Skip to content

Conversation

@miDeb
Copy link
Member

@miDeb miDeb commented Jul 9, 2025

wip, will get the opportunity to continue some work on this during the weekend

@raffael0
Copy link
Member

@miDeb are you still planning on working on this? Otherwise i'd take over

@miDeb
Copy link
Member Author

miDeb commented Jul 21, 2025

@raffael0 sorry for being inactive the last2 weeks; feel free to take this over if you have the capacity

@raffael0
Copy link
Member

No worries. I'll see what I can do

miDeb and others added 4 commits August 28, 2025 19:05
use std::string_view and avoid unnecessary copies

instead of using 2 fixed-size buffers, use dynamically allocated strings that are then reused
…es the execution time of the callback thread which will hopefully alleviate the weird sampling frequency drops.

Additionally the memory leak of in Node.cpp was fixed
@raffael0 raffael0 force-pushed the feature/can-rx-perf branch from 5cc386e to 0d5ba54 Compare August 28, 2025 17:05
@raffael0 raffael0 requested a review from Copilot August 28, 2025 21:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces performance improvements for CAN reception, refactors the InfluxDB writer to use modern C++ features, and implements a threaded message processing system for Kvaser CAN drivers.

Key Changes:

  • Refactors InfluxDB writer to use std::string buffers and std::format instead of C-style arrays and sprintf
  • Introduces a dedicated thread-based message queue for Kvaser CAN reception using the readerwriterqueue library
  • Modernizes CAN driver callback signatures with type aliases and lambda expressions

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/logging/InfluxDbWriter.cpp Refactored to use std::string buffers and std::format for data formatting
src/can/CanKvaserReceiveThread.cpp New threaded message processing implementation for Kvaser CAN
src/can/CANManager.cpp Updated to use lambda expressions instead of std::bind
include/can/CANDriver.h Added type alias for CAN receive callback signature
CMakeLists.txt Added readerwriterqueue dependency

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

raffael0 and others added 4 commits August 30, 2025 20:11
- added a mutex to the enqueue since sometimes the driver runs two callbacks at the same time
@raffael0
Copy link
Member

@miDeb Can you please review the code? I had to remove your influx performance improvements as it threw very weird runtime exceptions (Resource busy). I think it may have to do with the multithreaded addDatapoint calls.

@miDeb
Copy link
Member Author

miDeb commented Sep 1, 2025

Is there a measurable perf improvement? And can we tell if it's good enough now?

Otherwise changes look good to me 👍

@raffael0
Copy link
Member

raffael0 commented Sep 1, 2025

No there is no measurable performance impact(yet). I am still evaluating it on the hardware.
Here is the sampling frequency on a test I did yesterday. No sequences or commands were sent during this time. As you can see there wasn't a huge dip like we've seen in previous coldflows/hotfires(The one at the start is a restart of the server). We are conducting a reference test today to trigger the frequency dip and verify that it improved
image

What I already noticed is that the driver doesn't use a single thread in the callbacks. i.e. multiple threads may call the callback at the same time. In the previous solution this was just ignored but now I added a mutex to prevent concurrent access to the queue. My current theory is that there is one thread per channel but I haven't looked into it in detail so far.

This multithreaded callbacks concern me a bit since this might make the readouts non-deterministic and mess up the ordering of the datapoints. But since the software design is better than the previous solution I do not think it got any worse and I guess that's fine for now.

Edit: Also I forgot to push the revert commits of your influx stuff. I pushed it now
Edit: Here is the datarate from the test today. I think it's pretty safe to say that the patch fixes the issue.

image

@raffael0 raffael0 marked this pull request as ready for review September 1, 2025 12:20
@raffael0
Copy link
Member

raffael0 commented Sep 1, 2025

The latest commit changes the dequeue to a blocking version with a 1ms timeout. I looked at a profile and it looks a lot better

raffael0 added a commit that referenced this pull request Sep 5, 2025
This commit decouples the kvaser message receive callback from the message handling in the llserver. This is done by adding a new thread and a queue. Messagesa are added to the queue and removed/processed as soon as possible by a new message handling thread. Previously the long message parsing lead to sampling rate dropoffs. The new version appears to have fixed the bug

Additionally this commit fixes a memory leak in Node.cpp
@raffael0 raffael0 closed this Sep 5, 2025
@raffael0
Copy link
Member

raffael0 commented Sep 5, 2025

I merged manually

raffael0 added a commit that referenced this pull request Oct 9, 2025
Fixes a bug, where the whole llserver would freeze up for ~10 seconds. Ultimately the influxdb is responsible for this, since it sometimes takes >200ms to respond to an insert, instead of the normal 2-20. This PR fixes it by **completely** decoupling the sending thread from the rest of the llserver. I believe that this is also the root cause behind #11.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants