-
Notifications
You must be signed in to change notification settings - Fork 5
Data Processing
During a trace, a large amount of data is generated (32 bytes per method call at a minimum). When the incoming trace data outpaces data processing, the data buffer begins to fill up. Once it is full, the execution of the program being traced slows to rate at which the data can be processed. Since only so much memory is available for the data buffer, fast data processing is imperative.
Each data message is assigned a sequential identifier before it is sent, giving an absolute ordering. Integer overflow is resolved by also looking at the timestamps on the messages. With these two pieces of information, and the assumption that no more than half of the range of an integer is covered in a single time unit, we're able to reliably order the messages.
As data is collected, it is fed into a priority queue for sorting (using the sequence and timestamp values). As consecutive data points are found, they are fed to data processors for analysis. Due to how the data buffering and sending on the tracing agent is implemented, we can be fairly certain that not too much data will need to be queued up before a chunk of sorted data becomes available.
As sorted data becomes available, it is passed on to all registered data processors. It is up to these processors to do what they want with the data and store whatever information necessary, as the data is not stored elsewhere after being processed.
There may be breaks in the data. If tracing is suspended, program execution continues without tracing. If tracing is resumed later, any assumptions that data processors may be making about call stacks therefore become invalid. When such a break is encountered in the data stream, the data processors are notified.
Data processors implement the DataProcessor
trait and, once registered with the DataRouter
, will have their appropriate methods called as data comes up.
Some data messages are unsequenced. For these messages, order does not matter, so they are immediately propagated through the system, ahead of any data that is queued for sorting. Examples include MapThreadName
, MapMethodSignature
, and MapException
.