Add comments to clarify per-frame profiler#1039
Add comments to clarify per-frame profiler#1039vauduong merged 4 commits intofacebookresearch:masterfrom
Conversation
| Mn::DebugTools::GLFrameProfiler::Value::FrameTime | // Time to render per | ||
| // frame frame | ||
| Mn::DebugTools::GLFrameProfiler::Value:: | ||
| CpuDuration | // Time to process action (eg. physics, key presses) |
There was a problem hiding this comment.
Is physics/simulation and UI processing the only two broad categories? Is there anything else?
There was a problem hiding this comment.
The profiler currently measures just one scope, that's its limitation, except if more than one profiler instance would be used for more scopes, as i suggested on #1015. That's a possible path for extending this functionality.
(I might be expanding the Magnum functionality eventually as well, depending on what will be interesting to measure for the Vulkan backend.)
| CpuDuration | // Time to process action (eg. physics, key presses) | ||
| // data per frame | ||
| Mn::DebugTools::GLFrameProfiler::Value:: | ||
| GpuDuration; // Time to process graphics data per frame |
There was a problem hiding this comment.
"Process graphics data" = rendering? Or do we mean something else?
There was a problem hiding this comment.
@dhruvbatra : "GpuDruation" is using asynchronous query such as ARB_timer_query, to record the amount of time that GPU takes to fully complete a set of scoped GL commands.
There was a problem hiding this comment.
I think this could be an easy enough interpretation of the numbers:
FrameTimeis an inverse of FPS and should be 16.6 ms or less for smooth interactivity. In my opinion it's better than FPS because you can better measure improvements. "FPS increased by 10" means a totally different thing if it was from 10 to 20 or from 900 to 910, while "frame time decreased by 10 ms" always has a clear meaning.CpuDurationmeasures how much CPU time was spent doing all work in a frame -- event processing, physics, but also traversing the scenegraph, submitting work for the GPU or the driver overhead. It should be less than GPU duration, if it's not then the GPU is sitting there bored.GpuDurationmeasured how much time did it take for the GPU to process all work submitted by the CPU. If it's less than the CPU time then if you reduce the CPU / driver overhead you can do / draw more in a frame (or draw frames faster), if it's significantly more than the CPU time then you're GPU-bound -- rendering too dense meshes, having too high texture resolution, or maybe just rendering a lot of objects that end up occluded or out of the view or having inefficient layout of the mesh data. Generally, since the aim here is to render as fast as possible (and not power efficiency for example), the CPU and GPU time should be roughly equal.
@vauduong in case you haven't stumbled upon it yet, there's an article about the geometry pipeline in Magnum, with more info about how to interpret the values and what makes meshes slow or fast to render: https://blog.magnum.graphics/announcements/new-geometry-pipeline/ It's scary long but written hopefully in a general enough way that might give you useful info even if you don't end up using Magnum further in your career ;)
There was a problem hiding this comment.
Thanks for the helpful comments @mosras! I did stumble upon that article and it was also super informative :)
|
@mosra : Saw you are in the reviewer list. |
|
@mosra : Yes, would appreciate any feedback on writing more precise comments to help users understand what the GLProfiler is doing! |
|
Replied above :) I'm realizing some of this info could go straight into Magnum docs as well, because the explanation is currently a bit underwhelming. |
| * Uses asynchronous querying to measure the amount of time | ||
| * to fully complete a set of GL commands without stalling rendering, 3 frame | ||
| * delay | ||
| * Asynchronous querying extensions: ARB_timer_query (OpenGL 3.3), |
There was a problem hiding this comment.
I will recommend removing such details (L567 - L570)
| * CpuDuration: (Units::Nanoseconds) CPU time spent processing events, | ||
| * physics, traversing SceneGraph, and submitting data to GPU/drivers per | ||
| * frame | ||
| * Measured using std::chrono::high_resolution_clock, 1 frame delay |
There was a problem hiding this comment.
remove details such as L560.
| * frame | ||
| * Measured using std::chrono::high_resolution_clock, 1 frame delay | ||
| * | ||
| * GpuDuration: (Units::Nanoseconds) GPU time spent rendering data submitted |
There was a problem hiding this comment.
measured how much time it takes for the GPU to process all work submitted by the CPU.
| * GpuDuration: (Units::Nanoseconds) GPU time spent rendering data submitted | ||
| * by CPU per frame | ||
| * Uses asynchronous querying to measure the amount of time | ||
| * to fully complete a set of GL commands without stalling rendering, 3 frame |
| Mn::DebugTools::GLFrameProfiler::Value::GpuDuration; | ||
|
|
||
| // VertexFetchRatio and PrimitiveClipRatio only supported for GL 4.6 | ||
| /** |
There was a problem hiding this comment.
You do not need this. Undo the change.
Motivation and Context
We added a per-frame profiler in #1015 to display frame duration, cpu duration, and gpu duration at runtime to be aware of bottlenecks in data processing when running our viewer. This PR adds comments to clarify how to interpret values.
How Has This Been Tested
Build and run
Types of changes
Checklist