-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] take advantage of DisplayList culling #41606
Conversation
A/B Comparison of Framework ToT against a local engine that includes both flutter/flutter#125717 and this PR:
|
if (!buffer_damage.has_value()) { | ||
auto size = surface->GetSize(); | ||
buffer_damage = SkIRect::MakeWH(size.width, size.height); | ||
clip_rect = impeller::IRect::MakeXYWH(0, 0, size.width, size.height); | ||
} | ||
impeller::DlDispatcher impeller_dispatcher(clip_rect.value()); | ||
display_list->Dispatch(impeller_dispatcher, buffer_damage.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this a bit differently for impeller, rather than adding a clip - i added a translate and then shrunk the size of the root render pass, which causes impeller to cull.
translate: https://github.com/flutter/engine/blob/main/flow/compositor_context.cc#L189
I'm not sure if this clip would already do the right thing though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which causes impeller to cull.
Causes it to cull where?
impeller::Dispatcher::DrawRect immediately calls Canvas::DrawRect without any tests.
Canvas::DrawRect (sometimes does something different for strokes and mask blurs) immediately creates and adds an entity without any other checks.
Where is the culling done if that rectangle was so far outside the clip that it isn't visible? The Dispatch method on DL can do this kind of bounds culling and then the impeller::Dispatcher never even sees the op...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this code here can apply a translation. If there is no damage rect, the cull rect is the size of the surface so it is already sized correctly and 0-based. If there is a damage rect, then the remainder of the surface must remain intact so we need to render the culled ops "in situ" with respect to origin and surface size.
While I still don't think I see all of the relevance of your comment, I don't think we can do translations and resizes of surfaces here because of the nature of this code and the damage rect...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ding ding! With your later comment that the blinking cursor was missing I think I finally got what you are saying. Unlike the other platforms that reuse the original framebuffer surface to render the partial damage inside a clip in the middle of the surface, Impeller renders the damage to a reduced size surface and then blits it into place?
In that case then the Impeller cull rect should always be set to just the rect(size) value with no origin, but the Dispatch call should be called with the damage rect since it is in the coordinate space of the original frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking again, it looks like the translate happens before the frame DisplayList is generated, so both rects should be 0-origin-based?
And, doing all of that fixes the blinking cursor problem...
impeller/aiks/canvas.cc
Outdated
@@ -23,15 +23,25 @@ | |||
namespace impeller { | |||
|
|||
Canvas::Canvas() { | |||
Initialize(); | |||
Initialize(Rect::MakeLTRB(-1e9, -1e9, 1e9, 1e9)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We've been using Rect::MakeMaximum()
for max coverage/bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How well does MakeMaximum deal with being transformed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than worry about specifying a huge clip that might encourage degenerate math, I changed the cull_rect to an std::optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
impeller/entity/entity_pass.h
Outdated
@@ -236,6 +236,7 @@ class EntityPass { | |||
|
|||
struct CanvasStackEntry { | |||
Matrix xformation; | |||
Rect clip_bounds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename this to clip_coverage
and add a docstring noting that this rect is screen space? Impeller loosely standardized around using "bounds" for local space and "coverage" for screen space (unless qualified with a specific space term).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, does "coverage" imply exact coverage? This would be a conservative estimate of just the bounds of the device clip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this struct to canvas.h since it's not used from anywhere else, and how do you feel about "cull_rect" for a name since it is only used for culling, not for clipping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, "coverage" is always assumed to be a best effort minimum in Impeller. "cull_rect" SGTM.
const Rect Canvas::GetCurrentLocalClipBounds() const { | ||
Matrix inverse = xformation_stack_.back().xformation.Invert(); | ||
return xformation_stack_.back().clip_bounds.TransformBounds(inverse); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a regular test (doesn't use OpenPlaygroundHere
) in aiks_unittests
which uses this method to verify the intersection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way I can have a test that creates a wide ranging DL, then dispatches it to an impeller Dispatcher with a cull_rect (and one that does that with a sub-DisplayList as well), and then checks that there are only Entities for the ops that should have made it through the culling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a new canvas_unittests
file and wrote a whole suite of tests which found several bugs in Rect::CutOut().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to making sure that the Entities and Passes are culled by the cull rect, we already have tests of the DisplayList RTree mechanism that make sure that the culled Dispatch
variant correctly omits ops in the playback, and with the new tests in canvas_unittests.cc
that check that Impeller's Canvas
manages the cull rect according to the desired behavior - is there anything left to test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this level of test is beyond the call of duty. 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, given that it found errors in an existing geom method, isn't it just "at the call of duty"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already cover rect methods in geometry_unittests.cc. It sounds like I probably missed a case there.
Impeller already tracks the clip bounds in EntityPass at render time, but this change will prevent lots of unnecessary API traffic from hitting Impeller in the first place when platform views are being used, which seems like a good idea to me. I've been strongly considering pulling some of the culling behavior out into Aiks anyhow so that we can throw away culled things earlier, and so I'm not opposed to tracking a screen space coverage rectangle for the clip in the transform stack. |
Another run of the Impeller non-intersecting platform view benchmark on the latest commit produced similar results to before:
|
I tried running this on the Flutter gallery (the old one in flutter/flutter). On the text fields demo, the blinking cursor no longer shows up correctly and the console is filled with messages of the form:
|
I'll have to see where we are creating these DLs without an rtree, but the blinking cursor is likely because I didn't understand what you were saying above about how Impeller manages partial repaints. I'm curious if my latest reply there indicates if I'm on the same page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (Modulo the blinking cursor issue mentioned by Jonah)
756e78b
to
fa1cfa8
Compare
The latest version seems to fix both the blinking cursor and "DL with no RTree" problems. |
…125858) flutter/engine@7d87410...024bf94 2023-05-02 [email protected] Roll Skia from 38e56b6da8f9 to c9e0992be00b (3 revisions) (flutter/engine#41655) 2023-05-02 [email protected] Roll Dart SDK from dc4a048e3cf7 to 25c29435f73e (1 revision) (flutter/engine#41654) 2023-05-01 [email protected] [Impeller] take advantage of DisplayList culling (flutter/engine#41606) 2023-05-01 [email protected] Use os_dimension in framework tests. (flutter/engine#41649) 2023-05-01 [email protected] Turn @staticInterop tear-off into closure (flutter/engine#41643) 2023-05-01 [email protected] Roll Fuchsia Linux SDK from SJOgKviZ-kwWd1Z1u... to ur2ymZJCZSj64s6Q2... (flutter/engine#41648) Also rolling transitive DEPS: fuchsia/sdk/core/linux-amd64 from SJOgKviZ-kwW to ur2ymZJCZSj6 If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC [email protected],[email protected],[email protected] on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
SkiaPerf is reporting some nice results from this PR: https://flutter-flutter-perf.skia.org/e/?begin=1682965609&end=1683036391&keys=X5edb569869225d453e9d784e6e3952ea&num_commits=50&request_type=1&xbaroffset=34645 |
Also reductions in cpu/gpu util on the backdrop filter benchmarks https://flutter-flutter-perf.skia.org/e/?begin=1682965609&end=1683036391&keys=X711aa6c13f994bdd30843f360c7e708f&num_commits=50&request_type=1&xbaroffset=34645 |
I have been going through these results. The impact on the non-intersecting platform view benchmark was less than I measured on my iPhone 6 by quite a lot.
|
I think the much earlier culling is driving most of those improvements. Since the dirty region itself is fairly small (just the animating rect). Arguably we should turn of partial repaint for those benchmarks... For the non-intersecting platform views, there is not partial repaint so it seems like the additional culling may not be significant once we fixed Impeller to not encode those extra instructions? They're just draw rects so its fairly cheap. I think the opportunity for that benchmark is in recognizing that the overlay surfaces are empty. To take advantage of that, culling isn't sufficient - we need to skip the entire surface. |
I need to go back and check out the backdrop perf, but I think it was originally testing the caching of BDFLayers and that was before partial repaints. The current situation makes an excellent test of partial repaints, though. If we "fixed" the benchmark to continue rendering the BDFLayers by using a larger animating rect (or one above and one below), we'd then be measuring Impeller's lack of layer caching. |
Actually, isn't there partial repaint, but it isn't very significant? The scrolling list is the only part changing so the header would not need to be redrawn...? Either way, there is culling because before we didn't cull to anything and now we cull at least to the size of the surface which skips all of the data that is outside of the surface. I haven't looked closely at what is generated by the benchmark, but depending on the "overdraw" of the scrolling list, tons of non-visible rects may have been rendered on each frame. Update: It looks like "ShouldRender" was doing culling against the entity's coverage which may have eliminated a lot of them, but the DL culling is per-op, not per-entity, so perhaps it eliminated a bunch more? |
Partial repaint is disabled with platform views. Also neither backend raster cached backdrop filters |
Good point. So the gain there is the granularity of the culling.
When the benchmark was written it was to measure the performance of backdrop filters in order to track the effects of caching them. There was a prototype, but getting it right ended up being harder than expected and so the work was abandoned, but the benchmark remains. At the time we decided that it was good to keep it to measure the performance of that operation even if our short-term plans didn't pan out as we might have other ways of improving it. (flutter/flutter#34870) |
Switching the calls to dispatch into an Impeller Dispatcher to use a cull rect to enable pre-culling of the out-of-bounds ops.
This change showed an improvement of around 2x on the rendering performance of the non-intersecting platform view benchmark, but that was measured without the recent changes to the destructive blend modes in Impeller renderer.