-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] reduce gaussian sampling by 2x #40871
Conversation
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
); | ||
total_color += gaussian * Sample(texture_sampler, // sampler | ||
neg_coord // texture coordinates | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leveraging linear sampling is a great idea. :)
Instead of fanning out the two sides here, I think just taking the original and doing something like for (float16_t i = round(-blur_info.blur_radius); i <= blur_info.blur_radius; i += 2.hf)
would be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just round in the contents so blur_radius is always integral?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah rounding on the CPU seems good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, where did it go...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nvm, its there
This improved the performance of the backdrop filter blur, good job! |
The regular animated blur benchmark also has reduced GPU utilization on this commit: |
Uses technique outlined in https://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling to reduce number of texture samples by 2x. Rather than sampling each texture element with nearest neighbor, sample a lerp'd weighted value that should be identical to the previous measure.
From local testing on an iPhone 13, this is about ~50% faster on the backdropfilter stress test, though no major difference on the animated backdrop filter benchmark as the blur is already fast enough to not cause frame drops.
Most test apps look identical? Malioc results are much worse, but I suspect that it is not really analyzing the number of loops and instead just looking at the content per loop. Because I'm doing more computations per iteration (But fewer iterations) it looks worse.