-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] add compute pass API for memory barriers, re-enable for Vulkan. #49946
Conversation
@@ -32,7 +32,8 @@ | |||
if (!buffer_) { | |||
return; | |||
} | |||
encoder_ = [buffer_ computeCommandEncoder]; | |||
encoder_ = [buffer_ computeCommandEncoderWithDispatchType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default we were creating a serial encoder, which acts like it has a global memory barrier between each command. Since we need to specify the barriers for Vulkan, we might as well use the concurrent dispatcher for iOS too.
@@ -308,6 +308,7 @@ TEST_P(ComputeTest, MultiStageInputAndOutput) { | |||
CS1::BindOutput(*pass, DeviceBuffer::AsBufferView(output_buffer_1)); | |||
|
|||
ASSERT_TRUE(pass->Compute(ISize(512, 1)).ok()); | |||
pass->AddBufferMemoryBarrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're now running concurrently, we need a memory barrier here so that step 2 sees the output of step 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments and explanations look really good. Two notes.
Not an explicit approval as someone who understands what a memory barrier and whether it's being used properly should do that step.
@@ -267,9 +268,7 @@ GeometryVertexType PointFieldGeometry::GetVertexType() const { | |||
// Compute is disabled for Vulkan because the barriers are incorrect, see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this out of date with your change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, removed.
impeller/renderer/compute_pass.h
Outdated
/// This only handles read after write hazards. | ||
virtual void AddBufferMemoryBarrier() = 0; | ||
|
||
/// @brief Add a barrier that ensures all previously encoded commands have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to use the same verbage on similar API, otherwise my brain thinks "hmm this one Adds a barrier" but Buffer memory doesn't? (I'm guessing they both could be written more similar, and if not, let's clarify)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the adds a barrier
part redundant? I should update this too, I was trying to describe what a barrier does instead of restating the method name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "ensures all previously encoded ..." is probably sufficient.
The fact that it does so by adding the barrier is, as you stated, is in the API name.
The only thing I'd add I suppose is when this API is supposed to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased what this does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the barriers are underspecified in Vulkan and the Metal and Vulkan semantics are different for the transfer and compute/render dependencies.
Other than that, some nits on documentation.
impeller/renderer/compute_pass.h
Outdated
/// matter if the compute command is in a different command buffer. | ||
virtual void AddBufferMemoryBarrier() = 0; | ||
|
||
/// @brief Ensures all previously encoded commands have finished writing to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor edit for clarity: "Ensures that side effects (to textures) of all commands previously encoded in the pass are visible to subsequent commands in the pass."
Since "finished writing" isn't quite the same as having those writes be visible to subsequent commands. Because of caches and such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
impeller/renderer/compute_pass.h
Outdated
/// textures before any following commands can read from those | ||
/// textures. | ||
/// | ||
/// This method after encoding a command that writes to a storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is quite confusing to me. Did you mean "Call this method?" as a recommended usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah removed, I don't know what I was doing here
// |ComputePass| | ||
void ComputePassVK::AddBufferMemoryBarrier() { | ||
vk::MemoryBarrier barrier; | ||
barrier.srcAccessMask = vk::AccessFlagBits::eShaderWrite; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These to me seem underspecified. Right now, this says "All commands before the barrier must have buffer-writes in their compute shaders be visible to buffer-reads in also the compute shaders of commands after the barrier".
The same can apply to transfer read-write (VK_ACCESS_TRANSFER_READ|WRITE_BIT
in VK_PIPELINE_STAGE_TRANSFER_BIT
) in the transfer stage in addition to any compute-render dependencies.
You can just or
them up. But I suggest a comment indicating each value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also match up with Metal semantics implemented above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would I need to specify transfer dependencies in a barrier API that is specifically for compute-compute dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I clarified on the API docs that it is only for compute-compute dependencies, but I can add a caveat that it does not cover other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. But the first compute pass still needs to wait for say transfer writes on the buffers it works on to be visible. So my expectation would be that compute-compute dependencies while necessary are not sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use transfer commands anywhere yet though. my expectation is that if we needed this we would add this blit/transfer -> compute barrier functionality to the blit pass and not to the compute pass.
impeller/renderer/compute_pass.h
Outdated
/// This method after encoding a command that writes to a storage | ||
/// texture that a subsequent command will read from, provided that | ||
/// is a compute command and not a render pass command. It does not | ||
/// matter if the compute command is in a different command buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different command buffer but still subject to queue submission order right? Perhaps I am being pedantic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, done. Expect of course on metal where the command buffers are themselves a unit of synchronization.
@@ -213,8 +213,53 @@ bool ComputePassVK::BindResource(size_t binding, | |||
return true; | |||
} | |||
|
|||
// Note: | |||
// https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how that is the case. Though you do need the buffer and texture references to specify the barriers on. So this approach is definitely more easy to use if coarse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could easily add these to the API and create the more specific barriers, but given that we dont have any way to evaluate it this yet, I'd prefer to follow the recommendations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
Golden file changes have been found for this pull request. Click here to view and triage (e.g. because this is an intentional change). If you are still iterating on this change and are not ready to resolve the images on the Flutter Gold dashboard, consider marking this PR as a draft pull request above. You will still be able to view image results on the dashboard, commenting will be silenced, and the check will not try to resolve itself until marked ready for review. |
That golden change looks wrong? |
Yes it does. Let me double check this, we might need one more barrier. |
This pull request has been changed to a draft. The currently pending flutter-gold status will not be able to resolve until a new commit is pushed or the change is marked ready for review again. |
Changing to non-draft for golden updates, still WIP |
Golden file changes are available for triage from new commit, Click here to view. |
Golden file changes are available for triage from new commit, Click here to view. |
static_cast<int64_t>(std::ceil(width * 1.0 / max_wg_size_[0] * 1.0)), | ||
minimum); | ||
command_buffer_vk.dispatch(threadGroups, 1, 1); | ||
command_buffer_vk.dispatch(width, 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic was incorrect...
…142347) flutter/engine@525bd7d...a65a1b5 2024-01-26 [email protected] Roll Skia from c32aa37effcc to 6279c88b9e29 (1 revision) (flutter/engine#50098) 2024-01-26 [email protected] [Impeller] add compute pass API for memory barriers, re-enable for Vulkan. (flutter/engine#49946) 2024-01-26 [email protected] Introduce a prototype of a "header guard enforcement" tool (flutter/engine#48903) 2024-01-26 [email protected] Roll Skia from e24124912cc3 to c32aa37effcc (1 revision) (flutter/engine#50094) 2024-01-26 [email protected] [web] Do not wipe the PlatformViewManager when disposing of a view. (flutter/engine#49991) 2024-01-26 [email protected] Finish landing missing/incorrect header guards across `flutter/engine` (flutter/engine#50069) 2024-01-26 [email protected] Roll Skia from 32f6bff0f193 to e24124912cc3 (2 revisions) (flutter/engine#50093) 2024-01-26 [email protected] Roll Skia from cbdf09d69efc to 32f6bff0f193 (3 revisions) (flutter/engine#50092) 2024-01-26 [email protected] Roll Dart SDK from 5636e338e0b9 to 7337995bc851 (1 revision) (flutter/engine#50087) 2024-01-26 [email protected] Fix Shell::Screenshot for Impeller (flutter/engine#50072) 2024-01-26 [email protected] Roll Skia from ae73baacb793 to cbdf09d69efc (1 revision) (flutter/engine#50085) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC [email protected],[email protected],[email protected] on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://issues.skia.org/issues/new?component=1389291&template=1850622 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
Adds two new APIs that insert memory barriers for compute -> compute dependencies.
Also makes the ComputePassVK automatically insert a compute -> vertex dependency when encoding. This change is sufficient to let the GPU compute implementation of draw points work on Pixel and Samsung Android devices.
For more explaination on these specific barriers, see the documentation added in this PR.
Fixes #49946