Uniform control flow vs textures for flat interpolated varyings #3668

unconed · 2022-12-07T12:53:18Z

I'm trying to implement a deferred renderer. As part of this, I figured I'd batch the lights by type, for efficiency, with one instance per light.

However, as some lights have shadows and some don't, this triggers a warning about non-uniform control flow and texture sampling. The light index is @interpolate(flat) but this is still apparently considered non-uniform. Nevertheless, the shader works fine on desktop (macos). It just throws an ugly warning in the console that I can't suppress. But I suspect it's there because tiled GPUs might not work.

I'm however at a loss for how to solve this efficiently. Without push constants, the only way I can pass dynamic data to my shader is via a buffer. To avoid creating a unique buffer per draw call, I can use dynamic buffer offsets. But, buffer bindings must be aligned to 256 bytes.

This means that not only do I have to split a perfectly batchable call into one call per instance, but, in order to pass a mere uint32 to a light, so it can index from a shared light buffer, I need to allocate 256 bytes per draw call. This seems ridiculous.

At first, I thought maybe I could use firstInstanceIndex as a push constant, but this would also have to be passed from vert to frag, and thus be non-uniform. It is also an optional feature so I'm not even sure it's commonly available.

TLDR:

are flat interpolated varyings supposed to be non uniform?
is there some other way of having push constants that I've missed?
wtf is the point of throwing warnings about shaders that work fine?
why does texturing suck so much in WebGPU? not having bindless is one thing but not even being able to put an if around a texture fetch [sucks] and makes this a sad repeat of WebGL

dneto0 · 2022-12-07T15:13:34Z

Thanks for the feedback!

At first, I thought maybe I could use firstInstanceIndex as a push constant, but this would also have to be passed from vert to frag, and thus be non-uniform. It is also an optional feature so I'm not even sure it's commonly available.

firstInstance is not an optional feature. It's an optional parameter to the 'draw' call. It defaults to zero.
See https://gpuweb.github.io/gpuweb/#dom-gpurendercommandsmixin-draw

TLDR:

are flat interpolated varyings supposed to be non uniform?

Yes. By the letter of the Vulkan spec, they are non-uniform. Yes, this is surprising. See #3489

is there some other way of having push constants that I've missed?

Push constants are not in v1. See the discussion/investigation in #75

wtf is the point of throwing warnings about shaders that work fine?

This is the intense debate of the last few months. See #3479 #3554 #3644

why does texturing suck so much in WebGPU? not having bindless is one thing but not even being able to put an if around a texture fetch is retarded and makes this a sad repeat of WebGL

This is getting less actionable.
Please keep within boundaries of professional conduct and congenial communications.

Pulling the signal out of the noise: it is customary to put an operation requiring an implicit derivative inside non-uniform control flow. But the spec currently says this should be banned. We are debating how to weaken that rule.

fintelia · 2022-12-07T16:32:42Z

are flat interpolated varyings supposed to be non uniform?

Yes. By the letter of the Vulkan spec, they are non-uniform. Yes, this is surprising. See #3489

That's not the whole story. Vulkan doesn't actually require total uniformity across the invocation group, it only requires that derivatives are executed within "control flow that is uniform within a derivative group". A couple quotes from the spec that explain things:

Derivative operations operate on a set of invocations known as a derivative group as defined in the SPIR-V specification. A derivative group is equivalent to the primitive scope instance for a fragment shader invocation.

Any input variables decorated with Flat are uniform within a primitive scope instance.

All dynamic instances of explicit derivative instructions (OpDPdx*, OpDPdy*, and OpFwidth*) must be executed in control flow that is uniform within a derivative group. For other derivative operations, results are undefined if a dynamic instance is executed in control flow that is not uniform within the derivative group.

unconed · 2022-12-07T16:53:12Z

I was mistaken about firstInstance, it's only an optional feature for indirect calls. Useful here would be to know why a particular feature couldn't just be made part of the spec, and to document or link this.

Quoting from one of the linked threads:

As it is, I'm not confident we can reasonably port our engine to WebGPU.

This is exactly what I'm getting at. There is a lot of rules lawyering but I see very little focus on actually making sure well-established rendering techniques can be implemented. Whether you dress this up nicely is less important than the recognition that webgpu 1.0 will be dead on arrival if these kinds of issues are not fixed. Maybe a rude wake up call is exactly what is needed, rather than talk about "spending innovation tokens" (also a quote). That is the sort of "professional" language that actual professionals know is meaningless.

I have a repo with months of webgpu code in it that shows how untenable all these feature cuts can become when they all intersect, because of the many workarounds that all have to be deployed in concert. Even something as simple as rendering a GLTF—a format designed for GPU consumption, on the web—doesn't work unless you write code to split and realign data at 256 byte offsets. It's a death of a thousand cuts even if you have infrastructure like a shader linker and polyfills.

In my case, the offending texture sampler is a textureSampleCompare on a depth map without MIPs, so I don't even understand why derivatives would be needed.

Deferred isn't a particularly complex rendering technique. If you imagine e.g. a forward+ renderer, then the uniformity requirement pretty much blocks that from being realized altogether unless the developer emulates a tiled/froxel based hardware renderer entirely in software, with tons of redundant draw calls. This is not a viable thing to do, nor would it be sane. Obviously forward+ renderers have been used for years now, so it is WebGPU which is wrong.

With regards to throwing warnings: the question is a) whether warnings signify shaders that will break e.g. on mobile platforms or b) is a false alarm from an overzealous strictness check. Right now it seems like the latter, in which case the ability to silence it should've been there from the get go. This should be clearly communicated.

dneto0 · 2022-12-07T17:22:11Z

are flat interpolated varyings supposed to be non uniform?

Yes. By the letter of the Vulkan spec, they are non-uniform. Yes, this is surprising. See #3489

That's not the whole story. Vulkan doesn't actually require total uniformity across the invocation group, it only requires that derivatives are executed within "control flow that is uniform within a derivative group". A couple quotes from the spec that explain things:

Derivative operations operate on a set of invocations known as a derivative group as defined in the SPIR-V specification. A derivative group is equivalent to the primitive scope instance for a fragment shader invocation.

Any input variables decorated with Flat are uniform within a primitive scope instance.

All dynamic instances of explicit derivative instructions (OpDPdx*, OpDPdy*, and OpFwidth*) must be executed in control flow that is uniform within a derivative group. For other derivative operations, results are undefined if a dynamic instance is executed in control flow that is not uniform within the derivative group.

I understandt that. The issue is about reconvergence, and that it doesn't happen when most programmers expect it to. Sorry it isn't explained in #3489, and I didn't explain it above.

Example:

void main() {
  if ( cond_divergent_between_different_primitives )  {
          if ( cond_divergent_within_a_primitive ) {
                ...
          }
          // Most programmers think the invocations *within* the primitive
          // guaranteed to be reconverged here.
          // The spec does not guarantee this is the case.
          vec4 v = texture(sampler2D(t,s,coords)); // not required to be in uniform control flow.
   }
   // The way Vulkan/SPIR-V reconvergence works, this is the first point at which
   // control flow is uniform.
}

In a previous discussion I gave an executable example showing possibly-surprising non-reconvergence.
#954 (comment)

dneto0 · 2022-12-07T17:23:11Z

In my case, the offending texture sampler is a textureSampleCompare on a depth map without MIPs, so I don't even understand why derivatives would be needed.

Can you use textureSampleCompareLevel and give it level 0?

dneto0 · 2022-12-07T17:31:18Z

Maybe a rude wake up call is exactly what is needed, rather than talk about "spending innovation tokens" (also a quote). That is the sort of "professional" language that actual professionals know is meaningless.

Perhaps you missed the part where I said we're debating exactly how to relax the rule.

Rudeness and name-calling is not helping your case. You've overstepped your welcome.

kdashg · 2022-12-07T22:21:39Z

All: It's fine to be frustrated, but let's not be rude, thanks! (this includes veiled insults)
Please keep in mind that we are collaborating here, in accordance with https://docs.github.com/en/site-policy/github-terms/github-community-guidelines.

I expect that we will be adding a local opt-out (allowing suppression of warnings) in #3644 or shortly after.

Unfortunately for us, we're choosing to to (or at least strongly considering being) the bearer of bad news with regards to the minimal and strict guarantees of uniformity coming up to us from our underlying APIs. We are still figuring out exactly what form this takes, and ask that you please bear with us (and give us feedback about what works and what doesn't) while we figure out what we think is the best way to surface this very tricky and surprising class of issues. As noted by @dneto0, this is an area of very active discussion right now.

fintelia · 2022-12-07T22:59:11Z

A lot of the discussion seems to be conflating the example @dneto0 provided with something like the following (which I think may more closely match what's described in the original post):

void main() {
  if (cond_divergent_between_different_primitives)  {
        // Texture sample is OK here, but current analysis rejects it.
        vec4 v = texture(sampler2D(t,s,coords));
   }
}

Right now, the uniformity analysis cannot distinguish between the two examples, so it rejects both. That's an unfortunate state to be in and will hopefully change.

Either way, it seems worthwhile for the messaging to distinguish between true positives (where the system would rightfully be the bearer of bad news) and false positives (where the code is fine, and the programmer just needs to invoke the opt-outs once they're ready).

kdashg · 2022-12-14T18:20:40Z

WGSL 2022-12-13 Minutes

KG: Think this is again where developers don’t understand the surprising lack of guarantees.
AB: Think your summary is correct.
AB: If it helps, I’ve received lots of feedback that things not being intuitive to the user should be “fixed in the spec”. Thanks but can’t do that. Reality is what must be described.
KG: Want to be able to convey that many vendors can reliably say the guarantees actually are not provided.
AB: There’s an extension available for Vulkan for subgroup uniformity, and need to query what stages it’s allowed on (and must include compute, but fragment is optional).
JB: Telemetry about how widely adopted those extensions are?
AB: Have the telemetry but can’t share it.
AB: Look at experimental CTS tests for “maximal reconvergence” in the Vulkan CTS (which is public). You can read it and play with it. I can’t share experimental data about it. I can give you a link to that file.
https://github.com/KhronosGroup/VK-GL-CTS/blob/main/external/vulkancts/modules/vulkan/reconvergence/vktReconvergenceTests.cpp
KG: **So resolution: as much as authors want the better analysis, we can’t give a better analysis that guarantees what they want. **
AB: And the opt-out we now have will let them proceed if they are willing to take the risk.

kdashg · 2023-01-31T01:33:56Z

WGSL 2023-01-24 Minutes

KG: Think the resolution we came to is that this is something that is reasonable.** Many users expect this, not guaranteed, it's our job to be the messenger that this doesn't work**. Think the only thing we wanted to do was, in the error message, is to distinguish between these things. If we can tell there is a uniformy error that relies on flat and would be ok if flat was valid, then we can update the error to say the thing you want isn't real. It's a nice to have UA thing but nothing to do in the spec, per say, then have the issue to point folks at. Does that sound reasonable?
DN: Sounds reasonable to me. Also, since then, have diagnostic opt-out so they can turn it off if they want, but up to the UA to do better
MM: Why is flat non-uniform?
KG: Doesn't have guarantees you want. Reconvergence in SPIRV for things that are dynamic uniform is not guaranteed to reconverge.
MM: What does that have to do with flat?
KG: Flat is not dynamically uniform. Might think all things in quad get the same value, but the uniformity model in the spec says these values may not reconverge.
AB: Reconvergence is at a different scope then flat
KG: Have to return to the root of the execution
TR: Flat is quad uniform but not subgroup uniform which is needed for the reconvergence
KG: This is the secret difference, where technically we have these two things. Reconviergence is not guaranteed to happen on a flat, at least not in the cases they think it will. Does that make sense?
MM: Not really …. But ok
KG: Yep, pretty much “because Vulkan”

unconed added the question label Dec 7, 2022

dneto0 added the wgsl WebGPU Shading Language Issues label Dec 7, 2022

dneto0 added this to the V1.0 milestone Dec 7, 2022

ben-clayton mentioned this issue Jan 4, 2023

Add diagnostic filtering; range diagnostics apply to control flow constructs and compound statements #3713

Merged

kdashg closed this as completed Jan 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uniform control flow vs textures for flat interpolated varyings #3668

Uniform control flow vs textures for flat interpolated varyings #3668

unconed commented Dec 7, 2022 •

edited by kdashg

Loading

dneto0 commented Dec 7, 2022

Uh oh!

fintelia commented Dec 7, 2022

Uh oh!

unconed commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

kdashg commented Dec 7, 2022

Uh oh!

fintelia commented Dec 7, 2022

Uh oh!

kdashg commented Dec 14, 2022

Uh oh!

kdashg commented Jan 31, 2023

Uh oh!

Uniform control flow vs textures for flat interpolated varyings #3668

Uniform control flow vs textures for flat interpolated varyings #3668

Comments

unconed commented Dec 7, 2022 • edited by kdashg Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

fintelia commented Dec 7, 2022

Uh oh!

unconed commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

dneto0 commented Dec 7, 2022

Uh oh!

kdashg commented Dec 7, 2022

Uh oh!

fintelia commented Dec 7, 2022

Uh oh!

kdashg commented Dec 14, 2022

Uh oh!

kdashg commented Jan 31, 2023

Uh oh!

unconed commented Dec 7, 2022 •

edited by kdashg

Loading