Closed
Description
For some reasons, a few changes in my kernels can lead the compiler to pick a different choice for SIMD width (8, 16 or 32).
For some reason I don't explain, some of my kernels using shared memory and barriers perform very significantly better (several times faster) when the selected SIMD width is 8.
Thus I'd like a way to enforce the choice of SIMD width of 8. (Also ideally compiler would make better choices).
Maybe make __attribute__((vec_type_hint(float8)))
trigger SIMD-8 for example ?
Metadata
Metadata
Assignees
Labels
No labels