Description
This is forked from comment #2 (comment)
The problem occurs when doing a somewhat simple selection between two loads from Private storage class (or __constant in OpenCL C). The inst-combine optimization changes
- a selection between two loads
into - a selection between two pointers, followed by a load from that pointer.
In the current compilation scheme the two pointers are in Private storage class, and the compiler incorrectly has the result of the selection being a pointer to StorageBuffer storage class.
Case is as below:
This is the symptom of compiling code like:
__constant float kFirst[3] = {1.0f, 2.0f, 3.0f};
__constant float kSecond[3] = {10.0f, 11.0f, 12.0f};
kernel void foo(global float*A, int c, int i) {
*A = c==0 ? kFirst[i] : kSecond[i];
}
Produces this kind of code:
%33 = OpVariable %_ptr_Private__arr_float_uint_3 Private %23
%34 = OpVariable %_ptr_Private__arr_float_uint_3 Private %27
%35 = OpVariable %_ptr_StorageBuffer__struct_4 StorageBuffer
%36 = OpVariable %_ptr_StorageBuffer__struct_7 StorageBuffer
%37 = OpVariable %_ptr_StorageBuffer__struct_7 StorageBuffer
%38 = OpFunction %void None %11
%39 = OpLabel
%40 = OpAccessChain %_ptr_StorageBuffer_float %35 %uint_0 %uint_0
%41 = OpAccessChain %_ptr_StorageBuffer_uint %36 %uint_0
%42 = OpLoad %uint %41
%43 = OpAccessChain %_ptr_StorageBuffer_uint %37 %uint_0
%44 = OpLoad %uint %43
%45 = OpIEqual %bool %42 %uint_0
%46 = OpSelect %_ptr_StorageBuffer__arr_float_uint_3 %45 %33 %34
%47 = OpAccessChain %_ptr_StorageBuffer_float %46 %44
%48 = OpLoad %float %47
OpStore %40 %48
OpReturn
OpFunctionEnd
Note the OpSelect at %46. Its operands are pointers into Private, but its result is pointer to StorageBuffer. That's invalid. We need the initializers for %33 and %34 but even with VariablePointers we can't select between two different pointer-to-Private values.
See also other cases from comments:
- OpPtrAccessChain into Private storage class sometimes generated when indexing into OpenCL module-scope constant #2 (comment)
See workarounds: - rewrite the loads as function calls, to isolate them from instcombine: OpPtrAccessChain into Private storage class sometimes generated when indexing into OpenCL module-scope constant #2 (comment)
- rewrite the tables as two-dimensional: OpPtrAccessChain into Private storage class sometimes generated when indexing into OpenCL module-scope constant #2 (comment) (This is less general than wrapping in a function)
Another effective workaround is to use -O0 to disable the inst-combine optimization. But that's rather drastic.