Skip to content

Conversation

@wenju-he
Copy link
Contributor

@wenju-he wenju-he commented Mar 4, 2024

In translation from __spirv_AtomicCompareExchange to OpenCL builtin atomic_compare_exchange_strong_explicit, a new alloca expected is created and read/written in the OpenCL builtin.
The OpenCL builtin call can't have tail marker since the marker requires that callee doesn't access alloca from the caller. Otherwise llvm alias analysis deduces that the alloca isn't accessed by the call, and instcombine pass replaces the load from the alloca after the call with the value stored to the alloca before the call.

…it call

In translation from  __spirv_AtomicCompareExchange to OpenCL builtin
atomic_compare_exchange_strong_explicit, a new alloca `expected` is
created and read/written in the OpenCL builtin.
The OpenCL builtin call can't have tail marker since the marker requires
that callee doesn't access alloca from the caller.
Otherwise llvm alias analysis deduces that the alloca isn't accessed by
the call, and instcombine pass replaces the load from the alloca after
the call with the value stored to the alloca before the call.
@wenju-he
Copy link
Contributor Author

wenju-he commented Mar 4, 2024

Background of the bug:

instcombine command:

/export/users/wenjuhe/llvm/llvm-project/build-slibs/bin/opt -load-pass-plugin=/export/users/wenjuhe/llvm/llvm-project/build-slibs/lib/libLLVMSPIRVLib.so -passes=spirv-to-ocl20,instcombine /export/users/wenjuhe/llvm/llvm-project/llvm/projects/SPIRV-LLVM-Translator/test/tail-call-atomic-cmp-exchg.ll -S -o -

instcombine output before this PR:
output IR is wrong. atomic_compare_exchange_strong_explicit writes to %expected but the new value in %expected isn't used any more.

%"class.sycl::_V1::id" = type { %"class.sycl::_V1::detail::array" }
%"class.sycl::_V1::detail::array" = type { [1 x i64] }

define spir_kernel void @test(ptr addrspace(1) noundef align 8 %_arg_data_accessor, ptr noundef byval(%"class.sycl::_V1::id") align 8 %_arg_data_accessor4) {
entry:
  %expected = alloca i64, align 8
  %0 = load i64, ptr %_arg_data_accessor4, align 8
  %add.ptr = getelementptr inbounds ptr addrspace(4), ptr addrspace(1) %_arg_data_accessor, i64 %0
  %arrayidx.ascast = addrspacecast ptr addrspace(1) %add.ptr to ptr addrspace(4)
  br label %do.body

do.body:                                          ; preds = %do.body, %entry
  %call1 = tail call spir_func noundef i64 @_Z20atomic_load_explicitPU3AS4VU7_Atomicl12memory_order12memory_scope(ptr addrspace(4) %arrayidx.ascast, i32 5, i32 1)
  %1 = inttoptr i64 %call1 to ptr addrspace(4)
  %add.ptr.i = getelementptr inbounds i8, ptr addrspace(4) %1, i64 4
  %2 = ptrtoint ptr addrspace(4) %add.ptr.i to i64
  store i64 %call1, ptr %expected, align 8
  %expected.as = addrspacecast ptr %expected to ptr addrspace(4)
  %3 = tail call spir_func noundef i1 @_Z39atomic_compare_exchange_strong_explicitPU3AS4VU7_AtomiclPU3AS4ll12memory_orderS4_12memory_scope(ptr addrspace(4) %arrayidx.ascast, ptr addrspace(4) %expected.as, i64 %2, i32 5, i32 5, i32 1)
  br i1 true, label %exit, label %do.body

exit:                                             ; preds = %do.body
  ret void
}

declare spir_func noundef i64 @_Z20atomic_load_explicitPU3AS4VU7_Atomicl12memory_order12memory_scope(ptr addrspace(4), i32, i32)

declare spir_func noundef i1 @_Z39atomic_compare_exchange_strong_explicitPU3AS4VU7_AtomiclPU3AS4ll12memory_orderS4_12memory_scope(ptr addrspace(4), ptr addrspace(4), i64, i32, i32, i32)

instcombine output after this PR:
output IR is correct.

%"class.sycl::_V1::id" = type { %"class.sycl::_V1::detail::array" }
%"class.sycl::_V1::detail::array" = type { [1 x i64] }

define spir_kernel void @test(ptr addrspace(1) noundef align 8 %_arg_data_accessor, ptr noundef byval(%"class.sycl::_V1::id") align 8 %_arg_data_accessor4) {
entry:
  %expected = alloca i64, align 8
  %0 = load i64, ptr %_arg_data_accessor4, align 8
  %add.ptr = getelementptr inbounds ptr addrspace(4), ptr addrspace(1) %_arg_data_accessor, i64 %0
  %arrayidx.ascast = addrspacecast ptr addrspace(1) %add.ptr to ptr addrspace(4)
  br label %do.body

do.body:                                          ; preds = %do.body, %entry
  %call1 = tail call spir_func noundef i64 @_Z20atomic_load_explicitPU3AS4VU7_Atomicl12memory_order12memory_scope(ptr addrspace(4) %arrayidx.ascast, i32 5, i32 1)
  %1 = inttoptr i64 %call1 to ptr addrspace(4)
  %add.ptr.i = getelementptr inbounds i8, ptr addrspace(4) %1, i64 4
  %2 = ptrtoint ptr addrspace(4) %add.ptr.i to i64
  store i64 %call1, ptr %expected, align 8
  %expected.as = addrspacecast ptr %expected to ptr addrspace(4)
  %3 = call spir_func noundef i1 @_Z39atomic_compare_exchange_strong_explicitPU3AS4VU7_AtomiclPU3AS4ll12memory_orderS4_12memory_scope(ptr addrspace(4) %arrayidx.ascast, ptr addrspace(4) %expected.as, i64 %2, i32 5, i32 5, i32 1)
  %call2 = load i64, ptr addrspace(4) %expected.as, align 8
  %4 = icmp eq i64 %call2, %call1
  br i1 %4, label %exit, label %do.body

exit:                                             ; preds = %do.body
  ret void
}

declare spir_func noundef i64 @_Z20atomic_load_explicitPU3AS4VU7_Atomicl12memory_order12memory_scope(ptr addrspace(4), i32, i32)

declare spir_func noundef i1 @_Z39atomic_compare_exchange_strong_explicitPU3AS4VU7_AtomiclPU3AS4ll12memory_orderS4_12memory_scope(ptr addrspace(4), ptr addrspace(4), i64, i32, i32, i32)

@wenju-he
Copy link
Contributor Author

wenju-he commented Mar 4, 2024

@MrSidims could you please review?
IIUC tail marker can't be represented in SPIRV file at the moment, so no transcoding test is added.

@MrSidims MrSidims requested a review from svenvh March 5, 2024 12:09
@MrSidims MrSidims merged commit 1ff4a76 into KhronosGroup:main Mar 5, 2024
@wenju-he wenju-he deleted the tail-call_new-alloca branch March 6, 2024 00:24
@svenvh svenvh mentioned this pull request Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants