-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[flang][OpenMP] Generalize fixing alloca
IP pre-condition for private
ops
#122866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ate` ops This PR generalizes a fix that we implemented previously for `omp.wsloop`s. The fix makes sure the pre-condtion that the `alloca` block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to `allocatePrivateVars` so that it kicks in for any op not just `omp.wsloop`. This handles a bug uncovered by [a test](https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/blob/master/tests/4.5/target_simd/test_target_simd_safelen.F90) in the OpenMP_VV test suite.
@llvm/pr-subscribers-mlir-openmp @llvm/pr-subscribers-flang-openmp Author: Kareem Ergawy (ergawy) ChangesThis PR generalizes a fix that we implemented previously for This handles a bug uncovered by a test in the OpenMP_VV test suite. Full diff: https://github.com/llvm/llvm-project/pull/122866.diff 2 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index d6112fa9af1182..6ffe169038c54e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1340,13 +1340,23 @@ allocatePrivateVars(llvm::IRBuilderBase &builder,
llvm::SmallVectorImpl<llvm::Value *> &llvmPrivateVars,
const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
llvm::DenseMap<Value, Value> *mappedPrivateVars = nullptr) {
- llvm::IRBuilderBase::InsertPointGuard guard(builder);
// Allocate private vars
llvm::BranchInst *allocaTerminator =
llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
+ if (allocaTerminator->getNumSuccessors() != 1) {
+ splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
+ allocaIP.getBlock(), allocaTerminator->getIterator()),
+ true, "omp.region.after_alloca");
+ }
+
+ llvm::IRBuilderBase::InsertPointGuard guard(builder);
+ // Update the allocaTerminator in case the alloca block was split above.
+ allocaTerminator =
+ llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
builder.SetInsertPoint(allocaTerminator);
assert(allocaTerminator->getNumSuccessors() == 1 &&
"This is an unconditional branch created by OpenMPIRBuilder");
+
llvm::BasicBlock *afterAllocas = allocaTerminator->getSuccessor(0);
// FIXME: Some of the allocation regions do more than just allocating.
@@ -1876,11 +1886,6 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
SmallVector<llvm::Value *> privateReductionVariables(
wsloopOp.getNumReductionVars());
- splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
- allocaIP.getBlock(),
- allocaIP.getBlock()->getTerminator()->getIterator()),
- true, "omp.region.after_alloca");
-
llvm::Expected<llvm::BasicBlock *> afterAllocas = allocatePrivateVars(
builder, moduleTranslation, privateBlockArgs, privateDecls,
mlirPrivateVars, llvmPrivateVars, allocaIP);
diff --git a/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
new file mode 100644
index 00000000000000..0ce90578ea9d62
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
@@ -0,0 +1,34 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = true} {
+ omp.private {type = private} @simd_privatizer : !llvm.ptr alloc {
+ ^bb0(%arg0: !llvm.ptr):
+ omp.yield(%arg0 : !llvm.ptr)
+ }
+
+ llvm.func @test_target_simd() {
+ omp.target {
+ %5 = llvm.mlir.constant(1 : i32) : i32
+ %x = llvm.alloca %5 x i32 {bindc_name = "x"} : (i32) -> !llvm.ptr
+ omp.simd private(@simd_privatizer %x -> %arg1 : !llvm.ptr) {
+ omp.loop_nest (%arg2) : i32 = (%5) to (%5) inclusive step (%5) {
+ omp.yield
+ }
+ }
+ omp.terminator
+ }
+ llvm.return
+ }
+
+}
+
+// CHECK-LABEL: define {{.*}} @__omp_offloading_{{.*}}_test_target_simd_{{.*}}
+
+// CHECK: %[[INT:.*]] = alloca i32, align 4
+// CHECK: br label %[[LATE_ALLOC_BB:.*]]
+
+// CHECK: [[LATE_ALLOC_BB]]:
+// CHECK: br label %[[AFTER_ALLOC_BB:.*]]
+
+// CHECK: [[AFTER_ALLOC_BB]]:
+// CHECK: br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
|
@llvm/pr-subscribers-mlir Author: Kareem Ergawy (ergawy) ChangesThis PR generalizes a fix that we implemented previously for This handles a bug uncovered by a test in the OpenMP_VV test suite. Full diff: https://github.com/llvm/llvm-project/pull/122866.diff 2 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index d6112fa9af1182..6ffe169038c54e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1340,13 +1340,23 @@ allocatePrivateVars(llvm::IRBuilderBase &builder,
llvm::SmallVectorImpl<llvm::Value *> &llvmPrivateVars,
const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
llvm::DenseMap<Value, Value> *mappedPrivateVars = nullptr) {
- llvm::IRBuilderBase::InsertPointGuard guard(builder);
// Allocate private vars
llvm::BranchInst *allocaTerminator =
llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
+ if (allocaTerminator->getNumSuccessors() != 1) {
+ splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
+ allocaIP.getBlock(), allocaTerminator->getIterator()),
+ true, "omp.region.after_alloca");
+ }
+
+ llvm::IRBuilderBase::InsertPointGuard guard(builder);
+ // Update the allocaTerminator in case the alloca block was split above.
+ allocaTerminator =
+ llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
builder.SetInsertPoint(allocaTerminator);
assert(allocaTerminator->getNumSuccessors() == 1 &&
"This is an unconditional branch created by OpenMPIRBuilder");
+
llvm::BasicBlock *afterAllocas = allocaTerminator->getSuccessor(0);
// FIXME: Some of the allocation regions do more than just allocating.
@@ -1876,11 +1886,6 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
SmallVector<llvm::Value *> privateReductionVariables(
wsloopOp.getNumReductionVars());
- splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
- allocaIP.getBlock(),
- allocaIP.getBlock()->getTerminator()->getIterator()),
- true, "omp.region.after_alloca");
-
llvm::Expected<llvm::BasicBlock *> afterAllocas = allocatePrivateVars(
builder, moduleTranslation, privateBlockArgs, privateDecls,
mlirPrivateVars, llvmPrivateVars, allocaIP);
diff --git a/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
new file mode 100644
index 00000000000000..0ce90578ea9d62
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
@@ -0,0 +1,34 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = true} {
+ omp.private {type = private} @simd_privatizer : !llvm.ptr alloc {
+ ^bb0(%arg0: !llvm.ptr):
+ omp.yield(%arg0 : !llvm.ptr)
+ }
+
+ llvm.func @test_target_simd() {
+ omp.target {
+ %5 = llvm.mlir.constant(1 : i32) : i32
+ %x = llvm.alloca %5 x i32 {bindc_name = "x"} : (i32) -> !llvm.ptr
+ omp.simd private(@simd_privatizer %x -> %arg1 : !llvm.ptr) {
+ omp.loop_nest (%arg2) : i32 = (%5) to (%5) inclusive step (%5) {
+ omp.yield
+ }
+ }
+ omp.terminator
+ }
+ llvm.return
+ }
+
+}
+
+// CHECK-LABEL: define {{.*}} @__omp_offloading_{{.*}}_test_target_simd_{{.*}}
+
+// CHECK: %[[INT:.*]] = alloca i32, align 4
+// CHECK: br label %[[LATE_ALLOC_BB:.*]]
+
+// CHECK: [[LATE_ALLOC_BB]]:
+// CHECK: br label %[[AFTER_ALLOC_BB:.*]]
+
+// CHECK: [[AFTER_ALLOC_BB]]:
+// CHECK: br i1 %{{.*}}, label %{{.*}}, label %{{.*}}
|
Seems like the failed simd tests now pass according to @ajarmusch. Can you guys please ✅ if you do not have any objections to the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passes my tests and the changes look good. Thanks for the fix
Linaro reported a regression due to this PR: https://linaro.atlassian.net/browse/LLVM-1520. I am taking a look into this ... (cc @luporl) |
…PrivateVars` While llvm#122866 fixed some issues, it introduced a regression in worksharing loops. The new bug comes from the fact that we now conditionally created the `after_alloca` block based on the number of predecessors of the alloca insertion point. This is unneccessary, we can just alway create the block. If we do this, we respect the post condtions expected after calling `allocatePrivateVars` (i.e. that the `afterAlloca` block has a single predecessor.
A possible fix for the Fujistu regression: #123168. |
…atePrivateVars` (#123168) While #122866 fixed some issues, it introduced a regression in worksharing loops. The new bug comes from the fact that we now conditionally created the `after_alloca` block based on the number of sucessors of the alloca insertion point. This is unneccessary, we can just alway create the block. If we do this, we respect the post condtions expected after calling `allocatePrivateVars` (i.e. that the `afterAlloca` block has a single predecessor.
…k in `allocatePrivateVars` (#123168) While llvm/llvm-project#122866 fixed some issues, it introduced a regression in worksharing loops. The new bug comes from the fact that we now conditionally created the `after_alloca` block based on the number of sucessors of the alloca insertion point. This is unneccessary, we can just alway create the block. If we do this, we respect the post condtions expected after calling `allocatePrivateVars` (i.e. that the `afterAlloca` block has a single predecessor.
This PR generalizes a fix that we implemented previously for
omp.wsloop
s. The fix makes sure the pre-condtion that thealloca
block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix toallocatePrivateVars
so that it kicks in for any op not justomp.wsloop
.This handles a bug uncovered by a test in the OpenMP_VV test suite.