-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329
Conversation
When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow. The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block. This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct. One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures. This passes llvm-testsuite and can produce correct results from spec2017 **rate** (no openmp). Please advise if other testing is necessary.
@llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah) ChangesWhen unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow. The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block. This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct. One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures. This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary. Full diff: https://github.com/llvm/llvm-project/pull/77329.diff 2 Files Affected:
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
successor == eval.controlSuccessor)
// Exit from a degenerate, empty construct block.
genBranch(eval.parentConstruct->constructExit->block);
+ else if (successor->block)
+ // Catch case with omp collapse(n) when dealing with an end-do statement
+ // which has been collapsed into an outer loop and unstructured control
+ // flow is used. Create a fall through branch to the successor (the
+ // outer loop). This is not known to affect other cases.
+ genBranch(successor->block);
}
}
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+ integer, intent(in) :: imax, jmax
+ real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+ integer :: i, j, ii
+
+ ! collapse(2) is needed to reproduce the issue
+ !$omp parallel do collapse(2)
+ do j = 1, jmax
+ do i = 1, imax
+ do ii = 1, imax ! note that this loop is not collapsed
+ if (x(i,j) < y(ii,j)) then
+ ! exit needed to force unstructured control flow
+ exit
+ endif
+ enddo
+ enddo
+ enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL: func.func @_QPsub(
+! CHECK-SAME: %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME: %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME: %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME: %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK: omp.wsloop for (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK: cf.br ^bb1
+! CHECK: ^bb1:
+! CHECK: cf.br ^bb2
+! CHECK: ^bb2:
+! [...]
+! CHECK: cf.br ^bb3
+! CHECK: ^bb3:
+! [...]
+! CHECK: %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK: cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK: ^bb4:
+! [...]
+! CHECK: %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK: cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK: ^bb5:
+! CHECK: cf.br ^bb7
+! CHECK: ^bb6:
+! [...]
+! CHECK: cf.br ^bb3
+! CHECK: ^bb7:
+! CHECK: cf.br ^bb8
+! CHECK: ^bb8:
+! CHECK: omp.yield
+! CHECK: }
+! CHECK: omp.terminator
+! CHECK: }
+! CHECK: cf.br ^bb1
+! CHECK: ^bb1:
+! CHECK: return
+! CHECK: }
|
@llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah) ChangesWhen unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow. The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block. This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct. One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures. This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary. Full diff: https://github.com/llvm/llvm-project/pull/77329.diff 2 Files Affected:
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
successor == eval.controlSuccessor)
// Exit from a degenerate, empty construct block.
genBranch(eval.parentConstruct->constructExit->block);
+ else if (successor->block)
+ // Catch case with omp collapse(n) when dealing with an end-do statement
+ // which has been collapsed into an outer loop and unstructured control
+ // flow is used. Create a fall through branch to the successor (the
+ // outer loop). This is not known to affect other cases.
+ genBranch(successor->block);
}
}
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+ integer, intent(in) :: imax, jmax
+ real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+ integer :: i, j, ii
+
+ ! collapse(2) is needed to reproduce the issue
+ !$omp parallel do collapse(2)
+ do j = 1, jmax
+ do i = 1, imax
+ do ii = 1, imax ! note that this loop is not collapsed
+ if (x(i,j) < y(ii,j)) then
+ ! exit needed to force unstructured control flow
+ exit
+ endif
+ enddo
+ enddo
+ enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL: func.func @_QPsub(
+! CHECK-SAME: %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME: %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME: %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME: %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK: omp.wsloop for (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK: cf.br ^bb1
+! CHECK: ^bb1:
+! CHECK: cf.br ^bb2
+! CHECK: ^bb2:
+! [...]
+! CHECK: cf.br ^bb3
+! CHECK: ^bb3:
+! [...]
+! CHECK: %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK: cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK: ^bb4:
+! [...]
+! CHECK: %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK: cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK: ^bb5:
+! CHECK: cf.br ^bb7
+! CHECK: ^bb6:
+! [...]
+! CHECK: cf.br ^bb3
+! CHECK: ^bb7:
+! CHECK: cf.br ^bb8
+! CHECK: ^bb8:
+! CHECK: omp.yield
+! CHECK: }
+! CHECK: omp.terminator
+! CHECK: }
+! CHECK: cf.br ^bb1
+! CHECK: ^bb1:
+! CHECK: return
+! CHECK: }
|
[I'm on vacation until 1.16; responses may be slow] Have you tried other code variations? An exit from or to an intermediate or outer loop in addition to an exit from the innermost loop? An exit that is not at the end of the loop body? Unstructured branches other than exit, such as a simple goto? Since this is only needed for unstructured code, please add that to the condition, as in the prior degenerate/empty construct case. That also serves to document that restriction. Is there some simple condition that can be checked to determine that the context is a collapsed loop? If yes, please consider adding that. Please consider writing a more focused comment. One possibility: // Exit from a collapsed loop nest. |
Thank you for taking a look. After looking more closely at more code examples, I think I will need to take a different approach to support |
When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.
The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.
This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.
One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.
This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.