[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

tblah · 2024-01-08T16:08:29Z

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow. The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block. This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct. One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures. This passes llvm-testsuite and can produce correct results from spec2017 **rate** (no openmp). Please advise if other testing is necessary.

llvmbot · 2024-01-08T16:08:57Z

@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)

Changes

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.

Full diff: https://github.com/llvm/llvm-project/pull/77329.diff

2 Files Affected:

(modified) flang/lib/Lower/Bridge.cpp (+6)
(added) flang/test/Lower/OpenMP/wsloop-unstructured.f90 (+63)

diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
                successor == eval.controlSuccessor)
         // Exit from a degenerate, empty construct block.
         genBranch(eval.parentConstruct->constructExit->block);
+      else if (successor->block)
+        // Catch case with omp collapse(n) when dealing with an end-do statement
+        // which has been collapsed into an outer loop and unstructured control
+        // flow is used. Create a fall through branch to the successor (the
+        // outer loop). This is not known to affect other cases.
+        genBranch(successor->block);
     }
   }
 
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+  integer, intent(in) :: imax, jmax
+  real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+  integer :: i, j, ii
+
+  ! collapse(2) is needed to reproduce the issue
+  !$omp parallel do collapse(2)
+  do j = 1, jmax
+    do i = 1, imax
+      do  ii = 1, imax ! note that this loop is not collapsed
+        if (x(i,j) < y(ii,j)) then
+          ! exit needed to force unstructured control flow
+          exit
+        endif
+      enddo
+    enddo
+  enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL:   func.func @_QPsub(
+! CHECK-SAME:                      %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME:                      %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME:                      %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME:                      %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK:             omp.wsloop for  (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK:               cf.br ^bb1
+! CHECK:             ^bb1:
+! CHECK:               cf.br ^bb2
+! CHECK:             ^bb2:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb3:
+! [...]
+! CHECK:               %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK:               cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK:             ^bb4:
+! [...]
+! CHECK:               %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK:               cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK:             ^bb5:
+! CHECK:               cf.br ^bb7
+! CHECK:             ^bb6:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb7:
+! CHECK:               cf.br ^bb8
+! CHECK:             ^bb8:
+! CHECK:               omp.yield
+! CHECK:             }
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           cf.br ^bb1
+! CHECK:         ^bb1:
+! CHECK:           return
+! CHECK:         }

llvmbot · 2024-01-08T16:08:58Z

@llvm/pr-subscribers-flang-openmp

Author: Tom Eccles (tblah)

Changes

When unstrucuted control flow is used inside a collapsed wsloop, the inner-most (collapsed) loop statements are skipped (loop control is handled by the wsloop operation). However, blocks are still created without knowledge of the loop collapse. Without any loop control code generated for the collapsed loop, no branch would be added at the end of the block for the end-loop statement, resulting in invalid IR (an unterminated block) and for incorrect control flow.

The correct behaviour would be to branch unconditionally to the next out-most end-loop statement. This block is already successfully recognised as the successor block.

This patch adds an additional condition to add a branch to this successor block. I believe this should only match in cases which previously would have produced an unterminated block and so this should not effect any already working code construct.

One alternative solution that was considered was to handle this in the previous else if case where successor == eval.constrolSuccessor. This condition does not match here because the control successor is the beginning of the loop construct, whereas the successor is the next outer-most loop construct. Removing this condition led to numerous test failures.

This passes llvm-testsuite and can produce correct results from spec2017 rate (no openmp). Please advise if other testing is necessary.

Full diff: https://github.com/llvm/llvm-project/pull/77329.diff

2 Files Affected:

(modified) flang/lib/Lower/Bridge.cpp (+6)
(added) flang/test/Lower/OpenMP/wsloop-unstructured.f90 (+63)

diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 2bceee09b4f0f2..7c33a9c9d6d48e 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4159,6 +4159,12 @@ class FirConverter : public Fortran::lower::AbstractConverter {
                successor == eval.controlSuccessor)
         // Exit from a degenerate, empty construct block.
         genBranch(eval.parentConstruct->constructExit->block);
+      else if (successor->block)
+        // Catch case with omp collapse(n) when dealing with an end-do statement
+        // which has been collapsed into an outer loop and unstructured control
+        // flow is used. Create a fall through branch to the successor (the
+        // outer loop). This is not known to affect other cases.
+        genBranch(successor->block);
     }
   }
 
diff --git a/flang/test/Lower/OpenMP/wsloop-unstructured.f90 b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
new file mode 100644
index 00000000000000..310768c743a6a1
--- /dev/null
+++ b/flang/test/Lower/OpenMP/wsloop-unstructured.f90
@@ -0,0 +1,63 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine sub(imax, jmax, x, y)
+  integer, intent(in) :: imax, jmax
+  real, intent(in), dimension(1:imax, 1:jmax) :: x, y
+
+  integer :: i, j, ii
+
+  ! collapse(2) is needed to reproduce the issue
+  !$omp parallel do collapse(2)
+  do j = 1, jmax
+    do i = 1, imax
+      do  ii = 1, imax ! note that this loop is not collapsed
+        if (x(i,j) < y(ii,j)) then
+          ! exit needed to force unstructured control flow
+          exit
+        endif
+      enddo
+    enddo
+  enddo
+end subroutine sub
+
+! this is testing that we don't crash generating code for this: in particular
+! that all blocks are terminated
+
+! CHECK-LABEL:   func.func @_QPsub(
+! CHECK-SAME:                      %[[VAL_0:.*]]: !fir.ref<i32> {fir.bindc_name = "imax"},
+! CHECK-SAME:                      %[[VAL_1:.*]]: !fir.ref<i32> {fir.bindc_name = "jmax"},
+! CHECK-SAME:                      %[[VAL_2:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "x"},
+! CHECK-SAME:                      %[[VAL_3:.*]]: !fir.ref<!fir.array<?x?xf32>> {fir.bindc_name = "y"}) {
+! [...]
+! CHECK:             omp.wsloop for  (%[[VAL_53:.*]], %[[VAL_54:.*]]) : i32 = ({{.*}}) to ({{.*}}) inclusive step ({{.*}}) {
+! [...]
+! CHECK:               cf.br ^bb1
+! CHECK:             ^bb1:
+! CHECK:               cf.br ^bb2
+! CHECK:             ^bb2:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb3:
+! [...]
+! CHECK:               %[[VAL_63:.*]] = arith.cmpi sgt, %{{.*}}, %{{.*}} : i32
+! CHECK:               cf.cond_br %[[VAL_63]], ^bb4, ^bb7
+! CHECK:             ^bb4:
+! [...]
+! CHECK:               %[[VAL_76:.*]] = arith.cmpf olt, %{{.*}}, %{{.*}} fastmath<contract> : f32
+! CHECK:               cf.cond_br %[[VAL_76]], ^bb5, ^bb6
+! CHECK:             ^bb5:
+! CHECK:               cf.br ^bb7
+! CHECK:             ^bb6:
+! [...]
+! CHECK:               cf.br ^bb3
+! CHECK:             ^bb7:
+! CHECK:               cf.br ^bb8
+! CHECK:             ^bb8:
+! CHECK:               omp.yield
+! CHECK:             }
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           cf.br ^bb1
+! CHECK:         ^bb1:
+! CHECK:           return
+! CHECK:         }

vdonaldson · 2024-01-08T18:26:30Z

[I'm on vacation until 1.16; responses may be slow]

Have you tried other code variations? An exit from or to an intermediate or outer loop in addition to an exit from the innermost loop? An exit that is not at the end of the loop body? Unstructured branches other than exit, such as a simple goto?

Since this is only needed for unstructured code, please add that to the condition, as in the prior degenerate/empty construct case. That also serves to document that restriction.

Is there some simple condition that can be checked to determine that the context is a collapsed loop? If yes, please consider adding that.

Please consider writing a more focused comment. One possibility: // Exit from a collapsed loop nest.

tblah · 2024-01-09T16:28:46Z

Thank you for taking a look. After looking more closely at more code examples, I think I will need to take a different approach to support collapse(n) for n > 2

tblah requested review from kiranchandramohan and vdonaldson January 8, 2024 16:08

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Jan 8, 2024

tblah closed this Jan 9, 2024

kiranchandramohan mentioned this pull request Jan 15, 2024

[Flang][OpenMP] Restructure recursive lowering in createBodyOfOp #77761

Merged

kparzysz added a commit that referenced this pull request Jan 15, 2024

Add testcase from #77329

8ead6c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

tblah commented Jan 8, 2024

llvmbot commented Jan 8, 2024

llvmbot commented Jan 8, 2024

vdonaldson commented Jan 8, 2024

tblah commented Jan 9, 2024

[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

[flang][OpenMP] Fix unstrucuted control flow inside a collapsed wsloop #77329

Conversation

tblah commented Jan 8, 2024

llvmbot commented Jan 8, 2024

llvmbot commented Jan 8, 2024

vdonaldson commented Jan 8, 2024

tblah commented Jan 9, 2024