-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[Flang][OpenMP] Use simdloop operation only for omp simd pragma #79559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
OpenMP standard differentiates between omp simd (2.9.3.1) and omp do/for simd (2.9.3.2 for OpenMP 5.0 standard) pragmas. The first one describes the loop which needs to be vectorized. The second pragma describes the loop which needs to be workshared between existing threads. Each thread can use SIMD instructions to execute its chunk of the loop. That's why we need to model !$omp simd do-loop as omp.simdloop operation and add compiler hints for vectorization. The worksharing loop: !$omp do simd do-loop should be represented as worksharing loop. Currently Flang denotes both type of OpenMP pragmas by omp.simdloop operation. In consequence we cannot differentiate between: !$omp parallel simd do-loop and !$omp parallel do simd do-loop The second loop should be workshared between multiple threads. The first one describes the loop which needs to be redundantly executed by multiple threads. Current Flang implementation does not perform worksharing for `!$omp do simd` pragma and generates valid code only for first case.
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-openmp Author: Dominik Adamski (DominikAdamski) ChangesOpenMP standard differentiates between omp simd (2.9.3.1) and omp do/for simd (2.9.3.2 for OpenMP 5.0 standard) pragmas. The first one describes the loop which needs to be vectorized. The second pragma describes the loop which needs to be workshared between existing threads. Each thread can use SIMD instructions to execute its chunk of the loop. That's why we need to model
as The worksharing loop: Currently Flang denotes both type of OpenMP pragmas by
and
The second loop should be workshared between multiple threads. The first one describes the loop which needs to be redundantly executed by multiple threads. Current Flang implementation does not perform worksharing for @skatrak will propose detailed MLIR representation of Full diff: https://github.com/llvm/llvm-project/pull/79559.diff 9 Files Affected:
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index d2215f4d1bf1ce..0888b25f1c59d5 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -3311,6 +3311,43 @@ static void createWsLoop(Fortran::lower::AbstractConverter &converter,
/*outer=*/false, &dsp);
}
+static void createSimdWsLoop(
+ Fortran::lower::AbstractConverter &converter,
+ Fortran::lower::pft::Evaluation &eval, llvm::omp::Directive ompDirective,
+ const Fortran::parser::OmpClauseList &beginClauseList,
+ const Fortran::parser::OmpClauseList *endClauseList, mlir::Location loc) {
+ ClauseProcessor cp(converter, beginClauseList);
+ cp.processTODO<
+ Fortran::parser::OmpClause::Aligned, Fortran::parser::OmpClause::Allocate,
+ Fortran::parser::OmpClause::Linear, Fortran::parser::OmpClause::Safelen,
+ Fortran::parser::OmpClause::Simdlen, Fortran::parser::OmpClause::Order>(
+ loc, ompDirective);
+ // TODO: Add support for vectorization - add vectorization hints inside loop
+ // body.
+ // OpenMP standard does not specify the length of vector instructions.
+ // Currently we safely assume that for !$omp do simd pragma the SIMD length
+ // is equal to 1 (i.e. we generate standard workshare loop).
+ // When support for vectorization is enabled, then we need to add handling of
+ // if clause. Currently if clause can be skipped because we always assume
+ // SIMD length = 1.
+ createWsLoop(converter, eval, ompDirective, beginClauseList, endClauseList,
+ loc);
+}
+
+static bool isWorkshareSimdConstruct(llvm::omp::Directive ompDirective) {
+ switch (ompDirective) {
+ default:
+ return false;
+ case llvm::omp::OMPD_distribute_parallel_do_simd:
+ case llvm::omp::OMPD_do_simd:
+ case llvm::omp::OMPD_parallel_do_simd:
+ case llvm::omp::OMPD_target_parallel_do_simd:
+ case llvm::omp::OMPD_target_teams_distribute_parallel_do_simd:
+ case llvm::omp::OMPD_teams_distribute_parallel_do_simd:
+ return true;
+ }
+}
+
static void genOMP(Fortran::lower::AbstractConverter &converter,
Fortran::lower::SymMap &symTable,
Fortran::semantics::SemanticsContext &semanticsContext,
@@ -3377,10 +3414,17 @@ static void genOMP(Fortran::lower::AbstractConverter &converter,
")");
}
- // 2.9.3.1 SIMD construct
if (llvm::omp::allSimdSet.test(ompDirective)) {
- createSimdLoop(converter, eval, ompDirective, loopOpClauseList,
- currentLocation);
+ if (isWorkshareSimdConstruct(ompDirective)) {
+ // 2.9.3.2 Workshare SIMD construct
+ createSimdWsLoop(converter, eval, ompDirective, loopOpClauseList,
+ endClauseList, currentLocation);
+
+ } else {
+ // 2.9.3.1 SIMD construct
+ createSimdLoop(converter, eval, ompDirective, loopOpClauseList,
+ currentLocation);
+ }
} else {
createWsLoop(converter, eval, ompDirective, loopOpClauseList, endClauseList,
currentLocation);
diff --git a/flang/test/Lower/OpenMP/FIR/if-clause.f90 b/flang/test/Lower/OpenMP/FIR/if-clause.f90
index ef98a00f10dbd2..a1235be8e61ea2 100644
--- a/flang/test/Lower/OpenMP/FIR/if-clause.f90
+++ b/flang/test/Lower/OpenMP/FIR/if-clause.f90
@@ -28,7 +28,7 @@ program main
! ----------------------------------------------------------------------------
! DO SIMD
! ----------------------------------------------------------------------------
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp do simd
@@ -36,15 +36,13 @@ program main
end do
!$omp end do simd
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp do simd if(.true.)
do i = 1, 10
end do
!$omp end do simd
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp do simd if(simd: .true.)
do i = 1, 10
end do
@@ -103,7 +101,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp parallel do simd
@@ -113,8 +111,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(.true.)
do i = 1, 10
end do
@@ -122,8 +119,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(parallel: .true.) if(simd: .false.)
do i = 1, 10
end do
@@ -131,7 +127,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp parallel do simd if(parallel: .true.)
@@ -142,8 +138,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(simd: .true.)
do i = 1, 10
end do
@@ -306,7 +301,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp target parallel do simd
@@ -318,8 +313,7 @@ program main
! CHECK-SAME: if({{.*}})
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(.true.)
do i = 1, 10
end do
@@ -329,8 +323,7 @@ program main
! CHECK-SAME: if({{.*}})
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(target: .true.) if(parallel: .false.) &
!$omp& if(simd: .true.)
do i = 1, 10
@@ -342,7 +335,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp target parallel do simd if(target: .true.)
@@ -355,8 +348,7 @@ program main
! CHECK-SAME: {
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(parallel: .true.) if(simd: .false.)
do i = 1, 10
end do
diff --git a/flang/test/Lower/OpenMP/FIR/loop-combined.f90 b/flang/test/Lower/OpenMP/FIR/loop-combined.f90
index 117f7d625270ec..a6cec1beb49c86 100644
--- a/flang/test/Lower/OpenMP/FIR/loop-combined.f90
+++ b/flang/test/Lower/OpenMP/FIR/loop-combined.f90
@@ -23,7 +23,7 @@ program main
! ----------------------------------------------------------------------------
! DO SIMD
! ----------------------------------------------------------------------------
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp do simd
do i = 1, 10
end do
@@ -33,7 +33,7 @@ program main
! PARALLEL DO SIMD
! ----------------------------------------------------------------------------
! CHECK: omp.parallel
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp parallel do simd
do i = 1, 10
end do
@@ -54,7 +54,7 @@ program main
! ----------------------------------------------------------------------------
! CHECK: omp.target
! CHECK: omp.parallel
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp target parallel do simd
do i = 1, 10
end do
diff --git a/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90 b/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90
new file mode 100644
index 00000000000000..b62c54182442ac
--- /dev/null
+++ b/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90
@@ -0,0 +1,16 @@
+! This test checks lowering of OpenMP do simd aligned() pragma
+
+! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine testDoSimdAligned(int_array)
+ use iso_c_binding
+ type(c_ptr) :: int_array
+!CHECK: not yet implemented: Unhandled clause ALIGNED in DO SIMD construct
+!$omp do simd aligned(int_array)
+ do index_ = 1, 10
+ call c_test_call(int_array)
+ end do
+!$omp end do simd
+
+end subroutine testDoSimdAligned
+
diff --git a/flang/test/Lower/OpenMP/Todo/omp-do-simd-linear.f90 b/flang/test/Lower/OpenMP/Todo/omp-do-simd-linear.f90
new file mode 100644
index 00000000000000..a9e0446ec8c34e
--- /dev/null
+++ b/flang/test/Lower/OpenMP/Todo/omp-do-simd-linear.f90
@@ -0,0 +1,14 @@
+! This test checks lowering of OpenMP do simd linear() pragma
+
+! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine testDoSimdLinear(int_array)
+ integer :: int_array(*)
+!CHECK: not yet implemented: Unhandled clause LINEAR in DO SIMD construct
+!$omp do simd linear(int_array)
+ do index_ = 1, 10
+ end do
+!$omp end do simd
+
+end subroutine testDoSimdLinear
+
diff --git a/flang/test/Lower/OpenMP/Todo/omp-do-simd-safelen.f90 b/flang/test/Lower/OpenMP/Todo/omp-do-simd-safelen.f90
new file mode 100644
index 00000000000000..054eb52ea170ac
--- /dev/null
+++ b/flang/test/Lower/OpenMP/Todo/omp-do-simd-safelen.f90
@@ -0,0 +1,14 @@
+! This test checks lowering of OpenMP do simd safelen() pragma
+
+! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine testDoSimdSafelen(int_array)
+ integer :: int_array(*)
+!CHECK: not yet implemented: Unhandled clause SAFELEN in DO SIMD construct
+!$omp do simd safelen(4)
+ do index_ = 1, 10
+ end do
+!$omp end do simd
+
+end subroutine testDoSimdSafelen
+
diff --git a/flang/test/Lower/OpenMP/Todo/omp-do-simd-simdlen.f90 b/flang/test/Lower/OpenMP/Todo/omp-do-simd-simdlen.f90
new file mode 100644
index 00000000000000..bd00b6f336c931
--- /dev/null
+++ b/flang/test/Lower/OpenMP/Todo/omp-do-simd-simdlen.f90
@@ -0,0 +1,14 @@
+! This test checks lowering of OpenMP do simd simdlen() pragma
+
+! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine testDoSimdSimdlen(int_array)
+ integer :: int_array(*)
+!CHECK: not yet implemented: Unhandled clause SIMDLEN in DO SIMD construct
+!$omp do simd simdlen(4)
+ do index_ = 1, 10
+ end do
+!$omp end do simd
+
+end subroutine testDoSimdSimdlen
+
diff --git a/flang/test/Lower/OpenMP/if-clause.f90 b/flang/test/Lower/OpenMP/if-clause.f90
index 032009add31535..f982bf67b07225 100644
--- a/flang/test/Lower/OpenMP/if-clause.f90
+++ b/flang/test/Lower/OpenMP/if-clause.f90
@@ -28,7 +28,7 @@ program main
! ----------------------------------------------------------------------------
! DO SIMD
! ----------------------------------------------------------------------------
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp do simd
@@ -36,15 +36,13 @@ program main
end do
!$omp end do simd
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp do simd if(.true.)
do i = 1, 10
end do
!$omp end do simd
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp do simd if(simd: .true.)
do i = 1, 10
end do
@@ -103,7 +101,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp parallel do simd
@@ -113,8 +111,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(.true.)
do i = 1, 10
end do
@@ -122,8 +119,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(parallel: .true.) if(simd: .false.)
do i = 1, 10
end do
@@ -131,7 +127,7 @@ program main
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp parallel do simd if(parallel: .true.)
@@ -142,8 +138,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp parallel do simd if(simd: .true.)
do i = 1, 10
end do
@@ -306,7 +301,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp target parallel do simd
@@ -318,8 +313,7 @@ program main
! CHECK-SAME: if({{.*}})
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(.true.)
do i = 1, 10
end do
@@ -329,8 +323,7 @@ program main
! CHECK-SAME: if({{.*}})
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(target: .true.) if(parallel: .false.) &
!$omp& if(simd: .true.)
do i = 1, 10
@@ -342,7 +335,7 @@ program main
! CHECK: omp.parallel
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
! CHECK-NOT: if({{.*}})
! CHECK-SAME: {
!$omp target parallel do simd if(target: .true.)
@@ -355,8 +348,7 @@ program main
! CHECK-SAME: {
! CHECK: omp.parallel
! CHECK-SAME: if({{.*}})
- ! CHECK: omp.simdloop
- ! CHECK-SAME: if({{.*}})
+ ! CHECK: omp.wsloop
!$omp target parallel do simd if(parallel: .true.) if(simd: .false.)
do i = 1, 10
end do
diff --git a/flang/test/Lower/OpenMP/loop-combined.f90 b/flang/test/Lower/OpenMP/loop-combined.f90
index 960e9518127d75..70488b6a769ce4 100644
--- a/flang/test/Lower/OpenMP/loop-combined.f90
+++ b/flang/test/Lower/OpenMP/loop-combined.f90
@@ -23,7 +23,7 @@ program main
! ----------------------------------------------------------------------------
! DO SIMD
! ----------------------------------------------------------------------------
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp do simd
do i = 1, 10
end do
@@ -33,7 +33,7 @@ program main
! PARALLEL DO SIMD
! ----------------------------------------------------------------------------
! CHECK: omp.parallel
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp parallel do simd
do i = 1, 10
end do
@@ -54,7 +54,7 @@ program main
! ----------------------------------------------------------------------------
! CHECK: omp.target
! CHECK: omp.parallel
- ! CHECK: omp.simdloop
+ ! CHECK: omp.wsloop
!$omp target parallel do simd
do i = 1, 10
end do
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dominik, this looks good to me. I just have a small suggestion. Please wait for a day before merging, in case there are any concerns by other reviewers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning for this PR is inline with what's needed for proper support of SIMD from an OpenMP spec perspective.
This patch introduces the `omp.simd` operation. In contrast to the existing `omp.simdloop` operation, it is intended to hold SIMD information within worksharing loops, rather than representing a SIMD-only loop. Some examples of such loops are "omp do/for simd", "omp distribute simd", "omp target teams distribute parallel do/for simd", etc. For more context on this work, refer to PR #79559. This operation must always be nested within an `omp.wsloop` operation as its only non-terminator child. It follows the same approach as the `omp.distribute` operation, by serving as a simple wrapper operation holding clause information.
The PR for MLIR changes: #79843 |
Upstream Flang can generate valid workshare code for `omp do` pragma. See: llvm/llvm-project#79559 for more details.
Upstream Flang can generate valid workshare code for `omp do` pragma. See: llvm/llvm-project#79559 for more details.
OpenMP standard differentiates between omp simd (2.9.3.1) and omp do/for simd (2.9.3.2 for OpenMP 5.0 standard) pragmas. The first one describes the loop which needs to be vectorized. The second pragma describes the loop which needs to be workshared between existing threads. Each thread can use SIMD instructions to execute its chunk of the loop.
That's why we need to model
as
omp.simdloop
operation and add compiler hints for vectorization.The worksharing loop:
!$omp do simd
do-loop
should be represented as worksharing loop (
omp.wsloop
).Currently Flang denotes both type of OpenMP pragmas by
omp.simdloop
operation. In consequence we cannot differentiate between:and
The second loop should be workshared between multiple threads. The first one describes the loop which needs to be redundantly executed by multiple threads. Current Flang implementation does not perform worksharing for
!$omp do simd
pragma and generates valid code only for first case.@skatrak will propose detailed MLIR representation of
!$omp do simd
pragma