Skip to content

[LoopInterchange] Fix the vectorizable check for a loop #133667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kasuga-fj
Copy link
Contributor

In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130

@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ryotaro Kasuga (kasuga-fj)

Changes

In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130


Full diff: https://github.com/llvm/llvm-project/pull/133667.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopInterchange.cpp (+24-17)
  • (modified) llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll (+3-6)
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..b6b0b7d7a947a 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,32 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
   return std::nullopt;
 }
 
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+  for (unsigned I = 0; I != DepMatrix.size(); I++) {
+    char Dir = DepMatrix[I][LoopId];
+    if (Dir != 'I' && Dir != '=')
+      return false;
+  }
+  return true;
+}
+
 std::optional<bool> LoopInterchangeProfitability::isProfitableForVectorization(
     unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
-  for (auto &Row : DepMatrix) {
-    // If the inner loop is loop independent or doesn't carry any dependency
-    // it is not profitable to move this to outer position, since we are
-    // likely able to do inner loop vectorization already.
-    if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
-      return std::optional<bool>(false);
-
-    // If the outer loop is not loop independent it is not profitable to move
-    // this to inner position, since doing so would not enable inner loop
-    // parallelism.
-    if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
-      return std::optional<bool>(false);
-  }
-  // If inner loop has dependence and outer loop is loop independent then it
-  // is/ profitable to interchange to enable inner loop parallelism.
-  // If there are no dependences, interchanging will not improve anything.
-  return std::optional<bool>(!DepMatrix.empty());
+  // If the outer loop is not loop independent it is not profitable to move
+  // this to inner position, since doing so would not enable inner loop
+  // parallelism.
+  if (!canVectorize(DepMatrix, OuterLoopId))
+    return false;
+
+  // If inner loop has dependence and outer loop is loop independent then it is
+  // profitable to interchange to enable inner loop parallelism.
+  if (!canVectorize(DepMatrix, InnerLoopId))
+    return true;
+
+  // TODO: Estimate the cost of vectorized loop body when both the outer and the
+  // inner loop can be vectorized.
+  return std::nullopt;
 }
 
 bool LoopInterchangeProfitability::isProfitable(
diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
 ;   }
 ; }
 ;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
 
-; CHECK:      --- !Missed
+; CHECK:      --- !Passed
 ; CHECK-NEXT: Pass:            loop-interchange
-; CHECK-NEXT: Name:            InterchangeNotProfitable
+; CHECK-NEXT: Name:            Interchanged
 ; CHECK-NEXT: Function:        interchange_necesasry_for_vectorization
 ; CHECK-NEXT: Args:
-; CHECK-NEXT:   - String:          Interchanging loops is not considered to improve cache locality nor vectorization.
-; CHECK-NEXT: ...
+; CHECK-NEXT:   - String:          Loop interchanged with enclosing loop.
 define void @interchange_necesasry_for_vectorization() {
 entry:
   br label %for.i.header

@kasuga-fj
Copy link
Contributor Author

Depends on #133665

@kasuga-fj kasuga-fj force-pushed the users/kasuga-fj/loop-interchange-vectorzation-test branch from b53b7ce to 0f2ac45 Compare April 2, 2025 07:10
@kasuga-fj kasuga-fj force-pushed the users/kasuga-fj/loop-interchange-fix-profitable-vectorization branch from 2db59e8 to ae5d9cf Compare April 2, 2025 07:12
@kasuga-fj kasuga-fj force-pushed the users/kasuga-fj/loop-interchange-fix-profitable-vectorization branch from ae5d9cf to bd84ddc Compare April 2, 2025 07:29
Base automatically changed from users/kasuga-fj/loop-interchange-vectorzation-test to main April 2, 2025 12:02
In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
@kasuga-fj kasuga-fj force-pushed the users/kasuga-fj/loop-interchange-fix-profitable-vectorization branch from bd84ddc to b1f0744 Compare April 2, 2025 12:13
Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, LGTM.

@kasuga-fj kasuga-fj merged commit 91f3965 into main Apr 3, 2025
11 checks passed
@kasuga-fj kasuga-fj deleted the users/kasuga-fj/loop-interchange-fix-profitable-vectorization branch April 3, 2025 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants