-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[LoopInterchange] Fix the vectorizable check for a loop #133667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoopInterchange] Fix the vectorizable check for a loop #133667
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Ryotaro Kasuga (kasuga-fj) ChangesIn the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 Full diff: https://github.com/llvm/llvm-project/pull/133667.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..b6b0b7d7a947a 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,32 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
return std::nullopt;
}
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+ for (unsigned I = 0; I != DepMatrix.size(); I++) {
+ char Dir = DepMatrix[I][LoopId];
+ if (Dir != 'I' && Dir != '=')
+ return false;
+ }
+ return true;
+}
+
std::optional<bool> LoopInterchangeProfitability::isProfitableForVectorization(
unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
- for (auto &Row : DepMatrix) {
- // If the inner loop is loop independent or doesn't carry any dependency
- // it is not profitable to move this to outer position, since we are
- // likely able to do inner loop vectorization already.
- if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
- return std::optional<bool>(false);
-
- // If the outer loop is not loop independent it is not profitable to move
- // this to inner position, since doing so would not enable inner loop
- // parallelism.
- if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
- return std::optional<bool>(false);
- }
- // If inner loop has dependence and outer loop is loop independent then it
- // is/ profitable to interchange to enable inner loop parallelism.
- // If there are no dependences, interchanging will not improve anything.
- return std::optional<bool>(!DepMatrix.empty());
+ // If the outer loop is not loop independent it is not profitable to move
+ // this to inner position, since doing so would not enable inner loop
+ // parallelism.
+ if (!canVectorize(DepMatrix, OuterLoopId))
+ return false;
+
+ // If inner loop has dependence and outer loop is loop independent then it is
+ // profitable to interchange to enable inner loop parallelism.
+ if (!canVectorize(DepMatrix, InnerLoopId))
+ return true;
+
+ // TODO: Estimate the cost of vectorized loop body when both the outer and the
+ // inner loop can be vectorized.
+ return std::nullopt;
}
bool LoopInterchangeProfitability::isProfitable(
diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
; }
; }
;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
-; CHECK: --- !Missed
+; CHECK: --- !Passed
; CHECK-NEXT: Pass: loop-interchange
-; CHECK-NEXT: Name: InterchangeNotProfitable
+; CHECK-NEXT: Name: Interchanged
; CHECK-NEXT: Function: interchange_necesasry_for_vectorization
; CHECK-NEXT: Args:
-; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectorization.
-; CHECK-NEXT: ...
+; CHECK-NEXT: - String: Loop interchanged with enclosing loop.
define void @interchange_necesasry_for_vectorization() {
entry:
br label %for.i.header
|
Depends on #133665 |
b53b7ce
to
0f2ac45
Compare
2db59e8
to
ae5d9cf
Compare
ae5d9cf
to
bd84ddc
Compare
In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130
bd84ddc
to
b1f0744
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, LGTM.
In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized.
Related: #131130