-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[DSE] Enable initializes improvement #119116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Haopeng Liu (haopliu) ChangesTested with an internal benchmark and this improvement has an expected impact. Full diff: https://github.com/llvm/llvm-project/pull/119116.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 09e8301b772d96..778f83b7925691 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -164,9 +164,9 @@ static cl::opt<bool>
OptimizeMemorySSA("dse-optimize-memoryssa", cl::init(true), cl::Hidden,
cl::desc("Allow DSE to optimize memory accesses."));
-// TODO: turn on and remove this flag.
+// TODO: remove this flag.
static cl::opt<bool> EnableInitializesImprovement(
- "enable-dse-initializes-attr-improvement", cl::init(false), cl::Hidden,
+ "enable-dse-initializes-attr-improvement", cl::init(true), cl::Hidden,
cl::desc("Enable the initializes attr improvement in DSE"));
//===----------------------------------------------------------------------===//
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please share the compile-time impact?
Here is the compile-time comparison of this enabling (no meaningful diff). |
@dtcxzyw Can you please fuzz this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome! are there any numbers you can share in the description? also would be nice to link to the PR that added this flag and the initializes functionality
I cannot find any issue with this patch. |
Verified with llvm-test-suite + stage2 clang build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Very nice! And thanks for helping fuzz this =) |
Thank you all! Updated the description with an initial improvement number we got. |
Hi, I see problems after this patch. I haven't run tests with |
For reference, the IR is: @g = global [12 x i8] zeroinitializer, align 1
@str = private constant [13 x i8] [i8 97, i8 98, i8 99, i8 100, i8 101, i8 102, i8 103, i8 104, i8 105, i8 106, i8 107, i8 108, i8 0], align 1
define void @copyGto(ptr initializes((0, 12)) %agg.result) nounwind {
entry:
tail call void @llvm.memmove.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(12) %agg.result, ptr noundef nonnull align 1 dereferenceable(12) @g, i32 12, i1 false)
ret void
}
define i16 @main() {
entry:
tail call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(12) @g, ptr noundef nonnull align 1 dereferenceable(12) @str, i32 12, i1 false)
tail call void @copyGto(ptr @g)
call void @check(ptr @g)
ret i16 0
}
declare void @llvm.memmove.p0.p0.i32(ptr nocapture writeonly, ptr nocapture readonly, i32, i1 immarg)
declare void @llvm.memcpy.p0.p0.i32(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i32, i1 immarg)
declare void @check(ptr noundef) I believe that the
I think the problem here is that
|
Thanks! Reverted the enabling. Will look into this case and fix the issue. |
Hi @haopliu , Sorry for the noise or spamming your mailbox. I found another problem with this patch. If the length of the MemTransferInst or MemSetInst has a negative value then constant range will produce an issue. The problem happens verifying the IR, but seems that the function GetConstantIntRange may need to check if the value of the length is negative or treat it as an unsigned zero extension. This is a simple reproducer:
You can run it with this command: And this is the assertion:
If you need any more information, please let me know. EDIT: Sorry, I meant patch #97373 . But seems that is related to this patch. |
Thanks! |
Hi @ayrivera-intel, Thanks for raising the negative length issue. Fixed this problem in #120874. |
(Retry) enable the initializes improvement in DSE. Initially enabled in #119116. Fix the aliasing issue through global variables in #120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
(Retry) enable the initializes improvement in DSE. Initially enabled in llvm/llvm-project#119116. Fix the aliasing issue through global variables in llvm/llvm-project#120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
Tested with an internal search backend loadtest.
With
-ftrivial-auto-var-init
, this work has a 0.2%-0.3% total QPS improvement.Note that, the metric is total QPS instead of cpu-time, even 1% improvement means a lot.