Skip to content

[DSE] Enable initializes improvement #119116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 10, 2024
Merged

[DSE] Enable initializes improvement #119116

merged 1 commit into from
Dec 10, 2024

Conversation

haopliu
Copy link
Contributor

@haopliu haopliu commented Dec 8, 2024

Tested with an internal search backend loadtest.

With -ftrivial-auto-var-init, this work has a 0.2%-0.3% total QPS improvement.

Note that, the metric is total QPS instead of cpu-time, even 1% improvement means a lot.

@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Haopeng Liu (haopliu)

Changes

Tested with an internal benchmark and this improvement has an expected impact.


Full diff: https://github.com/llvm/llvm-project/pull/119116.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+2-2)
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 09e8301b772d96..778f83b7925691 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -164,9 +164,9 @@ static cl::opt<bool>
     OptimizeMemorySSA("dse-optimize-memoryssa", cl::init(true), cl::Hidden,
                       cl::desc("Allow DSE to optimize memory accesses."));
 
-// TODO: turn on and remove this flag.
+// TODO: remove this flag.
 static cl::opt<bool> EnableInitializesImprovement(
-    "enable-dse-initializes-attr-improvement", cl::init(false), cl::Hidden,
+    "enable-dse-initializes-attr-improvement", cl::init(true), cl::Hidden,
     cl::desc("Enable the initializes attr improvement in DSE"));
 
 //===----------------------------------------------------------------------===//

@nikic nikic changed the title Enable initializes improvement [DSE] Enable initializes improvement Dec 8, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please share the compile-time impact?

@haopliu
Copy link
Contributor Author

haopliu commented Dec 8, 2024

@nikic
Copy link
Contributor

nikic commented Dec 8, 2024

@dtcxzyw Can you please fuzz this?

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! are there any numbers you can share in the description? also would be nice to link to the PR that added this flag and the initializes functionality

@dtcxzyw
Copy link
Member

dtcxzyw commented Dec 10, 2024

@dtcxzyw Can you please fuzz this?

I cannot find any issue with this patch.

@dtcxzyw
Copy link
Member

dtcxzyw commented Dec 10, 2024

Verified with llvm-test-suite + stage2 clang build

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jvoung
Copy link
Contributor

jvoung commented Dec 10, 2024

Very nice! And thanks for helping fuzz this =)

@haopliu
Copy link
Contributor Author

haopliu commented Dec 10, 2024

Thank you all! Updated the description with an initial improvement number we got.

@haopliu haopliu merged commit ebe741f into llvm:main Dec 10, 2024
10 checks passed
@mikaelholmen
Copy link
Collaborator

Hi,

I see problems after this patch. I haven't run tests with -enable-dse-initializes-attr-improvement before but now I see that with
opt -passes="dse" bbi-102254.ll -S -o - -enable-dse-initializes-attr-improvement
the "memcpy" in the input program is simply removed. But that is needed because it writes to "@g", and without it "copyGto" won't copy the right data (to "@g" itself).
So is dse too aggressive here, or is the problem that there shouldn't be "initializes((0, 12))" on the "copyGto" input argument?

bbi-102254.ll.gz

@nikic
Copy link
Contributor

nikic commented Dec 11, 2024

For reference, the IR is:

@g = global [12 x i8] zeroinitializer, align 1
@str = private constant [13 x i8] [i8 97, i8 98, i8 99, i8 100, i8 101, i8 102, i8 103, i8 104, i8 105, i8 106, i8 107, i8 108, i8 0], align 1

define void @copyGto(ptr initializes((0, 12)) %agg.result) nounwind {
entry:
  tail call void @llvm.memmove.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(12) %agg.result, ptr noundef nonnull align 1 dereferenceable(12) @g, i32 12, i1 false)
  ret void
}

define i16 @main() {
entry:
  tail call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(12) @g, ptr noundef nonnull align 1 dereferenceable(12) @str, i32 12, i1 false)
  tail call void @copyGto(ptr @g)
  call void @check(ptr @g)
  ret i16 0
}

declare void @llvm.memmove.p0.p0.i32(ptr nocapture writeonly, ptr nocapture readonly, i32, i1 immarg)
declare void @llvm.memcpy.p0.p0.i32(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i32, i1 immarg)
declare void @check(ptr noundef)

I believe that the initializes attribute on copyGto is correct, at least per the current spec:

This attribute only holds for the memory accessed via this pointer parameter. Other arbitrary accesses to the same memory via other pointers are allowed.

I think the problem here is that

DSEState::getInitializesArgMemLoc(const Instruction *I) {
only looks at the call arguments, but doesn't account for the fact that the pointer may also be read via escaped memory (prior to initialization).

haopliu added a commit that referenced this pull request Dec 11, 2024
haopliu added a commit that referenced this pull request Dec 11, 2024
@haopliu
Copy link
Contributor Author

haopliu commented Dec 11, 2024

Thanks! Reverted the enabling. Will look into this case and fix the issue.

@ayrivera-intel
Copy link
Contributor

ayrivera-intel commented Dec 11, 2024

Hi @haopliu ,

Sorry for the noise or spamming your mailbox. I found another problem with this patch. If the length of the MemTransferInst or MemSetInst has a negative value then constant range will produce an issue. The problem happens verifying the IR, but seems that the function GetConstantIntRange may need to check if the value of the length is negative or treat it as an unsigned zero extension. This is a simple reproducer:

; Function Attrs: nounwind uwtable
define dso_local i32 @foo(ptr nocapture noundef writeonly %a, ptr nocapture noundef readonly %b) {
entry:
  call void @llvm.memcpy.p0.p0.i64(ptr %a, ptr %b, i64 -1, i1 false)
  ret i32 undef
}

declare void @llvm.memcpy.p0.p0.i64(ptr nocapture writeonly, ptr nocapture readonly, i64, i1)

You can run it with this command: opt -passes=function-attrs simple.ll -S

And this is the assertion:

Attribute 'initializes' does not support unordered ranges

If you need any more information, please let me know.

EDIT: Sorry, I meant patch #97373 . But seems that is related to this patch.

@mikaelholmen
Copy link
Collaborator

Thanks! Reverted the enabling. Will look into this case and fix the issue.

Thanks!

@haopliu
Copy link
Contributor Author

haopliu commented Dec 22, 2024

Hi @ayrivera-intel,

Thanks for raising the negative length issue. Fixed this problem in #120874.

haopliu added a commit that referenced this pull request Jan 23, 2025
(Retry) enable the initializes improvement in DSE.

Initially enabled in #119116.

Fix the aliasing issue through global variables in
#120044.

The compile-time comparison of this enabling (no meaningful diff):
https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 24, 2025
(Retry) enable the initializes improvement in DSE.

Initially enabled in llvm/llvm-project#119116.

Fix the aliasing issue through global variables in
llvm/llvm-project#120044.

The compile-time comparison of this enabling (no meaningful diff):
https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants