-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Handle scalable store size in MemCpyOptimizer #118957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The compiler crashes with an ICE when it tries to create a `memset` with scalable size.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-aarch64 Author: Momchil Velikov (momchil-velikov) ChangesThe compiler crashes with an ICE when it tries to create a Full diff: https://github.com/llvm/llvm-project/pull/118957.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
index 0cba5d077da62b..fc5f6ff2b7f377 100644
--- a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+++ b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
@@ -800,8 +800,9 @@ bool MemCpyOptPass::processStore(StoreInst *SI, BasicBlock::iterator &BBI) {
// in subsequent passes.
auto *T = V->getType();
if (T->isAggregateType()) {
- uint64_t Size = DL.getTypeStoreSize(T);
IRBuilder<> Builder(SI);
+ Value *Size =
+ Builder.CreateTypeSize(Builder.getInt64Ty(), DL.getTypeStoreSize(T));
auto *M = Builder.CreateMemSet(SI->getPointerOperand(), ByteVal, Size,
SI->getAlign());
M->copyMetadata(*SI, LLVMContext::MD_DIAssignID);
diff --git a/llvm/test/CodeGen/AArch64/memset-scalable-size.ll b/llvm/test/CodeGen/AArch64/memset-scalable-size.ll
new file mode 100644
index 00000000000000..8ea6330f235a69
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/memset-scalable-size.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S --passes=memcpyopt < %s | FileCheck %s
+target triple = "aarch64-unknown-linux"
+
+define void @f0() {
+; CHECK-LABEL: define void @f0() {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[P:%.*]] = alloca { <vscale x 16 x i1>, <vscale x 16 x i1> }, align 2
+; CHECK-NEXT: [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT: [[TMP1:%.*]] = mul i64 [[TMP0]], 4
+; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 2 [[P]], i8 0, i64 [[TMP1]], i1 false)
+; CHECK-NEXT: call void @g(ptr [[P]])
+; CHECK-NEXT: ret void
+;
+entry:
+ %p = alloca { <vscale x 16 x i1>, <vscale x 16 x i1>}, align 2
+ store { <vscale x 16 x i1>, <vscale x 16 x i1> } zeroinitializer, ptr %p, align 2
+ call void @g(ptr %p)
+ ret void
+}
+
+define void @f1() {
+; CHECK-LABEL: define void @f1() {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[P:%.*]] = alloca { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }, align 16
+; CHECK-NEXT: [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT: [[TMP1:%.*]] = mul i64 [[TMP0]], 48
+; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 16 [[P]], i8 0, i64 [[TMP1]], i1 false)
+; CHECK-NEXT: call void @g(ptr [[P]])
+; CHECK-NEXT: ret void
+;
+entry:
+ %p = alloca {<vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }, align 16
+ store {<vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } zeroinitializer, ptr %p, align 16
+ call void @g(ptr %p)
+ ret void
+}
+
+define void @f2() {
+; CHECK-LABEL: define void @f2() {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[P:%.*]] = alloca { <vscale x 8 x double>, <vscale x 8 x double>, <vscale x 8 x double> }, align 16
+; CHECK-NEXT: [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT: [[TMP1:%.*]] = mul i64 [[TMP0]], 192
+; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 16 [[P]], i8 0, i64 [[TMP1]], i1 false)
+; CHECK-NEXT: call void @g(ptr [[P]])
+; CHECK-NEXT: ret void
+;
+entry:
+ %p = alloca {<vscale x 8 x double>, <vscale x 8 x double>, <vscale x 8 x double> }, align 16
+ store {<vscale x 8 x double>, <vscale x 8 x double>, <vscale x 8 x double> } zeroinitializer, ptr %p, align 16
+ call void @g(ptr %p)
+ ret void
+}
+
+declare void @g(ptr)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually desirable to convert scalable load/store into a memset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
Ah, good point! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The compiler crashes with an ICE when it tries to create a
memset
with scalable size.