-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[clang] Use different memory layout type for _BitInt(N) in LLVM IR #91364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently for i128:128 targets either __int128 or a correct _BitInt(129+) implementation possible with lowering to iN, but not both. Since we have now correct implementation of __int128, this patch attempts to fix codegen issues by lowering _BitInt(129+) types to an array of i8 for "memory", similarly how it is happening for bools now. Fixes llvm#85139 Fixes llvm#83419
@llvm/pr-subscribers-hlsl @llvm/pr-subscribers-clang-codegen Author: Mariya Podchishchaeva (Fznamznon) ChangesCurrently for i128:128 targets correct implementation is possible either for __int128 or for _BitInt(129+) with lowering to iN, but not both. Since we have now correct implementation of __int128 in place after a21abc7, this patch attempts to fix codegen issues by lowering _BitInt(129+) types to an array of i8 for "memory", similarly how it is happening for bools now. Full diff: https://github.com/llvm/llvm-project/pull/91364.diff 6 Files Affected:
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index d96c7bb1e568..7e631e469a88 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1989,6 +1989,14 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(Address Addr, bool Volatile,
return EmitAtomicLoad(AtomicLValue, Loc).getScalarVal();
}
+ if (const auto *BIT = Ty->getAs<BitIntType>()) {
+ if (BIT->getNumBits() > 128) {
+ // Long _BitInt has array of bytes as in-memory type.
+ llvm::Type *NewTy = ConvertType(Ty);
+ Addr = Addr.withElementType(NewTy);
+ }
+ }
+
llvm::LoadInst *Load = Builder.CreateLoad(Addr, Volatile);
if (isNontemporal) {
llvm::MDNode *Node = llvm::MDNode::get(
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index 94962091116a..98ab1e23d128 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -1774,6 +1774,18 @@ llvm::Constant *ConstantEmitter::emitForMemory(CodeGenModule &CGM,
return Res;
}
+ if (const auto *BIT = destType->getAs<BitIntType>()) {
+ if (BIT->getNumBits() > 128) {
+ // Long _BitInt has array of bytes as in-memory type.
+ ConstantAggregateBuilder Builder(CGM);
+ llvm::Type *DesiredTy = CGM.getTypes().ConvertTypeForMem(destType);
+ auto *CI = cast<llvm::ConstantInt>(C);
+ llvm::APInt Value = CI->getValue();
+ Builder.addBits(Value, /*OffsetInBits=*/0, /*AllowOverwrite=*/false);
+ return Builder.build(DesiredTy, /*AllowOversized*/ false);
+ }
+ }
+
return C;
}
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index d84531959b50..717d47d20dea 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -5348,6 +5348,13 @@ Value *ScalarExprEmitter::VisitVAArgExpr(VAArgExpr *VE) {
return llvm::UndefValue::get(ArgTy);
}
+ if (const auto *BIT = Ty->getAs<BitIntType>()) {
+ if (BIT->getNumBits() > 128) {
+ // Long _BitInt has array of bytes as in-memory type.
+ ArgPtr = ArgPtr.withElementType(ArgTy);
+ }
+ }
+
// FIXME Volatility.
llvm::Value *Val = Builder.CreateLoad(ArgPtr);
diff --git a/clang/lib/CodeGen/CodeGenTypes.cpp b/clang/lib/CodeGen/CodeGenTypes.cpp
index e8d75eda029e..55c618677ddb 100644
--- a/clang/lib/CodeGen/CodeGenTypes.cpp
+++ b/clang/lib/CodeGen/CodeGenTypes.cpp
@@ -114,6 +114,12 @@ llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField) {
return llvm::IntegerType::get(getLLVMContext(),
(unsigned)Context.getTypeSize(T));
+ if (const auto *BIT = T->getAs<BitIntType>()) {
+ if (BIT->getNumBits() > 128)
+ R = llvm::ArrayType::get(CGM.Int8Ty,
+ (unsigned)Context.getTypeSize(T) / 8);
+ }
+
// Else, don't map it.
return R;
}
diff --git a/clang/test/CodeGen/ext-int-cc.c b/clang/test/CodeGen/ext-int-cc.c
index 001e866d34b4..83f20dcb0667 100644
--- a/clang/test/CodeGen/ext-int-cc.c
+++ b/clang/test/CodeGen/ext-int-cc.c
@@ -131,7 +131,7 @@ void ParamPassing3(_BitInt(15) a, _BitInt(31) b) {}
// are negated. This will give an error when a target does support larger
// _BitInt widths to alert us to enable the test.
void ParamPassing4(_BitInt(129) a) {}
-// LIN64: define{{.*}} void @ParamPassing4(ptr byval(i129) align 8 %{{.+}})
+// LIN64: define{{.*}} void @ParamPassing4(ptr byval([24 x i8]) align 8 %{{.+}})
// WIN64: define dso_local void @ParamPassing4(ptr %{{.+}})
// LIN32: define{{.*}} void @ParamPassing4(ptr %{{.+}})
// WIN32: define dso_local void @ParamPassing4(ptr %{{.+}})
diff --git a/clang/test/CodeGen/ext-int.c b/clang/test/CodeGen/ext-int.c
index 4cb399d108f2..a6a632bd985d 100644
--- a/clang/test/CodeGen/ext-int.c
+++ b/clang/test/CodeGen/ext-int.c
@@ -1,12 +1,19 @@
-// RUN: %clang_cc1 -triple x86_64-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64
-// RUN: %clang_cc1 -triple x86_64-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64
-// RUN: %clang_cc1 -triple i386-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,LIN32
-// RUN: %clang_cc1 -triple i386-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WIN32
+// RUN: %clang_cc1 -std=c23 -triple x86_64-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64
+// RUN: %clang_cc1 -std=c23 -triple x86_64-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64
+// RUN: %clang_cc1 -std=c23 -triple i386-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,LIN32
+// RUN: %clang_cc1 -std=c23 -triple i386-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WIN32
+
+// CHECK64: %struct.S1 = type { i17, [4 x i8], [24 x i8] }
+// CHECK64: %struct.S2 = type { [40 x i8], i32, [4 x i8] }
//GH62207
unsigned _BitInt(1) GlobSize1 = 0;
// CHECK: @GlobSize1 = {{.*}}global i1 false
+// CHECK64: @__const.foo.A = private unnamed_addr constant { i17, [4 x i8], <{ i8, [23 x i8] }> } { i17 1, [4 x i8] undef, <{ i8, [23 x i8] }> <{ i8 -86, [23 x i8] zeroinitializer }> }, align 8
+// CHECK64: @BigGlob = {{.*}}global <{ i8, i8, [38 x i8] }> <{ i8 -68, i8 2, [38 x i8] zeroinitializer }>, align 8
+// CHECK64: @f.p = internal global <{ i8, i8, [22 x i8] }> <{ i8 16, i8 39, [22 x i8] zeroinitializer }>, align 8
+
void GenericTest(_BitInt(3) a, unsigned _BitInt(3) b, _BitInt(4) c) {
// CHECK: define {{.*}}void @GenericTest
int which = _Generic(a, _BitInt(3): 1, unsigned _BitInt(3) : 2, _BitInt(4) : 3);
@@ -62,3 +69,85 @@ void Size1ExtIntParam(unsigned _BitInt(1) A) {
// CHECK: store i1 %[[PARAM_LOAD]], ptr %[[IDX]]
B[2] = A;
}
+
+#if __BITINT_MAXWIDTH__ > 128
+struct S1 {
+ _BitInt(17) A;
+ _BitInt(129) B;
+};
+
+int foo(int a) {
+ // CHECK64: %A1 = getelementptr inbounds %struct.S1, ptr %B, i32 0, i32 0
+ // CHECK64: store i17 1, ptr %A1, align 8
+ // CHECK64: %B2 = getelementptr inbounds %struct.S1, ptr %B, i32 0, i32 2
+ // CHECK64: %0 = load i32, ptr %a.addr, align 4
+ // CHECK64: %conv = sext i32 %0 to i129
+ // CHECK64: store i129 %conv, ptr %B2, align 8
+ // CHECK64: %B3 = getelementptr inbounds %struct.S1, ptr %A, i32 0, i32 2
+ // CHECK64: %1 = load i129, ptr %B3, align 8
+ // CHECK64: %conv4 = trunc i129 %1 to i32
+ // CHECK64: %B5 = getelementptr inbounds %struct.S1, ptr %B, i32 0, i32 2
+ // CHECK64: %2 = load i129, ptr %B5, align 8
+ struct S1 A = {1, 170};
+ struct S1 B = {1, a};
+ return (int)A.B + (int)B.B;
+}
+
+struct S2 {
+ _BitInt(257) A;
+ int B;
+};
+
+_BitInt(257) bar() {
+ // CHECK64: define {{.*}}void @bar(ptr {{.*}} sret([40 x i8]) align 8 %[[RET:.+]])
+ // CHECK64: %A = alloca %struct.S2, align 8
+ // CHECK64: %0 = getelementptr inbounds { <{ i8, [39 x i8] }>, i32, [4 x i8] }, ptr %A, i32 0, i32 0
+ // CHECK64: %1 = getelementptr inbounds <{ i8, [39 x i8] }>, ptr %0, i32 0, i32 0
+ // CHECK64: store i8 1, ptr %1, align 8
+ // CHECK64: %2 = getelementptr inbounds { <{ i8, [39 x i8] }>, i32, [4 x i8] }, ptr %A, i32 0, i32 1
+ // CHECK64: store i32 10000, ptr %2, align 8
+ // CHECK64: %A1 = getelementptr inbounds %struct.S2, ptr %A, i32 0, i32 0
+ // CHECK64: %3 = load i257, ptr %A1, align 8
+ // CHECK64: store i257 %3, ptr %[[RET]], align 8
+ struct S2 A = {1, 10000};
+ return A.A;
+}
+
+void TakesVarargs(int i, ...) {
+ // CHECK64: define{{.*}} void @TakesVarargs(i32
+__builtin_va_list args;
+__builtin_va_start(args, i);
+
+_BitInt(160) A = __builtin_va_arg(args, _BitInt(160));
+ // CHECK64: %[[ARG:.+]] = load i160
+ // CHECK64: store i160 %[[ARG]], ptr %A, align 8
+}
+
+_BitInt(129) *f1(_BitInt(129) *p) {
+ // CHECK64: getelementptr inbounds [24 x i8], {{.*}} i64 1
+ return p + 1;
+}
+
+char *f2(char *p) {
+ // CHECK64: getelementptr inbounds i8, {{.*}} i64 24
+ return p + sizeof(_BitInt(129));
+}
+
+auto BigGlob = (_BitInt(257))700;
+// CHECK64: define {{.*}}void @foobar(ptr {{.*}} sret([40 x i8]) align 8 %[[RET1:.+]])
+_BitInt(257) foobar() {
+ // CHECK64: %A = alloca [40 x i8], align 8
+ // CHECK64: %0 = load i257, ptr @BigGlob, align 8
+ // CHECK64: %add = add nsw i257 %0, 1
+ // CHECK64: store i257 %add, ptr %A, align 8
+ // CHECK64: %1 = load i257, ptr %A, align 8
+ // CHECK64: store i257 %1, ptr %[[RET1]], align 8
+ _BitInt(257) A = BigGlob + 1;
+ return A;
+}
+
+void f() {
+ static _BitInt(130) p = {10000};
+}
+
+#endif
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate, and will likely result in the FPGAs needing to generate extra bits here, so this is somewhat harmful in that regard.
It seems to me this is a case where we're trying to work -around an llvm bug? Should we just be fixing that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a helper somewhere to check "is this type a bitint wider than 128 bits"?
clang/lib/CodeGen/CGExprConstant.cpp
Outdated
// Long _BitInt has array of bytes as in-memory type. | ||
ConstantAggregateBuilder Builder(CGM); | ||
llvm::Type *DesiredTy = CGM.getTypes().ConvertTypeForMem(destType); | ||
auto *CI = cast<llvm::ConstantInt>(C); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this cast is guaranteed to succeed? At least in some cases, we emit constant expressions involving a ptrtoint. Maybe at the widths in question, that can't happen, but this deserves a comment explaining what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment. I'm not able to get a ptrtoint in a constant expression involving a big _BitInt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a "small" _BitInt
? The comment starts
// LLVM type doesn't match AST type only for big enough _BitInts,
and for AArch32 and AArch64 we are going to have a non-matching LLVM types even for "small" _BitInt
s - for AArch32 because the ABI wants the padding bing in-memory representation to contain zero or the sign-bit, and for both we'd like to emit loads/stores in bigger chunks, e.g. i17
is a single i32
load store, as opposed to two separate accesses to i16
and i8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case here is just going to be something like _SomeSplitBitIntType x = (unsigned long) &someVariable;
. What code do we actually produce for this? Sometimes we'll be able to fall back on dynamic initialization, but that's not always an option.
Ideally, it's just invalid to do something like that. It certainly needs to be diagnosed if the integer type is narrower than the pointer, and wider is also problematic, although less so and in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it seems it doesn't depend on the size of _BitInt. Using something like _SomeSplitBitIntType x = (unsigned long) &someVariable;
either fails if I for example apply constexpr
or falls back on dynamic initialization. So I changed the comment to make it more generic.
You mean, revert https://reviews.llvm.org/D86310 ? Making any changes in LLVM here is painful; I'd rather not revisit that. CC @hvdijk @rnk |
I didn't, no, but I hadn't seen all that conversation. Aaron has explained a bit more of the context here, and I'm finding myself pretty confused/out of the loop. As this is effectively all codegen, I suspect you, plus your CCs are the best ones to review this. I don't see a problem except for the FPGA folks to this, though between: 1- FPGA folks rarely/ever use large types like this if they can help it. I don't think I have strong feelings here. |
I don't think FPGA folks will run into any practical issue with this; this only impacts the in-memory types, and backends shouldn't really be using in-memory types for anything anyways. |
clang/lib/CodeGen/CGExpr.cpp
Outdated
@@ -1989,6 +1989,14 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(Address Addr, bool Volatile, | |||
return EmitAtomicLoad(AtomicLValue, Loc).getScalarVal(); | |||
} | |||
|
|||
if (const auto *BIT = Ty->getAs<BitIntType>()) { | |||
if (BIT->getNumBits() > 128) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a number of bits >64, <=128, LLVM's iN
type will have identical representation to Clang _BitInt(N)
but different alignment. I think this is fine, I think nothing needs their alignment to match Clang's, but could you double-check to make sure you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types remain unchanged.
Thanks for doing this, it's unfortunate that Clang is in a rather broken state with these types right now and it will be good to see improvement. I think the approach you're taking here is the only approach that will work. |
I played with the idea of using LLVM packed structs ( LLVM DataLayout's idea of Using byte arrays for the in-memory type should work, so it's probably the best path forward. |
Hmm. I think this is actually pretty different from the The problem being presented here is this:
However, it doesn't follow from the need to use
I expect that problem (2) also applies to The upshot is that code like
Edit: I originally defined |
If you want to do things that way, you will need to (1) generalize CodeGenTypes with a new API that will return this load/store type when applicable and (2) look at all the places we call You definitely should not be hard-coding 128 in a bunch of places. The load/store type should always be |
You're suggesting we should fork ConvertTypeForMem into two functions? So there's actually three types: the "register" type, the "load/store" type, and the "in-memory" type. I guess that makes sense from a theoretical perspective, but... as a practical matter, I'm not sure how many places need to call the proposed "ConvertTypeForLoadStore". In EmitLoadOfScalar(), instead of checking for BitInt, you just unconditionally do |
My experience is that compiler writers are really good at hacking in special cases to make their test cases work and really bad at recognizing that their case isn't as special as they think. There are three types already called out for special treatment in |
Thank you everyone for the feedback. I'm working on applying. |
clang/lib/CodeGen/CGExpr.cpp
Outdated
if (const auto *BIT = Ty->getAs<BitIntType>()) { | ||
if (BIT->getNumBits() > 128) { | ||
// Long _BitInt has array of bytes as in-memory type. | ||
llvm::Type *NewTy = ConvertType(Ty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we call calling ConvertTypeForMem
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to load not array, but iN, so ConvertType
here was intentional. However I'm updating this patch soon, it will be using special load/store type whose idea is described by #91364 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. It looks close to what we are trying to do with #93495, which is:
- create in-memory representations according to the target ABI
- improve efficiency of loads/stores, e.g. load/store of
i18
in LLVM must touch just 3 bytes, so a compiler would emit one 16-bit load and one 8-bit load, but ifi18
comes from_BitInt(18)
then a single 32-bit load would work better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch was mostly intended to fix codegen issues when it comes to big _BitInt types (>128 for 64bit targets), however I'm adding new idea of load/store type, so that seems close.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally looking great, and I think it's ready to go as soon as you can finish the tests. (You said you weren't able to update all the tests — did you have questions about the remaining tests?)
I did have a thought, though. Are we confident that the in-memory layout that LLVM is using for these large integer types matches the layout specified by the ABI? I know this patch makes the overall sizes match, but there's also an endianness question. When LLVM stores an i96, I assume it always stores them using the overall endianness of the target; for example, on i386, it might do three 32-bit stores with the low 32 bits at offset 0, the middle 32 bits at offset 4, and the high 32 bits at offset 8. I just want to make sure that the ABI specification for _BitInt always matches that. In particular, I'm worried that it might do some middle-endian thing where it breaks the integer into chunks and then stores those chunks in little-endian order even on a big-endian machine. (That is generally the right thing to do for BigInt types because most arithmetic operations access the chunks in little-endian order, and doing adjacent memory accesses in increasing order is generally more architecture-friendly.)
@@ -107,17 +107,52 @@ llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField) { | |||
return llvm::IntegerType::get(FixedVT->getContext(), BytePadded); | |||
} | |||
|
|||
// If this is a bool type, or a bit-precise integer type in a bitfield | |||
// representation, map this integer to the target-specified size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this comment; we just need to update it a little:
// If T is _Bool or a _BitInt type, ConvertType will produce an IR type
// with the exact semantic bit-width of the AST type; for example,
// _BitInt(17) will turn into i17. In memory, however, we need to store
// such values extended to their full storage size as decided by AST
// layout; this is an ABI requirement. Ideally, we would always use an
// integer type that's just the bit-size of the AST type; for example, if
// sizeof(_BitInt(17)) == 4, _BitInt(17) would turn into i32. That is what's
// returned by convertTypeForLoadStore. However, that type does not
// always satisfy the size requirement on memory representation types
// describe above. For example, a 32-bit platform might reasonably set
// sizeof(_BitInt(65)) == 12, but i96 is likely to have to have an alloc size
// of 16 bytes in the LLVM data layout. In these cases, we simply return
// a byte array of the appropriate size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks.
FWIW, I was chasing down ABI documents yesterday, and found: x86-64 (https://gitlab.com/x86-psABIs/x86-64-ABI):
ARM 32-bit (https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst):
ARM 64-bit (https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst):
The latest RISC-V, LoongArch, and CSKY ABI documents I could find did not mention |
Co-authored-by: John McCall <[email protected]>
…-project into long-bitint-align
Okay, so x86_64 describes it in byte terms and says they're little-endian, which is consistent with the overall target. Interestingly, it does not guarantee the content of the excess bits. The code-generation in this patch is consistent with that: the extension we do is unnecessary but allowed, and then we truncate it away after load. If we ever add some way to tell the backend that a truncation is known to be reversing a sign/zero-extension, we'll need to not set it on this target. 32-bit and 64-bit ARM describe it in terms of smaller units, but the units are expressly laid out according to the overall endianness of the target, which composes to mean that the bytes overall are also laid out according to that endianness. |
Given all that, I feel pretty comfortable relying on using LLVM's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: John McCall <[email protected]>
FYI this already exists in the form of |
This solves 5-6 issues we had downstream, many thanks! |
Ah, neat. Mariya, would you mind looking into setting this properly on the truncates we're doing here? It'd be fine to do that as a follow-up; no need to hold up this PR for it. You'll need some kind of target hook to tell us whether to set it or not. Probably that ought to go in the Basic TargetInfo just so all of the target-specific ABI configuration is done in one place. |
Sure. |
There are two problems with _BitInt prior to this patch:
Example: Currently for i128:128 targets correct implementation is possible either for __int128 or for _BitInt(129+) with lowering to iN, but not both, since we have now correct implementation of __int128 in place after a21abc7.
When this happens, opaque [M x i8] types used, where M = sizeof(_BitInt(N)).
This patch also introduces concept of load/store type and adds an API to CodeGenTypes that returns the IR type that should be used for load and store operations. This is particularly useful for the case when a _BitInt ends up having array of bytes as memory layout type. For _BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and stores of iM would both (1) produce far better code from the backends and (2) be far more optimizable by IR passes than loads and stores of [M x i8].
Fixes #85139
Fixes #83419