-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Don't alloca
just to look at a discriminant
#138391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Today we're making LLVM do a bunch of extra work for every enum you match on, even trivial stuff like `Option<bool>`. Let's not.
Some changes occurred in compiler/rustc_codegen_ssa |
// CHECK-LABEL: @four_valued | ||
// CHECK-LABEL: i16 @four_valued(i16{{.+}}%x) | ||
// CHECK-NEXT: {{^.*:$}} | ||
// CHECK-NEXT: ret i16 %0 | ||
// CHECK-NEXT: ret i16 %x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annot: these used to have the parameter named %0
because it was written into an %x
alloca, but now that that's no longer needed, we give the %x
name to the parameter directly.
let tag_op = match self.val { | ||
OperandValue::ZeroSized => bug!(), | ||
OperandValue::Immediate(_) | OperandValue::Pair(_, _) => { | ||
self.extract_field(fx, bx, tag_field) | ||
} | ||
OperandValue::Ref(place) => { | ||
let tag = place.with_type(self.layout).project_field(bx, tag_field); | ||
bx.load_operand(tag) | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This codegen_get_discr
method is moved nearly unchanged from PlaceRef
, just now we have the extra case to extract_field
if it's immediate(s). If it's a Ref
, though, we do the same project_field
as before.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Don't `alloca` just to look at a discriminant Today we're making LLVM do a bunch of extra work when you match on trivial stuff like `Option<bool>` or `ControlFlow<u8>`. This PR changes that so that simple types like `Option<u32>` or `Result<(), Box<Error>>` can stay as `OperandValue::ScalarPair` and we can still read the discriminant from them, rather than needing to write them into memory to have a `PlaceValue` just to get the discriminant out. Fixes rust-lang#137503
r? codegen |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (3cda1f4): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -2.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResults (primary -0.0%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 779.599s -> 779.273s (-0.04%) |
sym::discriminant_value => { | ||
if ret_ty.is_integral() { | ||
args[0].deref(bx.cx()).codegen_get_discr(bx, ret_ty) | ||
} else { | ||
span_bug!(span, "Invalid discriminant type for `{:?}`", arg_tys[0]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annot: this is dead code, since LowerIntrinsics
turns all the calls to this into MIR.
(And by deleting it there was only the one caller of codegen_get_discr
to update.)
TBH I'd hoped for some icount improvements from doing less work, but oh well. I'll take no regressions and a debug-mode code size improvement, given how small of a code change it took. @rustbot ready |
return bx.cx().const_poison(cast_to); | ||
} | ||
let (tag_scalar, tag_encoding, tag_field) = match self.layout.variants { | ||
Variants::Empty => unreachable!("we already handled uninhabited types"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can we just move the poison return here, instead of doing the if? Or is there a case where is_uninhabited
is true, but .variants
is not Empty
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering that myself, TBH, but wasn't sure so left it as it was.
Exploring a bit more now (and thinking of the whole "is (i32, !)
a ZST?" issue), it looks like yes it's possible: https://rust.godbolt.org/z/sjq1hEs8r
pub enum Foo<T> {
Hmm(i32, T)
}
pub type Bar = Foo<std::convert::Infallible>;
That Bar
has
uninhabited: true,
variants: Single {
index: 0,
},
so it's uninhabited but still single-variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rust/compiler/rustc_abi/src/layout.rs
Lines 25 to 29 in 8536f20
// A variant is absent if it's uninhabited and only has ZST fields. | |
// Present uninhabited variants only require space for their fields, | |
// but *not* an encoding of the discriminant (e.g., a tag value). | |
// See issue #49298 for more details on the need to leave space | |
// for non-ZST uninhabited data (mostly partial initialization). |
if let Some(discr) = self.layout.ty.discriminant_for_variant(bx.tcx(), index) { | ||
discr.val | ||
} else { | ||
assert_eq!(index, FIRST_VARIANT); | ||
0 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why is this behavior not embedded in discriminant_for_variant
itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On an ADT it doesn't have the option return: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.AdtDef.html#method.discriminant_for_variant
On Ty
, though, it returns None
for things that aren't enum
s nor Coroutines. From a quick scan there appear to be a bunch of places unwrap
ing or ?
ing it, presumably as a "look, this is a type where I don't actually want to look at the discriminant in the first place". (But mem::discriminant
work on arbitrary stuff, even i32
, so stable can get to checking a discriminant, especially in debug where the MIR is polymorphic.)
So it's not obvious to me that changing the Ty
version to default to just giving a zero instead of a None
would be a good thing.
(I changed it from what existed before on Place
because it used to be using the index.as_u32()
rather than zero, but I can't see any way that it would ever be possible to have a type with a non-zero VariantIdx
reach this arm, so wanted to upgrade that to an ICE if somehow it ever did since that's probably a mistake -- after all, returning the VariantIdx
as a discriminant is the wrong thing to do for enums too.)
r? WaffleLapkin |
☀️ Test successful - checks-actions |
Post-merge analysis result Test differencesNo test diffs found |
Finished benchmarking commit (addae07): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -2.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 4.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.0%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 774.314s -> 774.074s (-0.03%) |
Today we're making LLVM do a bunch of extra work when you match on trivial stuff like
Option<bool>
orControlFlow<u8>
.This PR changes that so that simple types like
Option<u32>
orResult<(), Box<Error>>
can stay asOperandValue::ScalarPair
and we can still read the discriminant from them, rather than needing to write them into memory to have aPlaceValue
just to get the discriminant out.Fixes #137503