Don't `alloca` just to look at a discriminant #138391

scottmcm · 2025-03-12T08:01:56Z

Today we're making LLVM do a bunch of extra work when you match on trivial stuff like Option<bool> or ControlFlow<u8>.

This PR changes that so that simple types like Option<u32> or Result<(), Box<Error>> can stay as OperandValue::ScalarPair and we can still read the discriminant from them, rather than needing to write them into memory to have a PlaceValue just to get the discriminant out.

Fixes #137503

Today we're making LLVM do a bunch of extra work for every enum you match on, even trivial stuff like `Option<bool>`. Let's not.

rustbot · 2025-03-12T08:02:02Z

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-03-12T08:02:03Z

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

scottmcm · 2025-03-12T08:06:18Z

tests/codegen/match-optimizes-away.rs

-    // CHECK-LABEL: @four_valued
+    // CHECK-LABEL: i16 @four_valued(i16{{.+}}%x)
    // CHECK-NEXT: {{^.*:$}}
-    // CHECK-NEXT: ret i16 %0
+    // CHECK-NEXT: ret i16 %x


annot: these used to have the parameter named %0 because it was written into an %x alloca, but now that that's no longer needed, we give the %x name to the parameter directly.

scottmcm · 2025-03-12T08:09:50Z

compiler/rustc_codegen_ssa/src/mir/operand.rs

+        let tag_op = match self.val {
+            OperandValue::ZeroSized => bug!(),
+            OperandValue::Immediate(_) | OperandValue::Pair(_, _) => {
+                self.extract_field(fx, bx, tag_field)
+            }
+            OperandValue::Ref(place) => {
+                let tag = place.with_type(self.layout).project_field(bx, tag_field);
+                bx.load_operand(tag)
+            }
+        };


This codegen_get_discr method is moved nearly unchanged from PlaceRef, just now we have the extra case to extract_field if it's immediate(s). If it's a Ref, though, we do the same project_field as before.

scottmcm · 2025-03-12T08:11:22Z

@bors try @rust-timer queue

Don't `alloca` just to look at a discriminant Today we're making LLVM do a bunch of extra work when you match on trivial stuff like `Option<bool>` or `ControlFlow<u8>`. This PR changes that so that simple types like `Option<u32>` or `Result<(), Box<Error>>` can stay as `OperandValue::ScalarPair` and we can still read the discriminant from them, rather than needing to write them into memory to have a `PlaceValue` just to get the discriminant out. Fixes rust-lang#137503

bors · 2025-03-12T08:12:35Z

⌛ Trying commit 143f393 with merge 3cda1f4...

lcnr · 2025-03-12T08:22:54Z

r? codegen

bors · 2025-03-12T10:16:41Z

☀️ Try build successful - checks-actions
Build commit: 3cda1f4 (3cda1f4b257a0fdcc0b8279efce522cebd5263c5)

rust-timer · 2025-03-12T12:14:52Z

Finished benchmarking commit (3cda1f4): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.8%	[-0.8%, -0.8%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.8%	[-0.8%, -0.8%]	1

Max RSS (memory usage)

Results (primary -2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.2%	[-2.2%, -2.2%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.0%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.0%	[-0.1%, -0.0%]	27
Improvements ✅ (secondary)	-0.0%	[-0.1%, -0.0%]	10
All ❌✅ (primary)	-0.0%	[-0.1%, -0.0%]	27

Bootstrap: 779.599s -> 779.273s (-0.04%)
Artifact size: 365.29 MiB -> 365.24 MiB (-0.01%)

scottmcm · 2025-03-12T19:05:59Z

compiler/rustc_codegen_ssa/src/mir/intrinsic.rs

-            sym::discriminant_value => {
-                if ret_ty.is_integral() {
-                    args[0].deref(bx.cx()).codegen_get_discr(bx, ret_ty)
-                } else {
-                    span_bug!(span, "Invalid discriminant type for `{:?}`", arg_tys[0])
-                }
-            }


Annot: this is dead code, since LowerIntrinsics turns all the calls to this into MIR.

(And by deleting it there was only the one caller of codegen_get_discr to update.)

scottmcm · 2025-03-12T19:06:23Z

TBH I'd hoped for some icount improvements from doing less work, but oh well. I'll take no regressions and a debug-mode code size improvement, given how small of a code change it took.

@rustbot ready

WaffleLapkin · 2025-03-13T05:58:39Z

compiler/rustc_codegen_ssa/src/mir/operand.rs

+            return bx.cx().const_poison(cast_to);
+        }
+        let (tag_scalar, tag_encoding, tag_field) = match self.layout.variants {
+            Variants::Empty => unreachable!("we already handled uninhabited types"),


question: Can we just move the poison return here, instead of doing the if? Or is there a case where is_uninhabited is true, but .variants is not Empty?

I was wondering that myself, TBH, but wasn't sure so left it as it was.

Exploring a bit more now (and thinking of the whole "is (i32, !) a ZST?" issue), it looks like yes it's possible: https://rust.godbolt.org/z/sjq1hEs8r

pub enum Foo<T> { Hmm(i32, T) } pub type Bar = Foo<std::convert::Infallible>;

That Bar has

uninhabited: true, variants: Single { index: 0, },

so it's uninhabited but still single-variant.

rust/compiler/rustc_abi/src/layout.rs

Lines 25 to 29 in 8536f20

// A variant is absent if it's uninhabited and only has ZST fields.

// Present uninhabited variants only require space for their fields,

// but *not* an encoding of the discriminant (e.g., a tag value).

// See issue #49298 for more details on the need to leave space

// for non-ZST uninhabited data (mostly partial initialization).

WaffleLapkin · 2025-03-13T06:02:03Z

compiler/rustc_codegen_ssa/src/mir/operand.rs

+                    if let Some(discr) = self.layout.ty.discriminant_for_variant(bx.tcx(), index) {
+                        discr.val
+                    } else {
+                        assert_eq!(index, FIRST_VARIANT);
+                        0
+                    };


question: why is this behavior not embedded in discriminant_for_variant itself?

On an ADT it doesn't have the option return: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.AdtDef.html#method.discriminant_for_variant

On Ty, though, it returns None for things that aren't enums nor Coroutines. From a quick scan there appear to be a bunch of places unwraping or ?ing it, presumably as a "look, this is a type where I don't actually want to look at the discriminant in the first place". (But mem::discriminant work on arbitrary stuff, even i32, so stable can get to checking a discriminant, especially in debug where the MIR is polymorphic.)

So it's not obvious to me that changing the Ty version to default to just giving a zero instead of a None would be a good thing.

(I changed it from what existed before on Place because it used to be using the index.as_u32() rather than zero, but I can't see any way that it would ever be possible to have a type with a non-zero VariantIdx reach this arm, so wanted to upgrade that to an ICE if somehow it ever did since that's probably a mistake -- after all, returning the VariantIdx as a discriminant is the wrong thing to do for enums too.)

WaffleLapkin · 2025-03-13T16:45:48Z

r? WaffleLapkin
@bors r+

bors · 2025-03-13T16:45:51Z

📌 Commit 2b15dd1 has been approved by WaffleLapkin

It is now in the queue for this repository.

bors · 2025-03-14T00:42:34Z

⌛ Testing commit 2b15dd1 with merge addae07...

bors · 2025-03-14T03:50:12Z

☀️ Test successful - checks-actions
Approved by: WaffleLapkin
Pushing addae07 to master...

github-actions · 2025-03-14T03:52:00Z

Post-merge analysis result

Test differences

No test diffs found

rust-timer · 2025-03-14T06:22:32Z

Finished benchmarking commit (addae07): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.7%	[-0.7%, -0.7%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.7%	[-0.7%, -0.7%]	1

Max RSS (memory usage)

Results (primary -2.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.6%	[-2.6%, -2.6%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.6%	[-2.6%, -2.6%]	1

Cycles

Results (secondary 4.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.6%	[3.2%, 5.9%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.0%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.0%	[-0.1%, -0.0%]	27
Improvements ✅ (secondary)	-0.0%	[-0.1%, -0.0%]	11
All ❌✅ (primary)	-0.0%	[-0.1%, -0.0%]	27

Bootstrap: 774.314s -> 774.074s (-0.03%)
Artifact size: 365.05 MiB -> 364.97 MiB (-0.02%)

Don't alloca just to look at a discriminant

143f393

Today we're making LLVM do a bunch of extra work for every enum you match on, even trivial stuff like `Option<bool>`. Let's not.

rustbot assigned lcnr Mar 12, 2025

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 12, 2025

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 12, 2025

scottmcm commented Mar 12, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 12, 2025

rustbot assigned workingjubilee and unassigned lcnr Mar 12, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 12, 2025

scottmcm commented Mar 12, 2025

View reviewed changes

WaffleLapkin reviewed Mar 13, 2025

View reviewed changes

Add more comments to discriminant calculations.

2b15dd1

WaffleLapkin approved these changes Mar 13, 2025

View reviewed changes

rustbot assigned WaffleLapkin and unassigned workingjubilee Mar 13, 2025

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 13, 2025

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 13, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 14, 2025

bors merged commit addae07 into rust-lang:master Mar 14, 2025
7 checks passed

rustbot added this to the 1.87.0 milestone Mar 14, 2025

scottmcm deleted the SSA-discriminants branch March 14, 2025 05:32

	// A variant is absent if it's uninhabited and only has ZST fields.
	// Present uninhabited variants only require space for their fields,
	// but not an encoding of the discriminant (e.g., a tag value).
	// See issue #49298 for more details on the need to leave space
	// for non-ZST uninhabited data (mostly partial initialization).

Don't alloca just to look at a discriminant #138391

Don't alloca just to look at a discriminant #138391

Uh oh!

Conversation

scottmcm commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Mar 12, 2025

Uh oh!

rustbot commented Mar 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scottmcm commented Mar 12, 2025

Uh oh!

This comment has been minimized.

bors commented Mar 12, 2025

Uh oh!

lcnr commented Mar 12, 2025

Uh oh!

bors commented Mar 12, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Mar 12, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scottmcm commented Mar 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WaffleLapkin commented Mar 13, 2025

Uh oh!

bors commented Mar 13, 2025

Uh oh!

bors commented Mar 14, 2025

Uh oh!

bors commented Mar 14, 2025

Uh oh!

Uh oh!

github-actions bot commented Mar 14, 2025

Test differences

Uh oh!

rust-timer commented Mar 14, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Uh oh!

Don't `alloca` just to look at a discriminant #138391

Don't `alloca` just to look at a discriminant #138391

scottmcm commented Mar 12, 2025 •

edited

Loading