-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
match: Use an aggregate equality comparison for constant array/slice patterns when possible #155216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c6676a2
05633c2
6049501
ba47e98
00fd1a1
85524ad
3da8657
5ba2ad7
e552045
a287bc9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,14 +4,56 @@ use rustc_abi::FieldIdx; | |
| use rustc_middle::mir::*; | ||
| use rustc_middle::span_bug; | ||
| use rustc_middle::thir::*; | ||
| use rustc_middle::ty::{self, Ty, TypeVisitableExt}; | ||
| use rustc_middle::ty::{self, Ty, TyCtxt, TypeVisitableExt}; | ||
|
|
||
| use crate::builder::Builder; | ||
| use crate::builder::expr::as_place::{PlaceBase, PlaceBuilder}; | ||
| use crate::builder::matches::{ | ||
| FlatPat, MatchPairTree, PatConstKind, PatternExtraData, SliceLenOp, TestableCase, | ||
| }; | ||
|
|
||
| /// Below this length, an array or slice pattern is compared element by element | ||
| /// rather than as a single aggregate, since the per-element comparisons are | ||
| /// unlikely to be more expensive than a `PartialEq::eq` call. | ||
| const AGGREGATE_EQ_MIN_LEN: usize = 4; | ||
|
|
||
| /// Checks whether every pattern in `elements` is a `PatKind::Constant` and, | ||
| /// if so, reconstructs a single aggregate `ty::Value` that represents the whole | ||
| /// array or slice. Returns `None` when any element is not a constant or the | ||
| /// sequence is too short to benefit from an aggregate comparison. | ||
| fn try_reconstruct_aggregate_constant<'tcx>( | ||
| tcx: TyCtxt<'tcx>, | ||
| aggregate_ty: Ty<'tcx>, | ||
| elements: &[Pat<'tcx>], | ||
| ) -> Option<ty::Value<'tcx>> { | ||
| // Short arrays are not worth an aggregate comparison. | ||
| if elements.len() < AGGREGATE_EQ_MIN_LEN { | ||
| return None; | ||
| } | ||
| let branches = elements | ||
| .iter() | ||
| .map(|pat| { | ||
| if let PatKind::Constant { value } = pat.kind { | ||
| Some(ty::Const::new_value(tcx, value.valtree, value.ty)) | ||
| } else { | ||
| None | ||
| } | ||
|
Comment on lines
+36
to
+40
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might also be worth reconstructing aggregate constants for arrays/slices of arrays of constants, etc.? I'm not a specialization expert, but it looks like arrays of bytewise-comparable things are also bytewise-comparable, at least for common array lengths1. Since array and slice equality are specialized based on their element types' bytewise-comparability, we should be able to get better codegen for nested array patterns too (as long as the inner arrays are of one of those common lengths), I think? Footnotes
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dianne, interesting. I’ll look into this next! 🙂 |
||
| }) | ||
| .collect::<Option<Vec<_>>>()?; | ||
| let valtree = ty::ValTree::from_branches(tcx, branches); | ||
| Some(ty::Value { ty: aggregate_ty, valtree }) | ||
| } | ||
|
|
||
| impl<'a, 'tcx> Builder<'a, 'tcx> { | ||
| /// Check if we can use aggregate `PartialEq::eq` comparisons for constant array/slice patterns. | ||
| /// This is not possible in const contexts, because `PartialEq` is not const-stable yet. | ||
| fn can_use_aggregate_eq(&self) -> bool { | ||
| let in_const_context = self.tcx.is_const_fn(self.def_id.to_def_id()) | ||
| || !self.tcx.hir_body_owner_kind(self.def_id).is_fn_or_closure(); | ||
| !in_const_context | ||
| } | ||
|
jakubadamw marked this conversation as resolved.
|
||
| } | ||
|
|
||
| /// For an array or slice pattern's subpatterns (prefix/slice/suffix), returns a list | ||
| /// of those subpatterns, each paired with a suitably-projected [`PlaceBuilder`]. | ||
| fn prefix_slice_suffix<'a, 'tcx>( | ||
|
|
@@ -220,10 +262,36 @@ impl<'tcx> MatchPairTree<'tcx> { | |
| _ => None, | ||
| }; | ||
| if let Some(array_len) = array_len { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, Some(array_len), prefix, slice, suffix) | ||
| // When all elements are constants and there is no `..` | ||
| // subpattern, compare the whole array at once via | ||
| // `PartialEq::eq` rather than element by element. | ||
| if slice.is_none() | ||
| && suffix.is_empty() | ||
| && cx.can_use_aggregate_eq() | ||
| && let Some(aggregate_value) = | ||
| try_reconstruct_aggregate_constant(cx.tcx, pattern.ty, prefix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| Some(TestableCase::Constant { | ||
| value: aggregate_value, | ||
| kind: PatConstKind::Aggregate, | ||
| }) | ||
| } else { | ||
| for (subplace, subpat) in prefix_slice_suffix( | ||
| &place_builder, | ||
| Some(array_len), | ||
| prefix, | ||
| slice, | ||
| suffix, | ||
| ) { | ||
| MatchPairTree::for_pattern( | ||
| subplace, | ||
| subpat, | ||
| cx, | ||
| &mut subpairs, | ||
| extra_data, | ||
| ); | ||
| } | ||
| None | ||
| } | ||
| } else { | ||
| // If the array length couldn't be determined, ignore the | ||
|
|
@@ -235,33 +303,57 @@ impl<'tcx> MatchPairTree<'tcx> { | |
| pattern.ty | ||
| ), | ||
| ); | ||
| None | ||
| } | ||
|
|
||
| None | ||
| } | ||
| PatKind::Slice { ref prefix, ref slice, ref suffix } => { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, None, prefix, slice, suffix) | ||
| // When there is no `..`, all elements are constants, and | ||
| // there are at least two of them, collapse the individual | ||
| // element subpairs into a single aggregate comparison that | ||
| // is performed after the length check. | ||
| if slice.is_none() | ||
|
Comment on lines
+310
to
+314
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Edit: assuming prefixes and suffixes are typically small and hand-written, it's probably not worth the trouble to use aggregate equality for them.
Edit: after |
||
| && suffix.is_empty() | ||
| && cx.can_use_aggregate_eq() | ||
| && let Some(aggregate_value) = | ||
| try_reconstruct_aggregate_constant(cx.tcx, pattern.ty, prefix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| } | ||
|
|
||
| if prefix.is_empty() && slice.is_some() && suffix.is_empty() { | ||
| // This pattern is shaped like `[..]`. It can match a slice | ||
| // of any length, so no length test is needed. | ||
| None | ||
| } else { | ||
| // Any other shape of slice pattern requires a length test. | ||
| // Slice patterns with a `..` subpattern require a minimum | ||
| // length; those without `..` require an exact length. | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len() + suffix.len()).unwrap(), | ||
| op: if slice.is_some() { | ||
| SliceLenOp::GreaterOrEqual | ||
| } else { | ||
| SliceLenOp::Equal | ||
| subpairs.push(MatchPairTree { | ||
| place, | ||
| testable_case: TestableCase::Constant { | ||
| value: aggregate_value, | ||
| kind: PatConstKind::Aggregate, | ||
| }, | ||
| subpairs: Vec::new(), | ||
| pattern_span: pattern.span, | ||
| }); | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len()).unwrap(), | ||
| op: SliceLenOp::Equal, | ||
| }) | ||
| } else { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, None, prefix, slice, suffix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| } | ||
|
|
||
| if prefix.is_empty() && slice.is_some() && suffix.is_empty() { | ||
| // This pattern is shaped like `[..]`. It can match | ||
| // a slice of any length, so no length test is needed. | ||
| None | ||
| } else { | ||
| // Any other shape of slice pattern requires a length test. | ||
| // Slice patterns with a `..` subpattern require a minimum | ||
| // length; those without `..` require an exact length. | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len() + suffix.len()).unwrap(), | ||
| op: if slice.is_some() { | ||
| SliceLenOp::GreaterOrEqual | ||
| } else { | ||
| SliceLenOp::Equal | ||
| }, | ||
| }) | ||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| // MIR for `array_match` after built | ||
|
|
||
| fn array_match(_1: [u8; 4]) -> bool { | ||
| debug x => _1; | ||
| let mut _0: bool; | ||
| let mut _2: &[u8; 4]; | ||
| let mut _3: bool; | ||
| scope 1 { | ||
| } | ||
|
|
||
| bb0: { | ||
| PlaceMention(_1); | ||
| _2 = &_1; | ||
| _3 = <[u8; 4] as PartialEq>::eq(copy _2, const &*b"\x01\x02\x03\x04") -> [return: bb4, unwind: bb8]; | ||
| } | ||
|
|
||
| bb1: { | ||
| _0 = const false; | ||
| goto -> bb7; | ||
| } | ||
|
|
||
| bb2: { | ||
| falseEdge -> [real: bb6, imaginary: bb1]; | ||
| } | ||
|
|
||
| bb3: { | ||
| goto -> bb1; | ||
| } | ||
|
|
||
| bb4: { | ||
| switchInt(move _3) -> [0: bb1, otherwise: bb2]; | ||
| } | ||
|
|
||
| bb5: { | ||
| FakeRead(ForMatchedPlace(None), _1); | ||
| unreachable; | ||
| } | ||
|
|
||
| bb6: { | ||
| _0 = const true; | ||
| goto -> bb7; | ||
| } | ||
|
|
||
| bb7: { | ||
| return; | ||
| } | ||
|
|
||
| bb8 (cleanup): { | ||
| resume; | ||
| } | ||
| } |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general approach feels unfortunate: we transformed constants into patterns in
const_to_pat, and now we try to reverse that transformation. Have we tried keeping the original constant around in the output ofconst_to_patand using it at runtime? Tho this has the same issue of const-dependent MIR lowering that @dianne pointed out.View changes since the review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, small handwritten arrays behave pretty much like tuples, it could be better codegen sometimes not to make them into constants:
It's admittedly a stretch but I'm tempted to err on the side of respecting user intent especially before the MIR boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. Maybe we should have a test for that in
mir-opt/building/match/sort_candidates.rsor such if we start optimizing array/slice comparisons? I think as-is this PR may also turn that into 8 sequential tests, each aTestKind::AggregateEqfor a different constant.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nadrieril, as an alternative to being guided solely by user “intent”, would it be sensible to raise the threshold on the number of elements in an array pattern where we would use the aggregate equality? Right now it’s
> 2. Perhaps 4 would work better? I suppose a quantitative comparison with benchmarks could be of use here, but sadly I can’t commit to that with my present schedule.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further thought, I think this is an unsound optimization, for example:
This function is sound if patterns were guaranteed to match left-to-right (this isn't set in stone yet, but likely).
Here if you replace this with
if PartialEq::eq(&*x, &[0, 1])not only are you taking a reference which itself has opsem consequences (this may make some later foreign writes invalid), but also may be reading uninitialized data.I think this convinces me we shouldn't even attempt to do this. Preserving "user intent" actually means preserving correct semantics here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'd be most appropriate to instead fix whatever changed to make LLVM stop doing this optimization? Looking at #110870 (comment), the match in the example function
fin that issue used to get simplified down enough in LLVM that the SLP vectorizer pass could turn the sequence of comparisons into a vector comparison in LLVM IR. I'm not familiar enough with LLVM to say what changed or how to fix it, though.The other place I could imagine doing something like this would be a MIR transform pass, but that seems iffy. I haven't followed the relevant opsem discussions super closely (other than reading rust-lang/unsafe-code-guidelines#346 just now) but it seems like even turning the safe match on a normal reference in
fin #110870 into a single comparison in MIR could introduce UB, judging from testing with CTFE and Miri (without-Zmiri-recursive-validation): those don't detect UB iffis given a reference to partially-initialized array if it short-circuits before reading anything uninit; of course, it's UB if the match becomes a call to a comparison intrinsic like araw_eqorcompare_bytes. Maybe a new comparison intrinsic or two with the right semantics could both work and give us more control over the LLVM IR we emit, but I'm guessing that's extreme overkill.