-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Stop emitting one-at-a-time byte ops when swapping byte arrays #134946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
A-codegen
Area: Code generation
C-optimization
Category: An issue highlighting optimization opportunities or PRs implementing such
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
T-libs
Relevant to the library team, which will review and decide on the PR/issue.
Comments
bors
added a commit
to rust-lang-ci/rust
that referenced
this issue
Dec 31, 2024
Redo the swap code for better tail & padding handling A couple of parts here ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this redo goes back to *always* swapping via not-`!noundef` integers. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This PR goes back to a manual chunk, with at most logarithmic more instructions for the tail. (There are other ways that could potentially handle the tail even better, but this seems to be pretty good, since it's how LLVM ends up lowering operations on types like `i56`.) ## Polymorphization Since it's silly to have separate copies of swapping -- especially *untyped* swapping! -- for `u32`, `i32`, `f32`, `[u16; 2]`, etc, this sends everything to byte versions, but still mono'd by alignment. That should make it more ok that the code is somewhat more complex, since we only get 7 monomorphizations of the complicated bit. (One day we'll be able to remove some of the hacks by being able to just call `foo::<{align_of::<T>()}>`, but since alignments are only powers of two, the manual dispatch out isn't a big deal.)
matthiaskrgr
added a commit
to matthiaskrgr/rust
that referenced
this issue
Apr 9, 2025
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
bors
added a commit
to rust-lang-ci/rust
that referenced
this issue
Apr 9, 2025
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.) --- try-jobs: x86_64-gnu-distcheck
Zalathar
added a commit
to Zalathar/rust
that referenced
this issue
Apr 10, 2025
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
Zalathar
added a commit
to Zalathar/rust
that referenced
this issue
Apr 10, 2025
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve ``@RalfJung's`` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
github-actions bot
pushed a commit
to model-checking/verify-rust-std
that referenced
this issue
Apr 19, 2025
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-codegen
Area: Code generation
C-optimization
Category: An issue highlighting optimization opportunities or PRs implementing such
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
T-libs
Relevant to the library team, which will review and decide on the PR/issue.
Mentioned on Discord https://discord.com/channels/273534239310479360/592856094527848449/1319367290902286367
Demo when swapping
[u8; 44]
: https://rust.godbolt.org/z/rznror9aGEspecially with opt-level 2, there's still
And in opt-level 3 it's
The text was updated successfully, but these errors were encountered: