-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
Summary
On my machine, the runtime of slice::contains in a tight loop changes drastically (~2×) depending on whether an additional, unrelated benchmark using slice::binary_search is present in the same binary. The contains benchmark is executed before the binary_search benchmark, but still becomes much slower when the third benchmark is compiled/executed afterwards.
This looks like a codegen/layout/inlining interaction (or similar), not a source-level change to the contains call.
Reproduction
Command
cargo run --releaseRepro code (src/main.rs)
#![allow(unused)]
use PieceType::*;
use std::hint::black_box;
use std::time::Instant;
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy)]
enum PieceType {
Pawn,
Knight,
Bishop,
Rook,
Queen,
King,
}
macro_rules! benchmark {
($code:block, $num_iterations:expr $(,)? ) => {{
let now = Instant::now();
for _ in 0..$num_iterations {
black_box($code);
}
println!("Time taken to run: {:?}", now.elapsed());
}};
}
fn main() {
let num_iterations: usize = 25_000_000_000;
benchmark!(
{ matches!(Pawn, Knight | Bishop | Rook | Queen) },
num_iterations,
);
benchmark!(
{ const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn) },
num_iterations,
);
benchmark!(
{
const { [Knight, Bishop, Rook, Queen] }
.binary_search(&Pawn)
.is_ok()
},
num_iterations,
);
}Observed output (with binary_search benchmark present)
Time taken to run: 4.700495676s
Time taken to run: 10.036953544s
Time taken to run: 4.750905003s
Control case
Comment out the last benchmark!( ... binary_search ... ) block (no other changes), then run the same command.
Observed output (without binary_search benchmark)
Time taken to run: 4.819309506s
Time taken to run: 4.75536901s
Expected
The second benchmark (const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn)) should have comparable runtime regardless of whether a later, unrelated benchmark is present.
Actual
contains becomes ~2× slower only when the binary_search benchmark exists in the same program.
Environment
Hardware (Lenovo Legion Pro 7i Gen 10, 2025)
- CPU: Intel Core Ultra 9 275HX
- RAM: 64 GB DDR5-6400
- GPU: NVIDIA GeForce RTX 5070 Laptop GPU
Software
- OS: Garuda Linux (Arch-based), x86_64
- Rust Version: rustc 1.92.0 (ded5c06 2025-12-08) (Arch Linux rust 1:1.92.0-1)
- Build/run:
cargo run --release