Skip to content

slice::contains runtime doubles when an unrelated binary_search benchmark is present #150450

@Gourab-Ghosh

Description

@Gourab-Ghosh

Summary

On my machine, the runtime of slice::contains in a tight loop changes drastically (~2×) depending on whether an additional, unrelated benchmark using slice::binary_search is present in the same binary. The contains benchmark is executed before the binary_search benchmark, but still becomes much slower when the third benchmark is compiled/executed afterwards.

This looks like a codegen/layout/inlining interaction (or similar), not a source-level change to the contains call.


Reproduction

Command

cargo run --release

Repro code (src/main.rs)

#![allow(unused)]

use PieceType::*;
use std::hint::black_box;
use std::time::Instant;

#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy)]
enum PieceType {
    Pawn,
    Knight,
    Bishop,
    Rook,
    Queen,
    King,
}

macro_rules! benchmark {
    ($code:block, $num_iterations:expr $(,)? ) => {{
        let now = Instant::now();
        for _ in 0..$num_iterations {
            black_box($code);
        }
        println!("Time taken to run: {:?}", now.elapsed());
    }};
}

fn main() {
    let num_iterations: usize = 25_000_000_000;

    benchmark!(
        { matches!(Pawn, Knight | Bishop | Rook | Queen) },
        num_iterations,
    );
    benchmark!(
        { const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn) },
        num_iterations,
    );
    benchmark!(
        {
            const { [Knight, Bishop, Rook, Queen] }
                .binary_search(&Pawn)
                .is_ok()
        },
        num_iterations,
    );
}

Observed output (with binary_search benchmark present)

Time taken to run: 4.700495676s
Time taken to run: 10.036953544s
Time taken to run: 4.750905003s

Control case

Comment out the last benchmark!( ... binary_search ... ) block (no other changes), then run the same command.

Observed output (without binary_search benchmark)

Time taken to run: 4.819309506s
Time taken to run: 4.75536901s

Expected

The second benchmark (const { [Knight, Bishop, Rook, Queen] }.contains(&Pawn)) should have comparable runtime regardless of whether a later, unrelated benchmark is present.

Actual

contains becomes ~2× slower only when the binary_search benchmark exists in the same program.


Environment

Hardware (Lenovo Legion Pro 7i Gen 10, 2025)

  • CPU: Intel Core Ultra 9 275HX
  • RAM: 64 GB DDR5-6400
  • GPU: NVIDIA GeForce RTX 5070 Laptop GPU

Software

  • OS: Garuda Linux (Arch-based), x86_64
  • Rust Version: rustc 1.92.0 (ded5c06 2025-12-08) (Arch Linux rust 1:1.92.0-1)
  • Build/run: cargo run --release

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions