Skip to content

Major performance regression on n-queens #19687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
reem opened this issue Dec 10, 2014 · 4 comments
Closed

Major performance regression on n-queens #19687

reem opened this issue Dec 10, 2014 · 4 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@reem
Copy link
Contributor

reem commented Dec 10, 2014

The single-threaded implementation here: https://github.com/reem/rust-n-queens used to take 500k ns/iter for n=12, but now takes closer to 7,000k ns/iter.

I'm not really sure how to go about narrowing this issue.

@huonw
Copy link
Member

huonw commented Dec 10, 2014

Just double checking: are you building with optimisations?

@reem
Copy link
Contributor Author

reem commented Dec 10, 2014

Yup. Originally tried with cargo bench then compiled manually with
--opt-level 3 -C lto
On Tue, Dec 9, 2014 at 9:02 PM Huon Wilson [email protected] wrote:

Just double checking: are you building with optimisations?


Reply to this email directly or view it on GitHub
#19687 (comment).

@mahkoh
Copy link
Contributor

mahkoh commented Dec 10, 2014

The faster version doesn't actually bench anything:

    callq   _ZN15n_queens_helper20h5eeb7aeda9e3dcfbFaaE
    leaq    8(%rsp), %rcx
    leaq    16(%rsp), %rdx
    .align  16, 0x90
.LBB2_4:
    movq    %rax, 8(%rsp)
    #APP
    #NO_APP
    #APP
    #NO_APP
    decq    %rbx
    jne .LBB2_4

whereas the slower version

    .align  16, 0x90
.LBB1_4:
    xorl    %edi, %edi
    xorl    %esi, %esi
    xorl    %edx, %edx
    callq   _ZN15n_queens_helper20h7ab2ef9deb11893eFaaE
    movq    %rax, 8(%rsp)
    #APP
    #NO_APP
    #APP
    #NO_APP
    decq    %rbx
    jne .LBB1_4

Generated with the following code and an unknown rust version from approximately 11-8:

extern crate test;
use test::Bencher;

fn n_queens(n: i32) -> uint {
    return n_queens_helper((1 << n as uint) -1, 0, 0, 0);
}

fn n_queens_helper(all_ones: i32, left_diags: i32, columns: i32, right_diags: i32) -> uint {
    let mut solutions = 0;
    let mut valid_spots = !(left_diags | columns | right_diags) & all_ones;
    while valid_spots != 0 {
        let spot = -valid_spots & valid_spots;
        valid_spots = valid_spots ^ spot;
        solutions += n_queens_helper(
            all_ones,
            (left_diags | spot) << 1,
            (columns | spot),
            (right_diags | spot) >> 1);
    }
    return solutions + ((columns == all_ones) as uint)
}

//#[test]
//fn test_parallel_n_queens() {
//    n_queens_helper(0, 0, 0, 0);
//}

#[bench]
fn bench_n_queens(b: &mut Bencher) {
    b.iter(|| test::black_box(n_queens(12)));
}

This generates the slower version. Remove the // to generate the faster version.

Not a regression.

@kmcallister kmcallister added I-slow Issue: Problems and improvements with respect to performance of generated code. B-reproduce labels Jan 17, 2015
@emberian
Copy link
Member

@reem closing due to inactivity and @mahkoh's indication that this isn't a regression. Feel free to repro and reopen.

lnicola pushed a commit to lnicola/rust that referenced this issue May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

5 participants