Skip to content

Missed bound check removal #41789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leonardo-m opened this issue May 6, 2017 · 1 comment
Open

Missed bound check removal #41789

leonardo-m opened this issue May 6, 2017 · 1 comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@leonardo-m
Copy link

leonardo-m commented May 6, 2017

A little test program:

#[inline(never)]
fn almost_product(nums: &[i32]) -> Vec<i32> {
    let mut result = vec![0; nums.len()];

    let mut prod = 1;
    for (i, &x) in nums.iter().enumerate() {
        result[i] = prod;
        prod *= x;
    }
    result
}

fn main() {
    let data = [1, 2, 3, 4, 11];
    println!("{:?}", almost_product(&data));
    let data = [1, 2, 3, 4, 11, 26, 54, 6];
    println!("{:?}", almost_product(&data));
}

If I compile it with:
nightly-x86_64-pc-windows-gnu - rustc 1.19.0-nightly (f4209651e 2017-05-05)

Using:
rustc -C opt-level=3 --emit asm test1.rs

I get this asm for the loop:

_ZN5test114almost_product17he2636a2faa7b0543E:
...
.LBB4_9:
    cmpq    %r12, %rdi
    jae .LBB4_10
    movl    (%r15,%rdi,4), %eax
    movl    %esi, (%rcx,%rdi,4)
    incq    %rdi
    imull   %esi, %eax
    addq    $-4, %rbx
    movl    %eax, %esi
    jne .LBB4_9
...
.LBB4_10:
    leaq    panic_bounds_check_loc.5(%rip), %rcx
    movq    %rdi, %rdx
    movq    %r12, %r8
    callq   _ZN4core9panicking18panic_bounds_check17h1fe2f83f670bcee9E
    ud2

The bound check isn't removed. To remove it I need to use get_unchecked_mut(), or code like this (with the same main function):

#![feature(core_intrinsics)]
use std::intrinsics::assume;

#[inline(never)]
fn almost_product(nums: &[i32]) -> Vec<i32> {
    let mut result = vec![0; nums.len()];

    let mut prod = 1;
    for (i, &x) in nums.iter().enumerate() {
        unsafe { assume(i < result.len()); }
        result[i] = prod;
        prod *= x;
    }
    result
}

Now the loop gives a clean asm:

.LBB3_11:
    movl    %edx, %esi
    movl    (%r12,%rcx,4), %edx
    imull   %esi, %edx
    movl    %esi, (%rax,%rcx,4)
    incq    %rcx
    addq    $-4, %rdi
    cmpq    %rcx, %rbx
    jne .LBB3_11

If I use:
unsafe { *result.get_unchecked_mut(i) = prod; }

The loop also gets unrolled four times:

.LBB3_15:
	movl	(%rcx), %esi
	imull	%edx, %esi
	movl	%edx, -12(%rbx)
	movl	4(%rcx), %edx
	imull	%esi, %edx
	movl	%esi, -8(%rbx)
	movl	8(%rcx), %esi
	imull	%edx, %esi
	movl	%edx, -4(%rbx)
	movl	12(%rcx), %edx
	imull	%esi, %edx
	movl	%esi, (%rbx)
	addq	$16, %rbx
	addq	$16, %rcx
	cmpq	%rdi, %rcx
	jne	.LBB3_15
@Mark-Simulacrum Mark-Simulacrum added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Jun 22, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 26, 2017
@jonas-schievink jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 17, 2020
@workingjubilee workingjubilee added the A-codegen Area: Code generation label Nov 29, 2023
@workingjubilee
Copy link
Member

Something seems to still be a problem:
https://godbolt.org/z/jW6K8qYvE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants