Skip to content

[ER] Array bound tests in a simple enough case #84314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leonardo-m opened this issue Apr 18, 2021 · 6 comments
Open

[ER] Array bound tests in a simple enough case #84314

leonardo-m opened this issue Apr 18, 2021 · 6 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@leonardo-m
Copy link

leonardo-m commented Apr 18, 2021

This comes from reduction of code that implements a binary heap (sift up) on an array:

pub fn foo(data: &[u32]) -> u32 {
    let i = data.len();
    if i / 2 == 0 {
        return 0;
    }
    data[i - 1] + data[i / 2 - 1]
}

rustc (1.53.0-nightly 392ba2b 2021-04-17) seems unable to remove the array bound tests, despite I think the code should be panic-less:

foo:
        sub     rsp, 8
        mov     rdx, rsi
        shr     rdx
        je      .LBB0_6
        mov     rcx, rsi
        sub     rcx, 1
        jb      .LBB0_4
        lea     rax, [rdx - 1]
        cmp     rax, rsi
        jae     .LBB0_5
        mov     eax, dword ptr [rdi + 4*rdx - 4]
        add     eax, dword ptr [rdi + 4*rcx]
        pop     rcx
        ret
.LBB0_6:
        xor     eax, eax
        pop     rcx
        ret
.LBB0_4:
        lea     rdx, [rip + .L__unnamed_1]
        mov     rdi, rcx
        xor     esi, esi
        call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
        ud2
.LBB0_5:
        lea     rdx, [rip + .L__unnamed_2]
        mov     rdi, rax
        call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
        ud2

I expected it to compile to something closer to:

foo:
        mov     rax, rsi
        shr     rax
        je      .LBB0_1
        mov     eax, dword ptr [rdi + 4*rax - 4]
        add     eax, dword ptr [rdi + 4*rsi - 4]
        ret
.LBB0_1:
        xor     eax, eax
        ret
@nikic
Copy link
Contributor

nikic commented Apr 18, 2021

Godbolt: https://rust.godbolt.org/z/dKjnjradW

@nikic
Copy link
Contributor

nikic commented Apr 18, 2021

It looks like there's an unnecessary one-use check on this fold: https://llvm.godbolt.org/z/TEMqoEEWE

@nikic
Copy link
Contributor

nikic commented Apr 18, 2021

Looks like the one-use case goes through this fold: https://github.com/llvm/llvm-project/blob/f1aaa306ee6c7bf73dfd7b578a57ba8807bde539/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp#L2277-L2284

@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Apr 19, 2021
@nikic
Copy link
Contributor

nikic commented Apr 19, 2021

llvm/llvm-project@9423f78 eliminates one of the panics. The new IR is:

  %0 = icmp ult i64 %data.1, 2
  br i1 %0, label %bb6, label %bb3

bb3:                                              ; preds = %start
  %_4 = lshr i64 %data.1, 1
  %_12 = add nsw i64 %_4, -1
  %_16 = icmp ult i64 %_12, %data.1
  br i1 %_16, label %bb5, label %panic1, !prof !2

@nikic
Copy link
Contributor

nikic commented May 9, 2021

For the record, this change was reverted due to performance regressions on ARM. May be reapplied after those are addressed (relevant patches are https://reviews.llvm.org/D101196 and https://reviews.llvm.org/D101778).

@nikic
Copy link
Contributor

nikic commented Aug 28, 2021

The change has since been reapplied and is part of LLVM 13. We're now left with only one bounds check: https://rust.godbolt.org/z/bnqn4chGK

@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants