JIT: should be able to remove redundant bounds-checks even if array contents change #6518

JosephTremoulet · 2016-08-18T14:13:56Z

Consider this program:

using System.Runtime.CompilerServices;

namespace N
{
    public class C
    {
        int[] a;
        int x;

        [MethodImpl(MethodImplOptions.NoInlining)]
        void array_instance(int s, int d1, int d2)
        {
            a[d1] = a[s] + 1;
            a[d2] = a[s];
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        void field(int s, int d1, int d2)
        {
            x = a[s] + 1;
            a[d2] = a[s];
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void array_static(int[] a, int s, int d1, int d2)
        {
            a[d1] = a[s] + 1;
            a[d2] = a[s];
        }

        public static int Main(string[] args)
        {
            C c = new C();
            var a = new int[] { 1, 2, 3, 4, 5, 6, 7, 8 };
            c.a = a;
            C.array_static(a, 3, 5, 7);
            c.array_instance(5, 3, 7);
            c.field(5, 3, 7);
            return 100;
        }
    }
}

In array_instance, we currently generate bounds-checks for both ldelems of a[s], even though we CSE the load of array a. The bounds-checks themselves are not CSE candidates (because they have type void, and even if that checked passed they're not in the list of CSE-candidate opcodes), so CSE doesn't remove them. If we could CSE the ldelems wholesale, we'd include the bounds-check in that CSE, as we do in method field, but we can't do that in array_instance because the store to a[d1] might overwrite a[s]. Assertion prop doesn't help either, because the to loads of array a have different conservative value numbers, and assertion prop uses the conservative numbers for fear of breaking safety by removing a redundant bounds-check but redundantly fetching the array (not that the fear is warranted in this case, as CSE removes the redundant load). By contrast, in method array_static, where the array a is an argument, the array portion of the two ldelems has the same conservative value number, so assertion prop removes the second a[s] bounds check even though it has to leave in the second a[s] load due to the possibility that the store to a[d1] overwrote a[s].

This same effect prevents hoisting invariant bounds-checks out of loops where the loop mutates the array. That happens in this code from benchmark JIT/Performance/CodeQuality/BenchI/BubbleSort2's inner loop:

            for (int j = i; j <= Bound; j++) {
                if (x[i] > x[j]) {
                    int temp = x[j];
                    x[j] = x[i];
                    x[i] = temp;
                }
            }

Here, x and i are invariant w.r.t. this loop, but x[i] gets stored to and so the loads of x[i] can't be hoisted out of the loop. The bounds check on the load x[i], however, should be hoistable. Loop hoisting doesn't hoist it because it's a CSE candidate, but even if it did, neither CSE nor assertion prop would remove the check in the loop (because of the same reasons as in array_instance above).

category:cq
theme:bounds-checks
skill-level:expert
cost:medium
impact:small

The text was updated successfully, but these errors were encountered:

russellhadley · 2016-08-19T20:03:23Z

@JosephTremoulet Reading this I'm thinking that we should instrument the compiler to not insert any bounds checks and see what the win is in the benchmarks, that would give us an upper bound on the cost bounds checks and let us prioritize this better.

briansull · 2016-08-22T17:29:42Z

We already have the ability to generate code that omit all of the array bounds checks:

set ComPlus_JitNoRngChks=1

#ifdef FEATURE_ENABLE_NO_RANGE_CHECKS
RETAIL_CONFIG_DWORD_INFO(PRIVATE_JitNoRangeChks, W("JitNoRngChks"), 0, "If 1, don't generate range checks")
#endif

JosephTremoulet · 2017-04-14T15:17:34Z

Currently we do remove the bounds check on the second a[s] in array_instance (due presumably to dotnet/coreclr#9451). However, we allocate three separate registers for array a, and we redundantly sign-extend s at both a[s] accesses, so the codegen still isn't as good as array_static. To wit, array_static currently looks like this:

G_M9659_IG01:
       4883EC28             sub      rsp, 40

G_M9659_IG02:
       8B4108               mov      eax, dword ptr [rcx+8]
       3BD0                 cmp      edx, eax
       732F                 jae      SHORT G_M9659_IG04
       4863D2               movsxd   rdx, edx
       448B549110           mov      r10d, dword ptr [rcx+4*rdx+16]
       41FFC2               inc      r10d
       443BC0               cmp      r8d, eax
       731F                 jae      SHORT G_M9659_IG04
       4D63C0               movsxd   r8, r8d
       4689548110           mov      dword ptr [rcx+4*r8+16], r10d
       8B549110             mov      edx, dword ptr [rcx+4*rdx+16]
       443BC8               cmp      r9d, eax
       730E                 jae      SHORT G_M9659_IG04
       4963C1               movsxd   rax, r9d
       89548110             mov      dword ptr [rcx+4*rax+16], edx

G_M9659_IG03:
       4883C428             add      rsp, 40
       C3                   ret

G_M9659_IG04:
       E8D376265F           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3

whilst array_instance currently looks like this:

G_M18826_IG01:
       56                   push     rsi
       4883EC20             sub      rsp, 32

G_M18826_IG02:
       488B4108             mov      rax, gword ptr [rcx+8]
       488BC8               mov      rcx, rax
       4C8BD0               mov      r10, rax
       458B5A08             mov      r11d, dword ptr [r10+8]
       413BD3               cmp      edx, r11d
       7336                 jae      SHORT G_M18826_IG04
       4863F2               movsxd   rsi, edx
       458B54B210           mov      r10d, dword ptr [r10+4*rsi+16]
       41FFC2               inc      r10d
       453BC3               cmp      r8d, r11d
       7326                 jae      SHORT G_M18826_IG04
       4D63C0               movsxd   r8, r8d
       4689548110           mov      dword ptr [rcx+4*r8+16], r10d
       488BC8               mov      rcx, rax
       4863D2               movsxd   rdx, edx
       8B449010             mov      eax, dword ptr [rax+4*rdx+16]
       453BCB               cmp      r9d, r11d
       730F                 jae      SHORT G_M18826_IG04
       4963D1               movsxd   rdx, r9d
       89449110             mov      dword ptr [rcx+4*rdx+16], eax

G_M18826_IG03:
       4883C420             add      rsp, 32
       5E                   pop      rsi
       C3                   ret

G_M18826_IG04:
       E85F76275F           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3

mikedn · 2017-04-14T15:55:18Z

and we redundantly sign-extend s at both a[s] accesses

Funny, I've just ran into something that looks like the opposite. Dictionary's FindEntry generates something like this:

G_M16822_IG05:
       mov      rcx, gword ptr [rsi+16]
; range check
       cmp      r15d, dword ptr [rcx+8]
       jae      G_M16822_IG13
; partial CSE def (r12) of array element address 
       movsxd   rdx, r15d
       lea      r12, [rdx+2*rdx]
; entry hashCode load
       cmp      dword ptr [rcx+8*r12+32], ebx
       jne      SHORT G_M16822_IG09
       mov      r13, gword ptr [rsi+24]
       mov      rcx, gword ptr [rsi+16]
; range check again!
       cmp      r15d, dword ptr [rcx+8]
       jae      G_M16822_IG13
; CSE use (r12)
       mov      rax, gword ptr [rcx+8*r12+16]

Not sure if it's related, in FindEntry the array doesn't change between the 2 range checks.

In your example AFAIR the reason why the sign extend cast isn't CSEd is that it's not considered worthwhile.

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020

BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JIT: should be able to remove redundant bounds-checks even if array contents change #6518

JIT: should be able to remove redundant bounds-checks even if array contents change #6518

JosephTremoulet commented Aug 18, 2016 •

edited by BruceForstall

Loading

russellhadley commented Aug 19, 2016

Uh oh!

briansull commented Aug 22, 2016 •

edited

Loading

Uh oh!

JosephTremoulet commented Apr 14, 2017

Uh oh!

mikedn commented Apr 14, 2017

Uh oh!

JIT: should be able to remove redundant bounds-checks even if array contents change #6518

JIT: should be able to remove redundant bounds-checks even if array contents change #6518

Comments

JosephTremoulet commented Aug 18, 2016 • edited by BruceForstall Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

russellhadley commented Aug 19, 2016

Uh oh!

briansull commented Aug 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JosephTremoulet commented Apr 14, 2017

Uh oh!

mikedn commented Apr 14, 2017

Uh oh!

JosephTremoulet commented Aug 18, 2016 •

edited by BruceForstall

Loading

briansull commented Aug 22, 2016 •

edited

Loading