Reopen of Bug 49389 - Condition not constant folded #90417

tesuji · 2024-04-28T21:17:19Z

This old issue doesn't seem to be fixed as all: https://bugs.llvm.org/show_bug.cgi?id=49389
Bug 49389 would be reposted below (godbolt link: https://godbolt.org/z/hdza9Pb8z):

#include <stdint.h>
#include <stddef.h>
#include <stdbool.h>

#define N 1234

bool eat_digits(uint8_t s[N]) {
    size_t i = 0;
    while (i < N) {
        uint8_t c = s[i];
        if ('0' <= c && c <= '9') {
            i += 1;
        } else {
            break;
        }
    }
    return i <= N;
}

compiles to:

eat_digits:                             # @eat_digits
        xor     ecx, ecx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        mov     rax, rcx
        cmp     rcx, 1234
        je      .LBB0_3
        movzx   edx, byte ptr [rdi + rax]
        add     dl, -48
        lea     rcx, [rax + 1]
        cmp     dl, 10
        jb      .LBB0_1
.LBB0_3:
        cmp     rax, 1235
        setb    al
        ret

GCC is able to compile it to:

eat_digits:
        mov     eax, 1
        ret

This is originally from: rust-lang/rust#81432

v01dXYZ · 2024-04-29T16:02:26Z

It seems using return i <= N instead of break in the loop body "solves" the missing optimisation.

Nevertheless, I wonder which pass could be responsible of that kind of optimisation IndVarSimplify or ConstraintElimination or other ?

I looked at ConstraintElimination and it fails to analyse the following loop pattern:

[ while.cond ]   -->   [ while.body ]
         |                   |
         |                   v
         +---------------> [ while.end ]

but if we duplicate [while.end] into [ while.end.cond ] and [ while.end.body ] which make now [ while.cond ] to be a dominator of [ while.end.cond ] (and same for [ while.body ]/[ while.end.body ]) we get the following IR before ConstraintElimination:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %common.ret, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end.2

common.ret:                                       ; preds = %while.cond, %while.end.2
  %common.ret.op = phi i1 [ %cmp7, %while.end.2 ], [ true, %while.cond ]
  ret i1 %common.ret.op

while.end.2:                                      ; preds = %while.body
  %cmp7 = icmp ult i64 %i.0, 1235
  br label %common.ret
}

ConstraintElimination is able to infer that %cmp7 is true now. I suspect the following code is responsible of missing to propagate the condition information for this kind of loop pattern:

llvm-project/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp

Lines 199 to 204 in 6f02120

    
             /// Returns true if we can add a known condition from BB to its successor 
        
             /// block Succ. 
        
             bool canAddSuccessor(BasicBlock &BB, BasicBlock *Succ) const { 
        
               return DT.dominates(BasicBlockEdge(&BB, Succ), Succ); 
        
             } 
        
           };

v01dXYZ · 2024-04-30T06:02:11Z

@fhahn Do you think this bug is related to ConstraintElimination ?

v01dXYZ · 2024-05-01T16:06:44Z

The faulty case manually fed to opt:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %while.end, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
;  %add = add i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end

while.end:                                        ; preds = %while.body, %while.cond
  %cmp6 = icmp ult i64 %i.0, 1235
  ret i1 %cmp6
}

$ opt faulty_manual.ll -passes="print<scalar-evolution>"

Printing analysis 'Scalar Evolution Analysis' for function 'eat_digits':
Classifying expressions for: @eat_digits
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  -->  {0,+,1}<nuw><nsw><%while.cond> U: [0,1235) S: [0,1235)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  -->  {%s,+,1}<nw><%while.cond> U: full-set S: full-set                Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %0 = load i8, ptr %arrayidx, align 1
  -->  %0 U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %1 = add i8 %0, -48
  -->  (-48 + %0) U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %add = add nuw nsw i64 %i.0, 1
  -->  {1,+,1}<nuw><nsw><%while.cond> U: [1,1236) S: [1,1236)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
Determining loop execution counts for: @eat_digits
Loop %while.cond: <multiple exits> Unpredictable backedge-taken count.
  exit count for while.cond: i64 1234
  exit count for while.body: ***COULDNOTCOMPUTE***
Loop %while.cond: constant max backedge-taken count is i64 1234
Loop %while.cond: symbolic max backedge-taken count is i64 1234
  symbolic max exit count for while.cond: i64 1234
  symbolic max exit count for while.body: ***COULDNOTCOMPUTE***

ScalarEvolution get everything we need, but it is not used.

When looking at the loop counter, if the min/max values of the SCEV ranges are relevant (not extremal), add this information as ConditionFacts for the Loop header. It allows icmp simplifications independently of the branch taken, such as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`. Fixes llvm#90417

v01dXYZ · 2024-05-08T10:55:55Z

I was able to solve your problems for the case you presented: a loop with a constant bound. The case of a dynamic bound is not yet supported. GCC seems to optimise only the constant case too.

When looking at the loop counter, if the min/max values of the SCEV ranges are relevant (not extremal), add this information as ConditionFacts for the Loop header. It allows icmp simplifications independently of the branch taken, such as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`. Fixes llvm#90417

When looking at the loop counter, if the min/max values of the SCEV ranges are relevant (not extremal), add this information as ConditionFacts for the Loop header. It allows icmp simplifications independently of the branch taken, such as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`. Fixes llvm#90417 IMPORTANT: Concerning modifications to Polly One test for polly was modified as an icmp simplification could be done earlier in the pipeline than previously. The input of `polly::CodePreparationPass` is almost the same one, except the name of one block that is different because of this simplification done earlier.

Test cases inspired by #90417.

For loops, we can use the condition in the loop header as upper bound on the compared induction in the unique exit block, if it exists. This can be done even if there are multiple in-loop edges to the unique exit block, as any other exit may only exit earlier. More generally, we could add the OR of all exit conditions to the exit, but that's a possible future extension. Fixes llvm#90417.

For loops, we can use the condition in the loop header as upper bound on the compared induction in the unique exit block, if it exists. This can be done even if there are multiple in-loop edges to the unique exit block, as any other exit may only exit earlier. More generally, we could add the OR of all exit conditions to the exit, but that's a possible future extension. Fixes llvm#90417. !fixup properly handle non-monotonically increasing case.

For loops, we can use the condition in the loop header as upper bound on the compared induction in the unique exit block, if it exists. This can be done even if there are multiple in-loop edges to the unique exit block, as any other exit may only exit earlier. More generally, we could add the OR of all exit conditions to the exit, but that's a possible future extension. Fixes llvm#90417.

For loops, we can use the condition in the loop header as upper bound on the compared induction in the unique exit block, if it exists. This can be done even if there are multiple in-loop edges to the unique exit block, as any other exit may only exit earlier. More generally, we could add the OR of all exit conditions to the exit, but that's a possible future extension. Fixes llvm#90417. !fixup properly handle non-monotonically increasing case. !fixup always emit precondition, don't check unique predecessor

github-actions bot added the new issue label Apr 28, 2024

Sirraide added llvm:optimizations missed-optimization and removed new issue labels Apr 29, 2024

v01dXYZ mentioned this issue May 8, 2024

[ConstraintElimination] Use SCEV ranges information for Loop counter #91457

Closed

v01dXYZ mentioned this issue May 28, 2024

[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

Open

fhahn added a commit that referenced this issue Jun 6, 2024

[ConstraintElim] Add set of tests where a loop iv is used in exit.

41d7350

Test cases inspired by #90417.

fhahn mentioned this issue Jun 6, 2024

[ConstraintElim] Use cond from header as upper bound on IV in exit BB. #94610

Merged

tesuji mentioned this issue Jul 7, 2024

LLVM cannot prove that index is in bounds rust-lang/rust#81432

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reopen of Bug 49389 - Condition not constant folded #90417

Reopen of Bug 49389 - Condition not constant folded #90417

tesuji commented Apr 28, 2024

v01dXYZ commented Apr 29, 2024

v01dXYZ commented Apr 30, 2024

v01dXYZ commented May 1, 2024

v01dXYZ commented May 8, 2024

Reopen of Bug 49389 - Condition not constant folded #90417

Reopen of Bug 49389 - Condition not constant folded #90417

Comments

tesuji commented Apr 28, 2024

v01dXYZ commented Apr 29, 2024

v01dXYZ commented Apr 30, 2024

v01dXYZ commented May 1, 2024

v01dXYZ commented May 8, 2024