Skip to content

Reopen of Bug 49389 - Condition not constant folded #90417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tesuji opened this issue Apr 28, 2024 · 4 comments
Open

Reopen of Bug 49389 - Condition not constant folded #90417

tesuji opened this issue Apr 28, 2024 · 4 comments

Comments

@tesuji
Copy link

tesuji commented Apr 28, 2024

This old issue doesn't seem to be fixed as all: https://bugs.llvm.org/show_bug.cgi?id=49389
Bug 49389 would be reposted below (godbolt link: https://godbolt.org/z/hdza9Pb8z):

#include <stdint.h>
#include <stddef.h>
#include <stdbool.h>

#define N 1234

bool eat_digits(uint8_t s[N]) {
    size_t i = 0;
    while (i < N) {
        uint8_t c = s[i];
        if ('0' <= c && c <= '9') {
            i += 1;
        } else {
            break;
        }
    }
    return i <= N;
}

compiles to:

eat_digits:                             # @eat_digits
        xor     ecx, ecx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        mov     rax, rcx
        cmp     rcx, 1234
        je      .LBB0_3
        movzx   edx, byte ptr [rdi + rax]
        add     dl, -48
        lea     rcx, [rax + 1]
        cmp     dl, 10
        jb      .LBB0_1
.LBB0_3:
        cmp     rax, 1235
        setb    al
        ret

GCC is able to compile it to:

eat_digits:
        mov     eax, 1
        ret

This is originally from: rust-lang/rust#81432

@v01dXYZ
Copy link
Contributor

v01dXYZ commented Apr 29, 2024

It seems using return i <= N instead of break in the loop body "solves" the missing optimisation.

Nevertheless, I wonder which pass could be responsible of that kind of optimisation IndVarSimplify or ConstraintElimination or other ?

I looked at ConstraintElimination and it fails to analyse the following loop pattern:

[ while.cond ]   -->   [ while.body ]
         |                   |
         |                   v
         +---------------> [ while.end ]

but if we duplicate [while.end] into [ while.end.cond ] and [ while.end.body ] which make now [ while.cond ] to be a dominator of [ while.end.cond ] (and same for [ while.body ]/[ while.end.body ]) we get the following IR before ConstraintElimination:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %common.ret, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end.2

common.ret:                                       ; preds = %while.cond, %while.end.2
  %common.ret.op = phi i1 [ %cmp7, %while.end.2 ], [ true, %while.cond ]
  ret i1 %common.ret.op

while.end.2:                                      ; preds = %while.body
  %cmp7 = icmp ult i64 %i.0, 1235
  br label %common.ret
}

ConstraintElimination is able to infer that %cmp7 is true now. I suspect the following code is responsible of missing to propagate the condition information for this kind of loop pattern:

/// Returns true if we can add a known condition from BB to its successor
/// block Succ.
bool canAddSuccessor(BasicBlock &BB, BasicBlock *Succ) const {
return DT.dominates(BasicBlockEdge(&BB, Succ), Succ);
}
};

@v01dXYZ
Copy link
Contributor

v01dXYZ commented Apr 30, 2024

@fhahn Do you think this bug is related to ConstraintElimination ?

@v01dXYZ
Copy link
Contributor

v01dXYZ commented May 1, 2024

The faulty case manually fed to opt:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %while.end, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
;  %add = add i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end

while.end:                                        ; preds = %while.body, %while.cond
  %cmp6 = icmp ult i64 %i.0, 1235
  ret i1 %cmp6
}
$ opt faulty_manual.ll -passes="print<scalar-evolution>"

Printing analysis 'Scalar Evolution Analysis' for function 'eat_digits':
Classifying expressions for: @eat_digits
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  -->  {0,+,1}<nuw><nsw><%while.cond> U: [0,1235) S: [0,1235)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  -->  {%s,+,1}<nw><%while.cond> U: full-set S: full-set                Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %0 = load i8, ptr %arrayidx, align 1
  -->  %0 U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %1 = add i8 %0, -48
  -->  (-48 + %0) U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %add = add nuw nsw i64 %i.0, 1
  -->  {1,+,1}<nuw><nsw><%while.cond> U: [1,1236) S: [1,1236)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
Determining loop execution counts for: @eat_digits
Loop %while.cond: <multiple exits> Unpredictable backedge-taken count.
  exit count for while.cond: i64 1234
  exit count for while.body: ***COULDNOTCOMPUTE***
Loop %while.cond: constant max backedge-taken count is i64 1234
Loop %while.cond: symbolic max backedge-taken count is i64 1234
  symbolic max exit count for while.cond: i64 1234
  symbolic max exit count for while.body: ***COULDNOTCOMPUTE***

ScalarEvolution get everything we need, but it is not used.

v01dXYZ pushed a commit to v01dXYZ/llvm-project that referenced this issue May 8, 2024
When looking at the loop counter, if the min/max values of the SCEV
ranges are relevant (not extremal), add this information as
ConditionFacts for the Loop header.

It allows icmp simplifications independently of the branch taken, such
as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`.

Fixes llvm#90417
@v01dXYZ
Copy link
Contributor

v01dXYZ commented May 8, 2024

I was able to solve your problems for the case you presented: a loop with a constant bound. The case of a dynamic bound is not yet supported. GCC seems to optimise only the constant case too.

v01dXYZ pushed a commit to v01dXYZ/llvm-project that referenced this issue May 9, 2024
When looking at the loop counter, if the min/max values of the SCEV
ranges are relevant (not extremal), add this information as
ConditionFacts for the Loop header.

It allows icmp simplifications independently of the branch taken, such
as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`.

Fixes llvm#90417
v01dXYZ pushed a commit to v01dXYZ/llvm-project that referenced this issue May 9, 2024
When looking at the loop counter, if the min/max values of the SCEV
ranges are relevant (not extremal), add this information as
ConditionFacts for the Loop header.

It allows icmp simplifications independently of the branch taken, such
as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`.

Fixes llvm#90417

IMPORTANT: Concerning modifications to Polly

One test for polly was modified as an icmp simplification could be
done earlier in the pipeline than previously. The input of
`polly::CodePreparationPass` is almost the same one, except the name
of one block that is different because of this simplification done
earlier.
nikic pushed a commit to nikic/llvm-project that referenced this issue May 13, 2024
When looking at the loop counter, if the min/max values of the SCEV
ranges are relevant (not extremal), add this information as
ConditionFacts for the Loop header.

It allows icmp simplifications independently of the branch taken, such
as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`.

Fixes llvm#90417

IMPORTANT: Concerning modifications to Polly

One test for polly was modified as an icmp simplification could be
done earlier in the pipeline than previously. The input of
`polly::CodePreparationPass` is almost the same one, except the name
of one block that is different because of this simplification done
earlier.
v01dXYZ pushed a commit to v01dXYZ/llvm-project that referenced this issue May 13, 2024
When looking at the loop counter, if the min/max values of the SCEV
ranges are relevant (not extremal), add this information as
ConditionFacts for the Loop header.

It allows icmp simplifications independently of the branch taken, such
as using `i <= N` after a block `for(i=0; i<N; i+=1) { ... }`.

Fixes llvm#90417

IMPORTANT: Concerning modifications to Polly

One test for polly was modified as an icmp simplification could be
done earlier in the pipeline than previously. The input of
`polly::CodePreparationPass` is almost the same one, except the name
of one block that is different because of this simplification done
earlier.
fhahn added a commit that referenced this issue Jun 6, 2024
fhahn added a commit to fhahn/llvm-project that referenced this issue Jun 6, 2024
For loops, we can use the condition in the loop header as upper bound on
the compared induction in the unique exit block, if it exists. This can
be done even if there are multiple in-loop edges to the unique exit
block, as any other exit may only exit earlier.

More generally, we could add the OR of all exit conditions to the exit,
but that's a possible future extension.

Fixes llvm#90417.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jun 6, 2024
For loops, we can use the condition in the loop header as upper bound on
the compared induction in the unique exit block, if it exists. This can
be done even if there are multiple in-loop edges to the unique exit
block, as any other exit may only exit earlier.

More generally, we could add the OR of all exit conditions to the exit,
but that's a possible future extension.

Fixes llvm#90417.
nikic pushed a commit to nikic/llvm-project that referenced this issue Jun 6, 2024
For loops, we can use the condition in the loop header as upper bound on
the compared induction in the unique exit block, if it exists. This can
be done even if there are multiple in-loop edges to the unique exit
block, as any other exit may only exit earlier.

More generally, we could add the OR of all exit conditions to the exit,
but that's a possible future extension.

Fixes llvm#90417.

!fixup properly handle non-monotonically increasing case.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jun 29, 2024
For loops, we can use the condition in the loop header as upper bound on
the compared induction in the unique exit block, if it exists. This can
be done even if there are multiple in-loop edges to the unique exit
block, as any other exit may only exit earlier.

More generally, we could add the OR of all exit conditions to the exit,
but that's a possible future extension.

Fixes llvm#90417.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jul 1, 2024
For loops, we can use the condition in the loop header as upper bound on
the compared induction in the unique exit block, if it exists. This can
be done even if there are multiple in-loop edges to the unique exit
block, as any other exit may only exit earlier.

More generally, we could add the OR of all exit conditions to the exit,
but that's a possible future extension.

Fixes llvm#90417.

!fixup properly handle non-monotonically increasing case.

!fixup always emit precondition, don't check unique predecessor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants