Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955

zamazan4ik · 2017-12-10T16:47:16Z


Bugzilla Link	35607
Version	trunk
OS	All
Blocks	#34959
CC	@hfinkel,@RKSimon,@rotateright
Fixed by commit(s)	r369386

Extended Description

clang(trunk) with '--std=c++17 -O3 -march=native -ffast-math' flags for this code:

#include <algorithm>

int test(int a, int b) {
    return std::max(std::min(a,b), std::max(a,b));
}

generates this assembly:

test(int, int): # @test(int, int)
  cmp esi, edi
  mov eax, edi
  cmovle eax, esi
  cmp edi, esi
  cmovl edi, esi
  cmp eax, edi
  cmovge edi, eax
  mov eax, edi
  ret

gcc(trunk) with '--std=c++17 -O3 -march=native -ffast-math':

test(int, int):
        cmp     edi, esi
        mov     eax, edi
        cmovl   eax, esi
        ret

Helpful link: https://github.com/gcc-mirror/gcc/blob/07b69d3f1cd3dd8ebb0af1fbff95914daee477d2/gcc/match.pd

zamazan4ik · 2017-12-10T17:10:28Z

Also check this case:

#include <algorithm>

int test(int a, int b)
{
    return std::max(std::max(a,b), std::max(b,a));
}

rotateright · 2017-12-11T19:07:14Z

The examples here don't need -ffast-math because they're integer ops.

This might be solved by canonicalizing min/max harder in instcombine, so we 'see' the common compare instructions:

define i32 @max_of_minmax(i32 %a, i32 %b) {
  %cmin = icmp slt i32 %b, %a
  %cmax = icmp slt i32 %a, %b
  %min = select i1 %cmin, i32 %b, i32 %a
  %max = select i1 %cmax, i32 %b, i32 %a
  %cmax2 = icmp slt i32 %min, %max
  %max2 = select i1 %cmax2, i32 %min, i32 %max
  ret i32 %max2
}

define i32 @max_of_maxmax(i32 %a, i32 %b) {
  %cmax1 = icmp slt i32 %a, %b
  %cmax2 = icmp slt i32 %b, %a
  %max1 = select i1 %cmax1, i32 %b, i32 %a
  %max2 = select i1 %cmax2, i32 %a, i32 %b
  %cmax3 = icmp slt i32 %max2, %max1
  %max3 = select i1 %cmax3, i32 %max2, i32 %max1
  ret i32 %max3
}

rotateright · 2017-12-11T20:14:01Z

On 2nd thought, if we ignore the trailing select, this can be viewed as a failure of InstSimplify to fold the last icmp.

We have "simplifyICmpWithMinMax" but it doesn't look for patterns like this where both operands are min and/or max.

rotateright · 2017-12-11T20:29:07Z

We have "simplifyICmpWithMinMax" but it doesn't look for patterns like this
where both operands are min and/or max.

Oops - it actually does look for these kinds of patterns after checking other cases. It's just missing some potential matches like:
https://rise4fun.com/Alive/Mdm

But that won't solve the 1st example. We'd have to match that as a min/max of min/max operands.

rotateright · 2017-12-12T15:09:59Z

We have "simplifyICmpWithMinMax" but it doesn't look for patterns like this
where both operands are min and/or max.

Oops - it actually does look for these kinds of patterns after checking
other cases. It's just missing some potential matches like:
https://rise4fun.com/Alive/Mdm

On 3rd thought, the real problem is that early-cse doesn't know that these are equivalent:
std::max(a,b)
std::max(b,a)
(for integers at least)

Ie, it knows that binops and compares can commute operands and be equivalent, but because min/max are not IR instructions or intrinsics, it fails to see min/max in the same way.

rotateright · 2018-01-07T17:30:34Z

Related logic added to value tracking:
https://reviews.llvm.org/rL321672

rotateright · 2018-03-04T17:13:58Z

Early-cse was improved for integer min/max here:
https://reviews.llvm.org/rL320640

...but we still need to match this pattern in instcombine.

define i32 @test(i32, i32) {
  %3 = icmp slt i32 %1, %0
  %4 = icmp slt i32 %0, %1
  %5 = select i1 %3, i32 %1, i32 %0
  %6 = select i1 %4, i32 %1, i32 %0
  %7 = icmp slt i32 %5, %6
  %8 = select i1 %7, i32 %6, i32 %5
  ret i32 %8
}

zamazan4ik · 2018-08-25T08:13:59Z

See this example:

#include <algorithm>

int test(float a, float b)
{
    return std::max(std::min(a,b), std::max(a,b));
}

I have changed here variables types to float and optimization failed too - clang trunk with '-O3 -ffast-math'generates this:

test(float, float):                              # @test(float, float)
        movaps  xmm2, xmm1
        minss   xmm2, xmm0
        maxss   xmm1, xmm0
        maxss   xmm1, xmm2
        cvttss2si       eax, xmm1
        ret

rotateright · 2019-08-20T15:30:42Z

All integer min/max patterns should be optimized after:
https://reviews.llvm.org/rL369386

FP will have to check fast-math-flags to handle NaN and -0.0 properly.
If so, we need to make sure that our FMF propagation is working as expected. In particular, we may need to extend FMF to phi nodes of FP values, so they get applied to a 'select' when we run -simplifycfg.

RKSimon · 2019-08-20T16:57:23Z

Current Codegen: https://godbolt.org/z/kVBRNY

rotateright · 2019-10-09T19:05:52Z

Update - we have FMF on phi with:
https://reviews.llvm.org/D67564

...but there was feedback that this may have unintended consequences, so posted for discussion on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-September/135444.html

No responses so far, but as suggested, I'm waiting to build on that until people have plenty of time to see that.

Assuming it sticks, the next step would be to fix SimplifyCFG to propagate FMF from phi to select.

rotateright · 2019-11-17T22:40:30Z

Assuming it sticks, the next step would be to fix SimplifyCFG to propagate
FMF from phi to select.

That part is at least partly done:
https://reviews.llvm.org/rGebf9bf2cbc8f

But this example is harder than I imagined: we have to propagate FMF through memory ops and/or function parameters because the min/max calls take references (pointers) as arguments. That means we don't start with a phi of FP values; it's a phi of pointers to FP values.

RKSimon · 2021-11-27T00:04:52Z

mentioned in issue #34959

RKSimon mentioned this issue Dec 10, 2017

[meta] Missed fast-math optimizations #34959

Open

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

RKSimon mentioned this issue Feb 17, 2025

[LoopVectorize] fast-math-flags on fcmp are not propagated correctly #34886

Open

RKSimon added llvm:instcombine floating-point Floating-point math labels Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955

Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955

zamazan4ik commented Dec 10, 2017 •

edited by RKSimon

Loading

zamazan4ik commented Dec 10, 2017 •

edited by RKSimon

Loading

rotateright commented Dec 11, 2017 •

edited by RKSimon

Loading

rotateright commented Dec 11, 2017

rotateright commented Dec 11, 2017

rotateright commented Dec 12, 2017

rotateright commented Jan 7, 2018

rotateright commented Mar 4, 2018 •

edited by RKSimon

Loading

zamazan4ik commented Aug 25, 2018 •

edited by RKSimon

Loading

rotateright commented Aug 20, 2019

RKSimon commented Aug 20, 2019

rotateright commented Oct 9, 2019

rotateright commented Nov 17, 2019

RKSimon commented Nov 27, 2021

Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955

Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955

Comments

zamazan4ik commented Dec 10, 2017 • edited by RKSimon Loading

Extended Description

zamazan4ik commented Dec 10, 2017 • edited by RKSimon Loading

rotateright commented Dec 11, 2017 • edited by RKSimon Loading

rotateright commented Dec 11, 2017

rotateright commented Dec 11, 2017

rotateright commented Dec 12, 2017

rotateright commented Jan 7, 2018

rotateright commented Mar 4, 2018 • edited by RKSimon Loading

zamazan4ik commented Aug 25, 2018 • edited by RKSimon Loading

rotateright commented Aug 20, 2019

RKSimon commented Aug 20, 2019

rotateright commented Oct 9, 2019

rotateright commented Nov 17, 2019

RKSimon commented Nov 27, 2021

zamazan4ik commented Dec 10, 2017 •

edited by RKSimon

Loading

zamazan4ik commented Dec 10, 2017 •

edited by RKSimon

Loading

rotateright commented Dec 11, 2017 •

edited by RKSimon

Loading

rotateright commented Mar 4, 2018 •

edited by RKSimon

Loading

zamazan4ik commented Aug 25, 2018 •

edited by RKSimon

Loading