-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Missed optimization in math expression: max(min(a,b),max(a,b)) == max(a,b) #34955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Also check this case: #include <algorithm>
int test(int a, int b)
{
return std::max(std::max(a,b), std::max(b,a));
} |
The examples here don't need -ffast-math because they're integer ops. This might be solved by canonicalizing min/max harder in instcombine, so we 'see' the common compare instructions: define i32 @max_of_minmax(i32 %a, i32 %b) {
%cmin = icmp slt i32 %b, %a
%cmax = icmp slt i32 %a, %b
%min = select i1 %cmin, i32 %b, i32 %a
%max = select i1 %cmax, i32 %b, i32 %a
%cmax2 = icmp slt i32 %min, %max
%max2 = select i1 %cmax2, i32 %min, i32 %max
ret i32 %max2
}
define i32 @max_of_maxmax(i32 %a, i32 %b) {
%cmax1 = icmp slt i32 %a, %b
%cmax2 = icmp slt i32 %b, %a
%max1 = select i1 %cmax1, i32 %b, i32 %a
%max2 = select i1 %cmax2, i32 %a, i32 %b
%cmax3 = icmp slt i32 %max2, %max1
%max3 = select i1 %cmax3, i32 %max2, i32 %max1
ret i32 %max3
} |
On 2nd thought, if we ignore the trailing select, this can be viewed as a failure of InstSimplify to fold the last icmp. We have "simplifyICmpWithMinMax" but it doesn't look for patterns like this where both operands are min and/or max. |
Oops - it actually does look for these kinds of patterns after checking other cases. It's just missing some potential matches like: But that won't solve the 1st example. We'd have to match that as a min/max of min/max operands. |
On 3rd thought, the real problem is that early-cse doesn't know that these are equivalent: Ie, it knows that binops and compares can commute operands and be equivalent, but because min/max are not IR instructions or intrinsics, it fails to see min/max in the same way. |
Related logic added to value tracking: |
Early-cse was improved for integer min/max here: ...but we still need to match this pattern in instcombine. define i32 @test(i32, i32) {
%3 = icmp slt i32 %1, %0
%4 = icmp slt i32 %0, %1
%5 = select i1 %3, i32 %1, i32 %0
%6 = select i1 %4, i32 %1, i32 %0
%7 = icmp slt i32 %5, %6
%8 = select i1 %7, i32 %6, i32 %5
ret i32 %8
} |
See this example:
I have changed here variables types to float and optimization failed too - clang trunk with '-O3 -ffast-math'generates this:
|
All integer min/max patterns should be optimized after: FP will have to check fast-math-flags to handle NaN and -0.0 properly. |
Current Codegen: https://godbolt.org/z/kVBRNY |
Update - we have FMF on phi with: ...but there was feedback that this may have unintended consequences, so posted for discussion on llvm-dev: No responses so far, but as suggested, I'm waiting to build on that until people have plenty of time to see that. Assuming it sticks, the next step would be to fix SimplifyCFG to propagate FMF from phi to select. |
That part is at least partly done: But this example is harder than I imagined: we have to propagate FMF through memory ops and/or function parameters because the min/max calls take references (pointers) as arguments. That means we don't start with a phi of FP values; it's a phi of pointers to FP values. |
mentioned in issue #34959 |
Extended Description
clang(trunk) with '--std=c++17 -O3 -march=native -ffast-math' flags for this code:
generates this assembly:
gcc(trunk) with '--std=c++17 -O3 -march=native -ffast-math':
Helpful link: https://github.com/gcc-mirror/gcc/blob/07b69d3f1cd3dd8ebb0af1fbff95914daee477d2/gcc/match.pd
The text was updated successfully, but these errors were encountered: