known_identity has wrong values with -ffast-math #13813

rafbiels · 2024-05-16T16:10:27Z

Describe the bug

The initial value for sycl::minimum<float> is set to inf which gets turned into 0 with -ffast-math. More generally, both min and max are affected for any type which has_infinity.

The issue is here:

llvm/sycl/include/sycl/known_identity.hpp

Lines 262 to 271 in 7271d61

    
           /// Returns maximal possible value as identity for MIN operations. 
        
           template <typename BinaryOperation, typename AccumulatorT> 
        
           struct known_identity_impl<BinaryOperation, AccumulatorT, 
        
                                      std::enable_if_t<IsMinimumIdentityOp< 
        
                                          AccumulatorT, BinaryOperation>::value>> { 
        
             static constexpr AccumulatorT value = static_cast<AccumulatorT>( 
        
                 std::numeric_limits<AccumulatorT>::has_infinity 
        
                     ? std::numeric_limits<AccumulatorT>::infinity() 
        
                     : (std::numeric_limits<AccumulatorT>::max)()); 
        
           };

std::numeric_limits<T>::infinity() should not be used here when -fno-honor-infinities is enabled (e.g. by -ffast-math). This causes the value to be evaluated as zero when used (at least in some cases) and subsequently leads to incorrect results of a program. Note that std::numeric_limits<float>::has_infinity evaluates to true regardless of the honor-infinities option.

This started failing in the nightly-2024-02-16 tag. The code in question hasn't changed in months, but it seems that the behaviour of -ffast-math has changed, possibly due to 73159a9

So it seems the code has been problematic earlier, but it was hidden by the -fno-honor-infinities not being set in this context.

I think there could be three solutions:

Just use std::numeric_limits<T>::max() in all cases. What's the benefit of using infinity here?
Add a second condition checking whether -fno-honor-infinities is used.
Make sure std::numeric_limits<T>::has_infinity evaluates to false when -fno-honor-infinities is enabled.

To reproduce

This is a minimal example of sycl::joint_reduce computing the minimum for a vector of 4 positive floats:

#include <sycl/sycl.hpp>
#include <cstdio>

int main() {
    sycl::queue q{};
    std::vector<float> data{4.0f, 1.0f, 3.0f, 2.0f};
    float result{data[0]};
    {
        sycl::buffer<float> inBuf{data};
        sycl::buffer<float> outBuf{&result, 1};
        q.submit([&inBuf, &outBuf](sycl::handler& cgh){
            sycl::accessor inAcc{inBuf, cgh, sycl::read_only};
            sycl::accessor outAcc{outBuf, cgh, sycl::write_only};
            cgh.parallel_for(sycl::nd_range<1>{1,1}, [inAcc, outAcc](sycl::nd_item<1> item){
                const float* start{&inAcc[0]};
                const float* end{start + inAcc.size()};
                outAcc[0] = sycl::joint_reduce(item.get_group(), start, end, sycl::minimum<>{});
            });
        });
    }
    printf("Result: %f\n", result);
    return 0;
}

When compiled with -O3 -ffast-math this prints 0.0 instead of the expected 1.0:

$ clang++ -fsycl -o test test.cpp -O3 -ffast-math
./test
Result: 0.000000

It can be worked around with -fhonor-infinities:

$ clang++ -fsycl -o test test.cpp -O3 -ffast-math -fhonor-infinities
./test
Result: 1.000000

Note this example doesn't directly use sycl::known_identity. It presents that the issue breaks a high-level feature. I didn't manage to create a small enough reproducer directly using the value, as it tends to get inlined / evaluated at compile time, and therefore not showing the issue. I have seen the value directly evaluating to zero in a larger project though.

Environment

Linux, Ubuntu 22.04
any target (tested Level Zero and CUDA)
GNU libstdc++6 v12.3.0
DPC++ tag nightly-2024-02-16 (first failing) through 2024-05-15 (latest tested)

Additional context

No response

The text was updated successfully, but these errors were encountered:

npmiller · 2024-05-21T09:21:30Z

cc @andykaylor any thoughts on this?

Your patch linked is NFC so I don't really see how it would have changed the behavior, but maybe you have ideas on the best way to fix this, or why it only started failing recently.

rafbiels · 2024-05-21T11:15:46Z

There was a discussion of this in the original PR: llvm/llvm-project#81173 (comment)

Regardless of whether it was this PR or another in the upstream pull-down which changed the behaviour, it looks to me like the change was correct, i.e. the current behaviour matches the documentation: -ffast-math enables -fno-honor-infinities. It looks like it wasn't the case before. However, the known_identity code has been wrong before and the pull-down only exposed this. It's the known_identity implementation that needs to be fixed.

The known_identity bug in older versions could be exposed by setting -fno-honor-infinities explicitly in the compilation command, but it was so niche that no one observed it. Now that it's seen with -ffast-math, it's much more severe.

aelovikov-intel · 2024-06-06T22:31:31Z

I'm afraid https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#table.identities mandates that infinity must be used by the implementation. @gmlueck , WDYT?

rafbiels · 2024-06-07T12:01:30Z

Does -ffast-math conform to the SYCL standard otherwise?

BTW, this is (rightly) triggering the -Wnan-infinity-disabled warning which is hidden by including SYCL headers as system headers. Including the headers as non-system shows this.

test.cpp:

#include <sycl/known_identity.hpp>
float foo() {return sycl::known_identity<sycl::minimum<float>,float>::value;}

command:

$ clang++ -c test.cpp -fno-honor-infinities -I$(dirname $(which clang++))/../include -I$(dirname $(which clang++))/../include/sycl

sycl/known_identity.hpp:269:13: warning: use of infinity is undefined behavior due to the currently enabled floating-point options [-Wnan-infinity-disabled]
  269 |           ? std::numeric_limits<AccumulatorT>::infinity()
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.cpp:2:71: note: in instantiation of static data member 'sycl::detail::known_identity_impl<sycl::minimum<float>, float>::value' requested here
    2 | float foo() {return sycl::known_identity<sycl::minimum<float>,float>::value;}
      |                                                                       ^

gmlueck · 2024-06-07T13:39:47Z

Just use std::numeric_limits<T>::max() in all cases. What's the benefit of using infinity here?

I don't think we should do this. Imagine a "min" reduction where all the values are INF (and the application is not compiled with -fno-honor-infinities). If the reduction used max as the identity, then I presume the result of the reduction would be max. But that would be wrong. Since all the values are INF, you'd expect the result to be INF.

Therefore, I think we should do one of the following:

Add a second condition checking whether -fno-honor-infinities is used.

Make sure std::numeric_limits<T>::has_infinity evaluates to false when -fno-honor-infinities is enabled.

I presume the effect on SYCL reductions is the same for either of these options. Obviously, changing the definition of std::numeric_limits<T>::has_infinity is a much larger scope, and I'd defer to @andykaylor to decide if this is the right thing to do.

If we do not decide to change the definition of has_infinity (or we think it's a very long discussion), we could change the SYCL headers to have an additional check for -fno-honor-infinities. We can always remove that check later if we decide to change the definition of has_infinity.

I think it is OK if the definition of SYCL's sycl::known_identity changes when the application is compiled with -fno-honor-infinities. Yes, this makes us non-conformant, but I think this is OK. When the user compiles with this option, they are specifically asking to trade-off correctness for performance.

There is a similar case for "max" reductions, where the known identity is -std::numeric_limits<T>::infinity(). Here, the identify should be lowest when compiling with -fno-honor-infinities. I did not see any other places in the SYCL spec that depend on std::numeric_limits<T>::infinity().

aelovikov-intel · 2024-06-07T22:35:01Z

I don't think we can even detect it...

$  diff -Naur <(clang++ -x c++ /dev/null -o - -E -dD -fhonor-infinities -DMY1) <(clang++ -x c++ /dev/null -o - -E -dD -fno-honor-infinites -DMY2)
--- /dev/fd/63  2024-06-07 15:34:16.516477537 -0700
+++ /dev/fd/62  2024-06-07 15:34:16.516477537 -0700
@@ -457,7 +457,7 @@
 #define __STDC_UTF_16__ 1
 #define __STDC_UTF_32__ 1
 # 1 "<command line>" 1
-#define MY1 1
+#define MY2 1
 #define __GCC_HAVE_DWARF2_CFI_ASM 1
 # 1 "<built-in>" 2
 # 1 "/dev/null" 2

gmlueck · 2024-06-10T13:15:15Z

I don't think we can even detect it...

I assume we could change the compiler to predefine a macro when -fno-honor-infinities is passed ...

aelovikov-intel · 2024-06-10T14:46:23Z

What if we use SYCL_EXTERNAL and multiple TUs with different fast math options? Wouldn't it be ODR-violation?

gmlueck · 2024-06-10T19:11:11Z

I agree that it's a bit weird for the sycl::known_identity trait to change based on the compiler flag. However, this seems less bad than returning a wrong value from a reduction. I'm open to other options if you have one.

aelovikov-intel · 2024-06-10T19:14:38Z

We can try using pragmas to disable FP optimizations in SYCL library code using known_identity. I'm not sure how to provide a way for the user to optimize it. sycl::reduction says that we can ignore user-provided identity if we know it ourselves for the given binary operation.

gmlueck · 2024-06-10T20:41:09Z

I suppose the SYCL RT could just ignore known_identity when we know it is INF and the TU is compiled with -fno-honor-infinities. Instead, the RT would silently use max instead in this case. That way, we will still get any optimizations associated with -fno-honor-infinities, and the reduction will still return the right answer (assuming the user's input does not contain any INF's).

cperkinsintel · 2025-02-14T22:34:58Z

I've opened a PR to fix this: #17028 . It uses the __FAST_MATH__ macro to navigate the cases.
A better solution than that could be to define a macro for -fno-honor-infinities. That would be more correct.

…17028) Technically a workaround, this fixes the problem with `known_identity<minOp/maxOp>` being incorrect when using `-ffast-math`. It works because we have a `__FAST_MATH__` macro that is defined when that flag is used. Better, would be to detect `-fno-honor-infinities`. Opening a JIRA/discussion for that, but this is a big improvement in the interim. See discussion #13813

rafbiels added the bug Something isn't working label May 16, 2024

npmiller added the confirmed label May 21, 2024

cperkinsintel mentioned this issue Feb 14, 2025

[SYCL] Fix min/max known_identity<> wrong when -ffast-math is used. #17028

Merged

cperkinsintel mentioned this issue Apr 29, 2025

[SYCL] tighten known_identity to finite_math macro #18247

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

known_identity has wrong values with -ffast-math #13813

known_identity has wrong values with -ffast-math #13813

rafbiels commented May 16, 2024

npmiller commented May 21, 2024

rafbiels commented May 21, 2024

aelovikov-intel commented Jun 6, 2024

rafbiels commented Jun 7, 2024 •

edited

Loading

gmlueck commented Jun 7, 2024

aelovikov-intel commented Jun 7, 2024

gmlueck commented Jun 10, 2024

aelovikov-intel commented Jun 10, 2024

gmlueck commented Jun 10, 2024

aelovikov-intel commented Jun 10, 2024

gmlueck commented Jun 10, 2024

cperkinsintel commented Feb 14, 2025

known_identity has wrong values with -ffast-math #13813

known_identity has wrong values with -ffast-math #13813

Comments

rafbiels commented May 16, 2024

Describe the bug

To reproduce

Environment

Additional context

npmiller commented May 21, 2024

rafbiels commented May 21, 2024

aelovikov-intel commented Jun 6, 2024

rafbiels commented Jun 7, 2024 • edited Loading

gmlueck commented Jun 7, 2024

aelovikov-intel commented Jun 7, 2024

gmlueck commented Jun 10, 2024

aelovikov-intel commented Jun 10, 2024

gmlueck commented Jun 10, 2024

aelovikov-intel commented Jun 10, 2024

gmlueck commented Jun 10, 2024

cperkinsintel commented Feb 14, 2025

rafbiels commented Jun 7, 2024 •

edited

Loading