Skip to content

known_identity has wrong values with -ffast-math #13813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rafbiels opened this issue May 16, 2024 · 12 comments
Open

known_identity has wrong values with -ffast-math #13813

rafbiels opened this issue May 16, 2024 · 12 comments
Labels
bug Something isn't working confirmed

Comments

@rafbiels
Copy link
Contributor

Describe the bug

The initial value for sycl::minimum<float> is set to inf which gets turned into 0 with -ffast-math. More generally, both min and max are affected for any type which has_infinity.

The issue is here:

/// Returns maximal possible value as identity for MIN operations.
template <typename BinaryOperation, typename AccumulatorT>
struct known_identity_impl<BinaryOperation, AccumulatorT,
std::enable_if_t<IsMinimumIdentityOp<
AccumulatorT, BinaryOperation>::value>> {
static constexpr AccumulatorT value = static_cast<AccumulatorT>(
std::numeric_limits<AccumulatorT>::has_infinity
? std::numeric_limits<AccumulatorT>::infinity()
: (std::numeric_limits<AccumulatorT>::max)());
};

std::numeric_limits<T>::infinity() should not be used here when -fno-honor-infinities is enabled (e.g. by -ffast-math). This causes the value to be evaluated as zero when used (at least in some cases) and subsequently leads to incorrect results of a program. Note that std::numeric_limits<float>::has_infinity evaluates to true regardless of the honor-infinities option.

This started failing in the nightly-2024-02-16 tag. The code in question hasn't changed in months, but it seems that the behaviour of -ffast-math has changed, possibly due to 73159a9

So it seems the code has been problematic earlier, but it was hidden by the -fno-honor-infinities not being set in this context.

I think there could be three solutions:

  • Just use std::numeric_limits<T>::max() in all cases. What's the benefit of using infinity here?
  • Add a second condition checking whether -fno-honor-infinities is used.
  • Make sure std::numeric_limits<T>::has_infinity evaluates to false when -fno-honor-infinities is enabled.

To reproduce

This is a minimal example of sycl::joint_reduce computing the minimum for a vector of 4 positive floats:

#include <sycl/sycl.hpp>
#include <cstdio>

int main() {
    sycl::queue q{};
    std::vector<float> data{4.0f, 1.0f, 3.0f, 2.0f};
    float result{data[0]};
    {
        sycl::buffer<float> inBuf{data};
        sycl::buffer<float> outBuf{&result, 1};
        q.submit([&inBuf, &outBuf](sycl::handler& cgh){
            sycl::accessor inAcc{inBuf, cgh, sycl::read_only};
            sycl::accessor outAcc{outBuf, cgh, sycl::write_only};
            cgh.parallel_for(sycl::nd_range<1>{1,1}, [inAcc, outAcc](sycl::nd_item<1> item){
                const float* start{&inAcc[0]};
                const float* end{start + inAcc.size()};
                outAcc[0] = sycl::joint_reduce(item.get_group(), start, end, sycl::minimum<>{});
            });
        });
    }
    printf("Result: %f\n", result);
    return 0;
}

When compiled with -O3 -ffast-math this prints 0.0 instead of the expected 1.0:

$ clang++ -fsycl -o test test.cpp -O3 -ffast-math
./test
Result: 0.000000

It can be worked around with -fhonor-infinities:

$ clang++ -fsycl -o test test.cpp -O3 -ffast-math -fhonor-infinities
./test
Result: 1.000000

Note this example doesn't directly use sycl::known_identity. It presents that the issue breaks a high-level feature. I didn't manage to create a small enough reproducer directly using the value, as it tends to get inlined / evaluated at compile time, and therefore not showing the issue. I have seen the value directly evaluating to zero in a larger project though.

Environment

  • Linux, Ubuntu 22.04
  • any target (tested Level Zero and CUDA)
  • GNU libstdc++6 v12.3.0
  • DPC++ tag nightly-2024-02-16 (first failing) through 2024-05-15 (latest tested)

Additional context

No response

@rafbiels rafbiels added the bug Something isn't working label May 16, 2024
@npmiller
Copy link
Contributor

cc @andykaylor any thoughts on this?

Your patch linked is NFC so I don't really see how it would have changed the behavior, but maybe you have ideas on the best way to fix this, or why it only started failing recently.

@rafbiels
Copy link
Contributor Author

There was a discussion of this in the original PR: llvm/llvm-project#81173 (comment)

Regardless of whether it was this PR or another in the upstream pull-down which changed the behaviour, it looks to me like the change was correct, i.e. the current behaviour matches the documentation: -ffast-math enables -fno-honor-infinities. It looks like it wasn't the case before. However, the known_identity code has been wrong before and the pull-down only exposed this. It's the known_identity implementation that needs to be fixed.

The known_identity bug in older versions could be exposed by setting -fno-honor-infinities explicitly in the compilation command, but it was so niche that no one observed it. Now that it's seen with -ffast-math, it's much more severe.

@aelovikov-intel
Copy link
Contributor

I'm afraid https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#table.identities mandates that infinity must be used by the implementation. @gmlueck , WDYT?

@rafbiels
Copy link
Contributor Author

rafbiels commented Jun 7, 2024

Does -ffast-math conform to the SYCL standard otherwise?

BTW, this is (rightly) triggering the -Wnan-infinity-disabled warning which is hidden by including SYCL headers as system headers. Including the headers as non-system shows this.

test.cpp:

#include <sycl/known_identity.hpp>
float foo() {return sycl::known_identity<sycl::minimum<float>,float>::value;}

command:

$ clang++ -c test.cpp -fno-honor-infinities -I$(dirname $(which clang++))/../include -I$(dirname $(which clang++))/../include/sycl

sycl/known_identity.hpp:269:13: warning: use of infinity is undefined behavior due to the currently enabled floating-point options [-Wnan-infinity-disabled]
  269 |           ? std::numeric_limits<AccumulatorT>::infinity()
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.cpp:2:71: note: in instantiation of static data member 'sycl::detail::known_identity_impl<sycl::minimum<float>, float>::value' requested here
    2 | float foo() {return sycl::known_identity<sycl::minimum<float>,float>::value;}
      |                                                                       ^

@gmlueck
Copy link
Contributor

gmlueck commented Jun 7, 2024

  • Just use std::numeric_limits<T>::max() in all cases. What's the benefit of using infinity here?

I don't think we should do this. Imagine a "min" reduction where all the values are INF (and the application is not compiled with -fno-honor-infinities). If the reduction used max as the identity, then I presume the result of the reduction would be max. But that would be wrong. Since all the values are INF, you'd expect the result to be INF.

Therefore, I think we should do one of the following:

  • Add a second condition checking whether -fno-honor-infinities is used.
  • Make sure std::numeric_limits<T>::has_infinity evaluates to false when -fno-honor-infinities is enabled.

I presume the effect on SYCL reductions is the same for either of these options. Obviously, changing the definition of std::numeric_limits<T>::has_infinity is a much larger scope, and I'd defer to @andykaylor to decide if this is the right thing to do.

If we do not decide to change the definition of has_infinity (or we think it's a very long discussion), we could change the SYCL headers to have an additional check for -fno-honor-infinities. We can always remove that check later if we decide to change the definition of has_infinity.

I think it is OK if the definition of SYCL's sycl::known_identity changes when the application is compiled with -fno-honor-infinities. Yes, this makes us non-conformant, but I think this is OK. When the user compiles with this option, they are specifically asking to trade-off correctness for performance.

There is a similar case for "max" reductions, where the known identity is -std::numeric_limits<T>::infinity(). Here, the identify should be lowest when compiling with -fno-honor-infinities. I did not see any other places in the SYCL spec that depend on std::numeric_limits<T>::infinity().

@aelovikov-intel
Copy link
Contributor

I don't think we can even detect it...

$  diff -Naur <(clang++ -x c++ /dev/null -o - -E -dD -fhonor-infinities -DMY1) <(clang++ -x c++ /dev/null -o - -E -dD -fno-honor-infinites -DMY2)
--- /dev/fd/63  2024-06-07 15:34:16.516477537 -0700
+++ /dev/fd/62  2024-06-07 15:34:16.516477537 -0700
@@ -457,7 +457,7 @@
 #define __STDC_UTF_16__ 1
 #define __STDC_UTF_32__ 1
 # 1 "<command line>" 1
-#define MY1 1
+#define MY2 1
 #define __GCC_HAVE_DWARF2_CFI_ASM 1
 # 1 "<built-in>" 2
 # 1 "/dev/null" 2

@gmlueck
Copy link
Contributor

gmlueck commented Jun 10, 2024

I don't think we can even detect it...

I assume we could change the compiler to predefine a macro when -fno-honor-infinities is passed ...

@aelovikov-intel
Copy link
Contributor

What if we use SYCL_EXTERNAL and multiple TUs with different fast math options? Wouldn't it be ODR-violation?

@gmlueck
Copy link
Contributor

gmlueck commented Jun 10, 2024

I agree that it's a bit weird for the sycl::known_identity trait to change based on the compiler flag. However, this seems less bad than returning a wrong value from a reduction. I'm open to other options if you have one.

@aelovikov-intel
Copy link
Contributor

We can try using pragmas to disable FP optimizations in SYCL library code using known_identity. I'm not sure how to provide a way for the user to optimize it. sycl::reduction says that we can ignore user-provided identity if we know it ourselves for the given binary operation.

@gmlueck
Copy link
Contributor

gmlueck commented Jun 10, 2024

I suppose the SYCL RT could just ignore known_identity when we know it is INF and the TU is compiled with -fno-honor-infinities. Instead, the RT would silently use max instead in this case. That way, we will still get any optimizations associated with -fno-honor-infinities, and the reduction will still return the right answer (assuming the user's input does not contain any INF's).

@cperkinsintel
Copy link
Contributor

I've opened a PR to fix this: #17028 . It uses the __FAST_MATH__ macro to navigate the cases.
A better solution than that could be to define a macro for -fno-honor-infinities. That would be more correct.

againull pushed a commit that referenced this issue Feb 18, 2025
…17028)

Technically a workaround, this fixes the problem with
`known_identity<minOp/maxOp>` being incorrect when using `-ffast-math`.
It works because we have a `__FAST_MATH__` macro that is defined when
that flag is used. Better, would be to detect `-fno-honor-infinities`.
Opening a JIRA/discussion for that, but this is a big improvement in the
interim.

See discussion #13813
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

No branches or pull requests

5 participants