Draft: Fix `{f16,f32,f64,f128}::div_euclid` #134062

traviscross · 2024-12-09T09:03:12Z

The current implementation of div_euclid for floating point numbers violates the invariant, stated in the documentation, that:

a.rem_euclid(b) ~= a.div_euclid(b).mul_add(-b, a)

This fixes that problem.

When the magnitude of the exact quotient is greater than or equal to the maximum integer that can be represented precisely in the type, that invariant is not generally possible to uphold -- and without control over the rounding mode it would be difficult to calculate in any case -- so we return NaN in those instances.

r? ghost

The current implementation of `div_euclid` for floating point numbers violates the invariant, stated in the documentation, that: ```rust a.rem_euclid(b) ~= a.div_euclid(b).mul_add(-b, a) ``` This fixes that problem. When the magnitude of the exact quotient is greater than or equal to the maximum integer that can be represented precisely in the type, that invariant is not generally possible to uphold -- and without control over the rounding mode it would be difficult to calculate in any case -- so we return `NaN` in those instances.

tczajka · 2024-12-09T11:04:26Z

When the magnitude of the exact quotient is greater than or equal to the maximum integer that can be represented precisely in the type, that invariant is not generally possible to uphold

The Euclidean division property is not meant to hold after rounding, it is meant to hold before rounding, i.e. in infinite precision. It is always possible to satisfy in a unique way.

theemathas · 2024-12-09T12:20:24Z

May I know the reason why you went with NaN, and not Infinity?

traviscross · 2024-12-09T17:52:49Z

The Euclidean division property is not meant to hold after rounding, it is meant to hold before rounding, i.e. in infinite precision. It is always possible to satisfy in a unique way.

Yes, agreed (though it is an interesting caveat that should be noted in the documentation). Calculating it, though, is the problem. This was the motivating bit:

...and without control over the rounding mode it would be difficult to calculate in any case...

We need a div_trunc fused primitive that computes (x / y).trunc() without intermediate rounding. This draft PR goes as far as it may be practical to go while building on rounded operations. And that's the point, is to explore those tradeoffs and see how happy we are with them.

The only other option I see is to add a div_trunc intrinsic with inline assembly for each platform and a soft-float fallback.

The question this PR asks, though, in lieu of that, is whether a correct implementation of div_euclid with a restricted codomain is better than the current incorrect implementation in core.

May I know the reason why you went with NaN, and not Infinity?

The purpose here is to add restriction to the codomain. There is a finite inexact value that we could give as output, but we're unable to calculate it with this approach, so returning NaN is more correct in the same way that certain operations return NaN in cases of underflow.

traviscross · 2024-12-09T19:04:58Z

In terms of correctness, at this point, for f16 I've proved this algorithm correct (taking into account the restricted codomain) by exhaustively checking that for all possible dividends and divisors, the result is identical to the correctly-rounded result of an ideal implementation that does the calculation in infinite precision.

For all non-NaN finite f16 dividends and divisors where the divisor is not zero, I've also verified (again taking into account the restricted codomain) that the relation x.rem_euclid(y) == x.div_euclid(y).mul_add(-y, x) holds exactly.

For f32 and f64, such exhaustive checking is of course not feasible, but preliminary results from fuzzing have not found any deviations from the equivalent calculations in infinite precision.

With respect to proving this analytically, div_special directly encodes the IEEE 754-2008 rules. Then, essentially, we need to show that this is correct for computing div_trunc:

let sign = xor_sign(x, y, 1.0);
let (x, y) = (x.abs(), y.abs());
let q = x / y;
if q > ((1u64 << f64::MANTISSA_DIGITS) - 1) as f64 {
    return f64::NAN;
}
let qt = q.trunc();
return if qt.mul_add(-y, x).is_sign_negative() { qt - 1.0 } else { qt }.copysign(sign)

The intuition here is that the problem happens when the exact quotient (of the absolute value of the dividend and divisor) is a bit smaller than some integer and is then rounded up to that integer ahead of the truncation. To reliably detect this, we need to do some calculation centered on zero so we can use the +0 / -0 distinction to our advantage. That's the trick here.

So we calculate the remainder starting from this perhaps-incorrectly-truncated quotient. If the sign is negative (even and particularly -0), then we know that this incorrect rounding-up must have happened, and so we correct it by subtracting one, which is OK because we've already excluded (by returning NaN) quotients that cannot be represented precisely or that might have been the rounded result of some other quotient that cannot be represented precisely.

traviscross · 2024-12-10T08:32:54Z

As an example of why this trick for calculating div_trunc breaks down when we can no longer precisely represent the quotient, consider this f16 division:

x = 0.1456298828125_f16
y = 3.057718276977539e-5_f16
(x / y) = 4764.0
(x / y).trunc()  = 4764.0
(x / y).trunc().mul_add(-y, x) = -3.981590270996094e-5
exact!(x / y) = 4762.697855750487329434697855750487329427
exact!((x / y).trunc()) = 4762.0
exact!((x / y).trunc()) as f16 = 4760.0

Here, the quotient is rounded up, and we can detect that. The trouble is, we can't tell whether it was rounded up from [4762, 4763) or from [4763, 4764). If it had been the latter, then 4764.0 would still have been the correct rounded result.

But as it is, the exact result is 4762.69.., which when truncated, is 4762, which would properly round down to 4760.

Since we can't readily distinguish these two cases, we can't fix up the rounding of the truncated quotient.

traviscross · 2024-12-12T19:37:41Z

Closing in favor of:

Correctly rounded floating point div_euclid. #134145

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 9, 2024

traviscross mentioned this pull request Dec 9, 2024

Fix {f16,f32,f64,f128}::div_euclid #133755

Closed

traviscross added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 9, 2024

theemathas mentioned this pull request Dec 11, 2024

Correctly rounded floating point div_euclid. #134145

Open

traviscross closed this Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Fix `{f16,f32,f64,f128}::div_euclid` #134062

Draft: Fix `{f16,f32,f64,f128}::div_euclid` #134062

traviscross commented Dec 9, 2024

tczajka commented Dec 9, 2024

theemathas commented Dec 9, 2024

traviscross commented Dec 9, 2024 •

edited

Loading

traviscross commented Dec 9, 2024 •

edited

Loading

traviscross commented Dec 10, 2024

traviscross commented Dec 12, 2024

Draft: Fix {f16,f32,f64,f128}::div_euclid #134062

Draft: Fix {f16,f32,f64,f128}::div_euclid #134062

Conversation

traviscross commented Dec 9, 2024

tczajka commented Dec 9, 2024

theemathas commented Dec 9, 2024

traviscross commented Dec 9, 2024 • edited Loading

traviscross commented Dec 9, 2024 • edited Loading

traviscross commented Dec 10, 2024

traviscross commented Dec 12, 2024

Draft: Fix `{f16,f32,f64,f128}::div_euclid` #134062

Draft: Fix `{f16,f32,f64,f128}::div_euclid` #134062

traviscross commented Dec 9, 2024 •

edited

Loading

traviscross commented Dec 9, 2024 •

edited

Loading