-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Draft: Fix {f16,f32,f64,f128}::div_euclid
#134062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current implementation of `div_euclid` for floating point numbers violates the invariant, stated in the documentation, that: ```rust a.rem_euclid(b) ~= a.div_euclid(b).mul_add(-b, a) ``` This fixes that problem. When the magnitude of the exact quotient is greater than or equal to the maximum integer that can be represented precisely in the type, that invariant is not generally possible to uphold -- and without control over the rounding mode it would be difficult to calculate in any case -- so we return `NaN` in those instances.
The Euclidean division property is not meant to hold after rounding, it is meant to hold before rounding, i.e. in infinite precision. It is always possible to satisfy in a unique way. |
May I know the reason why you went with NaN, and not Infinity? |
Yes, agreed (though it is an interesting caveat that should be noted in the documentation). Calculating it, though, is the problem. This was the motivating bit:
We need a The only other option I see is to add a The question this PR asks, though, in lieu of that, is whether a correct implementation of
The purpose here is to add restriction to the codomain. There is a finite inexact value that we could give as output, but we're unable to calculate it with this approach, so returning |
In terms of correctness, at this point, for For all non- For With respect to proving this analytically, let sign = xor_sign(x, y, 1.0);
let (x, y) = (x.abs(), y.abs());
let q = x / y;
if q > ((1u64 << f64::MANTISSA_DIGITS) - 1) as f64 {
return f64::NAN;
}
let qt = q.trunc();
return if qt.mul_add(-y, x).is_sign_negative() { qt - 1.0 } else { qt }.copysign(sign) The intuition here is that the problem happens when the exact quotient (of the absolute value of the dividend and divisor) is a bit smaller than some integer and is then rounded up to that integer ahead of the truncation. To reliably detect this, we need to do some calculation centered on zero so we can use the So we calculate the remainder starting from this perhaps-incorrectly-truncated quotient. If the sign is negative (even and particularly |
As an example of why this trick for calculating x = 0.1456298828125_f16
y = 3.057718276977539e-5_f16
(x / y) = 4764.0
(x / y).trunc() = 4764.0
(x / y).trunc().mul_add(-y, x) = -3.981590270996094e-5
exact!(x / y) = 4762.697855750487329434697855750487329427
exact!((x / y).trunc()) = 4762.0
exact!((x / y).trunc()) as f16 = 4760.0 Here, the quotient is rounded up, and we can detect that. The trouble is, we can't tell whether it was rounded up from But as it is, the exact result is Since we can't readily distinguish these two cases, we can't fix up the rounding of the truncated quotient. |
Closing in favor of: |
The current implementation of
div_euclid
for floating point numbers violates the invariant, stated in the documentation, that:This fixes that problem.
When the magnitude of the exact quotient is greater than or equal to the maximum integer that can be represented precisely in the type, that invariant is not generally possible to uphold -- and without control over the rounding mode it would be difficult to calculate in any case -- so we return
NaN
in those instances.r? ghost