Skip to content

device-libs: Split nonfinite edge case handling of fract usage#1587

Open
arsenm wants to merge 1 commit intoamd-mainfrom
device-libs/split-f64-fract-finite-only-check
Open

device-libs: Split nonfinite edge case handling of fract usage#1587
arsenm wants to merge 1 commit intoamd-mainfrom
device-libs/split-f64-fract-finite-only-check

Conversation

@arsenm
Copy link

@arsenm arsenm commented Feb 27, 2026

Break the builtin macro usage into the main sequence and the edge case fixup part. Perform the edge case checks on the original argument value, rather than the result of the earlier computation. We know this sequence cannot introduce inf or nan. This enables value tracking to trivially use the knowledge that the argument isn't inf or nan. The current sequence is 2 instructions too deep for the default value tracking recursion depth limit.

This eliminates codegen differences in the remaining trig functions when FINITE_ONLY_OPT is deleted.

@arsenm arsenm requested a review from b-sumner as a code owner February 27, 2026 13:51
@z1-cciauto
Copy link
Collaborator

double frac = BUILTIN_FRACTION_F64_IMPL(BUILTIN_FLDEXP_F64(f2, -2));
double frac_fixup = BUILTIN_FRACTION_F64_FIXUP(frac, x);

f2 = BUILTIN_FLDEXP_F64(frac_fixup, 2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to split this into 3 statements and add 2 new variables. A single line like the original is better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree. It's not like there's a statement and variable count budget. Temporary variables help readability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases that may be true, but not here. Please write this as a single statement.

@arsenm arsenm force-pushed the device-libs/split-f64-fract-finite-only-check branch from f22a444 to 10333a0 Compare February 27, 2026 16:58
@z1-cciauto
Copy link
Collaborator

EVALUATE(x, p2, p1, p0, f2, f1, f0);

f2 = BUILTIN_FLDEXP_F64(BUILTIN_FRACTION_F64(BUILTIN_FLDEXP_F64(f2, -2)), 2);
f2 = BUILTIN_FLDEXP_F64(BUILTIN_FRACTION_F64_FIXUP(BUILTIN_FRACTION_F64_IMPL(BUILTIN_FLDEXP_F64(f2, -2)), x), 2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it x here and not f2? The original implementation tested f2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the whole point of doing this. x is the original argument, which is trivially provable to not be infinity or nan. The above computation cannot introduce an infinity or nan. We know that isinf(x) == isinf(f2) at this point, so we can do the special case checks on the original argument

Break the builtin macro usage into the main sequence and the edge
case fixup part. Perform the edge case checks on the original argument
value, rather than the result of the earlier computation. We know this
sequence cannot introduce inf or nan. This enables value tracking to
trivially use the knowledge that the argument isn't inf or nan. The current
sequence is 2 instructions too deep for the default value tracking
recursion depth limit.

This eliminates codegen differences in the remaining trig functions
when FINITE_ONLY_OPT is deleted.
@arsenm arsenm force-pushed the device-libs/split-f64-fract-finite-only-check branch from 10333a0 to 52ea735 Compare March 4, 2026 19:00
@z1-cciauto
Copy link
Collaborator

@arsenm
Copy link
Author

arsenm commented Mar 5, 2026

ping

@lamb-j lamb-j changed the base branch from amd-staging to amd-main March 5, 2026 17:19
Copy link
Collaborator

@b-sumner b-sumner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. Splitting the fraction like this is just ugly and feels like a workaround.

@skganesan008
Copy link
Collaborator

!PSDB

@z1-cciauto
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants