Skip to content

[AArch64] Attempt to further split the arch default and implied exts. #106304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ahmedbougacha
Copy link
Member

In the tablegen definitions, we have now split the lists of extensions enabled by default for an arch version from the features that are required.

However, the frontend (in AArch64TargetInfo::setFeatureEnabled) still ends up bringing in all the default-enabled extensions anyway, so concretely we still end up with SVE2 for M4, now that it's declared as being v9.2a.

This takes the implied vs. default split to its conclusion, by emitting the two lists separately in tablegen, and really only enabling the really-required extensions. The default-enabled extensions are still enabled by default when explicitly specifying an arch version (via -march/target(arch=)).
Of course, what's "really required" is messy, and in some cases even contradictory, so I'm not sure this is all that practical overall.

This isn't a PR I intend to merge (or at the very least definitely not as-is), but I'm curious to see what others think. My conclusion is that we don't really want this after all, and since we don't expose the arch version directly (anymore) anyway, I'm thinking of downgrading the apple- definitions to v8a, explicitly listing the features, and discouraging or disallowing -march for darwin targets.

See individual commits:

[AArch64] Remove not-always-required features from arch implied.

These features aren't always required for the given base architecture
version, and usually are only conditionally required when some other
extension is present (e.g., FP/AdvSIMD).

This breaks a *ton* of tests (and maybe user code too), because most
places that just ask for an arch feature (without going through the
now-default-ext-aware -march/arch= machinery) now miss out on a lot.

We'd also have to add all the missing features to existing CPUs
that now lose them.  I haven't done that here yet.

Some of the features seem wrong in the Arm ARM, e.g., BF16 or I8MM
being declared as mandatory as of e.g., v8.6a, even though they really
need NEON, and NEON isn't mandatory.

[AArch64] Further split the arch default and implied exts.

We started doing this at the tablegen level, but the generated
arch info only contained the list of default extensions.
So in the frontend, we end up enabling all the default extensions
for arch vN anyway, when we only asked for a CPU that happens to
be declared as supporting vN.

This tries to split that into distinct lists, using the implied
subtarget features (and isolating only the AEK subset of those)
most of the time, but also using the default extensions when
explicitly asking for an arch (via -march or target(arch=)).

[clang] Don't enable SVE2 for v9.0a in the frontend.

We did this change in the backend, and we did it in the driver.
But the frontend continued to do it, and still does it even after this
change, because it picks up the DefaultExts (rather than the implied
required features).

This is a first step, at least.

Ideally we wouldn't need the frontend to do this too, but for functions
that have a target attribute, we can't rely on the driver, and we do
want the target to be visible at the frontend level, so we can't rely on
the backend either.  So we have to do this here as well.

We did this change in the backend, and we did it in the driver.
But the frontend continued to do it, and still does it even after this
change, because it picks up the DefaultExts (rather than the implied
required features).

This is a first step, at least.

Ideally we wouldn't need the frontend to do this too, but for functions
that have a target attribute, we can't rely on the driver, and we do
want the target to be visible at the frontend level, so we can't rely on
the backend either.  So we have to do this here as well.
We started doing this at the tablegen level, but the generated
arch info only contained the list of default extensions.
So in the frontend, we end up enabling all the default extensions
for arch vN anyway, when we only asked for a CPU that happens to
be declared as supporting vN.

This tries to split that into distinct lists, using the implied
subtarget features (and isolating only the AEK subset of those)
most of the time, but also using the default extensions when
explicitly asking for an arch (via -march or target(arch=)).
These features aren't always required for the given base architecture
version, and usually are only conditionally required when some other
extension is present (e.g., FP/AdvSIMD).

This breaks a *ton* of tests (and maybe user code too), because most
places that just ask for an arch feature (without going through the
now-default-ext-aware -march/arch= machinery) now miss out on a lot.

We'd also have to add all the missing features to existing CPUs
that now lose them.  I haven't done that here yet.

Some of the features seem wrong in the Arm ARM, e.g., BF16 or I8MM
being declared as mandatory as of e.g., v8.6a, even though they really
need NEON, and NEON isn't mandatory.
@ahmedbougacha
Copy link
Member Author

In the meantime, let's do #106312

@tmatheson-arm
Copy link
Contributor

This looks like the right direction to go.

@tmatheson-arm
Copy link
Contributor

Are you planning to follow through with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants