-
Notifications
You must be signed in to change notification settings - Fork 24
Avoid broadcast/extract when implementing memset #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: aie-public
Are you sure you want to change the base?
Conversation
f188d6e
to
de8bb9f
Compare
de8bb9f
to
a425b7d
Compare
I think we should extend the target hook to have something like:
It is risky to remove all that code. |
Discussed offline: we have an alignment here, we are just spotting the wrong one in the legalizer. |
if (MFI.getObjectAlign(FI) < Alignment) | ||
MFI.setObjectAlignment(FI, Alignment); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. blunt fix. We cut away an optimization and don't meet it in PR testing.
However, this is not the way. Instead I propose we change the lowering code to get the appropriate alignment in the MMO's of the store instructions that are generated. We need to have a less lazy constructor of that derived MMO. I think that should be perfectly acceptable to upstream.
The generic G_MEMSET legalizer helper would tweak the alignment of stack objects to make them amenable for vector implementations. However, the vector store that it creates doesn't have that alignment info available, and our legalization scalarizes it. With that scalarization it is not better than the original code that uses 32 bit stores.
I have disabled the cleverness which gives good results, also in stack size on the reduced example that I have added as a regression test.
DRAFT DRAFT DRAFT
This would be a draft PR, but I want to see whether it comes through standard CI testing