-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[CodeGen] Renumber slot indexes before register allocation #66334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
https://llvm-compile-time-tracker.com/ shows a geomean 0.00% change. TODO: I still need to update ~15 tests with manual checks. |
@llvm/pr-subscribers-llvm-transforms ChangesRegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps.This also means that enabling -early-live-intervals, which runs the -- Patch is 31.92 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66334.diff 330 Files Affected:
|
Error: Command failed due to missing milestone. |
All tests updated now. |
I've thought about doing this but it only makes the problem harder to observe. It still happens when renumbering occurs during allocation |
I don't think that's a bad thing. I think this makes things better, even if they're still not perfect. |
This seems reasonable to me. For AMDGPU graphics, this seems like it yields a very minor reduction in mean VGPR usage, and slightly greater reduction in mean SGPR usage. There are some a handful of edges case where VGPR usage increased significantly (20 VGPRs). I guess this is unsurprising given it tweaks a heuristic input. |
Rebased on #66627 which allowed me to update |
I'm on the fence with that change. My concerns is that it adds compile time and may trigger other renumbering down the line. Could you rename |
Yes, see earlier comment :)
This was based on the "renumber-all-slotindexes" branch here: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions%3Au&remote=jayfoad |
How about |
Both sounds good to me. Slight preference for |
Sorry, seen it then forgot about it :P. |
Renamed renumberAllIndexes -> packIndexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - but probably wait a little in case @qcolombet has further input
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.
Merged manually in e0919b1 |
Applies: llvm#66334 llvm#67038 Packing the slot indexes before register allocation is useful for us because it evens the gaps between slots after all the optimization passes that happen before `greedy` and may have removed a different number of instructions between AArch64 and X86. This leads to different slot gaps and, hence, slightly different regalloc in some cases. We backport the above patches for our LLVM, with the main difference being the absence of some convenient data structure iterators, which we had to convert to be compatible with our ADT infrastructure. We add the `-pack-indexes` flag to activate this. Addressses: systems-nuts/unifico#291
…7038) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.
This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.