You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generalize loop pre-header creation and loop hoisting
A loop pre-header is a block that flows directly (and only) to the loop entry
block. The loop pre-header is the only non-loop predecessor of the entry block.
Loop invariant code can be hoisted to the loop pre-header where it is
guaranteed to be executed just once (per loop entry).
Currently, a loop pre-header has a number of restrictions:
- it is only created for a "do-while" (top entry) loop, not for a mid-loop entry.
- it isn't created if the current loop head and the loop entry block are in
different EH try regions
Additionally, partially due those restrictions, loop hoisting has restrictions:
- it requires a "do-while" (top entry) loop
- it requires the existing `head` block to dominate the loop entry block
- it requires the existing `head` block to be in the same EH region as the entry block
- it won't hoist if the `entry` block is the first block of a handler
This change removes all these restrictions.
Previously, even if we did create a pre-header, the definition of a pre-header
was a little weaker: an entry predecessor could be a non-loop block and also
not the pre-header, if the predecessor was dominated by the entry block. This
is more complicated to reason about, so I change the pre-header creation to
force entry block non-loop predecessors to branch to the pre-header instead.
This case only rarely occurs, when we have what looks like an outer loop back
edge but the natural loop recognition package doesn't recognize it as an outer loop.
I added a "stress mode" to always create a loop pre-header immediately after
loop recognition. This is disabled currently because loop cloning doesn't
respect the special status and invariants of a pre-header, and so inserts
all the cloning conditions and copied blocks after the pre-header, triggering
new loop structure asserts. This should be improved in the future.
A lot more checking of the loop table and loop annotations on blocks has been
added. This revealed a number of problems with loop unrolling leaving things
in a bad state for downstream phases. Loop unrolling has been updated to fix
this, in particular, the loop table is rebuilt if it is detected that we unroll
a loop that contains nested loops, since there will now be multiple copies of
those nested loops. This is the first case where we might rebuild the loop
table, so it lays the groundwork for potentially rebuilding the loop table in
other cases, such as after loop cloning where we don't add the "slow path"
loops to the table.
There is some code refactoring to simplify the "find loops" code as well.
Some change details:
- `optSetBlockWeights` is elevated to a "phase" that runs prior to loop recognition.
- LoopFlags is simplified:
- LPFLG_DO_WHILE is removed; call `lpIsTopEntry` instead
- LPFLG_ONE_EXIT is removed; check `lpExitCnt == 1` instead
- LPFLG_HOISTABLE is removed (there are no restrictions anymore)
- LPFLG_CONST is removed: check `lpFlags & (LPFLG_CONST_INIT | LPFLG_CONST_LIMIT) == (LPFLG_CONST_INIT | LPFLG_CONST_LIMIT)` instead (only used in one place
- bool lpContainsCall is removed and replaced by LPFLG_CONTAINS_CALL
- Added a `lpInitBlock` field to the loop table. For constant and variable
initialization loops, code assumed that these expressions existed in the
`head` block. This isn't true anymore if we insert a pre-header block.
So, capture the block where these actually exist when we determine that
they do exist, and explicitly use this block pointer where needed.
- Added `fgComputeReturnBlocks()` to extract this code out of `fgComputeReachability` into a function
- Added `optFindAndScaleGeneralLoopBlocks()` to extract this out of loop recognition to its own function.
- Added `optResetLoopInfo()` to reset the loop table and block annotations related to loops
- Added `fgDebugCheckBBNumIncreasing()` to allow asserting that the bbNum
order of blocks is increasing. This should be used in phases that depend
on this order to do bbNum comparisons.
- Add a lot more loop table validation in `fgDebugCheckLoopTable()`
0 commit comments