[RISCV] Add test for copy propagation issue with VMV0. NFC #75347
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I am currently looking into selecting mask registers as virtual registers
instead of physical registers (i.e. copies into V0) in order to simplify #71764,
i.e. select masked pseudos by using the vmv0 reg class for the mask operand:
PseudoVADD_VV_M1_MASK %x, %y, %mask:vmv0
instead of how we currently copy it into v0:
$v0 = COPY %mask:vr
PseudoVADD_VV_M1_MASK %x, %y, $v0
One issue I've run into with this approach is that register allocation can fail
on vector compare instructions, due to an interaction with MachineCSE and how
we model register overlap constraints.
We currently model vector register overlap constraints with early clobber. For
instructions like PseudoVMSEQ_VV_M2_MASK, this is more restrictive than what is
needed since the mask operand can be the same as the destination register, but
there's currently no way in LLVM today to mark only a subset of operands as
being clobbered: [1]
early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %pt:vr(tied-def 0), ..., %mask:vmv0, ...
The issue arises if passthru operand is a copy of mask operand, e.g:
%mask:vmv0 = ...
%pt:vr = COPY %mask
MachineCSE performs trivial copy propagation and will coalesce the copy of
%mask to the passthru operand:
early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %mask:vmv0(tied-def 0), ..., %mask:vmv0, ...
The two address instruction pass then sees the tied operand and constrains the
def's register class:
%mask:vmv0 = ...
%res:vmv0 = COPY %mask
early-clobber %res:vmv0 = PseudoVMSNE_VV_M2_MASK %res:vmv0, ..., %mask:vmv0
Because of the early-clobber constraint, %mask and %res will need to be
separate registers: But vmv0 only has one register, and allocation errors out.
This doesn't occur today because we explicitly copy the mask into $v0 first, so
the coalescing never occurs in the first place.
I will post a separate PR with one possible approach to fixing (teaching
MachineCSE to avoid coalescing in this case)
[1] https://discourse.llvm.org/t/earlyclobber-but-for-a-subset-of-the-inputs/55240