[RISCV] Add test for copy propagation issue with VMV0. NFC #75347

lukel97 · 2023-12-13T15:19:15Z

I am currently looking into selecting mask registers as virtual registers
instead of physical registers (i.e. copies into V0) in order to simplify #71764,
i.e. select masked pseudos by using the vmv0 reg class for the mask operand:

PseudoVADD_VV_M1_MASK %x, %y, %mask:vmv0

instead of how we currently copy it into v0:

$v0 = COPY %mask:vr
PseudoVADD_VV_M1_MASK %x, %y, $v0

One issue I've run into with this approach is that register allocation can fail
on vector compare instructions, due to an interaction with MachineCSE and how
we model register overlap constraints.

We currently model vector register overlap constraints with early clobber. For
instructions like PseudoVMSEQ_VV_M2_MASK, this is more restrictive than what is
needed since the mask operand can be the same as the destination register, but
there's currently no way in LLVM today to mark only a subset of operands as
being clobbered: [1]

early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %pt:vr(tied-def 0), ..., %mask:vmv0, ...

The issue arises if passthru operand is a copy of mask operand, e.g:

%mask:vmv0 = ...
%pt:vr = COPY %mask

MachineCSE performs trivial copy propagation and will coalesce the copy of
%mask to the passthru operand:

early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %mask:vmv0(tied-def 0), ..., %mask:vmv0, ...

The two address instruction pass then sees the tied operand and constrains the
def's register class:

%mask:vmv0 = ...
%res:vmv0 = COPY %mask
early-clobber %res:vmv0 = PseudoVMSNE_VV_M2_MASK %res:vmv0, ..., %mask:vmv0

Because of the early-clobber constraint, %mask and %res will need to be
separate registers: But vmv0 only has one register, and allocation errors out.

This doesn't occur today because we explicitly copy the mask into $v0 first, so
the coalescing never occurs in the first place.

I will post a separate PR with one possible approach to fixing (teaching
MachineCSE to avoid coalescing in this case)

[1] https://discourse.llvm.org/t/earlyclobber-but-for-a-subset-of-the-inputs/55240

I am currently looking into selecting mask registers as virtual registers instead of physical registers (i.e. copies into V0) in order to simplify PseudoVADD_VV_M1_MASK %x, %y, %mask:vmv0 instead of how we currently copy it into v0: $v0 = COPY %mask:vr PseudoVADD_VV_M1_MASK %x, %y, $v0 One issue I've run into with this approach is that register allocation can fail on vector compare instructions, due to an interaction with MachineCSE and how we model register overlap constraints. We currently model vector register overlap constraints with early clobber. For instructions like PseudoVMSEQ_VV_M2_MASK, this is more restrictive than what is needed since the mask operand can be the same as the destination register, but there's currently no way in LLVM today to mark only a subset of operands as being clobbered: [1] early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %pt:vr(tied-def 0), ..., %mask:vmv0, ... The issue arises if passthru operand is a copy of mask operand, e.g: %mask:vmv0 = ... %pt:vr = COPY %mask MachineCSE performs trivial copy propagation and will coalesce the copy of %mask to the passthru operand: early-clobber %res:vr = PseudoVMSEQ_VV_M2_MASK %mask:vmv0(tied-def 0), ..., %mask:vmv0, ... The two address instruction pass then sees the tied operand and constrains the def's register class: %mask:vmv0 = ... %res:vmv0 = COPY %mask early-clobber %res:vmv0 = PseudoVMSNE_VV_M2_MASK %res:vmv0, ..., %mask:vmv0 Because of the early-clobber constraint, %mask and %res will need to be separate registers: But vmv0 only has one register, and allocation errors out. This doesn't occur today because we explicitly copy the mask into $v0 first, so the coalescing never occurs in the first place. I will post a separate PR with one possible approach to fixing (teaching MachineCSE to avoid coalescing in this case) [1] https://discourse.llvm.org/t/earlyclobber-but-for-a-subset-of-the-inputs/55240

topperc

LGTM

lukel97 · 2024-01-02T06:52:34Z

After some discussion with others, it looks using singleton register classes (i.e. VMV0) opens up a whole can of worms, and RISC-V would likely be the only target to extensively do so. It's probably not worth tackling these issues, so closing for now

lukel97 requested review from BeMg, preames, topperc, wangpc-pp and yetingk December 13, 2023 15:19

topperc approved these changes Dec 13, 2023

View reviewed changes

lukel97 closed this Jan 2, 2024

lukel97 mentioned this pull request Apr 15, 2024

[RISCV] Don't use V0 directly in patterns #88496

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Add test for copy propagation issue with VMV0. NFC #75347

[RISCV] Add test for copy propagation issue with VMV0. NFC #75347

Uh oh!

lukel97 commented Dec 13, 2023 •

edited

Loading

Uh oh!

topperc left a comment

Uh oh!

lukel97 commented Jan 2, 2024

Uh oh!

Uh oh!

[RISCV] Add test for copy propagation issue with VMV0. NFC #75347

[RISCV] Add test for copy propagation issue with VMV0. NFC #75347

Uh oh!

Conversation

lukel97 commented Dec 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented Jan 2, 2024

Uh oh!

Uh oh!

lukel97 commented Dec 13, 2023 •

edited

Loading