Skip to content

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3855,11 +3855,19 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
// If we end up changing the VL or mask of True, then we need to make sure it
// doesn't raise any observable fp exceptions, since changing the active
// elements will affect how fflags is set.
if (TrueVL != VL || !IsMasked)
if (TrueVL != VL || !IsMasked) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is specific to the change case. Even if we have exactly equal VL values, those values describe a different number of bits. So folding away the vmerge.vv and vmv.v.v is still illegal.

I think you can also use a much easier check here - the VT of the TrueOp should equal the VT of the vmerge or vmv.v.v. (Really the respective operand, but we don't have widening or narrowing versions of either so that's equivalent.)

if (mayRaiseFPException(True.getNode()) &&
!True->getFlags().hasNoFPExcept())
return false;

// If the EEW of True is different from vmerge's SEW, then we cannot change
// the VL or mask.
if (Log2_64(True.getScalarValueSizeInBits()) !=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe don't use log here, we do a shift left to log2sew instead. Shift is cheaper than log I think.

N->getConstantOperandVal(
RISCVII::getSEWOpNum(TII->get(N->getMachineOpcode())) - 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opcode is known? So I think we don't need to get SEW operand index from TSFlag, we can hardcode it here.

return false;
}

SDLoc DL(N);

// From the preconditions we checked above, we know the mask and thus glue
Expand Down
40 changes: 40 additions & 0 deletions llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
Original file line number Diff line number Diff line change
Expand Up @@ -1196,3 +1196,43 @@ define <vscale x 2 x i32> @true_mask_vmerge_implicit_passthru(<vscale x 2 x i32>
)
ret <vscale x 2 x i32> %b
}

define <vscale x 2 x i32> @unfoldable_mismatched_sew_mask(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, <vscale x 2 x i1> %mask, i64 %avl) {
; CHECK-LABEL: unfoldable_mismatched_sew_mask:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, ma
; CHECK-NEXT: vadd.vv v9, v9, v10
; CHECK-NEXT: vsetvli zero, a0, e32, m1, tu, ma
; CHECK-NEXT: vmerge.vvm v8, v8, v9, v0
; CHECK-NEXT: ret
%a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, i64 %avl)
%a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
%b = call <vscale x 2 x i32> @llvm.riscv.vmerge.nxv2i32.nxv2i32(
<vscale x 2 x i32> %passthru,
<vscale x 2 x i32> %passthru,
<vscale x 2 x i32> %a.bitcast,
<vscale x 2 x i1> %mask,
i64 %avl
)
ret <vscale x 2 x i32> %b
}

define <vscale x 2 x i32> @unfoldable_mismatched_sew_avl(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y) {
; CHECK-LABEL: unfoldable_mismatched_sew_avl:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetivli zero, 5, e64, m1, ta, ma
; CHECK-NEXT: vadd.vv v9, v9, v10
; CHECK-NEXT: vsetivli zero, 3, e32, m1, tu, ma
; CHECK-NEXT: vmv.v.v v8, v9
; CHECK-NEXT: ret
%a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, i64 5)
%a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
%b = call <vscale x 2 x i32> @llvm.riscv.vmerge.nxv2i32.nxv2i32(
<vscale x 2 x i32> %passthru,
<vscale x 2 x i32> %passthru,
<vscale x 2 x i32> %a.bitcast,
<vscale x 2 x i1> splat (i1 true),
i64 3
)
ret <vscale x 2 x i32> %b
}
14 changes: 14 additions & 0 deletions llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
Original file line number Diff line number Diff line change
Expand Up @@ -180,3 +180,17 @@ define <vscale x 2 x i32> @unfoldable_vredsum(<vscale x 2 x i32> %passthru, <vsc
%b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a, iXLen 1)
ret <vscale x 2 x i32> %b
}

define <vscale x 2 x i32> @unfoldable_mismatched_sew_diff_vl(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y) {
; CHECK-LABEL: unfoldable_mismatched_sew_diff_vl:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetivli zero, 6, e64, m1, ta, ma
; CHECK-NEXT: vadd.vv v9, v9, v10
; CHECK-NEXT: vsetivli zero, 3, e32, m1, tu, ma
; CHECK-NEXT: vmv.v.v v8, v9
; CHECK-NEXT: ret
%a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, iXLen 6)
%a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
%b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a.bitcast, iXLen 3)
ret <vscale x 2 x i32> %b
}
Loading