Skip to content

[RISCV] Refactor extract_subvector lowering slightly. NFC #65391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 11, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 12 additions & 10 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8735,16 +8735,17 @@ SDValue RISCVTargetLowering::lowerEXTRACT_SUBVECTOR(SDValue Op,
}
}

// With an index of 0 this is a cast-like subvector, which can be performed
// with subregister operations.
if (OrigIdx == 0)
return Op;

// If the subvector vector is a fixed-length type, we cannot use subregister
// manipulation to simplify the codegen; we don't know which register of a
// LMUL group contains the specific subvector as we only know the minimum
// register size. Therefore we must slide the vector group down the full
// amount.
if (SubVecVT.isFixedLengthVector()) {
// With an index of 0 this is a cast-like subvector, which can be performed
// with subregister operations.
if (OrigIdx == 0)
return Op;
MVT ContainerVT = VecVT;
if (VecVT.isFixedLengthVector()) {
ContainerVT = getContainerForFixedLengthVector(VecVT);
Expand Down Expand Up @@ -8776,17 +8777,18 @@ SDValue RISCVTargetLowering::lowerEXTRACT_SUBVECTOR(SDValue Op,
if (RemIdx == 0)
return Op;

// Else we must shift our vector register directly to extract the subvector.
// Do this using VSLIDEDOWN.
// Else SubVecVT is a fractional LMUL and may need to be slid down.
assert(RISCVVType::decodeVLMUL(getLMUL(SubVecVT)).second);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside decomposeSubvectorInsertExtractToSubRegs, there's a comment which reads:
// Note that this is not guaranteed to find a subregister index, such as
// when we are extracting from one VR type to another.

This seems to contradict your new assert here.

After thinking about it, I think that comment is stale because moving the index=0 case early adds a precondition to that routine that SubVecVT != VecVT, and thus sizeof(SubVecVT) < sizeof(VecVT).

If you agree, would you mind updating the comment to reflect that? We've only changed the invariant for one of two callers, so we can't actually add the assert in the callee, but maybe right before the call for this caller? And maybe add an assert that the result is not NoRegister?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index=0 case doesn't necessarily guarantee that sizeof(SubVecVT) < sizeof(VecVT), consider an extract of v2i8 from an nxv2i8 at index 2. It's a well formed extract_subvector, but at vscale=1, sizeof(v2i8) == sizeof(nxv2i8)

I also tried the NoRegister assert, but as it turns out if both SubVecVT and VecVT are LMUL=1 or less, e.g v2i8 and nxv2i8, they'll both have the LMUL=1 register class and decomposeSubvectorInsertExtractToSubRegs will return NoRegister.

But if we get NoRegister then we don't end up performing the subregister extract anyway, because we only do it when VecVT.bitsGT(getLMUL1VT(VecVT)). I think we can put the assert in that branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only case I can think of decomposing would fail and we try to extract a subregister is when SubVecVT and VecVT have the same LMUL and the LMUL>1.
In that case though, the only valid index is 0:

IDX must be a constant multiple of T's known minimum vector length. If T is a scalable vector, DX is first scaled by the runtime scaling factor of T. Elements IDX through (IDX + num_elements(T) - 1) must be valid VECTOR indices.

So we should have returned early by the precondition.


// If the vector type is an LMUL-group type, extract a subvector equal to the
// nearest full vector register type. This should resolve to a EXTRACT_SUBREG
// instruction.
// nearest full vector register type.
MVT InterSubVT = VecVT;
if (VecVT.bitsGT(getLMUL1VT(VecVT))) {
// If VecVT has an LMUL > 1, then SubVecVT should have a smaller LMUL, and
// we should have successfully decomposed the extract into a subregister.
assert(SubRegIdx != RISCV::NoSubRegister);
InterSubVT = getLMUL1VT(VecVT);
Vec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, InterSubVT, Vec,
DAG.getConstant(OrigIdx - RemIdx, DL, XLenVT));
Vec = DAG.getTargetExtractSubreg(SubRegIdx, DL, InterSubVT, Vec);
}

// Slide this vector register down by the desired number of elements in order
Expand Down