-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[ARM][TableGen][MC] Change the ARM mnemonic operands to be optional for ASM parsing #83436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM][TableGen][MC] Change the ARM mnemonic operands to be optional for ASM parsing #83436
Conversation
@llvm/pr-subscribers-mc Author: Alfie Richards (AlfieRichardsArm) ChangesThis changs the way the assembly matcher works for Aarch32 parsing. Previously there was a pile of hacks which dictated whether the CC and CCOut operands should be present which de-facto chose if the wide/narrow (or thumb1/thumb2/arm) instruction version were chosen. This meant much of the TableGen machinery present for the assembly matching was effectively being bypassed and worked around. This patch makes the CC and CCOut operands optional which allows the ASM matcher operate as it was designed and means we can avoid doing some of the hacks done previously. This also adds the option for the target to allow the prioritizing the smaller instruction encodings as is required for Aarch32. Patch is 136.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83436.diff 23 Files Affected:
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index 0d97a47190b196..8448b768d6e7d6 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -948,9 +948,18 @@ class AsmOperandClass {
/// error will be suppressed if all of the remaining unmatched operands are
/// marked as IsOptional.
///
- /// Optional arguments must be at the end of the operand list.
+ /// Note: Optional arguments have caveats if they are not at the end of this list
+ /// when regarding custom operand parsing. See below
bit IsOptional = false;
+ // Fixme: Ideally this would not be necessary however this would involve interleaving the
+ // parsing and matching processes.
+ /// Set to 1 if the parser should assume this operand will always be present
+ /// for the sake of calculating the operand index in regards to which custom operand
+ /// parser should be used.
+ /// This is only used for custom operands that are not at the end of the instruction.
+ bit OptionalShouldOffsetCustomParsers = true;
+
/// The name of the method on the target specific asm parser that returns the
/// default operand for this optional operand. This method is only used if
/// IsOptional == 1. If not set, this will default to "defaultFooOperands",
@@ -1724,6 +1733,11 @@ class Target {
// setting hasExtraDefRegAllocReq and hasExtraSrcRegAllocReq to 1
// for all opcodes if this flag is set to 0.
int AllowRegisterRenaming = 0;
+
+ // SortBySize = Should the assembly matcher prefer the smaller
+ // instructions. 1 if the instruction set should sort by size,
+ // 0 otherwise.
+ int SortBySize = 0;
}
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 877781568307dc..f380a8e40337b7 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -1746,4 +1746,5 @@ def ARM : Target {
let AssemblyParsers = [ARMAsmParser];
let AssemblyParserVariants = [ARMAsmParserVariant];
let AllowRegisterRenaming = 1;
+ let SortBySize = 1;
}
diff --git a/llvm/lib/Target/ARM/ARMInstrFormats.td b/llvm/lib/Target/ARM/ARMInstrFormats.td
index 14e315534570d2..6e0466f1bc11bd 100644
--- a/llvm/lib/Target/ARM/ARMInstrFormats.td
+++ b/llvm/lib/Target/ARM/ARMInstrFormats.td
@@ -155,7 +155,13 @@ def iflags_op : Operand<i32> {
// ARM Predicate operand. Default to 14 = always (AL). Second part is CC
// register whose default is 0 (no register).
-def CondCodeOperand : AsmOperandClass { let Name = "CondCode"; }
+def CondCodeOperand : AsmOperandClass {
+ let Name = "CondCode";
+ let PredicateMethod = "isCondCode";
+ let DefaultMethod = "defaultCondCodeOp";
+ let IsOptional = true;
+ let OptionalShouldOffsetCustomParsers = false;
+}
def pred : PredicateOperand<OtherVT, (ops i32imm, i32imm),
(ops (i32 14), (i32 zero_reg))> {
let PrintMethod = "printPredicateOperand";
@@ -174,7 +180,12 @@ def cmovpred : Operand<i32>, PredicateOp,
}
// Conditional code result for instructions whose 's' bit is set, e.g. subs.
-def CCOutOperand : AsmOperandClass { let Name = "CCOut"; }
+def CCOutOperand : AsmOperandClass {
+ let Name = "CCOut";
+ let DefaultMethod = "defaultCCOutOp";
+ let IsOptional = true;
+ let OptionalShouldOffsetCustomParsers = false;
+}
def cc_out : OptionalDefOperand<OtherVT, (ops CCR), (ops (i32 zero_reg))> {
let EncoderMethod = "getCCOutOpValue";
let PrintMethod = "printSBitModifierOperand";
@@ -468,7 +479,7 @@ class InstThumb<AddrMode am, int sz, IndexMode im,
// These are aliases that require C++ handling to convert to the target
// instruction, while InstAliases can be handled directly by tblgen.
class AsmPseudoInst<string asm, dag iops, dag oops = (outs)>
- : InstTemplate<AddrModeNone, 0, IndexModeNone, Pseudo, GenericDomain,
+ : InstTemplate<AddrModeNone, 4, IndexModeNone, Pseudo, GenericDomain,
"", NoItinerary> {
let OutOperandList = oops;
let InOperandList = iops;
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index 37bfb76a494dee..342afae7b05f67 100644
--- a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringMap.h"
@@ -47,6 +48,7 @@
#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Compiler.h"
+#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/SMLoc.h"
@@ -79,6 +81,7 @@ extern const ARMInstrTable ARMDescs;
} // end namespace llvm
namespace {
+class ARMOperand;
enum class ImplicitItModeTy { Always, Never, ARMOnly, ThumbOnly };
@@ -446,9 +449,10 @@ class ARMAsmParser : public MCTargetAsmParser {
}
bool validatetLDMRegList(const MCInst &Inst, const OperandVector &Operands,
- unsigned ListNo, bool IsARPop = false);
+ unsigned MnemonicOpsEndInd, unsigned ListIndex,
+ bool IsARPop = false);
bool validatetSTMRegList(const MCInst &Inst, const OperandVector &Operands,
- unsigned ListNo);
+ unsigned MnemonicOpsEndInd, unsigned ListIndex);
int tryParseRegister();
bool tryParseRegisterWithWriteBack(OperandVector &);
@@ -504,20 +508,28 @@ class ARMAsmParser : public MCTargetAsmParser {
bool parseDirectiveSEHEpilogEnd(SMLoc L);
bool parseDirectiveSEHCustom(SMLoc L);
+ std::unique_ptr<ARMOperand> defaultCondCodeOp();
+ std::unique_ptr<ARMOperand> defaultCCOutOp();
+
bool isMnemonicVPTPredicable(StringRef Mnemonic, StringRef ExtraToken);
StringRef splitMnemonic(StringRef Mnemonic, StringRef ExtraToken,
- unsigned &PredicationCode,
- unsigned &VPTPredicationCode, bool &CarrySetting,
- unsigned &ProcessorIMod, StringRef &ITMask);
+ ARMCC::CondCodes &PredicationCode,
+ ARMVCC::VPTCodes &VPTPredicationCode,
+ bool &CarrySetting, unsigned &ProcessorIMod,
+ StringRef &ITMask);
void getMnemonicAcceptInfo(StringRef Mnemonic, StringRef ExtraToken,
StringRef FullInst, bool &CanAcceptCarrySet,
bool &CanAcceptPredicationCode,
bool &CanAcceptVPTPredicationCode);
bool enableArchExtFeature(StringRef Name, SMLoc &ExtLoc);
- void tryConvertingToTwoOperandForm(StringRef Mnemonic, bool CarrySetting,
- OperandVector &Operands);
- bool CDEConvertDualRegOperand(StringRef Mnemonic, OperandVector &Operands);
+ void tryConvertingToTwoOperandForm(StringRef Mnemonic,
+ ARMCC::CondCodes PredicationCode,
+ bool CarrySetting, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
+
+ bool CDEConvertDualRegOperand(StringRef Mnemonic, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
bool isThumb() const {
// FIXME: Can tablegen auto-generate this?
@@ -657,15 +669,20 @@ class ARMAsmParser : public MCTargetAsmParser {
void cvtThumbBranches(MCInst &Inst, const OperandVector &);
void cvtMVEVMOVQtoDReg(MCInst &Inst, const OperandVector &);
- bool validateInstruction(MCInst &Inst, const OperandVector &Ops);
- bool processInstruction(MCInst &Inst, const OperandVector &Ops, MCStreamer &Out);
- bool shouldOmitCCOutOperand(StringRef Mnemonic, OperandVector &Operands);
- bool shouldOmitPredicateOperand(StringRef Mnemonic, OperandVector &Operands);
- bool shouldOmitVectorPredicateOperand(StringRef Mnemonic, OperandVector &Operands);
+ bool validateInstruction(MCInst &Inst, const OperandVector &Ops,
+ unsigned MnemonicOpsEndInd);
+ bool processInstruction(MCInst &Inst, const OperandVector &Ops,
+ unsigned MnemonicOpsEndInd, MCStreamer &Out);
+ bool shouldOmitVectorPredicateOperand(StringRef Mnemonic,
+ OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
bool isITBlockTerminator(MCInst &Inst) const;
- void fixupGNULDRDAlias(StringRef Mnemonic, OperandVector &Operands);
- bool validateLDRDSTRD(MCInst &Inst, const OperandVector &Operands,
- bool Load, bool ARMMode, bool Writeback);
+
+ void fixupGNULDRDAlias(StringRef Mnemonic, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
+ bool validateLDRDSTRD(MCInst &Inst, const OperandVector &Operands, bool Load,
+ bool ARMMode, bool Writeback,
+ unsigned MnemonicOpsEndInd);
public:
enum ARMMatchResultTy {
@@ -675,6 +692,7 @@ class ARMAsmParser : public MCTargetAsmParser {
Match_RequiresThumb2,
Match_RequiresV8,
Match_RequiresFlagSetting,
+ Match_RequiresDestinationRegisterMatchASourceRegister,
#define GET_OPERAND_DIAGNOSTIC_TYPES
#include "ARMGenAsmMatcher.inc"
@@ -714,6 +732,9 @@ class ARMAsmParser : public MCTargetAsmParser {
unsigned validateTargetOperandClass(MCParsedAsmOperand &Op,
unsigned Kind) override;
unsigned checkTargetMatchPredicate(MCInst &Inst) override;
+ unsigned
+ checkEarlyTargetMatchPredicate(MCInst &Inst,
+ const OperandVector &Operands) override;
bool MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
OperandVector &Operands, MCStreamer &Out,
@@ -4052,6 +4073,66 @@ static MCRegister MatchRegisterName(StringRef Name);
/// }
+static bool isDataTypeToken(StringRef Tok) {
+ return Tok == ".8" || Tok == ".16" || Tok == ".32" || Tok == ".64" ||
+ Tok == ".i8" || Tok == ".i16" || Tok == ".i32" || Tok == ".i64" ||
+ Tok == ".u8" || Tok == ".u16" || Tok == ".u32" || Tok == ".u64" ||
+ Tok == ".s8" || Tok == ".s16" || Tok == ".s32" || Tok == ".s64" ||
+ Tok == ".p8" || Tok == ".p16" || Tok == ".f32" || Tok == ".f64" ||
+ Tok == ".f" || Tok == ".d";
+}
+
+static unsigned getMnemonicOpsEndInd(const OperandVector &Operands) {
+ unsigned MnemonicOpsEndInd = 1;
+ // Special case for CPS which has a Mnemonic side token for possibly storing ie/id
+ // variant
+ if (Operands[0]->isToken() &&
+ static_cast<ARMOperand &>(*Operands[0]).getToken() == "cps") {
+ if (Operands.size() > 1 && Operands[1]->isImm() &&
+ static_cast<ARMOperand &>(*Operands[1]).getImm()->getKind() ==
+ llvm::MCExpr::Constant &&
+ (dyn_cast<MCConstantExpr>(
+ static_cast<ARMOperand &>(*Operands[1]).getImm())
+ ->getValue() == ARM_PROC::IE ||
+ dyn_cast<MCConstantExpr>(
+ static_cast<ARMOperand &>(*Operands[1]).getImm())
+ ->getValue() == ARM_PROC::ID))
+ ++MnemonicOpsEndInd;
+ }
+
+ // In some circumstances the code code moves to the right
+ bool RHSCondCode = false;
+ while (MnemonicOpsEndInd < Operands.size()) {
+ auto Op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
+ // Special case for it instructions which have a condition code on the RHS
+ if (Op.isITMask()) {
+ RHSCondCode = true;
+ MnemonicOpsEndInd++;
+ }
+ // Special case for it instructions which have a condition code on the RHS
+ else if (Op.isToken() &&
+ (
+ // There are several special cases not covered by
+ // isDataTypeToken
+ Op.getToken() == ".w" || Op.getToken() == ".bf16" ||
+ Op.getToken() == ".p64" || Op.getToken() == ".f16" ||
+ isDataTypeToken(Op.getToken()))) {
+ // In the mnemonic operators the cond code must always precede the data type. So we
+ // can now safely assume any subsequent cond code is on the RHS.
+ // As is the cdase for VCMP and VPT.
+ RHSCondCode = true;
+ MnemonicOpsEndInd++;
+ }
+ // Skip all mnemonic operator types
+ else if (Op.isCCOut() || (Op.isCondCode() && !RHSCondCode) ||
+ Op.isVPTPred() || (Op.isToken() && Op.getToken() == ".w"))
+ MnemonicOpsEndInd++;
+ else
+ break;
+ }
+ return MnemonicOpsEndInd;
+}
+
bool ARMAsmParser::parseRegister(MCRegister &Reg, SMLoc &StartLoc,
SMLoc &EndLoc) {
const AsmToken &Tok = getParser().getTok();
@@ -5597,37 +5678,86 @@ ParseStatus ARMAsmParser::parseAM3Offset(OperandVector &Operands) {
return ParseStatus::Success;
}
+// Finds the index of the first CondCode operator, if there is none returns 0
+unsigned findCondCodeInd(const OperandVector &Operands,
+ unsigned MnemonicOpsEndInd) {
+ for (unsigned I = 1; I < MnemonicOpsEndInd; ++I) {
+ auto Op = static_cast<ARMOperand &>(*Operands[I]);
+ if (Op.isCondCode())
+ return I;
+ }
+ return 0;
+}
+
+unsigned findCCOutInd(const OperandVector &Operands,
+ unsigned MnemonicOpsEndInd) {
+ for (unsigned I = 1; I < MnemonicOpsEndInd; ++I) {
+ auto Op = static_cast<ARMOperand &>(*Operands[I]);
+ if (Op.isCCOut())
+ return I;
+ }
+ return 0;
+}
+
/// Convert parsed operands to MCInst. Needed here because this instruction
/// only has two register operands, but multiplication is commutative so
/// assemblers should accept both "mul rD, rN, rD" and "mul rD, rD, rN".
void ARMAsmParser::cvtThumbMultiply(MCInst &Inst,
const OperandVector &Operands) {
- ((ARMOperand &)*Operands[3]).addRegOperands(Inst, 1);
- ((ARMOperand &)*Operands[1]).addCCOutOperands(Inst, 1);
- // If we have a three-operand form, make sure to set Rn to be the operand
- // that isn't the same as Rd.
- unsigned RegOp = 4;
- if (Operands.size() == 6 &&
- ((ARMOperand &)*Operands[4]).getReg() ==
- ((ARMOperand &)*Operands[3]).getReg())
- RegOp = 5;
- ((ARMOperand &)*Operands[RegOp]).addRegOperands(Inst, 1);
- Inst.addOperand(Inst.getOperand(0));
- ((ARMOperand &)*Operands[2]).addCondCodeOperands(Inst, 2);
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
+ unsigned CondOutI = findCCOutInd(Operands, MnemonicOpsEndInd);
+
+ // 2 operand form
+ unsigned RegRd = MnemonicOpsEndInd;
+ unsigned RegRn = MnemonicOpsEndInd + 1;
+ unsigned RegRm = MnemonicOpsEndInd;
+
+ if (Operands.size() == MnemonicOpsEndInd + 3) {
+ // If we have a three-operand form, make sure to set Rn to be the operand
+ // that isn't the same as Rd.
+ if (((ARMOperand &)*Operands[RegRd]).getReg() ==
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 1]).getReg()) {
+ RegRn = MnemonicOpsEndInd + 2;
+ RegRm = MnemonicOpsEndInd + 1;
+ } else {
+ RegRn = MnemonicOpsEndInd + 1;
+ RegRm = MnemonicOpsEndInd + 2;
+ }
+ }
+
+ // Rd
+ ((ARMOperand &)*Operands[RegRd]).addRegOperands(Inst, 1);
+ // CCOut
+ if (CondOutI != 0) {
+ ((ARMOperand &)*Operands[CondOutI]).addCCOutOperands(Inst, 1);
+ } else {
+ ARMOperand Op = *ARMOperand::CreateCCOut(0, Operands[0]->getEndLoc());
+ Op.addCCOutOperands(Inst, 1);
+ }
+ // Rn
+ ((ARMOperand &)*Operands[RegRn]).addRegOperands(Inst, 1);
+ // Rm
+ ((ARMOperand &)*Operands[RegRm]).addRegOperands(Inst, 1);
+
+ // Cond code
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI]).addCondCodeOperands(Inst, 2);
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(llvm::ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
const OperandVector &Operands) {
- int CondOp = -1, ImmOp = -1;
- switch(Inst.getOpcode()) {
- case ARM::tB:
- case ARM::tBcc: CondOp = 1; ImmOp = 2; break;
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
+ unsigned Cond =
+ (CondI == 0 ? ARMCC::AL
+ : static_cast<ARMOperand &>(*Operands[CondI]).getCondCode());
- case ARM::t2B:
- case ARM::t2Bcc: CondOp = 1; ImmOp = 3; break;
-
- default: llvm_unreachable("Unexpected instruction in cvtThumbBranches");
- }
// first decide whether or not the branch should be conditional
// by looking at it's location relative to an IT block
if(inITBlock()) {
@@ -5638,9 +5768,6 @@ void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
case ARM::t2Bcc: Inst.setOpcode(ARM::t2B); break;
}
} else {
- // outside IT blocks we can only have unconditional branches with AL
- // condition code or conditional branches with non-AL condition code
- unsigned Cond = static_cast<ARMOperand &>(*Operands[CondOp]).getCondCode();
switch(Inst.getOpcode()) {
case ARM::tB:
case ARM::tBcc:
@@ -5657,36 +5784,55 @@ void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
switch(Inst.getOpcode()) {
// classify tB as either t2B or t1B based on range of immediate operand
case ARM::tB: {
- ARMOperand &op = static_cast<ARMOperand &>(*Operands[ImmOp]);
+ ARMOperand &op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
if (!op.isSignedOffset<11, 1>() && isThumb() && hasV8MBaseline())
Inst.setOpcode(ARM::t2B);
break;
}
// classify tBcc as either t2Bcc or t1Bcc based on range of immediate operand
case ARM::tBcc: {
- ARMOperand &op = static_cast<ARMOperand &>(*Operands[ImmOp]);
+ ARMOperand &op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
if (!op.isSignedOffset<8, 1>() && isThumb() && hasV8MBaseline())
Inst.setOpcode(ARM::t2Bcc);
break;
}
}
- ((ARMOperand &)*Operands[ImmOp]).addImmOperands(Inst, 1);
- ((ARMOperand &)*Operands[CondOp]).addCondCodeOperands(Inst, 2);
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd]).addImmOperands(Inst, 1);
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI]).addCondCodeOperands(Inst, 2);
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(llvm::ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
void ARMAsmParser::cvtMVEVMOVQtoDReg(
MCInst &Inst, const OperandVector &Operands) {
- // mnemonic, condition code, Rt, Rt2, Qd, idx, Qd again, idx2
- assert(Operands.size() == 8);
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
- ((ARMOperand &)*Operands[2]).addRegOperands(Inst, 1); // Rt
- ((ARMOperand &)*Operands[3]).addRegOperands(Inst, 1); // Rt2
- ((ARMOperand &)*Operands[4]).addRegOperands(Inst, 1); // Qd
- ((ARMOperand &)*Operands[5]).addMVEPairVectorIndexOperands(Inst, 1); // idx
+ // mnemonic, condition code, Rt, Rt2, Qd, idx, Qd again, idx2
+ assert(Operands.size() == MnemonicOpsEndInd + 6);
+
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd]).addRegOperands(Inst, 1); // Rt
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 1])
+ .addRegOperands(Inst, 1); // Rt2
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 2]).addRegOperands(Inst, 1); // Qd
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 3])
+ .addMVEPairVectorIndexOperands(Inst, 1); // idx
// skip second copy of Qd in Operands[6]
- ((ARMOperand &)*Operands[7]).addMVEPairVectorIndexOperands(Inst, 1); // idx2
- ((ARMOperand &)*Operands[1]).addCondCodeOperands(Inst, 2); // condition code
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 5])
+ .addMVEPairVectorIndexOperands(Inst, 1); // idx2
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI])
+ .addCondCodeOperands(Inst, 2); // condition code
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
/// Parse an ARM memory expression, return false if...
[truncated]
|
@llvm/pr-subscribers-backend-arm Author: Alfie Richards (AlfieRichardsArm) ChangesThis changs the way the assembly matcher works for Aarch32 parsing. Previously there was a pile of hacks which dictated whether the CC and CCOut operands should be present which de-facto chose if the wide/narrow (or thumb1/thumb2/arm) instruction version were chosen. This meant much of the TableGen machinery present for the assembly matching was effectively being bypassed and worked around. This patch makes the CC and CCOut operands optional which allows the ASM matcher operate as it was designed and means we can avoid doing some of the hacks done previously. This also adds the option for the target to allow the prioritizing the smaller instruction encodings as is required for Aarch32. Patch is 136.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83436.diff 23 Files Affected:
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index 0d97a47190b196..8448b768d6e7d6 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -948,9 +948,18 @@ class AsmOperandClass {
/// error will be suppressed if all of the remaining unmatched operands are
/// marked as IsOptional.
///
- /// Optional arguments must be at the end of the operand list.
+ /// Note: Optional arguments have caveats if they are not at the end of this list
+ /// when regarding custom operand parsing. See below
bit IsOptional = false;
+ // Fixme: Ideally this would not be necessary however this would involve interleaving the
+ // parsing and matching processes.
+ /// Set to 1 if the parser should assume this operand will always be present
+ /// for the sake of calculating the operand index in regards to which custom operand
+ /// parser should be used.
+ /// This is only used for custom operands that are not at the end of the instruction.
+ bit OptionalShouldOffsetCustomParsers = true;
+
/// The name of the method on the target specific asm parser that returns the
/// default operand for this optional operand. This method is only used if
/// IsOptional == 1. If not set, this will default to "defaultFooOperands",
@@ -1724,6 +1733,11 @@ class Target {
// setting hasExtraDefRegAllocReq and hasExtraSrcRegAllocReq to 1
// for all opcodes if this flag is set to 0.
int AllowRegisterRenaming = 0;
+
+ // SortBySize = Should the assembly matcher prefer the smaller
+ // instructions. 1 if the instruction set should sort by size,
+ // 0 otherwise.
+ int SortBySize = 0;
}
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 877781568307dc..f380a8e40337b7 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -1746,4 +1746,5 @@ def ARM : Target {
let AssemblyParsers = [ARMAsmParser];
let AssemblyParserVariants = [ARMAsmParserVariant];
let AllowRegisterRenaming = 1;
+ let SortBySize = 1;
}
diff --git a/llvm/lib/Target/ARM/ARMInstrFormats.td b/llvm/lib/Target/ARM/ARMInstrFormats.td
index 14e315534570d2..6e0466f1bc11bd 100644
--- a/llvm/lib/Target/ARM/ARMInstrFormats.td
+++ b/llvm/lib/Target/ARM/ARMInstrFormats.td
@@ -155,7 +155,13 @@ def iflags_op : Operand<i32> {
// ARM Predicate operand. Default to 14 = always (AL). Second part is CC
// register whose default is 0 (no register).
-def CondCodeOperand : AsmOperandClass { let Name = "CondCode"; }
+def CondCodeOperand : AsmOperandClass {
+ let Name = "CondCode";
+ let PredicateMethod = "isCondCode";
+ let DefaultMethod = "defaultCondCodeOp";
+ let IsOptional = true;
+ let OptionalShouldOffsetCustomParsers = false;
+}
def pred : PredicateOperand<OtherVT, (ops i32imm, i32imm),
(ops (i32 14), (i32 zero_reg))> {
let PrintMethod = "printPredicateOperand";
@@ -174,7 +180,12 @@ def cmovpred : Operand<i32>, PredicateOp,
}
// Conditional code result for instructions whose 's' bit is set, e.g. subs.
-def CCOutOperand : AsmOperandClass { let Name = "CCOut"; }
+def CCOutOperand : AsmOperandClass {
+ let Name = "CCOut";
+ let DefaultMethod = "defaultCCOutOp";
+ let IsOptional = true;
+ let OptionalShouldOffsetCustomParsers = false;
+}
def cc_out : OptionalDefOperand<OtherVT, (ops CCR), (ops (i32 zero_reg))> {
let EncoderMethod = "getCCOutOpValue";
let PrintMethod = "printSBitModifierOperand";
@@ -468,7 +479,7 @@ class InstThumb<AddrMode am, int sz, IndexMode im,
// These are aliases that require C++ handling to convert to the target
// instruction, while InstAliases can be handled directly by tblgen.
class AsmPseudoInst<string asm, dag iops, dag oops = (outs)>
- : InstTemplate<AddrModeNone, 0, IndexModeNone, Pseudo, GenericDomain,
+ : InstTemplate<AddrModeNone, 4, IndexModeNone, Pseudo, GenericDomain,
"", NoItinerary> {
let OutOperandList = oops;
let InOperandList = iops;
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index 37bfb76a494dee..342afae7b05f67 100644
--- a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringMap.h"
@@ -47,6 +48,7 @@
#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Compiler.h"
+#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/SMLoc.h"
@@ -79,6 +81,7 @@ extern const ARMInstrTable ARMDescs;
} // end namespace llvm
namespace {
+class ARMOperand;
enum class ImplicitItModeTy { Always, Never, ARMOnly, ThumbOnly };
@@ -446,9 +449,10 @@ class ARMAsmParser : public MCTargetAsmParser {
}
bool validatetLDMRegList(const MCInst &Inst, const OperandVector &Operands,
- unsigned ListNo, bool IsARPop = false);
+ unsigned MnemonicOpsEndInd, unsigned ListIndex,
+ bool IsARPop = false);
bool validatetSTMRegList(const MCInst &Inst, const OperandVector &Operands,
- unsigned ListNo);
+ unsigned MnemonicOpsEndInd, unsigned ListIndex);
int tryParseRegister();
bool tryParseRegisterWithWriteBack(OperandVector &);
@@ -504,20 +508,28 @@ class ARMAsmParser : public MCTargetAsmParser {
bool parseDirectiveSEHEpilogEnd(SMLoc L);
bool parseDirectiveSEHCustom(SMLoc L);
+ std::unique_ptr<ARMOperand> defaultCondCodeOp();
+ std::unique_ptr<ARMOperand> defaultCCOutOp();
+
bool isMnemonicVPTPredicable(StringRef Mnemonic, StringRef ExtraToken);
StringRef splitMnemonic(StringRef Mnemonic, StringRef ExtraToken,
- unsigned &PredicationCode,
- unsigned &VPTPredicationCode, bool &CarrySetting,
- unsigned &ProcessorIMod, StringRef &ITMask);
+ ARMCC::CondCodes &PredicationCode,
+ ARMVCC::VPTCodes &VPTPredicationCode,
+ bool &CarrySetting, unsigned &ProcessorIMod,
+ StringRef &ITMask);
void getMnemonicAcceptInfo(StringRef Mnemonic, StringRef ExtraToken,
StringRef FullInst, bool &CanAcceptCarrySet,
bool &CanAcceptPredicationCode,
bool &CanAcceptVPTPredicationCode);
bool enableArchExtFeature(StringRef Name, SMLoc &ExtLoc);
- void tryConvertingToTwoOperandForm(StringRef Mnemonic, bool CarrySetting,
- OperandVector &Operands);
- bool CDEConvertDualRegOperand(StringRef Mnemonic, OperandVector &Operands);
+ void tryConvertingToTwoOperandForm(StringRef Mnemonic,
+ ARMCC::CondCodes PredicationCode,
+ bool CarrySetting, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
+
+ bool CDEConvertDualRegOperand(StringRef Mnemonic, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
bool isThumb() const {
// FIXME: Can tablegen auto-generate this?
@@ -657,15 +669,20 @@ class ARMAsmParser : public MCTargetAsmParser {
void cvtThumbBranches(MCInst &Inst, const OperandVector &);
void cvtMVEVMOVQtoDReg(MCInst &Inst, const OperandVector &);
- bool validateInstruction(MCInst &Inst, const OperandVector &Ops);
- bool processInstruction(MCInst &Inst, const OperandVector &Ops, MCStreamer &Out);
- bool shouldOmitCCOutOperand(StringRef Mnemonic, OperandVector &Operands);
- bool shouldOmitPredicateOperand(StringRef Mnemonic, OperandVector &Operands);
- bool shouldOmitVectorPredicateOperand(StringRef Mnemonic, OperandVector &Operands);
+ bool validateInstruction(MCInst &Inst, const OperandVector &Ops,
+ unsigned MnemonicOpsEndInd);
+ bool processInstruction(MCInst &Inst, const OperandVector &Ops,
+ unsigned MnemonicOpsEndInd, MCStreamer &Out);
+ bool shouldOmitVectorPredicateOperand(StringRef Mnemonic,
+ OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
bool isITBlockTerminator(MCInst &Inst) const;
- void fixupGNULDRDAlias(StringRef Mnemonic, OperandVector &Operands);
- bool validateLDRDSTRD(MCInst &Inst, const OperandVector &Operands,
- bool Load, bool ARMMode, bool Writeback);
+
+ void fixupGNULDRDAlias(StringRef Mnemonic, OperandVector &Operands,
+ unsigned MnemonicOpsEndInd);
+ bool validateLDRDSTRD(MCInst &Inst, const OperandVector &Operands, bool Load,
+ bool ARMMode, bool Writeback,
+ unsigned MnemonicOpsEndInd);
public:
enum ARMMatchResultTy {
@@ -675,6 +692,7 @@ class ARMAsmParser : public MCTargetAsmParser {
Match_RequiresThumb2,
Match_RequiresV8,
Match_RequiresFlagSetting,
+ Match_RequiresDestinationRegisterMatchASourceRegister,
#define GET_OPERAND_DIAGNOSTIC_TYPES
#include "ARMGenAsmMatcher.inc"
@@ -714,6 +732,9 @@ class ARMAsmParser : public MCTargetAsmParser {
unsigned validateTargetOperandClass(MCParsedAsmOperand &Op,
unsigned Kind) override;
unsigned checkTargetMatchPredicate(MCInst &Inst) override;
+ unsigned
+ checkEarlyTargetMatchPredicate(MCInst &Inst,
+ const OperandVector &Operands) override;
bool MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
OperandVector &Operands, MCStreamer &Out,
@@ -4052,6 +4073,66 @@ static MCRegister MatchRegisterName(StringRef Name);
/// }
+static bool isDataTypeToken(StringRef Tok) {
+ return Tok == ".8" || Tok == ".16" || Tok == ".32" || Tok == ".64" ||
+ Tok == ".i8" || Tok == ".i16" || Tok == ".i32" || Tok == ".i64" ||
+ Tok == ".u8" || Tok == ".u16" || Tok == ".u32" || Tok == ".u64" ||
+ Tok == ".s8" || Tok == ".s16" || Tok == ".s32" || Tok == ".s64" ||
+ Tok == ".p8" || Tok == ".p16" || Tok == ".f32" || Tok == ".f64" ||
+ Tok == ".f" || Tok == ".d";
+}
+
+static unsigned getMnemonicOpsEndInd(const OperandVector &Operands) {
+ unsigned MnemonicOpsEndInd = 1;
+ // Special case for CPS which has a Mnemonic side token for possibly storing ie/id
+ // variant
+ if (Operands[0]->isToken() &&
+ static_cast<ARMOperand &>(*Operands[0]).getToken() == "cps") {
+ if (Operands.size() > 1 && Operands[1]->isImm() &&
+ static_cast<ARMOperand &>(*Operands[1]).getImm()->getKind() ==
+ llvm::MCExpr::Constant &&
+ (dyn_cast<MCConstantExpr>(
+ static_cast<ARMOperand &>(*Operands[1]).getImm())
+ ->getValue() == ARM_PROC::IE ||
+ dyn_cast<MCConstantExpr>(
+ static_cast<ARMOperand &>(*Operands[1]).getImm())
+ ->getValue() == ARM_PROC::ID))
+ ++MnemonicOpsEndInd;
+ }
+
+ // In some circumstances the code code moves to the right
+ bool RHSCondCode = false;
+ while (MnemonicOpsEndInd < Operands.size()) {
+ auto Op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
+ // Special case for it instructions which have a condition code on the RHS
+ if (Op.isITMask()) {
+ RHSCondCode = true;
+ MnemonicOpsEndInd++;
+ }
+ // Special case for it instructions which have a condition code on the RHS
+ else if (Op.isToken() &&
+ (
+ // There are several special cases not covered by
+ // isDataTypeToken
+ Op.getToken() == ".w" || Op.getToken() == ".bf16" ||
+ Op.getToken() == ".p64" || Op.getToken() == ".f16" ||
+ isDataTypeToken(Op.getToken()))) {
+ // In the mnemonic operators the cond code must always precede the data type. So we
+ // can now safely assume any subsequent cond code is on the RHS.
+ // As is the cdase for VCMP and VPT.
+ RHSCondCode = true;
+ MnemonicOpsEndInd++;
+ }
+ // Skip all mnemonic operator types
+ else if (Op.isCCOut() || (Op.isCondCode() && !RHSCondCode) ||
+ Op.isVPTPred() || (Op.isToken() && Op.getToken() == ".w"))
+ MnemonicOpsEndInd++;
+ else
+ break;
+ }
+ return MnemonicOpsEndInd;
+}
+
bool ARMAsmParser::parseRegister(MCRegister &Reg, SMLoc &StartLoc,
SMLoc &EndLoc) {
const AsmToken &Tok = getParser().getTok();
@@ -5597,37 +5678,86 @@ ParseStatus ARMAsmParser::parseAM3Offset(OperandVector &Operands) {
return ParseStatus::Success;
}
+// Finds the index of the first CondCode operator, if there is none returns 0
+unsigned findCondCodeInd(const OperandVector &Operands,
+ unsigned MnemonicOpsEndInd) {
+ for (unsigned I = 1; I < MnemonicOpsEndInd; ++I) {
+ auto Op = static_cast<ARMOperand &>(*Operands[I]);
+ if (Op.isCondCode())
+ return I;
+ }
+ return 0;
+}
+
+unsigned findCCOutInd(const OperandVector &Operands,
+ unsigned MnemonicOpsEndInd) {
+ for (unsigned I = 1; I < MnemonicOpsEndInd; ++I) {
+ auto Op = static_cast<ARMOperand &>(*Operands[I]);
+ if (Op.isCCOut())
+ return I;
+ }
+ return 0;
+}
+
/// Convert parsed operands to MCInst. Needed here because this instruction
/// only has two register operands, but multiplication is commutative so
/// assemblers should accept both "mul rD, rN, rD" and "mul rD, rD, rN".
void ARMAsmParser::cvtThumbMultiply(MCInst &Inst,
const OperandVector &Operands) {
- ((ARMOperand &)*Operands[3]).addRegOperands(Inst, 1);
- ((ARMOperand &)*Operands[1]).addCCOutOperands(Inst, 1);
- // If we have a three-operand form, make sure to set Rn to be the operand
- // that isn't the same as Rd.
- unsigned RegOp = 4;
- if (Operands.size() == 6 &&
- ((ARMOperand &)*Operands[4]).getReg() ==
- ((ARMOperand &)*Operands[3]).getReg())
- RegOp = 5;
- ((ARMOperand &)*Operands[RegOp]).addRegOperands(Inst, 1);
- Inst.addOperand(Inst.getOperand(0));
- ((ARMOperand &)*Operands[2]).addCondCodeOperands(Inst, 2);
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
+ unsigned CondOutI = findCCOutInd(Operands, MnemonicOpsEndInd);
+
+ // 2 operand form
+ unsigned RegRd = MnemonicOpsEndInd;
+ unsigned RegRn = MnemonicOpsEndInd + 1;
+ unsigned RegRm = MnemonicOpsEndInd;
+
+ if (Operands.size() == MnemonicOpsEndInd + 3) {
+ // If we have a three-operand form, make sure to set Rn to be the operand
+ // that isn't the same as Rd.
+ if (((ARMOperand &)*Operands[RegRd]).getReg() ==
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 1]).getReg()) {
+ RegRn = MnemonicOpsEndInd + 2;
+ RegRm = MnemonicOpsEndInd + 1;
+ } else {
+ RegRn = MnemonicOpsEndInd + 1;
+ RegRm = MnemonicOpsEndInd + 2;
+ }
+ }
+
+ // Rd
+ ((ARMOperand &)*Operands[RegRd]).addRegOperands(Inst, 1);
+ // CCOut
+ if (CondOutI != 0) {
+ ((ARMOperand &)*Operands[CondOutI]).addCCOutOperands(Inst, 1);
+ } else {
+ ARMOperand Op = *ARMOperand::CreateCCOut(0, Operands[0]->getEndLoc());
+ Op.addCCOutOperands(Inst, 1);
+ }
+ // Rn
+ ((ARMOperand &)*Operands[RegRn]).addRegOperands(Inst, 1);
+ // Rm
+ ((ARMOperand &)*Operands[RegRm]).addRegOperands(Inst, 1);
+
+ // Cond code
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI]).addCondCodeOperands(Inst, 2);
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(llvm::ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
const OperandVector &Operands) {
- int CondOp = -1, ImmOp = -1;
- switch(Inst.getOpcode()) {
- case ARM::tB:
- case ARM::tBcc: CondOp = 1; ImmOp = 2; break;
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
+ unsigned Cond =
+ (CondI == 0 ? ARMCC::AL
+ : static_cast<ARMOperand &>(*Operands[CondI]).getCondCode());
- case ARM::t2B:
- case ARM::t2Bcc: CondOp = 1; ImmOp = 3; break;
-
- default: llvm_unreachable("Unexpected instruction in cvtThumbBranches");
- }
// first decide whether or not the branch should be conditional
// by looking at it's location relative to an IT block
if(inITBlock()) {
@@ -5638,9 +5768,6 @@ void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
case ARM::t2Bcc: Inst.setOpcode(ARM::t2B); break;
}
} else {
- // outside IT blocks we can only have unconditional branches with AL
- // condition code or conditional branches with non-AL condition code
- unsigned Cond = static_cast<ARMOperand &>(*Operands[CondOp]).getCondCode();
switch(Inst.getOpcode()) {
case ARM::tB:
case ARM::tBcc:
@@ -5657,36 +5784,55 @@ void ARMAsmParser::cvtThumbBranches(MCInst &Inst,
switch(Inst.getOpcode()) {
// classify tB as either t2B or t1B based on range of immediate operand
case ARM::tB: {
- ARMOperand &op = static_cast<ARMOperand &>(*Operands[ImmOp]);
+ ARMOperand &op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
if (!op.isSignedOffset<11, 1>() && isThumb() && hasV8MBaseline())
Inst.setOpcode(ARM::t2B);
break;
}
// classify tBcc as either t2Bcc or t1Bcc based on range of immediate operand
case ARM::tBcc: {
- ARMOperand &op = static_cast<ARMOperand &>(*Operands[ImmOp]);
+ ARMOperand &op = static_cast<ARMOperand &>(*Operands[MnemonicOpsEndInd]);
if (!op.isSignedOffset<8, 1>() && isThumb() && hasV8MBaseline())
Inst.setOpcode(ARM::t2Bcc);
break;
}
}
- ((ARMOperand &)*Operands[ImmOp]).addImmOperands(Inst, 1);
- ((ARMOperand &)*Operands[CondOp]).addCondCodeOperands(Inst, 2);
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd]).addImmOperands(Inst, 1);
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI]).addCondCodeOperands(Inst, 2);
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(llvm::ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
void ARMAsmParser::cvtMVEVMOVQtoDReg(
MCInst &Inst, const OperandVector &Operands) {
- // mnemonic, condition code, Rt, Rt2, Qd, idx, Qd again, idx2
- assert(Operands.size() == 8);
+ unsigned MnemonicOpsEndInd = getMnemonicOpsEndInd(Operands);
+ unsigned CondI = findCondCodeInd(Operands, MnemonicOpsEndInd);
- ((ARMOperand &)*Operands[2]).addRegOperands(Inst, 1); // Rt
- ((ARMOperand &)*Operands[3]).addRegOperands(Inst, 1); // Rt2
- ((ARMOperand &)*Operands[4]).addRegOperands(Inst, 1); // Qd
- ((ARMOperand &)*Operands[5]).addMVEPairVectorIndexOperands(Inst, 1); // idx
+ // mnemonic, condition code, Rt, Rt2, Qd, idx, Qd again, idx2
+ assert(Operands.size() == MnemonicOpsEndInd + 6);
+
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd]).addRegOperands(Inst, 1); // Rt
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 1])
+ .addRegOperands(Inst, 1); // Rt2
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 2]).addRegOperands(Inst, 1); // Qd
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 3])
+ .addMVEPairVectorIndexOperands(Inst, 1); // idx
// skip second copy of Qd in Operands[6]
- ((ARMOperand &)*Operands[7]).addMVEPairVectorIndexOperands(Inst, 1); // idx2
- ((ARMOperand &)*Operands[1]).addCondCodeOperands(Inst, 2); // condition code
+ ((ARMOperand &)*Operands[MnemonicOpsEndInd + 5])
+ .addMVEPairVectorIndexOperands(Inst, 1); // idx2
+ if (CondI != 0) {
+ ((ARMOperand &)*Operands[CondI])
+ .addCondCodeOperands(Inst, 2); // condition code
+ } else {
+ ARMOperand Op =
+ *ARMOperand::CreateCondCode(ARMCC::AL, Operands[0]->getEndLoc());
+ Op.addCondCodeOperands(Inst, 2);
+ }
}
/// Parse an ARM memory expression, return false if...
[truncated]
|
570ab5f
to
d2ad78c
Compare
If there is any interest from @llvm/issue-subscribers-backend-mips I noticed the sorting by smallest encoding has effects for the MIPS parsing (thus the inclusion of the option rather than having it on as default to not break tests). I don't know what the correct behavior is for MIPS. |
Also note #83413 is a prerequisite for this change. |
This change is intended to have minor/no user facing changes. It allows for the removal of some hacks and makes the parsing of ARM more idiomatic with the way TableGen works. Additionally, it is part of a larger piece of work to address the numerous flaws with how we handle ".w" and ".n". |
MCA test failures seem genuine, it makes assumptions about optional operands that this invalidates. Im investigating now |
d2ad78c
to
a1cd51c
Compare
a1cd51c
to
83213d7
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
a46b797
to
da3d160
Compare
Also, for quite a nice example of improved behavior from this change, see In this case, the old behavior was that the de-facto wide/narrow decision meant this was parsed as a wide encoding which was then converted to narrow by a hacky bit of code that this test checks for. This test is changed by this patch to check that the narrow encoding is used because it was selected by the TableGen matcher and the hack wasn't used (I intend to go through these hacks and remove all those that are now unused after this patch is accepted). Also see the numerous cases of improved (though more verbose) diagnostics. |
I have merged the prerequisites. Then as long as the tests pass and @ostannard doesnt have any changes to make I will merge this today. Thank you @s-barannikov @kosarev @statham-arm for your help with this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM too.
This change broke building FFmpeg for arm targets. Assembling https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/arm/vp8_armv6.S#L73-L75 produces the following error:
It can be reproduced with the following standalone test snippet: .syntax unified
@.thumb
it ne
pkhtbne r11, r11, r11, asr #16
nop Built like this: $ clang -target armv7-linux-gnueabihf -c repro.s
repro.s:5:27: error: asr operand expected.
pkhtbne r11, r11, r11, asr #16
^
repro.s:6:9: error: incorrect condition in IT block; got 'al', but expected 'ne'
nop
^ Please consider reverting this if you can't come up with a fix during the day. (Or I can push a revert myself.) Also, as a side note, from the merged git commit, I see that you have your email address hidden on github. Apparently the bot didn't make a comment about that for some reason. So, from https://github.com/llvm/llvm-project/blob/main/.github/workflows/email-check.yaml, please turn off Keep my email addresses private setting in your account, and see LLVM Discourse for more information. |
Hi @mstorsjo, |
@mstorsjo I have a fix I will add you to shortly. Thank you for finding this. |
This was broken by llvm#83436 as in optional operands meant sometimes the parsePKHImm parser is applied to operands in other positions, which previously produced an error. Now this instead fails the parse. However this unfortunately means the default parsing happens which combines the shift and the base operator which causes the diagnostics to be non-sense. To fix this I had to add a special case diagnostic which is unfortunate. Ideally these two methods for handling operands should be unified and the TableGen matching mechanisms should be used for validaing parsed shifts rather than special cases in parsing.
I tested, the whole file, and the rest of the project, builds fine now - thanks! (It usually requires you to run configure for the project, to generate a suitable |
Great. Thanks for finding this! |
This was broken by #83436 as in optional operands meant when the CC operand is provided the `parsePKHImm` parser is applied to register operands, which previously erroneously produced an error.
Hi @AlfieRichardsArm we are seeing a case with inline asm that no longer compiles after your change. Consider the following code: void a() {
__asm("strexd r0, r2, [r3]");
__asm("ldrexd r0, [r2]");
} Prior to your change, this compiled successfully:
But after your change, the compiler is now emitting errors for the same code:
Is this expected? |
Hi @dyung, |
Hi @dyung, Perhaps we previously parsed these but shouldn't have? (For me it causes a crash on old builds?). In this case they would both need an extra operands. eg.
(I'm guessing at the missing registers) Is the source for the project which these came from public? |
The code is unfortunately not from a public codebase. I am told that it is a short form of essentially the longer form you posted. If you try earlier versions of clang and compile directly to an object file and disassemble it, I believe it generated the assembly you mentioned. (I also noticed that earlier versions of clang hit an assertion failure when generating the assembly) |
Ah okay, very strange. I cant find any references to this in specifications or docs. Do you have any documentation for it? I would be happy to add some aliases for this alternate form if it's correct behavior, but I can't seem to find documentation for it or references to it before this code. |
It does look like GCC will accept these alternative forms. I expect that this will have been an unofficial/undocumented bit of GCC behaviour that clang has copied. |
@smithp35 Ah okay, in that case shall I re-add this behavior and add some tests? As mentioned above, this seems to have been broken previously, but didn't have a test. |
I think adding that behaviour back makes sense as existing implementations make a defacto standard. Ideally we can trace this back and when we add a test we can say it is for compatibility with GCC. |
Okay, I will create a ticket for this and pick it up. By "trace this back" do you mean find a reference to this in GCC? Or find a source for the previous behavior in LLVM? |
Yes, all I've done is just plug the example on GCC. If we're lucky there will be a comment in the source code or commit message for GCC, or clang saying that it has copied GCC. There will be a limit to what we can do, sometimes there won't be anything written down that can be easily found. |
This was broken by llvm#83436 as in optional operands meant when the CC operand is provided the `parsePKHImm` parser is applied to register operands, which previously erroneously produced an error.
…ctions (#86507) These aliases were supported previously there was a regression at some point. This adds back the alternate forms and tidies up this section of code a little. See #83436 (comment) for the initial report regarding this change.
This changs the way the assembly matcher works for Aarch32 parsing. Previously there was a pile of hacks which dictated whether the CC, CCOut, and VCC operands should be present which de-facto chose if the wide/narrow (or thumb1/thumb2/arm) instruction version were chosen.
This meant much of the TableGen machinery present for the assembly matching was effectively being bypassed and worked around.
This patch makes the CC and CCOut operands optional which allows the ASM matcher operate as it was designed and means we can avoid doing some of the hacks done previously. This also adds the option for the target to allow the prioritizing the smaller instruction encodings as is required for Aarch32.