-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[BOLT] Correctly print preferred disassembly for annotated instructions #120564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tion. This patch makes sure that BinaryContext::printInstruction prints the preferred disassembly. Preferred disassembly only gets printed when there are no annotations on the MCInst. Therefore, this patch temporarily removes the annotations before printing it. A few examples of before and after on AArch64 instructions are as follows: ``` BEFORE AFTER (preferred disassembly) ret x30 ret orr x30, xzr, x0 mov x30, x0 hint llvm#29 autiasp hint llvm#12 autia1716 ``` Clearly, the preferred disassembly is easier for developers to read, and is the disassembly that tools should be printing. This patch is motivated as part of future work on the llvm-bolt-binary-analysis tool, making sure that the reports it prints do use preferred disassembly. This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype. In this current patch, this only affects existing RISCV test cases. This patch also does improve test cases in future patches that will introduce a binary analysis for llvm-bolt-binary-analysis that checks for correct application of pac-ret (pointer authentication on return addresses).
@llvm/pr-subscribers-bolt Author: Kristof Beyls (kbeyls) ChangesThis patch makes sure that A few examples of before and after on AArch64 instructions are as follows:
Clearly, the preferred disassembly is easier for developers to read, and is the disassembly that tools should be printing. This patch is motivated as part of future work on the llvm-bolt-binary-analysis tool, making sure that the reports it prints do use preferred disassembly. This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype. In this current patch, this only affects existing RISCV test cases. This patch also does improve test cases in future patches that will introduce a binary analysis for llvm-bolt-binary-analysis that checks for correct application of pac-ret (pointer authentication on return addresses). Full diff: https://github.com/llvm/llvm-project/pull/120564.diff 5 Files Affected:
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index ac96b836ed5796..0fcbf7452cf880 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -1961,7 +1961,15 @@ void BinaryContext::printInstruction(raw_ostream &OS, const MCInst &Instruction,
OS << "\tjit\t" << MIB->getTargetSymbol(Instruction)->getName()
<< " # ID: " << DynamicID;
} else {
- InstPrinter->printInst(&Instruction, 0, "", *STI, OS);
+ // If there are annotations on the instruction, the MCInstPrinter will fail to
+ // print the preferred alias as it only does so when the number of operands is
+ // as expected. See
+ // https://github.com/llvm/llvm-project/blob/782f1a0d895646c364a53f9dcdd6d4ec1f3e5ea0/llvm/lib/MC/MCInstPrinter.cpp#L142
+ // Therefore, create a temporary copy of the Inst from which the annotations are
+ // removed, and print that Inst.
+ MCInst InstNoAnnot = Instruction;
+ MIB->stripAnnotations(InstNoAnnot);
+ InstPrinter->printInst(&InstNoAnnot, 0, "", *STI, OS);
}
if (MIB->isCall(Instruction)) {
if (MIB->isTailCall(Instruction))
diff --git a/bolt/test/RISCV/call-annotations.s b/bolt/test/RISCV/call-annotations.s
index ff99e0f1dfd848..f876544e214cab 100644
--- a/bolt/test/RISCV/call-annotations.s
+++ b/bolt/test/RISCV/call-annotations.s
@@ -16,13 +16,13 @@ f:
// CHECK-LABEL: Binary Function "_start" after building cfg {
// CHECK: auipc ra, f
-// CHECK-NEXT: jalr ra, -0x4(ra) # Offset: 4
-// CHECK-NEXT: jal ra, f # Offset: 8
-// CHECK-NEXT: jal zero, f # TAILCALL # Offset: 12
+// CHECK-NEXT: jalr -0x4(ra) # Offset: 4
+// CHECK-NEXT: jal f # Offset: 8
+// CHECK-NEXT: j f # TAILCALL # Offset: 12
// CHECK-LABEL: Binary Function "long_tail" after building cfg {
// CHECK: auipc t1, f
-// CHECK-NEXT: jalr zero, -0x18(t1) # TAILCALL # Offset: 8
+// CHECK-NEXT: jr -0x18(t1) # TAILCALL # Offset: 8
// CHECK-LABEL: Binary Function "compressed_tail" after building cfg {
// CHECK: jr a0 # TAILCALL # Offset: 0
diff --git a/bolt/test/RISCV/relax.s b/bolt/test/RISCV/relax.s
index ec390ea76b5c72..74f049b8f8dd94 100644
--- a/bolt/test/RISCV/relax.s
+++ b/bolt/test/RISCV/relax.s
@@ -5,9 +5,9 @@
// RUN: llvm-objdump -d %t.bolt | FileCheck --check-prefix=OBJDUMP %s
// CHECK: Binary Function "_start" after building cfg {
-// CHECK: jal ra, near_f
+// CHECK: jal near_f
// CHECK-NEXT: auipc ra, far_f
-// CHECK-NEXT: jalr ra, 0xc(ra)
+// CHECK-NEXT: jalr 0xc(ra)
// CHECK-NEXT: j near_f
// CHECK: Binary Function "_start" after fix-riscv-calls {
diff --git a/bolt/test/RISCV/reloc-branch.s b/bolt/test/RISCV/reloc-branch.s
index 6a8b5a28e19d92..780d978f36f447 100644
--- a/bolt/test/RISCV/reloc-branch.s
+++ b/bolt/test/RISCV/reloc-branch.s
@@ -7,7 +7,7 @@
.p2align 1
// CHECK: Binary Function "_start" after building cfg {
_start:
-// CHECK: beq zero, zero, .Ltmp0
+// CHECK: beqz zero, .Ltmp0
beq zero, zero, 1f
nop
// CHECK: .Ltmp0
diff --git a/bolt/test/RISCV/reloc-jal.s b/bolt/test/RISCV/reloc-jal.s
index ce54265fac05e9..62a2f1dbea03a7 100644
--- a/bolt/test/RISCV/reloc-jal.s
+++ b/bolt/test/RISCV/reloc-jal.s
@@ -14,7 +14,7 @@ f:
.globl _start
.p2align 1
_start:
-// CHECK: jal ra, f
+// CHECK: jal f
jal ra, f
ret
.size _start, .-_start
|
@mtvec @MouseSplinter : FYI, as this patch touches on some of the RISC-V tests that you wrote (if I'm reading git history correctly) |
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This patch makes sure that
BinaryContext::printInstruction
prints the preferred disassembly. Preferred disassembly only gets printed when there are no annotations on the MCInst. Therefore, this patch temporarily removes the annotations before printing it.A few examples of before and after on AArch64 instructions are as follows:
Clearly, the preferred disassembly is easier for developers to read, and is the disassembly that tools should be printing.
This patch is motivated as part of future work on the llvm-bolt-binary-analysis tool, making sure that the reports it prints do use preferred disassembly.
This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype.
In this current patch, this only affects existing RISCV test cases.
This patch also does improve test cases in future patches that will introduce a binary analysis for llvm-bolt-binary-analysis that checks for correct application of pac-ret (pointer authentication on return addresses).