Skip to content

[BOLT] Correctly print preferred disassembly for annotated instructions #120564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 20, 2024

Conversation

kbeyls
Copy link
Collaborator

@kbeyls kbeyls commented Dec 19, 2024

This patch makes sure that BinaryContext::printInstruction prints the preferred disassembly. Preferred disassembly only gets printed when there are no annotations on the MCInst. Therefore, this patch temporarily removes the annotations before printing it.

A few examples of before and after on AArch64 instructions are as follows:

  BEFORE                     AFTER
                             (preferred disassembly)

  ret   x30                  ret
  orr   x30, xzr, x0         mov   x30, x0
  hint  #29                  autiasp
  hint  #12                  autia1716

Clearly, the preferred disassembly is easier for developers to read, and is the disassembly that tools should be printing.

This patch is motivated as part of future work on the llvm-bolt-binary-analysis tool, making sure that the reports it prints do use preferred disassembly.

This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype.

In this current patch, this only affects existing RISCV test cases.

This patch also does improve test cases in future patches that will introduce a binary analysis for llvm-bolt-binary-analysis that checks for correct application of pac-ret (pointer authentication on return addresses).

…tion.

This patch makes sure that BinaryContext::printInstruction prints the
preferred disassembly. Preferred disassembly only gets printed when
there are no annotations on the MCInst. Therefore, this patch
temporarily removes the annotations before printing it.

A few examples of before and after on AArch64 instructions are as
follows:

```
  BEFORE                     AFTER
                             (preferred disassembly)

  ret   x30                  ret
  orr   x30, xzr, x0         mov x30, x0
  hint  llvm#29                  autiasp
  hint  llvm#12                  autia1716
```

Clearly, the preferred disassembly is easier for developers to read,
and is the disassembly that tools should be printing.

This patch is motivated as part of future work on the
llvm-bolt-binary-analysis tool, making sure that the reports it prints
do use preferred disassembly.

This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype.

In this current patch, this only affects existing RISCV test cases.

This patch also does improve test cases in future patches that will
introduce a binary analysis for llvm-bolt-binary-analysis that checks
for correct application of pac-ret (pointer authentication on return
addresses).
@llvmbot
Copy link
Member

llvmbot commented Dec 19, 2024

@llvm/pr-subscribers-bolt

Author: Kristof Beyls (kbeyls)

Changes

This patch makes sure that BinaryContext::printInstruction prints the preferred disassembly. Preferred disassembly only gets printed when there are no annotations on the MCInst. Therefore, this patch temporarily removes the annotations before printing it.

A few examples of before and after on AArch64 instructions are as follows:

  BEFORE                     AFTER
                             (preferred disassembly)

  ret   x30                  ret
  orr   x30, xzr, x0         mov   x30, x0
  hint  #<!-- -->29                  autiasp
  hint  #<!-- -->12                  autia1716

Clearly, the preferred disassembly is easier for developers to read, and is the disassembly that tools should be printing.

This patch is motivated as part of future work on the llvm-bolt-binary-analysis tool, making sure that the reports it prints do use preferred disassembly.

This patch was cherry-picked from https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype.

In this current patch, this only affects existing RISCV test cases.

This patch also does improve test cases in future patches that will introduce a binary analysis for llvm-bolt-binary-analysis that checks for correct application of pac-ret (pointer authentication on return addresses).


Full diff: https://github.com/llvm/llvm-project/pull/120564.diff

5 Files Affected:

  • (modified) bolt/lib/Core/BinaryContext.cpp (+9-1)
  • (modified) bolt/test/RISCV/call-annotations.s (+4-4)
  • (modified) bolt/test/RISCV/relax.s (+2-2)
  • (modified) bolt/test/RISCV/reloc-branch.s (+1-1)
  • (modified) bolt/test/RISCV/reloc-jal.s (+1-1)
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index ac96b836ed5796..0fcbf7452cf880 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -1961,7 +1961,15 @@ void BinaryContext::printInstruction(raw_ostream &OS, const MCInst &Instruction,
     OS << "\tjit\t" << MIB->getTargetSymbol(Instruction)->getName()
        << " # ID: " << DynamicID;
   } else {
-    InstPrinter->printInst(&Instruction, 0, "", *STI, OS);
+    // If there are annotations on the instruction, the MCInstPrinter will fail to
+    // print the preferred alias as it only does so when the number of operands is
+    // as expected. See
+    // https://github.com/llvm/llvm-project/blob/782f1a0d895646c364a53f9dcdd6d4ec1f3e5ea0/llvm/lib/MC/MCInstPrinter.cpp#L142
+    // Therefore, create a temporary copy of the Inst from which the annotations are
+    // removed, and print that Inst.
+    MCInst InstNoAnnot = Instruction;
+    MIB->stripAnnotations(InstNoAnnot);
+    InstPrinter->printInst(&InstNoAnnot, 0, "", *STI, OS);
   }
   if (MIB->isCall(Instruction)) {
     if (MIB->isTailCall(Instruction))
diff --git a/bolt/test/RISCV/call-annotations.s b/bolt/test/RISCV/call-annotations.s
index ff99e0f1dfd848..f876544e214cab 100644
--- a/bolt/test/RISCV/call-annotations.s
+++ b/bolt/test/RISCV/call-annotations.s
@@ -16,13 +16,13 @@ f:
 
 // CHECK-LABEL: Binary Function "_start" after building cfg {
 // CHECK:      auipc ra, f
-// CHECK-NEXT: jalr ra, -0x4(ra) # Offset: 4
-// CHECK-NEXT: jal ra, f # Offset: 8
-// CHECK-NEXT: jal zero, f # TAILCALL  # Offset: 12
+// CHECK-NEXT: jalr -0x4(ra) # Offset: 4
+// CHECK-NEXT: jal f # Offset: 8
+// CHECK-NEXT: j f # TAILCALL  # Offset: 12
 
 // CHECK-LABEL: Binary Function "long_tail" after building cfg {
 // CHECK:      auipc t1, f
-// CHECK-NEXT: jalr zero, -0x18(t1) # TAILCALL  # Offset: 8
+// CHECK-NEXT: jr -0x18(t1) # TAILCALL  # Offset: 8
 
 // CHECK-LABEL: Binary Function "compressed_tail" after building cfg {
 // CHECK:      jr a0 # TAILCALL  # Offset: 0
diff --git a/bolt/test/RISCV/relax.s b/bolt/test/RISCV/relax.s
index ec390ea76b5c72..74f049b8f8dd94 100644
--- a/bolt/test/RISCV/relax.s
+++ b/bolt/test/RISCV/relax.s
@@ -5,9 +5,9 @@
 // RUN: llvm-objdump -d %t.bolt | FileCheck --check-prefix=OBJDUMP %s
 
 // CHECK:      Binary Function "_start" after building cfg {
-// CHECK:      jal ra, near_f
+// CHECK:      jal near_f
 // CHECK-NEXT: auipc ra, far_f
-// CHECK-NEXT: jalr ra, 0xc(ra)
+// CHECK-NEXT: jalr 0xc(ra)
 // CHECK-NEXT: j near_f
 
 // CHECK:      Binary Function "_start" after fix-riscv-calls {
diff --git a/bolt/test/RISCV/reloc-branch.s b/bolt/test/RISCV/reloc-branch.s
index 6a8b5a28e19d92..780d978f36f447 100644
--- a/bolt/test/RISCV/reloc-branch.s
+++ b/bolt/test/RISCV/reloc-branch.s
@@ -7,7 +7,7 @@
   .p2align 1
 // CHECK: Binary Function "_start" after building cfg {
 _start:
-// CHECK: beq zero, zero, .Ltmp0
+// CHECK: beqz zero, .Ltmp0
   beq zero, zero, 1f
   nop
 // CHECK: .Ltmp0
diff --git a/bolt/test/RISCV/reloc-jal.s b/bolt/test/RISCV/reloc-jal.s
index ce54265fac05e9..62a2f1dbea03a7 100644
--- a/bolt/test/RISCV/reloc-jal.s
+++ b/bolt/test/RISCV/reloc-jal.s
@@ -14,7 +14,7 @@ f:
   .globl _start
   .p2align 1
 _start:
-// CHECK: jal ra, f
+// CHECK: jal f
   jal ra, f
   ret
   .size _start, .-_start

@kbeyls
Copy link
Collaborator Author

kbeyls commented Dec 19, 2024

@mtvec @MouseSplinter : FYI, as this patch touches on some of the RISC-V tests that you wrote (if I'm reading git history correctly)

Copy link

github-actions bot commented Dec 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kbeyls kbeyls merged commit 4111841 into llvm:main Dec 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants