Skip to content

Introduce DIExpression::foldConstantMath() #71718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 29, 2024

Conversation

rastogishubham
Copy link
Contributor

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

The hope is that other people can contribute other patterns as well.

I also couldn't see a good way of combining this with DIExpression::constantFold so it stands alone.

This is part of a stack of patches and comes after
#69768
#71717

@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2023

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

The hope is that other people can contribute other patterns as well.

I also couldn't see a good way of combining this with DIExpression::constantFold so it stands alone.

This is part of a stack of patches and comes after
#69768
#71717


Patch is 25.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71718.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+85)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h (-61)
  • (modified) llvm/lib/IR/DebugInfoMetadata.cpp (+280-1)
  • (modified) llvm/unittests/IR/MetadataTest.cpp (+277)
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index b347664883fd9f2..354ae48c3d676c1 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3044,6 +3044,11 @@ class DIExpression : public MDNode {
   /// expression and constant on failure.
   std::pair<DIExpression *, const ConstantInt *>
   constantFold(const ConstantInt *CI);
+
+  /// Try to shorten an expression with constand math operations that can be
+  /// evaulated at compile time. Returns a new expression on success, or the old
+  /// expression if there is nothing to be reduced.
+  DIExpression *foldConstantMath();
 };
 
 inline bool operator==(const DIExpression::FragmentInfo &A,
@@ -3073,6 +3078,86 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
   static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
 };
 
+/// Holds a DIExpression and keeps track of how many operands have been consumed
+/// so far.
+class DIExpressionCursor {
+  DIExpression::expr_op_iterator Start, End;
+
+public:
+  DIExpressionCursor(const DIExpression *Expr) {
+    if (!Expr) {
+      assert(Start == End);
+      return;
+    }
+    Start = Expr->expr_op_begin();
+    End = Expr->expr_op_end();
+  }
+
+  DIExpressionCursor(ArrayRef<uint64_t> Expr)
+      : Start(Expr.begin()), End(Expr.end()) {}
+
+  DIExpressionCursor(const DIExpressionCursor &) = default;
+
+  /// Consume one operation.
+  std::optional<DIExpression::ExprOperand> take() {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start++);
+  }
+
+  /// Consume N operations.
+  void consume(unsigned N) { std::advance(Start, N); }
+
+  /// Return the current operation.
+  std::optional<DIExpression::ExprOperand> peek() const {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start);
+  }
+
+  /// Return the next operation.
+  std::optional<DIExpression::ExprOperand> peekNext() const {
+    if (Start == End)
+      return std::nullopt;
+
+    auto Next = Start.getNext();
+    if (Next == End)
+      return std::nullopt;
+
+    return *Next;
+  }
+
+  std::optional<DIExpression::ExprOperand> peekNextN(unsigned N) const {
+    if (Start == End)
+      return std::nullopt;
+    DIExpression::expr_op_iterator Nth = Start;
+    for (unsigned I = 0; I < N; I++) {
+      Nth = Nth.getNext();
+      if (Nth == End)
+        return std::nullopt;
+    }
+    return *Nth;
+  }
+
+  void assignNewExpr(ArrayRef<uint64_t> Expr) {
+    DIExpression::expr_op_iterator NewStart(Expr.begin());
+    DIExpression::expr_op_iterator NewEnd(Expr.end());
+    this->Start = NewStart;
+    this->End = NewEnd;
+  }
+
+  /// Determine whether there are any operations left in this expression.
+  operator bool() const { return Start != End; }
+
+  DIExpression::expr_op_iterator begin() const { return Start; }
+  DIExpression::expr_op_iterator end() const { return End; }
+
+  /// Retrieve the fragment information, if any.
+  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
+    return DIExpression::getFragmentInfo(Start, End);
+  }
+};
+
 /// Global variables.
 ///
 /// TODO: Remove DisplayName.  It's always equal to Name.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
index 667a9efc6f6c04f..4daa78b15b8e298 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
@@ -31,67 +31,6 @@ class DIELoc;
 class TargetRegisterInfo;
 class MachineLocation;
 
-/// Holds a DIExpression and keeps track of how many operands have been consumed
-/// so far.
-class DIExpressionCursor {
-  DIExpression::expr_op_iterator Start, End;
-
-public:
-  DIExpressionCursor(const DIExpression *Expr) {
-    if (!Expr) {
-      assert(Start == End);
-      return;
-    }
-    Start = Expr->expr_op_begin();
-    End = Expr->expr_op_end();
-  }
-
-  DIExpressionCursor(ArrayRef<uint64_t> Expr)
-      : Start(Expr.begin()), End(Expr.end()) {}
-
-  DIExpressionCursor(const DIExpressionCursor &) = default;
-
-  /// Consume one operation.
-  std::optional<DIExpression::ExprOperand> take() {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start++);
-  }
-
-  /// Consume N operations.
-  void consume(unsigned N) { std::advance(Start, N); }
-
-  /// Return the current operation.
-  std::optional<DIExpression::ExprOperand> peek() const {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start);
-  }
-
-  /// Return the next operation.
-  std::optional<DIExpression::ExprOperand> peekNext() const {
-    if (Start == End)
-      return std::nullopt;
-
-    auto Next = Start.getNext();
-    if (Next == End)
-      return std::nullopt;
-
-    return *Next;
-  }
-
-  /// Determine whether there are any operations left in this expression.
-  operator bool() const { return Start != End; }
-
-  DIExpression::expr_op_iterator begin() const { return Start; }
-  DIExpression::expr_op_iterator end() const { return End; }
-
-  /// Retrieve the fragment information, if any.
-  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
-    return DIExpression::getFragmentInfo(Start, End);
-  }
-};
-
 /// Base class containing the logic for constructing DWARF expressions
 /// independently of whether they are emitted into a DIE or into a .debug_loc
 /// entry.
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f7f36129ec8557c..b73ffa3880fcc4a 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1857,7 +1857,6 @@ DIExpression *DIExpression::append(const DIExpression *Expr,
     }
     Op.appendToVector(NewOps);
   }
-
   NewOps.append(Ops.begin(), Ops.end());
   auto *result = DIExpression::get(Expr->getContext(), NewOps);
   assert(result->isValid() && "concatenated expression is not valid");
@@ -1998,6 +1997,286 @@ DIExpression::constantFold(const ConstantInt *CI) {
           ConstantInt::get(getContext(), NewInt)};
 }
 
+static bool isConstOperation(uint64_t Op) {
+  return Op == dwarf::DW_OP_constu || Op == dwarf::DW_OP_consts;
+}
+
+static bool operationIsNoOp(uint64_t Op, uint64_t Val) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_minus:
+  case dwarf::DW_OP_shl:
+  case dwarf::DW_OP_shr:
+    return Val == 0;
+  case dwarf::DW_OP_mul:
+  case dwarf::DW_OP_div:
+    return Val == 1;
+  default:
+    return false;
+  }
+}
+
+static std::optional<uint64_t>
+foldOperationIfPossible(uint64_t Op, uint64_t Operand1, uint64_t Operand2) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+    return Operand1 + Operand2;
+  case dwarf::DW_OP_minus:
+    return Operand1 - Operand2;
+  case dwarf::DW_OP_shl:
+    return Operand1 << Operand2;
+  case dwarf::DW_OP_shr:
+    return Operand1 >> Operand2;
+  case dwarf::DW_OP_mul:
+    return Operand1 * Operand2;
+  case dwarf::DW_OP_div:
+    return Operand1 / Operand2;
+  default:
+    return std::nullopt;
+  }
+}
+
+static bool operationsAreFoldableAndCommutative(uint64_t Op1, uint64_t Op2) {
+  if (Op1 != Op2)
+    return false;
+  switch (Op1) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_mul:
+    return true;
+  default:
+    return false;
+  }
+}
+
+static void moveCursorAndLocPos(DIExpressionCursor &Cursor, uint64_t &Loc,
+                                const DIExpression::ExprOperand &Op) {
+  Cursor.consume(1);
+  Loc = Loc + Op.getSize();
+}
+
+DIExpression *DIExpression::foldConstantMath() {
+
+  SmallVector<uint64_t, 8> WorkingOps;
+  WorkingOps.append(Elements.begin(), Elements.end());
+  uint64_t Loc = 0;
+  DIExpressionCursor Cursor(WorkingOps);
+
+  while (Loc < WorkingOps.size()) {
+
+    auto Op1 = Cursor.peek();
+    // Expression has no operations, exit
+    if (!Op1)
+      break;
+    auto Op1Raw = Op1->getOp();
+    auto Op1Arg = Op1->getArg(0);
+
+    // {DW_OP_plus_uconst, 0} -> {}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op1Arg == 0) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 2);
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    if (!isConstOperation(Op1Raw) && Op1Raw != dwarf::DW_OP_plus_uconst) {
+      // Early exit, all of the following patterns start with a constant value
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    auto Op2 = Cursor.peekNext();
+    // All following patterns require at least 2 Operations, exit
+    if (!Op2)
+      break;
+    auto Op2Raw = Op2->getOp();
+
+    // {DW_OP_const[u, s], 0, DW_OP_[plus, minus, shl, shr]} -> {}
+    // {DW_OP_const[u, s], 1, DW_OP_[mul, div]} -> {}
+    if (isConstOperation(Op1Raw) && operationIsNoOp(Op2Raw, Op1Arg)) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 3);
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op2Arg = Op2->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst &&
+        Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
+    // Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3 = Cursor.peekNextN(2);
+    // Op2 could still match a pattern, skip iteration
+    if (!Op3) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op3Raw = Op3->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_const[u, s], Const2, DW_OP_[plus,
+    // minus, mul, div, shl, shr] -> {DW_OP_constu, Const1 [+, -, *, /, <<, >>]
+    // Const2}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op2Raw)) {
+      auto Result = foldOperationIfPossible(Op3Raw, Op1Arg, Op2Arg);
+      if (!Result) {
+        moveCursorAndLocPos(Cursor, Loc, *Op1);
+        continue;
+      }
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_const[u, s], Const1, DW_OP_plus} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && isConstOperation(Op2Raw) &&
+        Op3Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2, WorkingOps.begin() + 5);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3Arg = Op3->getArg(0);
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus &&
+        Op3Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op3Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2, WorkingOps.begin() + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_plus_uconst;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4 = Cursor.peekNextN(3);
+    // Op2 and Op3 could still match a pattern, skip iteration
+    if (!Op4) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op4Raw = Op4->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], Const2, DW_OP_[plus, mul]}
+    // -> {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op3Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op3Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 3,
+                       WorkingOps.begin() + Loc + 6);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+    auto Op5 = Cursor.peekNextN(4);
+    if (!Op5) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op5Raw = Op5->getOp();
+    auto Op5Arg = Op5->getArg(0);
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_plus_uconst, Const2} -> {DW_OP_constu, Const1 + Const2, DW_OP_plus,
+    // DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        Op5Raw == dwarf::DW_OP_plus_uconst && Op2Raw == Op4Raw &&
+        Op4Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op5Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4Arg = Op4->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_const[u, s], Const2, DW_OP_plus} -> {DW_OP_plus_uconst, Const1 +
+    // Const2, DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op2Raw == dwarf::DW_OP_LLVM_arg &&
+        Op3Raw == Op5Raw && Op3Raw == dwarf::DW_OP_plus &&
+        isConstOperation(Op4Raw)) {
+      auto Result = Op1Arg + Op4Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 5,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op6 = Cursor.peekNextN(5);
+    if (!Op6) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op6Raw = Op6->getOp();
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], DW_OP_LLVM_arg, Arg1,
+    // DW_OP_[plus, mul], DW_OP_const[u, s], Const2, DW_OP_[plus, mul]} ->
+    // {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul], DW_OP_LLVM_arg,
+    // Arg1, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        isConstOperation(Op5Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw) &&
+        operationsAreFoldableAndCommutative(Op4Raw, Op6Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op5Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 9);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    moveCursorAndLocPos(Cursor, Loc, *Op1);
+  }
+  auto *Result = DIExpression::get(getContext(), WorkingOps);
+  assert(Result->isValid() && "concatenated expression is not valid");
+  return Result;
+}
+
 uint64_t DIExpression::getNumLocationOperands() const {
   uint64_t Result = 0;
   for (auto ExprOp : expr_ops())
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 1570631287b2a78..2326c5c2c78c630 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -3120,6 +3120,283 @@ TEST_F(DIExpressionTest, get) {
   EXPECT_EQ(N0WithPrependedOps, N2);
 }
 
+TEST_F(DIExpressionTest, Fold) {
+
+  // Remove a No-op DW_OP_plus_uconst from an expression.
+  SmallVector<uint64_t, 8> Ops = {dwarf::DW_OP_plus_uconst, 0};
+  auto *Expr = DIExpression::get(Context, Ops);
+  auto *E = Expr->foldConstantMath();
+  SmallVector<uint64_t, 8> ResOps;
+  auto *ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op add from an expression.
+  Ops[0] = dwarf::DW_OP_constu;
+  Ops[1] = 0;
+  Ops.push_back(dwarf::DW_OP_plus);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op subtract from an expression.
+  Ops[2] = dwarf::DW_OP_minus;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift left from an expression.
+  Ops[2] = dwarf::DW_OP_shl;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift right from an expression.
+  Ops[2] = dwarf::DW_OP_shr;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op multiply from an expression.
+  Ops[2] = dwarf::DW_OP_mul;
+  Ops[1] = 1;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op divide from an expression.
+  Ops[2] = dwarf::DW_OP_div;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test fold {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+  // {DW_OP_plus_uconst, Const1 + Const2}
+  Ops[0] = dwarf::DW_OP_plus_uconst;
+  Ops[1] = 2;
+  Ops[2] = dwarf::DW_OP_plus_uconst;
+  Ops.push_back(3);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps.push_back(dwarf::DW_OP_plus_uconst);
+  ResOps.push_back(5);
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_plus_uconst, Const2} -> {DW_OP_constu,
+  // Const1 + Const2}
+  Ops[0] = dwarf::DW_OP_constu;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_plus} ->
+  // {DW_OP_constu, Const1 + Const2}
+  Ops[1] = 8;
+  Ops[2] = dwarf::DW_OP_constu;
+  Ops[3] = 2;
+  Ops.push_back(dwarf::DW_OP_plus);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 10;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_minus} ->
+  // {DW_OP_constu, Const1 - Const2}
+  Ops[4] = dwarf::DW_OP_minus;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 6;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_mul} ->
+  // {DW_OP_constu, Const1 * Const2}
+  Ops[4] = dwarf::DW_OP_mul;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 16;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_div} ->
+  // {DW_OP_constu, Const1 / Const2}
+  Ops[4] = dwarf::DW_OP_div;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 4;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_shl} ->
+  // {DW_OP_constu, Const1 << Const2}
+  Ops[4] = dwarf::DW_OP_shl;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] =...
[truncated]

}

// {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
// Const1 + Const2}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need to take the signedness into account while doing the addition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, since all the Ops are uint64_t everything is in signed 2's complement and can be added right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting in DWARF v5, the expression stack can have any base type. The default type or "generic type" is the size of an address, so not necessarily 64-bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does DWARF handle under/overflow in its expressions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect to the generic type, "except as otherwise specified, the arithmetic operations perform addressing arithmetic, that is, unsigned arithmetic that is performed module one plus the largest representable address." Also, "Operations do not cause an exception on overflow."

Copy link
Contributor Author

@rastogishubham rastogishubham Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pogo59

If we look at DebugInfoMetadata.h line 2645:

class DIExpression : public MDNode {
  friend class LLVMContextImpl;
  friend class MDNode;

  std::vector<uint64_t> Elements;

While DWARF itself defines any base type, it seems like for LLVM, it is all still uint64_t right? Or am I missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is operating on the metadata format, not the final DWARF. It would be good to have a test that folds to a value wider than 32 bits and show that emitting the final DWARF for a 32-bit target properly truncates.

}

// {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
// Const1 + Const2}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does DWARF handle under/overflow in its expressions?

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to record that I'm really glad this is being addressed; it feels like it could probably be made more readable and there are some ideas for that in reviews, I don't want to wade in too. My 2 cents from a high level -- transforming from old-expression into a new-expression rather than editing in-place might be easier.

@rastogishubham
Copy link
Contributor Author

@jmorse @adrian-prantl @pogo59 @felipepiovezan I think I have addressed most of the feedback that was given on this patch. Please have a look


// {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], DW_OP_const[u, s], Const2,
// DW_OP_[plus, mul]} -> {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul]}
static bool tryFoldCommutativeMath(uint64_t Op1Raw, uint64_t Op1Arg,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are Op1 .. Op4 always WorkinOpts[0..4]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, Op1 .. Op4 are relative, where Op1 is the first current Op in the sequence, and Op4 is the 4th one in the sequence, that is why we use Loc to figure out what to erase

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't look at any overflow/underflow cases.

@rastogishubham
Copy link
Contributor Author

To make things easier, I have changed the patch to just support unsigned operations for now, I can add signed operation support in a later patch

static bool isConstantVal(uint64_t Op) { return Op == dwarf::DW_OP_constu; }

// Returns true if an operation and operand result in a No Op.
static bool isNeutralElement(uint64_t Op, uint64_t Val) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could the first type be dwarf::LocationAtom?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but what is the difference/advantage of that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's less easy to confuse the two uint64_t parameters. It's a very minor improvement.

return Operand1 - Operand2;
}
case dwarf::DW_OP_shl: {
if (Operand2 > 64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect. You want the equivalent of "does this overflow?" here, too.

return Operand1 << Operand2;
}
case dwarf::DW_OP_shr: {
if (Operand2 > 64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not needed, the result will still be null?

@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch 3 times, most recently from 39ea838 to 8f4da5f Compare April 4, 2024 02:04
@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from 3b40e56 to 03eac36 Compare May 1, 2024 23:10
@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch from 03eac36 to 21324f8 Compare May 10, 2024 23:23
Copy link

github-actions bot commented May 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -0,0 +1,81 @@
//===- llvm/IR/DIExpressionOptimizer.h - Constant folding of DIExpressions --*-
//C++ -*-===//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only works if it's on the first line. Just remove the llvm/IR/ bit.

using namespace llvm;

/// Returns true if the Op is a DW_OP_constu.
bool isConstantVal(uint64_t Op);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions probably don't need to be all public. If you just move the entry-level function into DIExpressionOptimizer.cpp then these can all be static methods?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then you probably don't even need a new header file.

@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from 04bb1b4 to 5ac6bf9 Compare May 15, 2024 17:46
@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from ce9a2fe to 3b03b65 Compare May 23, 2024 22:30
rastogishubham added a commit that referenced this pull request May 29, 2024
This is an NFC patch to move DIExpressionCursor to DebugInfoMetada.h, so
that it can be used by classes in that header file.

Specifically, I want to use DIExpressionCursor in a subsequent patch:
#71718
@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch from 3b03b65 to accb2f3 Compare May 29, 2024 22:43
DIExpressions can get very long and have a lot of redundant operations.
This function uses simple pattern matching to fold constant math that
can be evaluated at compile time.
@rastogishubham rastogishubham force-pushed the DIExpressionFoldMath branch from accb2f3 to f2afb34 Compare May 29, 2024 22:48
@rastogishubham rastogishubham merged commit b12f81b into llvm:main May 29, 2024
4 of 6 checks passed
@rastogishubham rastogishubham deleted the DIExpressionFoldMath branch May 29, 2024 23:10
rastogishubham added a commit that referenced this pull request May 29, 2024
…1719)

This patch uses `DIExpression::foldConstantMath()` at the end of a
`DIExpression::append()`. Which should help in reducing the size of
DIExpressions that grow because of salvaging debug info

This is part of a stack of patches and comes after:
#69768
#71717
#71718
alanzhao1 pushed a commit to alanzhao1/llvm-project that referenced this pull request May 29, 2024
…sion (llvm#71721)

This patch uses `DIExpression::foldConstantMath()` at the result of a
Salvaged expression, that is, it runs the folding optimizations after an
expression has been salvaged completely, to reduce how many times the
fold optimization function is called. Which should help in reducing the
size of DIExpressions that grow because of salvaging debug info

After checking the size of the dSYM with and without this change, I saw
a decrease of about 300KB, where the debug_loc section is about 1.6 GB
in size.

Where the debug loc section reduced in size by 212KB and it is 193MB in
size, the rest comes from the debug_info section

This is part of a stack of patches and comes after:
llvm#69768
llvm#71717
llvm#71718
llvm#71719
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants