Introduce DIExpression::foldConstantMath() #71718

rastogishubham · 2023-11-08T18:36:27Z

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

The hope is that other people can contribute other patterns as well.

I also couldn't see a good way of combining this with DIExpression::constantFold so it stands alone.

This is part of a stack of patches and comes after
#69768
#71717

llvmbot · 2023-11-08T18:36:58Z

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

The hope is that other people can contribute other patterns as well.

I also couldn't see a good way of combining this with DIExpression::constantFold so it stands alone.

This is part of a stack of patches and comes after
#69768
#71717

Patch is 25.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71718.diff

4 Files Affected:

(modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+85)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h (-61)
(modified) llvm/lib/IR/DebugInfoMetadata.cpp (+280-1)
(modified) llvm/unittests/IR/MetadataTest.cpp (+277)

diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index b347664883fd9f2..354ae48c3d676c1 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3044,6 +3044,11 @@ class DIExpression : public MDNode {
   /// expression and constant on failure.
   std::pair<DIExpression *, const ConstantInt *>
   constantFold(const ConstantInt *CI);
+
+  /// Try to shorten an expression with constand math operations that can be
+  /// evaulated at compile time. Returns a new expression on success, or the old
+  /// expression if there is nothing to be reduced.
+  DIExpression *foldConstantMath();
 };
 
 inline bool operator==(const DIExpression::FragmentInfo &A,
@@ -3073,6 +3078,86 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
   static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
 };
 
+/// Holds a DIExpression and keeps track of how many operands have been consumed
+/// so far.
+class DIExpressionCursor {
+  DIExpression::expr_op_iterator Start, End;
+
+public:
+  DIExpressionCursor(const DIExpression *Expr) {
+    if (!Expr) {
+      assert(Start == End);
+      return;
+    }
+    Start = Expr->expr_op_begin();
+    End = Expr->expr_op_end();
+  }
+
+  DIExpressionCursor(ArrayRef<uint64_t> Expr)
+      : Start(Expr.begin()), End(Expr.end()) {}
+
+  DIExpressionCursor(const DIExpressionCursor &) = default;
+
+  /// Consume one operation.
+  std::optional<DIExpression::ExprOperand> take() {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start++);
+  }
+
+  /// Consume N operations.
+  void consume(unsigned N) { std::advance(Start, N); }
+
+  /// Return the current operation.
+  std::optional<DIExpression::ExprOperand> peek() const {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start);
+  }
+
+  /// Return the next operation.
+  std::optional<DIExpression::ExprOperand> peekNext() const {
+    if (Start == End)
+      return std::nullopt;
+
+    auto Next = Start.getNext();
+    if (Next == End)
+      return std::nullopt;
+
+    return *Next;
+  }
+
+  std::optional<DIExpression::ExprOperand> peekNextN(unsigned N) const {
+    if (Start == End)
+      return std::nullopt;
+    DIExpression::expr_op_iterator Nth = Start;
+    for (unsigned I = 0; I < N; I++) {
+      Nth = Nth.getNext();
+      if (Nth == End)
+        return std::nullopt;
+    }
+    return *Nth;
+  }
+
+  void assignNewExpr(ArrayRef<uint64_t> Expr) {
+    DIExpression::expr_op_iterator NewStart(Expr.begin());
+    DIExpression::expr_op_iterator NewEnd(Expr.end());
+    this->Start = NewStart;
+    this->End = NewEnd;
+  }
+
+  /// Determine whether there are any operations left in this expression.
+  operator bool() const { return Start != End; }
+
+  DIExpression::expr_op_iterator begin() const { return Start; }
+  DIExpression::expr_op_iterator end() const { return End; }
+
+  /// Retrieve the fragment information, if any.
+  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
+    return DIExpression::getFragmentInfo(Start, End);
+  }
+};
+
 /// Global variables.
 ///
 /// TODO: Remove DisplayName.  It's always equal to Name.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
index 667a9efc6f6c04f..4daa78b15b8e298 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
@@ -31,67 +31,6 @@ class DIELoc;
 class TargetRegisterInfo;
 class MachineLocation;
 
-/// Holds a DIExpression and keeps track of how many operands have been consumed
-/// so far.
-class DIExpressionCursor {
-  DIExpression::expr_op_iterator Start, End;
-
-public:
-  DIExpressionCursor(const DIExpression *Expr) {
-    if (!Expr) {
-      assert(Start == End);
-      return;
-    }
-    Start = Expr->expr_op_begin();
-    End = Expr->expr_op_end();
-  }
-
-  DIExpressionCursor(ArrayRef<uint64_t> Expr)
-      : Start(Expr.begin()), End(Expr.end()) {}
-
-  DIExpressionCursor(const DIExpressionCursor &) = default;
-
-  /// Consume one operation.
-  std::optional<DIExpression::ExprOperand> take() {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start++);
-  }
-
-  /// Consume N operations.
-  void consume(unsigned N) { std::advance(Start, N); }
-
-  /// Return the current operation.
-  std::optional<DIExpression::ExprOperand> peek() const {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start);
-  }
-
-  /// Return the next operation.
-  std::optional<DIExpression::ExprOperand> peekNext() const {
-    if (Start == End)
-      return std::nullopt;
-
-    auto Next = Start.getNext();
-    if (Next == End)
-      return std::nullopt;
-
-    return *Next;
-  }
-
-  /// Determine whether there are any operations left in this expression.
-  operator bool() const { return Start != End; }
-
-  DIExpression::expr_op_iterator begin() const { return Start; }
-  DIExpression::expr_op_iterator end() const { return End; }
-
-  /// Retrieve the fragment information, if any.
-  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
-    return DIExpression::getFragmentInfo(Start, End);
-  }
-};
-
 /// Base class containing the logic for constructing DWARF expressions
 /// independently of whether they are emitted into a DIE or into a .debug_loc
 /// entry.
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f7f36129ec8557c..b73ffa3880fcc4a 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1857,7 +1857,6 @@ DIExpression *DIExpression::append(const DIExpression *Expr,
     }
     Op.appendToVector(NewOps);
   }
-
   NewOps.append(Ops.begin(), Ops.end());
   auto *result = DIExpression::get(Expr->getContext(), NewOps);
   assert(result->isValid() && "concatenated expression is not valid");
@@ -1998,6 +1997,286 @@ DIExpression::constantFold(const ConstantInt *CI) {
           ConstantInt::get(getContext(), NewInt)};
 }
 
+static bool isConstOperation(uint64_t Op) {
+  return Op == dwarf::DW_OP_constu || Op == dwarf::DW_OP_consts;
+}
+
+static bool operationIsNoOp(uint64_t Op, uint64_t Val) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_minus:
+  case dwarf::DW_OP_shl:
+  case dwarf::DW_OP_shr:
+    return Val == 0;
+  case dwarf::DW_OP_mul:
+  case dwarf::DW_OP_div:
+    return Val == 1;
+  default:
+    return false;
+  }
+}
+
+static std::optional<uint64_t>
+foldOperationIfPossible(uint64_t Op, uint64_t Operand1, uint64_t Operand2) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+    return Operand1 + Operand2;
+  case dwarf::DW_OP_minus:
+    return Operand1 - Operand2;
+  case dwarf::DW_OP_shl:
+    return Operand1 << Operand2;
+  case dwarf::DW_OP_shr:
+    return Operand1 >> Operand2;
+  case dwarf::DW_OP_mul:
+    return Operand1 * Operand2;
+  case dwarf::DW_OP_div:
+    return Operand1 / Operand2;
+  default:
+    return std::nullopt;
+  }
+}
+
+static bool operationsAreFoldableAndCommutative(uint64_t Op1, uint64_t Op2) {
+  if (Op1 != Op2)
+    return false;
+  switch (Op1) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_mul:
+    return true;
+  default:
+    return false;
+  }
+}
+
+static void moveCursorAndLocPos(DIExpressionCursor &Cursor, uint64_t &Loc,
+                                const DIExpression::ExprOperand &Op) {
+  Cursor.consume(1);
+  Loc = Loc + Op.getSize();
+}
+
+DIExpression *DIExpression::foldConstantMath() {
+
+  SmallVector<uint64_t, 8> WorkingOps;
+  WorkingOps.append(Elements.begin(), Elements.end());
+  uint64_t Loc = 0;
+  DIExpressionCursor Cursor(WorkingOps);
+
+  while (Loc < WorkingOps.size()) {
+
+    auto Op1 = Cursor.peek();
+    // Expression has no operations, exit
+    if (!Op1)
+      break;
+    auto Op1Raw = Op1->getOp();
+    auto Op1Arg = Op1->getArg(0);
+
+    // {DW_OP_plus_uconst, 0} -> {}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op1Arg == 0) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 2);
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    if (!isConstOperation(Op1Raw) && Op1Raw != dwarf::DW_OP_plus_uconst) {
+      // Early exit, all of the following patterns start with a constant value
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    auto Op2 = Cursor.peekNext();
+    // All following patterns require at least 2 Operations, exit
+    if (!Op2)
+      break;
+    auto Op2Raw = Op2->getOp();
+
+    // {DW_OP_const[u, s], 0, DW_OP_[plus, minus, shl, shr]} -> {}
+    // {DW_OP_const[u, s], 1, DW_OP_[mul, div]} -> {}
+    if (isConstOperation(Op1Raw) && operationIsNoOp(Op2Raw, Op1Arg)) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 3);
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op2Arg = Op2->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst &&
+        Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
+    // Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3 = Cursor.peekNextN(2);
+    // Op2 could still match a pattern, skip iteration
+    if (!Op3) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op3Raw = Op3->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_const[u, s], Const2, DW_OP_[plus,
+    // minus, mul, div, shl, shr] -> {DW_OP_constu, Const1 [+, -, *, /, <<, >>]
+    // Const2}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op2Raw)) {
+      auto Result = foldOperationIfPossible(Op3Raw, Op1Arg, Op2Arg);
+      if (!Result) {
+        moveCursorAndLocPos(Cursor, Loc, *Op1);
+        continue;
+      }
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_const[u, s], Const1, DW_OP_plus} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && isConstOperation(Op2Raw) &&
+        Op3Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2, WorkingOps.begin() + 5);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3Arg = Op3->getArg(0);
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus &&
+        Op3Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op3Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2, WorkingOps.begin() + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_plus_uconst;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4 = Cursor.peekNextN(3);
+    // Op2 and Op3 could still match a pattern, skip iteration
+    if (!Op4) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op4Raw = Op4->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], Const2, DW_OP_[plus, mul]}
+    // -> {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op3Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op3Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 3,
+                       WorkingOps.begin() + Loc + 6);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+    auto Op5 = Cursor.peekNextN(4);
+    if (!Op5) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op5Raw = Op5->getOp();
+    auto Op5Arg = Op5->getArg(0);
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_plus_uconst, Const2} -> {DW_OP_constu, Const1 + Const2, DW_OP_plus,
+    // DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        Op5Raw == dwarf::DW_OP_plus_uconst && Op2Raw == Op4Raw &&
+        Op4Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op5Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4Arg = Op4->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_const[u, s], Const2, DW_OP_plus} -> {DW_OP_plus_uconst, Const1 +
+    // Const2, DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op2Raw == dwarf::DW_OP_LLVM_arg &&
+        Op3Raw == Op5Raw && Op3Raw == dwarf::DW_OP_plus &&
+        isConstOperation(Op4Raw)) {
+      auto Result = Op1Arg + Op4Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 5,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op6 = Cursor.peekNextN(5);
+    if (!Op6) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op6Raw = Op6->getOp();
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], DW_OP_LLVM_arg, Arg1,
+    // DW_OP_[plus, mul], DW_OP_const[u, s], Const2, DW_OP_[plus, mul]} ->
+    // {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul], DW_OP_LLVM_arg,
+    // Arg1, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        isConstOperation(Op5Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw) &&
+        operationsAreFoldableAndCommutative(Op4Raw, Op6Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op5Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 9);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    moveCursorAndLocPos(Cursor, Loc, *Op1);
+  }
+  auto *Result = DIExpression::get(getContext(), WorkingOps);
+  assert(Result->isValid() && "concatenated expression is not valid");
+  return Result;
+}
+
 uint64_t DIExpression::getNumLocationOperands() const {
   uint64_t Result = 0;
   for (auto ExprOp : expr_ops())
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 1570631287b2a78..2326c5c2c78c630 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -3120,6 +3120,283 @@ TEST_F(DIExpressionTest, get) {
   EXPECT_EQ(N0WithPrependedOps, N2);
 }
 
+TEST_F(DIExpressionTest, Fold) {
+
+  // Remove a No-op DW_OP_plus_uconst from an expression.
+  SmallVector<uint64_t, 8> Ops = {dwarf::DW_OP_plus_uconst, 0};
+  auto *Expr = DIExpression::get(Context, Ops);
+  auto *E = Expr->foldConstantMath();
+  SmallVector<uint64_t, 8> ResOps;
+  auto *ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op add from an expression.
+  Ops[0] = dwarf::DW_OP_constu;
+  Ops[1] = 0;
+  Ops.push_back(dwarf::DW_OP_plus);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op subtract from an expression.
+  Ops[2] = dwarf::DW_OP_minus;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift left from an expression.
+  Ops[2] = dwarf::DW_OP_shl;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift right from an expression.
+  Ops[2] = dwarf::DW_OP_shr;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op multiply from an expression.
+  Ops[2] = dwarf::DW_OP_mul;
+  Ops[1] = 1;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op divide from an expression.
+  Ops[2] = dwarf::DW_OP_div;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test fold {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+  // {DW_OP_plus_uconst, Const1 + Const2}
+  Ops[0] = dwarf::DW_OP_plus_uconst;
+  Ops[1] = 2;
+  Ops[2] = dwarf::DW_OP_plus_uconst;
+  Ops.push_back(3);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps.push_back(dwarf::DW_OP_plus_uconst);
+  ResOps.push_back(5);
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_plus_uconst, Const2} -> {DW_OP_constu,
+  // Const1 + Const2}
+  Ops[0] = dwarf::DW_OP_constu;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_plus} ->
+  // {DW_OP_constu, Const1 + Const2}
+  Ops[1] = 8;
+  Ops[2] = dwarf::DW_OP_constu;
+  Ops[3] = 2;
+  Ops.push_back(dwarf::DW_OP_plus);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 10;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_minus} ->
+  // {DW_OP_constu, Const1 - Const2}
+  Ops[4] = dwarf::DW_OP_minus;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 6;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_mul} ->
+  // {DW_OP_constu, Const1 * Const2}
+  Ops[4] = dwarf::DW_OP_mul;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 16;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_div} ->
+  // {DW_OP_constu, Const1 / Const2}
+  Ops[4] = dwarf::DW_OP_div;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] = 4;
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test {DW_OP_constu, Const1, DW_OP_constu, Const2, DW_OP_shl} ->
+  // {DW_OP_constu, Const1 << Const2}
+  Ops[4] = dwarf::DW_OP_shl;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResOps[0] = dwarf::DW_OP_constu;
+  ResOps[1] =...
[truncated]

llvm/include/llvm/IR/DebugInfoMetadata.h

llvm/lib/IR/DebugInfoMetadata.cpp

adrian-prantl · 2023-11-11T01:08:47Z

llvm/lib/IR/DebugInfoMetadata.cpp

+    }
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
+    // Const1 + Const2}


don't you need to take the signedness into account while doing the addition?

I don't think so, since all the Ops are uint64_t everything is in signed 2's complement and can be added right?

Starting in DWARF v5, the expression stack can have any base type. The default type or "generic type" is the size of an address, so not necessarily 64-bit.

How does DWARF handle under/overflow in its expressions?

With respect to the generic type, "except as otherwise specified, the arithmetic operations perform addressing arithmetic, that is, unsigned arithmetic that is performed module one plus the largest representable address." Also, "Operations do not cause an exception on overflow."

@pogo59

If we look at DebugInfoMetadata.h line 2645:

class DIExpression : public MDNode { friend class LLVMContextImpl; friend class MDNode; std::vector<uint64_t> Elements;

While DWARF itself defines any base type, it seems like for LLVM, it is all still uint64_t right? Or am I missing something?

Right, this is operating on the metadata format, not the final DWARF. It would be good to have a test that folds to a value wider than 32 bits and show that emitting the final DWARF for a 32-bit target properly truncates.

llvm/lib/IR/DebugInfoMetadata.cpp

felipepiovezan · 2023-11-13T22:46:37Z

llvm/lib/IR/DebugInfoMetadata.cpp

+    }
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
+    // Const1 + Const2}


How does DWARF handle under/overflow in its expressions?

jmorse

Just to record that I'm really glad this is being addressed; it feels like it could probably be made more readable and there are some ideas for that in reviews, I don't want to wade in too. My 2 cents from a high level -- transforming from old-expression into a new-expression rather than editing in-place might be easier.

llvm/lib/IR/DebugInfoMetadata.cpp

rastogishubham · 2024-03-09T01:04:51Z

@jmorse @adrian-prantl @pogo59 @felipepiovezan I think I have addressed most of the feedback that was given on this patch. Please have a look

llvm/lib/IR/DebugInfoMetadata.cpp

adrian-prantl · 2024-03-09T01:23:34Z

llvm/lib/IR/DebugInfoMetadata.cpp

+
+// {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], DW_OP_const[u, s], Const2,
+// DW_OP_[plus, mul]} -> {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul]}
+static bool tryFoldCommutativeMath(uint64_t Op1Raw, uint64_t Op1Arg,


Are Op1 .. Op4 always WorkinOpts[0..4]?

No, Op1 .. Op4 are relative, where Op1 is the first current Op in the sequence, and Op4 is the 4th one in the sequence, that is why we use Loc to figure out what to erase

pogo59

The test doesn't look at any overflow/underflow cases.

llvm/unittests/IR/MetadataTest.cpp

rastogishubham · 2024-03-14T20:13:24Z

To make things easier, I have changed the patch to just support unsigned operations for now, I can add signed operation support in a later patch

llvm/include/llvm/IR/DebugInfoMetadata.h

llvm/lib/IR/DebugInfoMetadata.cpp

adrian-prantl · 2024-03-15T15:50:32Z

llvm/lib/IR/DebugInfoMetadata.cpp

+static bool isConstantVal(uint64_t Op) { return Op == dwarf::DW_OP_constu; }
+
+// Returns true if an operation and operand result in a No Op.
+static bool isNeutralElement(uint64_t Op, uint64_t Val) {


could the first type be dwarf::LocationAtom?

Yes but what is the difference/advantage of that?

So it's less easy to confuse the two uint64_t parameters. It's a very minor improvement.

llvm/lib/IR/DebugInfoMetadata.cpp

adrian-prantl · 2024-03-15T15:53:31Z

llvm/lib/IR/DebugInfoMetadata.cpp

+    return Operand1 - Operand2;
+  }
+  case dwarf::DW_OP_shl: {
+    if (Operand2 > 64)


I think this is incorrect. You want the equivalent of "does this overflow?" here, too.

adrian-prantl · 2024-03-15T15:54:01Z

llvm/lib/IR/DebugInfoMetadata.cpp

+    return Operand1 << Operand2;
+  }
+  case dwarf::DW_OP_shr: {
+    if (Operand2 > 64)


This check is not needed, the result will still be null?

llvm/lib/IR/DebugInfoMetadata.cpp

github-actions · 2024-05-10T23:26:25Z

✅ With the latest revision this PR passed the C/C++ code formatter.

adrian-prantl · 2024-05-11T00:43:16Z

llvm/include/llvm/IR/DIExpressionOptimizer.h

@@ -0,0 +1,81 @@
+//===- llvm/IR/DIExpressionOptimizer.h - Constant folding of DIExpressions --*-
+//C++ -*-===//


this only works if it's on the first line. Just remove the llvm/IR/ bit.

adrian-prantl · 2024-05-11T00:44:58Z

llvm/include/llvm/IR/DIExpressionOptimizer.h

+using namespace llvm;
+
+/// Returns true if the Op is a DW_OP_constu.
+bool isConstantVal(uint64_t Op);


These functions probably don't need to be all public. If you just move the entry-level function into DIExpressionOptimizer.cpp then these can all be static methods?

And then you probably don't even need a new header file.

This is an NFC patch to move DIExpressionCursor to DebugInfoMetada.h, so that it can be used by classes in that header file. Specifically, I want to use DIExpressionCursor in a subsequent patch: #71718

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

…1719) This patch uses `DIExpression::foldConstantMath()` at the end of a `DIExpression::append()`. Which should help in reducing the size of DIExpressions that grow because of salvaging debug info This is part of a stack of patches and comes after: #69768 #71717 #71718

…sion (llvm#71721) This patch uses `DIExpression::foldConstantMath()` at the result of a Salvaged expression, that is, it runs the folding optimizations after an expression has been salvaged completely, to reduce how many times the fold optimization function is called. Which should help in reducing the size of DIExpressions that grow because of salvaging debug info After checking the size of the dSYM with and without this change, I saw a decrease of about 300KB, where the debug_loc section is about 1.6 GB in size. Where the debug loc section reduced in size by 212KB and it is 193MB in size, the rest comes from the debug_info section This is part of a stack of patches and comes after: llvm#69768 llvm#71717 llvm#71718 llvm#71719

llvmbot added debuginfo llvm:ir labels Nov 8, 2023

This was referenced Nov 8, 2023

Use DIExpression::foldConstantMath() at the result of an append() #71719

Merged

Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

Merged

rastogishubham requested review from MaskRay, adrian-prantl, SLTozer, dwblaikie and jmorse November 8, 2023 18:41

adrian-prantl reviewed Nov 11, 2023

View reviewed changes

adrian-prantl requested review from pogo59 and felipepiovezan November 11, 2023 01:14

rastogishubham force-pushed the DIExpressionFoldMath branch from 003c0f6 to 93f4a70 Compare November 13, 2023 21:21

felipepiovezan reviewed Nov 13, 2023

View reviewed changes

jmorse reviewed Nov 14, 2023

View reviewed changes

llvm/lib/IR/DebugInfoMetadata.cpp Outdated Show resolved Hide resolved

rastogishubham mentioned this pull request Feb 29, 2024

[NFC] Move DIExpressionCursor to DebugInfoMetadata.h #69768

Merged

rastogishubham force-pushed the DIExpressionFoldMath branch from 93f4a70 to dfba61c Compare March 9, 2024 01:02

adrian-prantl reviewed Mar 9, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionFoldMath branch from dfba61c to 2725f7c Compare March 9, 2024 01:28

pogo59 reviewed Mar 11, 2024

View reviewed changes

llvm/unittests/IR/MetadataTest.cpp Outdated Show resolved Hide resolved

llvm/unittests/IR/MetadataTest.cpp Show resolved Hide resolved

rastogishubham force-pushed the DIExpressionFoldMath branch from 2725f7c to 8fb4645 Compare March 14, 2024 20:11

rastogishubham requested review from jmorse, felipepiovezan, pogo59 and adrian-prantl March 14, 2024 20:13

adrian-prantl reviewed Mar 15, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionFoldMath branch from 8fb4645 to 64ddbb3 Compare March 20, 2024 18:59

rastogishubham force-pushed the DIExpressionFoldMath branch 3 times, most recently from 39ea838 to 8f4da5f Compare April 4, 2024 02:04

rastogishubham force-pushed the DIExpressionFoldMath branch from 8f4da5f to e379aa2 Compare April 24, 2024 21:48

rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from 3b40e56 to 03eac36 Compare May 1, 2024 23:10

rastogishubham force-pushed the DIExpressionFoldMath branch from 03eac36 to 21324f8 Compare May 10, 2024 23:23

adrian-prantl reviewed May 11, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from 04bb1b4 to 5ac6bf9 Compare May 15, 2024 17:46

rastogishubham force-pushed the DIExpressionFoldMath branch 2 times, most recently from ce9a2fe to 3b03b65 Compare May 23, 2024 22:30

rastogishubham force-pushed the DIExpressionFoldMath branch from 3b03b65 to accb2f3 Compare May 29, 2024 22:43

Introduce DIExpression::foldConstantMath()

f2afb34

DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time.

rastogishubham force-pushed the DIExpressionFoldMath branch from accb2f3 to f2afb34 Compare May 29, 2024 22:48

rastogishubham merged commit b12f81b into llvm:main May 29, 2024
4 of 6 checks passed

rastogishubham deleted the DIExpressionFoldMath branch May 29, 2024 23:10

		@@ -0,0 +1,81 @@
		//===- llvm/IR/DIExpressionOptimizer.h - Constant folding of DIExpressions --*-
		//C++ -*-===//

Introduce DIExpression::foldConstantMath() #71718

Introduce DIExpression::foldConstantMath() #71718

Uh oh!

Conversation

rastogishubham commented Nov 8, 2023

Uh oh!

llvmbot commented Nov 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rastogishubham Nov 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jmorse left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rastogishubham commented Mar 9, 2024

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pogo59 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rastogishubham commented Mar 14, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 8, 2023 •

edited

Loading

rastogishubham Nov 14, 2023 •

edited

Loading

github-actions bot commented May 10, 2024 •

edited

Loading