Skip to content

Conversation

@unexpectedlydefined
Copy link

Patch 1 of 3 to add to llvm-dwarfdump the ability to measure DWARF coverage of local variables in terms of source lines, as discussed in this RFC.

This patch adds the basic variable coverage implementation. By default, inlined instances are shown separately (displaying the full inlining chain). Alternatively, a combined view that averages across all inlined instances can be returned using --combine-instances.

In this patch, we simply print a count of source lines over which each variable is covered. Later patches in the series will add the comparison against a baseline.

Example output:

$ llvm-dwarfdump --show-variable-coverage somefile
Variable coverage statistics:
Function InlChain                   Variable Decl                  LinesCovered
foo                                 bar      path/to/somefile.h:54 3
foo      path/to/someotherfile.c:32 bar      path/to/somefile.h:54 2
foo                                 baz      main.c:76             9
$ llvm-dwarfdump --show-variable-coverage somefile --combine-instances
Variable coverage statistics:
Function InstanceCount Variable Decl                  LinesCovered
foo      2             bar      path/to/somefile.h:54 2.5
foo      1             baz      main.c:76             9

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2025

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (unexpectedlydefined)

Changes

Patch 1 of 3 to add to llvm-dwarfdump the ability to measure DWARF coverage of local variables in terms of source lines, as discussed in this RFC.

This patch adds the basic variable coverage implementation. By default, inlined instances are shown separately (displaying the full inlining chain). Alternatively, a combined view that averages across all inlined instances can be returned using --combine-instances.

In this patch, we simply print a count of source lines over which each variable is covered. Later patches in the series will add the comparison against a baseline.

Example output:

$ llvm-dwarfdump --show-variable-coverage somefile
Variable coverage statistics:
Function InlChain                   Variable Decl                  LinesCovered
foo                                 bar      path/to/somefile.h:54 3
foo      path/to/someotherfile.c:32 bar      path/to/somefile.h:54 2
foo                                 baz      main.c:76             9
$ llvm-dwarfdump --show-variable-coverage somefile --combine-instances
Variable coverage statistics:
Function InstanceCount Variable Decl                  LinesCovered
foo      2             bar      path/to/somefile.h:54 2.5
foo      1             baz      main.c:76             9

Patch is 28.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/169646.diff

8 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-dwarfdump.rst (+37)
  • (added) llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll (+134)
  • (added) llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage.ll (+170)
  • (added) llvm/test/tools/llvm-dwarfdump/X86/coverage.test (+24)
  • (modified) llvm/tools/llvm-dwarfdump/CMakeLists.txt (+1)
  • (added) llvm/tools/llvm-dwarfdump/Coverage.cpp (+244)
  • (modified) llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp (+20-1)
  • (modified) llvm/tools/llvm-dwarfdump/llvm-dwarfdump.h (+2)
diff --git a/llvm/docs/CommandGuide/llvm-dwarfdump.rst b/llvm/docs/CommandGuide/llvm-dwarfdump.rst
index dfc0431f07826..c3af617d8b6c2 100644
--- a/llvm/docs/CommandGuide/llvm-dwarfdump.rst
+++ b/llvm/docs/CommandGuide/llvm-dwarfdump.rst
@@ -193,6 +193,17 @@ OPTIONS
 
             The :option:`--debug-frame` and :option:`--eh-frame` options are aliases, in cases where both sections are present one command outputs both.
 
+.. option:: --show-variable-coverage
+
+            Show per-variable coverage metrics. The output format is described
+            in the section below (:ref:`variable-coverage-format`).
+
+.. option:: --combine-instances
+
+            Use with :option:`--show-variable-coverage` to average variable
+            coverage across inlined subroutine instances instead of printing
+            them separately.
+
 .. option:: @<FILE>
 
             Read command-line options from `<FILE>`.
@@ -243,6 +254,32 @@ The following is generated if there are no errors reported::
     "error-count": 0
   }
 
+.. _variable-coverage-format:
+
+FORMAT OF VARIABLE COVERAGE OUTPUT
+---------------------------
+
+The :option:`--show-variable-coverage` option differs from
+:option:`--statistics` by printing per-variable debug info coverage metrics
+based on the number of source lines covered instead of the number of
+instruction bytes. Compared to counting instruction bytes, this is more stable
+across compilations and better reflects the debugging experience. The output is
+a tab-separated table containing the following columns:
+
+      - `Function` ==> Name of the function the variable was found in
+      - `InstanceCount` (when :option:`--combine-instances` is specified) ==>
+      Number of instances of the function; this is 1 for functions that have
+      not been inlined, and n+1 for functions that have been inlined n times
+      - `InlChain` (when :option:`--combine-instances` is not specified) ==>
+      Chain of call sites (file and line number) that the function has been
+      inlined into; this will be empty if the function has not been inlined
+      - `Variable` ==> Name of the variable
+      - `Decl` ==> Source location (file and line number) of the variable's
+      declaration
+      - `LinesCovered` ==> Number of source lines covered by the variable's
+      debug information in the input file
+
+
 EXIT STATUS
 -----------
 
diff --git a/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll b/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll
new file mode 100644
index 0000000000000..81e7feaf1addc
--- /dev/null
+++ b/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll
@@ -0,0 +1,134 @@
+; The source code of the test case:
+; extern void fn3(int *);
+; extern void fn2 (int);
+; __attribute__((noinline))
+; void
+; fn1 (int x, int y)
+; {
+;   int u = x + y;
+;   if (x > 1)
+;     u += 1;
+;   else
+;     u += 2;
+;   if (y > 4)
+;     u += x;
+;   int a = 7;
+;   fn2 (a);
+;   u --;
+; }
+
+; __attribute__((noinline))
+; int f()
+; {
+;   int l, k;
+;   fn3(&l);
+;   fn3(&k);
+;   fn1 (l, k);
+;   return 0;
+; }
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Function Attrs: noinline nounwind uwtable
+define dso_local void @fn1(i32 noundef %0, i32 noundef %1) local_unnamed_addr !dbg !9 {
+  call void @llvm.dbg.value(metadata i32 %0, metadata !14, metadata !DIExpression()), !dbg !18
+  call void @llvm.dbg.value(metadata i32 %1, metadata !15, metadata !DIExpression()), !dbg !18
+  call void @llvm.dbg.value(metadata !DIArgList(i32 %0, i32 %1), metadata !16, metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus, DW_OP_stack_value)), !dbg !18
+  call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression()), !dbg !18
+  call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression()), !dbg !18
+  call void @llvm.dbg.value(metadata i32 7, metadata !17, metadata !DIExpression()), !dbg !18
+  tail call void @fn2(i32 noundef 7), !dbg !19
+  call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression(DW_OP_constu, 1, DW_OP_minus, DW_OP_stack_value)), !dbg !18
+  ret void, !dbg !20
+}
+
+; Function Attrs: argmemonly mustprogress nofree nosync nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture)
+
+declare !dbg !21 void @fn2(i32 noundef) local_unnamed_addr
+
+; Function Attrs: argmemonly mustprogress nofree nosync nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture)
+
+; Function Attrs: noinline nounwind uwtable
+define dso_local i32 @f() local_unnamed_addr !dbg !25 {
+  %1 = alloca i32, align 4
+  %2 = alloca i32, align 4
+  %3 = bitcast i32* %1 to i8*, !dbg !31
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %3), !dbg !31
+  %4 = bitcast i32* %2 to i8*, !dbg !31
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %4), !dbg !31
+  call void @llvm.dbg.value(metadata i32* %1, metadata !29, metadata !DIExpression(DW_OP_deref)), !dbg !32
+  call void @fn3(i32* noundef nonnull %1), !dbg !33
+  call void @llvm.dbg.value(metadata i32* %2, metadata !30, metadata !DIExpression(DW_OP_deref)), !dbg !32
+  call void @fn3(i32* noundef nonnull %2), !dbg !34
+  %5 = load i32, i32* %1, align 4, !dbg !35, !tbaa !36
+  call void @llvm.dbg.value(metadata i32 %5, metadata !29, metadata !DIExpression()), !dbg !32
+  %6 = load i32, i32* %2, align 4, !dbg !40, !tbaa !36
+  call void @llvm.dbg.value(metadata i32 %6, metadata !30, metadata !DIExpression()), !dbg !32
+  call void @fn1(i32 noundef %5, i32 noundef %6), !dbg !41
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %4), !dbg !42
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %3), !dbg !42
+  ret i32 0, !dbg !43
+}
+
+declare !dbg !44 void @fn3(i32* noundef) local_unnamed_addr
+
+; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 14.0.6", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 7, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 1}
+!8 = !{!"clang version 14.0.6"}
+!9 = distinct !DISubprogram(name: "fn1", scope: !1, file: !1, line: 5, type: !10, scopeLine: 6, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !13)
+!10 = !DISubroutineType(types: !11)
+!11 = !{null, !12, !12}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !{!14, !15, !16, !17}
+!14 = !DILocalVariable(name: "x", arg: 1, scope: !9, file: !1, line: 5, type: !12)
+!15 = !DILocalVariable(name: "y", arg: 2, scope: !9, file: !1, line: 5, type: !12)
+!16 = !DILocalVariable(name: "u", scope: !9, file: !1, line: 7, type: !12)
+!17 = !DILocalVariable(name: "a", scope: !9, file: !1, line: 14, type: !12)
+!18 = !DILocation(line: 0, scope: !9)
+!19 = !DILocation(line: 15, column: 3, scope: !9)
+!20 = !DILocation(line: 17, column: 1, scope: !9)
+!21 = !DISubprogram(name: "fn2", scope: !1, file: !1, line: 2, type: !22, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !24)
+!22 = !DISubroutineType(types: !23)
+!23 = !{null, !12}
+!24 = !{}
+!25 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 20, type: !26, scopeLine: 21, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !28)
+!26 = !DISubroutineType(types: !27)
+!27 = !{!12}
+!28 = !{!29, !30}
+!29 = !DILocalVariable(name: "l", scope: !25, file: !1, line: 22, type: !12)
+!30 = !DILocalVariable(name: "k", scope: !25, file: !1, line: 22, type: !12)
+!31 = !DILocation(line: 22, column: 3, scope: !25)
+!32 = !DILocation(line: 0, scope: !25)
+!33 = !DILocation(line: 23, column: 3, scope: !25)
+!34 = !DILocation(line: 24, column: 3, scope: !25)
+!35 = !DILocation(line: 25, column: 8, scope: !25)
+!36 = !{!37, !37, i64 0}
+!37 = !{!"int", !38, i64 0}
+!38 = !{!"omnipotent char", !39, i64 0}
+!39 = !{!"Simple C/C++ TBAA"}
+!40 = !DILocation(line: 25, column: 11, scope: !25)
+!41 = !DILocation(line: 25, column: 3, scope: !25)
+!42 = !DILocation(line: 27, column: 1, scope: !25)
+!43 = !DILocation(line: 26, column: 3, scope: !25)
+!44 = !DISubprogram(name: "fn3", scope: !1, file: !1, line: 1, type: !45, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !24)
+!45 = !DISubroutineType(types: !46)
+!46 = !{null, !47}
+!47 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !12, size: 64)
diff --git a/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage.ll b/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage.ll
new file mode 100644
index 0000000000000..a475c38a80de2
--- /dev/null
+++ b/llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage.ll
@@ -0,0 +1,170 @@
+; The source code of the test case:
+; extern void fn3(int *);
+; extern void fn2 (int);
+; __attribute__((noinline))
+; void
+; fn1 (int x, int y)
+; {
+;   int u = x + y;
+;   if (x > 1)
+;     u += 1;
+;   else
+;     u += 2;
+;   if (y > 4)
+;     u += x;
+;   int a = 7;
+;   fn2 (a);
+;   u --;
+; }
+
+; __attribute__((noinline))
+; int f()
+; {
+;   int l, k;
+;   fn3(&l);
+;   fn3(&k);
+;   fn1 (l, k);
+;   return 0;
+; }
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @fn1(i32 noundef %0, i32 noundef %1) !dbg !10 {
+  %3 = alloca i32, align 4
+  %4 = alloca i32, align 4
+  %5 = alloca i32, align 4
+  %6 = alloca i32, align 4
+  store i32 %0, i32* %3, align 4
+  call void @llvm.dbg.declare(metadata i32* %3, metadata !15, metadata !DIExpression()), !dbg !16
+  store i32 %1, i32* %4, align 4
+  call void @llvm.dbg.declare(metadata i32* %4, metadata !17, metadata !DIExpression()), !dbg !18
+  call void @llvm.dbg.declare(metadata i32* %5, metadata !19, metadata !DIExpression()), !dbg !20
+  %7 = load i32, i32* %3, align 4, !dbg !21
+  %8 = load i32, i32* %4, align 4, !dbg !22
+  %9 = add nsw i32 %7, %8, !dbg !23
+  store i32 %9, i32* %5, align 4, !dbg !20
+  %10 = load i32, i32* %3, align 4, !dbg !24
+  %11 = icmp sgt i32 %10, 1, !dbg !26
+  br i1 %11, label %12, label %15, !dbg !27
+
+12:                                               ; preds = %2
+  %13 = load i32, i32* %5, align 4, !dbg !28
+  %14 = add nsw i32 %13, 1, !dbg !28
+  store i32 %14, i32* %5, align 4, !dbg !28
+  br label %18, !dbg !29
+
+15:                                               ; preds = %2
+  %16 = load i32, i32* %5, align 4, !dbg !30
+  %17 = add nsw i32 %16, 2, !dbg !30
+  store i32 %17, i32* %5, align 4, !dbg !30
+  br label %18
+
+18:                                               ; preds = %15, %12
+  %19 = load i32, i32* %4, align 4, !dbg !31
+  %20 = icmp sgt i32 %19, 4, !dbg !33
+  br i1 %20, label %21, label %25, !dbg !34
+
+21:                                               ; preds = %18
+  %22 = load i32, i32* %3, align 4, !dbg !35
+  %23 = load i32, i32* %5, align 4, !dbg !36
+  %24 = add nsw i32 %23, %22, !dbg !36
+  store i32 %24, i32* %5, align 4, !dbg !36
+  br label %25, !dbg !37
+
+25:                                               ; preds = %21, %18
+  call void @llvm.dbg.declare(metadata i32* %6, metadata !38, metadata !DIExpression()), !dbg !39
+  store i32 7, i32* %6, align 4, !dbg !39
+  %26 = load i32, i32* %6, align 4, !dbg !40
+  call void @fn2(i32 noundef %26), !dbg !41
+  %27 = load i32, i32* %5, align 4, !dbg !42
+  %28 = add nsw i32 %27, -1, !dbg !42
+  store i32 %28, i32* %5, align 4, !dbg !42
+  ret void, !dbg !43
+}
+
+; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
+declare void @llvm.dbg.declare(metadata, metadata, metadata)
+
+declare void @fn2(i32 noundef)
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @f() !dbg !44 {
+  %1 = alloca i32, align 4
+  %2 = alloca i32, align 4
+  call void @llvm.dbg.declare(metadata i32* %1, metadata !47, metadata !DIExpression()), !dbg !48
+  call void @llvm.dbg.declare(metadata i32* %2, metadata !49, metadata !DIExpression()), !dbg !50
+  call void @fn3(i32* noundef %1), !dbg !51
+  call void @fn3(i32* noundef %2), !dbg !52
+  %3 = load i32, i32* %1, align 4, !dbg !53
+  %4 = load i32, i32* %2, align 4, !dbg !54
+  call void @fn1(i32 noundef %3, i32 noundef %4), !dbg !55
+  ret i32 0, !dbg !56
+}
+
+declare void @fn3(i32* noundef)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+!llvm.ident = !{!9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 14.0.6", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.c", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 7, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 1}
+!8 = !{i32 7, !"frame-pointer", i32 2}
+!9 = !{!"clang version 14.0.6"}
+!10 = distinct !DISubprogram(name: "fn1", scope: !1, file: !1, line: 5, type: !11, scopeLine: 6, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !14)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null, !13, !13}
+!13 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!14 = !{}
+!15 = !DILocalVariable(name: "x", arg: 1, scope: !10, file: !1, line: 5, type: !13)
+!16 = !DILocation(line: 5, column: 10, scope: !10)
+!17 = !DILocalVariable(name: "y", arg: 2, scope: !10, file: !1, line: 5, type: !13)
+!18 = !DILocation(line: 5, column: 17, scope: !10)
+!19 = !DILocalVariable(name: "u", scope: !10, file: !1, line: 7, type: !13)
+!20 = !DILocation(line: 7, column: 7, scope: !10)
+!21 = !DILocation(line: 7, column: 11, scope: !10)
+!22 = !DILocation(line: 7, column: 15, scope: !10)
+!23 = !DILocation(line: 7, column: 13, scope: !10)
+!24 = !DILocation(line: 8, column: 7, scope: !25)
+!25 = distinct !DILexicalBlock(scope: !10, file: !1, line: 8, column: 7)
+!26 = !DILocation(line: 8, column: 9, scope: !25)
+!27 = !DILocation(line: 8, column: 7, scope: !10)
+!28 = !DILocation(line: 9, column: 7, scope: !25)
+!29 = !DILocation(line: 9, column: 5, scope: !25)
+!30 = !DILocation(line: 11, column: 7, scope: !25)
+!31 = !DILocation(line: 12, column: 7, scope: !32)
+!32 = distinct !DILexicalBlock(scope: !10, file: !1, line: 12, column: 7)
+!33 = !DILocation(line: 12, column: 9, scope: !32)
+!34 = !DILocation(line: 12, column: 7, scope: !10)
+!35 = !DILocation(line: 13, column: 10, scope: !32)
+!36 = !DILocation(line: 13, column: 7, scope: !32)
+!37 = !DILocation(line: 13, column: 5, scope: !32)
+!38 = !DILocalVariable(name: "a", scope: !10, file: !1, line: 14, type: !13)
+!39 = !DILocation(line: 14, column: 7, scope: !10)
+!40 = !DILocation(line: 15, column: 8, scope: !10)
+!41 = !DILocation(line: 15, column: 3, scope: !10)
+!42 = !DILocation(line: 16, column: 5, scope: !10)
+!43 = !DILocation(line: 17, column: 1, scope: !10)
+!44 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 20, type: !45, scopeLine: 21, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !14)
+!45 = !DISubroutineType(types: !46)
+!46 = !{!13}
+!47 = !DILocalVariable(name: "l", scope: !44, file: !1, line: 22, type: !13)
+!48 = !DILocation(line: 22, column: 7, scope: !44)
+!49 = !DILocalVariable(name: "k", scope: !44, file: !1, line: 22, type: !13)
+!50 = !DILocation(line: 22, column: 10, scope: !44)
+!51 = !DILocation(line: 23, column: 3, scope: !44)
+!52 = !DILocation(line: 24, column: 3, scope: !44)
+!53 = !DILocation(line: 25, column: 8, scope: !44)
+!54 = !DILocation(line: 25, column: 11, scope: !44)
+!55 = !DILocation(line: 25, column: 3, scope: !44)
+!56 = !DILocation(line: 26, column: 3, scope: !44)
diff --git a/llvm/test/tools/llvm-dwarfdump/X86/coverage.test b/llvm/test/tools/llvm-dwarfdump/X86/coverage.test
new file mode 100644
index 0000000000000..1f4471b07e5f3
--- /dev/null
+++ b/llvm/test/tools/llvm-dwarfdump/X86/coverage.test
@@ -0,0 +1,24 @@
+; RUN: llc %S/Inputs/coverage.ll -o %t.o -filetype=obj
+; RUN: llc %S/Inputs/coverage-opt.ll -o %t-opt.o -filetype=obj
+
+; RUN: llvm-dwarfdump --show-variable-coverage %t.o | FileCheck %s
+
+; CHECK:      Variable coverage statistics:
+; CHECK-NEXT: Function InlChain Variable Decl LinesCovered
+; CHECK-NEXT: f k test.c:22 5
+; CHECK-NEXT: f l test.c:22 5
+; CHECK-NEXT: fn1 a test.c:14 11
+; CHECK-NEXT: fn1 u test.c:7 11
+; CHECK-NEXT: fn1 x test.c:5 11
+; CHECK-NEXT: fn1 y test.c:5 11
+
+; RUN: llvm-dwarfdump --show-variable-coverage --combine-instances %t-opt.o | FileCheck %s --check-prefix=COMBINE
+
+; COMBINE:      Variable coverage statistics:
+; COMBINE-NEXT: Function InstanceCount Variable Decl LinesCovered
+; COMBINE-NEXT: f 1 k test.c:22 2
+; COMBINE-NEXT: f 1 l test.c:22 3
+; COMBINE-NEXT: fn1 1 a test.c:14 1
+; COMBINE-NEXT: fn1 1 u test.c:7 1
+; COMBINE-NEXT: fn1 1 x test.c:5 1
+; COMBINE-NEXT: fn1 1 y test.c:5 1
diff --git a/llvm/tools/llvm-dwarfdump/CMakeLists.txt b/llvm/tools/llvm-dwarfdump/CMakeLists.txt
index 7a0adf32e938c..1ed3f4901c4fc 100644
--- a/llvm/tools/llvm-dwarfdump/CMakeLists.txt
+++ b/llvm/tools/llvm-dwarfdump/CMakeLists.txt
@@ -11,6 +11,7 @@ set(LLVM_LINK_COMPONENTS
   )
 
 add_llvm_tool(llvm-dwarfdump
+  Coverage.cpp
   SectionSizes.cpp
   Statistics.cpp
   llvm-dwarfdump.cpp
diff --git a/llvm/tools/llvm-dwarfdump/Coverage.cpp b/llvm/tools/llvm-dwarfdump/Coverage.cpp
new file mode 100644
index 0000000000000..e74b31900a471
--- /dev/null
+++ b/llvm/tools/llvm-dwarfdump/Coverage.cpp
@@ -0,0 +1,244 @@
+//===-- Coverage.cpp - Debug info coverage metrics ------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm-dwarfdump.h"
+#include "llvm/BinaryFormat/Dwarf.h"
+#include "llvm/DebugInfo/DIContext.h"
+#include "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
+#include "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
+#include "llvm/DebugInfo/DWARF/DWARFContext.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/DebugProgramInstruction.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/SourceMgr.h"
+#include <set>
+
+using namespace llvm;
+using namespace llvm::dwarf;
+using namespace llvm::object;
+
+typedef std::pair<std::string, std::string> StringPair;
+
+static std::optional<std::set<std::pair<uint16_t, uint32_t>>>
+computeVariableCoverage(DWARFContext &DICtx, DWARFDie DIE,
+                        const DWARFDebugLine::LineTable *const LineTable) {
+  auto addLines = [](const DWARFDebugLine::LineTable *LineTable,
+                     std::set<std::pair<uint16_t, uint32_t>> &Lines,
+                     DWARFAddressRange Range,
+                     std::map<std::string, uint16_t, std::less<>> &FileNames) {
+    std::vector<uint32_t> Rows;
+    if (LineTable->lookupAddressRange({Range.LowPC, Range.SectionIndex},
+                                      Range.HighPC - Range.LowPC, Rows)) {
+      for (const auto &RowI : Rows) {
+        const auto Row = LineTable->Rows[RowI];
+        if (Row.Address.Address < Range.LowPC)
+          continue;
+        const auto FileIndex = Row.File;
+
+        if (!any_of(FileNames,
+                    [FileIndex](auto &FN) { return FN.second == FileIndex; })) {
+          std::string Name;
+          LineTable->getFileNameByIndex(
+            ...
[truncated]

@github-actions
Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage.ll llvm/tools/llvm-dwarfdump/Coverage.cpp llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp llvm/tools/llvm-dwarfdump/llvm-dwarfdump.h

The following files introduce new uses of undef:

  • llvm/test/tools/llvm-dwarfdump/X86/Inputs/coverage-opt.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

Copy link
Contributor

@SLTozer SLTozer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments from my look over of this patch - due to the size of the patch (understandable since it's adding and testing the smallest complete form of this feature) it might take a couple of passes to complete review.

Also since this is an implementation of the RFC that you've linked to in the PR description, it'd be a good idea to comment on the RFC thread linking to this PR so that those who participated in or watched that thread can follow through to the implementation.

Also on that note it'd be good to hear from @adrian-prantl - do you have any thoughts on this feature and its implementation, and/or have any suggested reviewers who may be more familiar with dwarfdump than me?

Show per-variable coverage metrics. The output format is described
in the section below (:ref:`variable-coverage-format`).

.. option:: --combine-instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. option:: --combine-instances
.. option:: --combine-inline-variable-instances

Maybe this could be named something more detailed/descriptive, since it's a sub-option of the above.

Comment on lines 106 to 110
do
if (Parent.getTag() == DW_TAG_subprogram ||
Parent.getTag() == DW_TAG_inlined_subroutine)
Parents.push_back(Parent);
while ((Parent = Parent.getParent()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
do
if (Parent.getTag() == DW_TAG_subprogram ||
Parent.getTag() == DW_TAG_inlined_subroutine)
Parents.push_back(Parent);
while ((Parent = Parent.getParent()));
do {
if (Parent.getTag() == DW_TAG_subprogram ||
Parent.getTag() == DW_TAG_inlined_subroutine)
Parents.push_back(Parent);
} while ((Parent = Parent.getParent()));

Would definitely suggest using braces for the do while.

Comment on lines 107 to 109
if (Parent.getTag() == DW_TAG_subprogram ||
Parent.getTag() == DW_TAG_inlined_subroutine)
Parents.push_back(Parent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of lambda functions, we may have DWARF structured like:

DW_TAG_subprogram # Outer function
  ...
  DW_TAG_structure_type # Lambda struct
    ...
    DW_TAG_subprogram # Lambda function

In these cases, if DIE is the lambda function, do we want to add the outer function to Parents (as the code will currently do), or are we only interested in traversing inlined frames? I assume since AFAICT this is only used for printing inline chains, it might be better to early exit after we reach a DW_TAG_subprogram (I'm decently confident that that will work, but not certain!).

Comment on lines 32 to 35
static std::optional<std::set<std::pair<uint16_t, uint32_t>>>
computeVariableCoverage(DWARFContext &DICtx, DWARFDie DIE,
const DWARFDebugLine::LineTable *const LineTable) {
auto addLines = [](const DWARFDebugLine::LineTable *LineTable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function and the lambda could do with commenting (a few sentences) to explain what they're doing; it might also help if DIE was renamed VariableDIE, to make it clear that this is the variable that coverage is being computed for, and also to explain with either a comment or a typedef that the returned set contains {FileIndex, Line} pairings (representing unique source locations).

std::map<std::string, uint16_t, std::less<>> FileNames;

auto Locations = DIE.getLocations(DW_AT_location);
std::optional<std::set<std::pair<uint16_t, uint32_t>>> Lines;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this and the other std::optional<std::set<...>> instances, it may be worth explicitly noting the purpose of the optional in a comment; as far as I can tell, after these blocks, we'll see an empty optional if there were no location ranges, and a present optional containing an empty set if location ranges were found that cover no source locations (e.g. if all lines were line 0), and this is relevant for the logic below that determines whether or not to fallback to or intersect with the parent's location ranges, is that correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is correct - I've added a comment explaining this.

bool CombineInstances, raw_ostream &OS) {
VarMap Vars;

WithColor(errs(), HighlightColor::Remark) << "Processing DIEs...\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used, or is it a "dev" print?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for now, but it could be worth considering some form of progress reporting as it can take a long time to run on very large inputs, especially with the additional analysis that will be added in later patches.

Comment on lines +1 to +2
RUN: llc %S/Inputs/coverage.ll -o %t.o -filetype=obj
RUN: llc %S/Inputs/coverage-opt.ll -o %t-opt.o -filetype=obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, both of these input files will be used by other tests independent of this one in a future patch? Just noting this for review, as normally these tests would each be self-contained, but in this case they'll also be used as inputs for multiple tests, including at least one that uses both the files as input to one command - is that correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - subsequent patches will add several more llvm-dwarfdump invocations using different argument combinations, including some that use both files as input.

if (CombineInstances)
OS << Var.Instances;
else if (Var.Parents.size())
displayParents(Var.Parents, OS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine in some cases the displayed inline chain might be quite long, and could potentially even overflow most terminals - I'm not sure if there's a good way around this without over-complicating the display, so this may be the best solution for now, but maybe a comment (could be TODO or FIXME) noting this limitation would be good.

Copy link
Contributor

@SLTozer SLTozer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments, largely this LGTM besides some nit comments and one real functional question regarding variables without DW_AT_location.

Comment on lines 40 to 41
call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression()), !dbg !18
call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression()), !dbg !18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file was probably generated by an older compiler version - 2 recommended changes for this and the other test file are:

  • Use metadata i32 poison instead of metadata i32 undef; we try to avoid adding new instances of undef except for where there is actual undefined behaviour (and dead debug values are emitted with poison now rather than undef).
  • Use debug value records instead of intrinsics - see here for the docs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by regenerating both files using a newer compiler version and changing all instances of undef to poison.

using namespace llvm::dwarf;
using namespace llvm::object;

typedef std::pair<std::string, std::string> StringPair;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this typedef used anywhere?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from a later patch - removed for now.

/// Pair of file index and line number representing a source location.
typedef std::pair<uint16_t, std::size_t> SourceLocation;

/// Returns the set of source lines covered by a variable's debug information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest expanding this to give a bit more detail about what the function does, and what the return values may mean. I'd suggest adding a description of the possible return values similar to the comment inside the function, ("Returns an empty set if the the variable has location ranges that don't cover any source lines, and returns an empty optional (std::nullopt) if the variable has no location ranges."); may also be worth mentioning that the ranges will be limited to the containing function's address ranges or will directly use that function's address ranges if the variable has no DW_AT_location, to minimize future surprise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the function to no longer return an optional, since the only case where an empty optional was returned was when both the variable and the containing scope had no ranges, and it seems logical to report zero covered lines for the variable in that case anyway.

}
};

std::map<std::string, uint16_t, std::less<>> FileNames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of FileNames? As far as I can tell, we insert into it, but never read from it (except for deciding whether or not to insert into it).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from a later patch - removed for now.

typedef std::pair<uint16_t, std::size_t> SourceLocation;

/// Returns the set of source lines covered by a variable's debug information.
static std::optional<std::set<SourceLocation>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend replacing uses of std::set here with DenseSet, which should be more efficient for this case (see also here - the advantages of std::set aren't used here). For the set intersection, there is llvm::set_intersect (needs #include "llvm/ADT/SetOperations.h") which will work for DenseSet.

// present (but containing an empty set) if ranges were found but contained no
// source locations, in order to distinguish the two cases.

auto Locations = VariableDIE.getLocations(DW_AT_location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this correctly handle cases where a variable is totally undef (that is, it has no location at all)? There will be some cases where a variable gets emitted without a DW_AT_location at all; in these cases, we would want to set Lines to be an empty set, as it is a variable with explicitly no live ranges, as opposed to a variable with a DW_AT_location that doesn't have a location list, meaning it has a live range for its entire parent scope.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to return an empty set and report zero covered lines for the variable in this case.

Comment on lines +19 to +24
COMBINE-NEXT: f 1 k test.c:22 5
COMBINE-NEXT: f 1 l test.c:22 5
COMBINE-NEXT: fn1 1 a test.c:14 0
COMBINE-NEXT: fn1 1 u test.c:7 0
COMBINE-NEXT: fn1 1 x test.c:5 0
COMBINE-NEXT: fn1 1 y test.c:5 0
Copy link
Contributor

@SLTozer SLTozer Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results here look a little odd; all the fn1 variables other than a were made fully undef in the most recent change, but even a has a 0 line coverage here even though it should have full coverage in fn1; is this happening because fn1 is ending up with 0 scope lines, or is there some problem with the logic for finding the scope lines? (Also k and l have changed results as well; possibly the fn1 results are being miscounted as being in f, and thus all the scope lines for fn1 are being counted for f variables? Something to look into either way).

Comment on lines +37 to +44
#dbg_value(i32 poison, !15, !DIExpression(), !19)
#dbg_value(i32 poison, !16, !DIExpression(), !19)
#dbg_value(!DIArgList(i32 poison, i32 poison), !17, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus, DW_OP_stack_value), !19)
#dbg_value(i32 poison, !17, !DIExpression(), !19)
#dbg_value(i32 poison, !17, !DIExpression(), !19)
#dbg_value(i32 7, !18, !DIExpression(), !19)
tail call void @fn2(i32 noundef 7) #3, !dbg !20
#dbg_value(i32 poison, !17, !DIExpression(DW_OP_constu, 1, DW_OP_minus, DW_OP_stack_value), !19)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all of the non-const arguments became poison, rather than just the ones that were previously undef. It's good to test at least one variable that's completely dead, so I'd recommend keeping at least one "fully dead" variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants