[CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments #132744

Camsyn · 2025-03-24T14:29:31Z

Fixes issue #132739.

CaptureAnalysis only considers captures through the def-use chain of the provided pointer, explicitly excluding captures of underlying values or implicit captures like those involving external globals.

The previous comment for PointerMayBeCaptured did not clearly state this limitation, leading to its incorrect usage in files such as ThreadSanitizer.cpp and SanitizerMetadata.cpp.

This PR addresses this by refining the comments for the relevant APIs within PointerMayBeCaptured to explicitly document this behavior.

Fixes issue llvm#132739. This fixes a missed capture detection scenario where only the base pointer is captured. Previously, PointerMayBeCaptured only performed forward def-use analysis on the given pointer, which worked for base pointers (Arguments/Allocas/Calls) but failed to detect captures of derived pointers when their base pointers were captured. The fix: For the input pointer, first trace back to its base pointer using `getUnderlyingObject`, then perform capture analysis on the base. This ensures proper handling of derived pointers with bases captured. Performance considerations: - Most existing callers already pass base pointers (the common case), so the added backtracing has negligible overhead - The few callers (ThreadSanitizer.cpp/SanitizerBinaryMetadata.cpp) passing arbitrary pointers are sanitizer-related components where compilation time is less critical compared to runtime overhead This addresses false negatives in capture detection while maintaining reasonable compilation efficiency.

github-actions · 2025-03-24T14:30:39Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-03-24T14:31:17Z

@llvm/pr-subscribers-llvm-analysis

Author: None (Camsyn)

Changes

Fixes issue #132739.

This fixes a missed capture detection scenario where only the base pointer is captured, as follows:

int arr[2] = {0, 0};
foo(arr); // arr is captured
arr[1] = 1; // arr + 1 is mis-analyzed as non-capture

Previously, PointerMayBeCaptured only performed forward def-use analysis on the given pointer, which worked for base pointers (Arguments/Allocas/Calls) but failed to detect captures of derived pointers when their base pointers were captured.

// Forward Tracing
return analyzeCaptures(Ptr);

The fix: For the input pointer, first trace back to its base pointer using getUnderlyingObject, then perform capture analysis on the base. This ensures proper handling of derived pointers with bases captured.

// Backward Tracing
const Value *Obj = getUnderlyingObject(Ptr);
// Forward Tracing
return analyzeCaptures(Obj);

Performance considerations:

Most existing callers already pass base pointers (the common case), so the added backtracing has negligible overhead
The few callers (ThreadSanitizer.cpp/SanitizerBinaryMetadata.cpp) passing arbitrary pointers are sanitizer-related components where compilation time is less critical compared to runtime overhead

This addresses false negatives in capture detection while maintaining reasonable compilation efficiency.

Full diff: https://github.com/llvm/llvm-project/pull/132744.diff

2 Files Affected:

(modified) llvm/lib/Analysis/CaptureTracking.cpp (+4)
(modified) llvm/unittests/Analysis/CaptureTrackingTest.cpp (+28)

diff --git a/llvm/lib/Analysis/CaptureTracking.cpp b/llvm/lib/Analysis/CaptureTracking.cpp
index aa53a27537567..0d598cd245464 100644
--- a/llvm/lib/Analysis/CaptureTracking.cpp
+++ b/llvm/lib/Analysis/CaptureTracking.cpp
@@ -424,6 +424,10 @@ void llvm::PointerMayBeCaptured(const Value *V, CaptureTracker *Tracker,
   if (MaxUsesToExplore == 0)
     MaxUsesToExplore = DefaultMaxUsesToExplore;
 
+  // When analyzing a derived pointer, we need to analyze its underlying
+  // object to determine whether it is captured.
+  // E.g., `ptr + 1` is captured if `ptr` is captured.
+  V = getUnderlyingObjectAggressive(V);
   SmallVector<const Use *, 20> Worklist;
   Worklist.reserve(getDefaultMaxUsesToExploreForCaptureTracking());
   SmallSet<const Use *, 20> Visited;
diff --git a/llvm/unittests/Analysis/CaptureTrackingTest.cpp b/llvm/unittests/Analysis/CaptureTrackingTest.cpp
index ea3f21efc014c..f35fc0a151470 100644
--- a/llvm/unittests/Analysis/CaptureTrackingTest.cpp
+++ b/llvm/unittests/Analysis/CaptureTrackingTest.cpp
@@ -132,3 +132,31 @@ TEST(CaptureTracking, MultipleUsesInSameInstruction) {
   EXPECT_EQ(ICmp, CT.Captures[6]->getUser());
   EXPECT_EQ(1u, CT.Captures[6]->getOperandNo());
 }
+
+TEST(CaptureTracking, DerivedPointerIfBasePointerCaptured) {
+  StringRef Assembly = R"(
+    declare void @bar(ptr)
+
+    define void @test() {
+      %stkobj = alloca [2 x i32]
+      %derived = getelementptr inbounds [2 x i32], ptr %stkobj, i64 0, i64 1
+      store i32 1, ptr %derived
+      call void @bar(ptr %stkobj)
+      ret void
+    }
+  )";
+
+  LLVMContext Context;
+  SMDiagnostic Error;
+  auto M = parseAssemblyString(Assembly, Error, Context);
+  ASSERT_TRUE(M) << "Bad assembly?";
+
+  Function *F = M->getFunction("test");
+  BasicBlock *BB = &F->getEntryBlock();
+  Instruction *StackObj = &*BB->begin();
+  Instruction *DerviedPtr = StackObj->getNextNode();
+  
+  // The base object and its derived pointer are both captured.
+  EXPECT_TRUE(PointerMayBeCaptured(StackObj, true));
+  EXPECT_TRUE(PointerMayBeCaptured(DerviedPtr, true));
+}

antoniofrighetto

Mind sharing the compile-time impact for this change?

llvm/unittests/Analysis/CaptureTrackingTest.cpp

Camsyn · 2025-03-25T06:40:13Z

Mind sharing the compile-time impact for this change?

Thanks @dtcxzyw for his performance testing. In the llvm-opt-benchmark he built, this modification brings ~0.01% compilation overhead.

I also tried llvm-test-suite with release mode clang locally, and the output also shows that the patch has a negligible impact on compile time, as follows:

Tests: 3326
Metric: compile_time

Program                                       compile_time                    
                                              results-orig results-patch diff 
tools/test/ret1                                 0.12         0.12         0.0%
Bitcode/Be...hmarks/Halide/blur/halide_blur     0.09         0.09         0.0%
MicroBench...marks/Builtins/Int128/Builtins     2.04         2.04         0.0%
MicroBench...Diffusion/AnisotropicDiffusion     1.92         1.92         0.0%
MicroBench...teralFiltering/BilateralFilter     1.77         1.77         0.0%
SingleSour...itTests/Threads/2010-12-08-tls     0.03         0.03         0.0%
SingleSource/UnitTests/Threads/tls              0.03         0.03         0.0%
SingleSour...PlanNativePath/outer-loop-vect     1.73         1.73         0.0%
SingleSour...ce/UnitTests/Vectorizer/any-of     3.03         3.03         0.0%
SingleSour...UnitTests/Vectorizer/find-last     4.37         4.37         0.0%
SingleSour...UnitTests/Vectorizer/gcc-loops     1.29         1.29         0.0%
SingleSour...tTests/Vectorizer/index-select     2.32         2.32         0.0%
SingleSour...itTests/Vectorizer/recurrences     1.83         1.83         0.0%
SingleSour.../UnitTests/SignlessTypes/cast2     0.04         0.04         0.0%
SingleSour...ce/UnitTests/SignlessTypes/ccc     0.04         0.04         0.0%
                           Geomean difference                             0.0%
      compile_time                     
run   results-orig results-patch   diff
count  3326.000000  3326.000000   453.0
mean   0.329113     0.329113      0.0  
std    3.125678     3.125678      0.0  
min    0.000000     0.000000      0.0  
25%    0.000000     0.000000      0.0  
50%    0.000000     0.000000      0.0  
75%    0.000000     0.000000      0.0  
max    122.717300   122.717300    0.0

nikic

We definitely should not be doing this in CaptureTracking -- passing a sensible value is the responsibility of the caller. In most contexts, performing capture checks only makes sense on "identified function-local objects".

Camsyn · 2025-03-25T15:12:21Z

We definitely should not be doing this in CaptureTracking -- passing a sensible value is the responsibility of the caller. In most contexts, performing capture checks only makes sense on "identified function-local objects".

If so, should we consider stating this in the comments of PointerMayBeCaptured, or renaming it FuncLocalObjMayBeCaptured or whatever?

The name of this function is very misleading, in fact, ThreadSanitizer.cpp and SanitizerBinaryMetadata .cpp have passed an arbitrary pointer, i.e., operand pointer of load/store, to it.

Finally, thanks for your feedback 😀.

github-actions · 2025-03-26T10:03:08Z

✅ With the latest revision this PR passed the C/C++ code formatter.

nikic · 2025-04-04T12:38:41Z

Sure, expanding the comment on the function is a good idea.

…iewer" This reverts commit cd64fa3.

This reverts commit 87c0937.

…ured comments The comments for PointerMayBeCaptured and related APIs have been updated to explicitly state that these functions assume the input is an function-local object. It is the caller's responsibility to ensure this precondition is met. This clarification aligns with recent discussions and feedback from code reviews, aiming to prevent misuse and potential analysis errors.

Camsyn · 2025-04-12T08:15:29Z

Since I didn’t see the comment update in the latest commit yet, I’ve gone ahead and included it in this PR — hope that’s okay.

nikic · 2025-04-14T12:24:05Z

llvm/include/llvm/Analysis/CaptureTracking.h

@@ -41,13 +41,17 @@ namespace llvm {
  /// MaxUsesToExplore specifies how many uses the analysis should explore for
  /// one value before giving up due too "too many uses". If MaxUsesToExplore
  /// is zero, a default value is assumed.
+  /// This assumes the pointer is to a function-local object. The caller is
+  /// responsible for ensuring this.


This depends on the use case. In most cases you'd want to pass a function-local object, but e.g. to infer captures() attribute you'd just work with arbitrary function arguments.

I think I'd add something like this instead:

This function only considers captures of the passed value via its use-def chain, without considering captures of values it may be based on, or implicit captures such as for external globals.

This depends on the use case. In most cases you'd want to pass a function-local object, but e.g. to infer captures() attribute you'd just work with arbitrary function arguments.

I think I'd add something like this instead:

This function only considers captures of the passed value via its use-def chain, without considering captures of values it may be based on, or implicit captures such as for external globals.

Good point, and thank you for the improved phrasing! Your suggested comment is indeed more appropriate. Just a minor correction: it should be 'def-use chain' rather than 'use-def chain' because our CaptureAnalysis works by forward tracking using getUser.

nikic

LGTM

nikic · 2025-04-14T14:13:27Z

Can you please update the PR description to match the new contents of the PR?

Camsyn · 2025-04-14T14:26:49Z

Very honored

Can you please update the PR description to match the new contents of the PR?

Done

github-actions · 2025-04-14T16:06:52Z

@Camsyn Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

…ured comments (llvm#132744) Fixes issue llvm#132739. CaptureAnalysis only considers captures through the def-use chain of the provided pointer, explicitly excluding captures of underlying values or implicit captures like those involving external globals. The previous comment for `PointerMayBeCaptured` did not clearly state this limitation, leading to its incorrect usage in files such as ThreadSanitizer.cpp and SanitizerMetadata.cpp. This PR addresses this by refining the comments for the relevant APIs within `PointerMayBeCaptured` to explicitly document this behavior.

llvmbot added the llvm:analysis label Mar 24, 2025

antoniofrighetto changed the title ~~[CaptureTracking] Supports analysis for dervied pointers~~ [CaptureTracking] Supports analysis for derived pointers Mar 24, 2025

antoniofrighetto requested a review from nikic March 24, 2025 15:00

antoniofrighetto reviewed Mar 24, 2025

View reviewed changes

llvm/unittests/Analysis/CaptureTrackingTest.cpp Outdated Show resolved Hide resolved

llvm/unittests/Analysis/CaptureTrackingTest.cpp Outdated Show resolved Hide resolved

This was referenced Mar 24, 2025

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR132744 dtcxzyw/llvm-opt-benchmark#2225

Closed

Tackle some format issues adhering the suggestions of the reviewer

cd64fa3

nikic requested changes Mar 25, 2025

View reviewed changes

Camsyn added 3 commits April 12, 2025 16:10

Revert "Tackle some format issues adhering the suggestions of the rev…

5ba4919

…iewer" This reverts commit cd64fa3.

Revert "[CaptureTracking] Supports analysis for dervied pointers"

087e9e4

This reverts commit 87c0937.

Camsyn changed the title ~~[CaptureTracking] Supports analysis for derived pointers~~ [CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments Apr 12, 2025

nikic reviewed Apr 14, 2025

View reviewed changes

Refine the comment aadhering to the reviewer

1d8292b

nikic approved these changes Apr 14, 2025

View reviewed changes

Merge branch 'main' into fix-capture

9ab5a98

nikic merged commit 71d1059 into llvm:main Apr 14, 2025
10 of 11 checks passed

Camsyn deleted the fix-capture branch April 15, 2025 07:57

This was referenced Apr 24, 2025

[TSan, SanitizerBinaryMetadata] Analyze the capture status for alloca rather than arbitrary Addr #132756

Merged

Request Commit Access For Camsyn #137457

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments #132744

[CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments #132744

Camsyn commented Mar 24, 2025 •

edited

Loading

github-actions bot commented Mar 24, 2025

llvmbot commented Mar 24, 2025

antoniofrighetto left a comment

Camsyn commented Mar 25, 2025 •

edited

Loading

nikic left a comment

Camsyn commented Mar 25, 2025

github-actions bot commented Mar 26, 2025 •

edited

Loading

nikic commented Apr 4, 2025

Camsyn commented Apr 12, 2025

nikic Apr 14, 2025

Camsyn Apr 14, 2025

nikic left a comment

nikic commented Apr 14, 2025

Camsyn commented Apr 14, 2025

github-actions bot commented Apr 14, 2025

[CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments #132744

[CaptureTracking][NFC] Clarify usage expectations in PointerMayBeCaptured comments #132744

Conversation

Camsyn commented Mar 24, 2025 • edited Loading

github-actions bot commented Mar 24, 2025

llvmbot commented Mar 24, 2025

antoniofrighetto left a comment

Choose a reason for hiding this comment

Camsyn commented Mar 25, 2025 • edited Loading

nikic left a comment

Choose a reason for hiding this comment

Camsyn commented Mar 25, 2025

github-actions bot commented Mar 26, 2025 • edited Loading

nikic commented Apr 4, 2025

Camsyn commented Apr 12, 2025

nikic Apr 14, 2025

Choose a reason for hiding this comment

Camsyn Apr 14, 2025

Choose a reason for hiding this comment

nikic left a comment

Choose a reason for hiding this comment

nikic commented Apr 14, 2025

Camsyn commented Apr 14, 2025

github-actions bot commented Apr 14, 2025

Camsyn commented Mar 24, 2025 •

edited

Loading

Camsyn commented Mar 25, 2025 •

edited

Loading

github-actions bot commented Mar 26, 2025 •

edited

Loading