-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[CaptureTracking] False Negatives in PointerMayBeCaptured
Due to Missing Underlying Object Backtracking
#132739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Camsyn
added a commit
to Camsyn/llvm-project
that referenced
this issue
Mar 24, 2025
Fixes issue llvm#132739. This fixes a missed capture detection scenario where only the base pointer is captured. Previously, PointerMayBeCaptured only performed forward def-use analysis on the given pointer, which worked for base pointers (Arguments/Allocas/Calls) but failed to detect captures of derived pointers when their base pointers were captured. The fix: For the input pointer, first trace back to its base pointer using `getUnderlyingObject`, then perform capture analysis on the base. This ensures proper handling of derived pointers with bases captured. Performance considerations: Most existing callers already pass base pointers (the common case), so the added backtracing has negligible overhead The few callers (ThreadSanitizer.cpp/SanitizerBinaryMetadata.cpp) passing arbitrary pointers are sanitizer-related components where compilation time is less critical compared to runtime overhead This addresses false negatives in capture detection while maintaining reasonable compilation efficiency.
Camsyn
added a commit
to Camsyn/llvm-project
that referenced
this issue
Mar 24, 2025
Fixes issue llvm#132739. This fixes a missed capture detection scenario where only the base pointer is captured. Previously, PointerMayBeCaptured only performed forward def-use analysis on the given pointer, which worked for base pointers (Arguments/Allocas/Calls) but failed to detect captures of derived pointers when their base pointers were captured. The fix: For the input pointer, first trace back to its base pointer using `getUnderlyingObject`, then perform capture analysis on the base. This ensures proper handling of derived pointers with bases captured. Performance considerations: - Most existing callers already pass base pointers (the common case), so the added backtracing has negligible overhead - The few callers (ThreadSanitizer.cpp/SanitizerBinaryMetadata.cpp) passing arbitrary pointers are sanitizer-related components where compilation time is less critical compared to runtime overhead This addresses false negatives in capture detection while maintaining reasonable compilation efficiency.
Camsyn
added a commit
to Camsyn/llvm-project
that referenced
this issue
Mar 24, 2025
…a` rather than arbitrary `Addr` This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation may miss captures of the underlying alloca when analyzing derived pointers. Even when working correctly, backtracking to the original alloca during analysis causes redundancy to the outer's `findAllocaForValue`. Key changes: Directly analyze the capture status of the underlying alloca instead of derived pointers to ensure accurate capture detection This patch fixes some FNs of TSan, referring to the appending testcases for more details.
Camsyn
added a commit
to Camsyn/llvm-project
that referenced
this issue
Mar 25, 2025
…a` rather than arbitrary `Addr` This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation may miss captures of the underlying alloca when analyzing derived pointers. Even when working correctly, backtracking to the original alloca during analysis causes redundancy to the outer's `findAllocaForValue`. Key changes: Directly analyze the capture status of the underlying alloca instead of derived pointers to ensure accurate capture detection This patch fixes some FNs of TSan, referring to the appending testcases for more details.
nikic
pushed a commit
that referenced
this issue
Apr 14, 2025
…ured comments (#132744) Fixes issue #132739. CaptureAnalysis only considers captures through the def-use chain of the provided pointer, explicitly excluding captures of underlying values or implicit captures like those involving external globals. The previous comment for `PointerMayBeCaptured` did not clearly state this limitation, leading to its incorrect usage in files such as ThreadSanitizer.cpp and SanitizerMetadata.cpp. This PR addresses this by refining the comments for the relevant APIs within `PointerMayBeCaptured` to explicitly document this behavior.
var-const
pushed a commit
to ldionne/llvm-project
that referenced
this issue
Apr 17, 2025
…ured comments (llvm#132744) Fixes issue llvm#132739. CaptureAnalysis only considers captures through the def-use chain of the provided pointer, explicitly excluding captures of underlying values or implicit captures like those involving external globals. The previous comment for `PointerMayBeCaptured` did not clearly state this limitation, leading to its incorrect usage in files such as ThreadSanitizer.cpp and SanitizerMetadata.cpp. This PR addresses this by refining the comments for the relevant APIs within `PointerMayBeCaptured` to explicitly document this behavior.
Camsyn
added a commit
to Camsyn/llvm-project
that referenced
this issue
Apr 24, 2025
…a` rather than arbitrary `Addr` This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation may miss captures of the underlying alloca when analyzing derived pointers. Even when working correctly, backtracking to the original alloca during analysis causes redundancy to the outer's `findAllocaForValue`. Key changes: Directly analyze the capture status of the underlying alloca instead of derived pointers to ensure accurate capture detection This patch fixes some FNs of TSan, referring to the appending testcases for more details.
melver
pushed a commit
that referenced
this issue
Apr 24, 2025
…a` rather than arbitrary `Addr` (#132756) This PR is based on my last PR #132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue #132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void *Thread(void *a) { ((int*)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```
IanWood1
pushed a commit
to IanWood1/llvm-project
that referenced
this issue
May 6, 2025
…a` rather than arbitrary `Addr` (llvm#132756) This PR is based on my last PR llvm#132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void *Thread(void *a) { ((int*)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```
IanWood1
pushed a commit
to IanWood1/llvm-project
that referenced
this issue
May 6, 2025
…a` rather than arbitrary `Addr` (llvm#132756) This PR is based on my last PR llvm#132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void *Thread(void *a) { ((int*)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```
IanWood1
pushed a commit
to IanWood1/llvm-project
that referenced
this issue
May 6, 2025
…a` rather than arbitrary `Addr` (llvm#132756) This PR is based on my last PR llvm#132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void *Thread(void *a) { ((int*)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```
Ankur-0429
pushed a commit
to Ankur-0429/llvm-project
that referenced
this issue
May 9, 2025
…a` rather than arbitrary `Addr` (llvm#132756) This PR is based on my last PR llvm#132752 (the first commit of this PR), but addressing a different issue. This commit addresses the limitation in `PointerMayBeCaptured` analysis when dealing with derived pointers (e.g. arr+1) as described in issue llvm#132739. The current implementation of `PointerMayBeCaptured` may miss captures of the underlying `alloca` when analyzing derived pointers, leading to some FNs in TSan, as follows: ```cpp void *Thread(void *a) { ((int*)a)[1] = 43; return 0; } int main() { int Arr[2] = {41, 42}; pthread_t t; pthread_create(&t, 0, Thread, &Arr[0]); // Missed instrumentation here due to the FN of PointerMayBeCaptured Arr[1] = 43; barrier_wait(&barrier); pthread_join(t, 0); } ``` Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the compilation result of TSan. Even when `PointerMayBeCaptured` working correctly, it should backtrack to the original `alloca` firstly during analysis, causing redundancy to the outer's `findAllocaForValue`. ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(Addr, true)) ... ``` Key changes: Directly analyze the capture status of the underlying `alloca` instead of derived pointers to ensure accurate capture detection ```cpp const AllocaInst *AI = findAllocaForValue(Addr); // Instead of Addr, we should check whether its base pointer is captured. if (AI && !PointerMayBeCaptured(AI, true)) ... ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
PointerMaybeCaptured
, as its name suggests, is supposed to take an arbitrary pointer and analyze whether it might be captured.However, the current capture checking in
PointerMayBeCaptured
may produce false negatives by directly tracking the def-use chain of the given pointer without first backtracking to its underlying object. This leads to missed captures when analyzing derived pointers (e.g., GEP results) with their base objects captured.Reproducer
Analyze whether
&arr[1]
is captured or not.Analysis of Current Usage
Currently, there are TWO different patterns of the pointer argument passed to
PointerMayBeCaptured
.Safe Patterns (Majority of Existing Uses)
Direct patterns with guaranteed base objects:
Alloca
instructionsnoalias
attributesgetUnderlyingObject()
These account for uses in:
InstructionSimplify.cpp
InstCombineCompares.cpp
FunctionAttrs.cpp
(with manual object backtracking)DeadStoreElimination.cpp
SimplifyCFG.cpp
Potentially Problematic Cases
ThreadSanitizer.cpp
: Uses pointer operand of Store/Load directlySanitizerBinaryMetadata.cpp
: Processes pointer operand of Store/Load for GWP-San metadataThe FNs of capture relation of load/store addresses might lead to FN in TSan's race detection, observed in this Godbolt example:
TWO Potential Fixes
Fix of Restriction: Explicitly or implicitly restrict the pattern of pointer argument of
PointerMayBeCaptured
Fix of Implementation (Recommended):
PointerMayBeCaptured
:PointerMayBeCaptured
just pass the underlying object pointer to it, this fix would not introduce obvious overhead, but eliminate the FN when the argument pointer is not of underlying object.The text was updated successfully, but these errors were encountered: