-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[analyzer] Refine invalidation caused by fread
#93408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
f9e841d
c46aa42
79ea47a
0ab63f1
f9142be
034d1a1
c8e5f54
cf90c7b
5608f46
dd82437
0fdf2e5
06f1b6e
a0e5215
17f63d1
0f045b2
5f73d42
dd4d268
c9268ab
b345554
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -717,18 +717,71 @@ const ExplodedNode *StreamChecker::getAcquisitionSite(const ExplodedNode *N, | |||||||||
| return nullptr; | ||||||||||
| } | ||||||||||
|
|
||||||||||
| /// Invalidate only the requested elements instead of the whole buffer. | ||||||||||
| /// This is basically a refinement of the more generic 'escapeArgs' or | ||||||||||
| /// the plain old 'invalidateRegions'. | ||||||||||
| /// This only works if the \p StartIndex and \p Count are concrete or | ||||||||||
| /// perfectly-constrained. | ||||||||||
| static ProgramStateRef | ||||||||||
| escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext &C, | ||||||||||
| const CallEvent &Call, const MemRegion *Buffer, | ||||||||||
| QualType ElemType, SVal StartIndex, SVal Count) { | ||||||||||
| if (!llvm::isa_and_nonnull<SubRegion>(Buffer)) | ||||||||||
| return State; | ||||||||||
|
||||||||||
|
|
||||||||||
| auto UnboxAsInt = [&C, &State](SVal V) -> std::optional<int64_t> { | ||||||||||
| auto &SVB = C.getSValBuilder(); | ||||||||||
| if (const llvm::APSInt *Int = SVB.getKnownValue(State, V)) | ||||||||||
| return Int->tryExtValue(); | ||||||||||
| return std::nullopt; | ||||||||||
| }; | ||||||||||
|
|
||||||||||
| auto StartIndexVal = UnboxAsInt(StartIndex); | ||||||||||
| auto CountVal = UnboxAsInt(Count); | ||||||||||
|
||||||||||
| auto StartIndexVal = UnboxAsInt(StartIndex); | |
| auto CountVal = UnboxAsInt(Count); | |
| std::optional<int64_t> StartIndexVal = UnboxAsInt(StartIndex); | |
| std::optional<int64_t> CountVal = UnboxAsInt(Count); |
The explicitly specified type would make this code easier to read (without it, my first guess was that these are int variables because the lambda is named ...AsInt).
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop does not work if the type of the array is not the same as the "size" parameter passed to fread.:
int buffer[100];
fread(buffer + 1, 3, 5, file);
In this case 3*5 bytes should be read by fread into the array. If sizeof(int)==4 4 elements should be invalidated in the buffer starting from index 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a problem.
To properly invalidate element-wise, I must bind the correctly typed conjured symbols so that subsequent lookups in the store would hit the entry.
Consequently, I can't consider the buffer as a char array and invalidate each chars in the range.
I must use ElementRegions, thus I need properly aligned item-wise iteration and invalidation.
So, I could try to convert the element size and item count into a start byte offset and length in bytes. After this I could check if the start bytes offset would land on an item boundary and if the length would cover a whole element object.
This way we could cover the case like:
int buffer[100]; // assuming 4 byte ints.
fread(buffer + 1, 2, 10, file); // Reads 20 bytes of data, which is dividable by 4, thus we can think of the 'buffer[1:6]' elements as individually invalidated.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we see an ElementRegion in the case when we're freading into the beginning of an array variable?
I see that the element region layer should be there if we did pointer arithmetic or if this is a symbolic region converted to a type; but not sure that this covers the "simply read into an array" case as well. Could you add a simple testcase that validates that the individual element invalidation activates in a situation like
int arr[10];
fread(arr, sizeof(int), 5, <FILE pointer>);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case fread reads to the beginning of a buffer, we won't have an ElementRegion, thus the heuristic for eagerly binding the invalidated elements won't trigger. This is unfortunate, but you can think of this as we keep the previous behavior.
To circumvent this, I'd need to know the type for for the pointee.
This would imply that I should special-case TypedValueRegion and SymbolicRegion.
I'll think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is some redundancy in getting the argument values, SizeVal and NMembVal are already available and could be used by the function (if it would take NonLoc).
Uh oh!
There was an error while loading. Please reload this page.