Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 77 additions & 11 deletions clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -717,18 +717,71 @@ const ExplodedNode *StreamChecker::getAcquisitionSite(const ExplodedNode *N,
return nullptr;
}

/// Invalidate only the requested elements instead of the whole buffer.
/// This is basically a refinement of the more generic 'escapeArgs' or
/// the plain old 'invalidateRegions'.
/// This only works if the \p StartIndex and \p Count are concrete or
/// perfectly-constrained.
static ProgramStateRef
escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext &C,
const CallEvent &Call, const MemRegion *Buffer,
QualType ElemType, SVal StartIndex, SVal Count) {
if (!llvm::isa_and_nonnull<SubRegion>(Buffer))
return State;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd slightly prefer a dyn_cast_or_null instead of this isa check that's eventually followed by a bare cast.


auto UnboxAsInt = [&C, &State](SVal V) -> std::optional<int64_t> {
auto &SVB = C.getSValBuilder();
if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
return Int->tryExtValue();
return std::nullopt;
};

auto StartIndexVal = UnboxAsInt(StartIndex);
auto CountVal = UnboxAsInt(Count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto StartIndexVal = UnboxAsInt(StartIndex);
auto CountVal = UnboxAsInt(Count);
std::optional<int64_t> StartIndexVal = UnboxAsInt(StartIndex);
std::optional<int64_t> CountVal = UnboxAsInt(Count);

The explicitly specified type would make this code easier to read (without it, my first guess was that these are int variables because the lambda is named ...AsInt).


// FIXME: Maybe we could make this more generic, and expose this by the
// 'invalidateRegions' API. After doing so, it might make sense to make this
// limit configurable.
constexpr int MaxInvalidatedElementsLimit = 64;
if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
return State->invalidateRegions({loc::MemRegionVal{Buffer}},
Call.getOriginExpr(), C.blockCount(),
C.getLocationContext(),
/*CausesPointerEscape=*/false);
}

constexpr auto DoNotInvalidateSuperRegion =
RegionAndSymbolInvalidationTraits::InvalidationKinds::
TK_DoNotInvalidateSuperRegion;

auto &RegionManager = Buffer->getMemRegionManager();
SmallVector<SVal> EscapingVals;
EscapingVals.reserve(*CountVal);

RegionAndSymbolInvalidationTraits ITraits;
for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop does not work if the type of the array is not the same as the "size" parameter passed to fread.:

int buffer[100];
fread(buffer + 1, 3, 5, file);

In this case 3*5 bytes should be read by fread into the array. If sizeof(int)==4 4 elements should be invalidated in the buffer starting from index 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a problem.
To properly invalidate element-wise, I must bind the correctly typed conjured symbols so that subsequent lookups in the store would hit the entry.
Consequently, I can't consider the buffer as a char array and invalidate each chars in the range.
I must use ElementRegions, thus I need properly aligned item-wise iteration and invalidation.

So, I could try to convert the element size and item count into a start byte offset and length in bytes. After this I could check if the start bytes offset would land on an item boundary and if the length would cover a whole element object.

This way we could cover the case like:

int buffer[100]; // assuming 4 byte ints.
fread(buffer + 1, 2, 10, file); // Reads 20 bytes of data, which is dividable by 4, thus we can think of the 'buffer[1:6]' elements as individually invalidated.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented.

NonLoc Index = C.getSValBuilder().makeArrayIndex(Idx);
const auto *Element = RegionManager.getElementRegion(
ElemType, Index, cast<SubRegion>(Buffer), C.getASTContext());
EscapingVals.push_back(loc::MemRegionVal(Element));
ITraits.setTrait(Element, DoNotInvalidateSuperRegion);
}
return State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
C.blockCount(), C.getLocationContext(),
/*CausesPointerEscape=*/false,
/*InvalidatedSymbols=*/nullptr, &Call,
&ITraits);
}

static ProgramStateRef escapeArgs(ProgramStateRef State, CheckerContext &C,
const CallEvent &Call,
ArrayRef<unsigned int> EscapingArgs) {
const auto *CE = Call.getOriginExpr();

SmallVector<SVal> EscapingVals;
EscapingVals.reserve(EscapingArgs.size());
for (auto EscArgIdx : EscapingArgs)
EscapingVals.push_back(Call.getArgSVal(EscArgIdx));
State = State->invalidateRegions(EscapingVals, CE, C.blockCount(),
C.getLocationContext(),
/*CausesPointerEscape=*/false);
auto GetArgSVal = [&Call](int Idx) { return Call.getArgSVal(Idx); };
auto EscapingVals = to_vector(map_range(EscapingArgs, GetArgSVal));
State = State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
C.blockCount(), C.getLocationContext(),
/*CausesPointerEscape=*/false,
/*InvalidatedSymbols=*/nullptr);
return State;
}

Expand Down Expand Up @@ -937,8 +990,21 @@ void StreamChecker::evalFreadFwrite(const FnDescription *Desc,

// At read, invalidate the buffer in any case of error or success,
// except if EOF was already present.
if (IsFread && !E.isStreamEof())
State = escapeArgs(State, C, Call, {0});
if (IsFread && !E.isStreamEof()) {
// Try to invalidate the individual elements.
if (const auto *BufferFirstElem =
dyn_cast_or_null<ElementRegion>(Call.getArgSVal(0).getAsRegion())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we see an ElementRegion in the case when we're freading into the beginning of an array variable?

I see that the element region layer should be there if we did pointer arithmetic or if this is a symbolic region converted to a type; but not sure that this covers the "simply read into an array" case as well. Could you add a simple testcase that validates that the individual element invalidation activates in a situation like

int arr[10];
fread(arr, sizeof(int), 5, <FILE pointer>);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case fread reads to the beginning of a buffer, we won't have an ElementRegion, thus the heuristic for eagerly binding the invalidated elements won't trigger. This is unfortunate, but you can think of this as we keep the previous behavior.
To circumvent this, I'd need to know the type for for the pointee.
This would imply that I should special-case TypedValueRegion and SymbolicRegion.

I'll think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented.

const MemRegion *Buffer = BufferFirstElem->getSuperRegion();
QualType ElemTy = BufferFirstElem->getElementType();
SVal FirstAccessedItem = BufferFirstElem->getIndex();
SVal ItemCount = Call.getArgSVal(2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is some redundancy in getting the argument values, SizeVal and NMembVal are already available and could be used by the function (if it would take NonLoc).

State = escapeByStartIndexAndCount(State, C, Call, Buffer, ElemTy,
FirstAccessedItem, ItemCount);
} else {
// Otherwise just fall back to invalidating the whole buffer.
State = escapeArgs(State, C, Call, {0});
}
}

// Generate a transition for the success state.
// If we know the state to be FEOF at fread, do not add a success state.
Expand Down
Loading