Skip to content

Use widened bounds to update bounds in context #821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion clang/include/clang/Sema/BoundsAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ namespace clang {
// @param[in] B is the current CFGBlock.
// return A mapping of Stmts to variables whose bounds are killed by the
// Stmt.
StmtDeclSetTy GetKillSet(const clang::CFGBlock *B);
StmtDeclSetTy GetKilledBounds(const clang::CFGBlock *B);

private:
// Compute Gen set for each edge in the CFG. If there is an edge B1->B2 and
Expand Down
14 changes: 11 additions & 3 deletions clang/lib/Sema/BoundsAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -720,13 +720,21 @@ void BoundsAnalysis::ComputeOutSets(ElevatedCFGBlock *EB,
}
}

StmtDeclSetTy BoundsAnalysis::GetKillSet(const CFGBlock *B) {
ElevatedCFGBlock *EB = BlockMap[B];
StmtDeclSetTy BoundsAnalysis::GetKilledBounds(const CFGBlock *B) {
auto I = BlockMap.find(B);
if (I == BlockMap.end())
return StmtDeclSetTy();

ElevatedCFGBlock *EB = I->second;
return EB->Kill;
}

BoundsMapTy BoundsAnalysis::GetWidenedBounds(const CFGBlock *B) {
ElevatedCFGBlock *EB = BlockMap[B];
auto I = BlockMap.find(B);
if (I == BlockMap.end())
return BoundsMapTy();

ElevatedCFGBlock *EB = I->second;
return EB->In;
}

Expand Down
105 changes: 94 additions & 11 deletions clang/lib/Sema/SemaBounds.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2142,6 +2142,78 @@ namespace {
}
}

void ResetKilledBounds(StmtDeclSetTy &KilledBounds, Stmt *St,
BoundsContextTy &ObservedBounds) {
auto I = KilledBounds.find(St);
if (I == KilledBounds.end())
return;

// KilledBounds stores a mapping of statements to all variables whose
// bounds are killed by each statement. Here we reset the bounds of all
// variables killed by the statement S to the declared bounds.
for (const VarDecl *V : I->second) {
if (const BoundsExpr *Bounds = V->getBoundsExpr())

// TODO: Throughout clang in general (and inside dataflow analysis in
// particular) we repeatedly invoke ExpandBoundsToRange in order to
// canonicalize the bounds of a variable to RangeBoundsExpr. Sometimes
// we do this multiple times for the same variable. This is very
// inefficient because ExpandBoundsToRange can allocate AST data
// structures that are permanently allocated and increase the memory
// usage of the compiler. The solution is to canonicalize the bounds
// once and attach it to the VarDecl. See issue
// https://github.com/microsoft/checkedc-clang/issues/830.

ObservedBounds[V] = S.ExpandBoundsToRange(V, Bounds);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is inefficient for memory use to call ExpandBoundsToRange repeatedly on the bounds for a variable. ExpandBoundsToRange can allocate AST data structures, and we don't want to do that in a dataflow analysis. Could you open a follow-up issue about reducing allocation of AST nodes and add a TODO here. The solution is to compute it once and attach the canoncialized value to the VarDecl.

This change is beyond the scope of this PR, so I think it should be done separately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue to track this: #830

}
}

void UpdateCtxWithWidenedBounds(BoundsMapTy &WidenedBounds,
BoundsContextTy &ObservedBounds) {
// WidenedBounds contains the mapping from _Nt_array_ptr to the offset by
// which its declared bounds should be widened. In this function we apply
// the offset to the declared bounds of the _Nt_array_ptr and update its
// bounds in ObservedBounds.

for (const auto item : WidenedBounds) {
const VarDecl *V = item.first;
unsigned Offset = item.second;

// We normalize the declared bounds to RangBoundsExpr here so that we
// can easily apply the offset to the upper bound.

// TODO: Throughout clang in general (and inside dataflow analysis in
// particular) we repeatedly invoke ExpandBoundsToRange in order to
// canonicalize the bounds of a variable to RangeBoundsExpr. Sometimes
// we do this multiple times for the same variable. This is very
// inefficient because ExpandBoundsToRange can allocate AST data
// structures that are permanently allocated and increase the memory
// usage of the compiler. The solution is to canonicalize the bounds
// once and attach it to the VarDecl. See issue
// https://github.com/microsoft/checkedc-clang/issues/830.

BoundsExpr *Bounds = S.ExpandBoundsToRange(V, V->getBoundsExpr());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too we may be allocating AST data structures multiple times. We want to avoid doing that, and do the expansion only once.

if (RangeBoundsExpr *RBE = dyn_cast<RangeBoundsExpr>(Bounds)) {
const llvm::APInt
APIntOff(Context.getTargetInfo().getPointerWidth(0), Offset);
IntegerLiteral *WidenedOffset = CreateIntegerLiteral(APIntOff);

Expr *Lower = RBE->getLowerExpr();
Expr *Upper = RBE->getUpperExpr();

// WidenedUpperBound = UpperBound + WidenedOffset.
Expr *WidenedUpper = ExprCreatorUtil::CreateBinaryOperator(
S, Upper, WidenedOffset,
BinaryOperatorKind::BO_Add);

RangeBoundsExpr *R =
new (Context) RangeBoundsExpr(Lower, WidenedUpper,
SourceLocation(), SourceLocation());
ObservedBounds[V] = R;
}
}
}

// Walk the CFG, traversing basic blocks in reverse post-oder.
// For each element of a block, check bounds declarations. Skip
// CFG elements that are subexpressions of other CFG elements.
Expand Down Expand Up @@ -2177,13 +2249,28 @@ namespace {
StmtSet MemoryCheckedStmts;
StmtSet BoundsCheckedStmts;
IdentifyChecked(Body, MemoryCheckedStmts, BoundsCheckedStmts, CheckedScopeSpecifier::CSS_Unchecked);

// Run the bounds widening analysis on this function.
BoundsAnalysis BA = getBoundsAnalyzer();
BA.WidenBounds(FD);
if (S.getLangOpts().DumpWidenedBounds)
BA.DumpWidenedBounds(FD);

PostOrderCFGView POView = PostOrderCFGView(Cfg);
ResetFacts();
for (const CFGBlock *Block : POView) {
AFA.GetFacts(Facts);
CheckingState BlockState = GetIncomingBlockState(Block, BlockStates);
// TODO: update BlockState.ObservedBounds to reflect the widened bounds
// for the block.

// Get the widened bounds for the current block as computed by the
// bounds widening analysis invoked by WidenBounds above.
BoundsMapTy WidenedBounds = BA.GetWidenedBounds(Block);
// Also get the bounds killed (if any) by each statement in the current
// block.
StmtDeclSetTy KilledBounds = BA.GetKilledBounds(Block);
// Update the Observed bounds with the widened bounds calculated above.
UpdateCtxWithWidenedBounds(WidenedBounds, BlockState.ObservedBounds);

for (CFGElement Elem : *Block) {
if (Elem.getKind() == CFGElement::Statement) {
CFGStmt CS = Elem.castAs<CFGStmt>();
Expand Down Expand Up @@ -2218,8 +2305,11 @@ namespace {
// bounds for each variable v that is in scope are the widened
// bounds for v (if any), or the declared bounds for v (if any).
GetDeclaredBounds(this->S, BlockState.ObservedBounds, S);
// TODO: update BlockState.ObservedBounds to reset any widened
// bounds that are killed by S to the declared variable bounds.

// If any bounds are killed by statement S, reset their bounds
// to their declared bounds.
ResetKilledBounds(KilledBounds, S, BlockState.ObservedBounds);
Copy link
Contributor

@kkjeer kkjeer May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be done before saving InitialObservedBounds = BlockState.ObservedBounds. InitialBounds should contain the contents of ObservedBounds before calling Check(S, CSS, BlockState) in order for the observed bounds to be reset after the Check call.

For example:

void pointer_widening() {
  nt_array_ptr<char> p : count(0) = "a";

  if (p[0]) {
    // This assignment kills the widened bounds of p.
    // InitialObservedBounds should contain p => bounds(p, p + 0),
    // not p => bounds(p, p + 0 + 1).
    // After checking this assignment, BlockState.ObservedBounds should contain
    // p => bounds(p, p + 0).
    p = "";
    if (p[1]) // observed bounds of p are (p, p + 0), so this is an out-of-bounds memory access
  }
}


BoundsContextTy InitialObservedBounds = BlockState.ObservedBounds;
BlockState.G.clear();

Expand Down Expand Up @@ -5191,13 +5281,6 @@ void Sema::CheckFunctionBodyBoundsDecls(FunctionDecl *FD, Stmt *Body) {
Checker.Check(Body, CheckedScopeSpecifier::CSS_Unchecked);
}

if (Cfg != nullptr) {
BoundsAnalysis BA = Checker.getBoundsAnalyzer();
BA.WidenBounds(FD);
if (getLangOpts().DumpWidenedBounds)
BA.DumpWidenedBounds(FD);
}

#if TRACE_CFG
llvm::outs() << "Done " << FD->getName() << "\n";
#endif
Expand Down
20 changes: 20 additions & 0 deletions clang/test/CheckedC/inferred-bounds/widened-bounds-check.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// Tests for using bounds widening to control "out-of-bounds access" errors.
//
// RUN: %clang_cc1 -verify -verify-ignore-unexpected=note -fcheckedc-extension %s

#include <limits.h>

void f1() {
_Nt_array_ptr<char> p : bounds(p, p) = "";

if (*p)
if (*(p + 1))
if (*(p + 3)) // expected-error {{out-of-bounds memory access}}
{}

if (*p) {
p++;
if (*(p + 1)) // expected-error {{out-of-bounds memory access}}
{}
}
}
46 changes: 23 additions & 23 deletions clang/test/CheckedC/inferred-bounds/widened-bounds.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ void f2() {
_Nt_array_ptr<char> p : count(0) = "ab";

if (*p)
if (*(p + 1)) // expected-error {{out-of-bounds memory access}}
if (*(p + 2)) // expected-error {{out-of-bounds memory access}}
if (*(p + 1))
if (*(p + 2))
{}

// CHECK: In function: f2
Expand Down Expand Up @@ -84,8 +84,8 @@ void f5() {
char p _Nt_checked[] : count(0) = "abc";

if (p[0])
if (p[1]) // expected-error {{out-of-bounds memory access}}
if (p[2]) // expected-error {{out-of-bounds memory access}}
if (p[1])
if (p[2])
{}

// CHECK: In function: f5
Expand Down Expand Up @@ -149,7 +149,7 @@ void f8() {
if (*p)
if (*(p + 1))
if (*(p + 2))
if (*(p + 3)) // expected-error {{out-of-bounds memory access}}
if (*(p + 3))
{}

// CHECK: In function: f8
Expand Down Expand Up @@ -267,8 +267,8 @@ void f13() {

if (p[0])
if (1[p])
if (p[2]) // expected-error {{out-of-bounds memory access}}
if (3[p]) // expected-error {{out-of-bounds memory access}}
if (p[2])
if (3[p])
{}

// CHECK: In function: f13
Expand Down Expand Up @@ -319,7 +319,7 @@ void f15(int i) {

_Nt_array_ptr<char> q : count(0) = "a";
if (*q)
if (*(q - 1)) // expected-error {{out-of-bounds memory access}}
if (*(q - 1))
{}

// CHECK: [B7]
Expand Down Expand Up @@ -355,7 +355,7 @@ void f16(_Nt_array_ptr<char> p : bounds(p, p)) {
_Nt_array_ptr<char> r : bounds(p, p + 1) = "a";

if (*(p))
if (*(p + 1)) // expected-error {{out-of-bounds memory access}}
if (*(p + 1))
{}

// CHECK: In function: f16
Expand Down Expand Up @@ -447,9 +447,9 @@ void f19() {
_Nt_array_ptr<char> p : count(0) = "a";

if (*p)
if (*(p + 1)) // expected-error {{out-of-bounds memory access}}
if (*(p + 3)) // expected-error {{out-of-bounds memory access}}
if (*(p + 2)) // expected-error {{out-of-bounds memory access}}
if (*(p + 1))
if (*(p + 3))
if (*(p + 2))
{}

// CHECK: In function: f19
Expand Down Expand Up @@ -540,8 +540,8 @@ void f21() {
char p _Nt_checked[] : count(0) = "abc";

while (p[0])
while (p[1]) // expected-error {{out-of-bounds memory access}}
while (p[2]) // expected-error {{out-of-bounds memory access}}
while (p[1])
while (p[2])
{}

// CHECK: In function: f21
Expand All @@ -565,8 +565,8 @@ void f22() {
_Nt_array_ptr<char> p : count(0) = "a";

if (*p)
while (*(p + 1)) // expected-error {{out-of-bounds memory access}}
if (*(p + 2)) // expected-error {{out-of-bounds memory access}}
while (*(p + 1))
if (*(p + 2))
{}

// CHECK: In function: f22
Expand Down Expand Up @@ -615,7 +615,7 @@ B: p;

while (*p) {
p;
while (*(p + 1)) { // expected-error {{out-of-bounds memory access}}
while (*(p + 1)) {
C: p;
}
}
Expand Down Expand Up @@ -674,9 +674,9 @@ void f25() {

for (; *p; ) {
i = 0;
for (; *(p + 1); ) { // expected-error {{out-of-bounds memory access}}
for (; *(p + 1); ) {
i = 1;
for (; *(p + 2); ) { // expected-error {{out-of-bounds memory access}}
for (; *(p + 2); ) {
i = 2;
}
}
Expand Down Expand Up @@ -874,7 +874,7 @@ void f28() {

switch (*p) {
case 1:
switch (*(p + 1)) { // expected-error {{out-of-bounds memory access}}
switch (*(p + 1)) {
case 2: break;
}
// CHECK: [B8]
Expand Down Expand Up @@ -917,17 +917,17 @@ void f29() {
// CHECK: case 'a':
// CHECK: upper_bound(p) = 1

if (*(p + 1)) { // expected-error {{out-of-bounds memory access}}
if (*(p + 1)) {
i = 0;
// CHECK: 1: i = 0
// CHECK: upper_bound(p) = 2

for (;*(p + 2);) { // expected-error {{out-of-bounds memory access}}
for (;*(p + 2);) {
i = 1;
// CHECK: 1: i = 1
// CHECK: upper_bound(p) = 3

while (*(p + 3)) { // expected-error {{out-of-bounds memory access}}
while (*(p + 3)) {
i = 2;
// CHECK: 1: i = 2
// CHECK: upper_bound(p) = 4
Expand Down