Skip to content

Explain bounds proof failure caused by free variables in inferred/declared bounds #903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Jan 26, 2021

Conversation

dopelsunce
Copy link
Contributor

@dopelsunce dopelsunce commented Aug 18, 2020

This PR enables the compiler to emit an error when a free variable (defined below) is found in the declared/inferred bounds on bounds assignments, and passing arguments. A note is printed for each free variable found.

Free variables w.r.t. expressions

Assume the helper function VarSet(E) returns the set of variables in expression E.

Given two expressions SrcExpr and DstExpr, a set of equivalent expression sets EQ. A variable V in VarSet(SrcExpr) is free if:

  1. (V does not equal to some constant) V is not an integer type, or there is no set in EQ that contains a constant and V; and,
  2. (V does not have an alias in VarSet(DstExpr)) For each U in VarSet(DstExpr), there is not set in EQ that contains both V and U.

Free variables w.r.t. declared/inferred bounds

We extend the above definition for free variables with respect to the lower (or upper bounds) of the declared bounds (or inferred bounds). In other words, a variable used by the declared lower bounds may be free even if it is used in the inferred upper bounds.

Affected cases

Free variable is detected in the following cases:

  1. On variable initialization or assignment, inferred bounds do not imply declared bounds. Error msg: "it is not possible to prove that the inferred bounds of p imply the declared bounds of p after ...".
  2. On passing arguments, the inferred bounds do not imply the expected argument bounds. Error msg: "it is not possible to prove argument meets declared bounds for n-th parameter"

Note: we do not check free variables in static bounds cast.

Note for each free variable

We use the template: "note: the declared/inferred bounds use the value of the variable 'q', and there is no relational information between 'q' and any of the variables used by the inferred/declared bounds".

@@ -1699,7 +1993,14 @@ namespace {
llvm::outs() << "\nSource range:";
SrcRange.Dump(llvm::outs());
#endif
ProofResult R = SrcRange.InRange(DeclaredRange, Cause, EquivExprs, Facts);
// For static bounds cast, we do not check free variables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is somewhat confusing since it's not clear that checking static bounds cast will always pass a nullptr EquivExprs parameter. I suggest moving this logic into the InRangeWithFreeVars function, so that in InRangeWithFreeVars, if EquivExprs is null, simply return InRange without checking any free variables.

Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work. I have another suggestion in addition to @kkjeer's suggestions. This could be done as follow-up work.

@@ -64,7 +66,9 @@ _Array_ptr<int> f14(unsigned num) {
}

_Array_ptr<int> f15(unsigned num1, unsigned num2){
_Array_ptr<int> p : byte_count(num1) = test_f13(num2); // expected-warning {{cannot prove declared bounds for 'p' are valid after initialization}} \
_Array_ptr<int> p : byte_count(num1) = test_f13(num2); // expected-error {{it is not possible to prove that the inferred bounds of 'p' imply the declared bounds of 'p' after initialization}} \
// expected-note {{the declared upper bounds use the value of the variable 'num1', and there is no relational information between 'num1' and any of the variables used by the inferred upper bounds}} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notes about the bounds should be displayed before the note about lack of relational information. This will be confusing for a programmer to follow. This can be addressed in follow-up work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion @dtarditi. Created issue #912 to track this.

// the source bounds.
HasFreeVariables = 0x200, // Source or destination has free variables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: change this value to 0x100 and remove trailing comma.

* Fix issue #909 and #911 by ignoring casts in EquivExprs

* Replace IgnoreCasts with IgnoreParenCasts and add test cases
Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm seems to have subtle complications that I didn't realize when we first discussed the idea. The basic idea is just to look at sets of variables, but there are subtleties involving expressions using those variables. Expressions that use those variables may have relational information between each other and may appear in equivalent expression sets. This relational information is being lost when we boil things down to just sets of variables.

@kkjeer and @dopelsunce, it may be worth discussing next steps before making changes. I think this may address some of the error messages that we're seeing be generated now.

FreeVariablePosition Pos2,
EquivExprSets *EquivExprs,
FreeVariableListTy &FreeVars) {
// If E1 or E2 accesses memory via poiter, we skip because we cannot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: poit->pointer.

@@ -1172,6 +1270,43 @@ namespace {
return ProofResult::Maybe;
}

// InRangeWithFreeVars is the same as InRange, but gathers free variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function needs a better comment. This function is an extension of InRange. It tries to prove that R is within this range. If R is provably not within this range, it sets Cause to explain reasons why. This function takes into account variables that are free in one of the lower/upper bounds expressions that are being compared.

InRange compares the lower bounds from the two ranges and then the upper bounds from the two ragnes. Given a pair of lower (or upper) bounds expressions, a variable is free if it appears in one expression but does not appear in the other expression, and there is no known relation between that variable and variables in the other expression. In that situation, it is impossible to prove that the comparison is true, given the facts that the compiler has.

EquivExprSets *EquivExprs,
std::pair<ComparisonSet, ComparisonSet> &Facts,
FreeVariableListTy &FreeVariables) {
// If we do not have equivalence relational information (EquivExprs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reword this comment:

If there is no relational information at all, then the compiler 
didn't attempt to gather any.  To avoid confusing
programmers, don't try to take free variables into account.
There may be some simple fact the compiler doesn't know that would cause
confusion about why the compiler is claiming that no fact is known.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkjeer, could you update this comment? Thanks.

@@ -426,7 +426,7 @@ void f42(void) {
// CHECK: | `-DeclRefExpr {{0x[0-9a-f]+}} '_Array_ptr<int>' lvalue Var {{0x[0-9a-f]+}} 'p' '_Array_ptr<int>'
// CHECK: `-IntegerLiteral {{0x[0-9a-f]+}} 'int' 1

_Array_ptr<int> r : count(1) = &p[0];
_Array_ptr<int> r : count(1) = &p[0]; // expected-error {{it is not possible to prove that the inferred bounds of 'r' imply the declared bounds of 'r' after initialization}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unexpected to me that we can no longer prove this. We seem to be missing the fact that r equals p.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set EquivExprs contains { (_Array_ptr<int>)(&p[0]), r }. Without the changes in this PR, there would be a warning: "cannot prove declared bounds for 'r' are valid after initialization" since the bases p and r of the inferred bounds (p, p + 1) and declared bounds (r, r + 1) are not equal.

return HasFreeVariables;
}

// AddFreeVariables maps each free variable in SrcVars w.r.t. DstVars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't very clear. Given two sets of variables SrcVars and DstVars, for each variable v in SrcVars, it determines whether v occurs in DstVars. There are also some side conditions about v with respect to EquivExprs.

return false;

bool HasFreeVariables = false;
EqualExprTy Vars1 = CollectVariableSet(S, E1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algorithm isn't correct because of its handling of equivalent expression sets. With equivalent expression sets, you may have two expressions that are known to be equal, such as "a.f == b.f", even though there is no equality of the variables used in the expressions. We don't know that a == b, for example.

We may need to generalize the notion of a variable v having relational information to another variable y. We need to check that there no expressions constructed from v or y that have relational information either.


// EquivVars holds sets of DeclRefExpr and IntegerLiteral filtered from
// EquivExprs.
EquivExprSets EquivVars;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An implication of my comment is that we shouldn't be filtering EquivVars. We could be removing expressions that have equality relationships and that use variables.

// EquivExprs that contains both V and a constant (i.e. an integer-typed
// value); and
// 2. for each variable U in DstVars, there is not set in EquivExprs
// that contains both V and U.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think condition 2 needs to be amended to work in terms of uses of variables. There is not a set in EquivExprs that contains two different expressions, where one expression uses V and the other expression uses U.

def note_free_variable_decl_or_inferred : Note<
"the %select{declared|inferred}0 %select{lower |upper |}1bounds use the value "
"of the variable '%2', and there is no relational information between '%2' "
"and any of the variables used by the %select{inferred|declared}0 "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need to be reworded. There is no relational information involving v and any of the expressions used by the other bounds. Perhaps something along the lines of

"the %select{declared|inferred}0 %slect{lower | upper |}bound use the variable %2' and there is no relational information involving '%2' and any of expressions used by the %select{inferred|declared}0 "

kkjeer and others added 2 commits January 14, 2021 15:51
* Change IsEqualToConstant to search for an rvalue cast of a variable

* Add FindVarRelationship method to search for an indirect relationship between variables

* Remove EquivVars set that filters out everything except DeclRefExprs and IntegerLiterals

* Remove expected free variables errors where there are indirect variable relationships

* Update note messages to mention lack of relational information between a free variable and expressions

* Remove comma from expected notes

* Fix comments

* Make free variable-related methods non-static

* Move top-level free variable method CheckFreeVarInExprs above other free variable methods

* Fix comment typo

* Update CheckFreeVarsInExprs comment

Co-authored-by: Katherine Kjeer <[email protected]>
This test was originally in the checkedc/tests/typechecking directory but was moved to checkedc-clang since it tests compiler implementation details (diagnostic messages).
Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few suggestions around comments that I'd like to see resolved. Also, there is a merge conflict according to GitHub. Once both of these are done, I'll approve the merge.

EquivExprSets *EquivExprs,
std::pair<ComparisonSet, ComparisonSet> &Facts,
FreeVariableListTy &FreeVariables) {
// If we do not have equivalence relational information (EquivExprs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkjeer, could you update this comment? Thanks.

@@ -1172,6 +1270,43 @@ namespace {
return ProofResult::Maybe;
}

// InRangeWithFreeVars is the same as InRange, but gathers free variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkjeer, could you update this comment?

const EqualExprTy &GetVariableList() const { return VariableList; }

bool VisitDeclRefExpr(DeclRefExpr *E) {
if (!EqualExprsContainsExpr(SemaRef, VariableList, E, nullptr)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkjeer, is my comment still accurate? If so, could you leave a comment in the code about the algorithm being quadratic. I don't like to leave quadratic algorithms around in a compiler without calling them out.

@kkjeer kkjeer merged commit b7be752 into master Jan 26, 2021
@kkjeer kkjeer deleted the non-equiv-exprs-in-bounds branch January 26, 2021 01:02
mattmccutchen-cci added a commit to correctcomputation/checkedc-clang that referenced this pull request Feb 4, 2021
We don't think it is important to the test, and since
checkedc#903, it is causing a compiler error (which was
previously a warning).
mattmccutchen-cci added a commit to correctcomputation/checkedc-clang that referenced this pull request Feb 12, 2021
We don't think it is important to the test, and since
checkedc#903, it is causing a compiler error (which was
previously a warning).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants