-
Notifications
You must be signed in to change notification settings - Fork 79
Explain bounds proof failure caused by free variables in inferred/declared bounds #903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Simplify logic and improve comments
…der declared bounds as well
clang/lib/Sema/SemaBounds.cpp
Outdated
@@ -1699,7 +1993,14 @@ namespace { | |||
llvm::outs() << "\nSource range:"; | |||
SrcRange.Dump(llvm::outs()); | |||
#endif | |||
ProofResult R = SrcRange.InRange(DeclaredRange, Cause, EquivExprs, Facts); | |||
// For static bounds cast, we do not check free variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is somewhat confusing since it's not clear that checking static bounds cast will always pass a nullptr EquivExprs parameter. I suggest moving this logic into the InRangeWithFreeVars
function, so that in InRangeWithFreeVars
, if EquivExprs
is null, simply return InRange
without checking any free variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work. I have another suggestion in addition to @kkjeer's suggestions. This could be done as follow-up work.
@@ -64,7 +66,9 @@ _Array_ptr<int> f14(unsigned num) { | |||
} | |||
|
|||
_Array_ptr<int> f15(unsigned num1, unsigned num2){ | |||
_Array_ptr<int> p : byte_count(num1) = test_f13(num2); // expected-warning {{cannot prove declared bounds for 'p' are valid after initialization}} \ | |||
_Array_ptr<int> p : byte_count(num1) = test_f13(num2); // expected-error {{it is not possible to prove that the inferred bounds of 'p' imply the declared bounds of 'p' after initialization}} \ | |||
// expected-note {{the declared upper bounds use the value of the variable 'num1', and there is no relational information between 'num1' and any of the variables used by the inferred upper bounds}} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notes about the bounds should be displayed before the note about lack of relational information. This will be confusing for a programmer to follow. This can be addressed in follow-up work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang/lib/Sema/SemaBounds.cpp
Outdated
// the source bounds. | ||
HasFreeVariables = 0x200, // Source or destination has free variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: change this value to 0x100
and remove trailing comma.
… for the declared bounds (#913)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The algorithm seems to have subtle complications that I didn't realize when we first discussed the idea. The basic idea is just to look at sets of variables, but there are subtleties involving expressions using those variables. Expressions that use those variables may have relational information between each other and may appear in equivalent expression sets. This relational information is being lost when we boil things down to just sets of variables.
@kkjeer and @dopelsunce, it may be worth discussing next steps before making changes. I think this may address some of the error messages that we're seeing be generated now.
clang/lib/Sema/SemaBounds.cpp
Outdated
FreeVariablePosition Pos2, | ||
EquivExprSets *EquivExprs, | ||
FreeVariableListTy &FreeVars) { | ||
// If E1 or E2 accesses memory via poiter, we skip because we cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: poit->pointer.
clang/lib/Sema/SemaBounds.cpp
Outdated
@@ -1172,6 +1270,43 @@ namespace { | |||
return ProofResult::Maybe; | |||
} | |||
|
|||
// InRangeWithFreeVars is the same as InRange, but gathers free variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function needs a better comment. This function is an extension of InRange. It tries to prove that R is within this range. If R is provably not within this range, it sets Cause to explain reasons why. This function takes into account variables that are free in one of the lower/upper bounds expressions that are being compared.
InRange compares the lower bounds from the two ranges and then the upper bounds from the two ragnes. Given a pair of lower (or upper) bounds expressions, a variable is free if it appears in one expression but does not appear in the other expression, and there is no known relation between that variable and variables in the other expression. In that situation, it is impossible to prove that the comparison is true, given the facts that the compiler has.
clang/lib/Sema/SemaBounds.cpp
Outdated
EquivExprSets *EquivExprs, | ||
std::pair<ComparisonSet, ComparisonSet> &Facts, | ||
FreeVariableListTy &FreeVariables) { | ||
// If we do not have equivalence relational information (EquivExprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reword this comment:
If there is no relational information at all, then the compiler
didn't attempt to gather any. To avoid confusing
programmers, don't try to take free variables into account.
There may be some simple fact the compiler doesn't know that would cause
confusion about why the compiler is claiming that no fact is known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkjeer, could you update this comment? Thanks.
@@ -426,7 +426,7 @@ void f42(void) { | |||
// CHECK: | `-DeclRefExpr {{0x[0-9a-f]+}} '_Array_ptr<int>' lvalue Var {{0x[0-9a-f]+}} 'p' '_Array_ptr<int>' | |||
// CHECK: `-IntegerLiteral {{0x[0-9a-f]+}} 'int' 1 | |||
|
|||
_Array_ptr<int> r : count(1) = &p[0]; | |||
_Array_ptr<int> r : count(1) = &p[0]; // expected-error {{it is not possible to prove that the inferred bounds of 'r' imply the declared bounds of 'r' after initialization}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unexpected to me that we can no longer prove this. We seem to be missing the fact that r equals p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set EquivExprs
contains { (_Array_ptr<int>)(&p[0]), r }
. Without the changes in this PR, there would be a warning: "cannot prove declared bounds for 'r' are valid after initialization" since the bases p
and r
of the inferred bounds (p, p + 1)
and declared bounds (r, r + 1)
are not equal.
clang/lib/Sema/SemaBounds.cpp
Outdated
return HasFreeVariables; | ||
} | ||
|
||
// AddFreeVariables maps each free variable in SrcVars w.r.t. DstVars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment isn't very clear. Given two sets of variables SrcVars and DstVars, for each variable v in SrcVars, it determines whether v occurs in DstVars. There are also some side conditions about v with respect to EquivExprs.
clang/lib/Sema/SemaBounds.cpp
Outdated
return false; | ||
|
||
bool HasFreeVariables = false; | ||
EqualExprTy Vars1 = CollectVariableSet(S, E1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This algorithm isn't correct because of its handling of equivalent expression sets. With equivalent expression sets, you may have two expressions that are known to be equal, such as "a.f == b.f", even though there is no equality of the variables used in the expressions. We don't know that a == b, for example.
We may need to generalize the notion of a variable v having relational information to another variable y. We need to check that there no expressions constructed from v or y that have relational information either.
clang/lib/Sema/SemaBounds.cpp
Outdated
|
||
// EquivVars holds sets of DeclRefExpr and IntegerLiteral filtered from | ||
// EquivExprs. | ||
EquivExprSets EquivVars; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An implication of my comment is that we shouldn't be filtering EquivVars. We could be removing expressions that have equality relationships and that use variables.
clang/lib/Sema/SemaBounds.cpp
Outdated
// EquivExprs that contains both V and a constant (i.e. an integer-typed | ||
// value); and | ||
// 2. for each variable U in DstVars, there is not set in EquivExprs | ||
// that contains both V and U. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think condition 2 needs to be amended to work in terms of uses of variables. There is not a set in EquivExprs that contains two different expressions, where one expression uses V and the other expression uses U.
def note_free_variable_decl_or_inferred : Note< | ||
"the %select{declared|inferred}0 %select{lower |upper |}1bounds use the value " | ||
"of the variable '%2', and there is no relational information between '%2' " | ||
"and any of the variables used by the %select{inferred|declared}0 " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need to be reworded. There is no relational information involving v and any of the expressions used by the other bounds. Perhaps something along the lines of
"the %select{declared|inferred}0 %slect{lower | upper |}bound use the variable %2' and there is no relational information involving '%2' and any of expressions used by the %select{inferred|declared}0 "
* Change IsEqualToConstant to search for an rvalue cast of a variable * Add FindVarRelationship method to search for an indirect relationship between variables * Remove EquivVars set that filters out everything except DeclRefExprs and IntegerLiterals * Remove expected free variables errors where there are indirect variable relationships * Update note messages to mention lack of relational information between a free variable and expressions * Remove comma from expected notes * Fix comments * Make free variable-related methods non-static * Move top-level free variable method CheckFreeVarInExprs above other free variable methods * Fix comment typo * Update CheckFreeVarsInExprs comment Co-authored-by: Katherine Kjeer <[email protected]>
This test was originally in the checkedc/tests/typechecking directory but was moved to checkedc-clang since it tests compiler implementation details (diagnostic messages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few suggestions around comments that I'd like to see resolved. Also, there is a merge conflict according to GitHub. Once both of these are done, I'll approve the merge.
clang/lib/Sema/SemaBounds.cpp
Outdated
EquivExprSets *EquivExprs, | ||
std::pair<ComparisonSet, ComparisonSet> &Facts, | ||
FreeVariableListTy &FreeVariables) { | ||
// If we do not have equivalence relational information (EquivExprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkjeer, could you update this comment? Thanks.
clang/lib/Sema/SemaBounds.cpp
Outdated
@@ -1172,6 +1270,43 @@ namespace { | |||
return ProofResult::Maybe; | |||
} | |||
|
|||
// InRangeWithFreeVars is the same as InRange, but gathers free variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkjeer, could you update this comment?
const EqualExprTy &GetVariableList() const { return VariableList; } | ||
|
||
bool VisitDeclRefExpr(DeclRefExpr *E) { | ||
if (!EqualExprsContainsExpr(SemaRef, VariableList, E, nullptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkjeer, is my comment still accurate? If so, could you leave a comment in the code about the algorithm being quadratic. I don't like to leave quadratic algorithms around in a compiler without calling them out.
…into non-equiv-exprs-in-bounds # Conflicts: # clang/lib/Sema/SemaBounds.cpp
We don't think it is important to the test, and since checkedc#903, it is causing a compiler error (which was previously a warning).
We don't think it is important to the test, and since checkedc#903, it is causing a compiler error (which was previously a warning).
This PR enables the compiler to emit an error when a free variable (defined below) is found in the declared/inferred bounds on bounds assignments, and passing arguments. A note is printed for each free variable found.
Free variables w.r.t. expressions
Assume the helper function
VarSet(E)
returns the set of variables in expressionE
.Given two expressions
SrcExpr
andDstExpr
, a set of equivalent expression setsEQ
. A variableV
inVarSet(SrcExpr)
is free if:V
does not equal to some constant)V
is not an integer type, or there is no set inEQ
that contains a constant andV
; and,V
does not have an alias inVarSet(DstExpr)
) For eachU
inVarSet(DstExpr)
, there is not set inEQ
that contains bothV
andU
.Free variables w.r.t. declared/inferred bounds
We extend the above definition for free variables with respect to the lower (or upper bounds) of the declared bounds (or inferred bounds). In other words, a variable used by the declared lower bounds may be free even if it is used in the inferred upper bounds.
Affected cases
Free variable is detected in the following cases:
Note: we do not check free variables in static bounds cast.
Note for each free variable
We use the template: "note: the declared/inferred bounds use the value of the variable 'q', and there is no relational information between 'q' and any of the variables used by the inferred/declared bounds".