Extend static bounds declaration checking#624
Conversation
Everytime in bounds declaration checking we encounter a comparison between zero and an unsigned integer, we can use the fact that 0 <= any unsigned integer expression. In this commit, the representation of ranges has been extended. Instead of ConstantSizedRange, a new representation is introduced: BaseRange. InRange() checks whether the range is constant-sized or variable-sized, and compare lower/upper offsets based on that. CreateBaseRange() is responsible for setting the correct RangeType to a range.
dtarditi
left a comment
There was a problem hiding this comment.
Thanks for making this change. I think with variable-sized ranges, there are additional cases that you need to check. I think the code would benefit from checking upper/lower bounds independently and then combining the results. This would handle the additional cases and make the code simpler.
| // const2 as signed (APSInt) integers. They must have the same bitsize. A | ||
| // variable-sized range is a range that has the form (e1 + e2, e1 + e3), where | ||
| // e1, e2, and e3 are all expressions. | ||
| class BaseRange { |
There was a problem hiding this comment.
I think ranges are more complex than this comment indicates. A variable sized range has the form (e1 + e2, e1 + e3), where one or both of e2 and e3 are non-constant expressions.
I think it will be useful to state the invariant about the representation:
- If a range is constant sized, the LowerOffsetExpr and UpperOffsetExpr should be null.
- For a variable sized range, you'll need to think about to indicate whether to use the constant offset or the expression.
In theory, you could just check whether the expression is constant or not. We don't want to do this in practice because that's expensive to determine. So we want to compute this when handed the expression.
| } | ||
|
|
||
| bool IsConstantSizedRange() { | ||
| return BaseRangeType == RangeType::ConstantSized && !UpperOffsetExpr && !LowerOffsetExpr; |
There was a problem hiding this comment.
I think the check that UpperOffsetExpr and LowerOffsetExpr are null should be an assert. This is part of the invariant about the representation of ranges.
| bool IsConstantSizedRange() { | ||
| return BaseRangeType == RangeType::ConstantSized && !UpperOffsetExpr && !LowerOffsetExpr; | ||
| } | ||
|
|
There was a problem hiding this comment.
I would recommend adding a matching method IsVariableSizedRange().
|
|
||
| // Is R in range of this range? | ||
| ProofResult InRange(ConstantSizedRange &R, ProofFailure &Cause, EquivExprSets *EquivExprs) { | ||
| ProofResult InRange(BaseRange &R, ProofFailure &Cause, EquivExprSets *EquivExprs) { |
There was a problem hiding this comment.
We should probably change that comment at line 1754 for clarity: Is R a subrange of this range?
| Result = ProofResult::False; | ||
| } | ||
| } else { // at least one of the ranges is not constant-sized | ||
| Result = ProofResult::Maybe; |
There was a problem hiding this comment.
I think the logic needs to be reworked to check the lower and upper bound separately. For each bound, you can them come to the conclusion that:
- this range's lower bound <= R's lower bound (true, false, or maybe).
- R's upper bounds <= his range's upper bound (true, false, or maybe).
If 1 and 2 are true, the result is true.
If 1 or 2 are false, the result is false.
Otherwise the result is maybe.
For each bound involved in a ej <= ek, comparison, you need to handle the cases where
- ej and ek are constant
- ej is 0 and ek is an unsigned integer expression.
If you follow this approach, there is no need to handle variable vs. constant-sized ranges separately.
|
|
||
| // Does R partially overlap this range? | ||
| ProofResult PartialOverlap(ConstantSizedRange &R) { | ||
| ProofResult PartialOverlap(BaseRange &R) { |
There was a problem hiding this comment.
I think this could be generalized.
There was a problem hiding this comment.
I addressed all the above comments (except this one) in my newest commit.
Can you please elaborate more on this comment?
dtarditi
left a comment
There was a problem hiding this comment.
I have additional suggestions for the code.
| // If both ranges are constant-sized, the constant integer values of the offsets are compared. | ||
| // If at least one of the ranges is variable-sized, then in some special cases | ||
| // we can prove the lower bound is correct. | ||
| if (IsConstantSizedRange() && R.IsConstantSizedRange()) { |
There was a problem hiding this comment.
Instead of checking that the ranges are constant sized, how about checking that both lower bounds are constant? Then you can eliminate the last case at lines 1794 - 1796.
After reading this code, I think having fields named LowerOffsetExpr and LowerOffset (as well UpperOffsetExpr and UpperOffset) are confusing. The Expr suffix doesn't mean much. LowerOffsetVariable and LowerOffsetConstant instead. It is also worth introducing predicate functions IsLowerOffsetConstant and IsLowerOffsetVariable, as that makes the code easier to read.
| // They must have the same bitsize. | ||
| // In this case, UpperOffsetExpr and LowerOffsetExpr should be both null. | ||
| // - If one or both of e2 and e3 are non-constant expressions, the range is Variable-sized. | ||
| // TODO: elaborate more. |
There was a problem hiding this comment.
Please state the invariants for the representation.
There was a problem hiding this comment.
@dtarditi For
I will add the following lines to the comment:
- for constant-sized range:
More specifically: (UpperOffsetExpr == nullptr && LowerOffsetExpr == nullptr && BaseRangeKind == Kind::ConstantSized) - for variable-sized range:
More specifically: ((UpperOffsetExpr != nullptr || LowerOffsetExpr != nullptr) && BaseRangeKind == Kind::VariableSized)
Is that enough?
| } | ||
| if (UpperOffset < R.UpperOffset) { | ||
| } else if (LowerOffsetExpr && R.LowerOffsetExpr && | ||
| !EqualValue(S.Context, LowerOffsetExpr, R.LowerOffsetExpr, EquivExprs)) |
There was a problem hiding this comment.
What about the case when the offset expressions are equal?
| // integer that will fit pointer width. | ||
| void SplitIntoBaseAndOffset(Expr *E, Expr *&Base, | ||
| llvm::APSInt &Offset) { | ||
| llvm::APSInt &Offset, Expr *&OffsetExpr, |
There was a problem hiding this comment.
Instead of adding the extra parameter OnlyConstant, I suggest always letting this function always split expressions if possible and checking the results returned by the function. I think you will need to change the function to return a value indicating what it did.
- remove RangeType field from BaseRange - rename fields to better reflect their purpose - introduce predicate functions for lower and upper bound kinds
|
@dtarditi I have made some new changes. The code is ready to be reviewed. |
dtarditi
left a comment
There was a problem hiding this comment.
Looks very good! I have a few more suggested changes.
| R->SetLower(LowerOffset); | ||
| R->SetUpper(UpperOffset); | ||
| return true; | ||
| if (LowerOffsetKind == BaseRange::Kind::ConstantSized && UpperOffsetKind == BaseRange::Kind::ConstantSized) { |
There was a problem hiding this comment.
It is OK for one of the upper bounds or lower bounds to be constant and the other to be variable. I think you should set the lower/upper bounds independently of each other:
if (LowerOffsetKind == BaseRange::Kind::ConstantSized) {
…
} else if (LowerOffsetKind == BaseRange::Kind::VariableSize) {
…
} else
return BaseRange::KInd::Invalid
| if (UpperOffsetConstant.getExtValue() == 0 && | ||
| R.UpperOffsetConstant.getExtValue() == 0) | ||
| return ProofResult::True; | ||
| if (SignsMatch(UpperOffsetConstant, R.UpperOffsetConstant) && |
There was a problem hiding this comment.
I don't think that you need to check that the signs match for the integer constants. We always represent integers using signed integers.
|
@dtarditi Thank you very much for your comments. New changes are ready for review. |
dtarditi
left a comment
There was a problem hiding this comment.
I have some additional suggestions.
| // R.LowerOffset is within this range, but R.UpperOffset is above the range | ||
| if (LowerOffset <= R.LowerOffset && R.LowerOffset < UpperOffset && | ||
| UpperOffset < R.UpperOffset) | ||
| if (LowerOffsetConstant <= R.LowerOffsetConstant && R.LowerOffsetConstant < UpperOffsetConstant && |
There was a problem hiding this comment.
I think you need to check that the two ranges are constant-sized here.
| // Convert an expression E into a base and offset form. | ||
| // - If E is pointer arithmetic involving addition or subtraction of a | ||
| // constant integer, return the base and offset. | ||
| // - If E is a pointer arithmetic involving addition or subtraction of |
There was a problem hiding this comment.
This comment needs to be updated. OnlyConstant is gone now.
| return p; | ||
| } | ||
|
|
||
| _Array_ptr<int> test_f70(int c) : byte_count(c); |
There was a problem hiding this comment.
Could you add some tests for _Nt_array_ptr too?
There was a problem hiding this comment.
@dtarditi Yes, I added some tests for _Nt_array_ptr and will push the changes shortly. There are more tests (for both nt_array_ptr and array_ptr) in CheckedC repo.
|
@dtarditi Thanks for the suggestions. I pushed the new changes to the branch. |
|
@dtarditi I improved the comments of various functions. I hope they are precise enough! |
…rectory. (#624) * Follow new guideline: Avoid `cd DIR && FOO` in RUN lines. Instead, put `FOO` on a separate run line. `cd` affects later RUN lines, and `cd DIR && FOO` could mislead the reader that it doesn't and that's why we needed to put both parts on one line. I added the proposed guideline to #346. I'm changing the existing tests before changing more tests to use `cd` as part of this PR. * Add `-o /dev/null` to clang commands that use `-` as input. * Add -working-directory to clang commands with named input files. * `cd` to a temporary directory for `3c -dump-stats` commands.
I have not added the tests yet.