Check return value bounds by kkjeer · Pull Request #1150 · checkedc/checkedc-clang

kkjeer · 2021-08-10T18:47:59Z

This PR checks that the inferred bounds of a return value expression imply the declared return bounds (if any) for the enclosing function.

This PR does not check modifications to variables or other lvalue expressions that are used in return bounds. For example:

_Array_ptr<int> f(int size) : count(size) {
  size = 3; // modify the size variable used in the declared return bounds
  return 0;
}

At the assignment size = 3, the bounds checker will not emit any diagnostic messages even though the modified variable size is used in the declared return bounds. This can be done in a separate PR.

Test updates:

Disable two 3C tests: functionDeclEnd.c and itype_nt_arr_cast due to functions returning expressions with invalid bounds (see issue 3C test failures when checking return bounds #1147).
CheckedC/dump-dataflow-facts.c: two functions returned expressions with invalid bounds. These functions now return 0.
CheckedC/static-checking/bounds-decl-checking.c: four functions returned expressions with invalid bounds. These functions now return expressions with valid bounds.
CheckedC/static-checking/return-bounds.c: add a new test file that tests the return bounds checking behavior introduced in this PR.
checkedc/465: add expected errors to thirteen function in checkedc tests that return expressions with invalid bounds.
checkedc-llvm-test-suite/114: update one function in the LLVM test suite that returned an expression with unknown bounds.

…ons, and set the ReturnBounds member to the expanded return bounds

…sion

…ue bounds checking errors

…FAIL See checkedc-clang issue 1147. These tests should be updated to avoid the compiler errors that arise due to functions in these test files return expressions with unknown bounds. Once these test files are updated, they should no longer be marked as XFAIL.

…ing return value bounds

…d return bounds checking compiler errors

… using the set of expressions that produce the same value as RetExpr

sulekhark · 2021-08-12T13:36:39Z

+                                            FD->getReturnType(),
+                                            BoundsValueExpr::Kind::Return);
+          ReturnBounds =
+            BoundsUtil::ExpandToRange(S, ReturnVal, FD->getBoundsExpr());


The function ExpandToRange only expands count (or, byte_count) bounds expressions. Does this mean that in a range bounds expression like bounds(_Return_value, _Return_value + 2), the abstract place holder _Return_value will not be replaced by the concrete value ReturnVal?

It will be great if a positive test case (no errors or warnings) with similar return bounds as in the test case f10 in clang/test/CheckedC/static-checking/return-bounds.c is added.

ReturnVal is a _Return_value expression (a BoundsValueExpr with the Kind Return). If a function has declared bounds of count(size), ExpandToRange will return bounds(_Return_value, _Return_value + size). If a function has declared bounds of bounds(_Return_value, _Return_value + size), ExpandToRange will return bounds(_Return_value, _Return_value + size).

I've added a test case for this (function f7 with no errors or warnings) to return-bounds.c.

Ok, thanks for the clarification.

sulekhark

LGTM! Thank you!

mattmccutchen-cci · 2021-08-12T21:06:20Z

Is it intentional that this change applies to return values with bounds-safe interfaces? Consider the following example:

_Array_ptr<char> test(void) : count(1) {
  char *x;
  return x;
}

char *test_itype(void) : count(1) {
  char *x;
  return x;
}

Both functions compile without error on master. With this PR, both functions generate errors:

itype_return.c:3:10: error: return value has unknown bounds, bounds expected because the function 'test' has bounds
  return x;
  ~~~~~~~^
itype_return.c:1:18: note: (expanded) declared return bounds are 'bounds(_Return_value, _Return_value + 1)'
_Array_ptr<char> test(void) : count(1) {
                 ^
itype_return.c:8:10: error: return value has unknown bounds, bounds expected because the function 'test_itype' has bounds
  return x;
  ~~~~~~~^
itype_return.c:6:7: note: (expanded) declared return bounds are 'bounds(_Return_value, _Return_value + 1)'
char *test_itype(void) : count(1) {
      ^

For test, this makes sense: the lack of bounds checking at a function return was inconsistent with all other contexts, which IIUC is the bug being fixed in this PR. However, for test_itype, the error seems to violate this statement from section 6.3.8 of the specification:

The checking of return statements (Section 4.9) includes return statements where there is a return bounds-safe interface for the enclosing function:

In checked contexts.

In unchecked contexts, when the return statement expression has been converted implicitly to an unchecked pointer type.

Since x already has an unchecked pointer type, it does not get an implicit conversion inserted per section 6.3.7, so the return statement does not meet the condition to be included in bounds checking.

3C has been assuming that a function whose return type has a bounds-safe interface can return an arbitrary unchecked pointer, so if that design is changing, we'll need to find a way to adjust the design of 3C correspondingly. That is, in essence, why the itype_nt_arr_cast.c test is failing. In functionDeclEnd.c, bounds checking of return values is not the focus of the test, so if need be, we can probably find a way to change that test to avoid triggering the new error.

kkjeer · 2021-08-12T21:30:05Z

The behavior in this PR (regarding bounds checking unchecked pointers in unchecked contexts with bounds-safe interfaces) is intentional with respect to the latest Checked C discussion as far as I understand. @dtarditi @sulekhark should we change this behavior so it is more consistent with the 3C design?

mattmccutchen-cci · 2021-08-12T21:41:46Z

Perhaps more to the point: if I'm right that the "latest Checked C discussion" as reflected by this PR is no longer consistent with the specification, can you please update the specification or otherwise point us to where we can get the latest information?

sulekhark · 2021-08-12T23:07:20Z

Perhaps more to the point: if I'm right that the "latest Checked C discussion" as reflected by this PR is no longer consistent with the specification, can you please update the specification or otherwise point us to where we can get the latest information?

@mattmccutchen-cci
As you pointed out, Section 6.3.8 (middle of page 116)

The checking of return statements (Section 4.9) includes return statements where there is return bounds-safe interface for the enclosing function:
In checked contexts,
In unchecked contexts, when the return statement expression has been converted implicitly to an unchecked pointer type

While our current implementation is consistent with Section 4.9, I think this part of the spec (Section 6.3.8 - middle of page 116) is unclear and we need to modify the spec and add one more scenario to the above statement as shown below. This will make this part of the spec consistent with the previous part.

In unchecked contexts, when an unchecked pointer is returned

However, we will discuss this matter once again with David tomorrow, and resolve it (i.e. make the spec and the implementation consistent with each other).

mattmccutchen-cci · 2021-08-12T23:45:34Z

I guess I'm still unclear on the overall design intent of bounds-safe interfaces. Clearly, one important use case is to allow both unmodified C code and fully Checked C code to call external library functions that have bounds-safe interfaces (as in the Checked C system headers). Is it also a goal to be able to add bounds-safe interfaces to plain C functions in the user's code without having to modify those implementations right away? This is essentially what 3C is trying to do.

If the answer is indeed "no", I noticed two other parts of the spec that could use clarification. In Section 6.3.1:

A function that has a bounds-safe interface and whose body does not use checked pointer or array types is compiled as though the bounds-safe interface has been stripped from its source code.

That sentence appears in a discussion of what happens when unchecked code uses a program element with a bounds-safe interface. But read in isolation, it does seem to suggest that my test_itype example above should be allowed.

Also, an assignment like the following is already rejected by the compiler before this PR (similar to the restriction this PR is adding for return values):

char *global : count(1);

void assign_global(void) {
  char *x;
  global = x;
}

But Section 6.3.8 says:

The checking that an expression statement, declaration, or bundled block implies the declared bounds of variables includes:

In checked contexts, all variables with bounds-safe interfaces.

In unchecked contexts, all variables with bounds-safe interfaces that are modified by an assignment within the statement, declaration, or bundled block, where the right-hand side expression is converted implicitly from a checked pointer type to an unchecked pointer type.

Does another scenario for an unchecked right-hand side need to be added here as well?

mwhicks1 · 2021-08-13T19:13:03Z

We've been discussing this issue over at Correct Computation and the summary is:

We are struggling to understand the goals and principles underlying the rules for itypes, especially when used on function definitions (rather than prototypes). If we understood these principles and goals, we might better understand how to judge one design vs. an alternative.

One thing I find confusing is that itypes currently used for standard library functions could not be applied to those functions themselves, in their original C. For example, in the header string_checked.h I have

char *strtok(char * restrict s1 : itype(restrict _Nt_array_ptr<char>),
             const char * restrict s2 : itype(restrict _Nt_array_ptr<const char>)) :
  itype(_Nt_array_ptr<char>);

The standard library was compiled from C code that implements this function, e.g.,

char *strtok(char * restrict s, const char * restrict sep) {
 {
	static char *p;
	if (!s && !(s = p)) return NULL;
	s += strspn(s, sep);
	if (!*s) return p = 0;
	p = s + strcspn(s, sep);
	if (*p) *p++ = 0;
	else p = 0;
	return s;
}

I would have thought that if I can advertise in string_checked.h that this code as an itype, I could just as well have attached the itype directly to the code itself, i.e.,

char *strtok(char * restrict s : itype(restrict _Nt_array_ptr<char>),
             const char * restrict sep : itype(restrict _Nt_array_ptr<const char>)) :
  itype(_Nt_array_ptr<char>) {
  	static char *p;
	if (!s && !(s = p)) return NULL;
	s += strspn(s, sep);
	if (!*s) return p = 0;
	p = s + strcspn(s, sep);
	if (*p) *p++ = 0;
	else p = 0;
	return s;
}

It feels strange to me that this does not work, but typechecking this code with its original C type and the prototype also present with an itype does work. What principle justifies the difference in treatment?

sulekhark · 2021-08-14T01:20:39Z

As per the discussion we had with David today, we agree that if the itypes used in a function declaration are applied to the function definition, and the function definition is compiled in an unchecked scope, and the function body does not use any checked pointers, then the compiler must not issue any bounds-checking errors or warnings.

For the strtok code show below, compiling it with the master branch :

#include <string.h>
char *strtok(char * restrict s : itype(restrict _Nt_array_ptr<char>),
             const char * restrict sep : itype(restrict _Nt_array_ptr<const char>)) :
  itype(_Nt_array_ptr<char>) {
  	static char *p;
	if (!s && !(s = p)) return NULL;
	s += strspn(s, sep);
	if (!*s) return p = 0;
	p = s + strcspn(s, sep);
	if (*p) *p++ = 0;
	else p = 0;
	return s;
}

currently gives two errors (I have elided the notes):

 t.c:7:14: error: inferred bounds for 's' are unknown after assignment
        if (!s && !(s = p)) return 0;

t.c:8:2: error: inferred bounds for 's' are unknown after assignment
        s += strspn(s, sep);

Going forward, these errors will be issued when the above function is analyzed in checked scope but not when analyzed in an unchecked scope. Additionally, in unchecked scope the return bounds checking (as per this PR) on the returned pointer s in the above example will not be performed.

We will update the Checked C spec and the compiler code to be consistent with this behavior.

mwhicks1 · 2021-08-14T11:21:24Z

Thanks for the update. When the new spec is available we will reread with a mindset toward consistency/principle, to see if we have any further thoughts about itypes; if so, we'll comment on a separate thread.

mattmccutchen-cci · 2021-08-16T17:55:36Z

we agree that if the itypes used in a function declaration are applied to the function definition, and the function definition is compiled in an unchecked scope, and the function body does not use any checked pointers, then the compiler must not issue any bounds-checking errors or warnings

For the record: it looks like #1157, #1158, and #1159 have been filed for this work. Thanks!

… a bounds-safe interface, for return values that have not been implicitly converted to an unchecked pointer

sulekhark

LGTM! Thank you!

The purpose of this function was to test function declaration rewriting for itype array pointers declared with bounds but without an explicit itype. This is equally well tested when a correct bound is inferred for the parameter array. (Matt: Remove the XFAIL now that the test is fixed.) (cherry picked from commit 57d31d2)

mgrang

LGTM.

…into check-return-bounds

kakje added 16 commits August 6, 2021 15:59

Add FunctionDeclaration and ReturnVal members to CheckBoundsDeclarati…

ec74226

…ons, and set the ReturnBounds member to the expanded return bounds

Add ReturnStmt to the ProofStmtKind enum

e589681

Do not check for free variables if E1 or E2 is a _Return_value expres…

e0e2bc3

…sion

Add ProveReturnBoundsValidity method

b53da94

Add diagnostic note messages for declared and inferred return bounds

3d860c2

Add DiagnoseUnknownReturnBounds method

3055184

Add ValidateReturnBounds method

f821bf5

Validate the return value bounds in CheckReturnStmt

91a6128

Change return values in bounds-decl-checking test to avoid return val…

0d29c2b

…ue bounds checking errors

Add return-bounds.c test file containing some initial tests for check…

1197e7c

…ing return value bounds

Change 2 return values in dump-dataflow-facts.c test in order to avoi…

9d369fd

…d return bounds checking compiler errors

Add bounds-safe interface test cases to return-bounds.c

7bbc1f6

Fix the logic for recording equality between RetExpr and ReturnVal by…

9013f6d

… using the set of expressions that produce the same value as RetExpr

Add tests for bounds casts to return-bounds.c

8237d6c

Add tests for function calls to return-bounds.c

df46f91

kkjeer requested review from arbipher, dtarditi, mgrang and sulekhark August 10, 2021 18:47

This was referenced Aug 10, 2021

Check return value bounds: update tests checkedc/checkedc#465

Merged

Check return value bounds: update test microsoft/checkedc-llvm-test-suite#114

Merged

sulekhark reviewed Aug 12, 2021

View reviewed changes

Add no warnings/errors test case for expanded declared return bounds

b613afb

sulekhark approved these changes Aug 12, 2021

View reviewed changes

kakje added 4 commits August 17, 2021 13:52

Skip checking return bounds in an unchecked scope, for functions with…

f8a72a6

… a bounds-safe interface, for return values that have not been implicitly converted to an unchecked pointer

Update return bounds tests (add more tests for bounds-safe interfaces)

86ccdbf

Enable 3C/itype_nt_arr_cast.c test

489ec1a

Add checked/unchecked scope comments to 3C/functionDeclEnd.c test

7600025

kkjeer mentioned this pull request Aug 17, 2021

3C test failures when checking return bounds #1147

Closed

This was referenced Aug 17, 2021

Fix functionDeclEnd.c test that fails new return value bounds check correctcomputation/checkedc-clang#682

Closed

Change test7 in functionDeclEnd.c to infer a valid bound correctcomputation/checkedc-clang#684

Merged

sulekhark approved these changes Aug 18, 2021

View reviewed changes

sulekhark mentioned this pull request Aug 24, 2021

Clarification updates to the Checked C specification. checkedc/checkedc#469

Merged

mgrang approved these changes Aug 24, 2021

View reviewed changes

Merge branch 'master' of https://github.com/microsoft/checkedc-clang …

d4a7eab

…into check-return-bounds

kkjeer merged commit 6375a86 into master Aug 27, 2021

kkjeer deleted the check-return-bounds branch August 27, 2021 22:54

mattmccutchen-cci mentioned this pull request Sep 8, 2021

Merge from Microsoft 2021-09-08 correctcomputation/checkedc-clang#700

Merged

mattmccutchen-cci mentioned this pull request Nov 17, 2021

"Symmetric" liberal itypes: allow data flow inside or outside function to use either side of itype? correctcomputation/checkedc-clang#743

Open

This was referenced Jan 15, 2022

Check return value bounds: update tests checkedc/checkedc-fork#463

Closed

Clarification updates to the Checked C specification. checkedc/checkedc-fork#466

Closed

Check return value bounds checkedc/checkedc-llvm-project#1149

Closed

Conversation

kkjeer commented Aug 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sulekhark Aug 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkjeer Aug 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sulekhark Aug 12, 2021

Choose a reason for hiding this comment

Uh oh!

sulekhark left a comment

Choose a reason for hiding this comment

Uh oh!

mattmccutchen-cci commented Aug 12, 2021

Uh oh!

kkjeer commented Aug 12, 2021

Uh oh!

mattmccutchen-cci commented Aug 12, 2021

Uh oh!

sulekhark commented Aug 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattmccutchen-cci commented Aug 12, 2021

Uh oh!

mwhicks1 commented Aug 13, 2021

Uh oh!

sulekhark commented Aug 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mwhicks1 commented Aug 14, 2021

Uh oh!

mattmccutchen-cci commented Aug 16, 2021

Uh oh!

sulekhark left a comment

Choose a reason for hiding this comment

Uh oh!

mgrang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kkjeer commented Aug 10, 2021 •

edited

Loading

sulekhark Aug 12, 2021 •

edited

Loading

kkjeer Aug 12, 2021 •

edited

Loading

sulekhark commented Aug 12, 2021 •

edited

Loading

sulekhark commented Aug 14, 2021 •

edited

Loading