Skip to content

Parsing and typechecking for new null-terminated array types. #394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 10, 2017

Conversation

dtarditi
Copy link
Member

This change adds parsing and type checking for two new kinds of Checked C types: null-terminated arrays and pointers to null-terminated arrays. Null-terminated arrays are written using the nt_checked keyword in place of the checked keyword. An example declaration is int arr nt_checked[10];. Pointers to null-terminated arrays are constructed using the nt_array_ptr keyword instead of the array_ptr keyword (the backward-compatible names for the keywords are _Nt_checked and _Nt_array_ptr).

To represent nt_array_ptrs in the IR, the existing enumeration for the different kinds of pointers is extended. For nt_checked arrays, the boolean indicating whether or not an array is checked is changed to an enumeration. The parsing changes are straightforward. The new keywords are recognized at the points where the existing Checked C keywords are recognized. They are then mapped to the appropriate enumeration tag.

For type checking, the profile methods for type object memorization/construction are changed to take into account the enumeration changes. We also add checks that null-terminated arrays and pointers are only constructed from integer or pointer types (the assumption for now is that the integer 0 will be the null terminator, so only data for which 0 is a valid value can be null-terminated). We do not allow nt_array_ptrs of void (the problem is that conversions through void could allow the null terminator to be partially overwritten when types larger than a character are involved).

With these rules we can construct interesting data structures, such as arrays of null-terminated pointers (i.e. arrays of strings), arrays of null-terminated pointers to null-terminated pointers (and so on), and arrays of null-terminated arrays (arrays of pre-allocated strings of fixed sizes).

Most of the typechecking changes are extending the implicit conversion logic in SemaExpr.cpp to implement the implicit conversion rules for nt_array_ptr. An nt_array_ptr can be converted implicitly to an array_ptr or ptr (bounds checking and checking of bounds declarations will prevent the null terminator from being overwritten). However, implicit conversions in the other direction are not allowed. An unchecked pointer, array_ptr, or ptr cannot be converted implicitly to an array_ptr. The data in memory may not be null-terminated or, if it is, there may be an alias that allows the terminator to be overwritten.

For conditional expressions (e1 ? e2 : e3), the implicit conversion rules are a bit more complex. If the type of one arm of the conditional expression is an nt_array_ptr and the type of the other arm is an unchecked pointer or array_ptr, the resulting type is an array_ptr. Bounds checking will prevent the null terminator from being overwritten.

For pointers to arrays, we allow an unchecked array to be converted to a checked array, but not vice versa. We do not allow an unchecked array to be converted to an nt_array.

This change contains code for one additional issue: we now disallow array_ptrs of function types (checkedc/checkedc#34). This is because functions are indeterminate in size, so no bounds checking is possible. We allow array_ptrs of ptrs to function types.

Testing:

  • Extended the existing Checked C typechecking tests to cover nt_array_ptrs and checked arrays. Most of this work for this change went into extending those tests. These tests are pretty comprehensive and cover many situations where types are checked. This will we be covered by a separate pull request for the Checked C repo.
  • Passes existing automated testing for Linux and Windows, including the LNT test suite and LNT tests that have been converted to Checked C.

Copy link
Collaborator

@awruef awruef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

else {
else if (Kind == CheckedArrayKind::NtChecked) {
// Do not propagate - null-terminated arrays of arrays are not
// allowed and BuildArrayType will issue an error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean this isn't reachable? Should there be an llvm_unreachable here? Or the error will happen later?

Copy link
Collaborator

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read these changes twice and nothing is jumping out at me as wrong or particularly confusing. Thanks for getting this done, a huge change!

dtarditi added a commit to checkedc/checkedc that referenced this pull request Oct 10, 2017
#202)

This change adds tests for parsing and typechecking the new Checked C null-terminated array and pointer types.  This matches a corresponding Checked C clang compiler pull request (checkedc/checkedc-clang#394).  The changes are integrated into existing test cases for checked arrays and array_ptrs.  

The tests for parsing are straightforward.  The new keywords `nt_checked` and `nt_array_ptr` can be used in place of the `checked` and `array_ptr` keywords.  For type checking, we add tests for building the new types. We check that null-terminated arrays and pointers can only be built from integer or pointer types (the assumption is that 0 is the null-terminator value, so only types for which 0 is a valid value can be used).

We also add tests that cover implicit conversions at assignments, function calls, and conditional expressions.  Nt_array_ptrs can be converted to array_ptrs and ptrs, but not the reverse.  The problem is that array_ptrs or ptrs may not point to null-terminated data, or, if they do, there may be an alias that permits the null-terminator to be overwritten.  We also add tests that cover uses of operators with the new types.

The tests include positive cases (where no error is expected) and many more negative cases, where the compiler is expected to issue an error. To avoid unnecessary source code changes that would obscure the new tests, I avoided renumbering temporary variables. I added letter suffixes to existing variables instead.

The compiler change disallows array_ptrs to function types (#34), so we add tests of that too.  Functions have indeterminate size, so bounds checking does not make sense for them.  We allow array_ptrs to ptrs to function types.
@dtarditi dtarditi merged commit fd945a0 into checkedc:master Oct 10, 2017
@dtarditi dtarditi deleted the issue380 branch October 24, 2017 23:25
sulekhark pushed a commit that referenced this pull request Feb 27, 2021
…394)

A right parentheses appearing between a function prototype and the beginning of
its definition caused anything falling between the parenthesis and the actual
end of the prototype to be deleted. This problem is fixed by using clang
provided source locations instead of locating the closing parenthesis by
iterating of the source code characters. Fixes #392.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants