Skip to content

Replace current_expr_value with expression temporaries. #560

Closed
@secure-sw-dev-bot

Description

@secure-sw-dev-bot

This issue was copied from checkedc/checkedc-clang#561


This change replaces current_expr_value in the Checked C clang IR with expression temporaries. An expression temporary is a temporary variable that holds the result of computing a subexpression of an expression. Use the expression temporaries to compute bounds for string literals and compound array literals. The bounds are used for static and dynamic checking.

The Checked C specification uses current_expr_value in its description of bounds inference. This leads to bounds inference steps having to adjust current_expr_value to offset the effect of an expression on a subexpression's value, when the subexpression's value is used in the bounds of an expression. For example, if the bounds of e1 are bounds(current_expr_value, current_expr_value + 5), then the bounds of e1 + e2 require subtracting the value of e2. The bounds of the parent expression are bounds(current_expr_value - e2, current_expr_value - e2). If e2 has side effects, it is not possible to recompute the value of e2. By using expression temporaries, we avoid these complications.

The clang AST has several existing forms of temporaries: CXXBindTemporaryExpr, MaterializeExpr, and OpaqueExpr. The first 2 are specialized for C++ and the third form is only used for temporaries that are "locally obvious". We don't generalize/refactor the existing classes because we would likely break something or make future merges from clang much more difficult.

Instead we create yet another class for temporaries called CHKCBindTemporaryExpr, modelled after CXXBindTemporaryExpr. CXXBindTemporaryExpr is specialized for inserting destructor calls. The class CHKCBindTemporaryExpr binds a temporary variable We use objects of type CHKCBindTemporaryExpr to represent the temporary. The binding class is matched with a class for using the value of an expression temporary. We use BoundsValueExpr for the use case.

We insert expression temporaries for array literals and compound array literals at the conversion of the array type to a pointer type (array-to-pointer decay, in clang terminology). During bounds inference, we look for the pattern of binding of an expression temporary whose subexpression is a possible-parenthesized literal, and use the temporary to construct the bounds.

Most of the changes here are boiler-plate changes related to adding a new IR node. There are a few interesting places:

  • We tried inserting the expression temporaries at the creation of literals instead of at array-to-pointer decays, but that didn't work well. There are lots of places in the compiler that assume they are operating on exactly a string literal, and they all had to be patched.
  • During code generation, we track the LLVM value object used to represent the result of evaluating the subexpression of a temporary binding. We create a map from the temporary binding to the value object. At uses, we use that information to obtain the value of the subexpression.
  • Temporary expression binding is a form of declaration. During AST TreeTransform.h, we track when a binding has been transformed so that we can transform the use too.
  • We don't expect uses of temporary expressions to appear during AST serialization. These are created by the compiler during bounds inference for expressions, and we don't serialize ASTs with these inferred bounds. If we ever need to do that, we'll to apply the same logic used for declarations to keep bindings/uses in sync.
  • We need to skip expression temporaries in some helper functions on expressions and in a few cases where expression temporaries now appear.

Testing:

  • Add new clang tests cases that check that the expected clang ASTs are synthesized for array literals and string literals, and that the expected LLVM IR is generated as well.
  • Add new runtime tests to the Checked C repo that check bounds checking of subscripting and bounds dereferences of string literals and compound array literals (such as "abcd"[index]`).
  • Existing automated tests pass, including Checked C tests, clang Checked C tests, and LNT testing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions