Add a dynamic check for pointer arithmetic of null pointers #647

mgrang · 2019-07-30T23:42:23Z

See issue #237

mgrang · 2019-07-30T23:43:38Z

This is version 1 of the patch and does not yet handle all the cases. Please review this and give your suggestions. I will add tests cases when this patch finalizes.

Driving example:

void foo() {
  int *i = NULL;
  ++i; // No dynamic check added here.

  array_ptr<char> p : count(3) = NULL;
  ++p; // Dynamic check added here.

  array_ptr<char> q : count(3) = "abc";
  ++q; // Dynamic check added here.
}

dtarditi · 2019-07-31T21:52:33Z

lib/CodeGen/CGExprScalar.cpp

+  if (type->isCheckedPointerType() || type->isCheckedArrayType()) {
+    LValueBaseInfo BaseInfo;
+    TBAAAccessInfo TBAAInfo;
+    Address Addr = CGF.EmitPointerWithAlignment(E->getSubExpr(),


This will result in E's subexpression being evaluated twice. This would be a problem for something where the subexpression might have a side-effect, such as (a[f()])++. I think you need to push the insertion of the check into the cases below. The only interesting cases are those where a pointer type can occur.

mgrang · 2019-07-31T23:00:51Z

lib/CodeGen/CGExprScalar.cpp

@@ -2427,6 +2427,15 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,

  // Next most common: pointer increment.
  } else if (const PointerType *ptr = type->getAs<PointerType>()) {


According to the spec, array_ptrs of function pointers are not allowed. So I guess we do not need a check in the case for function pointers.

Agreed - we do not need a check for the case of function pointers.

mgrang · 2019-07-31T23:03:06Z

This patch currently handles the case of pre/post inc/dec of array_ptrs. I am now working to extend this patch to handle binary ops as well. For example:

array_ptr<char> p = NULL;
p += 1;
p = p - 1;

mgrang · 2019-07-31T23:07:52Z

Should we guard the insertion of non-null checks via a clang flag? The user can pass a flag like -fno-runtime-checks to skip insertion of run time checks for performance-critical code.

dtarditi · 2019-07-31T23:19:11Z

Yes, I think a clang flag controlling isertion of these checks is a good idea. We'll need it so that we can examine the performance impact of adding the checks.

Added two new flags to control whether runtime checks should be added or not: -fcheckedc-runtime-checks -fno-checkedc-runtime-checks Also added runtime time checks for pointer arithmetic using binary operations like: ptr += 1 ptr = ptr - 1;

mgrang · 2019-08-01T01:01:59Z

Yes, I think a clang flag controlling isertion of these checks is a good idea. We'll need it so that we can examine the performance impact of adding the checks.

Added in latest patch set.

mgrang · 2019-08-01T17:21:16Z

lib/CodeGen/CGDynamicCheck.cpp

@@ -54,6 +54,9 @@ void CodeGenFunction::EmitDynamicNonNullCheck(const Address BaseAddr, const Qual
  if (!getLangOpts().CheckedC)
    return;

+  if (!CGM.getCodeGenOpts().CheckedCRuntimeChecks)


Runtime checks for non-null will only be added if the flag -fcheckedc-runtime-checks is present. Since I have added this check here, it means this would affect all non-null checks (and not just those for pointer arithmetic). This is resulting in 3 unit test failures which we need to fix (by adding this flag to those tests). Is this behavior acceptable?

It should be on by default, and something that we can disable.

lenary · 2019-08-01T18:46:30Z

Nice!

dtarditi · 2019-08-01T19:31:01Z

I think you should rename the flag to indicate that it is specific to null checks. With the current name, someone might think we disable bounds checks too.

sunnychatterjee · 2019-08-02T19:08:26Z

include/clang/Basic/CodeGenOptions.def

@@ -364,6 +364,9 @@ CODEGENOPT(BranchTargetEnforcement, 1, 0)
 /// Whether to emit unused static constants.
 CODEGENOPT(KeepStaticConsts, 1, 0)

+/// Whether to add null ptr checks for checkedc.


Do we want the nullptr checks to be emitted by default?

Yes, we want to emit them by default. Please see @dtarditi 's comment above.

sunnychatterjee · 2019-08-02T19:09:15Z

include/clang/Driver/Options.td

+
+def fcheckedc_null_ptr_checks : Flag<["-"], "fcheckedc-null-ptr-checks">,
+  Group<f_Group>, Flags<[CC1Option]>,
+  HelpText<"Enable runtime null ptr checks">;


Nit: "Enable runtime nullptr checks". Similarly, at other places.

Thanks. I wanted to distinguish general "null pointers" from the C++ "nulllptr" keyword. That's the reason for the space in between. I guess I can spell it out as "null pointer".

mgrang · 2019-08-02T19:27:00Z

This patch is causing 3 run time unit tests to assert. I am working on fixing it.

mgrang · 2019-08-06T23:59:51Z

lib/CodeGen/CGExprScalar.cpp

+}
+
+static Expr *peelOffOuterExpr(Expr *E) {
+  if (auto *ICE = dyn_cast<ImplicitCastExpr>(E))


Currently we only handle these two types of exprs because I do not have a driving example for any other types of exprs.

mgrang · 2019-08-07T00:00:59Z

lib/CodeGen/CGExprScalar.cpp

+    CGF.EmitDynamicNonNullCheck(Addr, Ty);
+    return;
+
+  } else if (RV.isComplex()) {


I also do not have any driving examples for complex and aggregate rvalues but I have added code here just in case. As an alternative, we could simply assert here and only handle scalar rvalues.

A suggestion: If you don't have motivating examples but have added the code for completeness, you can refer in comments to other parts of the codebase where such cases are being handled.

This is no longer present in the latest patch.

- Added a missing comment for _Nt_Array_ptr. - Fixed an incomplete sentence for _Array_ptr.

mgrang · 2019-08-08T00:21:04Z

I ran LNT testsuite locally and do not see any issues due to this patch. Although there are 10 tests failing with errors like expected identifier or '('. These fail even w/o this patch.

sunnychatterjee · 2019-08-08T00:25:32Z

lib/CodeGen/CGExprScalar.cpp

+  if (B->isAdditiveOp() && B->getType()->isPointerType()) {
+    if (B->getLHS()->getType()->isPointerType()) {
+      return B->getLHS();
+    } else if (B->getRHS()->getType()->isPointerType()) {


Nit: Coding style. In some of the other places like CGDynamicCheck.cpp, I'm seeing that for single statements under an if-else, we aren't using braces. Should we drop the braces from the single statements under if-else here?

For example:

if (!(BaseTy->isCheckedPointerType() || BaseTy->isCheckedArrayType()))
return;

sunnychatterjee · 2019-08-08T00:29:17Z

lib/CodeGen/CGExprScalar.cpp

@@ -2319,6 +2319,106 @@ static BinOpInfo createBinOpInfoFromIncDec(const UnaryOperator *E,
  return BinOp;
 }

+static Expr *peelOffPointerArithmetic(const BinaryOperator *B) {


Nit: I guess this function is getting the pointer type for expressions like (a + i) or (i + a), where 'a' is the pointer type and 'i' is the integer offset. Can we add a comment describing this?

Also, maybe we can rename the function to be more clear on the intent? Something like: getPointerTypeFromPointerOffsetExpression or something similar.

peelOffPointerArithmetic has been adapted from a similar function in lib/StaticAnalyzer/Core/BugReporterVisitors.cpp. I wanted to keep the same name to make the association clear and also to help while grep'ping for the function.

Added comment describing the function.

I agree with Sunny that the name isn't that meaningful.

sunnychatterjee · 2019-08-08T00:37:33Z

lib/CodeGen/CGExprScalar.cpp

+  return nullptr;
+}
+
+static Expr *peelOffOuterExpr(Expr *E) {


Can we rename this function to: skipCasts or something similar, which expresses the intent more clearly.

What about parenthesis? Do we create expressions for them in the ASTs as well? If so, we'd want to skip them as well.

The check for: if (const auto *BO = dyn_cast(E)) can be more targeted, since you only handle additive operator. So, you can use: isAdditiveOp() here.

peelOffOuterExpr has been adapted from a similar function in lib/StaticAnalyzer/Core/BugReporterVisitors.cpp. I wanted to keep the same name to make the association clear and also to help while grep'ping for the function.

Parenthesis are represented in the AST as node levels. For example: (a * b) + c is represented is:

+ / \ * c / \ a b

isAdditiveOp is already checked inside peelOffPointerArithmetic.

dtarditi

The code needs to revised to not evaluate pointer-typed subexpressions twice. I think additional tests are needed.

I'm interested in data on the effect of this on code size and on th effectiveness of optimization. That can be done as a follow-up work item.

dtarditi · 2019-08-08T20:24:36Z

lib/CodeGen/CGExprScalar.cpp

@@ -2319,6 +2319,106 @@ static BinOpInfo createBinOpInfoFromIncDec(const UnaryOperator *E,
  return BinOp;
 }

+static Expr *peelOffPointerArithmetic(const BinaryOperator *B) {


I agree with Sunny that the name isn't that meaningful.

dtarditi · 2019-08-08T20:57:36Z

lib/CodeGen/CGExprScalar.cpp

@@ -2427,6 +2529,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,

  // Next most common: pointer increment.
  } else if (const PointerType *ptr = type->getAs<PointerType>()) {
+    // Insert a dynamic check for arithmetic on null checked pointers.
+    emitDynamicNonNullCheck(CGF, E->getSubExpr());


The caller to this function already evaluated getSubExpr() to an lvalue LV. You are re-evaluating E->getSubExpr() here. That's semantically incorrect because the subexpression may have a side-effect. Earlier in the function, code to load the lvalue LV from has already been generated, with the runtime location of the result being stored in the variable value. The runtime check needs to be against value, and not re-evaluate the subexpression.

dtarditi · 2019-08-08T20:59:09Z

test/CheckedC/dynamic-checks/ptr-arithmetic-null-checks-code-gen.c

@@ -0,0 +1,60 @@
+// RUN: %clang_cc1 %s -emit-llvm -o - | FileCheck %s


Please add some checks for more complex lvalue-producing subexpressions:

array subscripting

members

subexpressions with side-effects.

dtarditi · 2019-08-08T21:02:04Z

lib/CodeGen/CGExprScalar.cpp

@@ -3162,6 +3267,9 @@ static Value *emitPointerArithmetic(CodeGenFunction &CGF,
    std::swap(pointerOperand, indexOperand);
  }

+  // Insert a dynamic check for arithmetic on null checked pointers.
+  emitDynamicNonNullCheck(CGF, pointerOperand);


This has similar problems, in that pointerOperand will be re-evaluated.

dtarditi · 2019-08-08T21:10:17Z

lib/CodeGen/CGExprScalar.cpp

+// Determine whether to emit a dynamic non-null check.
+static void emitDynamicNonNullCheck(CodeGenFunction &CGF, Expr *E) {
+  if (!CGF.CGM.getCodeGenOpts().CheckedCNullPtrChecks)
+    return;


I would structure this code differently. It's an optimization to not emit a dynamic null check if E can be proved at compile-time to be non-null. So I'd try to prove that E does not need a null check and return if the null check can be skipped. Then I would go into the general case. of adding a check.

Please improve the comments on this method. We should always explain what non-obvious arguments are. For example, in this case E is the pointer-typed expression in a pointer-arithmetic expression. We need to check the value of E is non-null, if E is a checked pointer type.

mgrang · 2019-08-13T18:31:36Z

lib/CodeGen/CGExprScalar.cpp

@@ -2427,6 +2433,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,

  // Next most common: pointer increment.
  } else if (const PointerType *ptr = type->getAs<PointerType>()) {
+    // Insert a dynamic check for arithmetic on null checked pointers.


I am not sure about adding a compile time check to see if value is indeed null so that the runtime check can be elided. LLVM has a function called isKnownNonZero which is used in a couple of places in Clang but in my case it always returns 0.

Consider this code:

array_ptr<char> s = NULL; ++s;

This is the IR:

%s = alloca i8*, align 8 store i8* null, i8** %s, align 8 %0 = load i8*, i8** %s, align 8 %_Dynamic_check.non_null = icmp ne i8* %0, null

I guess we need to check what is getting stored in %s. If it is indeed non-null, then we can elide the check. Is this a valid approach?

Added two new flags to control whether runtime checks should be added or not: -fcheckedc-runtime-checks -fno-checkedc-runtime-checks Also added runtime time checks for pointer arithmetic using binary operations like: ptr += 1 ptr = ptr - 1;

mgrang · 2019-08-14T17:54:17Z

This patch had merge conflicts with master. While trying to resolve the conflicts, git added other commits to this PR. At this point, I would simply create a new PR with the conflicts fixed. Please see https://github.com/microsoft/checkedc-clang/pull/663/files.

dtarditi · 2019-08-15T22:41:37Z

Closing this PR because it is superseded by PR #663.

Add a dynamic check for pointer arithmetic of null pointers

aa0c702

mgrang requested review from dtarditi and sunnychatterjee July 30, 2019 23:45

Update Testing.md with ARM lit testing guidelines (#645)

9337489

dtarditi reviewed Jul 31, 2019

View reviewed changes

Mandeep Singh Grang added 3 commits July 31, 2019 15:57

Move the non-null check into the case for pointers

584cbc7

Remove extra newline

ce10295

Add newline

c1d4092

mgrang commented Jul 31, 2019

View reviewed changes

Add new flags -f[no]checkedc-runtime-checks

6d66baa

Added two new flags to control whether runtime checks should be added or not: -fcheckedc-runtime-checks -fno-checkedc-runtime-checks Also added runtime time checks for pointer arithmetic using binary operations like: ptr += 1 ptr = ptr - 1;

Fix comment in CodeGenOptions.def

d8651ee

mgrang commented Aug 1, 2019

View reviewed changes

mgrang mentioned this pull request Aug 1, 2019

Add non-null checks to pointer arithmetic #237

Closed

Mandeep Singh Grang added 2 commits August 1, 2019 12:16

Conform CGDynamicCheck.cpp to 80 char line limit [NFC] (#651)

600ec52

Turn on runtime checks by default

151613b

Mandeep Singh Grang added 2 commits August 2, 2019 10:19

Rename flag to -f[no-]checkedc-null-ptr-checks

491948e

Add unit test

aaaddc6

sunnychatterjee reviewed Aug 2, 2019

View reviewed changes

Handle ptr arithmetic with on temp ptrs

a1d542b

mgrang commented Aug 7, 2019

View reviewed changes

Mandeep Singh Grang and others added 2 commits August 6, 2019 17:19

Avoid processing null checks if flag is present

d6236b8

Fixed the comment for the CheckedPointerKind enum class. (#652)

784d0f7

- Added a missing comment for _Nt_Array_ptr. - Fixed an incomplete sentence for _Array_ptr.

Change 'null ptr' to 'null pointer'

d2a260f

sunnychatterjee reviewed Aug 8, 2019

View reviewed changes

Removed braces for single statement if's and added comments

bfd404f

dtarditi reviewed Aug 8, 2019

View reviewed changes

Use computed value instead of re-emitting subexpr

77d885c

mgrang commented Aug 13, 2019

View reviewed changes

Mandeep Singh Grang added 16 commits August 14, 2019 09:04

Remove unneeded header

eda761a

Add a dynamic check for pointer arithmetic of null pointers

a2b644f

Move the non-null check into the case for pointers

491108a

Remove extra newline

9f861d6

Add newline

b358a94

Add new flags -f[no]checkedc-runtime-checks

f337e46

Added two new flags to control whether runtime checks should be added or not: -fcheckedc-runtime-checks -fno-checkedc-runtime-checks Also added runtime time checks for pointer arithmetic using binary operations like: ptr += 1 ptr = ptr - 1;

Fix comment in CodeGenOptions.def

c5768dc

Turn on runtime checks by default

858c250

Rename flag to -f[no-]checkedc-null-ptr-checks

8ed7650

Add unit test

65a4fc5

Handle ptr arithmetic with on temp ptrs

f2025b0

Avoid processing null checks if flag is present

cbe64db

Change 'null ptr' to 'null pointer'

95411fd

Removed braces for single statement if's and added comments

5a7fe68

Use computed value instead of re-emitting subexpr

989927a

Remove unneeded header and fix line limits

916e7ea

mgrang mentioned this pull request Aug 14, 2019

Add a dynamic check for pointer arithmetic of null pointers #663

Merged

dtarditi closed this Aug 15, 2019

mgrang deleted the dyn-check branch October 31, 2019 21:12

This was referenced Jan 16, 2022

Add a dynamic check for pointer arithmetic of null pointers checkedc/checkedc-llvm-project#643

Closed

Add a dynamic check for pointer arithmetic of null pointers checkedc/checkedc-llvm-project#659

Closed

		@@ -2427,6 +2427,15 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,

		// Next most common: pointer increment.
		} else if (const PointerType *ptr = type->getAs<PointerType>()) {

		@@ -0,0 +1,60 @@
		// RUN: %clang_cc1 %s -emit-llvm -o - \| FileCheck %s

Add a dynamic check for pointer arithmetic of null pointers #647

Add a dynamic check for pointer arithmetic of null pointers #647

Uh oh!

Conversation

mgrang commented Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgrang commented Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgrang commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgrang commented Jul 31, 2019

Uh oh!

dtarditi commented Jul 31, 2019

Uh oh!

mgrang commented Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lenary commented Aug 1, 2019

Uh oh!

dtarditi commented Aug 1, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgrang commented Aug 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgrang commented Aug 8, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dtarditi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

mgrang commented Jul 30, 2019 •

edited

Loading

mgrang commented Jul 30, 2019 •

edited

Loading

mgrang commented Jul 31, 2019 •

edited

Loading

mgrang commented Aug 1, 2019 •

edited

Loading

mgrang Aug 13, 2019 •

edited

Loading