Skip to content

Commit c71c452

Browse files
authored
String literals with checked types and improved support for array literals. (#396)
This adds compiler support for using string literals in checked scopes. In C, a string literal has the type "array of character", where there are several possible character types. In Checked C, that corresponds to an "unchecked array of character". To be able to use string in checked scopes, we need a way that they can be typed as a "checked null-terminated array of characters." We support this in two ways. First we alter the type of string literals that appears in checked scopes to have null-terminated array types. Array-to-pointer conversion converts this to having an nt_array_ptr test, and at assignments nt_array_ptr can converted array_ptr.. Second, we allow array literals (of which strings are a special case) to be converted implicitly to a checked pointer type when a checked pointer type is expected. The checked pointer type should have the same pointer kind as the expected pointer type. The implicit conversions can happen at assignments, function calls, and within initializers. The language rules haven't been added to the Checked C specification yet. The proposed rule is described[here](checkedc/checkedc#207). Array literals specified using initializers or compound literal expressions already worked for the most part. The C language rules use the type of the variable being initialized or the type of the compound literal expression as guides for filling in the type. The clang machinery for filling in arrays worked for checked arrays too. This commit adds one important missing case: incomplete array types. We were not propagating the checked enum when the complete array type was formed after the number of array elements had been determined. It also checks in a checked scope that the compound literal type is a checked type. This closes off one more route by which an unchecked type could appear in a checked scope. This commit adds a new method for checking types that appear syntactically in expressions in checked scopes (for example, in cast operators and compound literal expressions). This produces error message with similar detail to those produced for declarations, calling out the problematic unchecked type within a constructed type if necessary. The existing error message for cast operations are switched to use this method, so we get better error message for cast expressions too. Testing: - Added new tests in the Checked C repo in initializers.c that cover strings, array initializers, and pointers initialized via strings or compound literal expressions with array type. - Passes existing automated testing for Linux and Windows. I had to modify several LNT benchmarks because the improved typing discovered that we were passing strings to methods with bounds-safe interfaces.
1 parent fd945a0 commit c71c452

File tree

7 files changed

+121
-44
lines changed

7 files changed

+121
-44
lines changed

include/clang/Basic/DiagnosticSemaKinds.td

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9332,29 +9332,30 @@ def err_bounds_type_annotation_lost_checking : Error<
93329332
"|argument for parameter used in function return bounds expression"
93339333
"|argument for parameter used in function parameter bounds expression}1">;
93349334

9335-
def err_checked_scope_type: Error<
9335+
def err_checked_scope_decl_type: Error<
93369336
"%select{parameter|return|local variable|global variable|member}0 %select{|used }1in a checked scope must have "
93379337
"%select{a |a pointer, array or function type that uses only |a bounds-safe interface type that uses only }2"
93389338
"checked %plural{0:type %plural{2:|:or a bounds-safe interface}0|:types or parameter/return types with bounds-safe interfaces}2">;
93399339

9340+
def err_checked_scope_type: Error<
9341+
"type in a checked scope must %plural{0:be a checked pointer or array type"
9342+
"|:use only checked types or parameter/return types with bounds-safe interfaces}0">;
9343+
93409344
def note_checked_scope_declaration : Note<
93419345
"%select{parameter|return|local variable|global variable|member}0 declared here">;
93429346

93439347
def note_checked_scope_problem_type : Note<
93449348
"%0 is not allowed in a checked scope">;
93459349

9346-
def err_checked_scope_type_for_casting : Error<
9347-
"invalid casting to unchecked type %0 in a checked scope">;
9348-
93499350
def err_checked_scope_no_prototype_func : Error<
93509351
"function without a prototype cannot be used or declared in a checked scope">;
93519352

9352-
def err_checked_scope_no_variable_args : Error<
9353+
def err_checked_scope_decl_variable_args : Error<
93539354
"%select{parameter|return|local variable|global variable|member}0 %select{|used }1in a checked scope "
93549355
"cannot have variable arguments">;
93559356

9356-
def err_checked_scope_no_variable_args_for_casting : Error<
9357-
"invalid casting to type having variable arguments in a checked scope">;
9357+
def err_checked_scope_type_variable_args : Error<
9358+
"type in a checked scope cannot have variable arguments">;
93589359

93599360
def err_checked_scope_no_variadic_func_for_declaration : Error<
93609361
"variable arguments function cannot be made in a checked scope">;

include/clang/Sema/Sema.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3831,6 +3831,8 @@ class Sema {
38313831
bool DiagnoseCheckedDecl(const ValueDecl *D,
38323832
SourceLocation UseLoc = SourceLocation());
38333833

3834+
bool DiagnoseTypeInCheckedScope(QualType Ty, SourceLocation Start, SourceLocation End);
3835+
38343836
/// \brief Warn if we're implicitly casting from a _Nullable pointer type to a
38353837
/// _Nonnull one.
38363838
void diagnoseNullableToNonnullConversion(QualType DstType, QualType SrcType,

lib/Sema/SemaBounds.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -690,8 +690,10 @@ namespace {
690690
return LValueBounds(ICE->getSubExpr());
691691
return CreateBoundsUnknown();
692692
}
693-
// TODO: fill in these cases.
693+
// TODO: these cases need CurrentExprValue to be implemented to express
694+
// the bounds.
694695
case Expr::CompoundLiteralExprClass:
696+
case Expr::StringLiteralClass:
695697
return CreateBoundsAllowedButNotComputed();
696698
default:
697699
return CreateBoundsUnknown();

lib/Sema/SemaCast.cpp

Lines changed: 11 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2574,24 +2574,15 @@ void CastOperation::CheckCStyleCast() {
25742574
// Checked C - No C-style casts to unchecked pointer/array type or variadic
25752575
// type in a checked block.
25762576
if (Self.getCurScope()->isCheckedScope()) {
2577-
bool IsUnchecked = DestType->isOrContainsUncheckedType();
2578-
bool HasVariadicType = DestType->hasVariadicType();
2579-
bool ConstructsNullPointer = false;
2580-
if (IsUnchecked)
2581-
if (DestType->isVoidPointerType() &&
2582-
!SrcExpr.isInvalid()) {
2583-
const IntegerLiteral *Lit = dyn_cast<IntegerLiteral>(SrcExpr.get());
2584-
if (Lit && !Lit->getValue())
2585-
ConstructsNullPointer = true;
2586-
}
2587-
if ((IsUnchecked && !ConstructsNullPointer) || HasVariadicType) {
2588-
if (IsUnchecked) {
2589-
Self.Diag(OpRange.getBegin(), diag::err_checked_scope_type_for_casting)
2590-
<< DestType;
2591-
} else {
2592-
Self.Diag(OpRange.getBegin(),
2593-
diag::err_checked_scope_no_variable_args_for_casting);
2594-
}
2577+
bool isNullPointerConstant =
2578+
DestType->isVoidPointerType() &&
2579+
DestType->isUncheckedPointerType() &&
2580+
!SrcExpr.isInvalid() &&
2581+
SrcExpr.get()->isNullPointerConstant(Self.Context,
2582+
Expr::NPC_NeverValueDependent);
2583+
if (!isNullPointerConstant &&
2584+
!Self.DiagnoseTypeInCheckedScope(DestType, OpRange.getBegin(),
2585+
OpRange.getEnd())) {
25952586
SrcExpr = ExprError();
25962587
return;
25972588
}
@@ -2644,21 +2635,13 @@ void CastOperation::CheckBoundsCast(tok::TokenKind kind) {
26442635
// Checked C - No C-style casts to unchecked pointer/array type or variadic
26452636
// type in a checked block.
26462637
if (Self.getCurScope()->isCheckedScope()) {
2647-
bool IsUnchecked = DestType->isOrContainsUncheckedType();
2648-
bool HasVariadicType = DestType->hasVariadicType();
26492638
if (Kind == CK_AssumePtrBounds) {
26502639
Self.Diag(OpRange.getBegin(),
26512640
diag::err_checked_scope_no_assume_bounds_casting);
26522641
SrcExpr = ExprError();
26532642
return;
2654-
} else if (IsUnchecked || HasVariadicType) {
2655-
if (IsUnchecked) {
2656-
Self.Diag(OpRange.getBegin(), diag::err_checked_scope_type_for_casting)
2657-
<< DestType;
2658-
} else {
2659-
Self.Diag(OpRange.getBegin(),
2660-
diag::err_checked_scope_no_variable_args_for_casting);
2661-
}
2643+
} else if (!Self.DiagnoseTypeInCheckedScope(DestType, OpRange.getBegin(),
2644+
OpRange.getEnd())) {
26622645
SrcExpr = ExprError();
26632646
return;
26642647
}

lib/Sema/SemaChecking.cpp

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11588,7 +11588,8 @@ bool Sema::AllowedInCheckedScope(QualType Ty, const BoundsExpr *Bounds,
1158811588
return false;
1158911589
}
1159011590
}
11591-
return AllowedInCheckedScope(Ty->getPointeeType(), nullptr, false, Loc,
11591+
QualType ReferentType = QualType(Ty->getPointeeOrArrayElementType(), 0);
11592+
return AllowedInCheckedScope(ReferentType, nullptr, false, Loc,
1159211593
ProblemLoc, ProblemTy);
1159311594
} else if (const FunctionProtoType *fpt = Ty->getAs<FunctionProtoType>()) {
1159411595
const BoundsExpr *ReturnBounds = fpt->getReturnBounds();
@@ -11661,7 +11662,7 @@ bool Sema::DiagnoseCheckedDecl(const ValueDecl *Decl, SourceLocation UseLoc) {
1166111662
if (!AllowedInCheckedScope(Ty, TargetDecl->getBoundsExpr(),
1166211663
isa<ParmVarDecl>(TargetDecl), CSTL_TopLevel,
1166311664
ProblemLoc, ProblemTy)) {
11664-
Diag(Loc, diag::err_checked_scope_type) << DeclKind << IsUse
11665+
Diag(Loc, diag::err_checked_scope_decl_type) << DeclKind << IsUse
1166511666
<< ProblemLoc;
1166611667
if (IsUse) {
1166711668
Diag(TargetDecl->getLocStart(), diag::note_checked_scope_declaration)
@@ -11685,14 +11686,37 @@ bool Sema::DiagnoseCheckedDecl(const ValueDecl *Decl, SourceLocation UseLoc) {
1168511686
}
1168611687

1168711688
if (Ty->hasVariadicType()) {
11688-
Diag(Loc, diag::err_checked_scope_no_variable_args) << DeclKind
11689+
Diag(Loc, diag::err_checked_scope_decl_variable_args) << DeclKind
1168911690
<< IsUse;
1169011691
Result = false;
1169111692
}
1169211693

1169311694
return Result;
1169411695
}
1169511696

11697+
bool Sema::DiagnoseTypeInCheckedScope(QualType Ty, SourceLocation StartLoc,
11698+
SourceLocation EndLoc) {
11699+
CheckedScopeTypeLocation ProblemLoc = CSTL_TopLevel;
11700+
QualType ProblemTy = Ty;
11701+
if (!AllowedInCheckedScope(Ty, nullptr, false, CSTL_TopLevel,
11702+
ProblemLoc, ProblemTy)) {
11703+
Diag(StartLoc, diag::err_checked_scope_type) << ProblemLoc
11704+
<< SourceRange(StartLoc, EndLoc);
11705+
11706+
// Print a note about the problem type if it might not be obvious.
11707+
if (ProblemLoc != CSTL_TopLevel || !DisplaysAsArrayOrPointer(ProblemTy))
11708+
Diag(StartLoc, diag::note_checked_scope_problem_type) << ProblemTy;
11709+
return false;
11710+
}
11711+
11712+
if (Ty->hasVariadicType()) {
11713+
Diag(StartLoc, diag::err_checked_scope_type_variable_args) <<
11714+
SourceRange(StartLoc, EndLoc);
11715+
return false;
11716+
}
11717+
11718+
return true;
11719+
}
1169611720

1169711721
//===--- Layout compatibility ----------------------------------------------//
1169811722

lib/Sema/SemaExpr.cpp

Lines changed: 66 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1731,9 +1731,13 @@ Sema::ActOnStringLiteral(ArrayRef<Token> StringToks, Scope *UDLScope) {
17311731
// Get an array type for the string, according to C99 6.4.5. This includes
17321732
// the nul terminator character as well as the string length for pascal
17331733
// strings.
1734+
CheckedArrayKind ArrayKind = getCurScope()->isCheckedScope() ?
1735+
CheckedArrayKind::NtChecked : CheckedArrayKind::Unchecked;
1736+
17341737
QualType StrTy = Context.getConstantArrayType(CharTyConst,
1735-
llvm::APInt(32, Literal.GetNumStringChars()+1),
1736-
ArrayType::Normal, 0);
1738+
llvm::APInt(32, Literal.GetNumStringChars() + 1),
1739+
ArrayType::Normal, 0,
1740+
ArrayKind);
17371741

17381742
// OpenCL v1.1 s6.5.3: a string literal is in the constant address space.
17391743
if (getLangOpts().OpenCL) {
@@ -5795,6 +5799,14 @@ Sema::BuildCompoundLiteralExpr(SourceLocation LParenLoc, TypeSourceInfo *TInfo,
57955799
SourceRange(LParenLoc, LiteralExpr->getSourceRange().getEnd())))
57965800
return ExprError();
57975801

5802+
if (getCurScope()->isCheckedScope()) {
5803+
if (literalType->isArrayType())
5804+
literalType = MakeCheckedArrayType(literalType);
5805+
if (!DiagnoseTypeInCheckedScope(literalType, LParenLoc, RParenLoc))
5806+
return ExprError();
5807+
literalType = RewriteBoundsSafeInterfaceTypes(literalType);
5808+
}
5809+
57985810
InitializedEntity Entity
57995811
= InitializedEntity::InitializeCompoundLiteralInit(TInfo);
58005812
InitializationKind Kind
@@ -8213,6 +8225,50 @@ Sema::CheckTransparentUnionArgumentConstraints(QualType ArgType,
82138225
return Compatible;
82148226
}
82158227

8228+
// If the LHS type is a checked pointer type and the RHS is an array literal,
8229+
// retype the RHS to have the same kind of checked pointer. We assume
8230+
// array-to-pointer converison has already been done.
8231+
static bool arrayConstantCheckedConversion(Sema &S, QualType LHSType,
8232+
ExprResult &RHS) {
8233+
// Recognize the syntax for the constant.
8234+
ImplicitCastExpr *ICE = dyn_cast<ImplicitCastExpr>(RHS.get());
8235+
if (!ICE)
8236+
return false;
8237+
8238+
if (ICE->getCastKind() != CK_ArrayToPointerDecay)
8239+
return false;
8240+
8241+
Expr *Child = ICE->getSubExpr()->IgnoreParens();
8242+
if (!isa<InitListExpr>(Child) && !isa<StringLiteral>(Child) &&
8243+
!isa<CompoundLiteralExpr>(Child))
8244+
return false;
8245+
8246+
// See if the constant needs to be retyped.
8247+
QualType RHSType = RHS.get()->getType();
8248+
const PointerType *RHSPointerType = dyn_cast<PointerType>(RHSType);
8249+
const PointerType *LHSPointerType = dyn_cast<PointerType>(LHSType);
8250+
8251+
if (!LHSType->isCheckedPointerType() || !RHSPointerType ||
8252+
LHSPointerType->getKind() == RHSPointerType->getKind())
8253+
return false;
8254+
8255+
QualType RHSPointee = RHSPointerType->getPointeeType();
8256+
8257+
// Retype the constant.
8258+
8259+
// For checked null-terminated pointers, only retype the constant if the
8260+
// type would be a valid null-termianted ponter type.
8261+
if (LHSType->isCheckedPointerNtArrayType() && !RHSPointee->isIntegerType() &&
8262+
!RHSPointee->isPointerType())
8263+
return false;
8264+
8265+
RHSType = S.Context.getPointerType(RHSPointee, LHSPointerType->getKind());
8266+
RHS = S.ImpCastExprToType(RHS.get(), RHSType, CK_ArrayToPointerDecay,
8267+
clang::VK_RValue, nullptr,
8268+
Sema::CCK_ImplicitConversion, true);
8269+
return true;
8270+
}
8271+
82168272
Sema::AssignConvertType
82178273
Sema::CheckSingleAssignmentConstraints(QualType LHSType, ExprResult &CallerRHS,
82188274
bool Diagnose,
@@ -8228,7 +8284,6 @@ Sema::CheckSingleAssignmentConstraints(QualType LHSType, ExprResult &CallerRHS,
82288284
// to put the updated value.
82298285
ExprResult LocalRHS = CallerRHS;
82308286
ExprResult &RHS = ConvertRHS ? CallerRHS : LocalRHS;
8231-
82328287
if (getLangOpts().CPlusPlus) {
82338288
if (!LHSType->isRecordType() && !LHSType->isAtomicType()) {
82348289
// C++ 5.17p3: If the left operand is not of class type, the
@@ -8305,6 +8360,14 @@ Sema::CheckSingleAssignmentConstraints(QualType LHSType, ExprResult &CallerRHS,
83058360
return Incompatible;
83068361
}
83078362

8363+
// Checked C: implicitly convert array constants to checked pointer types
8364+
// when the LHS is a checked pointer type, This is done after array-to-pointer
8365+
// conversion to avoid duplicating the type conversion logic there.
8366+
if (arrayConstantCheckedConversion(*this, LHSType, RHS)) {
8367+
if (RHS.isInvalid())
8368+
return Incompatible;
8369+
}
8370+
83088371
Expr *PRE = RHS.get()->IgnoreParenCasts();
83098372
if (Diagnose && isa<ObjCProtocolExpr>(PRE)) {
83108373
ObjCProtocolDecl *PDecl = cast<ObjCProtocolExpr>(PRE)->getProtocol();

lib/Sema/SemaInit.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,8 @@ static void CheckStringInit(Expr *Str, QualType &DeclT, const ArrayType *AT,
160160
// Return a new array type (C99 6.7.8p22).
161161
DeclT = S.Context.getConstantArrayType(IAT->getElementType(),
162162
ConstVal,
163-
ArrayType::Normal, 0);
163+
ArrayType::Normal, 0,
164+
IAT->getKind());
164165
updateStringLiteralType(Str, DeclT);
165166
return;
166167
}
@@ -1698,7 +1699,8 @@ void InitListChecker::CheckArrayType(const InitializedEntity &Entity,
16981699
}
16991700

17001701
DeclType = SemaRef.Context.getConstantArrayType(elementType, maxElements,
1701-
ArrayType::Normal, 0);
1702+
ArrayType::Normal, 0,
1703+
arrayType->getKind());
17021704
}
17031705
if (!hadError && VerifyOnly) {
17041706
// If there are any members of the array that get value-initialized, check

0 commit comments

Comments
 (0)