Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Commit 6f55bc8

Browse files
committed
JIT: Fix value type box optimization
Boxing a value type produces a non-null result. If the result of the box is only used to feed a compare against null, the jit tries to optimize the box away entirely since the result of the comparison is known. Such idiomatic expressions arise fairly often in generics instantiated over value types. In the current implementation the box expands into two parts. The first is an upstream statement to allocate a boxed object and assign a reference to the boxed object to a local var known as the "box temp". The second is an expression tree whose value is the box temp that also contains an an encapsulated copy from the value being boxed to the payload section of the boxed object. The box node also contains a pointer back to the first statement (more on this later). In the examples being discussed here this second tree is a child of a compare node whose other child is a null pointer. When the optimization fires, the upstream allocation statement is located via the pointer in the box node and removed, and the entire compare is replaced with a constant 0 or 1 as appropriate. Unfortunately the encapsulated copy in the box subtree may include side effects that should be preserved, and so this transformation is unsafe. Note that the copy subtree as a whole will always contain side effects, since the copy is storing values into the heap, and that copy now will not happen. But the side effects that happen when producing the value to box must remain. In the initial example from #12949 the side effects in question were introduced by the jit's optimizer to capure a CSE definition. #13016 gives several other examples where the side effects are present in the initial user code. For instance the value being boxed might come from an array, in which case the encapsulated copy in the box expression tree would contain the array null check and bounds check. So removing the entire tree can alter behavior. This fix attempts to carefully preserve the important side effects by reworking how a box is imported. The copy is now moved out from under the box into a second upstream statement. The box itself is then just a trivial side-effect-free reference to the box temp. To ensure proper ordering of side effects the jit spills the evaluation stack before appending the copy statement. When the optimization fires the jit removes the upstream heap allocation as before, as well as the now-trivial compare tree. It analyzes the source side of the upstream copy. If it is side effect free, the copy is removed entirely. If not, the jit modifies the copy into a minimal load of the boxed value, and this load should reproduce the necessary side effects. The optimization is only performed when the tree shape of the copy matches expected patterns. There are some expected cases where the tree won't match, for instance if the optimization is invoked while the jit is inlining. Because this optimization runs at several points the jit can catch these cases once inlining completes. There is one case that is not handled that could be -- if the assignment part of the copy is itself a subtree of a comma. This doesn't happen often. The optimization is now also extended to handle the case where the comparision operation is `cgt.un`. This doesn't catch any new cases but causes the optimization to happen earlier, typically during importation, which should reduce jit time slightly. Generally the split of the box into two upstream statements reduces code size, especially when the box expression is incorporated into a larger tree -- for example a call. However in some cases where the value being boxed comes from an array, preserving the array bounds check now causes loop cloning to kick in and increase code size. Hence the overall size impact on the jit-diff set is essentially zero. Added a number of new test cases showing the variety of situations that must be handled and the need to spill before appending the copy statement. Fixes #12949.
1 parent f3e228b commit 6f55bc8

19 files changed

+902
-19
lines changed

src/jit/gentree.cpp

Lines changed: 156 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7454,7 +7454,8 @@ GenTreePtr Compiler::gtCloneExpr(
74547454

74557455
case GT_BOX:
74567456
copy = new (this, GT_BOX)
7457-
GenTreeBox(tree->TypeGet(), tree->gtOp.gtOp1, tree->gtBox.gtAsgStmtWhenInlinedBoxValue);
7457+
GenTreeBox(tree->TypeGet(), tree->gtOp.gtOp1, tree->gtBox.gtAsgStmtWhenInlinedBoxValue,
7458+
tree->gtBox.gtCopyStmtWhenInlinedBoxValue);
74587459
break;
74597460

74607461
case GT_INTRINSIC:
@@ -12032,32 +12033,168 @@ GenTreePtr Compiler::gtFoldExprSpecial(GenTreePtr tree)
1203212033

1203312034
switch (oper)
1203412035
{
12035-
1203612036
case GT_EQ:
1203712037
case GT_NE:
12038+
case GT_GT:
1203812039
// Optimize boxed value classes; these are always false. This IL is
1203912040
// generated when a generic value is tested against null:
1204012041
// <T> ... foo(T x) { ... if ((object)x == null) ...
1204112042
if (val == 0 && op->IsBoxedValue())
1204212043
{
12043-
// Change the assignment node so we don't generate any code for it.
12044+
// The tree under the box must be side effect free
12045+
// since we drop it if we optimize the compare.
12046+
assert(!gtTreeHasSideEffects(op->gtBox.gtOp.gtOp1, GTF_SIDE_EFFECT));
1204412047

12048+
// grab related parts for the optimization
1204512049
GenTreePtr asgStmt = op->gtBox.gtAsgStmtWhenInlinedBoxValue;
1204612050
assert(asgStmt->gtOper == GT_STMT);
12047-
GenTreePtr asg = asgStmt->gtStmt.gtStmtExpr;
12048-
assert(asg->gtOper == GT_ASG);
12051+
GenTreePtr copyStmt = op->gtBox.gtCopyStmtWhenInlinedBoxValue;
12052+
assert(copyStmt->gtOper == GT_STMT);
1204912053
#ifdef DEBUG
1205012054
if (verbose)
1205112055
{
12052-
printf("Bashing ");
12053-
printTreeID(asg);
12054-
printf(" to NOP as part of dead box operation\n");
12056+
printf("\nAttempting to optimize BOX(valueType) %s null\n", GenTree::OpName(oper));
1205512057
gtDispTree(tree);
12058+
printf("\nWith assign\n");
12059+
gtDispTree(asgStmt);
12060+
printf("\nAnd copy\n");
12061+
gtDispTree(copyStmt);
1205612062
}
1205712063
#endif
12064+
12065+
// We don't expect GT_GT with signed compares, and we
12066+
// can't predict the result if we do see it, since the
12067+
// boxed object addr could have its high bit set.
12068+
if ((oper == GT_GT) && !tree->IsUnsigned())
12069+
{
12070+
JITDUMP(" bailing; unexpected signed compare via GT_GT\n");
12071+
goto FAIL;
12072+
}
12073+
12074+
// If we don't recognize the form of the assign, bail.
12075+
GenTreePtr asg = asgStmt->gtStmt.gtStmtExpr;
12076+
if (asg->gtOper != GT_ASG)
12077+
{
12078+
JITDUMP(" bailing; unexpected assignment op %s\n", GenTree::OpName(asg->gtOper));
12079+
goto FAIL;
12080+
}
12081+
12082+
// If we don't recognize the form of the copy, bail.
12083+
GenTree* copy = copyStmt->gtStmt.gtStmtExpr;
12084+
if (copy->gtOper != GT_ASG)
12085+
{
12086+
// GT_RET_EXPR is a tolerable temporary failure.
12087+
// The jit will revisit this optimization after
12088+
// inlining is done.
12089+
if (copy->gtOper == GT_RET_EXPR)
12090+
{
12091+
JITDUMP(" bailing; must wait for replacement of copy %s\n", GenTree::OpName(copy->gtOper));
12092+
}
12093+
else
12094+
{
12095+
// Anything else is a missed case we should
12096+
// figure out how to handle. One known case
12097+
// is GT_COMMAs enclosing the GT_ASG we are
12098+
// looking for.
12099+
JITDUMP(" bailing; unexpected copy op %s\n", GenTree::OpName(copy->gtOper));
12100+
}
12101+
goto FAIL;
12102+
}
12103+
12104+
// If the copy is a struct copy, make sure we know how to isolate
12105+
// any source side effects.
12106+
GenTreePtr copySrc = copy->gtOp.gtOp2;
12107+
12108+
// If the copy source is from a pending inline, wait for it to resolve.
12109+
if (copySrc->gtOper == GT_RET_EXPR)
12110+
{
12111+
JITDUMP(" bailing; must wait for replacement of copy source %s\n",
12112+
GenTree::OpName(copySrc->gtOper));
12113+
goto FAIL;
12114+
}
12115+
12116+
bool hasSrcSideEffect = false;
12117+
bool isStructCopy = false;
12118+
12119+
if (gtTreeHasSideEffects(copySrc, GTF_SIDE_EFFECT))
12120+
{
12121+
hasSrcSideEffect = true;
12122+
12123+
if (copySrc->gtType == TYP_STRUCT)
12124+
{
12125+
isStructCopy = true;
12126+
12127+
if ((copySrc->gtOper != GT_OBJ) && (copySrc->gtOper != GT_IND) && (copySrc->gtOper != GT_FIELD))
12128+
{
12129+
// We don't know how to handle other cases, yet.
12130+
JITDUMP(" bailing; unexpected copy source struct op with side effect %s\n",
12131+
GenTree::OpName(copySrc->gtOper));
12132+
goto FAIL;
12133+
}
12134+
}
12135+
}
12136+
12137+
// Proceed with the optimization
12138+
//
12139+
// Change the assignment expression to a NOP.
12140+
JITDUMP("\nBashing NEWOBJ [%06u] to NOP\n", dspTreeID(asg));
1205812141
asg->gtBashToNOP();
1205912142

12060-
op = gtNewIconNode(oper == GT_NE);
12143+
// Change the copy expression so it preserves key
12144+
// source side effects.
12145+
JITDUMP("\nBashing COPY [%06u]", dspTreeID(copy));
12146+
12147+
if (!hasSrcSideEffect)
12148+
{
12149+
// If there were no copy source side effects just bash
12150+
// the copy to a NOP.
12151+
copy->gtBashToNOP();
12152+
JITDUMP(" to NOP\n");
12153+
}
12154+
else if (!isStructCopy)
12155+
{
12156+
// For scalar types, go ahead and produce the
12157+
// value as the copy is fairly cheap and likely
12158+
// the optimizer can trim things down to just the
12159+
// minimal side effect parts.
12160+
copyStmt->gtStmt.gtStmtExpr = copySrc;
12161+
JITDUMP(" to scalar read via [%06u]\n", dspTreeID(copySrc));
12162+
}
12163+
else
12164+
{
12165+
// For struct types read the first byte of the
12166+
// source struct; there's no need to read the
12167+
// entire thing, and no place to put it.
12168+
assert(copySrc->gtOper == GT_OBJ || copySrc->gtOper == GT_IND || copySrc->gtOper == GT_FIELD);
12169+
copySrc->ChangeOper(GT_IND);
12170+
copySrc->gtType = TYP_BYTE;
12171+
copyStmt->gtStmt.gtStmtExpr = copySrc;
12172+
JITDUMP(" to read first byte of struct via modified [%06u]\n", dspTreeID(copySrc));
12173+
}
12174+
12175+
// Set up the result of the compare.
12176+
int compareResult = 0;
12177+
if (oper == GT_GT)
12178+
{
12179+
// GT_GT(null, box) == false
12180+
// GT_GT(box, null) == true
12181+
compareResult = (op1 == op);
12182+
}
12183+
else if (oper == GT_EQ)
12184+
{
12185+
// GT_EQ(box, null) == false
12186+
// GT_EQ(null, box) == false
12187+
compareResult = 0;
12188+
}
12189+
else
12190+
{
12191+
assert(oper == GT_NE);
12192+
// GT_NE(box, null) == true
12193+
// GT_NE(null, box) == true
12194+
compareResult = 1;
12195+
}
12196+
op = gtNewIconNode(compareResult);
12197+
1206112198
if (fgGlobalMorph)
1206212199
{
1206312200
if (!fgIsInlining())
@@ -12070,9 +12207,15 @@ GenTreePtr Compiler::gtFoldExprSpecial(GenTreePtr tree)
1207012207
op->gtNext = tree->gtNext;
1207112208
op->gtPrev = tree->gtPrev;
1207212209
}
12073-
fgSetStmtSeq(asgStmt);
12210+
12211+
if (fgStmtListThreaded)
12212+
{
12213+
fgSetStmtSeq(asgStmt);
12214+
fgSetStmtSeq(copyStmt);
12215+
}
1207412216
return op;
1207512217
}
12218+
1207612219
break;
1207712220

1207812221
case GT_ADD:
@@ -12240,7 +12383,9 @@ GenTreePtr Compiler::gtFoldExprSpecial(GenTreePtr tree)
1224012383
break;
1224112384
}
1224212385

12243-
/* The node is not foldable */
12386+
/* The node is not foldable */
12387+
12388+
FAIL:
1224412389

1224512390
return tree;
1224612391

src/jit/gentree.h

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2965,9 +2965,16 @@ struct GenTreeBox : public GenTreeUnOp
29652965
// This is the statement that contains the assignment tree when the node is an inlined GT_BOX on a value
29662966
// type
29672967
GenTreePtr gtAsgStmtWhenInlinedBoxValue;
2968-
2969-
GenTreeBox(var_types type, GenTreePtr boxOp, GenTreePtr asgStmtWhenInlinedBoxValue)
2970-
: GenTreeUnOp(GT_BOX, type, boxOp), gtAsgStmtWhenInlinedBoxValue(asgStmtWhenInlinedBoxValue)
2968+
// And this is the statement that copies from the value being boxed to the box payload
2969+
GenTreePtr gtCopyStmtWhenInlinedBoxValue;
2970+
2971+
GenTreeBox(var_types type,
2972+
GenTreePtr boxOp,
2973+
GenTreePtr asgStmtWhenInlinedBoxValue,
2974+
GenTreePtr copyStmtWhenInlinedBoxValue)
2975+
: GenTreeUnOp(GT_BOX, type, boxOp)
2976+
, gtAsgStmtWhenInlinedBoxValue(asgStmtWhenInlinedBoxValue)
2977+
, gtCopyStmtWhenInlinedBoxValue(copyStmtWhenInlinedBoxValue)
29712978
{
29722979
}
29732980
#if DEBUGGABLE_GENTREE

src/jit/importer.cpp

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5260,7 +5260,7 @@ void Compiler::impImportAndPushBox(CORINFO_RESOLVED_TOKEN* pResolvedToken)
52605260
gtNewArgList(op2));
52615261
}
52625262

5263-
/* Remember that this basic block contains 'new' of an array */
5263+
/* Remember that this basic block contains 'new' of an object */
52645264
compCurBB->bbFlags |= BBF_HAS_NEWOBJ;
52655265

52665266
GenTreePtr asg = gtNewTempAssign(impBoxTemp, op1);
@@ -5302,11 +5302,16 @@ void Compiler::impImportAndPushBox(CORINFO_RESOLVED_TOKEN* pResolvedToken)
53025302
op1 = gtNewAssignNode(gtNewOperNode(GT_IND, lclTyp, op1), exprToBox);
53035303
}
53045304

5305-
op2 = gtNewLclvNode(impBoxTemp, TYP_REF);
5306-
op1 = gtNewOperNode(GT_COMMA, TYP_REF, op1, op2);
5305+
// Spill eval stack to flush out any pending side effects.
5306+
impSpillSideEffects(true, (unsigned)CHECK_SPILL_ALL DEBUGARG("impImportAndPushBox"));
53075307

5308-
// Record that this is a "box" node.
5309-
op1 = new (this, GT_BOX) GenTreeBox(TYP_REF, op1, asgStmt);
5308+
// Set up this copy as a second assignment.
5309+
GenTreePtr copyStmt = impAppendTree(op1, (unsigned)CHECK_SPILL_NONE, impCurStmtOffs);
5310+
5311+
op1 = gtNewLclvNode(impBoxTemp, TYP_REF);
5312+
5313+
// Record that this is a "box" node and keep track of the matching parts.
5314+
op1 = new (this, GT_BOX) GenTreeBox(TYP_REF, op1, asgStmt, copyStmt);
53105315

53115316
// If it is a value class, mark the "box" node. We can use this information
53125317
// to optimise several cases:
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
// Licensed to the .NET Foundation under one or more agreements.
2+
// The .NET Foundation licenses this file to you under the MIT license.
3+
// See the LICENSE file in the project root for more information.
4+
5+
using System;
6+
7+
public struct S<K>
8+
{
9+
public int x;
10+
public int y;
11+
public K val;
12+
}
13+
14+
public class X<K,V>
15+
{
16+
public X(K k)
17+
{
18+
a = new S<K>[2];
19+
a[1].val = k;
20+
a[1].x = 3;
21+
a[1].y = 4;
22+
}
23+
24+
public void Assert(bool b)
25+
{
26+
if (!b) throw new Exception("bad!");
27+
}
28+
29+
public int Test()
30+
{
31+
int r = 0;
32+
for (int i = 0; i < a.Length; i++)
33+
{
34+
Assert(a[i].val != null);
35+
r += a[i].val.GetHashCode();
36+
}
37+
return r;
38+
}
39+
40+
S<K>[] a;
41+
}
42+
43+
class B
44+
{
45+
public static int Main()
46+
{
47+
var a = new X<int, string>(11);
48+
int z = a.Test();
49+
return (z == 11 ? 100 : 0);
50+
}
51+
}
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
<?xml version="1.0" encoding="utf-8"?>
2+
<Project ToolsVersion="12.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
3+
<Import Project="$([MSBuild]::GetDirectoryNameOfFileAbove($(MSBuildThisFileDirectory), dir.props))\dir.props" />
4+
<PropertyGroup>
5+
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
6+
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
7+
<AssemblyName>$(MSBuildProjectName)</AssemblyName>
8+
<SchemaVersion>2.0</SchemaVersion>
9+
<ProjectGuid>{7B521917-193E-48BB-86C6-FE013F3DFF35}</ProjectGuid>
10+
<OutputType>Exe</OutputType>
11+
<AppDesignerFolder>Properties</AppDesignerFolder>
12+
<FileAlignment>512</FileAlignment>
13+
<ProjectTypeGuids>{786C830F-07A1-408B-BD7F-6EE04809D6DB};{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}</ProjectTypeGuids>
14+
<ReferencePath>$(ProgramFiles)\Common Files\microsoft shared\VSTT\11.0\UITestExtensionPackages</ReferencePath>
15+
<SolutionDir Condition="$(SolutionDir) == '' Or $(SolutionDir) == '*Undefined*'">..\..\</SolutionDir>
16+
17+
<NuGetPackageImportStamp>7a9bfb7d</NuGetPackageImportStamp>
18+
</PropertyGroup>
19+
<!-- Default configurations to help VS understand the configurations -->
20+
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
21+
</PropertyGroup>
22+
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
23+
</PropertyGroup>
24+
<ItemGroup>
25+
<CodeAnalysisDependentAssemblyPaths Condition=" '$(VS100COMNTOOLS)' != '' " Include="$(VS100COMNTOOLS)..\IDE\PrivateAssemblies">
26+
<Visible>False</Visible>
27+
</CodeAnalysisDependentAssemblyPaths>
28+
</ItemGroup>
29+
<PropertyGroup>
30+
<DebugType></DebugType>
31+
<Optimize>True</Optimize>
32+
<AllowUnsafeBlocks>True</AllowUnsafeBlocks>
33+
</PropertyGroup>
34+
<ItemGroup>
35+
<Compile Include="$(MSBuildProjectName).cs" />
36+
</ItemGroup>
37+
<ItemGroup>
38+
<Service Include="{82A7F48D-3B50-4B1E-B82E-3ADA8210C358}" />
39+
</ItemGroup>
40+
<Import Project="$([MSBuild]::GetDirectoryNameOfFileAbove($(MSBuildThisFileDirectory), dir.targets))\dir.targets" />
41+
<PropertyGroup Condition=" '$(MsBuildProjectDirOverride)' != '' ">
42+
</PropertyGroup>
43+
</Project>

0 commit comments

Comments
 (0)