Skip to content

Conversation

@tbkka
Copy link
Contributor

@tbkka tbkka commented Mar 10, 2020

This seems to be the single most common bridge cast,
and repeated lookup seems to be a performance issue
for the common operation of bridging a string-valued
NSDictionary into Swift. Spending a couple of static
words of memory to memoize it could be a nice perf win.

Resolves rdar://55237013

This seems to be the single most common bridge cast,
and repeated lookup seems to be a performance issue
for the common operation of bridging a string-valued
NSDictionary into Swift.  Spending a couple of static
words of memory to memoize it should be a nice perf win.
@tbkka tbkka requested a review from Catfish-Man March 10, 2020 16:11
@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
FlattenListFlatMap 3695 5598 +51.5% 0.66x (?)
FlattenListLoop 2773 3183 +14.8% 0.87x (?)
PrefixArrayLazy 13 14 +7.7% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9400 7200 -23.4% 1.31x
ObjectiveCBridgeToNSSet 11450 9000 -21.4% 1.27x
ObjectiveCBridgeFromNSStringForced 1780 1535 -13.8% 1.16x (?)
ObjectiveCBridgeStubFromArrayOfNSString2 2120 1860 -12.3% 1.14x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 29400 26600 -9.5% 1.11x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 3300 3000 -9.1% 1.10x (?)
ObjectiveCBridgeFromNSSetAnyObjectToString 46000 42000 -8.7% 1.10x (?)
DictionaryBridgeToObjC_Access 558 510 -8.6% 1.09x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 28900 26600 -8.0% 1.09x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2800 2580 -7.9% 1.09x (?)
ObjectiveCBridgeStubFromNSDateRef 2540 2370 -6.7% 1.07x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
StringInterpolationManySmallSegments 8800 10700 +21.6% 0.82x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9250 7200 -22.2% 1.28x
ObjectiveCBridgeToNSSet 11450 9050 -21.0% 1.27x
ObjectiveCBridgeFromNSStringForced 1765 1540 -12.7% 1.15x (?)
ObjectiveCBridgeStubFromArrayOfNSString2 2100 1870 -11.0% 1.12x (?)
DictionaryBridgeToObjC_Access 558 511 -8.4% 1.09x (?)

Code size: -Osize

Performance: -Onone

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9800 7600 -22.4% 1.29x
ObjectiveCBridgeToNSSet 11850 9500 -19.8% 1.25x
ObjectiveCBridgeStubFromArrayOfNSString2 2080 1860 -10.6% 1.12x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2840 2560 -9.9% 1.11x (?)
ObjectiveCBridgeFromNSString 2165 1955 -9.7% 1.11x (?)
ObjectiveCBridgeFromNSStringForced 1840 1665 -9.5% 1.11x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 30200 27600 -8.6% 1.09x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 30000 27500 -8.3% 1.09x (?)
DictionaryBridgeToObjC_Access 650 601 -7.5% 1.08x (?)
TypeFlood 148 137 -7.4% 1.08x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@Catfish-Man -- Any thoughts about FlattenListFlatMap and FlattenListLoop here? I don't offhand see any reason those benchmarks would touch the Obj-C bridging path, which means they should be unaffected by this change. Am I missing something? Apart from that, the benchmarks here look nice.

@Catfish-Man
Copy link
Contributor

FlattenListFlatMap has been odder than one might expect; I've been looking into it for the array changes I've been working on, and it doesn't respond to changes in a way I can make sense of.

I wouldn't worry about it too much but taking a quick sample/instruments profile locally might be worthwhile just in case.

@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Improvement OLD NEW DELTA RATIO
FlattenListFlatMap 6249 4124 -34.0% 1.52x (?)
ObjectiveCBridgeToNSArray 9500 7250 -23.7% 1.31x (?)
ObjectiveCBridgeToNSSet 11350 9100 -19.8% 1.25x (?)
FlattenListLoop 3365 2829 -15.9% 1.19x (?)
UTF8Decode_InitDecoding 167 141 -15.6% 1.18x (?)
ObjectiveCBridgeStubFromArrayOfNSString2 2130 1820 -14.6% 1.17x (?)
ObjectiveCBridgeFromNSStringForced 1800 1580 -12.2% 1.14x (?)
NormalizedIterator_fastPrenormal 660 580 -12.1% 1.14x (?)
Chars2 3500 3100 -11.4% 1.13x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 30600 27200 -11.1% 1.12x (?)
Calculator 155 138 -11.0% 1.12x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 29800 26800 -10.1% 1.11x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2820 2540 -9.9% 1.11x (?)
IterateData 878 794 -9.6% 1.11x (?)
StringHashing_fastPrenormal 590 540 -8.5% 1.09x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 3320 3080 -7.2% 1.08x (?)
OpenClose 61 57 -6.6% 1.07x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
FlattenListLoop 2653 2924 +10.2% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9300 7000 -24.7% 1.33x
ObjectiveCBridgeToNSSet 11550 9050 -21.6% 1.28x (?)
ObjectiveCBridgeStubFromArrayOfNSString2 2130 1800 -15.5% 1.18x (?)
UTF8Decode_InitDecoding 164 139 -15.2% 1.18x (?)
Calculator 154 138 -10.4% 1.12x (?)
StringHashing_fastPrenormal 590 530 -10.2% 1.11x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 29800 26800 -10.1% 1.11x (?)
Chars2 3400 3100 -8.8% 1.10x (?)

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeStubFromNSDate 3020 3270 +8.3% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9800 7350 -25.0% 1.33x
ObjectiveCBridgeToNSSet 11800 9450 -19.9% 1.25x
UTF8Decode_InitDecoding 185 156 -15.7% 1.19x
ObjectiveCBridgeStubFromArrayOfNSString2 2110 1850 -12.3% 1.14x (?)
ObjectiveCBridgeFromNSStringForced 1835 1615 -12.0% 1.14x (?)
ObjectiveCBridgeFromNSString 2135 1895 -11.2% 1.13x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 30600 27400 -10.5% 1.12x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 30200 27200 -9.9% 1.11x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2820 2580 -8.5% 1.09x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

Depending on benchmark results, this may get backed out.
@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9350 7100 -24.1% 1.32x
ObjectiveCBridgeToNSSet 11350 8750 -22.9% 1.30x
UTF8Decode_InitDecoding 164 138 -15.9% 1.19x
ObjectiveCBridgeStubFromArrayOfNSString2 2130 1840 -13.6% 1.16x (?)
ObjectiveCBridgeFromNSStringForced 1770 1540 -13.0% 1.15x (?)
NormalizedIterator_fastPrenormal 640 570 -10.9% 1.12x
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 29400 26600 -9.5% 1.11x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2800 2540 -9.3% 1.10x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 29100 26400 -9.3% 1.10x (?)
Chars2 3400 3100 -8.8% 1.10x (?)
Calculator 152 139 -8.6% 1.09x (?)
DictionaryBridgeToObjC_Access 552 506 -8.3% 1.09x (?)
OpenClose 61 56 -8.2% 1.09x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 3240 3020 -6.8% 1.07x (?)

Code size: -O

Performance: -Osize

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9300 6950 -25.3% 1.34x
ObjectiveCBridgeToNSSet 11200 8800 -21.4% 1.27x
UTF8Decode_InitDecoding 164 136 -17.1% 1.21x
ObjectiveCBridgeStubFromArrayOfNSString2 2090 1800 -13.9% 1.16x (?)
ObjectiveCBridgeFromNSStringForced 1770 1545 -12.7% 1.15x (?)
Calculator 154 136 -11.7% 1.13x
ObjectiveCBridgeFromNSArrayAnyObjectToStringForced 29600 26600 -10.1% 1.11x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 29100 26200 -10.0% 1.11x (?)
DictionaryBridgeToObjC_Access 577 523 -9.4% 1.10x (?)
Chars2 3400 3100 -8.8% 1.10x (?)
ObjectiveCBridgeStubToArrayOfNSString2 2760 2520 -8.7% 1.10x (?)
ObjectiveCBridgeFromNSDictionaryAnyObjectForced 4300 3950 -8.1% 1.09x (?)
NormalizedIterator_fastPrenormal 670 620 -7.5% 1.08x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 3260 3020 -7.4% 1.08x (?)
ObjectiveCBridgeFromNSSetAnyObjectToString 43500 40500 -6.9% 1.07x (?)

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeStubNSDateRefAccess 3323 3601 +8.4% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeToNSArray 9600 7250 -24.5% 1.32x
ObjectiveCBridgeToNSSet 11750 9400 -20.0% 1.25x
UTF8Decode_InitDecoding 183 157 -14.2% 1.17x
ObjectiveCBridgeStubFromArrayOfNSString2 2130 1850 -13.1% 1.15x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 5020 4400 -12.4% 1.14x (?)
ObjectiveCBridgeFromNSArrayAnyObjectToString 30300 26900 -11.2% 1.13x (?)
DictionaryBridgeToObjC_Access 645 599 -7.1% 1.08x (?)
StringToDataEmpty 750 700 -6.7% 1.07x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@Catfish-Man -- Thoughts? Given the lack of interesting difference between these benchmarks, I'm inclined to back out the last change (caching Dictionary and Array conformances) and just keep the String conformance cache.

@tbkka
Copy link
Contributor Author

tbkka commented Mar 10, 2020

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - b13fb96

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - b13fb96

@tbkka tbkka merged commit 46bba7a into swiftlang:master Mar 11, 2020
@tbkka tbkka deleted the tbkka-casting-stringBridgePerf branch October 16, 2020 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants