-
Notifications
You must be signed in to change notification settings - Fork 2.6k
String.StartsWith performance - OrdinalCompareSubstring #2825
Conversation
For the curious here is how OrdinalCompareSubstring performs in place of EqualsHelper within Two 15 character strings that differ at index n:
Two equal length character strings:
|
|
|
||
[System.Security.SecuritySafeCritical] // auto-generated | ||
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)] | ||
private unsafe static bool OrdinalCompareSubstring(String strA, String strB, int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name for this method would be StartsWithOrdinalHelper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. I based the name off the CoreRT version of EqualsHelper, OrdinalCompareEqualLengthStrings.
I think it is fine for this to be a different method. BTW: We will apply a profile guided optimizations (aka |
LGTM otherwise. Thank you for doing this! |
The CI failures are known issues. |
if (*(long*)a != *(long*)b) goto ReturnFalse; | ||
if (*(long*)(a + 4) != *(long*)(b + 4)) goto ReturnFalse; | ||
if (*(long*)(a + 8) != *(long*)(b + 8)) goto ReturnFalse; | ||
a += 12; b += 12; length -= 12; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move length -= 12;
to first statement of while
so is closer to CPU pipeline completed by next iter of while
& test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit torn on that one. It is faster for longer equal strings but it is a tax on early exits (differs within the first 12 characters) which should be more common.
Long = 252 char equal strings
Short = 252 char strings that differ at index 1
Method | AvrTime | StdDev | op/s |
---|---|---|---|
LongStartsWithIncAfter | 33.7967 ns | 1.9410 ns | 29,678,807.37 |
LongStartsWithIncBefore | 33.2239 ns | 1.8214 ns | 30,183,392.10 |
ShortStartsWithIncAfter | 7.0180 ns | 0.5621 ns | 142,983,128.85 |
ShortStartsWithIncBefore | 7.2873 ns | 0.5598 ns | 137,668,403.59 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably leave it then; % diff is bigger on early exits than the gain on equals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get almost the same gain without the penalty just changing
a += 12; b += 12; length -= 12;
to
length -= 2; a += 2; b += 2;
Pretty small numbers but consistent across multiple runs. Can't hurt.
Updated with name change. AMD64 => WIN64. length -= moved to the front in EqualsHelper, StartsWithOrdinalHelper, and CompareOrdinalHelper. ALIGN_ACCESS implemented. |
ALIGN_ACCESS conditional removed. Length checks removed in first int alignment read. |
👍 |
String.StartsWith performance - OrdinalCompareSubstring
Cheers for everyone's input on this. Thanks! |
Alternative to #2667
Improves the performance of an ordinal based StartsWith comparison. The existing code has the overhead of figuring out the exact index that differs along with redundant argument validation.
OrdinalCompareSubstring is a variation of EqualsHelper which is aware when it has reached the end of the string and can control the result of reading the null terminator (or next character after the provided length). The bitwise OR provided the best performance while avoiding having a branch or a result dependency.
The new function does not replace EqualsHelper as it has enough of a different performance profile than I was comfortable applying to
String.Equals
. One of those is the goto's as suggested by @jkotas, they perform better if the branch is not taken but worse than an inline return if they are taken. However with theStartsWith
call all string lengths are faster so it seems like a good trade off to leave them in.Results for when strings match:
(Note: These times include the full StartsWith code path)
cc @jkotas @benaadams @justinvp