-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Optimize string.EndsWith(char) for const values #69038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsFrom looking at source.dot.net, all usages of Just like the I've also put some comments in both StartsWith and EndsWith explaining how the optimizations work and what assumptions they make, including that the code shouldn't be copied & pasted to other data types like char[] or ROS<char>. static void DoIt4(string s)
{
if (s.EndsWith('x')) { Console.WriteLine(s); }
} ;
; BEFORE
;
; Method ConsoleApp7.Program:DoIt4(System.String)
G_M63046_IG01:
;; size=0 bbWeight=1 PerfScore 0.00
G_M63046_IG02:
mov eax, dword ptr [rcx+8]
lea edx, [rax-1]
cmp eax, edx
jbe SHORT G_M63046_IG05
;; size=10 bbWeight=1 PerfScore 3.75
G_M63046_IG03:
mov eax, edx
cmp word ptr [rcx+2*rax+12], 120
jne SHORT G_M63046_IG05
;; size=10 bbWeight=0.50 PerfScore 2.12
G_M63046_IG04:
tail.jmp [System.Console:WriteLine(System.String)]
;; size=6 bbWeight=0.50 PerfScore 1.00
G_M63046_IG05:
ret
;; size=1 bbWeight=0.50 PerfScore 0.50
; Total bytes of code: 27
;
; AFTER
;
; Method ConsoleApp7.Program:DoIt4(System.String)
G_M63046_IG01:
;; size=0 bbWeight=1 PerfScore 0.00
G_M63046_IG02:
mov eax, dword ptr [rcx+8]
cmp word ptr [rcx+2*rax+10], 120
jne SHORT G_M63046_IG04
;; size=11 bbWeight=1 PerfScore 6.00
G_M63046_IG03:
tail.jmp [System.Console:WriteLine(System.String)]
;; size=6 bbWeight=0.50 PerfScore 1.00
G_M63046_IG04:
ret
;; size=1 bbWeight=0.50 PerfScore 0.50
; Total bytes of code: 18
|
If this makes it more efficient than the equivalent open-coded checks, are there places not using EndsWith that should be changed to use it? |
We did a pass a while back to identify & rewrite calls from EndsWith("single-char", Ordinal) to EndsWith(char) throughout the libs. I think we even have an analyzer for it? But you're right - I should do another pass as part of this PR to see if we missed any call sites. One difficulty is that we can't do the rewrite for libs which target netstandard, since EndsWith(char) doesn't exist in netfx. |
I actually meant code manually doing e.g. if (path.Length == 0 || path[^1] != '/')... rather than already using EndsWith. |
e.g. runtime/src/libraries/System.Diagnostics.Process/src/System/Diagnostics/Process.Windows.cs Line 650 in 5195418
runtime/src/libraries/System.Net.HttpListener/src/System/Net/Managed/HttpEndPointListener.cs Line 181 in 5195418
runtime/src/libraries/System.Net.Http/src/System/Net/Http/Headers/ContentDispositionHeaderValue.cs Lines 446 to 449 in 5195418
runtime/src/libraries/System.Net.Http/src/System/Net/Http/HttpContent.cs Lines 194 to 196 in 5195418
runtime/src/libraries/System.Private.CoreLib/src/System/Globalization/DateTimeFormat.cs Line 531 in 5195418
runtime/src/libraries/System.Private.CoreLib/src/System/Globalization/CultureData.cs Line 1002 in 5195418
etc. |
Something like runtime/src/libraries/System.Net.Http/src/System/Net/Http/HttpContent.cs Lines 194 to 196 in 5195418
if (charset.Length > 2 &&
charset.StartsWith('\"') &&
charset.EndsWith('\"')) which would effectively become charset.Length > 2 && charset._firstChar == '\"' && Unsafe.Add(ref charset._firstChar, (nint)(uint)charset.Length - 1) == '\"'; and we'd expect that to be faster? (Because the JIT can't infer that the length check protects the subsequent accesses?) |
Oh, I understand now. Thanks both for the clarification! :) |
You're correct that at the moment the length check doesn't protect "complex" accesses like |
Yup, I can see how Also it does clean up other cases more, eg fileName.Length > 0 && fileName[0] == '\"' && fileName[fileName.Length - 1] == '\"'; becomes fileName.StartsWith('\"') && fileName.EndsWith('\"'); |
Some of those call sites target ns2.0 so can't be updated. Additionally, some of the call sites target I could add a somewhat optimized EndsWith(T) on MemoryExtensions, but to get full optimization would require some JIT work (see #69080). We could also perhaps get the JIT to recognize the span[span.Length - 1] pattern, but it would require (presumably larger) JIT work. This isn't really my area so I'm going by my gut on what the relative costs of these two options are. It's probably worth doing the StartsWith(T) and EndsWith(T) extensions anyway for parity with other types. I'll file an issue. |
Latest iteration has a merge from main + scours the runtime project for locations where we can update call sites to take advantage of these A few notes for reviewers:
And if in doubt, remember the logic table! string.StartsWith(char) => string.Length > 0 && string[0] == char
!string.StartsWith(char) => string.Length == 0 || string[0] != char
string.EndsWith(char) => string.Length > 0 && string[string.Length - 1] == char
!string.EndsWith(char) => string.Length == 0 || string[string.Length - 1] != char |
src/libraries/System.Net.Http/src/System/Net/Http/Headers/NameValueHeaderValue.cs
Outdated
Show resolved
Hide resolved
Looks like
|
Drat, my desire to force my preferred coding conventions on the world is foiled again. :) |
Co-authored-by: Jeremy Barton <[email protected]>
From looking at source.dot.net, all usages of
string.EndsWith(char)
throughout our libraries pass const values for the char parameter.Just like the
string.StartsWith(char)
optimization that was done in #63734, we can take advantage of the layout of string and reduce the number of branches and total codegen size forstring.EndsWith(char)
.I've also put some comments in both StartsWith and EndsWith explaining how the optimizations work and what assumptions they make, including that the code shouldn't be copied & pasted to other data types like char[] or ROS<char>.