-
Notifications
You must be signed in to change notification settings - Fork 5k
Perf: String.StartsWith Linux regression from 1x speed to 100x speed #15372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
area-System.Runtime
os-linux
Linux OS (any supported distro)
tenet-performance
Performance related issue
Milestone
Comments
I assume this is actually due to the ICU work, not the change you pointed out. Previously we would have just been doing a simple ASCII search. Now we will be loading collators and doing a bunch more work per iteration. |
(For reference, dotnet/corefx#1632 only touched the Ordinal case, whereas this test is using CurrentCulture.) |
ellismg
referenced
this issue
in ellismg/corefx
Dec 16, 2015
In general lingustic comparsion is slower than ordinal comparision. When using ICU for globalization, the lingustic StartsWith code path actually ends up doing a lot of work inside ICU itself. While we'll try to make things better (see dotnet/corefx#3672) for common cases like ASCII only, if code doesn't need ligustic comparision the best practice is to not request it. BaseNumberConverter was using a lingustic StartsWith to detect a hex prefix when trying to parse a number but it is better served by just always using ordinal comparisons. This results in a ~35% QPS improvement in some very barebones ASP.NET scenarios (like the one outlined in richardkiene/CoreCLRWebAPISample) on my local machine.
ellismg
referenced
this issue
in ellismg/coreclr
Jan 13, 2016
In ICU doing any sort of index of operation (which includes Prefix and Suffix checking) is relatively expensive. ICU ends up doing a fair amount of work and allocations in order to construct a searcher object which could be reused, but our APIs are not amenable towards working in this manner. However, for some cultures we can often fast path ASCII searches when we know that ASCII and Ordinal comparisions are the same, as is the case for both Invariant and en-US. This change has CompareInfo hold some additional state about a locale to decide if we can do this optimiztion and then wires it up to IndexOf, LastIndexOf, IsPrefix and IsSuffix. In the future, we can try to extend the set of allowable cultures that we preform this optimization on by coming up with better checks on when it is safe to preform this transformation. Today, this optimization does not apply when IgnoreSymbols is set, because we would need to blank some ASCII symbol characters. If this ends up being a common operation, we could consider having ordinal implementations that also ignore symbols. This represents the best that we can do for dotnet/corefx#3672. It gets us back to where we were before for many common real world cases. Fixes dotnet/corefx#3672.
ellismg
referenced
this issue
in ellismg/coreclr
Jan 13, 2016
In ICU doing any sort of index of operation (which includes Prefix and Suffix checking) is relatively expensive. ICU ends up doing a fair amount of work and allocations in order to construct a searcher object which could be reused, but our APIs are not amenable towards working in this manner. However, for some cultures we can often fast path ASCII searches when we know that ASCII and Ordinal comparisions are the same, as is the case for both Invariant and en-US. This change has CompareInfo hold some additional state about a locale to decide if we can do this optimiztion and then wires it up to IndexOf, LastIndexOf, IsPrefix and IsSuffix. In the future, we can try to extend the set of allowable cultures that we preform this optimization on by coming up with better checks on when it is safe to preform this transformation. Today, this optimization does not apply when IgnoreSymbols is set, because we would need to blank some ASCII symbol characters. If this ends up being a common operation, we could consider having ordinal implementations that also ignore symbols. This represents the best that we can do for dotnet/corefx#3672. It gets us back to where we were before for many common real world cases. Fixes dotnet/corefx#3672.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
area-System.Runtime
os-linux
Linux OS (any supported distro)
tenet-performance
Performance related issue
The perf test and all applicable helper methods:
Linux results after regression (today, 10/6/2015)
Windows results:
Linux results before Regression: (Unfortunately these are from before I was recording dates with my results, so I'm not sure when they're from or even how valid they are - the important thing is moreso the current Linux results vs Windows)
I verified the results from today with a Console App and received the same results as with the xunit perf runner.
My first guess is that this has something to do with dotnet/coreclr#1632.
The text was updated successfully, but these errors were encountered: