-
Notifications
You must be signed in to change notification settings - Fork 18k
reflect: use binary search in MethodByName #50617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Change https://golang.org/cl/378634 mentions this issue: |
Thanks for the thorough analysis. It's nice to see someone really dig into a performance problem like this. 120 methods is just crazy. I wonder how common this problem actually is, and if a different fix, if any, is the right way forward. MethodByName works very hard, even ignoring the a linear scan. Also, using the reflect package to call a method is sometimes necessary but often avoidable with a bit of work elsewhere. I am nervous about opening the door to optimizations of the reflect package. |
As a regular Go user, I think it's pretty reasonable to assume that MethodByName() won't be an O(N) operation. I think the author did list one such example, the templating engine, where it matters, but one would argue that e.g. if you make a reflection-based HTTP router, or any kind of dynamic dispatch, e.g. in an interpreter like yaegi, you would probably just use MethodByName(). You would not think of caching it unless you know that it's implemented as a linear search, because it's a standard library function, and most of the time the user's assumption would be that it's reasonably (as much as it can be reasonable given that it's reflection) optimised. |
@YuriyNasretdinov in fact, yaegi does use MethodByName in a few places. I haven't done any research to find out whether these usages lie on a hot path or anything. I can probably run the benchmarks that involve calling MethodByName and give some old/new timings. |
I think you can try to fix it with something like this:
|
@MBkkt probably. It may be worthwhile to see which strategy gives the mean average boost on different real codebases as opposed to the benchmarks. That being said, I would like to wait for a decision on whether any optimization to MethodByName should be done at all. |
This seems to fix in https://go-review.googlesource.com/c/go/+/425481 cc @golang/runtime |
Well, I'm more surprised that it was accepted without an open issue while mine patch was on hold due to discussions. Looking and the CL you've mentioned, this issue indeed can be closed. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?Linux/amd64
But it should be irrelevant.
What did you do?
Executed hugo on digitalgov.gov site, collected the CPU profiles. Found that
reflect.Type.MethodByName
is top1.If we take a look at the MethodByName sources, we'll notice that it uses a linear search. There is even a comment to fix that:
go/src/reflect/type.go
Lines 888 to 893 in a99c38d
What did you expect to see?
reflect.Type.MethodByName
doesn't use linear search for bigger slices.What did you see instead?
reflect.Type.MethodByName
is quite slow for some cases.More info, numbers, benchmarks
I'll send a CL that adds a binary search to the
reflect.Type.MethodByName
, but since there are a few caveats, I think it's worthwhile to share some info I gathered by investigating this.After patching the reflect package, I compared the results.
Old site build time: 20.4s
New site build time: 16.5s
If we take a look at the "new" CPU profile, MethodByName doesn't appear in top20 at all, even though it was a top1 entry before:
But then we have a question: why we get an impact that big? We need to collect some stats to be sure that it's not a coincidence.
First, let's see a distribution of number of methods.
It looks like vast majority of calls goes through 120-entry slice with linear search. Not very common I guess, but that's what we get.
But if the searched method would be located at 2nd or 3rd pos, it wouldn't be a problem. Therefore, we need to see how "found at" indexes are distributed.
Most cases fall into "found at i=98" category, which is pretty bad for linear search.
In total, building that site makes hugo call MethodByName 17042238 times.
Benchmarking the new method shows mixed, but expected, results:
In general, the new approach is only slower if searched method happens to be located somewhere in the beginning of the slice. If we assume that they're somewhere in the middle on average, then binary search performs better. I would still keep a linear search path for small slices, like
len<=8
to avoid a minor regression for small slices (that are not so uncommon).Caveats
sort.Search
We can decrease the effect of (2) by using a linear search path for small slices, but in general, this side-effect is hard to avoid.
The text was updated successfully, but these errors were encountered: