Skip to content

gh-89013: Improve the performance of methodcaller #106960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Jul 21, 2023

We improve the performance of methodcaller by using the vector protocol. The performance improvement depends on the use case, but for simple calls the performance doubles. Creation is a bit slower (because the vector call arguments have to be constructed), but the net gain is positive already after a single invocation of the methodcaller

call: Mean +- std dev: [main] 138 ns +- 1 ns -> [patch] 55.5 ns +- 1.6 ns: 2.49x faster
creation+call: Mean +- std dev: [main] 257 ns +- 10 ns -> [patch] 190 ns +- 2 ns: 1.35x faster
call kwarg: Mean +- std dev: [main] 222 ns +- 8 ns -> [patch] 81.4 ns +- 0.4 ns: 2.72x faster
creation+call kwarg: Mean +- std dev: [main] 387 ns +- 11 ns -> [patch] 334 ns +- 5 ns: 1.16x faster

Geometric mean: 1.80x faster
Benchmark script
import pyperf

setup = """
from operator import methodcaller as mc
arr = []
call = mc('sort')
call_kwarg = mc('sort', reverse=True)
"""

runner = pyperf.Runner()
runner.timeit(name="call", stmt="call(arr)", setup=setup)
runner.timeit(name="creation+call", stmt="call = mc('sort'); call(arr)", setup=setup)
runner.timeit(name="call kwarg", stmt="call_kwarg(arr)", setup=setup)
runner.timeit(name="creation+call kwarg", stmt="call = mc('sort', reverse=True); call(arr)", setup=setup)

Note: this is a continuation of #27782. Changes with respect to that PR

  • The methodcaller_call was removed in favor of the vector call, but it is still needed for some builds. It is now restored, but less efficient
  • Store the full positional arguments as xargs instead of of a slice to avoid creation of an additional tuple
  • Additional storage in the methodcallerobject compared to main is now 2 variables PyObject* and a tuple holding the keyword argument names.
  • To avoid construction of the vector call arguments during the definition of the methodcaller, one can also defer the construction to the first invocation of the methodcaller. A branch with this approach is main...eendebakpt:cpython:fastmethodcaller_lazy_vectorcall

@eendebakpt eendebakpt changed the title Draft: gh-27782: Improve the performance of methodcaller Draft: gh-89013: Improve the performance of methodcaller Jul 21, 2023
@eendebakpt eendebakpt changed the title Draft: gh-89013: Improve the performance of methodcaller gh-89013: Improve the performance of methodcaller Jul 23, 2023
@eendebakpt eendebakpt changed the title gh-89013: Improve the performance of methodcaller Draft: gh-89013: Improve the performance of methodcaller Jul 23, 2023
@eendebakpt eendebakpt changed the title Draft: gh-89013: Improve the performance of methodcaller gh-89013: Improve the performance of methodcaller Jul 24, 2023
@corona10
Copy link
Member

corona10 commented Aug 1, 2023

close the PR since #107201 is merged.

@corona10 corona10 closed this Aug 1, 2023
@eendebakpt eendebakpt deleted the fastmethodcaller branch March 16, 2025 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants