Skip to content

bpo-37207: Use PEP 590 vectorcall to speed up range(), list() and dict() by about 30% #13930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Jun 9, 2019

@tirkarthi
Copy link
Member

test_range seems to leak references. Using test.bisect_cmd gives me test.test_range.RangeTest.test_invalid_invocation as the relevant test.

➜  cpython git:(pr_13930) ./python.exe -m test -R 3:3 test_range
Run tests sequentially
0:00:00 load avg: 1.75 [1/1] test_range
beginning 6 repetitions
123456
......
test_range leaked [11, 11, 11] references, sum=33
test_range failed

== Tests result: FAILURE ==

1 test failed:
    test_range

Total duration: 19 sec 722 ms
Tests result: FAILURE

@@ -3286,6 +3286,56 @@ dict_init(PyObject *self, PyObject *args, PyObject *kwds)
return dict_update_common(self, args, kwds, "dict");
}

static PyObject *
dict_vectorcall(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there should be a newline right after the (.

@jdemeyer
Copy link
Contributor

CC @vstinner because this PR actually affects the stable ABI for PyTypeObject (unlike #13858 where this was erroneously claimed)

Objects/call.c Outdated
@@ -1320,6 +1320,10 @@ _PyStack_UnpackDict(PyObject *const *args, Py_ssize_t nargs, PyObject *kwargs,
return 0;
}

if (PyArg_ValidateKeywordArguments(kwargs) == 0) {
Copy link
Contributor

@jdemeyer jdemeyer Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may or may not make sense, but in any case I think that it needs to be justified individually and a test added, preferably on a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in Include/cpython/abstract.h says that _PyStack_UnpackDict does not check the type of keyword arguments. So this change is inconsistent with that. If you think that this should be changed, please open a separate issue explaining the rationale for this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional validation should have been in PyVectorcall_Call which does need to do validation. I've moved it there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that it's better to move that change to a different PR with rationale explained for this change.

@jdemeyer
Copy link
Contributor

Do you have an idea why this speeds up those calls that much? I'm just wondering if we could instead try to optimize the existing code path instead of implementing a mostly duplicate code path.

@jdemeyer
Copy link
Contributor

What are your plans for inheritance of tp_vectorcall? That should be clarified and documented.

Copy link
Contributor

@jdemeyer jdemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the overall approach here. However I have various minor comments (some already mentioned):

  1. I'm not convinced that vectorcall should be used for dict(). There was a discussion about using METH_FASTCALL for dict.update() in bpo-29312 and the conclusion was that it was a bad idea. So either we need to argue that the conclusion on bpo-29312 was wrong or that it doesn't apply to the dict() constructor.
  2. You should try to minimize duplication of functionality. For example range_new and range_vectorcall serve essentially the same purpose, so why are there two functions? You could implement both in terms of a common function or implement range_new as
static PyObject *
range_new(PyTypeObject *type, PyObject *args, PyObject *kw)
{
    if (kw && PyDict_GET_SIZE(kw) != 0) {
        PyErr_Format(PyExc_TypeError, "range() takes no keyword arguments");
        return NULL;
    }
    return range_vectorcall(type, _PyTuple_ITEMS(args),
                            PyTuple_GET_SIZE(args), NULL);
}
  1. Use Py_ssize_t for the nargs variable.
  2. The change to PyVectorcall_Call does not belong here. Please open a separate PR for that.
  3. I'm getting a compiler warning
Objects/rangeobject.c:765:26: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
         .tp_vectorcall = range_vectorcall
                          ^~~~~~~~~~~~~~~~

@jdemeyer
Copy link
Contributor

jdemeyer commented Jul 4, 2019

Since dict() and dict.update() are very similar (basically, dict(*args, **kwargs) starts from an empty dict and then calls update(*args, **kwargs) on it), we should handle them together. I suggest to take that change out of this PR.

@methane
Copy link
Member

methane commented Jul 4, 2019

dict and dict.update are different.
dict(d1, **d2) is well known idiom to merge two dicts. While there is {**d1, **d2} now, many people are familiar with old idiom.

@jdemeyer
Copy link
Contributor

jdemeyer commented Jul 4, 2019

@methane, I don't understand which point you are trying to make.

Right now, there is a common implementation of dict() and dict.update() using VARARGS. I would like to change that to a common implementation of dict() and dict.update() using FASTCALL. Do you agree?

@jdemeyer
Copy link
Contributor

I opened https://bugs.python.org/issue37540 for the issue of checking that keyword names are strings.

@jdemeyer
Copy link
Contributor

jdemeyer commented Aug 5, 2019

I created #14588 and #14682 to separate two pieces from this PR. May I suggest again to merge those first? That would allow simplifying this PR and allowing to move forward with other PRs depending on #14588 and #14682.

miss-islington pushed a commit that referenced this pull request Aug 15, 2019
Base PR for other PRs that want to play with `type.__call__` such as #13930 and #14589.

The author is really @markshannon I just made the PR.


https://bugs.python.org/issue37207



Automerge-Triggered-By: @encukou
lisroach pushed a commit to lisroach/cpython that referenced this pull request Sep 10, 2019
Base PR for other PRs that want to play with `type.__call__` such as python#13930 and python#14589.

The author is really @markshannon I just made the PR.


https://bugs.python.org/issue37207



Automerge-Triggered-By: @encukou
@encukou
Copy link
Member

encukou commented Sep 11, 2019

It would make sense to delegate tp_call for range() and list() to vectorcall, rather than duplicating the implementations.

@@ -5170,6 +5170,11 @@ inherit_slots(PyTypeObject *type, PyTypeObject *base)
!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
{
type->tp_flags |= _Py_TPFLAGS_HAVE_VECTORCALL;
/* Also inherit tp_vectorcall if tp_call of the
* metaclass is type.__call__ */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* metaclass is type.__call__ */
* metaclass is type.__call__ */

if (nargs == 1) {
int err = 0;
PyObject *arg = args[0];
_Py_IDENTIFIER(keys);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like two same identifiers in single source file.
Please make _Py_IDENTIFIER(keys); global static variable.

DinoV pushed a commit to DinoV/cpython that referenced this pull request Jan 14, 2020
Base PR for other PRs that want to play with `type.__call__` such as python#13930 and python#14589.

The author is really @markshannon I just made the PR.


https://bugs.python.org/issue37207



Automerge-Triggered-By: @encukou
@encukou
Copy link
Member

encukou commented Feb 11, 2020

This PR has 4 relatively independent parts, and all of it seems stuck on one of them. I opened #18464 to get range() out of the way.

miss-islington pushed a commit that referenced this pull request Feb 18, 2020
This continues the `range()` part of #13930. The complete pull request is stalled on discussions around dicts, but `range()` should not be controversial. (And I plan to open PRs for other parts if this is merged.)
On top of Mark's change, I unified `range_new` and `range_vectorcall`, which had a lot of duplicate code.


https://bugs.python.org/issue37207
@vstinner
Copy link
Member

This PR was splitted into multiple smaller PRs which all are now all merged: see https://bugs.python.org/issue37207

@vstinner vstinner closed this Mar 30, 2020
websurfer5 pushed a commit to websurfer5/cpython that referenced this pull request Jul 20, 2020
Base PR for other PRs that want to play with `type.__call__` such as python#13930 and python#14589.

The author is really @markshannon I just made the PR.


https://bugs.python.org/issue37207



Automerge-Triggered-By: @encukou
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants