-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-118218: Reuse return tuple in itertools.pairwise #118219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
With this change: ``` b1(1) min=152.8us mean=155.5us ± 3.8us (25 repeats, 1000 loops) b2(1) min=149.0us mean=159.1us ± 8.8us (25 repeats, 1000 loops) b3(1) min=232.6us mean=242.5us ± 11.7us (25 repeats, 1000 loops) b1(10) min=279.2us mean=296.9us ± 16.6us (25 repeats, 1000 loops) b2(10) min=249.5us mean=259.2us ± 12.2us (25 repeats, 1000 loops) b3(10) min=386.6us mean=398.8us ± 10.1us (25 repeats, 1000 loops) b1(1000) min=20.3ms mean=20.7ms ± 0.5ms (25 repeats, 1000 loops) b2(1000) min=16.7ms mean=17.1ms ± 0.2ms (25 repeats, 1000 loops) b3(1000) min=26.0ms mean=26.2ms ± 0.3ms (25 repeats, 1000 loops) ``` Without this change: ``` b1(1) min=142.2us mean=143.0us ± 0.9us (25 repeats, 1000 loops) b2(1) min=142.7us mean=143.3us ± 1.0us (25 repeats, 1000 loops) b3(1) min=219.8us mean=227.2us ± 4.4us (25 repeats, 1000 loops) b1(10) min=314.2us mean=323.8us ± 4.1us (25 repeats, 1000 loops) b2(10) min=335.4us mean=341.8us ± 5.1us (25 repeats, 1000 loops) b3(10) min=362.0us mean=386.2us ± 14.9us (25 repeats, 1000 loops) b1(1000) min=26.5ms mean=27.3ms ± 0.3ms (25 repeats, 1000 loops) b2(1000) min=29.8ms mean=30.2ms ± 0.2ms (25 repeats, 1000 loops) b3(1000) min=26.0ms mean=26.5ms ± 0.4ms (25 repeats, 1000 loops) ```
@@ -301,6 +302,11 @@ pairwise_new_impl(PyTypeObject *type, PyObject *iterable) | |||
} | |||
po->it = it; | |||
po->old = NULL; | |||
po->result = PyTuple_Pack(2, Py_None, Py_None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyTuple_Pack
is slow. I created an issue #118222 to discuss this.
An alternative that might be faster:
po->result = PyTuple_Pack(2, Py_None, Py_None); | |
PyObject tmp[2] = {Py_None, Py_None}; | |
po->result = _PyTuple_FromArraySteal(tmp, 2),; |
(note: untested!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this locally. Since this line is once called once for each call to pairwise
the performance difference is insignificant. It does matter for the other call to PyTuple_Pack
(the one in pairwise_next
), but that is best handled in a different PR.
PyTuple_SET_ITEM(result, 0, Py_NewRef(old)); | ||
PyTuple_SET_ITEM(result, 1, Py_NewRef(new)); | ||
Py_DECREF(last_old); | ||
Py_DECREF(last_new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since last_new
is equal to old
we could do:
PyTuple_SET_ITEM(result, 0, Py_NewRef(old)); | |
PyTuple_SET_ITEM(result, 1, Py_NewRef(new)); | |
Py_DECREF(last_old); | |
Py_DECREF(last_new); | |
PyTuple_SET_ITEM(result, 0, old); | |
PyTuple_SET_ITEM(result, 1, Py_NewRef(new)); | |
Py_DECREF(last_old); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily true that last_new
is old
. Really up to the whims of when the user lets go of the tuple
Conceptually this seems reasonable to me. Please do have someone else give it a thorough review (Serhiy would be a reasonable choice). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please compare it also with the simple code that uses PyTuple_New
+ PyTuple_SET_ITEM
instead of PyTuple_Pack
if the latter is so slow.
11f2e8e
to
68b5d27
Compare
Using this diff: diff --git a/Modules/itertoolsmodule.c b/Modules/itertoolsmodule.c
index 6ee447ef6a..668c5a4f15 100644
--- a/Modules/itertoolsmodule.c
+++ b/Modules/itertoolsmodule.c
@@ -356,7 +356,11 @@ pairwise_next(pairwiseobject *po)
return NULL;
}
/* Future optimization: Reuse the result tuple as we do in enumerate() */
- result = PyTuple_Pack(2, old, new);
+ result = PyTuple_New(2);
+ if (result != NULL) {
+ PyTuple_SET_ITEM(result, 0, Py_NewRef(old));
+ PyTuple_SET_ITEM(result, 1, Py_NewRef(new));
+ }
Py_XSETREF(po->old, new);
Py_DECREF(old);
return result; Baseline:
This PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
it = pairwise([None, None]) | ||
gc.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all tests above except for zip_longest
there is a next(it)
before gc.collect()
. All these tests were added in 226a012 (bpo-42536).
@brandtbucher, why is such difference? Is there a bug in test_zip_longest_result_gc
? Do we need next(it)
there and here?
Misc/NEWS.d/next/Library/2024-04-24-07-45-08.gh-issue-118218.m1OHbN.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Library/2024-04-24-07-45-08.gh-issue-118218.m1OHbN.rst
Outdated
Show resolved
Hide resolved
…1OHbN.rst Co-authored-by: Serhiy Storchaka <[email protected]>
Thank you for your contribution @hauntsaninja. |
As per the comment, this is shamelessly based off of enumobject.c
With this change:
Without this change:
Benchmarking script: