-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-45260: Add superinstruction, UNPACK_SEQUENCE__STORE_FAST #28519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
You don't need to do this in specialize.c, right? You can do it in the compiler. That's faster. |
Yes. I implement it in the compiler (https://github.com/zcpara/cpython/pull/2/files). The performance (geometric mean) of both implementations are almost the same. Which implementation do we need? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From your benchmarks 1.008x faster (0.8% faster) is almost in the realm of noise. It seems that only unpack_sequence
(which is a microbenchmark) benefited. I don't know if we'll ever run out of instructions, but this doesn't seem to worth it IMO.
FWIW, thus far the superinstructions implemented by Mark have been via the specializing interpreter. From what I understand, the main benefit over a compiler implementation is that when users dissassemble/inspect/trace bytecode, they won't encounter some strange-looking instruction and it will instead fall back to the old instruction. So we don't need to document or support third party tools.
Lib/opcode.py
Outdated
@@ -247,6 +247,7 @@ def jabs_op(name, op): | |||
"STORE_ATTR_SLOT", | |||
"STORE_ATTR_WITH_HINT", | |||
# Super instructions | |||
"UNPACK_SEQUENCE_ST", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep with the naming convention and rename this to UNPACK_SEQUENCE__STORE_FAST
. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Nonetheless, I wanted to say that this is still very impressive work! Thank you for implementing the instruction and benchmarking this. |
I don’t want to derail this PR, but I feel in general the compiler should do it if it can, like in this case. “Specialization” would be for cases where runtime data needs to be taken into account (e.g. BINARY_SUBSCR vs. BINARY_SUBSCR_LIST etc.). But we should really see which has the shorter code. |
Sorry to seem arguing with myself, but actually I stand corrected. The specializer produces a bunch of super-instructions, like LOAD_FAST__LOAD_FAST. See the switch in optimize() in specialize.c. |
Update the implementation in the compiler. |
Sorry about arriving a bit late to this discussion. I don't think this is worthwhile. There is no performance improvement, and the new instruction is quite large and complex. Ignoring the Care needs to be taken not to break tracing. This PR breaks line tracing on: (
a,
b,
) = (
1,2
) |
We now specialize |
https://bugs.python.org/issue45260