Skip to content

bpo-45260: Add superinstruction, UNPACK_SEQUENCE__STORE_FAST #28519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

zcpara
Copy link

@zcpara zcpara commented Sep 22, 2021

@zcpara zcpara requested a review from markshannon as a code owner September 22, 2021 15:03
@the-knights-who-say-ni

This comment has been minimized.

@gvanrossum
Copy link
Member

You don't need to do this in specialize.c, right? You can do it in the compiler. That's faster.

@zcpara
Copy link
Author

zcpara commented Sep 26, 2021

You don't need to do this in specialize.c, right? You can do it in the compiler. That's faster.

Yes. I implement it in the compiler (https://github.com/zcpara/cpython/pull/2/files). The performance (geometric mean) of both implementations are almost the same. Which implementation do we need?

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your benchmarks 1.008x faster (0.8% faster) is almost in the realm of noise. It seems that only unpack_sequence (which is a microbenchmark) benefited. I don't know if we'll ever run out of instructions, but this doesn't seem to worth it IMO.

FWIW, thus far the superinstructions implemented by Mark have been via the specializing interpreter. From what I understand, the main benefit over a compiler implementation is that when users dissassemble/inspect/trace bytecode, they won't encounter some strange-looking instruction and it will instead fall back to the old instruction. So we don't need to document or support third party tools.

Lib/opcode.py Outdated
@@ -247,6 +247,7 @@ def jabs_op(name, op):
"STORE_ATTR_SLOT",
"STORE_ATTR_WITH_HINT",
# Super instructions
"UNPACK_SEQUENCE_ST",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep with the naming convention and rename this to UNPACK_SEQUENCE__STORE_FAST. Thank you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Fidget-Spinner
Copy link
Member

Nonetheless, I wanted to say that this is still very impressive work! Thank you for implementing the instruction and benchmarking this.

@gvanrossum
Copy link
Member

FWIW, thus far the superinstructions implemented by Mark have been via the specializing interpreter. From what I understand, the main benefit over a compiler implementation is that when users dissassemble/inspect/trace bytecode, they won't encounter some strange-looking instruction and it will instead fall back to the old instruction. So we don't need to document or support third party tools.

I don’t want to derail this PR, but I feel in general the compiler should do it if it can, like in this case. “Specialization” would be for cases where runtime data needs to be taken into account (e.g. BINARY_SUBSCR vs. BINARY_SUBSCR_LIST etc.). But we should really see which has the shorter code.

@gvanrossum
Copy link
Member

I don’t want to derail this PR, but I feel in general the compiler should do it if it can, like in this case.

Sorry to seem arguing with myself, but actually I stand corrected. The specializer produces a bunch of super-instructions, like LOAD_FAST__LOAD_FAST. See the switch in optimize() in specialize.c.

@zcpara
Copy link
Author

zcpara commented Sep 30, 2021

Update the implementation in the compiler.

@zcpara zcpara changed the title bpo-45260: Add superinstruction, UNPACK_SEQUENCE_ST bpo-45260: Add superinstruction, UNPACK_SEQUENCE__STORE_FAST Oct 9, 2021
@markshannon
Copy link
Member

Sorry about arriving a bit late to this discussion.

I don't think this is worthwhile. There is no performance improvement, and the new instruction is quite large and complex.

Ignoring the UNPACK_SEQUENCE benchmark, the difference is performance is just noise.
All other superinstructions are pairs of very simple instructions.

Care needs to be taken not to break tracing. This PR breaks line tracing on:

(
    a,
    b,
) = (
    1,2
)

@markshannon
Copy link
Member

We now specialize UNPACK_SEQUENCE, which means that adding this superinstruction would interfere with that specialization.
We have experimented with specialized superinstructions and, with the possibe exception of LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE, they don't seem to be worthwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants