-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-107557: Remove unnecessary SAVE_IP instructions #108583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(@Fidget-Spinner I'm not sure if reusing your gh-107557 as the issue in the PR subject is right. I've mostly been skipping news for optimizer/executor based changes because they don't matter to end users and only a few core devs even care. Those interested can look at the commit messages.) |
Python/optimizer.c
Outdated
@@ -372,6 +372,32 @@ static PyTypeObject UOpExecutor_Type = { | |||
.tp_as_sequence = &uop_as_sequence, | |||
}; | |||
|
|||
static int | |||
squeeze_stubs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
squeeze_stubs( | |
shift_stubs( |
(the name implied to me that something happens to the size of the stubs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about move_stubs()
(echoing that it uses memmove()
).
need_ip = true; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert here that last_instr >= 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I can just initialize it to 0 though. If the trace is empty we shouldn't get here but the code here won't do anything.
or variable_used(instr, "error") | ||
or variable_used(instr, "pop_1_error") | ||
or variable_used(instr, "exception_unwind") | ||
or variable_used(instr, "resume_with_error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make variable_used
accept a sequence of strings to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but why bother? Plus, I'd kind of like variable_used(instr, X, Y, Z)
to look for the sequence of identifiers X Y Z
. (And I'd also like to allow keywords, e.g. goto
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews! I've changed the function name, shortened the comment, and changed the initialization of last_instr. I'll merge once tests pass.
Python/optimizer.c
Outdated
@@ -372,6 +372,32 @@ static PyTypeObject UOpExecutor_Type = { | |||
.tp_as_sequence = &uop_as_sequence, | |||
}; | |||
|
|||
static int | |||
squeeze_stubs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about move_stubs()
(echoing that it uses memmove()
).
need_ip = true; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I can just initialize it to 0 though. If the trace is empty we shouldn't get here but the code here won't do anything.
or variable_used(instr, "error") | ||
or variable_used(instr, "pop_1_error") | ||
or variable_used(instr, "exception_unwind") | ||
or variable_used(instr, "resume_with_error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but why bother? Plus, I'd kind of like variable_used(instr, X, Y, Z)
to look for the sequence of identifiers X Y Z
. (And I'd also like to allow keywords, e.g. goto
.)
|
Benchmarking results show this would be 4% slower than main if the optimizer was always on (it isn't). |
As a final optimization pass, squeeze out any SAVE_IP instructions that are not needed.
For example, if we have
we can remove the first
SAVE_IP
, sinceLOAD_FAST
can neither deoptimize nor error out; but we cannot remove the secondSAVE_IP
, sinceLOAD_GLOBAL
can error out.This pass requires new per-opcode flags,
HAS_ERROR_FLAG
andHAS_DEOPT_FLAG
, which must be defined for regular ops as well as for uops. As a bonus we also remove redundantNOP
opcodes.(It may seem suboptimal to call
squeeze_stubs()
twice -- once after constructing an initial trace if it has stubs, once after removing unneeded uops; but to avoid this we'd need to change a bunch of APIs to return where the gap is. This seems simpler.)