-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-106581: Support multiple uops pushing new values #108895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also avoid the need for the awkward .clone() call in the argument to mgr.adjust_inverse() and mgr.adjust().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just three questions for my own understanding:
def adjust(self, offset: StackOffset) -> None: | ||
for down in offset.deep: | ||
deep = list(offset.deep) | ||
high = list(offset.high) | ||
for down in deep: | ||
self.adjust_deeper(down) | ||
for up in offset.high: | ||
for up in high: | ||
self.adjust_higher(up) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why? I assume it has something to do with the comment you removed below (# Use clone() since adjust_inverse() mutates final_offset
), but I'm not sure I understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. That .clone()
call was needed because adjust_higher()
and adjust_deeper()
change self.final_offset
(in place), and the argument here may be self.final_offset
. (And ditto for min_offset
.) So previously my "fix" for this was to pass in a clone of the argument. The new fix avoids the need to clone in the caller but effectively clones the deep
and high
lists of the argument.
# Note that for array output effects we may still write | ||
# past the stack top. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems surprising to me. When could this happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. in UNPACK_SEQUENCE
and its specializations. For array outputs, the variable is just a pointer, and it is initialized before the body of the opcode is executed. The body is supposed to write its output directly into the stack using this pointer. But the SP is not adjusted until after the body is executed. This is iffy only if you believe that something else might write over the portion of the stack above the SP while that code is running. But nothing ever does that.
assert_no_pokes(managers) | ||
|
||
if mgr.instr.name == "SAVE_CURRENT_IP": | ||
next_instr_is_set = True | ||
if cache_offset: | ||
out.emit(f"next_instr += {cache_offset};") | ||
if tier == TIER_ONE: | ||
assert_no_pokes(managers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are pokes bad for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I don't want to think about what to do with loose pokes in macros containing SAVE_CURRENT_IP
(i.e. some specializations of CALL
), since next_instr_is_set
causes the normal handling of loose pokes below to be skipped.
If and when we need to support that case, this assert will remind us (or at least me :-) that some additional effort is needed here. (Kind of like how the LOAD_FAST_PUSH_NULL
use case reminded me to fix this issue here.)
Brandt reported to me that he tried to create a macro instruction like this:
and got an error from the cases generator. I've confirmed that this currently gives:
This PR fixes that, by moving the "poke" code generation for all uops to the end, after the stack adjustment is done. I've added a test that confirms this works (the test elicits the same errors as above without the fixes).
None of the existing generated files are changed by this PR, since none of the current macro ops do this.
The fix looks a bit convoluted, in part due to the refactoring, in part because I added some extra assertions to ensure that
_PUSH_FRAME
,_POP_FRAME
andSAVE_CURRENT_IP
are only used in ways that don't violate the assumptions by the fix.