Skip to content

gh-106581: Support multiple uops pushing new values #108895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 5, 2023

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Sep 4, 2023

Brandt reported to me that he tried to create a macro instruction like this:

macro(LOAD_FAST_PUSH_NULL) = LOAD_FAST + PUSH_NULL;

and got an error from the cases generator. I've confirmed that this currently gives:

AssertionError: Push or pop above current stack level: stack_pointer[0]
...
AssertionError: Error writing macro LOAD_FAST_PUSH_NULL

This PR fixes that, by moving the "poke" code generation for all uops to the end, after the stack adjustment is done. I've added a test that confirms this works (the test elicits the same errors as above without the fixes).

None of the existing generated files are changed by this PR, since none of the current macro ops do this.

The fix looks a bit convoluted, in part due to the refactoring, in part because I added some extra assertions to ensure that _PUSH_FRAME, _POP_FRAME and SAVE_CURRENT_IP are only used in ways that don't violate the assumptions by the fix.

Also avoid the need for the awkward .clone() call in the argument
to mgr.adjust_inverse() and mgr.adjust().
Copy link
Member

@brandtbucher brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just three questions for my own understanding:

Comment on lines 263 to 269
def adjust(self, offset: StackOffset) -> None:
for down in offset.deep:
deep = list(offset.deep)
high = list(offset.high)
for down in deep:
self.adjust_deeper(down)
for up in offset.high:
for up in high:
self.adjust_higher(up)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why? I assume it has something to do with the comment you removed below (# Use clone() since adjust_inverse() mutates final_offset), but I'm not sure I understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. That .clone() call was needed because adjust_higher() and adjust_deeper() change self.final_offset (in place), and the argument here may be self.final_offset. (And ditto for min_offset.) So previously my "fix" for this was to pass in a clone of the argument. The new fix avoids the need to clone in the caller but effectively clones the deep and high lists of the argument.

Comment on lines +463 to +464
# Note that for array output effects we may still write
# past the stack top.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems surprising to me. When could this happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. in UNPACK_SEQUENCE and its specializations. For array outputs, the variable is just a pointer, and it is initialized before the body of the opcode is executed. The body is supposed to write its output directly into the stack using this pointer. But the SP is not adjusted until after the body is executed. This is iffy only if you believe that something else might write over the portion of the stack above the SP while that code is running. But nothing ever does that.

Comment on lines +445 to +452
assert_no_pokes(managers)

if mgr.instr.name == "SAVE_CURRENT_IP":
next_instr_is_set = True
if cache_offset:
out.emit(f"next_instr += {cache_offset};")
if tier == TIER_ONE:
assert_no_pokes(managers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are pokes bad for these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't want to think about what to do with loose pokes in macros containing SAVE_CURRENT_IP (i.e. some specializations of CALL), since next_instr_is_set causes the normal handling of loose pokes below to be skipped.

If and when we need to support that case, this assert will remind us (or at least me :-) that some additional effort is needed here. (Kind of like how the LOAD_FAST_PUSH_NULL use case reminded me to fix this issue here.)

@gvanrossum gvanrossum merged commit b87263b into python:main Sep 5, 2023
@gvanrossum gvanrossum deleted the load-fast-push-null branch September 5, 2023 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants