gh-106812: Refactor to allow uops with array stack effects #107564

gvanrossum · 2023-08-02T17:48:28Z

This adds a new file, stacking.py, which tracks pushes and pops across the uops comprising a macro. Instruction writing for non-macro instructions is also unified with this.

The generated files look quite different, but I have carefully verified that everything works. (And usually if it doesn't, it won't even build. :-)

TODO:

(If possible) Clean up circular import between instructions.py and stacking.py
Pass Analyzer around and turn a few asserts into error messages
Remove redundant Analyzer.check_macro_consistency (fold into write_components)
Remove errors from Analyzer.stack_analysis that are no longer errors
Remove commented-out lines from ``Formatter.assign`
Change StackItem so the effect itself is included in deep/high
(Maybe) Change StackItem to have a StackOffset member instead of inheriting it
(Probably not) Change StackOffset operations to use __add__, __sub__ etc.

Issue: Code generator: support variable stack effects in macros #106812

gvanrossum · 2023-08-03T02:49:34Z

@Fidget-Spinner I am tempted to just merge this without fixing everything I listed above -- we can improve things iteratively, and I feel it's more important to go back to gh-106581 (this started as a giant yak to shave for that). What do you think?

Fidget-Spinner · 2023-08-03T05:27:39Z

I am tempted to just merge this without fixing everything I listed above -- we can improve things iteratively, and I feel it's more important to go back to gh-106581 (this started as a giant yak to shave for that). What do you think?

I'd prefer we work on this iteratively as well. It's easier to review that way too.

Also, surprisingly kenjin is an actual taken GH user, which isn't me, but bears the same name.

Fidget-Spinner

looks good in general

Fidget-Spinner · 2023-08-03T05:41:17Z

Tools/cases_generator/analysis.py

+                    if vars[eff.name] != eff:
+                        self.error(
+                            f"Instruction {instr.name!r} has "
+                            f"inconsistent types for variable {eff.name!r}: "


Should this be inconsistent types or just inconsistent in general?

Yeah, it could mean that either type, cond or size is inconsistent. I'll change it.

Fidget-Spinner · 2023-08-03T05:51:40Z

Tools/cases_generator/analysis.py

+            for name, eff in vars.items():
+                if name in all_vars:
+                    if all_vars[name] != eff:
+                        self.warning(
+                            f"Macro {mac.name!r} has"
+                            f"inconsistent types for variable {name!r}: "
+                            f"{all_vars[name]} vs {eff} in {part.instr.name!r}",
+                            mac.macro,
+                        )
+                else:
+                    all_vars[name] = eff


Will this block ever warn? Wouldn't it all be consistent as it's already checked in get_var_names?

This checks for inconsistency across different instructions. E.g.

op(A, (args[oparg] --)) { ... } op(B, (args[oparg+1] --)) { ... } macro(M) = A + B;

Fidget-Spinner · 2023-08-03T06:00:37Z

Tools/cases_generator/stacking.py

+
+
+@dataclasses.dataclass
+class StackOffset:


Could this be represented instead as an index from the TOS, then a capture of the "stack"?

E.g. for a PEEK(1) it would be index=-2, stack=[item1, item2, item3, item4].

Hm... That's closer to the old way of doing this, where the stack was represented (implicitly) by variables _tmp_1, _tmp_2, etc., and the stack offset of an effect was represented as being mapped to one of those variables (using the old input_mapping and output_mapping members of Component). I ran into some problems there when an effect is conditional, and managed to hack that in (but only for the output effects of the last component). But with array effects for guard instructions like we will need for CALL guards I just couldn't hack it any more, and instead I came up with this abstraction, which can handle conditional and array effects anywhere in a macro.

The only thing this cannot handle is a situation where a value is temporarily pushed onto the stack, stays there for the next uop, and then is popped later:

op(A, (-- temp)) { ... } // stack: [] -> [temp] op(B, (--)) { ... } // stack: [temp] -> [temp] op(C, (temp --)) { ... } // stack: [temp] -> [] macro(M1) = A + C; macro(M2) = A + B + C;

Here both M1 and M2 have a net stack effect of 0, n_popped is 0, and n_pushed is 0, and we cannot express using n_popped and n_pushed that this uses one temporary stack item. For M1 that's not a problem, because the algorithm translates the push in A and the pop in C into a copy, which doesn't require stack space. (Also, the copy disappears because the variable name is the same.) But for M2 the push and pop are not adjacent so they are not optimized away like that.

I could improve the algorithm to recognize this situation and use a copy for M2, but it's more complicated and it's unlikely that we'll need this. (Most likely in a real case there would be a result pushed onto the stack at the end, so the problem of phantom stack space wouldn't occur.) I ought to at least detect it and warn, but I'd rather do that in a future PR, since this one is complex enough as it is.

Note: the generated code for M2 clearly shows the problem:

TARGET(M2) { PyObject *temp; // A { ... } stack_pointer[0] = temp; // B { ... } // C temp = stack_pointer[0]; { ... } DISPATCH(); }

Note that stack_pointer[0] is an invalid stack item, pointing just above the current stack top. (The actual top is stack_pointer[-1].)

(stack_analysis.py was no longer being called!)

gvanrossum · 2023-08-03T18:29:34Z

Tools/cases_generator/analysis.py

+                    if vars[eff.name] != eff:
+                        self.error(
+                            f"Instruction {instr.name!r} has "
+                            f"inconsistent types for variable {eff.name!r}: "


Yeah, it could mean that either type, cond or size is inconsistent. I'll change it.

gvanrossum · 2023-08-03T18:33:16Z

Tools/cases_generator/analysis.py

+            for name, eff in vars.items():
+                if name in all_vars:
+                    if all_vars[name] != eff:
+                        self.warning(
+                            f"Macro {mac.name!r} has"
+                            f"inconsistent types for variable {name!r}: "
+                            f"{all_vars[name]} vs {eff} in {part.instr.name!r}",
+                            mac.macro,
+                        )
+                else:
+                    all_vars[name] = eff


This checks for inconsistency across different instructions. E.g.

op(A, (args[oparg] --)) { ... } op(B, (args[oparg+1] --)) { ... } macro(M) = A + B;

gvanrossum · 2023-08-03T18:34:10Z

Tools/cases_generator/analysis.py

@@ -371,7 +439,7 @@ def stack_analysis(
                        eff.size for eff in instr.input_effects + instr.output_effects
                    ):
                        # TODO: Eventually this will be needed, at least for macros.
-                        self.error(
+                        self.warning(


FWIW, the checks in this function are no longer needed, and in fact it's not called any more, so I'm deleting it. :-)

gvanrossum · 2023-08-03T19:03:33Z

Tools/cases_generator/stacking.py

+
+
+@dataclasses.dataclass
+class StackOffset:


Hm... That's closer to the old way of doing this, where the stack was represented (implicitly) by variables _tmp_1, _tmp_2, etc., and the stack offset of an effect was represented as being mapped to one of those variables (using the old input_mapping and output_mapping members of Component). I ran into some problems there when an effect is conditional, and managed to hack that in (but only for the output effects of the last component). But with array effects for guard instructions like we will need for CALL guards I just couldn't hack it any more, and instead I came up with this abstraction, which can handle conditional and array effects anywhere in a macro.

The only thing this cannot handle is a situation where a value is temporarily pushed onto the stack, stays there for the next uop, and then is popped later:

op(A, (-- temp)) { ... } // stack: [] -> [temp] op(B, (--)) { ... } // stack: [temp] -> [temp] op(C, (temp --)) { ... } // stack: [temp] -> [] macro(M1) = A + C; macro(M2) = A + B + C;

Here both M1 and M2 have a net stack effect of 0, n_popped is 0, and n_pushed is 0, and we cannot express using n_popped and n_pushed that this uses one temporary stack item. For M1 that's not a problem, because the algorithm translates the push in A and the pop in C into a copy, which doesn't require stack space. (Also, the copy disappears because the variable name is the same.) But for M2 the push and pop are not adjacent so they are not optimized away like that.

I could improve the algorithm to recognize this situation and use a copy for M2, but it's more complicated and it's unlikely that we'll need this. (Most likely in a real case there would be a result pushed onto the stack at the end, so the problem of phantom stack space wouldn't occur.) I ought to at least detect it and warn, but I'd rather do that in a future PR, since this one is complex enough as it is.

Note: the generated code for M2 clearly shows the problem:

TARGET(M2) { PyObject *temp; // A { ... } stack_pointer[0] = temp; // B { ... } // C temp = stack_pointer[0]; { ... } DISPATCH(); }

Note that stack_pointer[0] is an invalid stack item, pointing just above the current stack top. (The actual top is stack_pointer[-1].)

See python#107564 (comment)

gvanrossum

(Just me mumbling to myself.)

gvanrossum · 2023-08-03T22:49:53Z

Tools/cases_generator/stacking.py

+            res = f"stack_pointer[{index}]"
+        if not lax:
+            # Check that we're not reading or writing above stack top.
+            # Skip this for output variable initialization (lax=True).


I'm still pondering if I can add a working check for writing arrays above stack level. This is tricky because output array variables are initialized before the body of the opcode, at a point where stack_pointer is still low. E.g.

inst(UNPACK_SEQUENCE_TUPLE, (unused/1, seq -- values[oparg])) { ... }

which produces output like this:

TARGET(UNPACK_SEQUENCE_TUPLE) { PyObject *seq; PyObject **values; seq = stack_pointer[-1]; values = stack_pointer - 1; ... STACK_SHRINK(1); STACK_GROW(oparg); next_instr += 1; DISPATCH(); }

The assert would trigger, because the code (...) may well write above stack_pointer, but all is well because of the following STACK_GROW(oparg). Hence the lax flag, which disables the assert for output array variables (L372 below). It's hard to conceive of a realistic example where the failing assert would not be a false positive. A theoretical example would be:

op(A, (-- temp[oparg])) { ... } op(B, (temp[oparg] --) { ... } macro(M) = A + B;

But I don't expect we'll ever write such code.

This fixes two tiny defects in analysis.py that I didn't catch on time in #107564: - `get_var_names` in `check_macro_consistency` should skip `UNUSED` names. - Fix an occurrence of `is UNUSED` (should be `==`).

gvanrossum added 26 commits July 31, 2023 16:16

Make repr(StackEffect()) nicer

67d0445

Ignore name when comparing StackEffect

f8a56f9

Improve counting of things read/written

2e80b55

Record which instructions use DEOPT_IF

c2f4277

Blackify a few turds left by refactor

80f2895

Add warning() and note() helpers

41878fe

Use self.note() for non-viable uops in macros

4ad5db2

Temporary check for troublesome macros

9a0bdb1

Replace part of Instruction.write()

d989f6b

Replace most of the rest of Instruction.write()

533fe6b

Spread enum InstructionFormat across multiple lines

d92516c

Compute macro stack effect info using stacking.py

ce14682

Disappear wrap_macro

d067e45

Add helper method to Formatter for family size static assert

5437eca

Replace write_macro with stacking logic

dbc04f8

Emit cache vars in macros

cf46cc2

Increment next_instr in macros

c979089

Move conditional init test to Formatter.declare()

c41c906

Fix bug in merge(); clean up a bit

cfb5bcb

Fix test_generated_cases.py

4cb3986

Unify Peek/Poke to StackItem

51e3e29

Get rid of initial_offset and net_offset (unused)

2195484

Refactor write_macro_instr

aa9832a

Unify macro and single instruction writing

3027ccb

Skip pokes for unmoved names

597b97f

Fix tests

ce6c203

gvanrossum changed the title ~~Refactor to allow uops with array stack effects~~ gh-106812: Refactor to allow uops with array stack effects Aug 2, 2023

bedevere-bot mentioned this pull request Aug 2, 2023

Code generator: support variable stack effects in macros #106812

Closed

gvanrossum added the skip news label Aug 2, 2023

Add static_assert back to macro

dde4595

gvanrossum requested a review from Fidget-Spinner August 2, 2023 18:56

gvanrossum added 7 commits August 2, 2023 13:33

Remove some unneeded imports and commented-out code

8e96cc1

Fix test by moving import

652093c

Rm {in,out}put_mapping, stack, {initial,final}_sp

d21f8a4

Make StackOffset a member instead of base class of StackItem

008721c

Fix type errors

b4a9652

Move merge into __init__; offset includes effect

777bbd4

Simplify get_stack_effect_info_for_macro

7c930e3

Remove unused import of abstractmethod

0ff94ac

Fidget-Spinner reviewed Aug 3, 2023

View reviewed changes

Improve consistency errors; rm stack_analysis()

39051e3

(stack_analysis.py was no longer being called!)

gvanrossum commented Aug 3, 2023

View reviewed changes

Add asserts for push/pop above stack level

09a40fa

See python#107564 (comment)

gvanrossum marked this pull request as ready for review August 3, 2023 20:50

bedevere-bot added the awaiting core review label Aug 3, 2023

Improve asserts for push/pop above stack level

925388f

gvanrossum commented Aug 3, 2023

View reviewed changes

Remove unused def string_effect_size and tests

bb56748

gvanrossum merged commit 400835e into python:main Aug 4, 2023

bedevere-bot removed the awaiting core review label Aug 4, 2023

gvanrossum mentioned this pull request Aug 5, 2023

gh-106812: Fix two tiny bugs in analysis.py #107649

Merged

gvanrossum deleted the stacking branch August 5, 2023 04:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-106812: Refactor to allow uops with array stack effects #107564

gh-106812: Refactor to allow uops with array stack effects #107564

gvanrossum commented Aug 2, 2023 •

edited

Loading

gvanrossum commented Aug 3, 2023 •

edited

Loading

Fidget-Spinner commented Aug 3, 2023

Fidget-Spinner left a comment

Fidget-Spinner Aug 3, 2023

gvanrossum Aug 3, 2023

Fidget-Spinner Aug 3, 2023

gvanrossum Aug 3, 2023

Fidget-Spinner Aug 3, 2023

gvanrossum Aug 3, 2023

gvanrossum Aug 3, 2023

gvanrossum Aug 3, 2023

gvanrossum Aug 3, 2023

gvanrossum Aug 3, 2023

gvanrossum left a comment

gvanrossum Aug 3, 2023

gh-106812: Refactor to allow uops with array stack effects #107564

gh-106812: Refactor to allow uops with array stack effects #107564

Conversation

gvanrossum commented Aug 2, 2023 • edited Loading

gvanrossum commented Aug 3, 2023 • edited Loading

Fidget-Spinner commented Aug 3, 2023

Fidget-Spinner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvanrossum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvanrossum commented Aug 2, 2023 •

edited

Loading

gvanrossum commented Aug 3, 2023 •

edited

Loading