GH-98831: Implement array support in cases generator #100912

gvanrossum · 2023-01-10T05:35:39Z

There are some issues, but as Mark said we needn't attempt to get the generator to do everything (at least not right now).

In particular, as I mentioned in the meeting, there's no full support for output arrays (but maybe we'll never need that).

While I don't envision super-instructions with array operands, once macros become more popular they may well need arrays -- that's currently not supported.

We might want to consider a macro to DECREF an array of object pointers; this is currently done in an ad-hoc fashion.

Issue: Generate the interpreter #98831

bedevere-bot · 2023-01-10T05:35:42Z

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit 597fd69 🤖

If you want to schedule another build, you need to add the :hammer: test-with-refleak-buildbots label again.

brandtbucher

My computer is forcing a restart on me in two minutes, so here's what I've got. I still need to finish looking at test_generator.py and generate_cases.py tomorrow (there's a lot of new logic that I don't quite follow yet).

Python/bytecodes.c

Tools/cases_generator/generate_cases.py

Tools/cases_generator/parser.py

gvanrossum · 2023-01-14T02:26:04Z

@brandtbucher This should be ready for your review now. I don't intend to convert the remaining opcodes that have an array stack effect -- they either have an output array (for which this logic doesn't really work) or they are one of the main CALL specializations (wow there are a lot of those).

Let me know if I can explicate something for you.

brandtbucher

Thanks for your patience. A few more things (mostly minor):

brandtbucher · 2023-01-17T17:56:58Z

Python/opcode_metadata.h

+    [LIST_APPEND] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },
+    [SET_ADD] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },


All four of the changed opcodes in this file should be INSTR_FMT_IB, not INSTR_FMT_IX.

I'm guessing you search for "oparg" in the body of the instruction, but I think we need to now search for it in array dimensions too.

Good catch. Fixed.

brandtbucher · 2023-01-17T18:04:15Z

Tools/cases_generator/parser.py

+                if not (dim := self.dimension()):
+                    raise self.make_syntax_error("Expected dimension")
+                self.require(lx.RBRACKET)
+                return StackEffect(tkn.text, "PyObject **", dim.text.strip())
+            else:
+                return StackEffect(tkn.text)
+
+    @contextual
+    def dimension(self) -> Dimension | None:
+        tokens: list[lx.Token] = []
+        while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:
+            tokens.append(tkn)
+            self.next()
+        return Dimension(lx.to_text(tokens).strip())


Nitpicks:

dimension can't return None, right?

Suggested change

if not (dim := self.dimension()):

raise self.make_syntax_error("Expected dimension")

self.require(lx.RBRACKET)

return StackEffect(tkn.text, "PyObject **", dim.text.strip())

else:

return StackEffect(tkn.text)

@contextual

def dimension(self) -> Dimension | None:

tokens: list[lx.Token] = []

while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:

tokens.append(tkn)

self.next()

return Dimension(lx.to_text(tokens).strip())

dim = self.dimension()

self.require(lx.RBRACKET)

return StackEffect(tkn.text, "PyObject **", dim.text.strip())

else:

return StackEffect(tkn.text)

@contextual

def dimension(self) -> Dimension:

tokens: list[lx.Token] = []

while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:

tokens.append(tkn)

self.next()

return Dimension(lx.to_text(tokens).strip())

Also, I think the current definition of dimension allows an empty dimension (like stuff[]). Probably worth guarding against.

Ah, but the correct fix to the code is for dimension() to check that it's got at least one token and return None otherwise.

brandtbucher · 2023-01-17T18:05:52Z

Tools/cases_generator/test_generator.py

@@ -4,6 +4,29 @@
 import tempfile

 import generate_cases
+from parser import StackEffect


Question about this file: is it supposed to run as part of our test suite, or is the workflow just to use pytest or something locally?

For now I just run it locally after I make changes.

I don't think our test suite supports pytest, and I don't want to make it a separate GitHub workflow. Eventually we'll have to rewrite it to use unittest. But not today.

brandtbucher · 2023-01-17T18:08:29Z

Tools/cases_generator/test_generator.py

+            STACK_SHRINK(oparg*2);
+            STACK_SHRINK(2);


I'm sort of surprised we don't combine these. Not that it matters much, though.

brandtbucher · 2023-01-17T18:11:10Z

Tools/cases_generator/generate_cases.py

+            # The only supported output array forms are:
+            # - unused[...]
+            # - X[...] where X[...] matches an i99nput array form
+            self.emit(f"MOVE_ITEMS({src.name}, {dst.name}, {src.size});")


This feels backwards to me. My mental model is the memcpy argument order:

Suggested change

self.emit(f"MOVE_ITEMS({src.name}, {dst.name}, {src.size});")

self.emit(f"MOVE_ITEMS({dst.name}, {src.name}, {src.size});")

brandtbucher · 2023-01-17T18:17:42Z

Tools/cases_generator/generate_cases.py

+        if isym and isym != osym:
+            self.emit(f"STACK_SHRINK({isym});")
+        if diff < 0:
+            self.emit(f"STACK_SHRINK({-diff});")
        if diff > 0:
            self.emit(f"STACK_GROW({diff});")
-        elif diff < 0:
-            self.emit(f"STACK_SHRINK({-diff});")
+        if osym and osym != isym:
+            self.emit(f"STACK_GROW({osym});")


See my comment earlier. It probably doesn't matter much, but it feels like we should be combining these into a single STACK_GROW/STACK_SHRINK per instruction.

I thought about that. There are some problematic theoretical cases where the generator would need to understand the symbolic sizes to be able to tell if it's a net grow or shrink operation, and it can't first do a grow/shrink of the numeric part only, since you could have e.g.

inst(FOO, (a, b[oparg] -- c, d))

which shrinks if oparg == 0 but grows for oparg >= 2. We cannot safely do this as grow(1); shrink(oparg) or shrink(oparg); grow(1), nor as grow(1 - oparg) nor as shrink(oparg - 1). It must be done as shrink(1 + oparg); grow(2), or use STACK_ADJUST(), but the latter doesn't do bounds checking in debug mode. The best way is shrink(oparg); grow(1), but I left it even simpler, as shrink(1); shrink(oparg); grow(2).

Honestly, the case analysis got hairy enough that I decided that this was okay to leave to future generations; the compiler will probably do a fine job optimizing shrink(a); shrink(b) to shrink(a + b).

brandtbucher · 2023-01-17T18:21:28Z

Tools/cases_generator/generate_cases.py

+            # NOTE: MOVE_ITEMS() does not actually exist.
+            # The only supported output array forms are:
+            # - unused[...]
+            # - X[...] where X[...] matches an i99nput array form


At first I thought this was something like "i18n" or "a11y". ;)

Suggested change

# - X[...] where X[...] matches an i99nput array form

# - X[...] where X[...] matches an input array form

brandtbucher · 2023-01-17T18:22:17Z

Tools/cases_generator/generate_cases.py

+            for i in range(len(ieffects)):
+                ieffect = ieffects[i]


;)

Suggested change

for i in range(len(ieffects)):

ieffect = ieffects[i]

for i, ieffect in enumerate(ieffects):

brandtbucher · 2023-01-17T18:23:15Z

Tools/cases_generator/generate_cases.py

+            for i in range(len(oeffects)):
+                oeffect = oeffects[i]


Suggested change

for i in range(len(oeffects)):

oeffect = oeffects[i]

for i, oeffect in enumerate(oeffects):

brandtbucher · 2023-01-17T18:24:32Z

Tools/cases_generator/generate_cases.py

+            oeffects = list(reversed(self.output_effects))
+            for i in range(len(oeffects)):
+                oeffect = oeffects[i]
+                if oeffect.name in self.unmoved_names:
+                    continue
+                osize = string_effect_size(list_effect_size(oeffects[:i+1]))
+                if oeffect.size:
+                    dst = StackEffect(f"&PEEK({osize})", "PyObject **")
+                else:
+                    dst = StackEffect(f"PEEK({osize})", "")


It seems like this whole block is identical to the logic for input effects. Maybe factor out into a function?

I'm not excited about having to pick a name for a helper method that's five lines and takes three parameters (i, ieffect, ieffects).

Ah, I missed the differences at the top and bottom of the for loops (I thought the entire loop could be factored out). No need to do this then.

Anything that's not 'oparg' or 'oparg*2' will be parenthesized.

gvanrossum · 2023-01-17T21:23:35Z

If the buildbot tests pass I'll just merge it, unless you strenuously object.

bedevere-bot · 2023-01-17T21:23:52Z

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit c7edea2 🤖

If you want to schedule another build, you need to add the :hammer: test-with-buildbots label again.

gvanrossum · 2023-01-17T23:54:08Z

I'm just gonna land now. The "ARM64 Windows PR" buildbot is read, but it's been red for a week.

gvanrossum added 16 commits January 6, 2023 09:07

Tweak 'This file is generated...' header

4b17476

Start on array stack effects

69f04e4

Simple input array effects work

b9bd7e3

Ignore Context when comparing Nodes

bb53c44

Improve stack effect counting (TODO: use everywhere)

4a15c53

Fix stack effect counting in most places

0a1e9a2

Fix TODO

b023236

Fix opcode metadata for array stack effects

9ec91e1

Array-ize LIST_APPEND

af785de

Array-ize SET_ADD

4fddf2b

Array-ize LIST_EXTEND

ac22115

Array-ize SET_UPDATE

6c83be0

Array-ize BUILD_TUPLE

c7999dd

Array-ize BUILD_LIST

a7aa425

Use ERROR_IF() in BUILD_STRING

11f6a58

Merge remote-tracking branch 'origin/main' into cases-array

597fd69

gvanrossum added skip news 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Jan 10, 2023

gvanrossum requested a review from brandtbucher January 10, 2023 05:35

bedevere-bot added awaiting core review and removed 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Jan 10, 2023

bedevere-bot mentioned this pull request Jan 10, 2023

Generate the interpreter #98831

Closed

brandtbucher reviewed Jan 12, 2023

View reviewed changes

gvanrossum added 6 commits January 11, 2023 21:57

Apply most suggestions from code review

abff4ae

Accept more general dimensions

a02a9eb

Fix LIST_EXTEND

2bcd90f

Fix LIST_APPEND, SET_ADD, SET_UPDATE

c579a40

Merge remote-tracking branch 'origin/main' into cases-array

21ac256

Array-ize BUILD_SET

70e983d

gvanrossum added 3 commits January 13, 2023 17:42

Array-ize BUILD_MAP

827481e

Array-ize BUILD_CONST_KEY_MAP

aa69c55

Add comment warning that RAISE_VARARGS needs to stay legacy

6cbabc7

brandtbucher reviewed Jan 17, 2023

View reviewed changes

gvanrossum added 7 commits January 17, 2023 12:22

Look for 'oparg' in the whole instruction

c7cfe9a

Add docstring for effect_size()

a1a4688

Parenthesize certain symbolic effects

16a2d99

Anything that's not 'oparg' or 'oparg*2' will be parenthesized.

Fix typo in comment

1ef085d

Swap args of imaginary MOVE_ITEMS() macro

1bf42ce

Tune for loops

c2be1ab

Disallow empty dimension

c7edea2

gvanrossum added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2023

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2023

brandtbucher approved these changes Jan 17, 2023

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Jan 17, 2023

gvanrossum merged commit 80e3e34 into python:main Jan 17, 2023

bedevere-bot removed the awaiting merge label Jan 17, 2023

gvanrossum deleted the cases-array branch January 17, 2023 23:59

kumaraditya303 mentioned this pull request Feb 16, 2023

Set: BUILD_SET opcode can be failed with segfault #101952

Closed

		[LIST_APPEND] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },
		[SET_ADD] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },

	self.emit(f"MOVE_ITEMS({src.name}, {dst.name}, {src.size});")
	self.emit(f"MOVE_ITEMS({dst.name}, {src.name}, {src.size});")

	# - X[...] where X[...] matches an i99nput array form
	# - X[...] where X[...] matches an input array form

	for i in range(len(ieffects)):
	ieffect = ieffects[i]
	for i, ieffect in enumerate(ieffects):

	for i in range(len(oeffects)):
	oeffect = oeffects[i]
	for i, oeffect in enumerate(oeffects):

Uh oh!

GH-98831: Implement array support in cases generator #100912

GH-98831: Implement array support in cases generator #100912

Uh oh!

Conversation

gvanrossum commented Jan 10, 2023 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-bot commented Jan 10, 2023

Uh oh!

brandtbucher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gvanrossum commented Jan 14, 2023

Uh oh!

brandtbucher left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brandtbucher Jan 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Jan 17, 2023

Uh oh!

bedevere-bot commented Jan 17, 2023

Uh oh!

gvanrossum commented Jan 17, 2023

Uh oh!

Uh oh!

gvanrossum commented Jan 10, 2023 •

edited by bedevere-bot

Loading

brandtbucher Jan 17, 2023 •

edited

Loading