Skip to content

GH-98831: Implement array support in cases generator #100912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jan 17, 2023

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Jan 10, 2023

There are some issues, but as Mark said we needn't attempt to get the generator to do everything (at least not right now).

In particular, as I mentioned in the meeting, there's no full support for output arrays (but maybe we'll never need that).

While I don't envision super-instructions with array operands, once macros become more popular they may well need arrays -- that's currently not supported.

We might want to consider a macro to DECREF an array of object pointers; this is currently done in an ad-hoc fashion.

@gvanrossum gvanrossum added skip news 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Jan 10, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit 597fd69 🤖

If you want to schedule another build, you need to add the :hammer: test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot added awaiting core review and removed 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Jan 10, 2023
Copy link
Member

@brandtbucher brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My computer is forcing a restart on me in two minutes, so here's what I've got. I still need to finish looking at test_generator.py and generate_cases.py tomorrow (there's a lot of new logic that I don't quite follow yet).

@gvanrossum
Copy link
Member Author

@brandtbucher This should be ready for your review now. I don't intend to convert the remaining opcodes that have an array stack effect -- they either have an output array (for which this logic doesn't really work) or they are one of the main CALL specializations (wow there are a lot of those).

Let me know if I can explicate something for you.

Copy link
Member

@brandtbucher brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience. A few more things (mostly minor):

Comment on lines 48 to 49
[LIST_APPEND] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },
[SET_ADD] = { -1, -1, DIR_NONE, DIR_NONE, DIR_NONE, true, INSTR_FMT_IX },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four of the changed opcodes in this file should be INSTR_FMT_IB, not INSTR_FMT_IX.

I'm guessing you search for "oparg" in the body of the instruction, but I think we need to now search for it in array dimensions too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed.

Comment on lines 240 to 253
if not (dim := self.dimension()):
raise self.make_syntax_error("Expected dimension")
self.require(lx.RBRACKET)
return StackEffect(tkn.text, "PyObject **", dim.text.strip())
else:
return StackEffect(tkn.text)

@contextual
def dimension(self) -> Dimension | None:
tokens: list[lx.Token] = []
while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:
tokens.append(tkn)
self.next()
return Dimension(lx.to_text(tokens).strip())
Copy link
Member

@brandtbucher brandtbucher Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicks:

dimension can't return None, right?

Suggested change
if not (dim := self.dimension()):
raise self.make_syntax_error("Expected dimension")
self.require(lx.RBRACKET)
return StackEffect(tkn.text, "PyObject **", dim.text.strip())
else:
return StackEffect(tkn.text)
@contextual
def dimension(self) -> Dimension | None:
tokens: list[lx.Token] = []
while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:
tokens.append(tkn)
self.next()
return Dimension(lx.to_text(tokens).strip())
dim = self.dimension()
self.require(lx.RBRACKET)
return StackEffect(tkn.text, "PyObject **", dim.text.strip())
else:
return StackEffect(tkn.text)
@contextual
def dimension(self) -> Dimension:
tokens: list[lx.Token] = []
while (tkn := self.peek()) and tkn.kind != lx.RBRACKET:
tokens.append(tkn)
self.next()
return Dimension(lx.to_text(tokens).strip())

Also, I think the current definition of dimension allows an empty dimension (like stuff[]). Probably worth guarding against.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but the correct fix to the code is for dimension() to check that it's got at least one token and return None otherwise.

@@ -4,6 +4,29 @@
import tempfile

import generate_cases
from parser import StackEffect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about this file: is it supposed to run as part of our test suite, or is the workflow just to use pytest or something locally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I just run it locally after I make changes.

I don't think our test suite supports pytest, and I don't want to make it a separate GitHub workflow. Eventually we'll have to rewrite it to use unittest. But not today.

Comment on lines +365 to +366
STACK_SHRINK(oparg*2);
STACK_SHRINK(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sort of surprised we don't combine these. Not that it matters much, though.

# The only supported output array forms are:
# - unused[...]
# - X[...] where X[...] matches an i99nput array form
self.emit(f"MOVE_ITEMS({src.name}, {dst.name}, {src.size});")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels backwards to me. My mental model is the memcpy argument order:

Suggested change
self.emit(f"MOVE_ITEMS({src.name}, {dst.name}, {src.size});")
self.emit(f"MOVE_ITEMS({dst.name}, {src.name}, {src.size});")

Comment on lines +119 to +126
if isym and isym != osym:
self.emit(f"STACK_SHRINK({isym});")
if diff < 0:
self.emit(f"STACK_SHRINK({-diff});")
if diff > 0:
self.emit(f"STACK_GROW({diff});")
elif diff < 0:
self.emit(f"STACK_SHRINK({-diff});")
if osym and osym != isym:
self.emit(f"STACK_GROW({osym});")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment earlier. It probably doesn't matter much, but it feels like we should be combining these into a single STACK_GROW/STACK_SHRINK per instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that. There are some problematic theoretical cases where the generator would need to understand the symbolic sizes to be able to tell if it's a net grow or shrink operation, and it can't first do a grow/shrink of the numeric part only, since you could have e.g.

inst(FOO, (a, b[oparg] -- c, d))

which shrinks if oparg == 0 but grows for oparg >= 2. We cannot safely do this as grow(1); shrink(oparg) or shrink(oparg); grow(1), nor as grow(1 - oparg) nor as shrink(oparg - 1). It must be done as shrink(1 + oparg); grow(2), or use STACK_ADJUST(), but the latter doesn't do bounds checking in debug mode. The best way is shrink(oparg); grow(1), but I left it even simpler, as shrink(1); shrink(oparg); grow(2).

Honestly, the case analysis got hairy enough that I decided that this was okay to leave to future generations; the compiler will probably do a fine job optimizing shrink(a); shrink(b) to shrink(a + b).

# NOTE: MOVE_ITEMS() does not actually exist.
# The only supported output array forms are:
# - unused[...]
# - X[...] where X[...] matches an i99nput array form
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought this was something like "i18n" or "a11y". ;)

Suggested change
# - X[...] where X[...] matches an i99nput array form
# - X[...] where X[...] matches an input array form

Comment on lines 257 to 258
for i in range(len(ieffects)):
ieffect = ieffects[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;)

Suggested change
for i in range(len(ieffects)):
ieffect = ieffects[i]
for i, ieffect in enumerate(ieffects):

Comment on lines 291 to 292
for i in range(len(oeffects)):
oeffect = oeffects[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(len(oeffects)):
oeffect = oeffects[i]
for i, oeffect in enumerate(oeffects):

Comment on lines 290 to 299
oeffects = list(reversed(self.output_effects))
for i in range(len(oeffects)):
oeffect = oeffects[i]
if oeffect.name in self.unmoved_names:
continue
osize = string_effect_size(list_effect_size(oeffects[:i+1]))
if oeffect.size:
dst = StackEffect(f"&PEEK({osize})", "PyObject **")
else:
dst = StackEffect(f"PEEK({osize})", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this whole block is identical to the logic for input effects. Maybe factor out into a function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not excited about having to pick a name for a helper method that's five lines and takes three parameters (i, ieffect, ieffects).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the differences at the top and bottom of the for loops (I thought the entire loop could be factored out). No need to do this then.

@gvanrossum
Copy link
Member Author

If the buildbot tests pass I'll just merge it, unless you strenuously object.

@gvanrossum gvanrossum added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit c7edea2 🤖

If you want to schedule another build, you need to add the :hammer: test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2023
@gvanrossum
Copy link
Member Author

I'm just gonna land now. The "ARM64 Windows PR" buildbot is read, but it's been red for a week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants