bpo-44187: Quickening infrastructure #26264

markshannon · 2021-05-20T11:38:59Z

First step toward implementing PEP 659.

https://bugs.python.org/issue44187

…racing.

Python/specialize.c

Objects/codeobject.c

bratao · 2021-05-22T18:39:21Z

@markshannon one thing I noticed is the introduction of HotPy prefix that do not exist on CPython. I understand that the heritage of this code based on some previous projects.
But would not be better to simply call PyCacheEntry, for example? I can see some developer trying to make sense of the Hot prefix, as it have a meaning in JITs.

Include/internal/pycore_code.h

Python/specialize.c

…tation to make it less surprising.

gvanrossum

Here's my first round of comments; I haven't gotten to specialize.c yet, but I figured I'd send this in case I don't get to that tonight (which is likely).

Include/cpython/code.h

Include/internal/pycore_code.h

Misc/NEWS.d/next/Core and Builtins/2021-05-20-12-43-04.bpo-44187.3lk0L1.rst

Include/internal/pycore_code.h

Python/ceval.c

Python/specialize.c

…mal dispatch.

gvanrossum

Okay, I got through everything this time, and I think I understand it. Some suggestions to make it easier to understand for new readers.

Python/specialize.c

gvanrossum · 2021-05-26T17:18:07Z

Python/specialize.c

+static uint8_t adaptive[256] = { 0 };
+
+static uint8_t cache_requirements[256] = { 0 };


These variables could use some comment explaining their purpose. (Also, maybe we should plan to generate these from info added to opcode.py, like opcode_targets.h?

gvanrossum · 2021-05-26T17:49:27Z

Python/specialize.c

+            cache_offset = i/2;
+        }
+        else if (oparg > 255) {
+            /* Cannot access required cache_offset */


Maybe in some kind of debug mode it would be nice to report whether this happens at all? If we see this frequently we need to change the strategy. OTOH maybe we never expect it and we could put assert(0) here???

Python/specialize.c

gvanrossum · 2021-05-26T18:31:31Z

Python/specialize.c

+            /* Cannot access required cache_offset */
+            continue;
+        }
+        cache_offset += need;


Looks like this will over-count if there are eligible opcodes with an EXTENDED_ARG prefix.

Python/specialize.c

Python/ceval.c

iritkatriel · 2021-05-26T19:02:42Z

Python/ceval.c

@@ -1343,6 +1343,14 @@ eval_frame_handle_pending(PyThreadState *tstate)
 #define JUMPTO(x)       (next_instr = first_instr + (x))
 #define JUMPBY(x)       (next_instr += (x))

+/* Get opcode and opcode from original instructions, not quickened form. */


Suggested change

/* Get opcode and opcode from original instructions, not quickened form. */

/* Get opcode and oparg from original instructions, not quickened form. */

Include/internal/pycore_code.h

iritkatriel · 2021-05-26T21:30:15Z

Include/internal/pycore_code.h

+    return &last_cache_plus_one[-1-n].entry;
+}
+
+/* Following two functions determine the index of a cache entry from the


This comment doesn't seem correct - they take the index and return oparg or offset.

…quickening.

markshannon · 2021-05-27T13:42:58Z

@gvanrossum , @iritkatriel I think I've addressed all your comments.
Specifically, I've:

Factored out the common code in optimize and entries_needed
Made sure that previous_opcode is correct to avoid super-instructions stomping on adaptive ones
Added more checks for EXTENDED_ARG to avoid wasting memory
Fixed up lots of comments.
Added an extended comment showing the layout of the quickening data

Include/internal/pycore_code.h

Python/specialize.c

gvanrossum

Looking good. Maybe we should ask Dino to have a look? (Between this PR and PEP 659 he should have enough to judge the design, right?)

Also, please look at the compiler warnings found by GitHub Actions for Windows (x64).

Include/internal/pycore_code.h

gvanrossum · 2021-05-28T22:30:35Z

Python/specialize.c

+ <instr 0> <instr 1> <instr 2> <instr 3>    <--- co->co_first_instr
+ <instr 4> <instr 5> <instr 6> <instr 7>
+ ...
+ <instr N-1>


Maybe M-1? The number of cache entries doesn't have to match the number of instructions.

Python/specialize.c

gvanrossum · 2021-05-28T23:21:29Z

Python/specialize.c

+        }
+        previous_opcode = opcode;
+    }
+    return cache_offset+1;


Suggested change

return cache_offset+1;

return cache_offset + 1; // One extra for the count entry

…ming quickening.

…dex+1) is clear.

markshannon · 2021-06-01T12:53:19Z

I've been implementing some specialization of LOAD_ATTR in another branch and from that it became clear that two more things were needed:

We need to use the next index (index + 1) for efficiency as that is what is available in the interpreter.
We need to initialize the adaptive cache for adaptive instructions (or they crash).

I've implemented those in the last three commits.

See python/cpython#26264 (comment)

iritkatriel · 2021-06-05T13:40:51Z

Python/specialize.c

+    }
+    Py_ssize_t size = PyBytes_GET_SIZE(code->co_code);
+    int instr_count = (int)(size/sizeof(_Py_CODEUNIT));
+    if (instr_count > MAX_SIZE_TO_QUICKEN) {


Would it be possible instead to quicken the first 5000 instructions and then exit? (That would avoid a cliff where a minor change in the code tips it over the no-optimization limit and makes it run much slower.)

There's a memory cost to creating a duplicate code array, so it would risk wasting memory unproportionally.

bedevere-bot · 2021-06-06T23:19:50Z

🤖 New build scheduled with the buildbot fleet by @markshannon for commit ec50298 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

bedevere-bot · 2021-06-07T10:59:42Z

🤖 New build scheduled with the buildbot fleet by @markshannon for commit ab3a30b 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

markshannon · 2021-06-07T13:33:12Z

Windows tests were failing before this PR.
We seem to be right on the edge of running out of stack on Windows when up against the recursion limit.

methane · 2021-06-08T13:22:54Z

Lib/test/libregrtest/refleak.py

@@ -73,9 +73,10 @@ def get_pooled_int(value):
    alloc_deltas = [0] * repcount
    fd_deltas = [0] * repcount
    getallocatedblocks = sys.getallocatedblocks
+    getallocatedblocks = sys.getallocatedblocks


Typo. #26624

markshannon added 5 commits May 18, 2021 10:04

Add co_firstinstr field to code object.

7fae680

Implement barebones quickening.

c192bf2

Cleanup quickening internal API and use non-quickened bytecode when t…

edce8de

…racing.

Tweak internal quickening API.

2d8b14a

Flesh out superinstruction insertion code a bit.

af5b90e

the-knights-who-say-ni added the CLA signed label May 20, 2021

bedevere-bot added the awaiting core review label May 20, 2021

markshannon requested a review from gvanrossum May 20, 2021 11:39

markshannon added 2 commits May 20, 2021 12:43

Add NEWS item

526be3b

Add new file to Windows build.

6577457

markshannon requested a review from a team as a code owner May 20, 2021 11:50

Fix up public symbol.

bc7b418

isidentical reviewed May 20, 2021

View reviewed changes

Python/specialize.c Outdated Show resolved Hide resolved

Objects/codeobject.c Outdated Show resolved Hide resolved

markshannon added 2 commits May 21, 2021 11:18

Tweaks

b54feff

Merge branch 'main' into quickening-infrastructure

ffd6e87

markshannon mentioned this pull request May 21, 2021

Adaptive, specializing interpreter faster-cpython/ideas#28

Closed

iritkatriel reviewed May 23, 2021

View reviewed changes

Include/internal/pycore_code.h Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

markshannon added 2 commits May 24, 2021 16:08

Clarify commments, fix assertions and switch sign of counter represen…

8c12a0a

…tation to make it less surprising.

Remove 'HotPy' prefixes.

9e1a771

gvanrossum reviewed May 25, 2021

View reviewed changes

markshannon added 4 commits May 25, 2021 11:16

Add more explanatory comments and rename a few macros for clarity.

f0acdf0

Don't specialize instructions with EXTENDED_ARG.

8bd4487

Convert tracing dispatch to macro to ease keeping it in sync with nor…

d0ca916

…mal dispatch.

Rename macro to avoid name clash.

ae520e5

gvanrossum reviewed May 26, 2021

View reviewed changes

iritkatriel reviewed May 26, 2021

View reviewed changes

Python/ceval.c Show resolved Hide resolved

iritkatriel reviewed May 26, 2021

View reviewed changes

markshannon added 2 commits May 27, 2021 13:58

Clarify and refactor quickening code. Account for EXTENDED_ARGs when …

e329b2e

…quickening.

Merge branch 'main' into quickening-infrastructure

9345959

Move more heavily used fields of code object to front.

2c06ed4

iritkatriel reviewed May 27, 2021

View reviewed changes

Include/internal/pycore_code.h Outdated Show resolved Hide resolved

Include/internal/pycore_code.h Outdated Show resolved Hide resolved

markshannon added 2 commits May 27, 2021 16:45

Fix more typos

2d4e416

Merge branch 'main' into quickening-infrastructure

39b3a93

iritkatriel reviewed May 27, 2021

View reviewed changes

Python/specialize.c Show resolved Hide resolved

markshannon added 2 commits May 28, 2021 10:07

Make means of offset calculation explicit.

ee2dae1

Merge branch 'main' into quickening-infrastructure

12078f0

gvanrossum reviewed May 28, 2021

View reviewed changes

markshannon added 4 commits June 1, 2021 10:44

Add more explanatory comments.

16b985d

Use index+1 for speed and initialize adaptive cache entry when perfor…

8ea4b85

…ming quickening.

Add comment

55e673b

Make sure that the uses of instruction index versus uses of nexti (in…

b7c3995

…dex+1) is clear.

gvanrossum added a commit to faster-cpython/tools that referenced this pull request Jun 1, 2021

Add an option to count future cache needs

73f16fd

See python/cpython#26264 (comment)

Merge branch 'main' into quickening-infrastructure

ecfa62b

iritkatriel reviewed Jun 5, 2021

View reviewed changes

Merge branch 'main' into quickening-infrastructure

ec50298

markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 6, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 6, 2021

Fix refleaks tests to account for quickened blocks

ab3a30b

markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 7, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 7, 2021

markshannon merged commit 001eb52 into python:main Jun 7, 2021

bedevere-bot removed the awaiting core review label Jun 7, 2021

markshannon deleted the quickening-infrastructure branch June 7, 2021 17:56

methane reviewed Jun 8, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-44187: Quickening infrastructure #26264

bpo-44187: Quickening infrastructure #26264

markshannon commented May 20, 2021 •

edited by bedevere-bot

Loading

bratao commented May 22, 2021

gvanrossum left a comment

gvanrossum left a comment

gvanrossum May 26, 2021

gvanrossum May 26, 2021

gvanrossum May 26, 2021

iritkatriel May 26, 2021

iritkatriel May 26, 2021

markshannon commented May 27, 2021

gvanrossum left a comment

gvanrossum May 28, 2021

gvanrossum May 28, 2021

markshannon commented Jun 1, 2021

iritkatriel Jun 5, 2021

bluss Jun 6, 2021

bedevere-bot commented Jun 6, 2021

bedevere-bot commented Jun 7, 2021

markshannon commented Jun 7, 2021

methane Jun 8, 2021

markshannon Jun 9, 2021

		static uint8_t adaptive[256] = { 0 };

		static uint8_t cache_requirements[256] = { 0 };

	/* Get opcode and opcode from original instructions, not quickened form. */
	/* Get opcode and oparg from original instructions, not quickened form. */

	return cache_offset+1;
	return cache_offset + 1; // One extra for the count entry

bpo-44187: Quickening infrastructure #26264

bpo-44187: Quickening infrastructure #26264

Conversation

markshannon commented May 20, 2021 • edited by bedevere-bot Loading

bratao commented May 22, 2021

gvanrossum left a comment

Choose a reason for hiding this comment

gvanrossum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markshannon commented May 27, 2021

gvanrossum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markshannon commented Jun 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bedevere-bot commented Jun 6, 2021

bedevere-bot commented Jun 7, 2021

markshannon commented Jun 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markshannon commented May 20, 2021 •

edited by bedevere-bot

Loading