Skip to content

bpo-33387: Simplify bytecodes for try-finally, try-except and with blocks. #6641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 21, 2019

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Apr 29, 2018

Quoting the bpo issue:

The six complex bytecodes currently used for implementing 'with' and 'try' 
statements can be replaced with just two simpler bytecodes.
The six bytecodes are WITH_CLEANUP_START, WITH_CLEANUP_FINISH,
BEGIN_FINALLY, END_FINALLY, CALL_FINALLY and POP_FINALLY.
They can be replaced with RERAISE and WITH_EXCEPT_START.

See https://bugs.python.org/issue32949 for more details of the new bytecodes and 
how they are used in the 'with' statement.

The try-finally statement can be implemented broadly as 
   SETUP_FINALLY except
   try-body
   POP_BLOCK
   finally-body
   JUMP exit
except:
   finally-body
exit:

The implementation of with statements is a bit more complex. The with statement

   with EXPR as VAR:
       BLOCK

is implemented as:

        <code for EXPR>
        SETUP_WITH  ex
        <code to store to VAR> or POP_TOP
        <code for BLOCK>
        LOAD_CONST (None, None, None)
        CALL_FUNCTION_EX 0
        JUMP_FORWARD  done
    ex:  WITH_EXCEPT_START (calls EXPR.__exit__)
        POP_JUMP_IF_TRUE t:
        RERAISE
    t:  POP_TOP * 3 (remove exception from stack)
        POP_EXCEPT
        POP_TOP
    done:

https://bugs.python.org/issue33387

@markshannon
Copy link
Member Author

markshannon commented May 19, 2018

For the record, here is the result of pyperformance.
The appears to be a mean speedup of about 1%, but I doubt that it is significant.

Performance version: 0.6.2
Report on Linux-4.15.0-20-generic-x86_64-with-debian-buster-sid
Number of logical CPUs: 8
Start date: 2018-05-16 13:12:14.879026
End date: 2018-05-16 13:38:49.462145

### 2to3 ###
Mean +- std dev: 672 ms +- 42 ms -> 646 ms +- 8 ms: 1.04x faster
Significant (t=4.57)

### chameleon ###
Mean +- std dev: 25.1 ms +- 0.4 ms -> 23.9 ms +- 0.4 ms: 1.05x faster
Significant (t=16.30)

### chaos ###
Mean +- std dev: 257 ms +- 3 ms -> 253 ms +- 3 ms: 1.01x faster
Not significant

### crypto_pyaes ###
Mean +- std dev: 213 ms +- 2 ms -> 214 ms +- 2 ms: 1.01x slower
Not significant

### deltablue ###
Mean +- std dev: 15.6 ms +- 0.2 ms -> 15.2 ms +- 0.1 ms: 1.02x faster
Significant (t=11.53)

### django_template ###
Mean +- std dev: 312 ms +- 4 ms -> 331 ms +- 3 ms: 1.06x slower
Significant (t=-29.21)

### dulwich_log ###
Mean +- std dev: 151 ms +- 3 ms -> 148 ms +- 1 ms: 1.02x faster
Significant (t=8.85)

### fannkuch ###
Mean +- std dev: 929 ms +- 11 ms -> 933 ms +- 11 ms: 1.00x slower
Not significant

### float ###
Mean +- std dev: 214 ms +- 4 ms -> 211 ms +- 6 ms: 1.01x faster
Not significant

### genshi_text ###
Mean +- std dev: 81.7 ms +- 1.4 ms -> 75.8 ms +- 1.4 ms: 1.08x faster
Significant (t=23.14)

### genshi_xml ###
Mean +- std dev: 173 ms +- 3 ms -> 169 ms +- 2 ms: 1.02x faster
Significant (t=8.36)

### go ###
Mean +- std dev: 541 ms +- 6 ms -> 529 ms +- 4 ms: 1.02x faster
Significant (t=12.50)

### hexiom ###
Mean +- std dev: 20.0 ms +- 0.2 ms -> 19.3 ms +- 0.2 ms: 1.04x faster
Significant (t=19.28)

### html5lib ###
Mean +- std dev: 193 ms +- 6 ms -> 197 ms +- 7 ms: 1.02x slower
Significant (t=-3.50)

### json_dumps ###
Mean +- std dev: 27.2 ms +- 0.7 ms -> 26.1 ms +- 0.5 ms: 1.04x faster
Significant (t=8.78)

### json_loads ###
Mean +- std dev: 52.7 us +- 0.5 us -> 51.7 us +- 0.5 us: 1.02x faster
Not significant

### logging_format ###
Mean +- std dev: 25.3 us +- 0.3 us -> 27.2 us +- 0.3 us: 1.07x slower
Significant (t=-33.07)

### logging_silent ###
Mean +- std dev: 361 ns +- 4 ns -> 362 ns +- 6 ns: 1.00x slower
Not significant

### logging_simple ###
Mean +- std dev: 22.4 us +- 0.3 us -> 22.7 us +- 0.3 us: 1.01x slower
Not significant

### mako ###
Mean +- std dev: 35.9 ms +- 0.5 ms -> 35.2 ms +- 0.7 ms: 1.02x faster
Significant (t=6.54)

### meteor_contest ###
Mean +- std dev: 186 ms +- 6 ms -> 180 ms +- 2 ms: 1.04x faster
Significant (t=8.92)

### nbody ###
Mean +- std dev: 224 ms +- 6 ms -> 218 ms +- 4 ms: 1.03x faster
Significant (t=6.34)

### nqueens ###
Mean +- std dev: 231 ms +- 7 ms -> 224 ms +- 3 ms: 1.03x faster
Significant (t=6.80)

### pathlib ###
Mean +- std dev: 50.7 ms +- 1.6 ms -> 46.1 ms +- 0.8 ms: 1.10x faster
Significant (t=19.91)

### pickle ###
Mean +- std dev: 22.3 us +- 0.6 us -> 20.9 us +- 0.5 us: 1.07x faster
Significant (t=14.70)

### pickle_dict ###
Mean +- std dev: 68.8 us +- 6.0 us -> 64.0 us +- 6.2 us: 1.07x faster
Significant (t=4.30)

### pickle_list ###
Mean +- std dev: 8.57 us +- 0.15 us -> 8.64 us +- 0.64 us: 1.01x slower
Not significant

### pickle_pure_python ###
Mean +- std dev: 1.09 ms +- 0.02 ms -> 1.07 ms +- 0.02 ms: 1.02x faster
Not significant

### pidigits ###
Mean +- std dev: 272 ms +- 2 ms -> 270 ms +- 0 ms: 1.01x faster
Not significant

### python_startup ###
Mean +- std dev: 18.9 ms +- 2.0 ms -> 19.6 ms +- 2.4 ms: 1.04x slower
Significant (t=-3.48)

### python_startup_no_site ###
Mean +- std dev: 14.1 ms +- 1.9 ms -> 11.0 ms +- 1.0 ms: 1.28x faster
Significant (t=20.04)

### raytrace ###
Mean +- std dev: 1.14 sec +- 0.01 sec -> 1.16 sec +- 0.10 sec: 1.02x slower
Not significant

### regex_compile ###
Mean +- std dev: 389 ms +- 9 ms -> 375 ms +- 6 ms: 1.04x faster
Significant (t=10.40)

### regex_dna ###
Mean +- std dev: 265 ms +- 2 ms -> 272 ms +- 14 ms: 1.02x slower
Significant (t=-3.60)

### regex_effbot ###
Mean +- std dev: 4.92 ms +- 0.16 ms -> 5.71 ms +- 0.73 ms: 1.16x slower
Significant (t=-8.23)

### regex_v8 ###
Mean +- std dev: 40.2 ms +- 0.3 ms -> 40.3 ms +- 2.4 ms: 1.00x slower
Not significant

### richards ###
Mean +- std dev: 163 ms +- 6 ms -> 158 ms +- 3 ms: 1.03x faster
Significant (t=5.16)

### scimark_fft ###
Mean +- std dev: 655 ms +- 15 ms -> 669 ms +- 29 ms: 1.02x slower
Significant (t=-3.18)

### scimark_lu ###
Mean +- std dev: 364 ms +- 19 ms -> 358 ms +- 18 ms: 1.02x faster
Not significant

### scimark_monte_carlo ###
Mean +- std dev: 247 ms +- 9 ms -> 230 ms +- 8 ms: 1.07x faster
Significant (t=10.69)

### scimark_sor ###
Mean +- std dev: 468 ms +- 8 ms -> 440 ms +- 13 ms: 1.06x faster
Significant (t=14.28)

### scimark_sparse_mat_mult ###
Mean +- std dev: 8.34 ms +- 0.09 ms -> 8.17 ms +- 0.42 ms: 1.02x faster
Significant (t=3.04)

### spectral_norm ###
Mean +- std dev: 250 ms +- 4 ms -> 250 ms +- 9 ms: 1.00x slower
Not significant

### sqlalchemy_declarative ###
Mean +- std dev: 279 ms +- 4 ms -> 279 ms +- 7 ms: 1.00x slower
Not significant

### sqlalchemy_imperative ###
Mean +- std dev: 52.1 ms +- 1.7 ms -> 52.4 ms +- 1.8 ms: 1.01x slower
Not significant

### sqlite_synth ###
Mean +- std dev: 8.40 us +- 0.23 us -> 7.37 us +- 0.23 us: 1.14x faster
Significant (t=24.96)

### sympy_expand ###
Mean +- std dev: 872 ms +- 8 ms -> 925 ms +- 11 ms: 1.06x slower
Significant (t=-30.34)

### sympy_integrate ###
Mean +- std dev: 38.7 ms +- 0.4 ms -> 41.0 ms +- 0.8 ms: 1.06x slower
Significant (t=-19.24)

### sympy_str ###
Mean +- std dev: 391 ms +- 5 ms -> 419 ms +- 7 ms: 1.07x slower
Significant (t=-25.24)

### sympy_sum ###
Mean +- std dev: 184 ms +- 2 ms -> 189 ms +- 3 ms: 1.03x slower
Significant (t=-11.19)

### telco ###
Mean +- std dev: 18.9 ms +- 0.6 ms -> 19.2 ms +- 0.6 ms: 1.02x slower
Not significant

### tornado_http ###
Mean +- std dev: 350 ms +- 4 ms -> 348 ms +- 4 ms: 1.01x faster
Not significant

### unpack_sequence ###
Mean +- std dev: 113 ns +- 1 ns -> 106 ns +- 1 ns: 1.07x faster
Significant (t=48.50)

### unpickle ###
Mean +- std dev: 26.0 us +- 0.2 us -> 27.1 us +- 2.3 us: 1.04x slower
Significant (t=-3.61)

### unpickle_list ###
Mean +- std dev: 7.40 us +- 0.06 us -> 8.00 us +- 0.40 us: 1.08x slower
Significant (t=-11.46)

### unpickle_pure_python ###
Mean +- std dev: 892 us +- 16 us -> 871 us +- 46 us: 1.02x faster
Significant (t=3.43)

### xml_etree_generate ###
Mean +- std dev: 255 ms +- 5 ms -> 244 ms +- 4 ms: 1.04x faster
Significant (t=13.63)

### xml_etree_iterparse ###
Mean +- std dev: 202 ms +- 6 ms -> 200 ms +- 6 ms: 1.01x faster
Not significant

### xml_etree_parse ###
Mean +- std dev: 297 ms +- 10 ms -> 314 ms +- 17 ms: 1.06x slower
Significant (t=-6.36)

### xml_etree_process ###
Mean +- std dev: 214 ms +- 3 ms -> 207 ms +- 4 ms: 1.03x faster
Significant (t=10.29)

@markshannon markshannon closed this Jul 8, 2018
@markshannon markshannon reopened this Jul 8, 2018
@markshannon markshannon force-pushed the python-better-try-finally branch from 0dfe44c to d8b54c9 Compare July 8, 2018 11:55
@methane
Copy link
Member

methane commented Jul 11, 2018

@markshannon FYI, perf compare_to old.json new.json -G (-G option) makes output more compact and readable.

@markshannon markshannon force-pushed the python-better-try-finally branch from d8b54c9 to 3efb201 Compare September 9, 2018 11:45
@markshannon markshannon force-pushed the python-better-try-finally branch from 3efb201 to b16b8e8 Compare September 19, 2018 13:18
@markshannon
Copy link
Member Author

All comments addressed and rebased to fix merge conflict with ddd1949

@markshannon markshannon force-pushed the python-better-try-finally branch from b16b8e8 to d64244b Compare May 6, 2019 15:14
@markshannon
Copy link
Member Author

Rebased again

@markshannon markshannon force-pushed the python-better-try-finally branch from d64244b to 6f916e9 Compare May 18, 2019 14:24
@markshannon
Copy link
Member Author

Rebased yet again.
I'd like to get the merged, so I don't have to keep rebasing.

@markshannon markshannon force-pushed the python-better-try-finally branch 2 times, most recently from b9c698e to cdb2581 Compare June 4, 2019 14:18
@markshannon
Copy link
Member Author

Rebased yet again (twice)

Copy link
Member

@brettcannon brettcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly stylistic stuff and requests for comments to help explain the code in some spots.


Calls the function in position 7 on the stack with the top three
items on the stack as arguments.
Used to implement the call ``exit(*exc_info())`` when an exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was initially misleading to me as my brain thought of sys.exit() instead of context_manager.__exit__().

Suggested change
Used to implement the call ``exit(*exc_info())`` when an exception
Used to implement the call ``context_manager.__exit__(*exc_info())`` when an exception

} codetracker;

static void
reset(codetracker *tracker)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a comment as to why some fields don't require resetting while others do? And maybe why the fields in init_codetracker() differ from those in reset()? Basically my brain is wondering why e.g. start_line is initialized but not reset, and vice-versa.

There's also a naming discrepancy between reset() and init_codetracker() in that one includes codetracker and the other does not.

}
while (tracker->offset < tracker->lnotab_len &&
tracker->addr >= tracker->line_addr + tracker->lnotab[tracker->offset]) {
tracker->line_addr = tracker->line_addr + tracker->lnotab[tracker->offset];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracker->line_addr = tracker->line_addr + tracker->lnotab[tracker->offset];
tracker->line_addr += tracker->lnotab[tracker->offset];

return op == SETUP_FINALLY && !is_try_except(op, target_op) && !is_async_for(op, target_op);
}

#define TRY_EXCEPT 250
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect to find this value in opcode.h but I'm not, so why the new definition of an opcode here? (A comment would be enough for me to clarify why this is here.)

}
else if (is_try_finally(op, target_op)) {
int addr = tracker->addr;
// Skip over duplicate finally blocks if line is after body.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Skip over duplicate finally blocks if line is after body.
// Skip over duplicate 'finally' blocks if line is after body.

PyErr_SetString(PyExc_ValueError,
"can't jump into the middle of a block");
return -1;
/* Validate change of block stack */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Validate change of block stack */
/* Validate change of block stack. */

}
break;
}
/* Check for illegal jumps out of finally or except blocks */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Check for illegal jumps out of finally or except blocks */
/* Check for illegal jumps out of finally or except blocks. */

{
/* Pop the exit function. */
delta++;
/* Unwind block stack */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Unwind block stack */
/* Unwind block stack. */

Python/compile.c Outdated
@@ -2587,7 +2620,7 @@ compiler_for(struct compiler *c, stmt_ty s)
if (start == NULL || end == NULL || cleanup == NULL)
return 0;

if (!compiler_push_fblock(c, FOR_LOOP, start, end))
if (!compiler_push_fblock(c, FOR_LOOP, start, end, NULL))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're up for it, would you mind fixing these if blocks to use {}? Otherwise the style is shifting and I would rather promote the new style than the old one.

Python/compile.c Outdated
compiler_use_next_block(c, finally);
if (!compiler_push_fblock(c, FINALLY_END, finally, NULL))

/* End of body; start the cleanup */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* End of body; start the cleanup */
/* End of body; start the cleanup. */

@markshannon
Copy link
Member Author

All review comments addressed.

@brettcannon
Copy link
Member

FYI no need to wait on me on an approving review.

@markshannon markshannon force-pushed the python-better-try-finally branch from db08c6d to c611d1b Compare June 13, 2019 21:27
@markshannon
Copy link
Member Author

markshannon commented Jun 14, 2019

Rebased yet again...

Time to press that merge button ⬇️ 🙂

@brettcannon
Copy link
Member

@markshannon is there anything preventing you from doing the merge at this point?

@markshannon markshannon force-pushed the python-better-try-finally branch from c611d1b to 1bd40b8 Compare June 23, 2019 10:00
@markshannon
Copy link
Member Author

@brettcannon No technical obstacles, but I think that for large PRs like this one, it is best if someone other than the author does the merge.
Partly to discourage things being merged without proper review, and partly to reduce any acrimony should changes need to be reverted.
Also, merging is an unambiguous signal that the PR has been approved. Comments like "LGTM" and clicking the "approved" button are open to interpretation.

@brettcannon
Copy link
Member

@markshannon Are you choosing to skip my doc change suggestions? (I'm asking because you said all the comments were addressed but those were left open/unresolved, so I just wanted to check the PR is in its final state sans merge conflict.)

As for clicking "approved" on a review, that's as close as people typically get for approval on a core dev's PR. Historically core devs have done their own merges as they will be the ones to handle reversions, etc. IOW if you want to address my remaining doc comments and are willing to handle reversion duties, etc., then I can click the merge button for you.

@markshannon markshannon force-pushed the python-better-try-finally branch from 1bd40b8 to 385b2df Compare July 14, 2019 11:27
@markshannon markshannon force-pushed the python-better-try-finally branch 2 times, most recently from 54f381c to 23cfe89 Compare September 12, 2019 10:16
@markshannon markshannon force-pushed the python-better-try-finally branch from 23cfe89 to 62a3625 Compare October 7, 2019 11:13
@markshannon markshannon force-pushed the python-better-try-finally branch from 62a3625 to 8457a00 Compare November 12, 2019 09:42
…parate code for normal and exceptional paths.

Remove BEGIN_FINALLY, END_FINALLY, CALL_FINALLY and POP_FINALLY bytecodes. Implement finally blocks by code duplication.
Reimplement frame.lineno setter using line numbers rather than bytecode offsets.
@markshannon markshannon force-pushed the python-better-try-finally branch from 8457a00 to 6364ee7 Compare November 12, 2019 10:19
@markshannon
Copy link
Member Author

I've been holding off on this as I wanted to let the 3.8 release stabilize before changing the bytecode.
3.8 all seems fine, so unless anyone objects, I'm going to merge this tomorrow.

@brettcannon I have addressed all your comments.

@markshannon markshannon merged commit fee5526 into python:master Nov 21, 2019
@markshannon markshannon deleted the python-better-try-finally branch November 21, 2019 09:11
@bedevere-bot
Copy link

@markshannon: Please replace # with GH- in the commit message next time. Thanks!

jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
…parate code for normal and exceptional paths. (python#6641)

Remove BEGIN_FINALLY, END_FINALLY, CALL_FINALLY and POP_FINALLY bytecodes. Implement finally blocks by code duplication.
Reimplement frame.lineno setter using line numbers rather than bytecode offsets.
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
…parate code for normal and exceptional paths. (python#6641)

Remove BEGIN_FINALLY, END_FINALLY, CALL_FINALLY and POP_FINALLY bytecodes. Implement finally blocks by code duplication.
Reimplement frame.lineno setter using line numbers rather than bytecode offsets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants