Line numbers for errors off in traces #106290

gvanrossum · 2023-06-30T19:07:44Z

When a uop encounters an error, the line number is set to -1 or to the wrong place.

Repro (on latest main):

a = 1

def func():
    global a
    i = 20
    while i > 0:
        del a
        i -= 1
        if i >= 3:
            a = 2

func()

Run this normally and you get this traceback:

Traceback (most recent call last):
  File "/Users/guido/cpython/t.py", line 12, in <module>
    func()
  File "/Users/guido/cpython/t.py", line 7, in func
    del a
        ^
NameError: name 'a' is not defined

But run with -Xuops, and you get:

Traceback (most recent call last):
  File "/Users/guido/cpython/t.py", line 12, in <module>
    func()
  File "/Users/guido/cpython/t.py", line -1, in func
NameError: name 'a' is not defined

Linked PRs

gh-106290: Fix edge cases around uops #106319

The text was updated successfully, but these errors were encountered:

gvanrossum · 2023-07-01T05:05:17Z

Having spent some time trying to debug this, I still have no solution. I do know this: the invariants in the code regarding next_instr, frame->prev_instr, and the like, are devilish, and I'm not sure what assumptions ENTER_EXECUTOR is making.

I think that whenever we're in the body of an instruction (i.e. in bytecodes.c):

frame->prev_instr points to the word containing the opcode+oparg of the current instruction
next_instr initially points to the next word; when the instruction exits, it must have skipped over the cache, or jumped to a new instruction

When executing regular bytecode, upon failing a DEOPT_IF guard, neither of these is changed. Upon failing an ERROR_IF check, next_instr is unused and frame->prev_instr is incorporated (as tb_lasti) in the traceback.

When executing uops, there are at least three reasons to distinguish for leaving _PyUopExecute() (and that's just assuming we don't switch to a new frame):

We execute the EXIT_TRACE uop
We fail a DEOPT_IF guard
We fail an ERROR_IF check

The ENTER_EXECUTOR opcode doesn't distinguish between the first two, and in these cases always resumes where frame->prev_instr points. (When resuming it sets next_instr to one more than frame->prev_instr -- this is to restore the invariant of the top two bullets above.)

When the executor returns NULL, the assumption is that (after restoring frame from cframe.current_frame) frame->prev_instr points to the instruction that raised the error.

But when I put all this together something's off! I generate uop sequences like this:

(more uops)
LOAD_FAST_CHECK (n)
SET_IP (ip)
EXIT_TRACE

where ip is the instruction offset of the original LOAD_FAST_CHECK bytecode, but to make everything (mostly) work the SET_IP opcode must subtract one from ip before storing the value in frame->prev_instr (adding the start of the bytecode sequence of course). For example, if I have a bytecode LOAD_FAST_CHECK 2 at byte offset 20, i.e. instruction offset 10, I generate SET_IP 10, but it must store co_code_adaptive + 9 into frame->prev_instr. This ensures that DEOPT_IF and EXIT_TRACE work; ERROR_IF almost works, but sets tb_lasti to 9 and as a result the traceback is subtly wrong. I have attempted various other permutations, but none work.

I fear to change the logic in JUMP_BACKWARD or ENTER_EXECUTOR, but maybe that's the real culprit? That would also require changes to the counter-optimizer used to test the infrastructure though.

I also believe that if the first uop in a trace fails an ERROR_IF, frame->prev_instr is still pointing at the ENTER_EXECUTOR instruction. (And quite possibly the same for DEOPT_IF, but I don't have a way to repro that.)

PS. The one thing that's not at fault here is the stack pointer. The logic around setting and restoring that is clear and unimpeachable.

gvanrossum · 2023-07-01T20:16:17Z

Okay, I think I've got it. In the EXIT_TRACE uop, we need to decrement frame->prev_instr in order for the rest of the logic to work.

Explanation: When we return from the executor, we goto resume_frame, where the following code is executed (after expanding the macro SET_LOCALS_FROM_FRAME()):

resume_frame:
    next_instr = frame->prev_instr + 1;
    stack_pointer = _PyFrame_GetStackPointer(frame);
    // (Some debug stuff omitted)
    DISPATCH();

The DISPATCH() macro then uses either computed goto or switch to execute the instruction at next_instr.

All in all this means that if we have a "destination" opcode indicated by the final SET_IP uop, we must arrange for frame->prev_instr to point one word before that destination, so that the code at the resume_frame label correctly sets next_instr to the destination.

When we exit because of a failed DEOPT_IF guard, frame->prev_instr points to the bytecode that needs to be restarted; again, we need to decrement it before returning from the uop executor -- the executor's caller has no special case for deopt, we must make it jump to the bytecode that needs to be re-executed. (This case is handled differently in Tier 2 than in Tier 1 -- in Tier 1, we jump straight to the implementation of the unspecialized bytecode, leaving all other state unchanged; in Tier 2, we actually re-execute the specialized bytecode, counting on it to fail the same guard and then jump to the case for the unspecialized bytecode.

OTOH, when we exit because of a failed ERROR_IF, we must keep frame->prev_instr pointing at the bytecode that failed, since that is the situation expected by the error handling code.

I suppose there are two ways to fix this, given my implementation:

In SAVE_IP n, make frame->prev_instr point at instruction offset n - 1; increment it on an error exit
In SAVE_IP n, make frame->prev_instr point at instruction offset n; decrement it on regular or deoptimizing exit

In my PR I have used (2), since it emulates the state of the Tier 1 interpreter (during execution of a certain bytecode, frame->prev_instr points to that instruction).

We could also incorporate the decrement in the caller of the executor, but beware: there are two places where it's called:

In the ENTER_EXECUTOR bytecode
In _PyOptimizer_BackEdge(), when it inserts an ENTER_EXECUTOR instruction

The trick is that EXIT_TRACE must decrement prev_instr.

- Tweak uops debugging output - Fix the bug from gh-106290 - Rename `SET_IP` to `SAVE_IP` (per faster-cpython/ideas#558) - Add a `SAVE_IP` uop at the start of the trace (ditto) - Allow `unbound_local_error`; this gives us uops for `LOAD_FAST_CHECK`, `LOAD_CLOSURE`, and `DELETE_FAST` - Longer traces - Support `STORE_FAST_LOAD_FAST`, `STORE_FAST_STORE_FAST` - Add deps on pycore_uops.h to Makefile(.pre.in)

gvanrossum added the type-bug An unexpected behavior, bug, or error label Jun 30, 2023

gvanrossum added a commit to gvanrossum/cpython that referenced this issue Jul 1, 2023

Fix pythongh-106290 (wrong line numbers in traceback)

7944a80

The trick is that EXIT_TRACE must decrement prev_instr.

gvanrossum mentioned this issue Jul 1, 2023

gh-106290: Fix edge cases around uops #106319

Merged

gvanrossum closed this as completed Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Line numbers for errors off in traces #106290

Line numbers for errors off in traces #106290

gvanrossum commented Jun 30, 2023 •

edited by bedevere-bot

Loading

gvanrossum commented Jul 1, 2023

gvanrossum commented Jul 1, 2023 •

edited

Loading

Line numbers for errors off in traces #106290

Line numbers for errors off in traces #106290

Comments

gvanrossum commented Jun 30, 2023 • edited by bedevere-bot Loading

Linked PRs

gvanrossum commented Jul 1, 2023

gvanrossum commented Jul 1, 2023 • edited Loading

gvanrossum commented Jun 30, 2023 •

edited by bedevere-bot

Loading

gvanrossum commented Jul 1, 2023 •

edited

Loading