Skip to content

gh-93351: Ensure the position information in AST nodes created by the parser is always consistent #93352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 30, 2022

Conversation

pablogsal
Copy link
Member

Closes: #93351

@pablogsal pablogsal requested a review from isidentical as a code owner May 30, 2022 13:38
@pablogsal pablogsal added 3.10 only security fixes 3.11 only security fixes 3.12 only security fixes needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes labels May 30, 2022
Copy link
Member

@isidentical isidentical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also do the same in the AST validator, otherwise this would lead an interpreter crash on the debug mode when the user tries to compile a malformed AST node object.

>>> tree = ast.parse("a = 1")
>>> tree.body[0].lineno = 0
>>> compile(tree, "<test>", "exec")
python: Python/Python-ast.c:2118: _PyAST_Assign: Assertion `p->lineno != 0 && p->lineno <= p->end_lineno' failed.
[1]    26880 IOT instruction (core dumped)  ./python

@isidentical
Copy link
Member

Or maybe we should only limit this to nodes created by the parser. WDYT @pablogsal? Otherwise, I feel like it might break some code (even though it is clearly invalid). For example this now works, but if we start validating the line numbers for user created ASTs it will stop working:

>>> tree = ast.parse("a = 1")
>>> tree.body[0].lineno = -3121
>>> code = compile(tree, "<test>", "exec")
>>> ln = {}
>>> exec(code, ln)
>>> ln["a"]
1

@pablogsal
Copy link
Member Author

pablogsal commented May 30, 2022

Or maybe we should only limit this to nodes created by the parser. WDYT @pablogsal

I'm preparing the other PR as we speak :)

I still prefer to keep these separated as the other PR fixes a reported issue with pytest

@pablogsal
Copy link
Member Author

@isidentical Hummmmm there is an ordering problem here. We create the C nodes before we validate them, so the assert will trigger before we can raise a nice exception. Maybe we should add the check only on the validation step..... what do you think?

There is a separate problem which is that end_line_number defaults to 0, but if the user is setting just the line number currently then that leaves the node as invalid. I plan to add a check in the AST constructor to correct this, but we can just say "well, that's invalid now".

@pablogsal
Copy link
Member Author

I have changed to PR to just adding the check in the AST validator.

@isidentical
Copy link
Member

I have changed to PR to just adding the check in the AST validator.

I think that's definietly better (and since we are running the validator on debug mode even for the regular parser outputs, I think it should cover all the bases).

@isidentical
Copy link
Member

Should we drop the backport labels since this is theoritically a breaking change @pablogsal? Otherwise it LGTM (with a news entry).

Copy link
Member

@isidentical isidentical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (though there might be some missing cases for the validation , like excepthandler?)

Python/ast.c Outdated
return 0; \
} \
if ((node->lineno < 0 && node->end_lineno != node->lineno) || \
(node->col_offset < 0 && node->col_offset != node->end_col_offset)) { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:

Suggested change
(node->col_offset < 0 && node->col_offset != node->end_col_offset)) { \
(node->col_offset < 0 && node->col_offset != node->end_col_offset)) { \

@pablogsal
Copy link
Member Author

Should we drop the backport labels since this is theoritically a breaking change @pablogsal? Otherwise it LGTM (with a news entry).

For 3.10 yes, but for 3.11 no because failing to do this causes the compiler to break.

@pablogsal pablogsal removed needs backport to 3.10 only security fixes 3.10 only security fixes labels May 30, 2022
@pablogsal
Copy link
Member Author

LGTM (though there might be some missing cases for the validation , like excepthandler?)

I added two more in the last commit.

@pablogsal pablogsal merged commit 5893b5d into python:main May 30, 2022
@pablogsal pablogsal deleted the gh-93351 branch May 30, 2022 18:30
@miss-islington
Copy link
Contributor

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-93360 is a backport of this pull request to the 3.11 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 30, 2022
…by the parser is always consistent (pythonGH-93352)

(cherry picked from commit 5893b5d)

Co-authored-by: Pablo Galindo Salgado <[email protected]>
miss-islington added a commit that referenced this pull request May 30, 2022
… parser is always consistent (GH-93352)

(cherry picked from commit 5893b5d)

Co-authored-by: Pablo Galindo Salgado <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 only security fixes skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure the position information in AST nodes created by the parser is always consistent
4 participants