Skip to content

Don't fail parsing ALTER TABLE ADD COLUMN ending with a semicolon #246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 31, 2020

Conversation

miuy56dc
Copy link
Contributor

@miuy56dc miuy56dc commented Jul 29, 2020

Fix parse_column_def #233

ALTER TABLE tab ADD COLUMN foo TEXT;

@coveralls
Copy link

Pull Request Test Coverage Report for Build 186613638

  • 3 of 3 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 91.743%

Totals Coverage Status
Change from base Build 186377189: 0.0%
Covered Lines: 4389
Relevant Lines: 4784

💛 - Coveralls

Copy link
Contributor

@nickolay nickolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious that this is the correct fix. How is {the set of tokens we break on in parse_column_def} determined? Why is it OK to break on irrelevant tokens? This should be noted in a comment I think.

@miuy56dc
Copy link
Contributor Author

I have compared CREATE TABLE t (a int); and ALTER TABLE tab ADD COLUMN foo TEXT;. We invoke parse_column_def to parse CREATE statement it will break the loop when it meets Token::RParen, but for ALTER statement, it will not match either of them in the end. That's why I add Token::SemiColon to fix that.

@nickolay nickolay changed the title Fix parse_column_def error Don't fail parsing ALTER TABLE ADD COLUMN ending with a semicolon Jul 31, 2020
@nickolay nickolay merged commit 9c1a5a7 into apache:main Jul 31, 2020
nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Jul 31, 2020
Since PR apache#93
`parse_column_def` parses a set of column options in a loop, e.g. given:

```
                  _______ column_def _______
CREATE TABLE foo (bar INT NOT NULL DEFAULT 1, )
                          -------- ---------
                          option 1  option 2
````

it parses column options until it encounters one of the delimiter tokens

First when we only supported `CREATE TABLE`, the set of delimiters that
stopped the parsing used to be `Token::Comma | Token::RParen`.

Then we added support for `ALTER TABLE ADD COLUMN <column_def>`. Turns
out the parser started to bail if the statement ended with a semicolon,
while attempting to parse the semicolon as a column option, as we forgot
to add it to the set of delimiter tokens.

This was recently fixed in apache#246
by including Token::SemiColon to the list, but it felt wrong to have 
to update this list, and to have a common list of delimiters for two
different contexts (CREATE TABLE with parens vs ALTER TABLE ADD COLUMN
without parens).

Also our current approach cannot handle multiple statements NOT
separated by a semicolon, as is common in MS SQL DDL. We don't
explicitly support it in `parse_statements`, but that's a use-case
like to keep in mind nevertheless.
nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Jul 31, 2020
Since PR apache#93
`parse_column_def` parses a set of column options in a loop, e.g. given:

```
                  _______ column_def _______
CREATE TABLE foo (bar INT NOT NULL DEFAULT 1, )
                          -------- ---------
                          option 1  option 2
````

it parses column options until it encounters one of the delimiter tokens

First when we only supported `CREATE TABLE`, the set of delimiters that
stopped the parsing used to be `Token::Comma | Token::RParen`.

Then we added support for `ALTER TABLE ADD COLUMN <column_def>`. Turns
out the parser started to bail if the statement ended with a semicolon,
while attempting to parse the semicolon as a column option, as we forgot
to add it to the set of delimiter tokens.

This was recently fixed in apache#246
by including Token::SemiColon to the list, but it felt wrong to have
to update this list, and to have a common list of delimiters for two
different contexts (CREATE TABLE with parens vs ALTER TABLE ADD COLUMN
without parens).

Also our current approach cannot handle multiple statements NOT
separated by a semicolon, as is common in MS SQL DDL. We don't
explicitly support it in `parse_statements`, but that's a use-case
like to keep in mind nevertheless.
@nickolay
Copy link
Contributor

It's not obvious that this is the correct fix. How is {the set of tokens we break on in parse_column_def} determined? Why is it OK to break on irrelevant tokens? This should be noted in a comment I think.

I've opened #251 to fix that.

nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Aug 10, 2020
Since PR apache#93
`parse_column_def` parses a set of column options in a loop, e.g. given:

```
                  _______ column_def _______
CREATE TABLE foo (bar INT NOT NULL DEFAULT 1, )
                          -------- ---------
                          option 1  option 2
````

it parses column options until it encounters one of the delimiter tokens

First when we only supported `CREATE TABLE`, the set of delimiters that
stopped the parsing used to be `Token::Comma | Token::RParen`.

Then we added support for `ALTER TABLE ADD COLUMN <column_def>`. Turns
out the parser started to bail if the statement ended with a semicolon,
while attempting to parse the semicolon as a column option, as we forgot
to add it to the set of delimiter tokens.

This was recently fixed in apache#246
by including Token::SemiColon to the list, but it felt wrong to have
to update this list, and to have a common list of delimiters for two
different contexts (CREATE TABLE with parens vs ALTER TABLE ADD COLUMN
without parens).

Also our current approach cannot handle multiple statements NOT
separated by a semicolon, as is common in MS SQL DDL. We don't
explicitly support it in `parse_statements`, but that's a use-case
like to keep in mind nevertheless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants