Skip to content

Resolved bug in parse_function_arg #1826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

LucaCappelletti94
Copy link
Contributor

This pull request resolves the bug described in issue #1825, which was caused by an incorrect implementation of the named argument parsing. It also adds a few tests to verify that the new implementation is correct.

The previous implementation made the incorrect assumption that arguments name cannot have the same name as types, but the set of types that are parsed as types in sqlparser is a superset of the types that are present in each dialect. Therefore, it is correct syntax to use as argument name for instance int2 for PostgreSQL, while this same argument name would be interpreted as a type elsewhere.

I have changed the parsing to determine via a look-ahead whether the name is a type or not.

Best,
Luca

Comment on lines 5205 to 5217
// It may appear that the first token can be converted into a known
// type, but this could also be a collision as some types are only
// present in some dialects and therefore some type reserved keywords
// may be freely used as argument names in other dialects.

// To check whether the first token is a name or a type, we need to
// peek the next token, which if it is another type keyword, then the
// first token is a name and not a type in itself.
let potential_tokens = [Token::Eq, Token::RParen, Token::Comma];
if !self.peek_keyword(Keyword::DEFAULT)
&& !potential_tokens.contains(&self.peek_token().token)
{
name = Some(Ident::new(next_token.to_string()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if something like this work instead?

if let DataType::Custom(n, _) = &data_type {
  if let Some(dt) = self.maybe_parse(|parser| parser.parse_data_type())? {
    match n.0[0].clone() {
      ObjectNamePart::Identifier(ident) => name = Some(ident),
    }
    data_type = dt;
  }
}

thinking if so it would closer match the desired goal to parse an optional datatype if the first token is regular identifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try it out, I wasn't aware of maybe_parse which certainly seems to make it less confusing to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {, but other variants. I am trying to update my own version using maybe_parse instead of the keywords check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit 1801b2a

@@ -5199,13 +5199,20 @@ impl<'a> Parser<'a> {

// parse: [ argname ] argtype
let mut name = None;
let next_token = self.peek_token();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {

Oh how did you mean here by Int2 in this example not being parsed as a custom datatype, do we get back a different type or does parse_data_type fail in that scenario?

I think ideally we will want to do without this self.peek_token() to avoid the cloning that it includes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument named Int2 (as described in the issue) is not parsed as DataType::Custom, but as a DataType::Int2. Analogously, any other such argument names that collides with data types from other SQL engines would be parsed into a type.

Now, if I were to convert back to string DataType::Int2 I would get some arbitrary capitalization which in this case is INT2 - without the peek_token, I am unsure how we can preserve the initial token from being lost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see that makes sense! Maybe something like this we can do to restrict the cloning to only when necessary?

let data_type_idx = self.get_current_index();
if let Some(next_data_type) = self.maybe_parse(|parser| {
    name = parser.token_at(data_type_idx).to_string();
   // ...
})

Coming to think about it, would we not need to sanity check that the first token is actually a Token::Word variant? current code seems to assume that to be the case which might not necessarily be true.
For example following how the following sql would be parsed, we can probably have a test case it

function(struct<a,b> int64)

we would call to_string() on only the first token which would be struct even though this query is technically invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a complete example of such a broken case, so that I may add it to the test suite?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants