-
Notifications
You must be signed in to change notification settings - Fork 601
Resolved bug in parse_function_arg
#1826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/parser/mod.rs
Outdated
// It may appear that the first token can be converted into a known | ||
// type, but this could also be a collision as some types are only | ||
// present in some dialects and therefore some type reserved keywords | ||
// may be freely used as argument names in other dialects. | ||
|
||
// To check whether the first token is a name or a type, we need to | ||
// peek the next token, which if it is another type keyword, then the | ||
// first token is a name and not a type in itself. | ||
let potential_tokens = [Token::Eq, Token::RParen, Token::Comma]; | ||
if !self.peek_keyword(Keyword::DEFAULT) | ||
&& !potential_tokens.contains(&self.peek_token().token) | ||
{ | ||
name = Some(Ident::new(next_token.to_string())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if something like this work instead?
if let DataType::Custom(n, _) = &data_type {
if let Some(dt) = self.maybe_parse(|parser| parser.parse_data_type())? {
match n.0[0].clone() {
ObjectNamePart::Identifier(ident) => name = Some(ident),
}
data_type = dt;
}
}
thinking if so it would closer match the desired goal to parse an optional datatype if the first token is regular identifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try it out, I wasn't aware of maybe_parse
which certainly seems to make it less confusing to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code you proposed does not work as Int2
(or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {
, but other variants. I am trying to update my own version using maybe_parse
instead of the keywords check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See commit 1801b2a
@@ -5199,13 +5199,20 @@ impl<'a> Parser<'a> { | |||
|
|||
// parse: [ argname ] argtype | |||
let mut name = None; | |||
let next_token = self.peek_token(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {
Oh how did you mean here by Int2
in this example not being parsed as a custom datatype, do we get back a different type or does parse_data_type
fail in that scenario?
I think ideally we will want to do without this self.peek_token()
to avoid the cloning that it includes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument named Int2
(as described in the issue) is not parsed as DataType::Custom
, but as a DataType::Int2
. Analogously, any other such argument names that collides with data types from other SQL engines would be parsed into a type.
Now, if I were to convert back to string DataType::Int2
I would get some arbitrary capitalization which in this case is INT2
- without the peek_token
, I am unsure how we can preserve the initial token from being lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see that makes sense! Maybe something like this we can do to restrict the cloning to only when necessary?
let data_type_idx = self.get_current_index();
if let Some(next_data_type) = self.maybe_parse(|parser| {
name = parser.token_at(data_type_idx).to_string();
// ...
})
Coming to think about it, would we not need to sanity check that the first token is actually a Token::Word
variant? current code seems to assume that to be the case which might not necessarily be true.
For example following how the following sql would be parsed, we can probably have a test case it
function(struct<a,b> int64)
we would call to_string()
on only the first token which would be struct
even though this query is technically invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide a complete example of such a broken case, so that I may add it to the test suite?
This pull request resolves the bug described in issue #1825, which was caused by an incorrect implementation of the named argument parsing. It also adds a few tests to verify that the new implementation is correct.
The previous implementation made the incorrect assumption that arguments name cannot have the same name as types, but the set of types that are parsed as types in
sqlparser
is a superset of the types that are present in each dialect. Therefore, it is correct syntax to use as argument name for instanceint2
for PostgreSQL, while this same argument name would be interpreted as a type elsewhere.I have changed the parsing to determine via a look-ahead whether the name is a type or not.
Best,
Luca