Skip to content

Resolved bug in parse_function_arg #1826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5199,12 +5199,22 @@ impl<'a> Parser<'a> {

// parse: [ argname ] argtype
let mut name = None;
let next_token = self.peek_token();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {

Oh how did you mean here by Int2 in this example not being parsed as a custom datatype, do we get back a different type or does parse_data_type fail in that scenario?

I think ideally we will want to do without this self.peek_token() to avoid the cloning that it includes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument named Int2 (as described in the issue) is not parsed as DataType::Custom, but as a DataType::Int2. Analogously, any other such argument names that collides with data types from other SQL engines would be parsed into a type.

Now, if I were to convert back to string DataType::Int2 I would get some arbitrary capitalization which in this case is INT2 - without the peek_token, I am unsure how we can preserve the initial token from being lost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see that makes sense! Maybe something like this we can do to restrict the cloning to only when necessary?

let data_type_idx = self.get_current_index();
if let Some(next_data_type) = self.maybe_parse(|parser| {
    name = parser.token_at(data_type_idx).to_string();
   // ...
})

Coming to think about it, would we not need to sanity check that the first token is actually a Token::Word variant? current code seems to assume that to be the case which might not necessarily be true.
For example following how the following sql would be parsed, we can probably have a test case it

function(struct<a,b> int64)

we would call to_string() on only the first token which would be struct even though this query is technically invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a complete example of such a broken case, so that I may add it to the test suite?

let mut data_type = self.parse_data_type()?;
if let DataType::Custom(n, _) = &data_type {
// the first token is actually a name
match n.0[0].clone() {
ObjectNamePart::Identifier(ident) => name = Some(ident),
}

// It may appear that the first token can be converted into a known
// type, but this could also be a collision as some types are only
// present in some dialects and therefore some type reserved keywords
// may be freely used as argument names in other dialects.

// To check whether the first token is a name or a type, we need to
// peek the next token, which if it is another type keyword, then the
// first token is a name and not a type in itself.
let potential_tokens = [Token::Eq, Token::RParen, Token::Comma];
if !self.peek_keyword(Keyword::DEFAULT)
&& !potential_tokens.contains(&self.peek_token().token)
{
name = Some(Ident::new(next_token.to_string()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if something like this work instead?

if let DataType::Custom(n, _) = &data_type {
  if let Some(dt) = self.maybe_parse(|parser| parser.parse_data_type())? {
    match n.0[0].clone() {
      ObjectNamePart::Identifier(ident) => name = Some(ident),
    }
    data_type = dt;
  }
}

thinking if so it would closer match the desired goal to parse an optional datatype if the first token is regular identifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try it out, I wasn't aware of maybe_parse which certainly seems to make it less confusing to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {, but other variants. I am trying to update my own version using maybe_parse instead of the keywords check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit 1801b2a

data_type = self.parse_data_type()?;
}

Expand Down
168 changes: 168 additions & 0 deletions tests/sqlparser_postgres.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4098,6 +4098,174 @@ fn parse_update_in_with_subquery() {
pg_and_generic().verified_stmt(r#"WITH "result" AS (UPDATE "Hero" SET "name" = 'Captain America', "number_of_movies" = "number_of_movies" + 1 WHERE "secret_identity" = 'Sam Wilson' RETURNING "id", "name", "secret_identity", "number_of_movies") SELECT * FROM "result""#);
}

#[test]
fn parser_create_function_with_args() {
let sql1 = r#"CREATE OR REPLACE FUNCTION check_strings_different(str1 VARCHAR, str2 VARCHAR) RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN
IF str1 <> str2 THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$"#;

assert_eq!(
pg_and_generic().verified_stmt(sql1),
Statement::CreateFunction(CreateFunction {
or_alter: false,
or_replace: true,
temporary: false,
name: ObjectName::from(vec![Ident::new("check_strings_different")]),
args: Some(vec![
OperateFunctionArg::with_name(
"str1",
DataType::Varchar(None),
),
OperateFunctionArg::with_name(
"str2",
DataType::Varchar(None),
),
]),
return_type: Some(DataType::Boolean),
language: Some("plpgsql".into()),
behavior: None,
called_on_null: None,
parallel: None,
function_body: Some(CreateFunctionBody::AsBeforeOptions(Expr::Value(
(Value::DollarQuotedString(DollarQuotedString {value: "\nBEGIN\n IF str1 <> str2 THEN\n RETURN TRUE;\n ELSE\n RETURN FALSE;\n END IF;\nEND;\n".to_owned(), tag: None})).with_empty_span()
))),
if_not_exists: false,
using: None,
determinism_specifier: None,
options: None,
remote_connection: None,
})
);

let sql2 = r#"CREATE OR REPLACE FUNCTION check_not_zero(int1 INT) RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN
IF int1 <> 0 THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$"#;
assert_eq!(
pg_and_generic().verified_stmt(sql2),
Statement::CreateFunction(CreateFunction {
or_alter: false,
or_replace: true,
temporary: false,
name: ObjectName::from(vec![Ident::new("check_not_zero")]),
args: Some(vec![
OperateFunctionArg::with_name(
"int1",
DataType::Int(None)
)
]),
return_type: Some(DataType::Boolean),
language: Some("plpgsql".into()),
behavior: None,
called_on_null: None,
parallel: None,
function_body: Some(CreateFunctionBody::AsBeforeOptions(Expr::Value(
(Value::DollarQuotedString(DollarQuotedString {value: "\nBEGIN\n IF int1 <> 0 THEN\n RETURN TRUE;\n ELSE\n RETURN FALSE;\n END IF;\nEND;\n".to_owned(), tag: None})).with_empty_span()
))),
if_not_exists: false,
using: None,
determinism_specifier: None,
options: None,
remote_connection: None,
})
);

let sql3 = r#"CREATE OR REPLACE FUNCTION check_values_different(a INT, b INT) RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN
IF a <> b THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$"#;
assert_eq!(
pg_and_generic().verified_stmt(sql3),
Statement::CreateFunction(CreateFunction {
or_alter: false,
or_replace: true,
temporary: false,
name: ObjectName::from(vec![Ident::new("check_values_different")]),
args: Some(vec![
OperateFunctionArg::with_name(
"a",
DataType::Int(None)
),
OperateFunctionArg::with_name(
"b",
DataType::Int(None)
),
]),
return_type: Some(DataType::Boolean),
language: Some("plpgsql".into()),
behavior: None,
called_on_null: None,
parallel: None,
function_body: Some(CreateFunctionBody::AsBeforeOptions(Expr::Value(
(Value::DollarQuotedString(DollarQuotedString {value: "\nBEGIN\n IF a <> b THEN\n RETURN TRUE;\n ELSE\n RETURN FALSE;\n END IF;\nEND;\n".to_owned(), tag: None})).with_empty_span()
))),
if_not_exists: false,
using: None,
determinism_specifier: None,
options: None,
remote_connection: None,
})
);

let sql4 = r#"CREATE OR REPLACE FUNCTION check_values_different(int1 INT, int2 INT) RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN
IF int1 <> int2 THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$"#;
assert_eq!(
pg_and_generic().verified_stmt(sql4),
Statement::CreateFunction(CreateFunction {
or_alter: false,
or_replace: true,
temporary: false,
name: ObjectName::from(vec![Ident::new("check_values_different")]),
args: Some(vec![
OperateFunctionArg::with_name(
"int1",
DataType::Int(None)
),
OperateFunctionArg::with_name(
"int2",
DataType::Int(None)
),
]),
return_type: Some(DataType::Boolean),
language: Some("plpgsql".into()),
behavior: None,
called_on_null: None,
parallel: None,
function_body: Some(CreateFunctionBody::AsBeforeOptions(Expr::Value(
(Value::DollarQuotedString(DollarQuotedString {value: "\nBEGIN\n IF int1 <> int2 THEN\n RETURN TRUE;\n ELSE\n RETURN FALSE;\n END IF;\nEND;\n".to_owned(), tag: None})).with_empty_span()
))),
if_not_exists: false,
using: None,
determinism_specifier: None,
options: None,
remote_connection: None,
})
);
}

#[test]
fn parse_create_function() {
let sql = "CREATE FUNCTION add(INTEGER, INTEGER) RETURNS INTEGER LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS 'select $1 + $2;'";
Expand Down