Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the parsing of methods in MySQL #31556

Open
14 tasks
TherChenYang opened this issue Jun 3, 2024 · 4 comments · May be fixed by #33335
Open
14 tasks

Improve the parsing of methods in MySQL #31556

TherChenYang opened this issue Jun 3, 2024 · 4 comments · May be fixed by #33335

Comments

@TherChenYang
Copy link
Collaborator

TherChenYang commented Jun 3, 2024

Background

Hi community.
The ShardingSphere SQL parser engine helps users to parse SQL to create the AST (Abstract Syntax Tree) and visit the AST to get SQLStatement (Java Object).

Currently, we are planning to enhance the support for MySQL SQL parsing in ShardingSphere.

More details:
https://shardingsphere.apache.org/document/current/en/reference/sharding/parse/

Issue Background Explanation

In the original parsing work, it may have overlooked the parsing of method parameters. For ShardingSphere, we need to pay attention to the table name or field name in method parameters. If there are issues with parsing method parameters, it will cause problems in subsequent binding and rewriting tasks.

For the verification work, we need to complete the following items.

  1. Find the example SQL of this method in the official website.Built-In Function and Operator Reference
  2. verify if the SQL itself can be parsed by the parser
  3. Check if the parsed SQLStatement can correctly capture the parameters in the method.

Task

LAG()
LAST_INSERT_ID()
LAST_VALUE()
LCASE()
LEAD()
LEAST()
LEFT()
LENGTH()
LN()
LOCALTIME()
LOCALTIMESTAMP()
LOCATE()
LOWER()
LPAD()

Overall Procedure

If you intend to participate in fixing this issue, please feel free to leave a comment below the issue. Community members will assign the issue accordingly.

For example, you can leave a comment like this: "Hi, please assign this issue to me. Thank you!"

Once you have claimed the issue, please review the syntax of the SQL on the official website of the corresponding database. Execute the SQL on the respective database to ensure the correctness of the SQL syntax.

You can check the corresponding source of each SQL case on the official database website by clicking on the link provided below each case.

Next, execute the problematic SQL cases mentioned above in the database (you can quickly start the corresponding database using the Docker image for that database, and then connect to it using
a client you are familiar with), to ensure that the SQL syntax itself is correct.

Fixing ANTLR Grammar Parsing Issue

Once you have confirmed the correctness of the SQL syntax, you can validate and fix the grammar parsing issue in ShardingSphere.

If you are using IntelliJ IDEA, you will need to install the ANTLR plugin before proceeding.

If it is an ANTLR parsing error message, try to repair the .g4 file by comparing it with the official database syntax until the SQL can be correctly parsed by ANTLR.

When there is no error message in the ANTLR Preview window, it means that ANTLR can correctly parse the SQL.

Visitor problem fix

After ANTLR parses SQL into an abstract syntax tree, ShardingSphere will access the abstract syntax tree through Visitor and extract the required information.
If you need to extract Segments, you need to first execute:

mvn -T 2C clean install -DskipTests

Under the shardingsphere-parser module to compile the entire parser module.
Then rewrite the corresponding visit method in SQLStatementVisitorr as needed to extract the corresponding Segment.

Add assertion test file

After the above SQL parsing problem is repaired, the corresponding Test needs to be added.
The steps are as follows:

  1. Add the corresponding sql-case in the sql/supported directory.
  2. Add case assertions in the case directory of the shardingsphere-test-it-parser module.
  3. Run org.apache.shardingsphere.test.it.sql.parser.internal.InternalSQLParserIT
    After SQL Parser IT runs successfully, you can submit a PR.

Relevant Skills

  1. Master JAVA language
  2. Have a basic understanding of Antlr g4 file
  3. Be familiar with Doris SQLs
Copy link

github-actions bot commented Jul 3, 2024

There hasn't been any activity on this issue recently, and in order to prioritize active issues, it will be marked as stale.

@github-actions github-actions bot added the stale label Jul 3, 2024
@terrymanu terrymanu removed the stale label Jul 7, 2024
@ashishbania
Copy link

Hi, please assign this issue to me. Thank you!

@strongduanmu
Copy link
Member

@ashishbania Welcome, I just assign this issue to you. Please enjoy it.

@ganesh-vk
Copy link

Hi, I'd like to work on this issue if not fixed although I am unable to find the database used to verify the SQL queries. I only found the "employees" database provided by MySQL in it's documentation. Am I supposed to create a test database to verify the syntax of the queries first or is there a pre existing database with which I can complete the verification? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment