Finally, for testing purposes, is this a good way to add a convenience method for debugging in IntelliJ, or how would this normally be done if the grammar only expects one statement at a time, and you want to, for example, test to make sure all ten statements are correct?
root
: EOF
// this line is for testing only
| selectStatement (SEMICOLON selectStatement)* (SEMICOLON EOF? | EOF)
// this line is for the actual parser
| selectStatement (SEMICOLON EOF? | EOF)
;
CodePudding user response:
Whether to allow for multiple statements is pretty much just a grammar design choice. Depending upon the context, it might have been more straightforward to know you'll only see a single statement at a time, or that you can easily separate multiple statements and send each to a parser.
It does look like it would be useful for you.
A simpler version would be:
root : selectStatement? (SEMICOLON selectStatement)* SEMICOLON? EOF
you should always have an EOF
Another thing that doesn't always dawn on designers is that your grammar can have multiple start rules. So you could also have a
selectStart: selectStatement SEMICOLO? EOF;
rule that only allows for a single statement and depending upon you situation you can choose which start rule to use. I had a graphical tool for a language I wrote so sometimes I parsed expr
s, sometimes stmt
s and sometimes script
s. Each had its own start rule. But don't forget to end a start rule with an EOF
. This forces the parser to look at ALL of your input. Without it, it will parse as much as it can, but will ignore training input that doesn't fit a parse rule.
(well, it's possible not to have EOF, IF you have a custom stream that remains open so there is no end of input. However, this would not be the case in your situation.)
CodePudding user response:
There are several arguments pro single-statement processing:
- Killer Reason: Each statement can use a different delimiter (which you cannot handle in the parser grammar). Delimiters are not part of the SQL syntax.
- In editors you will want to know where a statement starts and ends, without first parsing the full text (think of megabytes sized dumps), e.g. for executing a single statement.
- You don't want to miss all following statement details, if only one statement contains a syntax error.
- Parsing a single statement at a time gives you much better response times (e.g. when editing SQL code while it's still being parsed).
- The server can process single statements only, anyway.
I implemented the statement handling in MySQL Workbench and did it the same way in MySQL Shell for VS Code. The statement splitter is usually very fast (100ms in C for a million statements, depending on the box it runs on). This allows to do a quick first run for the statement ranges, show the statement indicator and make the editor ready for statement requests (for execution). After than a background thread is used to parse the individual statements for errors, which can be stopped at any time when a statement was edited.