Whether to parse multiple statements at once or not-CodePudding

I noticed that in the

Finally, for testing purposes, is this a good way to add a convenience method for debugging in IntelliJ, or how would this normally be done if the grammar only expects one statement at a time, and you want to, for example, test to make sure all ten statements are correct?

root
    : EOF
    // this line is for testing only
    | selectStatement (SEMICOLON selectStatement)* (SEMICOLON EOF? | EOF)
    // this line is for the actual parser
    | selectStatement (SEMICOLON EOF? | EOF)
    ;

CodePudding user response：

Whether to allow for multiple statements is pretty much just a grammar design choice. Depending upon the context, it might have been more straightforward to know you'll only see a single statement at a time, or that you can easily separate multiple statements and send each to a parser.

It does look like it would be useful for you.

A simpler version would be:

root : selectStatement? (SEMICOLON selectStatement)* SEMICOLON? EOF

you should always have an EOF

Another thing that doesn't always dawn on designers is that your grammar can have multiple start rules. So you could also have a

selectStart: selectStatement SEMICOLO? EOF;

rule that only allows for a single statement and depending upon you situation you can choose which start rule to use. I had a graphical tool for a language I wrote so sometimes I parsed exprs, sometimes stmts and sometimes scripts. Each had its own start rule. But don't forget to end a start rule with an EOF. This forces the parser to look at ALL of your input. Without it, it will parse as much as it can, but will ignore training input that doesn't fit a parse rule. (well, it's possible not to have EOF, IF you have a custom stream that remains open so there is no end of input. However, this would not be the case in your situation.)

CodePudding user response：

There are several arguments pro single-statement processing:

Killer Reason: Each statement can use a different delimiter (which you cannot handle in the parser grammar). Delimiters are not part of the SQL syntax.
In editors you will want to know where a statement starts and ends, without first parsing the full text (think of megabytes sized dumps), e.g. for executing a single statement.
You don't want to miss all following statement details, if only one statement contains a syntax error.
Parsing a single statement at a time gives you much better response times (e.g. when editing SQL code while it's still being parsed).
The server can process single statements only, anyway.

I implemented the statement handling in MySQL Workbench and did it the same way in MySQL Shell for VS Code. The statement splitter is usually very fast (100ms in C for a million statements, depending on the box it runs on). This allows to do a quick first run for the statement ranges, show the statement indicator and make the editor ready for statement requests (for execution). After than a background thread is used to parse the individual statements for errors, which can be stopped at any time when a statement was edited.