Home > Back-end >  Writing the body of a SQL grammar
Writing the body of a SQL grammar

Time:08-10

I am new to ANTLR4 and sketching out the skeleton of a SQL language where a script/editor will eventually have one or more SQL statements of the form:

statement1; 
statement2;
statement3; 
etc...

Here is what I have thus far:

root
    : sqlStatements EOF
    ;

sqlStatements
    : sqlStatement*
    ;

sqlStatement
    : MOCK SEMI?
    ;

MOCK
    : [a-zA-Z0-9 \n]    // mock of some stuff to emulate a statement
    ;

SEPARATOR
    : [ \t\n] 
    ;

SEMI
    : ';'
    ;

And my input:

# input.txt
hi;
this is sql; and so is this
and continues here; and again the fourth statement

This parses correctly, but I'm wondering the following as this is my first 'grammar' I've written with ANTLR4:

  • Is there a good/common way to eat common separators? For example, let's say I have a bunch of whitespace between a separator token and the mock SQL statement. What would be a better way (currently I'm consuming the separators in my mock statement)? Basically I'm just looking to 'throw out' all whitespace.
  • Does the grammar look ok or have any obvious errors?

CodePudding user response:

You can just add the skip action to your Lexer rule. This is very common since it allows you to not have to specify everywhere something like whitespace is allowed in parser rules. They are “skip”ped over when producing the token stream.

SEPARATOR
    : [ \t\n]  -> skip
    ;

You can also send them to another channel if there’s a reason you may want access to the tokens, but want to not worry about them in parser rules.

  • Related