I have the following grammar which works fine:
selectStatement
: simpleSelectStatement (setOperand selectStatement)?;
However, I would like to break up the selectStatement
so it tells us at the top level whether it contains a set operation at all. For example:
selectStatement
: simpleSelectStatement | setOperation
;
setOperation
: simpleSelectStatement (setOperand selectStatement)
;
Unfortunately, to parse this unambiguously, it has to examine the entire SELECT
statement to see if there is a UNION
there to see which rule to delegate to. For example, with the below taking 24 lookaheads to figure out what type of statement it is!
Is there a way to resolve this, or is the only way basically "Put it back into one root statement-type" (as the UNION
usually comes 'so late in the statement' that delegating the statement type could almost take an entire parse itself). Here is a full working grammar to test with:
grammar DBParser;
options { caseInsensitive=true; }
root
: selectStatement SEMI? EOF
;
selectStatement
: simpleSelectStatement | setOperation
;
setOperation
: simpleSelectStatement (setOperand selectStatement)
;
simpleSelectStatement:
( selectClause | OPEN_PAREN selectStatement CLOSE_PAREN)
;
selectClause
: SELECT selectItem (COMMA selectItem)*
;
selectItem
: NUMBER ( FROM IDENTIFIER )?
;
setOperand
: UNION ALL?|EXCLUDE|INTERSECT
;
SELECT : 'SELECT'; // SELECT *...
LIMIT : 'LIMIT'; // ORDER BY x LIMIT 20
ALL : 'ALL'; // SELECT ALL vs. SELECT DISTINCT; WHERE ALL (...); UNION ALL...
UNION : 'UNION'; // Set operation
FROM : 'FROM'; // Set operation
AS : 'AS'; // Set operation
WITH : 'WITH'; // Set operation
SEMI : ';'; // Statement terminator
OPEN_PAREN : '('; // Function calls, object declarations
CLOSE_PAREN : ')';
COMMA : ',';
NUMBER
: [0-9]
;
IDENTIFIER
: [A-Z_] [A-Z_0-9]*
;
WHITESPACE
: [ \t\r\n] -> skip
;
CodePudding user response:
Maybe just try labelled alternatives?
grammar DBParser;
options { caseInsensitive=true; }
root
: selectStatement SEMI? EOF
;
selectStatement
: SELECT selectItem (COMMA selectItem)* # simpleSelect
| OPEN_PAREN selectStatement CLOSE_PAREN # parenSelect
| selectStatement setOperand selectStatement # setOperation
;
//setOperation
// : simpleSelectStatement setOperand selectStatement # set
// ;
//simpleSelectStatement:
// selectClause
// | OPEN_PAREN selectStatement CLOSE_PAREN
// ;
//selectClause
// : SELECT selectItem (COMMA selectItem)*
// ;
selectItem
: NUMBER ( FROM IDENTIFIER )?
;
setOperand
: UNION ALL?|EXCLUDE|INTERSECT
;
SELECT : 'SELECT'; // SELECT *...
LIMIT : 'LIMIT'; // ORDER BY x LIMIT 20
ALL : 'ALL'; // SELECT ALL vs. SELECT DISTINCT; WHERE ALL (...); UNION ALL...
UNION : 'UNION'; // Set operation
FROM : 'FROM'; // Set operation
AS : 'AS'; // Set operation
WITH : 'WITH'; // Set operation
SEMI : ';'; // Statement terminator
OPEN_PAREN : '('; // Function calls, object declarations
CLOSE_PAREN : ')';
COMMA : ',';
NUMBER
: [0-9]
;
IDENTIFIER
: [A-Z_] [A-Z_0-9]*
;
WHITESPACE
: [ \t\r\n] -> skip
;