I am trying to parse the following SELECT
statement:
select 1 union all (select 1) union all (with cte as (select 1) select 1 from tbl limit 1)
union all select 1 union all (select 1 limit 1)
And this is the grammar I currently have (at least to be able to reproduce the issue):
parser grammar DBParser;
options { tokenVocab = DBLexer;}
root
: selectStatement SEMI? EOF
;
selectStatement
: withClause?
( selectClause | OPEN_PAREN selectStatement CLOSE_PAREN | selectStatement setOperand selectStatement)
orderLimitClause?
;
withClause
: WITH IDENTIFIER AS OPEN_PAREN selectClause CLOSE_PAREN
;
orderLimitClause
: LIMIT NUMBER
;
selectClause
: SELECT NUMBER ( FROM IDENTIFIER )?
;
setOperand
: UNION ALL?
;
lexer grammar DBLexer;
options { caseInsensitive=true; }
SELECT : 'SELECT'; // SELECT *...
LIMIT : 'LIMIT'; // ORDER BY x LIMIT 20
ALL : 'ALL'; // SELECT ALL vs. SELECT DISTINCT; WHERE ALL (...); UNION ALL...
UNION : 'UNION'; // Set operation
FROM : 'FROM'; // Set operation
AS : 'AS'; // Set operation
WITH : 'WITH'; // Set operation
SEMI : ';'; // Statement terminator
OPEN_PAREN : '('; // Function calls, object declarations
CLOSE_PAREN : ')';
NUMBER
: [0-9]
;
IDENTIFIER
: [A-Z_] [A-Z_0-9]*
;
WHITESPACE
: [ \t\r\n] -> skip
;
The issue is the selectStatement
which refers to itself without a possible alternation:
selectStatement
: withClause?
( selectClause | OPEN_PAREN selectStatement <-- here
;
What might be a possible way to resolve this? Note that the main part of the SQL grammar is from
CodePudding user response:
You'll need a higher level rule. Maybe this:
selectStatement
: simpleSelectStatement (setOperand selectStatement)?
simpleSelectStatement
: withClause?
( selectClause | OPEN_PAREN selectStatement CLOSE_PAREN)
orderLimitClause?
;