Home > Software engineering >  Removing this left-recursive way to define a SELECT statement
Removing this left-recursive way to define a SELECT statement

Time:08-19

I am trying to parse the following SELECT statement:

select 1 union all (select 1) union all (with cte as (select 1) select 1 from tbl limit 1) 
union all select 1 union all (select 1 limit 1)

And this is the grammar I currently have (at least to be able to reproduce the issue):

parser grammar DBParser;
options { tokenVocab = DBLexer;}

root
    : selectStatement SEMI? EOF
    ;

selectStatement
    : withClause?
    ( selectClause | OPEN_PAREN selectStatement CLOSE_PAREN | selectStatement setOperand selectStatement)
    orderLimitClause?
    ;

withClause
    : WITH IDENTIFIER AS OPEN_PAREN selectClause CLOSE_PAREN
    ;

orderLimitClause
    : LIMIT NUMBER
    ;

selectClause
    : SELECT NUMBER ( FROM IDENTIFIER )?
    ;

setOperand
    : UNION ALL?
    ;
lexer grammar DBLexer;
options { caseInsensitive=true; }
SELECT              :           'SELECT';                   // SELECT *...
LIMIT               :           'LIMIT';                    // ORDER BY x LIMIT 20
ALL                 :           'ALL';                      // SELECT ALL vs. SELECT DISTINCT; WHERE ALL (...); UNION ALL...
UNION               :           'UNION';                    // Set operation
FROM                :           'FROM';                    // Set operation
AS                  :           'AS';                      // Set operation
WITH                :           'WITH';                    // Set operation

SEMI                :           ';';                        // Statement terminator
OPEN_PAREN          :           '(';                        // Function calls, object declarations
CLOSE_PAREN         :           ')';

NUMBER
     : [0-9] 
    ;
IDENTIFIER
    : [A-Z_] [A-Z_0-9]*
    ;
WHITESPACE
    : [ \t\r\n] -> skip
    ;

The issue is the selectStatement which refers to itself without a possible alternation:

selectStatement
    : withClause?
    ( selectClause | OPEN_PAREN selectStatement <-- here 
    ;

What might be a possible way to resolve this? Note that the main part of the SQL grammar is from enter image description here

CodePudding user response:

You'll need a higher level rule. Maybe this:

selectStatement
   : simpleSelectStatement (setOperand selectStatement)?

simpleSelectStatement
    : withClause?
    ( selectClause | OPEN_PAREN selectStatement CLOSE_PAREN)
    orderLimitClause?
    ;
  • Related