I am having a tough time figuring out a way to handle the SQL set operation, that it, being able to do things like query1 UNION ALL query2
. It should support the following statements:
select 1 union all select 2 -- ok
with cte as (select 1) select 1 from cte union all select 2 -- ok
(select 1 limit 1) union all select 1 union all (select 2) -- ok
And should not support the following, as one of the table operands to the query expression needs to be parenthesized:
select 1 from cte union all with cte as (select 1) -- no
select 1 limit 1 union select 1 -- no
Here is the grammar I have so far:
selectStatement:
withClause?
(selectStatementWithParens|selectClause|setOperation)
orderClause?
(limitClause offsetClause?)?
;
selectStatementWithParens
: OPEN_PAREN selectStatement CLOSE_PAREN
;
selectClause:
SELECT // a bunch of other stuff...
;
setOperation:
(selectClause | selectStatementWithParens)
(setOperand (selectClause | selectStatementWithParens))*
;
setOperand
: UNION (DISTINCT|ALL)?
| INTERSECT DISTINCT?
| EXCEPT DISTINCT?
;
Though it does work -- it seems to be a bit heavy handed, and I don't like having the selectStatementWithParens
underneath the WITH
(though it does work, such the WITH
is optional, and so it can support (select 1)
or (with x as (select 1) select 1))
. How could this be improved, or does it already look pretty good?
CodePudding user response:
Consider this:
parser grammar DBParser
;
options {
tokenVocab = DBLexer;
}
root: selectStatement SEMI? EOF;
selectStatement
: selectClause # selectClauseStatement
| selectStatement limitClause # limitedSelectStatement
| OPEN_PAREN selectStatement CLOSE_PAREN # parenSelectStatement
| selectStatement setOperand selectStatement # setOperationStatement
;
selectClause: SELECT NUMBER;
limitClause: LIMIT NUMBER;
setOperand: UNION ALL?;
It gives a usable parse tree (so far as I can tell) and has max K = 1
I may have missed some subtlety, but perhaps it puts you on a better path.