Home > database >  Why does using a character class make this left-recursive?
Why does using a character class make this left-recursive?

Time:08-10

I have the following to define a basic arithmetic grammar:

grammar Calc;


expression
    : '(' expression ')'                // parenExpression has highest precedence
    | expression ('*' | '/') expression // then multDivExpression
    | expression (' ' | '-') expression // then addSubExpression
    | OPERAND
    ;

// 12 or .12 or 2. or 2.38
OPERAND
    : [0-9]  ('.' [0-9]*)?
    | '.' [0-9] 
    ;

And it can handle something like:

1 2*3

However, as soon as I change the ('*' | '/') to the character class [*/] I get the following error:

enter image description here

How does adding a character class do this? Or is it because I'm trying to add that to a parser rule and not a lexer rule, or why does that occur?

Update: extracting the character class out of the grammar rule fixes it, but it'd be great to understand why:

expression
    : '(' expression ')'           // parenExpression has highest precedence
    | expression MULDIV expression // then multDivExpression
    | expression ADDSUB expression // then addSubExpression
    | OPERAND
    ;

MULDIV
    : [*/]
    ;

ADDSUB
    : [- ]
    ;

CodePudding user response:

Character classes are only available to Lexer rules.

when you put literals (' ', '/', etc.) implicit Lexer rules are defined for you (and given rather unusable names like T__1, T__2, etc.). So they're just convenient ways of inserting implicit tokens. (The character class is only available in actual Lexer rules.)

Not sure which version of ANTLR you're using, but I get a "rule expression has no defined parameters" error, and not mutual left recursion.

I think you're seeing a side effect of the character class syntax not being available in parser rules. (The [...] syntax has an entirely different meaning in parser rules.)

  • Related