Home > Net >  antlr visitor: lookup of reserved words efficiently
antlr visitor: lookup of reserved words efficiently

Time:10-18

I'm learning Antlr. At this point, I'm writing a little stack-based language as part of my learning process -- think PostScript or Forth. An RPN language. For instance:

10 20 mul

This would push 10 and 20 on the stack and then perform a multiply, which pops two values, multiplies them, and pushes 200. I'm using the visitor pattern. And I find myself writing some code that's kind of insane. There has to be a better way.

Here's a section of my WaveParser.g4 file:

any_operator:
    value_operator |
    stack_operator |
    logic_operator |
    math_operator |
    flow_control_operator;

value_operator:
  BIND | DEF
  ;

stack_operator:
  DUP |
  EXCH |
  POP |
  COPY |
  ROLL |
  INDEX |
  CLEAR |
  COUNT
  ;

BIND is just the bind keyword, etc. So my visitor has this method:

antlrcpp::Any WaveVisitor::visitAny_operator(Parser::Any_operatorContext *ctx);

And now here's where I'm getting to the very ugly code I'm writing, which leads to the question.

Value::Operator op = Value::Operator::NO_OP;

WaveParser::Value_operatorContext * valueOp = ctx->value_operator();
WaveParser::Stack_operatorContext * stackOp = ctx->stack_operator();
WaveParser::Logic_operatorContext * logicOp = ctx->logic_operator();
WaveParser::Math_operatorContext * mathOp = ctx->math_operator();
WaveParser::Flow_control_operatorContext * flowOp = ctx->flow_control_operator();

if (valueOp) {
    if (valueOp->BIND()) {
        op = Value::Operator::BIND;
    }
    else if (valueOp->DEF()) {
        op = Value::Operator::DEF;
    }
}
else if (stackOp) {
    if (stackOp->DUP()) {
        op = Value::Operator::DUP;
    }
    ...
}
...

I'm supporting approximately 50 operators, and it's insane that I'm going to have this series of if statements to figure out which operator this is. There must be a better way to do this. I couldn't find a field on the context that mapped to something I could use in a hashmap table.

I don't know if I should make every one of my operators have a separate rule, and use the corresponding method in my visitor, or if what else I'm missing.

Is there a better way?

CodePudding user response:

With ANTLR, it's usually very helpful to label components of your rules, as well as the high level alternatives.

If part of a parser rule can only be one thing with a single type, usually the default accessors are just fine. But if you have several alternatives that are essentially alternatives for the "same thing", or perhaps you have the same sub-rule reference in a parser rule more than one time and want to differentiate them, it's pretty handy to give them names. (Once you start doing this and see the impact to the Context classes, it'll become pretty obvious where they provide value.)

Also, when rules have multiple top-level alternatives, it's very handy to give each of them a label. This will cause ANTLR to generate a separate Context class for each alternative, instead of dumping everything from every alternative into a single class.

(making some stuff up just to get a valid compile)

grammar WaveParser
    ;

any_operator
    : value_operator        # val_op
    | stack_operator        # stack_op
    | logic_operator        # logic_op
    | math_operator         # math_op
    | flow_control_operator # flow_op
    ;

value_operator: op = ( BIND | DEF);

stack_operator
    : op = (
        DUP
        | EXCH
        | POP
        | COPY
        | ROLL
        | INDEX
        | CLEAR
        | COUNT
    )
    ;

logic_operator:        op = (AND | OR);
math_operator:         op = (ADD | SUB);
flow_control_operator: op = (FLOW1 | FLOW2);

AND: 'and';
OR:  'or';
ADD: ' ';
SUB: '-';

FLOW1: '>>';
FLOW2: '<<';

BIND:  'bind';
DEF:   'def';
DUP:   'dup';
EXCH:  'exch';
POP:   'pop';
COPY:  'copy';
ROLL:  'roll';
INDEX: 'index';
CLEAR: 'clear';
COUNT: 'count';
  • Related