Home > Software design >  Is it possible to modify ANTLR parser errors?
Is it possible to modify ANTLR parser errors?

Time:08-25

I have a grammar where a language statement (outside of flow statements) is comprised of <expression>;. Currently, when ANTLR parses and finds the missing ;, it identifies it as missing at the position of the next token, which generally means the start of the next line. This is less than ideal, since in this case it is more correct to say it is missing at the end of the previous token, rather than the start of the next.

Is there a way to instruct ANTLR to indicate the end of the previous token as the position for the error, rather than the start of the next token? I understand why the parser sees it this way, but for a person, it can be a little confusing (especially for newbie programmers that might be learning the language). I'm using this in a GUI editor project to given programmers a better tool, so I would like to also deliver a better error message (since the goal is to deliver a better development and learning experience).

I am using a custom IAntlrErrorListener<IToken> implementation to collect the errors for display to the programmer, in case that is important. I'm hoping there is an easy way to perhaps indicate this at a parser rule level, such that the parser can easily be instructed to indicate end of the preceding token as the problem position, rather than the start of the next token.

CodePudding user response:

When you create your listener, you could get a reference to the token stream. From the token the error is reported on you could get the use getTokenIndex() and then you can subtract 1 from that and use the get() method of the token stream to get the previous token.

Note: It probably won’t prove quite so simple as there may be multiple tokens and/or skipped tokens that you have to account for to get to the token you really want to use. So, while you can get to other tokens in the stream it might be “fiddly” to get to the previous token that you want for your error message. (And this is probably getting at why ANTLR gives you the reference that it does)

  • Related