Home > Software design >  Clean way to interrupt Bison/Flex based parser
Clean way to interrupt Bison/Flex based parser

Time:09-20

tl;dr How to get a Bison/Flex parser to periodically run code that checks for an interruption request from the user?


I am looking to make a Bison/Flex based parser stop cleanly in response to interactive input. In other words, the parser should periodically check for user interruption, and if an interruption request is detected, it should exit cleanly. I know that I can stop a Bison parser using YYABORT, but I am not sure where to insert the interruption checks. Which Bison rule is run is determined by the contents of the input file. Is there a way to specify that a certain piece of code should be run periodically regardless of the contents of the file that is being parsed? Should the interruption checks be handled on the Bison side of Flex side?

CodePudding user response:

Take a look at flex's YY_USER_ACTION, the code in that macro is run every time a token is recognized. I'm not sure if bison has anything similar.

CodePudding user response:

In the standard parser/lexer model, the parser knows absolutely nothing about the input mechanism. It simply transforms a stream of tokens into a parse tree. "Files" and "interactive input" are not part of a parser's data model, and you'll find it much more convenient to maintain that separation.

A Bison parser can use YYABORT to clean up and terminate (by returning the error code 1). That's the same error return as is produced by a syntax error. It's important to use YYABORT in order to free used resources, particularly if the parser stack includes allocated objects. So, as you say, the question resolves to how the lexer communicates the desire to terminate.

Here, the lexer's options are limited. It can return a special-purpose token, not used in any parser rule, which will trigger a syntax error. Or it could just return 0, indicating that there is no more input, which might or might not trigger a syntax error. (Of those options, I'd go for returning 0, but there's not much difference.)

If the parser is doing anything more complicated than building up an AST -- for example, if it will actually attempt to produce some product, like executable code, then you will want to include a mechanism which suppresses further processing. That could be through a global (yuk!), or shared state communicated between the parser and the lexer using Bison's additional parameter declarations. The shared state could be as simple as a boolean flag, which might need to be checked:

  • in yyerror, in order to suppress the syntax error;
  • in any parser error action, which should YYABORT on premature end of input;
  • in the parser's final reduction action (that is, the reduction to the start symbol), which should suppress further processing and probably call YYABORT;
  • in whoever called the parser, in order to correctly interpret yyparse's error return. So an easy solution would be to add a %param declaration in your Bison file for a bool* parameter, remembering to adjust the prototype for yylex, yyerror, and other functions which need the extra parameter.

How you actually detect the interrupt in your lexical scanner is a separate problem. Parsing a buffer's worth of input does not usually take a noticeable amount of time, so the easiest solution might be to let the interruption produce an EOF indication for the lexer, and then attempt to figure out whether the EOF was a real end of input or a user interrupt either in your <<EOF>> action or in an implementation of yywrap.

  • Related