I am trying to get working an example shown in the Flex manual [1]. The example shows Flex rules for a quoted string that may contain octal codes.
The manual is a bit incomplete in its description of the action for the closing quote. It simply has this comment:
/* return string constant token type and
* value to parser
*/
So I created code that I thought would work, but apparently my code is incorrect.
Below is the lexer followed by the parser. When I execute the generated parser, I get this output:
The string is: ''
What I expect, and want, is this output:
The string is: 'John Doe'
My input is this: "John Doe"
What am I doing wrong, please?
Here is the lexer:
%option noyywrap
%x STR
%{
#include "parse.tab.h"
#define MAX_STR_CONST 100
%}
%%
char string_buf[MAX_STR_CONST];
char *string_buf_ptr;
\" { string_buf_ptr = string_buf; BEGIN(STR); }
<STR>{
\" { /* closing quote - all done */
BEGIN(INITIAL);
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr);
return(STRING);
}
\n { /* error - unterminated string constant */
perror("Error - unterminated string");
yyterminate();
}
\\[0-7]{1,3} { /* octal escape sequence */
int result;
(void) sscanf(yytext 1, "%o", &result);
if (result > 0xff) {
perror("Error - octal escape is out-of-bounds");
yyterminate();
}
*string_buf_ptr = result;
}
\\[0-9] { /* bad escape sequence */
perror("Error - bad escape sequence");
yyterminate();
}
\\n *string_buf_ptr = '\n';
\\t *string_buf_ptr = '\t';
\\r *string_buf_ptr = '\r';
\\b *string_buf_ptr = '\b';
\\f *string_buf_ptr = '\f';
\\(.|\n) *string_buf_ptr = yytext[1];
[^\\\n\"] {
char *yptr = yytext;
while (*yptr)
*string_buf_ptr = *yptr ;
}
}
%%
Here is the parser:
%{
#include <stdio.h>
#include <stdlib.h>
/* interface to the lexer */
extern int yylineno; /* from lexer */
int yylex(void);
void yyerror(const char *s, ...);
extern FILE *yyin;
int yyparse (void);
%}
%union {
char *strval;
}
%token <strval> STRING
%%
start
: STRING { printf("The string is: '%s'", $1);}
;
%%
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
yyparse();
fclose(yyin);
return 0;
}
void yyerror(const char *s, ...)
{
fprintf(stderr, "%d: %s\n", yylineno, s);
}
[1] See page 24-25 in the Flex manual https://epaperpress.com/lexandyacc/download/flex.pdf
CodePudding user response:
Your action is:
*string_buf_ptr = '\0';
yylval.strval = strdup(string_buf_ptr)
return STRING;
It seems pretty clear that strdup
of string_buf_ptr
will return a newly-allocated copy of an empty string, since you just set the character pointed to by string_buf_ptr
to 0.
Two comments:
- This bug has essentially nothing to do with Flex (or Bison). I know that it is always tempting to assume that the most unfamiliar technology you are using is the source of errors, but making assumptions like that is not a very effective debugging technique.
- A debugger is often a faster way of finding bugs than StackOverflow. There's a bit of a learning curve to use Gdb, but it will definitely pay off in the end (perhaps even soon).
Also, perror
is intended to present the user with an error message based on the value of errno
. That's not very useful in this context; you probably want to call yyerror
. (However, you'll need to declare it in the lexer, unless you arrange for its prototype to be inserted in parse.tab.h
. See %code requires
/%code provides
in the bison manual for how to do that.)