I wrote a pretty simple left-recursive grammar in yacc, based on data types for a simple C-like language. The generated yacc parser reduces before the left recursion suffix, and I have no clue why.
Here is the source code:
%%
start: type {
printf("Reduced to start.\n");
};
type: pointer {
printf("Reduced pointer to type.\n");
}
| char {
printf("Reduced char to type.\n");
};
char: KW_CHAR {
printf("Reduced to char.\n");
};
pointer: type ASTERISK {
printf("Reduced to pointer.\n");
};
%%
Given the input char *
(KW_CHAR ASTERISK):
Reduced to char.
Reduced char to type.
syntax error
CodePudding user response:
You don't seem to be asking about the parse error you're receiving (which, as noted in the comments, is likely an issue with the token types being returned by yylex()
), but rather why the parser reductions are performed in the order shown by your trace. That's the question I will try to address here, even though it's quite possible that it's an XY problem, because understanding reduction order is important.
In this case, the reduction order is pretty simple. If you have a production:
pointer: type ASTERISK
then type
must be reduced before ASTERISK
is shifted. In an LR parser, reductions must be done when the right-hand side to be reduced ends exactly at the last consumed input token. (The lookahead token has been identified by the lexical scanner but not yet consumed by the parser. So it's not part of the reduction and can be used also to identify other reductions ending at the same point.
I hope it's more evident why with the production
type: char
char
needs to be reduced before type
. Until char
is reduced, it's not available to be used in the reduction of type
.
So the reduction order shown by your parser (first char
, then type
, and only after the *
is shifted, pointer
) is precisely what would be expected.