I have two pretty similar patterns in Lexer.x first for numbers second for byte. Here they are.
$digit=0-9
$byte=[a-f0-9]
$digit { \s -> TNum (readRational s) }
$digit .$digit { \s -> TNum (readRational s) }
$digit .$digit e$digit { \s -> TNum (readRational s) }
$digit e$digit { \s -> TNum (readRational s) }
$byte$byte { \s -> TByte (encodeUtf8(pack s)) }
I have Parser.y
%token
cnst { TNum $$}
byte { TByte $$}
'[' { TOSB }
']' { TCSB }
%%
Expr:
'[' byte ']' {$1}
| const {$1}
when I write, I got.
[ 11 ] parse error
11 ok
but when I put byte pattern in Lexer before numbers
$digit=0-9
$byte=[a-f0-9]
$byte$byte { \s -> TByte (encodeUtf8(pack s)) }
$digit { \s -> TNum (readRational s) }
$digit .$digit { \s -> TNum (readRational s) }
$digit .$digit e$digit { \s -> TNum (readRational s) }
$digit e$digit { \s -> TNum (readRational s) }
I got
[ 11 ] ok
11 parse error
I think that happens because Lexer makes tokens from string and then gives them to parser. And when parser wait for byte token it got number token and parser don't have opportunity to make from this value another token. What I should do in this situation?
CodePudding user response:
In that case you should postpone parsing. You can for example make a TNumByte
data constructor that stores the value as String
:
Token
= TByte ByteString
| TNum Rational
| TNumByte String
-- …
For a sequence of $digit
s, it is not yet clear if we have to interpret this as byte or number, so we construct a TNumByte
for this:
$digit=0-9
$byte=[a-f0-9]
$digit$digit { TNumByte }
$byte$byte { \s -> TByte (encodeUtf8(pack s)) }
$digit { \s -> TNum (readRational s) }
$digit .$digit { \s -> TNum (readRational s) }
$digit .$digit e$digit { \s -> TNum (readRational s) }
$digit e$digit { \s -> TNum (readRational s) }
then in the parser we can decide based on the context:
%token
cnst { TNum $$ }
byte { TByte $$ }
numbyte { TNumByte $$ } --