I am facing a problem with rules that with long look ahead
Take as an example a parser that consumes integer or fractions, also the fractions have no GAP
s in the numerator (however it may have GAP
s between the numerator and the SLASH
)
The regular expression [0-9_ ]*?([0-9] \/[0-9_ ] )|[0-9_ ]
describes the valid inputs
you can check some examples here.
Here is one way to write it
Value: Integer | Fraction;
Fraction: IntegerTokenStar DigitPlus GapStar SLASH IntegerToken
DigitPlus: DIGIT DigitPlus | DIGIT
GapStar: GAP GapStar | %empty
Integer: IntegerTokenPlus
IntegerToken: DIGIT | GAP
IntegerTokenStar: IntegerToken IntegerTokenStar | %empty
IntegerTokenPlus: IntegerToken IntegerTokenPlus | IntegerToken
But it will fail to parse even an example like 0 0/0
, the IntegerTokenStar will consume as much as it can, then trying to parse the numerator there is no digit available, trying to continue with integer is not possible because it has a '/'.
How to write this in a conceptually clear way and that we can produce a valid parser.
Examples
A few strings and the expected (i)nteger part, (n)umerator, (d)enominator.
1_1_ 1___/1_1 -> fraction {i:"1_1_ ",n:"1___", d:"1_1"}
1_1_ 1___1_1 -> integer {i:"1_1_ 1___1_1",n:"", d:""}
1_1_1___/1_1 -> fraction {i:"",n:"1_1_1___",d:"1_1"}
frac.y