I have the following PEG grammar defined:
Program = _{ SOI ~ Expr ~ EOF }
Expr = { UnaryExpr | BinaryExpr }
Term = _{Int | "(" ~ Expr ~ ")" }
UnaryExpr = { Operator ~ Term }
BinaryExpr = { Term ~ (Operator ~ Term)* }
Operator = { " " | "-" | "*" | "^" }
Int = @{ Operator? ~ ASCII_DIGIT }
WHITESPACE = _{ " " | "\t" }
EOF = _{ EOI | ";" }
And the following expressions are all parsed correctly:
1 2
1 - 2
1 -2
1 - -2
1
1
-1
But any expression that begins with a negative number errors
-1 2
errors with
--> 1:4
|
1 | -1 2
| ^---
|
= expected EOI
What I expect (what I would like) is for -1 2
to be treated the same as 1 -2
, that is a Binary expression that is made up of two Unary Expressions.
I have toyed around with a lot of variations with no success. And, I'm open to using an entirely different paradigm if I need to, but I'd really like to keep the UnaryExpression idea since I've already built my parser around it.
I'm new to PEG, so I'd appreciate any help.
For what its worth, I'm using Rust v1.59 and https://pest.rs/ to both parse and test my expressions.
CodePudding user response:
You have a small error in the Expr
logic. The first part before the |
takes precedence if both match.
And -1
is a valid UnaryExpr
so the program as a whole is expected to match SOI ~ UnaryExpr ~ EOF
in this case. But there is additional data ( 2
) which leads to this error.
If you reverse the possibilities of Expr
so that Expr = { BinaryExpr | UnaryExpr }
the example works. The reason for that is that first BinaryExpr
will be checked and only if that fails UnaryExpr
.