Home > Software design >  Basic parsing structure for nested structures?
Basic parsing structure for nested structures?

Time:10-29

I've come across the following bnf quite frequently for parsing a basic arithmetic expression:

S :== EXPRESSION
EXPRESSION :== TERM | TERM { [ ,-] TERM] }
TERM :== FACTOR | FACTOR { [*,/] FACTOR] }
FACTOR :== number | '(' EXPRESSION ')'

-- or --

expression :    term | term   term | term − term
term :      factor | factor * factor | factor / factor
factor :    number | ( expression ) |   factor | − factor

This would parse something like 2 3-4*(1 2). However, what is required to parse something like 1 1 1 1, as the above factor cannot also refer to the expression production itself?

CodePudding user response:

Your first version is correct, but not BNF. It uses "extended" syntax, which includes the repetition operator { ... }, which means "zero or more instances of the pattern inside the braces". So TERM { [ ,-] TERM] } means TERM or TERM TERM or TERM - TERM TERM or TERM TERM TERM - TERM, etc.

Your second version is not correct (and does not correspond to the first version). You don't provide any citation for it, so I can't tell if you copied it incorrectly or your source was simply wrong. Here's a correct version:

expression: term | expression   term | expression − term
term:       factor | term * factor | term / factor
factor:     number | ( expression ) |   factor | − factor

I hope it becomes clear how that analyses 1 1 1 1 into a left-associated sequence of additions.

  • Related