Understanding ABNF syntax "0<pchar>"-CodePudding

In RFC 3986, they defined the rule:

path-empty = 0<pchar>

For simplicity, let's assume pchar is defined:

pchar = 'a' / 'b' / 'c'

What does path-empty match and how is it matched?

I've read the Wikipedia page on ABNF. My guess is that matches the empty string (regex ^(?![\s\S])). If that is the case, why even reference pchar? Is there not a simpler way to match the empty string in ABNF syntax without referencing another rule?

How could this be translated to ANTLR4?

CodePudding user response：

Yes, you are correct. path-empty derives the empty string.

In ABNF, the right-hand side of a rule must contain an element, which will be anything other than spaces, newlines, and comments. See rfc5234, page 10. Given this syntax, there are several ways to define the empty string. path-empty = 0<pchar> is one way. It means "exactly zero of <pchar>". But, path-empty = "" and path-empty = 0pchar would also work. ABNF does not define whether one is preferred over the other.

Note, the rfc3986 spec uses a prose-val, i.e., <pchar> instead of the "rulename" pchar (or even "", 0pchar, or just <empty string>). It's unclear why, but it has the same effect--path-empty derives the empty string. But, <pchar> is not the same as pchar. A prose value is a "last resort" way to add a non-formal description of the rule.

In Antlr4, the rule would just be path_empty : ;. Note, Antlr has a different naming convention that defines a strict boundary between lexer and parser. ABNF does not have this distinction. In fact, this grammar could be converted to a single Antlr lexer grammar, an exercise in understanding the power of Antlr lexers.