Home > Enterprise >  Nearley parser - how to return indefinite string of matches without ambiguity? (Only four lines)
Nearley parser - how to return indefinite string of matches without ambiguity? (Only four lines)

Time:12-09

I am writing software meant to make it easy to publish your choose-your-own-adventure story. However, I wanted to change my parser to the Nearley system from JavaScript I wrote myself.

I have a four line nearley parser:

main->(excludebrackets link:  excludebrackets): 
link->"[LINK:"i excludebrackets "|" excludebrackets "]"
{% (d) => {return'<a href ="func__'   d[3][0].join("")   '()">' d[1][0].join("") "</a>"}%} 
excludebrackets->[^\\[\]]:  | null

The only problem is the very top line. The "link" nonterminal does an excellent job of turning things like:

[LINK: shoot | shoot_dragon] into <a href ="func__ shoot_dragon()"> shoot </a>. But if I try to use more complex code:

You could [LINK: shoot | shoot_dragon] the dragon with your arrows or [LINK: draw | stab_dragon] your sword, but you'd have to let it get close.

my function is ambiguous, and thus returns many results. (It seems easy to work with because of the way javascript handles nulls, but this is still at the best case slower than it need be.)

The more general question, is how can I return an indefinite series of two matches, without ambiguity?

(As a bonus, can anyone explain what the :*, : , :? mean exactly? I don't get the question mark.)

CodePudding user response:

If by "how can I return an indefinite series of two matches, without ambiguity?", You mean "what's the unambiguous equivalent of a* a*?", the answer is that since a* a* is necessarily ambiguous, the only solution is to remove one of them (which matches the same syntax).

Since excludebrackets is already an arbitrary repetition, you can use the much simpler

main->(excludebrackets link):* excludebrackets

You could have written excludebrackets without the explicit | null by using :* instead of : . For any X, X | null is the same as X*, by definition.

  • Related