Home > Software engineering >  JavaScript grammar notation
JavaScript grammar notation

Time:08-29

In the section 5.1.5 Grammar Notation of the ECMAScript there is an example of the grammar notation regarding to parameterized productions

References to nonterminals on the right-hand side of a production can also be parameterized. For example:

StatementList :
    ReturnStatement
    ExpressionStatement[ In]

is equivalent to saying:

StatementList :
    ReturnStatement
    ExpressionStatement_In

Should it be

StatementList :
    ReturnStatement
    ExpressionStatement
    ExpressionStatement_In

similarly to nonterminals on the left-hand side of productions?

CodePudding user response:

No, I think this is deliberate. It adds the parameter to the production. Although ExpressionStatement_In would be shorter, it would be even more confusing to mix parameterized and non parameterized non-terminals.

FWIW, the In parameter means (as noted in 13.10):

The [In] grammar parameter is needed to avoid confusing the in operator in a relational expression with the in operator in a for statement.

So Expression[ In] means the in operator can appear in the production and cannot lead to a syntactical ambiguity, whereas Expression[~In] means in may not appear in the expression, as it would be confused with the for ... in. Thus for(a in b; false; ) { } is a syntax error.

CodePudding user response:

In the ECMAScript formalism, bracketed parameter lists are a form of macro expansion. Like any other macro facility, they do not actually carry any intrinsic semantic information. Brackets can be eliminated with a macro preprocessor, and such tools do exist.

Leaving aside the lookahead restrictions, there are three related but very different uses of brackets in the formalism's metasyntax:

  1. On the left-hand side of a production, bracketed parameters indicate that the production is defining multiple non-terminals, whose names may (or may not) have _parameter appended. As you say, the definition

    ExpressionStatement[In] : ...
    

    would indicate that the (macro-expanded) grammar had two non-terminals, ExpressionStatement and ExpressionStatement_In. You can define a name with multiple parameters, in which case all the combinations are being defined. (And, in the current version of ECMA-262, ExpressionStatement is not actually qualified with an In parameter. Its actual definition is Expression[Yield, Await], which is shorthand for four different non-terminals.)

  2. That might lead you to believe that it would be OK to use ExpressionStatement or ExpressionStatement_In on the right-hand side of a production. And you could, technically, do that, since the notation is just a macro notation. But it would be quite confusing and the grammar does not ever do that. (There were earlier grammar drafts which did though, and some of the automated tools permit it. But it's better to pretend that once a non-terminal is defined with bracketed parameters, it must always be used with bracketed parameters, the same ones in the same order.)

    Fortunately, the real grammar uses explicit right-hand side parameters, which have three forms (remembering that there can be multiple parameters, so what's inside the brackets is actually a list). The three possibilities are:

    • [ parameter], which is the same as appending _parameter to the non-terminal's name; i.e. an indication that the version of the non-terminal indicated is the one with the feature.

    • [~parameter], which is technically a no-op, the same as nothing be appended to the name, indicating that the version indicated is the one without the feature.

    • [?parameter] is the same as either [ parameter] or [~parameter], depending on whether the particular production being expanded was invoked with or without the parameter. This syntax can only be used if the non-terminal being defined and the non-terminal being used both have a parameter with the same name. But it's not required that they be the same non-terminal.

  3. Finally, you can make particular alternatives conditional on a parameter (or a list of parameters) by putting a bracketed list at the beginning of the alternative. (In this position, it doesn't immediately follow the name of a non-terminal, so there is no ambiguity.) There are two possible conditions:

    • [ parameter]: the alternative is only included for the non-terminals (of the set being defined) which has that parameter.

    • [-parameter]: the alternative is only included for the non-terminals (of the set being defined) which does not have that parameter.

Note that, as an exception to the usual rule for grammars, which is that all non-terminals actually be used somewhere, the ECMAScript formalism does not require that every possible expansion of a parameterised non-terminal actually be reachable. So the correct way to macro-expand bracket notation is to start with the root non-terminals and traverse the tree of productions, creating individual instances of a bracketed non-terminal when its first use is encountered.

  • Related