Consider whether x
in the declaration int x;
is an expression.
I used to think that it's certainly not, but the grammar calls the variable name an id-expression
here.
One could then argue that only expression
is an expression, not ??-expression
. But then in 1 2
, neither 1
nor 2
match, because those are additive-expression
and multiplicative-expression
respectively, not expression
s. But common sense says those should be called expressions too.
We could decide that any ??-expression
(including expression
) is an expression, but then the variable name in a declaration matches as well.
We could define an expression to be any ??-expression
except id-expression
, but this feels rather arbitrary.
What's the right grammatical definition of an expression, and is the variable name in its declaration an expression or not?
CodePudding user response:
After looking at the links provided by @LanguageLawyer (1, 2), I'm convinced the consensus is that id-expression
is a misnomer, and not always an expression (e.g. it's not an expression in a declaration).
Then, a source substring is an expression if at least one of its parents in the parse tree is called:
expression
, or*-expression
1 but notid-expression
, and that parent expands exactly to this substring and nothing more.
This is the same definition @n.m. proposed, except I allow "*-expression
and not id-expression
" nodes as well.
1 *
is a wildcard for any string.
CodePudding user response:
What does the question "is x
an expression" mean?
When we talk about specific occurrences of expressions and identifiers in a specific program, we must consider its parse tree. A substring of a program is an expression
if some occurrence of expression
node in its parse tree expands to that substring.
Thus, x
in the declaration int x;
is not an expression because there is no expression
in the parse tree (of any valid program that contains int x;
as a declaration) that expands to this occurrence of x
. There is an id-expression
node, but that particular id-expression
is not an expansion of an expression
node, it is a part of an expansion of a declaration
node.
When talking about about expressions and identifiers in isolation, then "is a" means "expands to/contracts to, according to some rule in the grammar". Thus, taken in isolation, x
is an expression. This means we can construct a program where x
is an expression according to the definition above.
These definitions are purely syntactic and as such are valid for any language and grammar production. However the C standard states informally that
An expression is a sequence of operators and operands that specifies a computation
and in several places talks about subexpressions as expressions on their own right. Thus the term "expression" in the standard does not coincide with the grammar element expression
.
This is not an insurmountable problem however. The grammar is only a tool. The standard could have defined the grammar differently:
expression:
expression = expression
expression expression
expression * expression
...
( expression )
id-expression
and resolve the ambiguities in English text. If we want the notion of expression to correspond closely to the grammar element, we probably should mentally consider the grammar presented this way.
Alternatively, as you suggest, we can consider an expansion of any ??-expression
an "expression", but replace certain occurrences of id-expression
with a new symbol. The two approaches seem equivalent.
Note: this version of the answer is a complete rewrite. The previous version was a result of a misunderstanding.