C programming language documentation Precedence and order of evaluation states:
The direction of evaluation does not affect the results of expressions that include more than one multiplication (*), addition ( ), or binary-bitwise (&, |, or ^) operator at the same level. Order of operations is not defined by the language.
What exactly does the above mean (perhaps a code example will help)?
CodePudding user response:
That page is not particularly well-written.
Precedence determines which operators are grouped with which operands in an expression - it does not dictate the order in which subexpressions are evaluated. For example, in the expression a b * c
, the *
operator has higher precedence than the
operator, so the expression is parsed as a (b * c)
- the result of a
is added to the result of b * c
.
However, each of the expressions a
, b
, and c
may be evaluated in any order, even simultaneously (interleaved or in parallel). There’s no requirement that b
be evaluated before c
or that either must be evaluated before a
.
Associativity also affects grouping of operators and operands when you have multiple operators of the same precedence - the expression a b c
is parsed as (a b) c
because the
operator (along with the other arithmetic operators) is left-associative. The result of a b
is added to the result of c
.
But like with precedence above, this does not control order of evaluation. Again, each of a
, b
, and c
may be evaluated in any order.
The only operators which force left-to-right evaluation of their operands are the &&
, ||
, ?:
, and comma operators.
CodePudding user response:
I assume that what the cited documentation is trying to say is that given the code
a = f1() f2() f3();
or
b = f1() * f2() * f3();
we do not know which of the functions f1
, f2
, or f3
will be called first.
However, it is guaranteed that the result of calling f1
will be added to the result of calling f2
, and that this intermediate sum will then be added to the result of calling f3
. Similarly for the multiplications involved in computing b
. These aspects of the evaluation order are guaranteed due to the left-associativity of addition and multiplication. That is, the results (both the defined and the unspecified aspects) are the same as if the expressions had been written
a = (f1() f2()) f3();
and
b = (f1() * f2()) * f3();
Upon reading the cited documentation, however, I fear that I may be wrong. It's possible that the cited documentation is simply wrong, in that it seems to be suggesting that the
, *
, &
, |
, and ^
are somehow an exception to the associativity rules, and that the defined left-associativity is somehow not applicable. That's nonsense, of course: left-associativity is just as real when applied to these operators as it is when applied to, say, -
and /
.
To explain: If we write
10 - 5 - 2
it is unquestionably equivalent to
(10 - 5) - 2
and therefore results in 3. It is not equivalent to
10 - (5 - 2)
and the result is therefore not 7. Subtraction is not commutative and not associative, so the order you do things in almost always matters.
In real mathematics, of course, addition and multiplication are fully commutative and associative, meaning that you can mix things up almost any which way and still get the same result. But what's not as well known is that computer mathematics are significantly enough different from "real" mathematics that not all of the rules — in particular, commutativity — actually apply.
Consider the operation
-100000000 2000000000 200000000
If it's evaluated the way I've said it has to be, it's
(-100000000 2000000000) 200000000
which is
1900000000 200000000
which is 2100000000
, which is fine.
If someone (or some compiler) chose to evaluate it the way I've said it couldn't be evaluated, on the other hand, it might come out as
-100000000 (2000000000 200000000) /* WRONG */
which is
-100000000 2200000000
which is... wait a minute. We're in trouble already. 2200000000 is a 32-bit number, which means it can't be properly represented as a positive, 32-bit signed integer.
In other words, this is an example of an expression which, if you evaluate it in the wrong order, can overflow, and theoretically become undefined.
Similar things can happen with floating-point arithmetic. The expression
1.2e-50 * 3.4e300 * 5.6e20
will overflow (exceed the maximum value of a double
, which is good up to about 1e307
) if the second multiplication wrongly happens first. The expression
2.3e100 * 4.5e-200 * 6.7e-200
will underflow (to zero, exceeding the minimum value of a double
) if the second multiplication happens first.
The point I'm trying to make here is that computer addition and multiplication are not quite commutative, meaning that a compiler should not rearrange them. If a compiler does (as the cited documentation seems to, wrongly, claim is possible), you, the programmer, can see results which are significantly and wrongly different from what the C Standard said you were allowed to expect.
I hope this all makes some kind of sense, although in closing, I should perhaps suggest that it's not necessarily as unambiguous and clear-cut as I've made it sound. I believe what I've described (that is, the strict associativity, and non commutativity, of multiplication and addition) is what's formally required by the current C standards, and by IEEE-754. However, I'm not sure they've been required by all versions of the C Standard, and I don't believe they were clearly guaranteed by Ritchie's original definition of C, either. They're not guaranteed by all C compilers, they're not depended upon or cared about by all C programmers, and they're not appreciated by people who write documentation like that cited in this thread.
(Also, for those really interested in fine points, rearranging integer addition as if it were commutative is not quite so wrong — or, at least, it's not visible/detectably wrong — if you know you're compiling for a processor that quietly wraps around on signed integer overflow.)