Why and how is the given code printing 235
in C?
#include <stdio.h>
int main()
{
int i;
char ch[] = { 'z', 'o', 'h', 'o' };
char *ptr, *str1;
ptr = ch;
str1 = ch;
i = (*ptr-- *str1) - 10;
printf("%d", i);
return 0;
}
CodePudding user response:
In i=(*ptr-- *str1)-10;
, *ptr--
is parsed as *(ptr--)
because postfix --
binds more tightly than unary *
.
Since the prior ptr=ch;
set ptr
to point to the first element of ch
, ptr--
attempts to decrement ptr
to point before the start of ch
. The behavior of this is not defined by the C standard.
Since ptr--
normally evaluates to the value of ptr
before the decrement, *ptr--
might produce the value that ptr
points to, as if it were *ptr
instead of *ptr--
. Thus, it would produce the value of z
. However, due to the incorrect pointer increment, this is not guaranteed by the C standard. Once there is behavior not defined by the standard, none of the other behavior on the same code path is defined.
If we remove the errant --
and use (*ptr *str1)-10
, then, by itself, *ptr
produces the value for 'z'
, which is 122 when the ASCII character set is used. str1
also points to the first element of ch
, so *str1
increments that element to change it from 122 to 123.
At this point, there is again behavior not defined by the C standard. *ptr
uses ch[0]
, and *str1
modifies it, and there is no sequencing between the two. By the rule in C 2018 6.5 2, the behavior of the program is not defined by the C standard.
If we suppose sequencing is somehow added, so that the increment comes later than *ptr
, then *str1
produces the incremented value, 123.
Then
adds 122 and 123 to produce 245, and subtracting 10 produces 235.
CodePudding user response:
In the expression *ptr––
, postfix ––
has higher precedence than unary *
, so the expression is parsed as *(ptr––)
; you are dereferencing the result of ptr––
, which yields 'z'
(ASCII 122). As a side effect, ptr
is decremented and now points outside of the array bounds.
In the expression *str
, unary
and unary *
have the same precedence, so the expression is parsed as (*str)
, which is the same as 'z' 1
, or ASCII 123. 122 123 - 10
gives you 235
. As a size effect, ch[0]
now contains ASCII 123, or '{'
.
This whole mess is equivalent to writing
tmp1 = ptr;
tmp2 = *str;
i = *tmp1 (tmp2 1) - 10;
ptr = ptr - 1;
*str = *str 1;
except that the updates to i
, ptr
, and *str
can happen in any order, even simultaneously.
CodePudding user response:
To evaluate (*ptr-- *str1)
, the compiler applies the rules of the C grammar: postfix operator bind stronger than prefix operators, they are applied from left to right, then prefix operators are applied to the result from right to left.
*ptr--
hence applies *
to the result of ptr--
, which is the original value of ptr
, the side effect on ptr
occurs sometime before the next sequence point, ie the next statement. Since ptr
has been assigned to the beginning of the array ch
, *ptr--
has the value 'z'
.
Note however that decrementing ptr
before the beginning of the array has undefined behavior... While this particular instance should not cause any problem on most current systems, the mere presence of undefined behavior may let the compiler perform aggressive optimisations with counterintuitive results.
*str
is parsed as (*str)
: the value is *str 1
, hence the next character in the character set after z
(*), and the value of *str
is incremented as a side effect sometime before the next sequence point.
This is another instance of undefined behavior because *ptr
and *str
refer to the same element of the array ch
: the side effect on *str
clearly conflicts with *ptr
reading the same element. The C Standard describes this as undefined behavior because an object modified in an expression can only be read to compute the modified value, which is not the case here.
The observed behavior, the output of 235
, is consistent with the full expression evaluating to ('z' ('z' 1)) - 10
along with unobserved side effects and the use of the ASCII encoding where 'z'
has the value 122
: 122 122 1 - 10 = 235.
Note that this is only a potential explanation for the observed behavior, but the C Standard does not guarantee this behavior at all: the behavior is undefined and the program could behave in different ways: it could output 236
, 329
, 330
, it could stop with or without an error or produce other unpredictable results.
(*) Note that (*str)
would also have implementation defined behavior on a hypothetical system where 'z'
would have the value of CHAR_MAX
with char
signed by default. This makes *str
attempt to convert a value beyond the range of signed type char
, a situation covered by 6.3.1.3 [...] Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. . Many systems have 'z'
with a value of 122
and CHAR_MAX
set to 127
... close but no cigar:)