Home > database >  prefix and postfix with dereferencing operator
prefix and postfix with dereferencing operator

Time:05-01

Why and how is the given code printing 235 in C?

#include <stdio.h>

int main()
{
    int i;

    char ch[] = { 'z', 'o', 'h', 'o' };

    char *ptr, *str1;
    ptr = ch;
    str1 = ch;
    i = (*ptr--     *str1) - 10;
    printf("%d", i);

    return 0;
}

CodePudding user response:

In i=(*ptr-- *str1)-10;, *ptr-- is parsed as *(ptr--) because postfix -- binds more tightly than unary *.

Since the prior ptr=ch; set ptr to point to the first element of ch, ptr-- attempts to decrement ptr to point before the start of ch. The behavior of this is not defined by the C standard.

Since ptr-- normally evaluates to the value of ptr before the decrement, *ptr-- might produce the value that ptr points to, as if it were *ptr instead of *ptr--. Thus, it would produce the value of z. However, due to the incorrect pointer increment, this is not guaranteed by the C standard. Once there is behavior not defined by the standard, none of the other behavior on the same code path is defined.

If we remove the errant -- and use (*ptr *str1)-10, then, by itself, *ptr produces the value for 'z', which is 122 when the ASCII character set is used. str1 also points to the first element of ch, so *str1 increments that element to change it from 122 to 123.

At this point, there is again behavior not defined by the C standard. *ptr uses ch[0], and *str1 modifies it, and there is no sequencing between the two. By the rule in C 2018 6.5 2, the behavior of the program is not defined by the C standard.

If we suppose sequencing is somehow added, so that the increment comes later than *ptr, then *str1 produces the incremented value, 123.

Then adds 122 and 123 to produce 245, and subtracting 10 produces 235.

CodePudding user response:

In the expression *ptr––, postfix –– has higher precedence than unary *, so the expression is parsed as *(ptr––); you are dereferencing the result of ptr––, which yields 'z' (ASCII 122). As a side effect, ptr is decremented and now points outside of the array bounds.

In the expression *str, unary and unary * have the same precedence, so the expression is parsed as (*str), which is the same as 'z' 1, or ASCII 123. 122 123 - 10 gives you 235. As a size effect, ch[0] now contains ASCII 123, or '{'.

This whole mess is equivalent to writing

tmp1 = ptr;
tmp2 = *str;
i = *tmp1   (tmp2   1) - 10;
ptr = ptr - 1;
*str = *str   1;

except that the updates to i, ptr, and *str can happen in any order, even simultaneously.

CodePudding user response:

To evaluate (*ptr-- *str1), the compiler applies the rules of the C grammar: postfix operator bind stronger than prefix operators, they are applied from left to right, then prefix operators are applied to the result from right to left.

*ptr-- hence applies * to the result of ptr--, which is the original value of ptr, the side effect on ptr occurs sometime before the next sequence point, ie the next statement. Since ptr has been assigned to the beginning of the array ch, *ptr-- has the value 'z'.

Note however that decrementing ptr before the beginning of the array has undefined behavior... While this particular instance should not cause any problem on most current systems, the mere presence of undefined behavior may let the compiler perform aggressive optimisations with counterintuitive results.

*str is parsed as (*str): the value is *str 1, hence the next character in the character set after z(*), and the value of *str is incremented as a side effect sometime before the next sequence point.

This is another instance of undefined behavior because *ptr and *str refer to the same element of the array ch: the side effect on *str clearly conflicts with *ptr reading the same element. The C Standard describes this as undefined behavior because an object modified in an expression can only be read to compute the modified value, which is not the case here.

The observed behavior, the output of 235, is consistent with the full expression evaluating to ('z' ('z' 1)) - 10 along with unobserved side effects and the use of the ASCII encoding where 'z' has the value 122: 122 122 1 - 10 = 235.

Note that this is only a potential explanation for the observed behavior, but the C Standard does not guarantee this behavior at all: the behavior is undefined and the program could behave in different ways: it could output 236, 329, 330, it could stop with or without an error or produce other unpredictable results.


(*) Note that (*str) would also have implementation defined behavior on a hypothetical system where 'z' would have the value of CHAR_MAX with char signed by default. This makes *str attempt to convert a value beyond the range of signed type char, a situation covered by 6.3.1.3 [...] Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. . Many systems have 'z' with a value of 122 and CHAR_MAX set to 127... close but no cigar:)

  •  Tags:  
  • c
  • Related