What language in the standard makes this code work, printing '-1'?
unsigned int u = UINT_MAX;
signed int s = u;
printf("%d", s);
https://en.cppreference.com/w/c/language/conversion
otherwise, if the target type is signed, the behavior is implementation-defined (which may include raising a signal)
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
GCC supports only two’s complement integer types, and all bit patterns are ordinary values.
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3):
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
To me it seems like converting UINT_MAX
to an int
would therefore mean dividing UINT_MAX
by 2^(CHAR_BIT * sizeof(int))
. For the sake of argument, with 32 bit ints, 0xFFFFFFFF / 2^32 = 0xFFFFFFFF
. So this doesnt really explain how the value '-1' ends up in the int
.
Is there some language somewhere else that says after the modulo division we just reinterpret the bits? Or some other part of the standard that takes precedence before the parts I have referenced?
CodePudding user response:
No part of the C standard guarantees that your code shall print -1 in general. As it says, the result of the conversion is implementation-defined. However, the GCC documentation does promise that if you compile with their implementation, then your code will print -1. It's nothing to do with bit patterns, just math.
The clearly intended reading of "reduced modulo 2^N" in the GCC manual is that the result should be the unique number in the range of signed int
that is congruent mod 2^N to the input. This is a precise mathematical way of defining the "wrapping" behavior that you expect, which happens to coincide with what you would get by reinterpreting the bits.
Assuming 32 bits, UINT_MAX
has the value 4294967295. This is congruent mod 4294967296 to -1. That is, the difference between 4294967295 and -1 is a multiple of 4294967296, namely 4294967296 itself. Moreover, this is necessarily the unique such number in [-2147483648, 2147483647]. (Any other number congruent to -1 would be at least -1 4294967296 = 4294967295
, or at most -1 - 4294967296 = -4294967297
). So -1 is the result of the conversion.
In other words, add or subtract 4294967296 repeatedly until you get a number that's in the range of signed int
. There's guaranteed to be exactly one such number, and in this case it's -1.