How does C casting unsigned to signed work?-CodePudding

What language in the standard makes this code work, printing '-1'?

unsigned int u = UINT_MAX;
signed   int s = u;
printf("%d", s);

https://en.cppreference.com/w/c/language/conversion

otherwise, if the target type is signed, the behavior is implementation-defined (which may include raising a signal)

https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation

GCC supports only two’s complement integer types, and all bit patterns are ordinary values.

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3):

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

To me it seems like converting UINT_MAX to an int would therefore mean dividing UINT_MAX by 2^(CHAR_BIT * sizeof(int)). For the sake of argument, with 32 bit ints, 0xFFFFFFFF / 2^32 = 0xFFFFFFFF. So this doesnt really explain how the value '-1' ends up in the int.

Is there some language somewhere else that says after the modulo division we just reinterpret the bits? Or some other part of the standard that takes precedence before the parts I have referenced?

CodePudding user response：

No part of the C standard guarantees that your code shall print -1 in general. As it says, the result of the conversion is implementation-defined. However, the GCC documentation does promise that if you compile with their implementation, then your code will print -1. It's nothing to do with bit patterns, just math.

The clearly intended reading of "reduced modulo 2^N" in the GCC manual is that the result should be the unique number in the range of signed int that is congruent mod 2^N to the input. This is a precise mathematical way of defining the "wrapping" behavior that you expect, which happens to coincide with what you would get by reinterpreting the bits.

Assuming 32 bits, UINT_MAX has the value 4294967295. This is congruent mod 4294967296 to -1. That is, the difference between 4294967295 and -1 is a multiple of 4294967296, namely 4294967296 itself. Moreover, this is necessarily the unique such number in [-2147483648, 2147483647]. (Any other number congruent to -1 would be at least -1 4294967296 = 4294967295, or at most -1 - 4294967296 = -4294967297). So -1 is the result of the conversion.

In other words, add or subtract 4294967296 repeatedly until you get a number that's in the range of signed int. There's guaranteed to be exactly one such number, and in this case it's -1.