Is this a bug in glibc printf?-CodePudding

Using stdint.h from glibc (gcc SUSE Linux version 9.2.1, Intel Core I7 processor) I came across a most strange behaviour when printing INT32_MIN directly:

#include <stdio.h>
#include <stdint.h>

void main(void)
{
    printf("%d\n", INT16_MIN);
    int a = INT16_MIN;
    printf("%d\n", a);

    printf("%ld\n", INT32_MIN);
    long b = INT32_MIN;
    printf("%ld\n", b);

    printf("%ld\n", INT64_MIN);
    long c = INT64_MIN;
    printf("%ld\n", c);
}

which outputs:

-32768
-32768
2147483648
-2147483648
-9223372036854775808
-9223372036854775808

Furthermore, if I try

printf("%ld\n", -INT32_MIN);

I get the same result, but with compiler warning: integer overflow in expression '-2147483648' of type 'int' results in '-2147483648' [-Woverflow].

Not that this is incredibly bad for any existing program, actually it seems pretty harmless, but is this a bug in good old printf?

CodePudding user response：

Is this a bug in glibc printf?

No.

printf("%ld\n", INT32_MIN); … 2147483648

There is an easy way for this to happen. The second integer/pointer argument to a function should be passed in 64-bit register RCX. INT32_MIN is a 32-bit int with bit pattern 0x80000000, since that is the two’s complement pattern for −2,147,483,648. The rules for passing a 32-bit value in a 64-bit register are that it is passed in the low 32 bits, and the high bits are not used by the called routine. For this call, 0x80000000 was put into the low 32 bits, and the high bits happened to be set to zero.

Then printf examines the format string and expects a 64-bit long. So it goes looking in RCX for a 64-bit integer. Of course, the rules for passing a 64-bit integer are to use the entire register, so printf takes all 64 bits, 0x0000000080000000. That is the bit pattern for 2,147,483,468, so printf prints 2147483648.

Of course, the C standard does not define the behavior, so other things could happen, but this is a likely scenario for what did happen in the instance you observed.

printf("%d\n", INT16_MIN); … -32768

Since int is 32 bits in your C implementation, the int16_t value INT16_MIN is automatically promoted to int for the function call, so this passes an int, and %d expects an int, so there is no mismatch, and the correct value is printed.

Similarly, the other printf calls in the question have arguments that match the conversion specifications (given the particular definitions of int16_t and such in your C implementation; they could mismatch in others), so their values are printed correctly.

CodePudding user response：

Let's observe this specific snippet

    long a = INT32_MIN;
    printf("%ld\n", a);          // #1
    printf("%ld\n", INT32_MIN);  // #2

It prints the following (as per your example, I haven't tried it myself, it's implementation dependent, as I'll note later):

-2147483648
2147483648

Now, in #1 we are passing a long to printf. The function works with a variadic argument list and based on the format specified, the value of our parameter will be interpreted as a signed long, and thus we end up with the value -2147483648.

For #2 we have a somewhat different situation. If you look at how INT32_MIN is defined, you'll see that it's a plain old C macro, for example:

# define INT32_MIN      (-2147483647-1)

so, our line

printf("%ld\n", INT32_MIN);

is actually

printf("%ld\n", (-2147483647-1));

Where the argument type isn't a long, but an integer (see this for details)! Compare it to this

    int b = INT32_MIN;
    printf("%ld\n", b);

And it prints out the same value you observed (positive).

Now it boils down to what your question should actually be:

What happens when printf receives a value of a type that isn't in the format specifier?

Undefined behavior!