Why can't I use %d to print the address of a variable-CodePudding

#include <stdio.h>

int main()
{
  int a = 10;

  printf("address = %d\n ", &a);

  return 0; 
}

test.c:11:29: warning: format specifies type 'int' but the argument has type 'int *' [-Wformat]
  printf("address = %d\n ", &a);

output if I use %d

address = -376933160

output if I use %p, this is also weird, I think I should get a positive integer instead of this?

address = 0x7ffee07004d8

I know the correct way to do this should be %p, but I see in YouTube video, people can just use %d and don't get any problems. I wonder if my setting is wrong, I am using VS code to run the program.

video example

Update

Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?

CodePudding user response：

Format string identifiers have specific purposes. The %d identifier is for integer values, like short, int etc. long has %ld more variants as such.

You have a pointer, that holds an address. Although it is a numerical value, it's special in its purpose and should be formatted as %p, the proper way to print pointers.

Also, the size of pointers may change by architecture, so it may not be the same size as the %d identifier expects.

Regarding the different values that were printed to the screen: If you were to print the address of a variable in these two formats in the same execution, you may get again one 'positive' hex value and one 'negative' integer value. But, these values are actually the same. Almost. The integer value representation of the variable is only the lower 32 bits of the 64 bit value, and it's negative because it is signed, and as a pointer representation (since it's an address and sign doesn't matter) it is unsigned hex value and looks positive, though both are the same value in memory (At least the 32 bits that are equal). This is happening because of different width of variable and something called "Two's complement". You can further read about it here Note: The two values you mentioned are not the same, since you got them in different executions of the program and ASLR was on, the actual address value of a has changed between executions.

It is important to mention, even though I refer to pointers in this answer as numerical values, it is not correct to regard them as so, as they hold addresses which are their own type category (Thanks @JohnBullinger for clarifying).

Use the correct format identifier to avoid this warning, at it is informing you that you may have miss typed or used the wrong variable since it is not a regular numerical value you're trying to print, but an address.

CodePudding user response：

Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?

Because it is Undefined Behaviour. It does what it does. No rules.

Using the wrong format specifier is cheating the compiler. Imagine that I want to buy your old car. The price is $5000. I pay you but not in US$. If I pay £5000 you are the winner. But if I pay in drachmas 5000GDR you will not be very happy. And your behaviour will be undefined. Maybe you will chase me with the baseball bat in your hand or maybe you simply give up and accept your losses.

Same happens with the compiler and the printf function.

CodePudding user response：

For the same reason you can't use %f, or %c, or anything other than %p.

%d expects its corresponding argument to have type int; if the argument doesn't have type int, then the behavior is undefined. You may get reasonable-looking output, or you may not.

On most 64-bit machines sizeof (int *) > sizeof (int). On x86 pointers are 64 bits wide while ints are 32 bits wide. If you pass a pointer as the argument for %d, printf will only pull half of the pointer value - the output is gibberish.

Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?

Again, you're likely only seeing the lower 32 bits of a 64-bit pointer value.

Undefined behavior does not mean "stop execution" or "print an error" or anything else - it just means that neither the compiler nor the runtime environment are required to handle the situation in any particular way.

CodePudding user response：

The loss of bits and the apppearence of the minus sign provoke a warning when you do it directly:

adr.c:7:13: warning: overflow in conversion from 'long int' to 'int' changes value from '140732663858392' to '-529529640' [-Woverflow]
    7 |   int adr = 0x7ffee07004d8;

e07004d8 (lower 32 bits) is over 2^32. With 0x7ffe0000000a it converts to '10'.

Not a random number, and no reason to "stop execution".

Such a "cast" rarely makes sense, especially not on pointers. (Unless you take a pointer to convert it to a long just to play with it).

CodePudding user response：

The compiler will generally issue a warning if there is a conversion between pointer and integer types without an explicit cast as a protection to the programmer. You can eliminate the warning by casting &a to long long int.

Depending on the system you may be able to print the decimal value if you cast it with %lld:

printf("address = %lld\n ", (long long int)&a);

However, not all systems may support ll as a valid length.