Home > Back-end >  Is this behavior intended in C or is it platform dependent?
Is this behavior intended in C or is it platform dependent?

Time:12-04

I remembered the fast inverse square root algorithm from quake 3 and wanted to do some bit shenanigans.

#include <stdio.h>

int main()
{
    unsigned int x[2] = {0xffffffff, 0x0f01};
    
    unsigned long y = * (long *) &x[0];
    
    unsigned long z = ((unsigned long)x[1] << 32)   x[0];
    
    printf("x[0] = %x, x[1] = %x\n(x[1] << 32)   x[0] = %lx\ny = %lx", x[0], x[1], z, y);

    return 0;
}

and it outputs this

x[0] = ffffffff, x[1] = f01
(x[1] << 32)   x[0] = f01ffffffff
y = f01ffffffff

Is it supposed to be doing this?

CodePudding user response:

Is this behavior intended in C or is it platform dependent?

Technically, it's undefined behavior. You are accessing a variable of type unsinged int dereferencing a long pointer. What is the strict aliasing rule?. And really, it's on wiki https://en.wikipedia.org/wiki/Fast_inverse_square_root#Avoiding_undefined_behavior .

unsigned int x[2];    
*(long *) &x[0]; // invalid

If it's not undefined behavior and your compiler is trying to make sense of it, then you need to be lucky so that &x[0] is aligned to alingof(long). If you are lucky and your compiler is trying to make sense of it and the stars are in the right positions, then the result depends on endianess. https://en.wikipedia.org/wiki/Endianness . I.e. you get 0xf01ffffffff on x86 and 0xffffffff00000f01 on ARM.

CodePudding user response:

By dereferencing the address &x[0] and treating the object located at that address as a long, although it is an unsigned int, you are violating the strict aliasing rule, which causes undefined behavior.

Is it supposed to be doing this?

It can do anything it wants, as the behavior of your program is undefined.

CodePudding user response:

It is perhaps interesting to consider a similar program whose behavior is well-defined (aliasing via a union is permitted in C), but still may differ between platforms:

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(void) {
    union {
        uint32_t x[2];
        uint64_t y;
    } u = { {0xffffffff, 0x0f01} };
    
    uint64_t z = ((uint64_t)u.x[1] << 32)   u.x[0];
    
    printf("x[0] = %" PRIx32 ", x[1] = %" PRIx32 "\n(x[1] << 32)   x[0] = %" PRIx64 "\ny = %" PRIx64 "\n",
            u.x[0], u.x[1], z, u.y);

    return 0;
}

It happens to produce the same output for me on my machine as your program did for you:

$ ./a.out
x[0] = ffffffff, x[1] = f01
(x[1] << 32) x[0] = f01ffffffff
y = f01ffffffff

Among other things, this shows that on my machine (as, probably, on yours) both 32-bit and 64-bit unsigned integers are represented with little-endian byte order.

  •  Tags:  
  • c
  • Related