Home > Software design >  Casting / accessing same memory are as different variables in C
Casting / accessing same memory are as different variables in C

Time:09-22

If I have 4 bytes of memory and would like to access this memory as both unsigned long L and char c[4]. What would be the most efficient way to do this in C?

For example, if I set the L to 300 then the bytes will be set to 0x0000012c, if I access the c[3] I would expect to see 0x2c

If I increase c[3] by one, it becomes 0x2d and L now has a value of 301

Thanks

CodePudding user response:

You can create a union of the two types:

union u1 {
    unsigned long l;
    char c[sizeof(unsigned long)];
};

Then if you create a variable of the union type, you can write to one member and read the other.

Keep in mind however that the result depends on the endianness of your system. Most x86 based systems are little-endian, meaning the least significant byte comes first. If that's the case, the c[0] member would actually have the value 2c.

CodePudding user response:

You could investigate the memory using a union and type punning:

#include <stdint.h>
#include <stdio.h>

typedef union {
    uint32_t L;
    char c[sizeof(uint32_t)];
} foo;

int main() {
    foo x;
    x.L = 123;
    printf("%X %X %X %X\n", x.c[0], x.c[1], x.c[2], x.c[3]);
}

The output will probably be 7B 0 0 0 or 0 0 0 7B depending on the endianess of your machine.

CodePudding user response:

If you have memory that has been defined as an unsigned long, as with unsigned long L;, then you may access its bytes through a char value, as with:

char *c = (char *) &L;
printf("Byte 3 of L is %d.\n", c[3]);

This works because C 2018 6.3.2.3 7 says:

… When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

and 6.5 7 says we can access any object through a character type:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

— a character type.

This is also true for an unsigned long that was created by storing an unsigned long in allocated memory, as with unsigned long *p = malloc(sizeof *p); *p = 300;.

However, the order of the bytes in the unsigned long is implementation-defined. C requires unsigned integer types to use a pure binary notation but does not specify the order of the bits within the bytes or the bytes within the representation, and it also allows padding.

Also, it is generally preferable to use an unsigned char to access the bytes, rather than a char, to avoid potential complications with signedness.

If you just have “some memory,” say a special region in an embedded system, and you want to access it both as an unsigned long and a char as described above, then you may need support from your compiler beyond what the C standard. Of course, C implementations intended for use in embedded systems generally provide support for accessing such memory as needed.

However, even if you do not have a defined or allocated object and do not have special support from the compiler, you can put the bytes of an unsigned long in that memory by using character types. In particular, memcpy is defined to copy byte-by-byte, so it would serve:

unsigned long L = 300;       // Prepare a normal unsigned long.
memcpy(target, L, sizeof L); // Copy bytes of L to the address in pointer "target".

Then, of course, you can also set a character pointer to point to the same place as target and use that pointer to access the bytes, as above.

Another way to use the same memory for two or more objet types is through a union. This is defined in C but not C :

typedef union
{
    unsigned long L;
    char c[4];
} MyUnion;
MyUnion x = { .L = 300 };
printf("Byte 3 of the unsigned long is %d.\n", x.c[3]);

This works because C 2018 6.5.2.3 3 tells us that the . operator accesses the value of the named member of the union, and note 99 makes it clear that, if the requested member is not the last one stored, “the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type”.

  • Related