Home > Enterprise >  In C: Can I have two pointers of different types pointing to the same address?
In C: Can I have two pointers of different types pointing to the same address?

Time:12-23

Question:

Can I have two pointers of different types (uint32_t * and char *) pointing to the very same address?


Here is why I want to have this:

I want to convert UTF-8 to UTF-32 and vice versa in C.

Lets say, I have a variable of type uint32_t that contains one UTF-32 encoded unicode character. And I already know that it needs 4 byte when encoded in UTF-8. It's binary representation is this:

00000000000aaabbbbbbccccccdddddd

a, b, c and d are 4 different ranges where each bit can be 0 or 1. With clever bitwise &, | and << operations I can rearrange these bits so that at the end there is this new distribution:

00000aaa00bbbbbb00cccccc00dddddd

And then I can flip some bits (using | again), to get this

11110aaa10bbbbbb10cccccc10dddddd

When I split this into 4 subsequent char variables in an array I have this:

11110aaa  10bbbbbb  10cccccc  10dddddd

which is exactly the UTF-8 encoding of the same unicode character.

So, the very same 4 byte in memory shall be one single uint32_t variable and at the same time an array of 4 char variables:

So, I want to have this:

uint32_t *utf32;
char utf8[4];

  • *utf32 is a pointer that points to a single 4 bytes long uint32_t variable.
  • utf8 is a pointer to an array of 4 char elements, each 1 byte long.

And I want that both pointers point to the very same address. So I can write a utf32 encoded character into the variable utf32, transform it in place, and then read the result form the array utf32. Is this possible? If so: How can I do it?

(I used this technique very often when I was coding in COBOL in the previous millennium, because in COBOL it's easy to overload the same region in the memory with many different definitions. But I don't know how to do it in C.)


I have found a lot of questions dealing with 2 pointers pointing to the same address, but in these questions the pointers have always the same type. And some other questions are about why you get an error if a pointer defined with a certain type points to an address that was defined with another type. But I didn't find anything about two pointers of different types sharing the same address.

CodePudding user response:

Can I have two pointers of different types (uint32_t * and char *) pointing to the very same address?

Yes, you can.

union U {
  uint32_t ui32;
  char c[4];
};

union U u;
u.ui32 = ...

uint32_t *pi = &u.ui32;
char *cp = u.c;

assert(pi == cp);

There are some C language rules which you'll violate IF you use the resulting char* to do something other than copying the data in or out, but the "two diffierent pointer types pointing to the same address" is not a problem in itself.

CodePudding user response:

Yes, two pointers of different types can point to the same address.

Let's say that somewhere in your memory is this utf32 and you know where that is so I will refer to this as address.

So if you'd want to treat these 4 bytes like a uint32 you could do this:

uint32_t* utf32 = address;

And you can just as easily treat is as a char array:

char* utf8 = address;

If you then want to access a char you just do:

utf8[index]
  • Related