Home > database >  is it possible to have a union of arrays in c
is it possible to have a union of arrays in c

Time:12-06

I wish to have a type which can be used as two different array structures - depending on context. They are not to be used interchangeably whilst the program is executing, rather when the program is executed with a particular start-up flag the type will be addressed as one of the array types (for example):

array1[2][100] or array2[200];

I am not interested in how the data is organised (well I am but it is not relevant to what I wish to achieve)

union m_arrays
{
   uint16_t array1[2][100];
   uint16_t array2[200];
};

or do I have to use a pointer and alloc it at runtime?

uint16_t * array;

array = malloc(200 * sizeof(uint16_t));
uint16_t  m_value  =100;

*(array   199) = m_value;
//equivalent uint16_t  array1[1][99] == *(array   199);
//equivalent uint16_t  array2[199] == *(array   199);

I haven't tried anything as yet

CodePudding user response:

A union as itself contains either of its members. That is, only one member can be "bound" at a time.

In general, the effective size of that union will be the higher size on bytes of its members.

Let me give an example:

#include <stdio.h>

typedef union m_arrays
{
   int array1[2][100];
   int array2[400];
} a;

int main()
{
    printf("%d", sizeof(a));

    return 0;
}

In this example, this would print 1600 (assuming int is 4 bytes long, but at the end it will depend on the architecture) and is the highest size in bytes. So, YES, you can have a union of arrays in C

CodePudding user response:

Yes, this does work, and it's actually precisely because of how arrays are different from pointers. I'm sure you've heard that arrays in C are really just pointers, but the truth is that there are some important differences.
First, an array always points to somewhere on the stack. You can't use malloc to make an array because malloc returns a heap address. A pointer can point anywhere, you can even set it to an arbitrary integer if you want (though there's no guaruntee you can access that memory that it points to).
Second, because arrays are fixed length, the compiler can and does allocate them for you when you declare them. Importantly, this comes with the guaruntee that the whole array is in one continuous memory block. So if you declare int arr[2][100], you'll have 200 int slots allocated in a row on the stack. That means you can treat any multimensional array as a single-dimensional array if you want to, e.g. instead of arr[y][x] you could do arr[0][y*100 x]. You could also do something like int* arr2 = arr and then treat arr2 as a regular array even though arr is technically an int** (you'll get a warning for doing either of these things, my point is that you can do them because of how arrays are made).

The third, and probably most important difference, is a consequence of the second. When you have an array in a struct or union, the struct/union isn't just holding a pointer to the first element. It holds the entire array. This is often used for copying arrays or returning them from functions. What this means for you is that what you want to do works despite what someone who's heard that arrays are pointers might initially think. If arrays were just an address and they were initialized by allocating at that address, there would be two different arrays initialized at two different places, and having the pointers to them in a union would mean one gets overwritten and now you have an array somewhere that you can't access.

So when this all comes together, your union of arrays basically has one array with two different ways of accessing the data (which is what you want if I'm not mistaken). A little example:

#include <stdio.h>
int main(void) {
    union {
        int arr1[4];
        int arr2[2][2];
    } u;
    u.arr1[0] = 1;
    u.arr1[1] = 2;
    u.arr1[2] = 3;
    u.arr1[3] = 4;
    printf("%d %d\n%d %d\n", u.arr2[0][0], u.arr2[0][1], u.arr2[1][0], u.arr2[1][1]);
    return 0;
}

Output:

1 2
3 4

We can also quickly walk through why this wouldn't work with pure pointers. Let's say we instead had a union like this:

union {
    int* arr1;
    int** arr2;
} u;

Then we might initialize with u.arr1 = (int*) malloc(4 * sizeof (int));. Then we could use arr1 like a normal array. But what happens when we try to use arr2? Well, arr2[y][x] is of course syntactic sugar for *(*(arr2 y) x)). Once it's dereferenced that first time, we now have an int, since the address points to an int. So when we add x to that int and try to dereference again, we're trying to dereference an int. C will try to do it, and if you're very unlucky it will succeed; I say unlucky because then you'll be messing with arbitrary memory. What's more likely is a segfault because whatever int is there is most likely not an address your program has access to.

  •  Tags:  
  • c
  • Related