Home > Software design >  How to resize a hashtable in C? Array refuses to change
How to resize a hashtable in C? Array refuses to change

Time:10-21

struct table
  {
    int size;
    key_value *array[];
  };

struct table *create_table(int size)
{
    struct table *table = calloc(1, sizeof(struct table));
    table->size = size;
    return table;
}

void resize_table(struct table *table, int new_size)
{
    struct table *new_table = create_table(new_size);
    for(int i = 0; i < table->size; i  )
      {
      key_value *p = table->array[i]->next;
      while(has_next(p))
        {
        insert(new_table, p->key, p->value);
        p = p->next;
        }
      }
    *table = *new_table;
}

I tried making the array only array[], this didn't work either. When I try to get rehashed values through new_table, it works perfectly. But after assigning *table = *new_table and accessing rehashed values through table, it doesn't work. Why does assert(new_table->array[23]->key==23) work, but not assert(table->array[23]->key==23)? I have put table equal to new_table, shouldn't they be the same now? What's weird is that the size value has changed, but not the array.

When I try to change just the pointer:

table->array = new_table->array;

I get invalid use of flexible array member, which I don't understand, aren't I just changing a pointer in the struct to a different adress, why does it even matter that it's an array?

Can I solve this by using alloc/realloc maybe?

CodePudding user response:

There are a lot of misconceptions here.

  • key_value *array[]; is a so-called flexible array member, a special feature. Specifically it is an array of key_value pointers, of a size unknown at the point of declaration. It is an array, not a pointer.

    Therefore table->array = new_table->array; tries to assign an array to another array, which isn't allowed in C. Arrays must be copied with memcpy or similar.

  • Flexible array members mostly make sense when you need to allocate adjacent memory directly following the struct. A hash table isn't an obvious use-case for such, but it can be made to work (and gives good cache performance). What you need to do when allocating a flex array member is to allocate room for the struct and the array in one go, like this:

    struct table *table = calloc(1, sizeof(struct table)   sizeof(key_value*[size]) );
    

    A special attribute of flexible array members is that sizeof(struct table) doesn't take the flex array in account. So we need to specify that size separately.

    If you don't do this, then key_value isn't valid memory and can't be used.

    calloc can be assumed to set all pointers to null, so that's good, keep calloc instead of malloc which doesn't init anything.

  • This here won't work:

    void resize_table(struct table *table, int new_size)
    {
      struct table *new_table = create_table(new_size);
    

    For the reasons explained at Dynamic memory access only works inside function

    There are two ways you could make this work:

    • Either table is assumed to point at previously allocated data. In that case you should not call create_table but modify the already allocated data. Possibly realloc it. In which case realloc must look use sizeof(struct table) sizeof(key_value*[new_size]).

      realloc, unlike calloc, does not initialize the data to zero, so you must do that explicitly for the new items if you use realloc.

    • Or alternatively you can ask the caller to provide you with an address to the previously allocated data, through struct table** table. Then you can free it and then just call create_table anew, storing the result inside *table. Naturally you might want to memcpy over any items that should be preserved before calling free though.

  • Similarly, *table = *new_table; won't work since that will not copy the flexible array member, nor any data pointed at by the pointers it contains.

    But in general one cannot hard copy a struct when it has pointer members to dynamically allocated data, or you end up with two structs where the members point at the same data. This means that you would have to allocate all data anew using either of the two methods I described above.

Note that all of the above merely allocates room for the pointers themselves in key_value *array[]. They have to be set to point at actual data allocated elsewhere, outside this struct.

CodePudding user response:

When you originally had

struct table
  {
    int size;
    key_value *array[24];
  };

(or whatever), the fixed-size 24-element array is part of your struct, and is counted in sizeof(struct table), and space is allocated for it correctly in

struct table *create_table(int size)
{
    struct table *table = calloc(1, sizeof(struct table));
    table->size = size;
    return table;
}

But when you change this to a flexible array, it has no default size, and no storage for it is assumed in sizeof(struct table). This should be obvious, because you've explicitly told the compiler you don't know how big it should be. You could easily confirm your understanding here by just printing sizeof the two versions of your struct.

Therefore, you need to explicitly tell calloc how many extra bytes to allocate:

    struct table *table = calloc(1, sizeof(struct table)   size * sizeof(key_value*));

Finally, the assignment

  *table = *new_table;

won't work either, because the two objects are not necessarily the same size. You can't resize the first object (whose array length the compiler doesn't even know) like this, and no elements will be copied from the flexible array either (because the compiler doesn't know how many there are).

Now, you already allocated an object of the correct size - You need to return this to the caller. The usual approach would be to pass a pointer-to-pointer, so you can update the caller with the new address:

void resize_table(struct table **table_ptr, int new_size)
{
    struct table *table = *table_ptr; /* this is just to avoid rewriting the rest of the code */
    struct table *new_table = create_table(new_size);
    for(int i = 0; i < table->size; i  )
      {
      key_value *p = table->array[i]->next;
      while(has_next(p))
        {
        insert(new_table, p->key, p->value);
        p = p->next;
        }
      }
    *table_ptr = new_table;
    free(table); /* you're replacing the table, not updating it */
}

There are potentially other problems, like the fact that you seem to always skip the first element in a bucket, and you could use realloc to reduce the number of redundant calloc/free operations.

CodePudding user response:

Since array inside the struct is without a size specification, it's a flexible array member and it needs dynamic memory allocation. Your code never do that so your array has no memory. You can implement it using a flexible array but for this I would go for a pointer to pointer to key_value.

Like:

struct table
{
    int size;
    key_value **array;
};

struct table *create_table(int size)
{
    struct table *table = malloc(sizeof *table);
    assert(table != NULL);
    table->size = size;
    table->array = calloc(size, sizeof *table->array);
    assert(table->array != NULL);
    return table;
}

void resize_table(struct table *table, int new_size)
{
    key_value **tmp = realloc(table->array, new_size * sizeof *table->array);
    if (tmp == NULL)
    {
        // Error handling or just:
        exit(1);
    }
    table->size = new_size;
    table->array = tmp;
}

If you prefer a solution using a flexible array member, it would something like:

struct table
{
    int size;
    key_value *array[];
};

struct table *create_table(int size)
{
    struct table *table = calloc(1, sizeof *table   size * sizeof table->array[0]);
    assert(table != NULL);
    table->size = size;
    return table;
}

struct table * resize_table(struct table *table, int new_size)
{
    struct table *tmp = realloc(table, sizeof *table   new_size * sizeof table->array[0]);
    if (tmp == NULL)
    {
        // Error handling or just:
        exit(1);
    }
    tmp->size = new_size;
    return tmp;
}

and use the resize_table like:

table = resize_table(table, new_size);
  • Related