Home > Back-end >  Inconsistency in size of array cells
Inconsistency in size of array cells

Time:01-11

i'm doing a school project implementing some sorting algorithms in C codes, and i'm working on a Binary Insertion Sort on some generic data type arrays (so i'm using void* items and void** arrays).

I have a binarySearch function that returns the index i would have to insert an item into the array to preserve the ordering of its elements (according to a given function that i pass to the sorting function), and this works correctly.

int binary_search(void **arr, void *item, long start, long end, int data_size, compFunc compare)
{
  int s = start, e = end;
  while (s <= e)
  {
    long middle = s   (e - s) / 2;
    int comparison = compare(item, arr[middle]);
    if (comparison == 0)
      return middle;
    else if (comparison > 0)
      s = middle   1;
    else
      e = middle - 1;
  }
  return s;
}

Then i have the binaryInsertSort function

void binary_insert_sort(void **arr, long arr_size, int data_size, compFunc compare)
{
  long explored, j, pos;
  void *current = malloc(sizeof(void*)), *holder;
  if(!current){
    perror("Error allocating memory\n");
    exit(EXIT_FAILURE);
  }
  for (explored = 1; explored < arr_size; explored  )
  {
    memcpy(current, arr[explored], data_size);
    j = explored - 1;
    pos = binary_search(arr, current, 0, j, data_size, compare);
    while (j >= pos)
    {
      holder = malloc(sizeof(void *));
      memcpy(holder, (*arr)   (j   1)*data_size, data_size);
//CRASHES HERE
      memcpy((*arr)   (j   1) * data_size, (*arr)   (j * data_size), data_size);
      memcpy((*arr)   (j * data_size), holder, data_size);
      j--;
      free(holder);
    }
  }
  free(current);
}

I call these functions like this

int main(int argc, char const *argv[])
{
    char* arr[] = {"a", "b", "f", "d", "c", "g", "e", "1"};
    int n = sizeof(arr)/sizeof(char*);

    binary_insert_sort((void**)arr, n, sizeof(char*), string_compare);
    print_string_array(arr, n);
    return 0;
}

But it always crashes after the first memcpy , when i try to move arr[j 1] into arr[j] ; i tried doing some debugging and, after trying to print (int)(arr[1]-arr[0])(which should print the size of a cell if i understand correctly), i noticed cells have size = 2 rather than the expected sizeof(char*)=4 , so there are problem accessing them correctly when using *arr j*data_size , since i'm moving j * 2 cells rather than j

why does this happen?

i apologize if i'm missing something basic, or if english or formatting arent right, 1st time asking

CodePudding user response:

Your approach is initially wrong due the function declarations and their calls like for example

binary_insert_sort((void**)arr, n, sizeof(char*), string_compare);
                   ^^^^^^^^^^^  

That is if within the function you will dereference the pointer like for example

arr[explored]

then the expression will have the type void *. So if the original array has for example the type

char arr[] = "hello";

then in the expression above there will be used incorrect pointer arithmetic. That is instead of evaluation the value of the pointer expression like value of arr explored * sizeof( char ) the value of the pointer expression will be evaluated like value of arr explored * sizeof( void * ).

Also the function is inefficient. There are too many memory allocations in the while loop

while (j >= pos)
{
  holder = malloc(sizeof(void *));
  //...

The function should be declared at least like

void binary_insert_sort( void *arr, size_t arr_size, size_t data_size, compFunc compare);

The function binary_search has a redundant parameter start. You are calling the function always passing 0 as its argument for the parameter start

pos = binary_search(arr, current, 0, j, data_size, compare);

Instead of the parameters start and end it is enough to pass the number of elements in the sub-array. So the function could be declared like

int binary_search( const void *arr, const void *item, size_t arr_size, size_t data_size, compFunc compare);

Or similarly to the standard C function bsearch like

int binary_search( const void *item, const void *arr, size_t arr_size, size_t data_size, compFunc compare);

If the sub-array already has the element that is equal to the searched element then the function should return the position after the existent element in the sub-array instead of returning the position of the existent element.

CodePudding user response:

i'm doing a school project implementing some sorting algorithms in C codes, and i'm working on a Binary Insertion Sort on some generic data type arrays (so i'm using void* items and void** arrays).

You clarified in comments that what you mean is that you want a function that can sort arrays having any element type. This is exactly what the standard library's qsort() function does, so you should look to it for guidance on how such a function might look and work.

In particular, you need to understand that C has no generic data type. In particular, a void * can point to an object of any type, but void * itself is a specific, complete type, not generic in any way.* Thus, using void * items does not serve your purpose at all. Not even if you wanted to sort only arrays of pointers, because the C language does not guarantee that different pointer types have the same representation as each other, or even the same size, except only that char * and void * are required to have the same size and representation.

In other words, no, you don't have void * items, and you don't want to sort a "void ** array". And therefore no, your binarySearch() function for an array of void * does not serve your purposes -- however well it does its job, it's the wrong job.

Following qsort(), here's a signature that would serve your purpose:

typedef int (*compFunc)(const void *, const void *);

void binary_insert_sort(void *arr, size_t element_count, size_t element_size,
        compFunc compare);

Note that the array to sort is conveyed via a pointer to its first element, received by the function as a pointer of type void * -- not void **. This does present an issue, however: you cannot perform pointer arithmetic or array indexing on a void *, because these operations are defined in terms of the size of the pointed-to type, which is unknown in this case because void is an incomplete type. But it should not be a particular surprise that such an issue arises, because the whole point of the exercise is to sort objects whose size is not known when the the function is compiled.

So what do you do? The traditional approach would be via converting to char *:

#define ELEMENT_POINTER(base, index, size) ((char *) (base)   (index) * (size))

void *element_3_for_example = ELEMENT_POINTER(arr, 3, element_size);

So a comparison would then look like this ...

int result = compare(ELEMENT_POINTER(arr, i, element_size),
                     ELEMENT_POINTER(arr, j, element_size));

... and a swap might look like this:

void *temp = malloc(element_size);

// ...

memcpy(temp, ELEMENT_POINTER(arr, i, element_size));
memcpy(ELEMENT_POINTER(arr, i, element_size), ELEMENT_POINTER(arr, j, element_size));
memcpy(ELEMENT_POINTER(arr, j, element_size), temp);

// ...

free(temp);

I'll leave it to you to work the actual exercise in terms of those or similar constructs.


As for the actual question ...

why does this happen?

, it's because your function is confused about whether the elements of the array are themselves the items being sorted or whether they are pointers to the elements being sorted. Erroneously assuming the latter, it makes the further questionable choice of swapping the data instead of the pointers. In this particular case, the data happen to be pointers after all, though that would not always be the case. The data they point to are arrays containing string literals, and

  1. These arrays are not the expected size, so you have bounds overruns on both reading and writing, and
  2. They are not writable anyway, which is probably the specific source of the error.

* One could consider void to be a generic data type, as indeed this answer could be taken to demonstrate. But you cannot declare an object to have type void, nor access an object via an lvalue of type void, so this is largely moot.

  • Related