Home > Blockchain >  How to properly compare data of any type in C with pointers and type conversion to char
How to properly compare data of any type in C with pointers and type conversion to char

Time:06-15

I know that C is not a language that easily allows developers to create generic typed functions. However, as a learning exercise I am trying to write a function that allows a user to pass an array of any data type (i.e. void) and a scalar of any data type (i.e. void). The purpose of the function is to walk through the array with a pointer and compare the scalar to the value in the array index and see if they match. In order to do this, I convert the scalar and the array into a char and use strcmp from the string.h library. The function looks like the example below.

void find_vector_indices(void *vec, void *value, size_t num_bytes, size_t arr_size) {
    char compare;
    char *val = (char *) value;
    char *dst;
    for (int i = 0; i < arr_size; i  ) {
        dst = (char *) vec   (i * num_bytes);
        compare = strcmp(dst, val);
        if (compare == 0) printf("Strings Match\n");
    }
}

In the main function if I have the following code using integers, the above function works just fine and prints Strings Match twice, just as I would expect.

    int a[3] = {3, 2, 3};
    int b = 3;
    find_vector_indices(a, &b, sizeof(int), 3);

However, if I modify the main function to be a float array and a float scalar like so, the function no longer works.

float a[3] = {3.1, 2.2, 3.1};
flat b = 3.1;
find_vector_indices(a, &b, sizeof(float), 3);

I am assuming this has something to do with how float and integer values are represented in memory. However, as best i know it, both consume 4 bytes of memory and should be correctly represented as character strings, which I believe are correctly pointed to in the pointer arithmetic of this function. The solution is probably very simple, but it is escaping me. Any help would be appreciated.

CodePudding user response:

strcmp() is for comparing strings (NUL-terminated sequence of characters), not arbitrary binary data. Use memcmp() instead.

Instead of

compare = strcmp(dst, val);

You should use

compare = memcmp(dst, val, num_bytes);

In little-endian machine, the 2nd, 3rd, and 4th bytes of small int such as 2 or 3 will be zero, so it will work as the string terminator and your code seem working.

On the other hand, the float value 3.1 will be represented as 0x40466666 (according to IEEE-754 Floating Point Converter). This doesn't contain 0x00 bytes, so strcmp() has less chance to seem working.


Also the type of compare should be int, not char. strcmp() and memcmp() returns int and its range is not specified. (just specified to be positive, negative, or zero, according to the result of comparision). Therefore, there maybe some chance to misunderstand unmatch as match due to the truncation.

CodePudding user response:

In addition to @MikeCAT good answer:

Use consistent types

// for (int i = 0; i < arr_size; i  ) {
for (size_t i = 0; i < arr_size; i  ) {

"see if they match"?

Even with fixed suggested by @MikeCAT,

0.0 == -0.0 is true, yet via find_vector_indices() would report they differ as they have the same value, but different bit patterns. This can also happen with other floating point encodings and pointers.

float a[1] = {-0.0};
float b = 0.0;
find_vector_indices(a, &b, sizeof b, 1);

See also struct concerns @John Bollinger.

Alternative

Rather than use memcmp(), strcmp(), etc., pass in a compare function int cmp(const void *, const void *) and return the first zero compare address.

// Illustrative untested code.
void *find_vector_match(const void *vec, const void *value, size_t element_size,
    size_t arr_n, int (cmp)(const void *, const void *)) {
 
  while (arr_n > 0) {
    arr_n--;
    if (cmp(vec, value) == 0) {
      return (void *) vec;
    }
    vec = (const unsigned char*)vec   element_size;
  }
  return NULL;
}
  • Related