Home > Blockchain >  proper way of comparing two strings other than strcmp
proper way of comparing two strings other than strcmp

Time:09-16

I was reading the proper way of comparing two strings is to use the strcmp() function.

However I am not sure why the first comparison below would leads to the condition True (i.e. the contents inside the first "if condition" was able to be printed).

#include <stdio.h>
#include <string.h>
int main()
{
    printf("Hello World\n");
    
    char *x;
    char *w = "oliver bier";
    char *w2 = "oliver bier";
    x = w;
    
    if(x == "oliver bier") { // what is being compared here?, why this leads to True
        printf("This is comparing string also????\n");
        printf("Or this is comparing the type of the variable x and type of oliver bier i.e. a string??");
    }
    
    if ( strcmp( x, w2) == 0 ) { // I was reading this is the correct way of comparing strings
        printf("I think this is the correct way to compare two strings\n");
    }
    
    if(x == w) { // what is being compared here?
        printf("This is expected!! since x, w stored same address");
    }
    


    return 0;
}


so basically why is it that if(x == "oliver bier") would evaluate to true as well? I thought x is a pointer to character, and "oliver bier" is a string.

CodePudding user response:

Literal strings in C are really arrays of characters, whose life-time last the entire run-time of the program.

When you get a pointer to a literal string, you get a pointer to its first character (it's normal array-to-pointer decay).

Also, the C specification allows compilers to reuse literals. So for example all instances of "oliver bier" can (and most likely will) be the exact same array, and the pointers to its first character will then of course also be the same.

That's the reason that the comparison x == "oliver bier" will work.

But if you change it to:

char x[] = "oliver bier";

Then the comparison will no longer work, as the pointer to the array x and the pointer to the array "oliver bier" will be different.

CodePudding user response:

Most of the time, calling a function like strcmp is the proper and the only way of reliably comparing two strings. Most of the time, checking pointer equality is not a reliable way.

The problem is that two different pointers can point to two different memory regions that contain separate copies of the same string. You can have this:

     ---          --- --- --- --- --- --- --- --- --- --- --- --- 
p1: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
     ---          --- --- --- --- --- --- --- --- --- --- --- --- 

     ---          --- --- --- --- --- --- --- --- --- --- --- --- 
p2: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
     ---          --- --- --- --- --- --- --- --- --- --- --- --- 

Or you can have this:

     ---          --- --- --- --- --- --- --- --- --- --- --- --- 
p3: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
     ---          --- --- --- --- --- --- --- --- --- --- --- --- 
                   ^
     ---           |
p4: | *------------'
     --- 

p1 and p2 point to different strings, so the pointers will compare unequal. p3 and p4 point to the same string, so the pointers will compare equal.

If the pointers compare equal, obviously strcmp will say the strings are equal, too. But if the pointers are different, the strings might be the same (as in p1 and p2), or they might be different.

Sometimes people write things like

if(str1 == str2 || strcmp(str1, str2) == 0)

This checks to see whether the two strings str1 and str2 are the same. If the pointers are equal, then the strings are the same, and only if the pointers are not equal does the code perform the (more expensive) call to strcmp to check the actual, pointed-to characters.

When you have two string literals in your program that happen to be the same, like

char *w = "oliver bier";
char *w2 = "oliver bier";

or

char *w = "oliver bier";
...
if(w == "oliver bier") { ... }

you can't predict, in general, whether their pointers will be the same or different, whether the compiler was clever enough to have one in-memory copy of the string do double duty for its use in multiple places. Once upon a time this "cleverness" was quite rare, although I gather that today it's pretty common.

  • Related