Home > Mobile >  What's the length of a string in C when I use the "\x00" to interrupt a string?
What's the length of a string in C when I use the "\x00" to interrupt a string?

Time:09-28

char buf1[1024] = "771675175\x00AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
char buf2[1024] = "771675175\x00";
char buf3[1024] = "771675175\0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
char buf4[1024] = "771675175\0";
char buf5[1024] = "771675175";
buf5[9] = 0;
char buf6[1024] = "771675175";
buf6[9] = 0;
buf6[10] = "A";

printf("%d\n", strlen(buf1));
printf("%d\n", strlen(buf2));
printf("%d\n", strlen(buf3));
printf("%d\n", strlen(buf4));
printf("%d\n", strlen(buf5));
printf("%d\n", strlen(buf6));

if("\0" == "\x00"){
    printf("YES!");
}

Output:

10
9
9
9
9
9
YES!

As shown above, I use the "\x00" to interrupt a string. As far as I know, when the strlen() meet the "\x00", it will return the number of characters before the terminator, and does not include the "\x00". But here, why is the length of the buf1 equal to 10?

CodePudding user response:

As pointed out in the comments section, hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. All of the subsequent A characters are valid hexadecimal digits, so they are part of the escape sequence. Therefore, the result of the escape sequence does not fit in a char, so the result is unspecified.

You should change

char buf1[1024] = "771675175\x00AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";

to:

char buf1[1024] = "771675175\x00" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";

Also, strlen returns a value of type size_t. The correct printf format specifier for size_t is %zu, not %d. Even if %d works on your platform, it may fail on other platforms.

The following program will print the desired result of 9:

#include <stdio.h>
#include <string.h>

int main( void )
{
    char buf1[1024] = "771675175\x00" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";

    printf( "%zu\n", strlen(buf1) );
}

Also, it is worth nothing that the following line does not make sense:

if("\0" == "\x00")

In that if condition, you are comparing the addresses of two pointers, which point to string literals. It depends on the compiler whether it is storing both string literals in the same memory location. Some compilers may merge identical string literals into the same memory location, some may not. Normally, this is irrelevant to the programmer. Therefore, it does not make much sense to compare these memory addresses.

You probably wanted to write the following instead, which will compare the actual character values:

if( '\0' == '\x00' )

There is a big difference between a string literal and a character constant.

  •  Tags:  
  • c
  • Related