char buf1[1024] = "771675175\x00AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
char buf2[1024] = "771675175\x00";
char buf3[1024] = "771675175\0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
char buf4[1024] = "771675175\0";
char buf5[1024] = "771675175";
buf5[9] = 0;
char buf6[1024] = "771675175";
buf6[9] = 0;
buf6[10] = "A";
printf("%d\n", strlen(buf1));
printf("%d\n", strlen(buf2));
printf("%d\n", strlen(buf3));
printf("%d\n", strlen(buf4));
printf("%d\n", strlen(buf5));
printf("%d\n", strlen(buf6));
if("\0" == "\x00"){
printf("YES!");
}
Output:
10
9
9
9
9
9
YES!
As shown above, I use the "\x00"
to interrupt a string.
As far as I know, when the strlen() meet the "\x00"
, it will return the number of characters before the terminator, and does not include the "\x00"
.
But here, why is the length of the buf1 equal to 10?
CodePudding user response:
As pointed out in the comments section, hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. All of the subsequent A
characters are valid hexadecimal digits, so they are part of the escape sequence. Therefore, the result of the escape sequence does not fit in a char
, so the result is unspecified.
You should change
char buf1[1024] = "771675175\x00AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
to:
char buf1[1024] = "771675175\x00" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
Also, strlen
returns a value of type size_t
. The correct printf
format specifier for size_t
is %zu
, not %d
. Even if %d
works on your platform, it may fail on other platforms.
The following program will print the desired result of 9
:
#include <stdio.h>
#include <string.h>
int main( void )
{
char buf1[1024] = "771675175\x00" "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
printf( "%zu\n", strlen(buf1) );
}
Also, it is worth nothing that the following line does not make sense:
if("\0" == "\x00")
In that if
condition, you are comparing the addresses of two pointers, which point to string literals. It depends on the compiler whether it is storing both string literals in the same memory location. Some compilers may merge identical string literals into the same memory location, some may not. Normally, this is irrelevant to the programmer. Therefore, it does not make much sense to compare these memory addresses.
You probably wanted to write the following instead, which will compare the actual character values:
if( '\0' == '\x00' )
There is a big difference between a string literal and a character constant.