Home > Enterprise >  Why is this C program detecting two '\0' characters in a string when an unconnected line
Why is this C program detecting two '\0' characters in a string when an unconnected line

Time:11-30

I am currently learning C and I wrote this program to check if the '\0' char is really at the end of strings, as pointed in "K and R".

I had the strangest result though.

If I comment the "int lista[] = {0, 1, 2, 3, 4};" statement out of the program (this is a statement that has nothing to do with the other statements of this program, it was part of another test I was going to make). the output of the program comes out how expected, detecting one '\0' char ending the string. However, if I leave the statement uncommented, the program output detects two '\0' chars at the end of the string. Why does this happens?

This is the program with the statement uncommented:

#include <stdio.h>

int main(void)
{
    int lista[] = {0, 1, 2, 3, 4};
    char string[] = "linhas";
    
    for (int i = 0; i <= sizeof(string); i  )
    {
        if (string[i] != '\0')
        {
            printf("%c\n", string[i]);
        }
        else
        {
            printf("this dawmn null char\n");
        }
    }
}

This outputs:

l
i
n
h
a
s
this dawmn null char
this dawmn null char

this is the program with the line commented out:

#include <stdio.h>

int main(void)
{
    /*int lista[] = {0, 1, 2, 3, 4};*/
    char string[] = "linhas";

    for (int i = 0; i <= sizeof(string); i  )
    {
        if (string[i] != '\0')
        {
            printf("%c\n", string[i]);
        }
        else
        {
            printf("this dawmn null char\n");
        }
    }
}

it outputs:

l
i
n
h
a
s
this dawmn null char

CodePudding user response:

Your loop

for (int i = 0; i <= sizeof(string); i  )

is ever-so-slightly wrong. It should be

for (int i = 0; i < sizeof(string); i  )

By using <=, you make one too many trips through the loop, and you access memory outside of the string array. It looks like, with the lista array in place, the extra byte you mistakenly access (outside of the string array) happens to be a 0, so you get an extra, second printout of your "this dawmn null char" message.

But then, when you comment out the lista array, it must be the case that the extra byte you mistakenly access isn't 0, so it gets printed as itself, instead. It might be an invisible control character, which is why you don't see anything. I suggest changing your code to

if (string[i] != '\0')
     printf("string contains %d\n", string[i]);
else printf("this damn null char\n");

to see this more clearly.

The important lesson here is that if you have a loop that's supposed to run N times, there are two ways to write it. In C, the vast majority of the time, you want to write it as

for(i = 0; i < N; i  )

That's a "0-based" loop, that runs from 0 to N-1, for a total of N trips. Once in a while, you want a 1-based loop:

for(i = 1; i <= N; i  )

This runs from 1 to N, again for a total of N trips. But if you write

for(i = 0; i <= N; i  )      /* usually WRONG */

your loop runs from 0 to N, for a total of N 1 trips.

CodePudding user response:

Do not confuse strlen(string) which in your case should be 6 and sizeof(string) which is the size of the array including the '\0' byte ! ;-)

In the case of a string declared as an array with "automatic size" the difference is only one but if you had char string[256], sizeof(string) would not be the same has strlen(string) 1.

With char *string, sizeof(string) would likely be 8 or 4.

CodePudding user response:

@SteveSummit has explained everything in detail. Here is a short answer.

Accessing the element lista[sizeof(lista)] is undefined behavior, so it's "pointless" to discuss what value it should have. I quoted pointless, because it can be a good thing to understand how undefined behavior manifests itself for debugging purposes. But if this code were to go into production, you should NEVER access lista[sizeof(lista)]. It's always out of bound and always a bug.

  • Related