Home > Software design >  How does the compiler add a null-terminator when there is no more space?
How does the compiler add a null-terminator when there is no more space?

Time:10-06

I know that this question had been asked a million times, but I still get confused. I intentionally make my string size equal to the char number in it:

int main()
{
    int i = 0;
    char str[5] = "check";
    while((str[i] != '\0') && (i != 10)){ // (i != 10) aborts the func
        printf("str[i] = %c\n", *(str i));
        printf("i = %d\n", i);
        i  ;
    }
}

The output is:

str[i] = c
i = 0
str[i] = h
i = 1
str[i] = e
i = 2
str[i] = c
i = 3
str[i] = k
i = 4

Why does it still store null character at the end of it, despite there is no space allocated? Does it depend on the compiler, or do all of them work the same way?

CodePudding user response:

The standard library has

A string is a contiguous sequence of characters terminated by and including the first null character.

With char str[5] = "check";, str[] is not a string as it lacks a null character. str[] could be called an array of char.


After a few iterations, with while((str[i] != '\0') && (i != 10)){, code attempts str[5], which is outside the str[]. This is undefined behavior (UB) @Vlad from Moscow. Anything may happened.

In OP's case, the UB was apparently str[5] is zero, ending the loop. This may differ on a subsequent run.

Do not rely on this result. It is UB.


Why does it still store null character at the end of it, despite there is no space allocated?

The loop may have ended because a 0 was next in memory or other reasons. It is UB.

Does it depend on the compiler,

There is no specified behavior based on the compilation. There is no specification that the result is consistent. It is UB.

... or do all of them work the same way?

There is no specification that they work the same way. There is no specification that they work differently. It is UB.

CodePudding user response:

You ask:

Why does it still store null character at the end of it, despite there is no space allocated?

It doesn't. Please look at this modification of your code:

#include <stdio.h>

int main(void)
{
    int i = 0;
    char a = 'x';
    char str[5] = "check";
    char b = 'x';
    while((str[i] != '\0') && (i != 10)){ // (i != 10) aborts the func
        printf("str[i] = %c\n", *(str i));
        printf("i = %d\n", i);
        i  ;
    }
}

In my case, it produced:

str[i] = c
i = 0
str[i] = h
i = 1
str[i] = e
i = 2
str[i] = c
i = 3
str[i] = k
i = 4
str[i] = x
i = 5
str[i] = x
i = 6
str[i] = 
i = 7

So it does not add a \0 at the end. That next address might contain a \0 but it would not have anything to do with arr. Instead, it has to do with what the compiler decides to put after it. And that may depend on a number of things. One of which is the size of arr, another is other variables in the same scope.

Don't make any assumptions. It's UB and the problem with this type of mistake is that it might just work.

Don't make any assumptions if it does work. Like you assumed a \0 is added. It isn't.

CodePudding user response:

The way strings work in C is that on its end, there's always a special end character, \0(aka null terminator), which signals the end of the string.

Thing is, you haven't allocated enough space for the string to have the end character, so your loop just keeps going forward into your memory.

The reason your function is aborted at i=10 is because as your program keeps going forward in the memory of your computer, it reaches a point where the memory you are trying to access is reserved for something more important than your program (say, the OS or files you have in your system), so the OS is like: OI LAD, where do you think you are going?, and kills your program, to protect the memory you are trying to access.

  • Related