Home > Enterprise >  strlen() is returning a value that includes the null terminator
strlen() is returning a value that includes the null terminator

Time:05-01

I'm using a Ubuntu Machine compiling with Clang.

I'm reading a simple file, storing it into a buffer then getting the length. I'm anticipating receiving a 5 but got a 6.

strlen() isn't suppose to include the null terminator. Is this perhaps because I performed a cast on the buffer?

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main() {

    unsigned char buffer[30];
    memset(buffer, '\0', 30);

    int fd_read = open("test.txt", O_RDONLY);
    read(fd_read, buffer, 29);

    ssize_t length = strlen((const char *)buffer);
    printf("%zu\n", length);
}

Contents of test.txt:

Hello

Output:

6

CodePudding user response:

strlen() isn't suppose to include the null terminator.

That is true.

Is this perhaps because I performed a cast on the buffer?

The cast is unnecessary but it is not what is causing the problem.

I'm reading a simple file, storing it into a buffer then getting the length. I'm anticipating receiving a 5 but got a 6.

The likely scenario is that you have newline character at the end of the read string, as pointed out by Chris Dodd, which strlen will count, to remove it, if it's indeed there:

buffer[strcspn(buffer, "\n")] = '\0';

Other considerations about your code:

  • You should verify the return value of open to confirm that the file was successfuly accessed.

  • memset(buffer, '\0', 30); is unnecessary, you can null terminate buffer:

    ssize_t nbytes = read(fd_read, buffer, sizeof buffer - 1);
    
    if(nbytes >= 0)  
        buffer[nbytes] = '\0';
    

    Or you can initialize the array with 0s:

    unsigned char buffer[30] = {'\0'}; // or 0
    

CodePudding user response:

Your program is somewhat convoluted, using modified types for no reason. Yet the problem does not come from these typing issues nor the use of casts, it is much more likely the file contains 6 bytes instead 5, namely the letters Hello and a newline(*).

Here is a modified version:

#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

int main() {
    char buffer[30] = { 0 };

    int fd_read = open("test.txt", O_RDONLY);
    if (fd_read >= 0) {
        int count = read(fd_read, buffer, sizeof(buffer) - 1);
        size_t length = strlen(buffer);
        printf("count=%d, length=%zu\n", count, length);
        printf("contents: {");
        for (size_t i = 0; i < count; i  ) {
            printf("%3.2X", (unsigned char)buffer[i]);
        }
        printf(" }\n");
        close(fd_read);
    }
    return 0;
}

(*)or possibly on legacy platforms, Hello and an end of line sequence CR/LF (7 bytes) that is translated to a single '\n' byte by the read library function that is a wrapper on system calls that performs complex postprocessing

  • Related