Put null byte at the end of C array-CodePudding

I have seen this coding pattern multiple times when I am reading the source code of different projects.

z = read(fd, buf, sizeof(buf));
buf[z] = 0;

As far as I know, this is to make sure the c-style string is correctly terminated by a null byte. But is this a good coding style? It looks like there is a potential out-of-bound write that may corrupt other data.

CodePudding user response：

buf must be big enough to hold the data plus the null terminator. The correct way is to read sizeof(buf)-1 (assuming buf is type char[]) bytes to buf.

buf[sizeof(buf)] will access the memory after the last allocated member and is invalid because C arrays are 0 indexed.

The number returned by read is the number of bytes read or -1 if read fails so be sure to check for this possibility.

The number of bytes read corresponds to the index in buf of the first byte not read.

Correct code:

z = read(fd, buf, sizeof(buf)-1);
if (z == -1) {
    // Handle error
} else {
    buf[z] = 0;
}

CodePudding user response：

As far as I know, this is to make sure the c-style string is correctly terminated by a null byte.

Yes, that seems to be the most likely purpose. Of course, that's only useful if one wants to interpret the data as a string. That's not so uncommon, but it is by no means universal.

But is this a good coding style? It looks like there is a potential out-of-bound write that may corrupt other data.

You are correct. If the read() transfers the maximum number of bytes, and supposing that buf is defined as an array, not a pointer^*, then the

buf[z] = 0;

will perform an out-of-bounds access to array buf. If one wants to be able to access the contents of buf as a string after the read, without losing any data, then one must reserve one character for the terminator:

z = read(fd, buf, sizeof(buf) - 1);

Code that does not explicitly reserve at least one character in that way implicitly assumes that it will never see a maximum-length read. That may be a relatively safe assumption in some cases, but it's not one I would want to hang my hat on.

Additionally, if the read() fails then it will return -1. In that case, too, the assignment to buf[z] will be out of bounds. That must be avoided by checking the return value of read() before using it, as one almost always should do with the return values of function calls that may indicate error conditions that way.

^*If buf is a pointer rather than an array, then there is an altogether different problem: sizeof(buf) reports on the size of a pointer, not on the size of the object to which it points.

CodePudding user response：

The read function returns the number of bytes written to buf, if the call is successful. If an error occur the return value is -1.

So there is a possible out of bounds write both after the end of buf and before the start of buf.

One way of handling this is:

z = read(fd, buf, sizeof(buf) - 1);
if(z >= 0) {
    buf[z] = 0;
} else {
    /* handle error */
}