Home > OS >  Is my understanding of EOF (end of file) in C correct?
Is my understanding of EOF (end of file) in C correct?

Time:08-31

Lets say we have this piece of code:

while ((ch = getc(fp)) != EOF) {
  ...
}

This is how i would assume the loop would work, please tell me if it is correct or not/if any mistakes have been made:

  1. getc reads character from I/O stream fp
  2. getc stores it in the variable ch
  3. ch is tested against EOF which is normally equivalent to -1 (depending on how it is defined for you).
  4. When EOF is returned it indicates that the End Of File indicator for a specific stream has been set. This means that a previous read operation has attempted to read past the end of the file.

One important question i have:

In this loop as getc gets to the end of the file, does it try to eventually read past the end of the file which causes it to eventually return EOF?

CodePudding user response:

Once the file stream reaches EOF, the status is supposed to be sticky — all input operations on the file stream will simply report EOF each time they're called. You can call clearerr() to reset the error/EOF status, in which case the input functions will try to read more data until they encounter EOF again.

There was a period when the GNU C Library didn't make the status sticky if the input device was terminal. That is no longer the case these days.

CodePudding user response:

  1. getc reads character from I/O stream fp

More precisely, getc attempts to read a character from the fp stream.

  1. getc stores it in the variable ch

getc returns an int value, and the assignment ch = getc(fp) stores that value in ch, if it is representable in the ch type. If it is not representable, it is converted in a manner that depends on the ch type. To avoid complications with this conversion, ch should be int.

  1. ch is tested against EOF

Yes, the resulting value of ch is compared with EOF. This makes the type important: If getc returns EOF and ch is, say, a character type that cannot represent the value of EOF, then the conversion will cause a different value to be stored in ch, and then it will not compare equal to EOF.

… which is normally equivalent to -1 (depending on how it is defined for you).

EOF may be −1 in many C implementations, but it is normal for it to have any negative int value.

  1. When EOF is returned it indicates that the End Of File indicator for a specific stream has been set.

No, an EOF return indicates that either end-of-file has been reached (and a further read was attempted) or an error occurred. C 2018 7.21.7.5 3 says:

The getc function returns the next character from the input stream pointed to by stream. If the stream is at end-of-file, the end-of-file indicator for the stream is set and getc returns EOF. If a read error occurs, the error indicator for the stream is set and getc returns EOF.

This means that a previous read operation has attempted to read past the end of the file.

It means either a previous operation or the current operation attempted to read past the end of the file or a previous operation or the current operation encountered an error (and these previous conditions have not been cleared, as by calling rewind or clearerr).

In this loop as getc gets to the end of the file, does it try to eventually read past the end of the file which causes it to eventually return EOF?

If the stream is connected to a file of finite and unchanging size, that loop will eventually attempt to read past the end of file, if nothing in the body of the loop resets the file position, at which point EOF will be returned and the loop will exit. However, if the stream is connected to some “infinite” supply of data, such as a pipe, it is possible the loop will never encounter the end of the stream.

CodePudding user response:

No, it never tries to read past the end of the file, because the ch != EOF condition fails when you get to the end and the loop stops. This happens when the previous iteration read the last character of the file, so you're at the end, not past it.

If you did keep trying to read, it would keep returning EOF unless you call clearerr(). This can be useful when processing files that get extended (think about how the tail -f command works -- it reads until EOF, then keeps checking if the file has grown).

  • Related