Home > Software design >  use fgetc reading newline character but error in macOS
use fgetc reading newline character but error in macOS

Time:05-05

My code needs to use a lot of fgetc(inp).

It doesn't have any problem in windows, but in macOS the program will error out.

I found that the problem is caused by the inconsistency of the number of characters in the newline character in both systems: macOS just \n, windows is \r\n

So I created a new function to replace fgetc(inp) which reads newline characters

void getwhite() {
    int white = fgetc(inp);
    if (isspace(white) == 0) {
        fseek(inp, -1, SEEK_CUR);
    }
}

But it doesn't work as expected, still works fine in windows, macOS still gives errors

CodePudding user response:

Instead of fseek(), you should use ungetc() to push back the byte read from the stream:

int getwhite() {
    int c = fgetc(inp);
    if (!isspace(c)) {
        ungetc(c, inp);
    }
    return c;
}

Regarding the handling of line endings on windows and other systems: for legacy reasons, windows still uses CR LR sequences to indicate end of lines in text files and the C library translates these sequences transparently into a single '\n' byte for programs that read files as text, either with fopen() or lower level open() interfaces.

This makes file offsets tricky to use because the number of bytes read from the file may be different from the offset in bytes into the file, which cannot be retrieved with standard functions: the long returned by ftell() for streams open in text mode is only meaningful as a number to pass to fseek() for SEEK_SET mode on the same file open in text mode. Seeking with a non zero offset in SEEK_CUR and SEEK_END modes on a text stream has undefined behavior, as specified in the C Standard:

7.21.9.2 The fseek function
Synopsis

#include 
int fseek(FILE *stream, long int offset, int whence);

Description
[...]

For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.

If you need to rely on file offsets, you should open files in binary mode and handle the line endings explicitly in your own code.

Apple operating systems used to represent line endings as a single CR byte, but switched to a single NL byte more than 10 years ago, when they adopted the Mach unix compatible kernel.

  • Related