Home > OS >  How to read line by line with universal newline support in c?
How to read line by line with universal newline support in c?

Time:08-24

int n = NMAXVAL;
char line[MAXLINE];
int rowCount = 0;
char* token = NULL;

while (rowCount < n && fgets(line, MAXLINE, stdin))
{
    token = strtok(line, " ");
    checkToken(token, n);
    // do something
    rowCount  ;
}

My function reads from stdin line by line and performs some tasks. It works fine for me on Visual Studio. However, it acts differently on xcode. It took a while for me to realize that this is caused by the difference between \n, \r, and \n\r. How do I make my code support these newline chars?

update: consider this input:

1
10
3 4
5 6 7

On visual studio, the function reads line by line. However, on Xcode it reads the first line just fine then it reads \n instead 10\n.

CodePudding user response:

On visual studio, the function reads line by line. However, on xcode it reads the first line just fine then it reads \n instead 10\n.

The problem is not fgets(). This problem is the same exact file is treated as different text files on different platforms.

A simply solution is to create each text file, separately, on each platform using that platform's end-of-line convention.

Alternatively and harder, to read the same exact file on different platforms, create your own line read function that ID's an end-of-line in the various ways. Depending on goals, code may need to (re-)open the file in binary.

Alternatively and meets many cases, open the file in text mode (as stdin does), use fgets() and strip various line endings off.

if (fgets(line, sizeof line, stdin)) {
  line[strcspn(line, "\n\r")] = '\0'; // truncate after first of \n, \r or \0
}

CodePudding user response:

Since the line will be tokenized by whitespace, the input could be read by scanf which will ignore leading whitespace. So scanf does the tokenizing as a feature.
Before scanning a token, try to scan a line ending, \r or \n. If successful, scan for consecutive line endings. This seems to be the issue with fgets, it stops on the first line ending and treats a consecutive line ending as a blank line and the input lines seem to end with a pair of line endings.
If the scan for line ending fails, scanf will replace the non-matching character into the input stream.

#include <stdio.h>
#include <stdlib.h>

int main ( void) {
    char token[100] = "";
    char endline[2] = "";
    int rowcount = 0;
    int result = 0;

    while ( 1) {
        scanf ( "%*[ \t]"); //scan and discard any and all space and tab
        if ( 1 == scanf ( "%1[\r\n]", endline)) {
              rowcount;
            // scanf ( "%1[\r\n]", endline); // consume one \r or \n
            while ( 1 == scanf ( "%1[\r\n]", endline)) {
                //consume consecutive \r or \n
            }
        }
        if ( 1 == ( result = scanf ( "           
  • Related