Home > Blockchain >  getline in C implementation for Window OS
getline in C implementation for Window OS

Time:11-30

Statement

  • I know there is a fucntion called getline() on OS made of Linux/Unix.
  • I want to know what other functions are not available in the Windows operating system, but available in the operating system made of Linux/Unix.

Question

  1. Is there any getline() function made by yourself that can replace the one in Windows ?

  2. What resources are available for reference and reading ?

size_t getline(char **lineptr, size_t *n, FILE *stream)

CodePudding user response:

The relevant standard is IEEE Std 1003.1, also called POSIX.1, specifically its System Interfaces, with the list of affected functions here.

I recommend Linux man pages online. Do not be deterred by its name, because the C functions (described in sections 2, 3) have a section Conforming to, which specifies which standards standardize the feature. If it is C89, C99, etc., then it is included in the C standard; if POSIX.1, then in POSIX; if SuS, then in Single Unix Specification which preceded POSIX; if 4.3BSD, then in old BSD version 4.3; if SVr4, then in Unix System V release 4, and so on.

Windows implements its own extensions to C, and avoids supporting anything POSIX or SuS, but has ported some details from 4.3BSD. Use Microsoft documentation to find out.

In Linux, the C libraries expose these features if certain preprocessor macros are defined before any #include statements are done. These are described at man 7 feature_test_macros. I typically use #define _POSIX_C_SOURCE 200809L for POSIX, and occasionally #define _GNU_SOURCE for GNU C extensions.


getline() is an excellent interface, and not "leaky" except perhaps when used by programmers used to Microsoft/Windows inanities, like not being able to do wide character output to console without Microsoft-only extensions (because they just didn't want to put that implementation inside fwide(), apparently).

The most common use pattern is to initialize an unallocated buffer, and a suitable line length variable:

    char   *line_buf = NULL;
    size_t  line_max = 0;
    ssize_t line_len;

Then, when you read a line, the C library is free to reallocate the buffer to whatever size is needed to contain the line. For example, your read file line-by-line loop might look like this:

    while (1) {
        len = getline(&line_buf, &line_max, stdin);
        if (len < 0)
            break;

        // line_buf has len characters of data in it, and line_buf[len] == '\0'.
        // If the input contained embedded '\0' bytes in it, then strlen(line_buf) < len.
        // Normally, strlen(line_buf) == len.
    }

    free(line_buf);
    line_buf = NULL;
    line_max = 0;

    if (!feof(stdin) || ferror(stdin)) {
        // Not all of input was processed, or there was an error.
    } else {
        // All input processed without I/O errors.
    }

Note that free(NULL) is safe, and does nothing. This means that we can safely use free(line_buf); line_buf = NULL; line_max = 0; after the loop –– in fact, at any point we want! –– to discard the current line buffer. If one is needed, the next getline() or getdelim() call with the same variables will allocate a new one.

The above pattern never leaks memory, and correctly detects all errors during file processing, from I/O errors to not having enough RAM available (or allowed for the current process), albeit it cannot distinguish between them: only that an error occurred. It also won't have false errors, unless you break out of the loop in your own added processing code.

Thus, any claims of getline() being "leaky" are anti-POSIX, pro-Microsoft propaganda. For some reason, Microsoft has steadfastly refused to implement these in their own C library, even though they easily could.

If you want to copy parts of the line, I do recommend using strdup() or strndup(), also POSIX.1-2008 functions. They return a dynamically allocated copy of the string, the latter only copying up to the specified number of characters (if the string does not end before that); in all cases, if the functions return a non-NULL pointer, the dynamically allocated string is terminated with a nul '\0', and should be freed with free() just like the getline() buffer above, when no longer needed.

If you have to run code on Microsoft also, a good option is to implement your own getline() on the architectures and OSes that do not provide one. (You can use the Pre-defined Compiler Macros Wiki to see how you can detect the code being compiled on a specific architecture, OS, or compiler.)

An example getline() implementation can be written on top of fgets(), growing the buffer and reading more (appending to existing buffer), until the buffer ends with a newline. It, however, cannot really handle embedded '\0' bytes in the data; to do that, and properly implement getdelim(), you need to read the data character-by-character, using e.g. fgetc().

  • Related