Difficulties with an example 1.9 of The C Programming Language-CodePudding

I'm am working my way through the exercises of the first chapter of The C Programming Language and while I understand most of what is said and shown, there is one example that I don't understand.

In 1.9, there is a function shown to return the length of a line while setting a char array, passed as an argument, to the contents.

int get_line(char s[], int lim)
{
    int c, i, l;

    for (i = 0, l = 0; (c = getchar()) != EOF && c != '\n';   i) {
        if (i < lim - 1)
            s[l  ] = c;
    }
    if (c == '\n')
        if (l < lim - 1)
            s[l  ] = c;
    s[l] = '\0';

    return l;
}

The thing I do not understand, is why we need this: if (c == '\n') {...}. Could this not be combined in the for-loop? Where we explicitly check that c is not equal to '\n'? I'm having trouble wrapping my head around why this needs to be an external condition.

Any light shed would be helpful! Thanks!

CodePudding user response：

The for loop is exited if either c equals EOF or c equals '\n'. Therefore, immediately after the for loop, if you want to know which value c has, you must test.

CodePudding user response：

is why we need this: if (c == '\n') {...}.

get_line() is structurally:

get_line() {
  initialize

  while get, A not true and B not true
    perform X

  if B
    perform X
  
  finalize

The loop quits under 2 conditions. With one of those (c == '\n'), we still want to perform X somewhere as that is part of the function goal.

Could this not be combined in the for-loop?

It could be combined, yet then we have 2 locations that exit the loop.

Typical coding guidelines promote a single location to quit the loop. If we set aside that goal, then:

get_line() {
  initialize

  while get, A not true
    perform X
    if B quit the loop
  
  finalize

As below with the same number of conditions checks, yet 2 loop exit points.

int get_line(char s[], int lim) {
    int c, i, l;

    for (i = 0, l = 0; (c = getchar()) != EOF;   i) {
        if (i < lim - 1)
            s[l  ] = c;
        if (c == '\n')
            break;
    }

    s[l] = '\0';
    return l;
}

We could contort the code to get the 2 checks back on the same line and not have that pesky after the loop if (c == '\n'). Stylistically this may be harder to follow.

int get_line(char s[], int lim) {
    int c, i, l;

    for (i = 0, l = 0, c = 0; c != '\n' && (c = getchar()) != EOF;   i) {
        if (i < lim - 1)
            s[l  ] = c;
    }
    s[l] = '\0';

    return l;
}

Lastly, code could use improvements:

No need for i and l index counters. One is enough.
Array sizing and index best uses size_t type. Warning: size_t is some unsigned type.
Using a leading size parameter allows for better static code analysis and self-documenting code: the lim relates to s[].
Avoid math on input parameters to not incur overflow. We have more range control on local objects.
Careful when lim is at an extreme or zero.
Rather than assign after declaration, where practical, initialize. E.g. int i = 0;

get_line() {
  initialize

  while B not true, get, A not true
    perform X
  
  finalize

#include <stdio.h>
#include <stdlib.h>
      
size_t get_line(size_t size, char s[size]) {
  int ch = 0;
  size_t i = 0;

  while (ch != '\n' && (ch = getchar()) != EOF) {
    if (i   1 < size)
      s[i  ] = (char) ch;
  }

  // size might have been pathologically 0, so no room for \0
  if (i < size) {
    s[i] = '\0';
  }
  return i;
}

CodePudding user response：

If you want to put it in the loop, you have to do something like that:

int get_line(char s[], int lim)
{
    int c, i, l;

    for (i = 0, l = 0; (c = getchar()) != EOF;   i) {
        if ((i < lim - 1) && (c != '\n'))
            s[l  ] = c;
        else if (c == '\n') {
            if (l < lim - 1)
                s[l  ] = c;
             break;
         }
    }

    s[l] = '\0';

    return l;
}

So as you see, wrapping the condition inside the loop, led to more conditions checks and a break statatement.