I'm trying to write a program which task is to get rid of extra whitespace in a text stream.
#include <stdio.h>
int main()
{
int c;
while( (c = getchar()) != EOF)
{
if(c == ' ')
{
putchar(c);
while( (c = getchar()) == ' ' );
if(c == EOF) break;
}
putchar(c);
}
}
I know that getchar()
function returns a new character each iteration and we store the character in the c
variable, but I can't figure out why do we need to assign c
variable to getchar()
in the second while loop again.
CodePudding user response:
The inner loop loops as long as it extracts spaces from the stream, effectively skipping them. The result is that, if you have a sequence of spaces, the first one will be printed, and the rest will be discarded.
You need to assign to c
even there (as opposed to just doing something like while (getchar() == ' ');
) because you'll reach a point where you extract a character that is not a space, so you need to remember what character it was to output it after the loop.
CodePudding user response:
I know that
getchar()
function returns a new character each iteration and we store the character in thec
variable, but I can't figure out why do we need to assignc
variable togetchar()
in the second while loop again.
The reuse of variable c
is of no particular consequence. A separate variable could have been used for the inner loop instead. Example:
#include <stdio.h>
int main()
{
int c;
while( (c = getchar()) != EOF)
{
putchar(c);
if(c == ' ')
{
int c2;
while( (c2 = getchar()) == ' ' );
if (c2 == EOF) break;
putchar(c2);
}
}
}
Note that a rearrangement of the putchar()
calls was required. Either way, when the program encounters a run of spaces, it must both (i) print one space, AND (ii) avoid losing the first non-space, if any, after the run.
It would also have been possible to write the program with only one appearance of a call to getchar()
, but then it would need to store more state explicitly (whether the previous character read was a space) and use that to decide whether to output each newly-read character. For example:
#include <stdio.h>
int main(void) {
int c;
_Bool was_space = 0;
while ((c = getchar()) != EOF) {
if (c != ' ' || !was_space) {
putchar(c);
}
was_space = (c == ' ');
}
}
In fact, I find both of those alternatives rather clearer than the original. The original seems to go out of its way to minimize the number of variable it declares.