Home > database >  reading EOF as a char in C
reading EOF as a char in C

Time:09-14

I know it might sound stupid but how can I exit this loop ?

#include <stdio.h>
#include <stdlib.h>

int main(){
    char c;
    do {
        c = getchar();
        printf("%c", c);
    }while(c != EOF);

    return 0;
}

I'm reading a book and trying to do the following exercise: "Verify that the expression getchar() != EOF is 0 or 1" If I read the value of EOF stored in an integer value it will be equal to -1 but if I try to catch a -1 as a char is mindfuck. From what I've understood EOF is a value which is not assigned to any other char ..

anyone can help ?

edit1: I know that c should be an Integer ... I'm reading it as a char intentionally.

edit2:

int main(){
    int c;
    while((c = getchar()) != EOF)
    {
        printf("%d\n", c);
    }
    return 0;
}

----->

int main(){
    int c;
    int i = 0;
    char str[2];
    while((c = getchar()) != EOF)
    {
        str[i] = c;
          i;
        if(i > 1) i = 0;
        if(str[0]=='-'&&str[1]=='1')
        {
            c = EOF; // doens't exit loop
        }
        else printf("%d\n", c);


    }
    return 0;
}

Why I cant understand dis

CodePudding user response:

  1. c has to be int not char. It is especially important in implementations where char is unsigned. -1 is represented by 0xffffffff (32 bit integer, twos complement) and assigned to char as 0xff. When compared 0xff will always be not equal -1. Thats why you should use int not char.
  2. Test for EOF before printing. More suitable is while(...) {} loop
int main(){
    int c;
    while((c = getchar()) != EOF)
    {
        printf("%c", c);
    }
}

enter image description here

CodePudding user response:

It might help you to understand what's going on if you change the program like this:

#include <stdio.h>

int main(){
    int c;
    do {
        c = getchar();
        printf("%d\n", c);
    } while(c != EOF);
}

You'll notice that I have:

  1. declared c as int
  2. printed it using %d

If I run this program and type "abc" and then hit Enter and then CTRL-D, this is what I see:

97
98
99
10
-1

97, 98, and 99 are the ASCII codes for a, b, and c. 10 is the code for newline, aka \n. And then that -1 is the EOF that resulted when I typed CTRL-D. (If you're on Windows, you'd use CTRL-Z and another Enter instead.)

In this program, although c is an int variable, that does not mean that it does not hold characters! In C, characters are represented by small integers which are their codes in the machine's character set. Here's a modification to demonstrate this:

int c;
int nch = 0;
char string[100];
do {
    c = getchar();
    printf("%d", c)
    if(c >= 32 && c < 127) {
        printf(" = '%c'", c);
        string[nch  ] = c;
    }
    printf("\n");
} while(c != EOF);
string[nch] = '\0';
printf("You typed \"%s\"\n", string);

Now it prints

97 = 'a'
98 = 'b'
99 = 'c'
10
-1
You typed "abc"

There's no problem calling

printf(" = '%c'", c);

even though c is an int and %c is for printing characters.
There's no problem assigning

string[nch  ] = c;

even though c is an int and string is an array of characters.

CodePudding user response:

getchar() returns an int in the [0 ... UCHAR_MAX] range or EOF.

To well distinguish these typical 257 different values, save in an int.

If saved in a char that is signed, an EOF with a typical value of -1 is saved yet so is some character, perhaps with the value of 255. The loops then ends under one of 2 conditions.

If saved in a char that is unsigned, an EOF with a typical value of -1 is save as 255 and never equates to EOF resulting in an infinite loop.

Do the right thing. Save in an int.

Compare before printing, else when EOF is returned it may print the same as if the character with 255 was read.

// char c;
int c;
while ((c = getchar()) != EOF)) {
    printf("%c", c);
}

From what I've understood EOF is a value which is not assigned to any other char

This is not quite true. EOF is negative and characters (not char) are read with unsigned char values.

EOF may be -1 and a char, when signed, can also have the value of -1. The key is to consider charters best originally as unsigned values, even when surprisingly saved in a signed char.

Deeper: It is an old C historic compromise that char is signed or unsigned. Still, character processing best done as-if characters are treated as unsigned char. This affects getchar(), is...(), strcmp() and other functions.

CodePudding user response:

According to the C Standard (7.21 Input/output <stdio.h>)

EOF

which expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream;

As you can see there is nothing said what negative value the constant expression has.

However usually it is equal to -1.

The condition of the do-while statement

}while(c != EOF);

will be correctly evaluate to 1 and 0 if the type char behaves as the type signed char. In this case the object c of the type char will be implicitly converted to the type int in the condition due to the integer promotions. That is the integer value -1 stored in the object c of the type char will promoted to the type int and keep its value -1.

However depending on compiler options the type char can behave as the type unsigned char. In this case after this assignment

c = getchar()

the variable c will have an unsigned value. For example if it was assigned with EOF that is equal to -1 then the variable c will have the positive value 255 that it will keep after the integer promotions. As a result the condition

}while(c != EOF);

will always evaluate to 1 because -1 is not equal to 255.

So you should always declare the variable c as having the type int. In this case your code will not depend on how the type char behaves: either as signed char or as unsigned char.

CodePudding user response:

In an addendum to your question, you tried to have your loop exit when the user typed "-1", and you wondered why that didn't work. You made one or two mistakes. Here is a corrected version which you can try:

int c;
int i = 0;
char str[20];
while((c = getchar()) != EOF)
{
    str[i] = c;
      i;
    if(str[0]=='-' && str[1]=='1')
    {
        break;        /* fix here: was "c = EOF" */
    }
    else printf("%d\n", c);
}

Your code actually did recognize "-1", but in response it set c to EOF, but this accomplished nothing, because the next thing that happened was another call to c = getchar() in the while header at the top of the loop.

In the modified version I'm calling break when "-1" is seen, and this works.

Note that to exit the loop you must type the two characters "-" and "1", and you must type them at the beginning of the line.

Note that you're reading characters here. The two characters "-" and "1" are not being treated as the integer -1, or as the C EOF value. They're just two characters. If you changed the if statement to

if(str[0]=='x' && str[1]=='y')

you'd end up with a loop that stopped when the user typed "xy" at the beginning of a line.

  • Related