I'm trying to have my program end if the user types 'end' and while brainstorming for an implementation, one of the solutions I came up with has an interesting behavior.
In the code below, numbers are stored in firstArray.content[index]; and letters/characters get stored in end[whileCounter]. Everything works as intended, except when end gets an 'i' or an 'n'. When that happens, a new line character '\n' gets stored instead.
I searched online and in the debugger for an explanation of this interesting result but couldn't find a reason. Does anyone know why it happens?
Edit: It doesn't seem to be scanf()'s existing trailing issue as this only happens with specific characters.
Example: If I type 4lol everything works as expected but if I type 4null only ull gets stored in the character array.
I recreated this behavior in the code below:
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
struct HeapArrRealNums {
size_t size;
size_t nextIndex;
size_t currentIndex;
double* content;
};
int main() {
struct HeapArrRealNums firstArray = { 0 };
firstArray.size = 4;
firstArray.nextIndex = 1;
firstArray.currentIndex = 0;
firstArray.content = malloc(firstArray.size * sizeof(double));
#define NOT_A_NUMBER 0
int scanCheck;
size_t index = firstArray.currentIndex;
char end[6] = { 'c' };
size_t whileCounter = 0;
while ((scanCheck = scanf("%lf", &firstArray.content[index])) != 1) {
if (scanCheck == EOF) {
printf("\n----EOF. Program terminated----");
firstArray.currentIndex = index;
//return true;
}
if (scanCheck == NOT_A_NUMBER) {
(void)scanf("%c", &end[whileCounter]);
whileCounter ;
printf("-%s-", end);
//continue;
}
}
}
CodePudding user response:
i
is the first character in inf
and n
is the first character in nan
. Those are legitimate inputs for the %f
format conversion, so scanf
will look at the next character. When it turns out the the next character is \n
, scanf
puts that character back into the input stream and returns a conversion failure.
But it doesn't put the i
/n
back, because the standard says that only one character can be returned to the input stream. The restriction comes from ungetc
, which is documented as possibly failing if it is called more than once without a getc
in between. scanf
uses ungetc
to return the character, at least in principle, and it's not allowed to try something which might fail.
Really, scanf
is not a great interface for reading data more complicated than a sequence of numbers. But it can be made to work if you are aware of its limitations. When it's possible that either a number or a word is the next thing, try the word first:
// This is one of a zillion variants on this theme. You need to be
// clear about your precise requirements.
unsigned char c;
int status;
while (1) {
// Skip spaces but not newlines
while ((status = scanf("%c", &c)) == 1
&& isspace(c) && c != '\n') { }
if (status == EOF) break;
if (c == '\n') {
// Handle end of line
}
ungetc(c, stdin);
if (isalpha(c)) {
status = scanf("%5[a-zA-Z]", end);
// do something with the word
}
else {
status = scanf("%lf", &datum);
if (status == 0) {
// Handle conversion error
// Note: the invalid character is back in the input.
}
// Handle datum
}
// Continue looping
}