Home > OS >  Using fscanf, scanning a file into a struct in C, but the first argument is failing already
Using fscanf, scanning a file into a struct in C, but the first argument is failing already

Time:12-12

I have a file where I'm trying to read each line into a struct in C to further work with it.

The file looks like this:

Bread,212,2.7,36,6,9.8,0.01,0.01,10,500 
Pasta,347,2.5,64,13,7,0.01,0.01,6,500 
Honey,340,0.01,83,0.01,0.01,0.01,0.01,22.7,425 
Olive-oil,824,92,0.01,0.01,0.01,0.01,13.8,35,500 
White-beans,320,2.7,44,21,18,0.01,0.01,11,400 
Flaxseed-oil,828,92,0.01,0.01,0.01,52,14,100,100 
Cereal,363,6.5,58,13,9.9,0.01,0.01,11,1000 
Hazelnuts,644,61.6,10.5,12,0.01,0.09,7.83,16.74,252 

So I wrote a for-loop to iterate over the lines in the file, trying to store each value into fields of a struct. I try to print the fields of the struct, but its already going wrong with the first argument, the string.

It is printing:

scanresult: 1, name:  ■B, kcal: 0.00, omega 3: 0.00, omega 6: 0.00, carb: 0.00, protein: 0.00, fib: 0.00, price: 0.00, weight: 0.00g

Scanres should be 10, not 1, and the values should match the ones of the first line of the file.

I have tried with or without whitespace in front of the argument in the formatted string. Also I tried compiler warnings -Wall and -pedantic. No issues found.

What else could cause this problem?

The code looks like this:

#include <stdio.h>

#define MAX_CHAR 100
#define SIZE_OF_SHELF 8

typedef struct {
    char name[MAX_CHAR];
    double kcal, fat, omega_3, omega_6, carb, protein, fib, price, weight;
} Food;

int main(void) {
    int i = 0, scanresult;
    Food Shelf[SIZE_OF_SHELF];
    FILE *fp;

    fp = fopen("foods.txt", "r");

    if (! fp) {
        printf("error loading file. bye.\n");
        return 0;
    }

    for (i = 0; !feof(fp); i  ) {
        scanres = fscanf(fp, " %[^,],%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf ",
                         Shelf[i].name,
                         &Shelf[i].kcal, &Shelf[i].fat,
                         &Shelf[i].carb, &Shelf[i].protein,
                         &Shelf[i].fib, &Shelf[i].omega_3,
                         &Shelf[i].omega_6, &Shelf[i].price,
                         &Shelf[i].weight);
        
        printf("scanres: %d, name: %s, kcal: %.2f, omega 3: %.2f, omega 6: %.2f, carb: %.2f, protein: %.2f, fib: %.2f, price: %.2f, weight: %.2fg\n",
               scanres, Shelf[i].name, Shelf[i].kcal,
               Shelf[i].omega_3, Shelf[i].omega_6, Shelf[i].carb, 
               Shelf[i].protein, Shelf[i].fib, Shelf[i].price,
               Shelf[i].weight);
    }
    return 0;
}

If anybody can spot what I'm doing wrong, please let me know.

CodePudding user response:

Check if the file has a Byte Order Mark (BOM) in the first three characters. You can use hexdump (or any binary editor) to inspect it.

File with BOM:


00000000  ef bb bf 42 72 65 61 64  2c 32 31 32 2c 32 2e 37  |...Bread,212,2.7|
00000010  2c 33 36 2c 36 2c 39 2e  38 2c 30 2e 30 31 2c 30  |,36,6,9.8,0.01,0|
00000020  2e 30 31 2c 31 30 2c 35  30 30 20 0a 50 61 73 74  |.01,10,500 .Past|
00000030  61 2c 33 34 37 2c 32 2e  35 2c 36 34 2c 31 33 2c  |a,347,2.5,64,13,|
...

File without BOM :


00000000  42 72 65 61 64 2c 32 31  32 2c 32 2e 37 2c 33 36  |Bread,212,2.7,36|
00000010  2c 36 2c 39 2e 38 2c 30  2e 30 31 2c 30 2e 30 31  |,6,9.8,0.01,0.01|
00000020  2c 31 30 2c 35 30 30 20  0a 50 61 73 74 61 2c 33  |,10,500 .Pasta,3|
00000030  34 37 2c 32 2e 35 2c 36  34 2c 31 33 2c 37 2c 30  |47,2.5,64,13,7,0|
...

CodePudding user response:

It's likely that, besides having a Byte Order Mark (BOM), the original copy of the foods.txt file was encoded using UTF-16, instead of ASCII or the more popular and compatible UTF-8. Taking a cue from wildplasser's answer, here is a hex dump of the first portion of the file in the little-endian variant of that encoding:

00000000  ff fe 42 00 72 00 65 00  61 00 64 00 2c 00 32 00  |..B.r.e.a.d.,.2.|
00000010  31 00 32 00 2c 00 32 00  2e 00 37 00 2c 00 33 00  |1.2.,.2...7.,.3.|
00000020  36 00 2c 00 36 00 2c 00  39 00 2e 00 38 00 2c 00  |6.,.6.,.9...8.,.|
00000030  30 00 2e 00 30 00 31 00  2c 00 30 00 2e 00 30 00  |0...0.1.,.0...0.|
00000040  31 00 2c 00 31 00 30 00  2c 00 35 00 30 00 30 00  |1.,.1.0.,.5.0.0.|
00000050  20 00 0a 00 50 00 61 00  73 00 74 00 61 00 2c 00  | ...P.a.s.t.a.,.|
00000060  33 00 34 00 37 00 2c 00  32 00 2e 00 35 00 2c 00  |3.4.7.,.2...5.,.|

The leading ff fe represents the byte order mark, and would account for the mysterious that showed up in the output name: ■B. Thereafter, every other byte is 0, which is why "Bread" was truncated to "B". And then fscanf's first %lf sees "r\0e\0a\0d", and can't parse that as a double, which is why fscanf returns 1 instead of 10.

  • Related