I was trying to read some data from a file using fread()
and I realized that my file keeps growing and growing. But since I was just reading from file, the behaviour was not reasonable for me. So I wrote this code and found that if I use putw()
to write data to a file, then try to read from that file(before closing and reopening the file), fread
expands the file to be able to read from it.
Operating System: Windows 8.1
Compiler: MinGW gcc
The code:
typedef struct {
int a;
int b;
} A;
int main() {
FILE* f = fopen("file", "wb");
A a;
a.a = 2;
a.b = 3;
putw(1, f);
fwrite(&a, sizeof(A), 1, f);
fclose(f); // To make sure that wb mode and fwrite are not responsible
f = fopen("file", "rb ");
printf("initial position: %ld\n", ftell(f));
putw(1, f);
printf("position after putw: %ld\n", ftell(f));
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 1st fread: %ld\n", ftell(f));
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 2nd fread: %ld\n", ftell(f));
fclose(f);
remove("file");
return 0;
}
RESULT:
initial position: 0
position after putw: 4
fread result: 1
position after 1st fread: 12
fread result: 1
position after 2nd fread: 20
CodePudding user response:
The Problems
There are a few issues in the code that can lead to undefined behavior:
- mixing wide- & byte-oriented functions,
- using the contents with position after a character that was written to a wide oriented stream (causing potential framing errors), and
- calling input functions after output functions without an intervening
fflush
.
Issue 2 is tricky to phrase succinctly; the C standard section quoted below should make it clearer.
The behavior of functions as related to orientation is defined in C17 (draft) §§ 7.21.2 4,5:
4 Each stream has an orientation. After a stream is associated with an external file, but before any operations are performed on it, the stream is without orientation. Once a wide character input/output function has been applied to a stream without orientation, the stream becomes a wide-oriented stream. Similarly, once a byte input/output function has been applied to a stream without orientation, the stream becomes a byte-oriented stream. Only a call to the freopen function or the fwide[*] function can otherwise alter the orientation of a stream. (A successful call to freopen removes any orientation.)
5 Byte input/output functions shall not be applied to a wide-oriented stream and wide character input/output functions shall not be applied to a byte-oriented stream. The remaining stream operations do not affect, and are not affected by, a stream’s orientation, except for the following additional restrictions: […]
— For wide-oriented streams, after a successful call to a file-positioning function that leaves the file position indicator prior to the end-of-file, a wide character output function can overwrite a partial multibyte character; any file contents beyond the byte(s) written are henceforth indeterminate.
Mixing output & input without flushing is covered by § 7.19.5.3 6 (fopen
):
6 When a file is opened with update mode (’ ’ as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), […]
These are also listed in the Big List of Undefined Behavior, Annex J.2:
The behavior is undefined in the following circumstances:
[…]
— A byte input/output function is applied to a wide-oriented stream, or a wide character input/output function is applied to a byte-oriented stream (7.21.2).
— Use is made of any portion of a file beyond the most recent wide character written to a wide-oriented stream (7.21.2).
[…]
— An output operation on an update stream is followed by an input operation without an intervening call to the fflush function or a file positioning function, […] (7.19.5.3).
The Solutions
There are two approaches:
- use
freopen
in between the wide-character and byte-oriented functions, or - use only byte-oriented functions (e.g.
fwrite
), andfflush
(orfseek
, as per the standard) in between writing & reading.
Note fwide
can only set the orientation of unoriented streams, so it can't address the issues; once the orientation of a stream is set, it can only be cleared with freopen
.
freopen
Solution
freopen
on its own addresses 2 of the 3 issues:
- It clears the orientation in between the wide and byte orented functions, so they're not mixed.
- On its own,
freopen
will leave any garbage characters in the tail of the file, though it shouldn't be an issue in the given example. If this is an issue, the stream must first be truncated (though this isn't appropriate for the example). freopen
callsfflush
, so that output is not directly followed by input.
const char* fName = "file";
f = fopen(fName, "rb ");
putw(1, f);
// truncate here, if applicable
if (freopen(NULL, "rb ", f)) {
int nA;
fread(&nA, sizeof(nA), 1, f);
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 1st fread: %ld\n", ftell(f));
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 2nd fread: %ld\n", ftell(f));
}
Byte-Oriented I/O Solution
Replacing putw
with fwrite
and adding a call to fflush
addresses all 3 issues:
- No more wide-orientation functions are used, so there's no orientation mixing.
- With no wide-orientation functions being used, you don't have the problem of framing errors mentioned in § 7.21.2 5.
fflush
explicitly addresses § 7.19.5.3 6.
const char* fName = "file";
f = fopen(fName, "rb ");
int nA = 1;
fwrite(&nA, sizeof(nA), 1, f);
fflush(f);
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 1st fread: %ld\n", ftell(f));
printf("fread result: %d\n", fread(&a, sizeof(A), 1, f));
printf("position after 2nd fread: %ld\n", ftell(f));
PS
In the context of the toy problem, the call to putw
followed by fread
doesn't make much sense as something that would be done in production (though that's not as important, as its purpose is to illustrate an issue). As such, the above solutions might not address aspects of production code that mixes putw
with fread
.
Only minimal error handling is shown in the sample code.