I'm getting some weird behavior in the following function:
char* read_file(const char* filename) {
FILE* file = fopen(filename, "rb"); // error check
fseek(file, 0L, SEEK_END); // error check this too
unsigned int size = ftell(file); // error check
rewind(file);
// I know a char is a single byte, but just for readibility's sake:
char* buffer = malloc(sizeof(char) * size 1); // error check
unsigned int result = fread(buffer, sizeof(char), size, file); // more error check
fclose(file);
// puts(buffer); // see the rest of the question, this is line 16
return buffer;
}
If line 16 stays uncommented, I get undefined behavior in my program, i.e. the string returned was not what I expected, i.e. an error happened. And with error checking I spare the question from having, I'm getting that result
is equal to size
, so fread read the file successfully. Now, if I uncomment line 16 and print the result, everything works and no undefined behavior happens. Why might this be happening?
CodePudding user response:
malloc
allocates uninitialized storage, which means the contents of the returned buffer is not initialized to zero.
You allocate a buffer of size 1
bytes, and then read size
bytes from the file into it. That final byte in the buffer, the one you got because of that 1
? You never wrote anything to it. That string is missing its null terminator, and fread
won't do that for you.
You'll have to add the null terminator yourself, otherwise puts
will hit undefined behaviour trying to print it:
buffer[result] = '\0';
CodePudding user response:
In addition to @Etienne de Martel good answer, OP's code had other weaknesses.
No check for open success
FILE* file = fopen(filename, "rb"); // error check
// add
if (file == NULL) {
fprintf(stderr, "Unable t open file <%s>\n", filename);
return EXIT_FAILURE;
}
No error check, Unneeded L
//fseek(file, 0L, SEEK_END); // error check this too
if (fseek(file, 0, SEEK_END)) {
// add error message here
return EXIT_FAILURE;
}
Wrong type, no error check
// unsigned int size = ftell(file);
// ftell returns a long
long size = ftell(file);
if (size == -1) {
// add error message here
return EXIT_FAILURE;
}
rewind(file); // this is OK
No size error check, conceptual wrong multiplication, no allocation check
// char* buffer = malloc(sizeof(char) * size 1); // error check
if (size < 0 || size >= SIZE_MAX) {
fprintf(stderr, "Size %ld was to large to convert to size_t\n", size);
return EXIT_FAILURE;
}
// Even though it makes no arithmetic difference as sizeof(char) is 1,
// multiply the size of the object by the _sum_
// char* buffer = malloc(sizeof(char) * (size 1));
// Even better as, size by the referenced object, not type
char* buffer = malloc(sizeof buffer[0] * (size 1));
// Add allocation check
if (buffer == NULL) {
// add error message here
return EXIT_FAILURE;
}
Use correct return type, add result check
// unsigned int result = fread(buffer, sizeof(char), size, file);
// fread returns a size_t
// size_t result = fread(buffer, sizeof(char), size, file);
// Even better as
size_t result = fread(buffer, sizeof buffer[0], size, file);
if (result != size) {
// add error message here
return EXIT_FAILURE;
}
fclose(file); // OK
Finally we get to OP's most pressing problem
Code attempts to print using puts(buffer)
which expects buffer
to be a string. The character array buffer[]
is not certainly a string as it may lack a null character.
Various solutions: append a null character.
// Make certain `buffer[]` contains a final '\0'.
buffer[result] = '\0';
// then print
puts(buffer);
Or print using fwrite()
, which does not need buffer[]
as a string.
fwrite(buffer, sizeof buffer[0], result, stdout);
Deeper
Remaining problems:
Data was read using
"rb"
and writing withstdout
(either withputs()
orfwrite()
. This creates issues on some platforms as writing in text_mode (stdout
typical default mode) the binary data"\r\n"
may print as"\r\r\n"
. Work-around: re-openstdout
in binary mode.Data read may contain null characters, in which case
puts()
will stop early at the first'\0'
- not the appended one.fwrite()
avoids this issue.File size may exceed
LONG_MAX
orSIZE_MAX
or exceedmalloc()
available memory. We need to determine file size another way and allocate differently to cope with such huge files.On unusual systems the
fseek(), ftell()
trick does not work. The best solution involves not needing to read the entire file into one array/string, but into many blocks. Leaves details of that for another day.