Home > OS >  Extremely weird behavior in fread(); read entire string when printing string but says it did not rea
Extremely weird behavior in fread(); read entire string when printing string but says it did not rea

Time:01-03

I'm getting some weird behavior in the following function:

char* read_file(const char* filename) {
  FILE* file = fopen(filename, "rb"); // error check

  fseek(file, 0L, SEEK_END); // error check this too
  unsigned int size = ftell(file); // error check

  rewind(file);

  // I know a char is a single byte, but just for readibility's sake:

  char* buffer = malloc(sizeof(char) * size   1); // error check

  unsigned int result = fread(buffer, sizeof(char), size, file); // more error check

  fclose(file);
  // puts(buffer); // see the rest of the question, this is line 16
  return buffer;
}

If line 16 stays uncommented, I get undefined behavior in my program, i.e. the string returned was not what I expected, i.e. an error happened. And with error checking I spare the question from having, I'm getting that result is equal to size, so fread read the file successfully. Now, if I uncomment line 16 and print the result, everything works and no undefined behavior happens. Why might this be happening?

CodePudding user response:

malloc allocates uninitialized storage, which means the contents of the returned buffer is not initialized to zero.

You allocate a buffer of size 1 bytes, and then read size bytes from the file into it. That final byte in the buffer, the one you got because of that 1? You never wrote anything to it. That string is missing its null terminator, and fread won't do that for you.

You'll have to add the null terminator yourself, otherwise puts will hit undefined behaviour trying to print it:

buffer[result] = '\0';

CodePudding user response:

In addition to @Etienne de Martel good answer, OP's code had other weaknesses.

No check for open success

  FILE* file = fopen(filename, "rb"); // error check
  
  // add
  if (file == NULL) {
    fprintf(stderr, "Unable t open file <%s>\n", filename);
    return EXIT_FAILURE;
  }  

No error check, Unneeded L

  //fseek(file, 0L, SEEK_END); // error check this too

  if (fseek(file, 0, SEEK_END)) {
    // add error message here
    return EXIT_FAILURE;
  }

Wrong type, no error check

  // unsigned int size = ftell(file);

  // ftell returns a long
  long size = ftell(file);
  if (size == -1) {
    // add error message here
    return EXIT_FAILURE;
  }

  rewind(file);  // this is OK

No size error check, conceptual wrong multiplication, no allocation check

  // char* buffer = malloc(sizeof(char) * size   1); // error check

  if (size < 0 || size >= SIZE_MAX) {
    fprintf(stderr, "Size %ld was to large to convert to size_t\n", size);
    return EXIT_FAILURE;
  }

  // Even though it makes no arithmetic difference as sizeof(char) is 1,
  // multiply the size of the object by the _sum_
  // char* buffer = malloc(sizeof(char) * (size   1));
  
  // Even better as, size by the referenced object, not type 
  char* buffer = malloc(sizeof buffer[0] * (size   1));

  // Add allocation check
  if (buffer == NULL) {
    // add error message here
    return EXIT_FAILURE;
  }

Use correct return type, add result check

  // unsigned int result = fread(buffer, sizeof(char), size, file);
  // fread returns a size_t
  // size_t result = fread(buffer, sizeof(char), size, file);
  // Even better as
  size_t result = fread(buffer, sizeof buffer[0], size, file);
   
  if (result != size) {
    // add error message here
    return EXIT_FAILURE;
  }
      
  fclose(file); // OK

Finally we get to OP's most pressing problem

Code attempts to print using puts(buffer) which expects buffer to be a string. The character array buffer[] is not certainly a string as it may lack a null character.

Various solutions: append a null character.

  // Make certain `buffer[]` contains a final '\0'.
  buffer[result] = '\0';
  // then print 
  puts(buffer);

Or print using fwrite(), which does not need buffer[] as a string.

  fwrite(buffer, sizeof buffer[0], result, stdout);

Deeper

Remaining problems:

  • Data was read using "rb" and writing with stdout (either with puts() or fwrite(). This creates issues on some platforms as writing in text_mode (stdout typical default mode) the binary data "\r\n" may print as "\r\r\n". Work-around: re-open stdout in binary mode.

  • Data read may contain null characters, in which case puts() will stop early at the first '\0' - not the appended one. fwrite() avoids this issue.

  • File size may exceed LONG_MAX or SIZE_MAX or exceed malloc() available memory. We need to determine file size another way and allocate differently to cope with such huge files.

  • On unusual systems the fseek(), ftell() trick does not work. The best solution involves not needing to read the entire file into one array/string, but into many blocks. Leaves details of that for another day.

  • Related