CS50 recover.c Need explanation on file creation-CodePudding

I watched the solution online and one detail didn't appear clear to me,
Which specific command line does the job of creating the jpg files?
I suppose nothing else other than sprintf could do that, but how exactly it does that?
As far as I'm aware it olny prints the name of the file into location but doesn't create it

int main(int argc, char *argv[])
{
  FILE *raw_file = fopen(argv[1], "r");
  if (raw_file == NULL)
  {
       printf("Could not open file");
      return 1;
  }
  unsigned char buffer[512];
  int count = 0;
  FILE *output = NULL;
  char *filename = malloc(8);

  while (fread(buffer, 1, 512, raw_file) == 512)
  {
    if(buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
    {
      sprintf(filename, "i.jpg", count);
      output = fopen(filename, "w");
      count  ;

    }
    if (output != NULL)
    {
        fwrite(buffer, 1, 512, output);
    }
  }
  free(filename);
  fclose(output);
  fclose(raw_file);

  return 0;
}

CodePudding user response：

Which specific command line does the job of creating the jpg files?

Typically we call these "lines of [C] code". The term "command line" usually refers to commands as whole programs running on a terminal (or "console"). There are actually two lines of code responsible for creating the jpg file here. The first is:

output = fopen(filename, "w");

This creates a file for writing. But at this point it is empty as it does not hold any data content. filename is the just the name of the file to be created. A reference to a memory where you can write the data is returned by fopen and assigned to output variable.

The contents of the file are written in this line:

fwrite(buffer, 1, 512, output);

In which the source of data to be written is given by the buffer variable. This points to the data read from the raw file with fread above.

I suppose nothing else other than sprintf could do that, but how exactly it does that?

The sprintf here is just writing a filename, assigning it to variable filename. These will be "000.jpg", "001.jpg", "002.jpg" and so on.

'fwrite' does not write directly to the disk as this would be very inefficient. It instead writes the data to a buffer memory (called stream). In order to ensure the stream is flushed to the disk, the following line is called (it probably should be called inside the loop before opening a new output file):

fclose(output);

When the raw file is being read, note that there is some check of its contents in place:

if(buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)

0xFF 0xD8 is the "Start Of Image" JPEG marker. Used here as a way to ensuring/detecting that the source (raw) file segment actually does hold JPEG data. Have a look at the "Common JPEG markers" table in the JPEG Wikipedia page.

It's recommended to add the 'b' character (denoting "binary") to the fopen function when dealing with binary data, as suggested in the comments. For instance as explained in this fopen manual:

   The mode string can also include the letter 'b' either as a last
   character or as a character between the characters in any of the
   two-character strings described above.  This is strictly for
   compatibility with C89 and has no effect; the 'b' is ignored on
   all POSIX conforming systems, including Linux.  (Other systems
   may treat text files and binary files differently, and adding the
   'b' may be a good idea if you do I/O to a binary file and expect
   that your program may be ported to non-UNIX environments.)