Home > OS >  Read binary file chunk by chunk in C
Read binary file chunk by chunk in C

Time:06-14

Im new to C, and used to python. But because of current situation in the world my primary computer was destroyed and all I left with is 12 years old laptop. Python would not work really good there.

My goal is to create encryption algorithm. Currently its fairly simple, but I will upgrade it later

How can I read file in chunks and overwrite its data? here's my code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef unsigned char byte;

void printa(byte a[]) {
    printf("[");
    for (int i = 0; i < strlen(a); i  ) {
        printf("%d", a[i]);
        if (i != strlen(a) - 1) {
            printf(", ");
        };
    };
    printf("]\n");
};

int encrypt(byte data[], byte key[], int keyi) {
    for (int i = 0; i < strlen(data); i  ) {
        data[i]  = key[keyi];
        keyi = (keyi   1) * (keyi < strlen(key) - 1); //branchless if (keyi<strlen(key)) {keyi  ;} else {keyi=0;};
    };
    return keyi;
};

void encrypt_file(byte filename[], byte key[], int buffer_size) {
    FILE *file;
    byte *buff = calloc(sizeof(byte), buffer_size);
    file = fopen(filename, "rb");
    if (file == NULL) {
        printf("UNABLE TO OPEN FILE\n");
    };
};

int main() {
    encrypt_file("test.txt", "key", 1024);
    return 0;
};

My files will be around 2GB in size, but my laptop ram is 512MB, so I have to use chunking.

CodePudding user response:

To encrypt the file on a block basis, you should use fread and fwrite.

To reset the key index keyi at the end of key, comparing it to strlen(key) - 1 is potentially very inefficient because strlen() iterates on the whole string to locate the null terminator. Just test if key[keyi] is a null byte.

Also don't worry about branchless code:

  • any good compiler will generate branchless code for a simple test if (key[keyi] == '\0') keyi = 0;
  • branches are not a real issue on modern processors, as long as they can be consistently predicted.

Here is a modified version:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef unsigned char byte;

int encrypt(byte data[], size_t size, const byte key[], int keyi) {
    for (size_t i = 0; i < size; i  ) {
        data[i] ^= key[keyi  ];  // involutory encryption (apply twice to decrypt)
        if (key[keyi] == '\0')
            keyi = 0;
    }
    return keyi;
}

int encrypt_file(const char *filename, const char *filename2,
                 const byte key[], size_t buffer_size)
{
    byte *buff = calloc(sizeof(*buff), buffer_size);
    if (buff == NULL) {
        fprintf(stderr, "Cannot allocate memory\n");
        return 1;
    }
    FILE *f1 = fopen(filename, "rb");
    if (f1 == NULL) {
        fprintf(stderr, "Cannot open file %s: %s\n", filename, strerror(errno));
        free(buff);
        return 1;
    }
    FILE *f2 = fopen(filename2, "wb");
    if (f2 == NULL) {
        fprintf(stderr, "Cannot open file %s: %s\n", filename2, strerror(errno));
        fclose(f1);
        free(buff);
        return 1;
    }
    size_t nread;
    int keyi = 0;
    while ((nread = fread(buff, sizeof(*buff), size, f1)) != 0) {
        keyi = encrypt(buff, nread, key, keyi);
        if (fwrite(buff, sizeof(*buff), nread, f2) != nread) {
            fprintf(stderr, "Error writing to file %s: %s\n", filename2, strerror(errno));
            break;
        }
    }
    fclose(f1);
    fclose(f2);
    free(buff);
    return 0;
}

int main() {
    return encrypt_file("test.txt", "test.out", (const byte *)"key", 1024);
}

You can modify the program to encrypt the file in place, but be aware that if the process is interrupted, a partially encrypted file will be difficult to restore. You should also try and preserve file modification times. A much safer approach is to encrypt the partition with appropriate tools, some of which support hidden partitions and plausible deniability.

CodePudding user response:

You want to use fread and fwrite, passing your buff as ptr, sizeof byte as size, and buffer_size as nmemb.

size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);

size_t fwrite(const void *ptr, size_t size, size_t nmemb,
                     FILE *stream);

CodePudding user response:

How can I read file in chunks and overwrite its data?

In the first place, if you want to both read and write the same file pre-existing file, in place, then open it in mode rb . Mode rb, which you are currently using, does not permit writing.

In the second place, reading and writing binary data in chunks is exactly the role of the standard fread() and fwrite() functions.

You will also need to reposition the file pointer in order to overwrite data you just read. For that, there are ftell() and fseek() or fgetpos() and fsetpos(). However, I suggest instead avoiding this by writing the output to a different file, and (only) after the new file is fully written and closed, replacing the original with it. This avoids the possibility of being left with a partially-encrypted file in the event that the process fails in the middle.

Do not neglect to pay attention to all these functions's return values, as they report on errors if they occur. But you absolutely must pay attention to the return value of fread(), because it also reports on the amount of data that was actually read. That will be important at least once for every file you process whose length is not an exact multiple of the selected chunk size.

  • Related