Question:
Is there a good way to write a 3D float vector of size (9000,9000,4) to an output file in C ?
My C program generates a 9000x9000 image matrix with 4 color values (R, G, B, A) for each pixel. I need to save this data as an output file to be read into a numpy.array() (or similar) using python at a later time. Each color value is saved as a float (can be larger than 1.0) which will be normalized in the python portion of the code.
Currently, I am writing the (9000,9000,4) sized vector into a CSV file with 81 million lines and 4 columns. This is slow for reading and writing and it creates large files (~650MB).
NOTE: I run the program multiple times (up to 20) for each trial, so read/write times and file sizes add up.
Current C Code:
This is the snippet that initializes and writes the 3D vector.
// initializes the vector with data from 'makematrix' class instance
vector<vector<vector<float>>> colorMat = makematrix->getMatrix();
outfile.open("../output/11_14MidRed9k8.csv",std::ios::out);
if (outfile.is_open()) {
outfile << "r,g,b,a\n"; // writes column labels
for (unsigned int l=0; l<colorMat.size(); l ) { // 0 to 8999
for (unsigned int m=0; m<colorMat[0].size(); m ) { // 0 to 8999
outfile << colorMat[l][m][0] << ',' << colorMat[l][m][1] << ','
<< colorMat[l][m][2] << ',' << colorMat[l][m][3] << '\n';
}
}
}
outfile.close();
Summary:
I am willing to change the output file type, the data structures I used, or anything else that would make this more efficient. Any and all suggestions are welcome!
CodePudding user response:
Use the old C file functions and binary format
auto startT = chrono::high_resolution_clock::now();
ofstream outfile;
FILE* f = fopen("example.bin", "wb");
if (f) {
const int imgWidth = 9000;
const int imgHeight = 9000;
fwrite(&imgWidth, sizeof(imgWidth), 1, f);
fwrite(&imgHeight, sizeof(imgHeight), 1, f);
for (unsigned int i=0; i<colorMat.size(); i)
{
fwrite(&colorMat[i], sizeof(struct Pixel), 1, f);
}
}
auto endT = chrono::high_resolution_clock::now();
cout << "Time taken : " << chrono::duration_cast<chrono::seconds>(endT-startT).count() << endl;
fclose(f);
The format is the following :
[ImageWidth][ImageHeight][RGBA][RGBA[RGBA]... for all ImageWidth * ImageHeight pixels.
Your sample ran in 119s in my machine. This code ran in 2s.
But please note that the file will be huge anyway : you are writing the equivalent of two 8K files without any kind of compression.
Besides that, some tips on your code :
- Don't use a vector of floats to represent your pixels. They won't have more components than RGBA. Instead create a simple struct with four floats.
- You don't need to look through width and height separately. Internally all lines are put sequentially one after the other. It is easier to create a one dimension array of width * height size.