I have pixels from an image which are stored in a binary file.
I would like to use a function to quickly read this file.
For the moment I have this:
std::vector<int> _data;
std::ifstream file(_rgbFile.string(), std::ios_base::binary);
while (!file.eof())
{
char singleByte[1];
file.read(singleByte, 1);
int b = singleByte[0];
_data.push_back(b);
}
std::cout << "end" << std::endl;
file.close();
But on 4096 * 4096 * 3 images it already takes a little time.
Is it possible to optimize this function?
CodePudding user response:
You could make this faster by reading the whole file in one go, and preallocating the necessary storage in the vector beforehand:
std::ifstream file(_rgbFile.string(), std::ios_base::binary);
std::streampos posStart = file.tellg();
file.seekg(0, std::ios::end);
std::streampos posEnd = file.tellg();
file.seekg(posStart);
std::vector<char> _data;
_data.resize(posEnd - posStart, 0);
file.read(&_data[0], posEnd - posStart);
std::cout << "end" << std::endl;
file.close();
Avoiding unnecessary i/o
By reading the file as a whole in one read()
call you can avoid a lot of read calls, and buffering of the ifstream
. If the file is very large and you don't want to load it all in memory at once, then you can load smaller chunks of maybe a few MB each.
Also you avoid lots of functions calls - by reading it byte-by-byte you need to issue ifstream::read
50'331'648 times!
vector preallocation
std::vector
grows dynamically when you try to insert new elements but no space is left. Each time the vector resizes, it needs to allocate a new, larger, memory area and copy all current elements in the vector over to the new location.
Most vector implementions choose a growth factor between 1.5 - 2, so each time the vector needs to resize it'll be a 1.5-2x larger allocation.
This can be completely avoided by calling std::vector::reserve
or std::vector::resize
.
With these functions the vector memory only needs to be allocated once, with at least as many elements as you requested.
Godbolt example
Here's a godbolt example that shows the performance improvement.
testing a ~5MB file (4096*4096*3 bytes)
- gcc 11.2, with optimizations disabled:
Old | New |
---|---|
1300ms | 16ms |
- gcc 11.2,
-O3
Old | New |
---|---|
878ms | 13ms |
Small bug in the code
As @TedLyngmo has pointed out your code also contains a small bug.
The EOF
marker will only be set once you tried to read past the end of the file. see this question
So the last read that sets the EOF
bit didn't actually read a byte, so you have one more byte in your array that contains uninitialized garbage.
You could fix this by checking for EOF
directly after the read:
while(true) {
char singleByte[1];
file.read(singleByte, 1);
if(file.eof()) break;
int b = singleByte[0];
_data.push_back(b);
}