During a recent interview, I suggested using vector<pair<int,int>>
over vector<vector<int>>
since we only wanted to store two values for every entry in the vector. I said something to the tune of "we should use vector<pair<int,int>>
over vector<vector<int>>
since the latter is heavier than the former".
After the coding session was over, they said it was a good idea to use pair over a vector and asked me to elaborate what I meant by "heavier" earlier. I wasn't able to elaborate, unfortunately. Yes, I know we can enter only two values in a pair but many more in a vector and that vector gets resized automatically when its size==capacity, etc. but how should I have answered their question - why specifically was using vector<pair<int,int>>
better than vector<vector<int>>
? What extra things are done in the latter case?
CodePudding user response:
Each vector is a single contiguous area of memory, dynamically allocated.
Let's say that you have 1000 values you'll be working with.
std::vector<std::pair<int, int>>
This gets you a single, contiguous block of memory, for 2000 integers.
std::vector<std::vector<int>>
This gets you a single contiguous block of memory for 1000 vectors.
Each one of those 1000 std::vector
s gets you another contiguous block of memory for just two integers.
So, instead of one single contiguous block of memory, for this data structure, it will consist of 1001 blocks of memory scattered all over. You have no guarantees, whatsoever, that all those blocks of memory will be contiguous, one after another.
Each dynamic memory allocation comes at a cost. The cost is fairly small but it adds up very, very quickly. A single penny is easily ignored. A thousand pennies should be enough to get you a cup of coffee at Starbucks.
Furthermore, modern CPUs are very, very good at accessing contiguous blocks of memory. Iterating over a single contiguous block of memory to add up two thousand int
s will be much, much faster than doing the same over a thousand disjointed sections of memory.