Home > Software engineering >  In C , sorting a vector of big struct would be slow?
In C , sorting a vector of big struct would be slow?

Time:02-20

I thought when sorting a vector of structs, the struct elements would be moved around, or the copy-constructor would be called, which will bring in some overhead, so it should be slow.

For example,

struct big_struct_t {
  // with many data members, just to name a few
  int val;
  vector<string> strs;
  ...
};

int main() {
  vector<big_struct_t> V;
  ... // populate V with 10k elements for example

  sort(V.begin(), V.end(), [](const big_struct_t& lhs, const big_struct_t& rhs) {
    return lhs.val < rhs.val;
  });
}

But after testing the above code, seems like no copy-constructor called at all during sorting. So I wonder how the sort actually works? It doesn't need to move the elements around at all inside the vector?

CodePudding user response:

Like the comments said, the performance of std::sort is highly dependent on the elements.

std::vector stores elements contiguously. The only way to change the order of elements is to change their values, by copying, moving or swapping them.

std::sort works by comparing and swapping elements in a random-accessible range (such as a std::vector), which, consequently, must also satisfy Benchmark results

Benchmark on quick-bench.com.

As you can see, sorting a std::vector<bigger_struct_t> is 49x slower than std::vector<big_struct_t>, due to the std::array.

Sorting both std::list<bigger_struct_t> and std::list<big_struct_t> is much faster than std::vector<bigger_struct_t>, since they don't have to deal with the std::array.

Both are also considerably slower than std::vector<big_struct_t>, so avoid using std::list unless you need to.

PS: I'm not sure why std::list<bigger_struct_t> is slower than std::list<big_struct_t>. Maybe a problem with the benchmark? NOTE: See @FrançoisAndrieux's comment about the likely culprit being cache misses.

  • Related