Home > OS >  How does std::unordered_map find values?
How does std::unordered_map find values?

Time:12-25

When iterating over an unordered map of values, std::unordered_map<Foo, Bar>, it will use an iterator pointing to values instd::pair<Foo, Bar>. This makes it seem like std::unordered_map stores its values internally as std::pairs of values. If I'm not mistaken, std::unordered_map works by hashing the key and using that for lookups. So, if the internal structure is something like a list of pairs, how does it know which pair has the hashed key? Wouldn't it need to hash the whole pair, value included?

If the internal structure does not hold pairs, does calling std::unordered_map::begin() then create a std::pair object using the data in the map? Then how does modifying the data in the pair also modify the data in the actual map itself?

Thank you.

CodePudding user response:

Let's pretend that the map is really a single vector:

std::vector<std::pair<Foo, Bar>> container;

It's easy to visualize that iterating over this container iterates over std::pairs of two classes.

The real unordered map works the same way, except instead of a vector the map stores std::pair<Foo, Bar>s in a hash table, with additional pointers that stitch the whole hash table together, that are not exposed via the map's iterators.

The way that associative containers, like maps, get loosely explained and described -- as a key/value lookup -- makes it sound like in the maps keys and values are stored separately: here's a hash table of keys, and each key points to its corresponding value. However that is not the case. The key and its value are kept together, tightly coupled in discrete std::pair objects, and the map's internal structure arranges them in a hashed table that the iterator knows how to iterate over.

So, if the internal structure is something like a list of pairs, how does it know which pair has the hashed key?

Neither. An unordered map can be loosely described as a:

std::vector<std::list<std::pair<Key, Value>>>

A key's hash is the index in the vector. All Keys in the ith position in this vector have the same hash value. All keys with the same hash are stored in a single linked list.

  • Related