This is more of an intellectual exercise, but are there battle-tested C libraries implementing hash map/set (std::unordered_map
, std::unordered_set
), red-back trees (std::map
, std::set
) using std::vector
s?
Take std::unordered_set
for instance. My idea is to use 3 vectors, X
(stores heads of chains), Y
(stores set elements), Z
(temporary container).
Roughly speaking, given a set element 12345
,
Let
i = X[hash(12345) % X.size()]
. ThenY[i]
is the head of the chain that12345
lives on.Y[i]
is a pair(val, j)
.val
is some element value.Y[j]
is the next item on chain.Once
12345
is found, deleting it can leave a "hole" inY
.A new element will be pushed back to
Y
andX
will be adjusted accordingly.If the number of "holes" in
Y
exceeds, e.g. 50% ofY.size()
, adjustX
andY
globally to erase all the "holes", during whichZ
might be needed as a temporary storage.
The idea applies to trees in std::set
and std::map
. Of course many other details need to be carefully taken care of.
Has anybody tried something like this? The motivation is to keep the data structure as compact as possible, and to avoid memory allocations as much as possible. I imagine this will yield some good speedup for small and medium size applications -- or maybe I am wrong?
Thanks!
CodePudding user response:
Yes, there are. Google dense_hash_map is one of such example.
There is an immense variety of hash maps and tables built with purpose-specific requirements like cache locality, size, read speed, write speed. As speed is highly dependent on cache locality, it is very common for these implementations to use vectors as backend storage.
Have a look at this shootout between hashmaps and browse through every one of them.