I was going through the Stanford algorithms course in which they had mentioned prime no. as a 'quick and dirty' method of hashing. So I was attempting to implement my own hash table class, but I am stuck at finding the best way to get the closest prime number to n (number of 'buckets'). Sieve of Eratosthenes is valid, but will take O(nloglogn) time complexity.
Is there any better way to go about this?
CodePudding user response:
You can use a prime number test like Miller Rabin primality test with a bit of randomness to pick the number of buckets. While the test runs in constant time for a single std::size_t you might have to try a few numbers. But resizing a hashtable is rare.
Alternatively you can just pre-compute a list of suitable primes for use as hashtable size, e.g. the closest prime to 2^n. You can still add randomness to the hash by using (key ^ seed) % prime
.
CodePudding user response:
GCC just uses a precomputed LUT. No reason to recompute primes when you can just store an array and use std::lower_bound
In fact, they use two LUTs. The primary one is an array of all prime number up to 2^64. The other is a small inline one that contains the last prime number <= index i which is useful in constructors and common cases of small dictionaries.