This is the log (truncated) from my analysis of the hashmap:
unsigned nbuckets = octree->Nodes.bucket_count();
LOG(Error, "bucket size = {0}, kk = {1}", nbuckets, octree->Nodes.max_load_factor());
for (auto& x : octree->Nodes) {
LOG(Warning, "Element [{0}:{1}] is in bucket {2}", x.first.morton, x.second.position, octree->Nodes.bucket(x.first));
}
[ 00:19:26.169 ]: [Error] bucket size = 512, kk = 1.0
[ 00:19:26.169 ]: [Warning] Element [132120576:X:384 Y:384 Z:384] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [115343360:X:128 Y:384 Z:384] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [98566144:X:384 Y:128 Z:384] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [81788928:X:128 Y:128 Z:384] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [65011712:X:384 Y:384 Z:128] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [48234496:X:128 Y:384 Z:128] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [31457280:X:384 Y:128 Z:128] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [16515072:X:192 Y:192 Z:192] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [14417920:X:64 Y:192 Z:192] is in bucket 0
[ 00:19:26.169 ]: [Warning] Element [12320768:X:192 Y:64 Z:192] is in bucket
....
Map is defined like this:
std::unordered_map<Int3Pos, OctreeNode, Int3Pos::HashFunction> Nodes;
where OctreeNode is a struct with some values. And Int3Pos:
struct Int3Pos // TODO rename
{
public:
Int3_64 position;
uint64 morton;
Int3Pos() : position(Int3_64(0, 0, 0)), morton(0) { }
Int3Pos(Int3_64 c)
{
this->position = c;
this->morton = mrtn(this->position);
}
bool operator==(const Int3Pos& otherPos) const
{
if (this->morton == otherPos.morton) return true;
else return false;
}
struct HashFunction
{
size_t operator()(const Int3Pos& pos) const
{
return pos.morton;
}
};
};
It tells 512 buckets, but all go to 0. Or I am misinterpreting something?
CodePudding user response:
Your hash function is very bad. The greatest common divisor of the morton values you shown is
GCD(132120576, 115343360, 98566144, 81788928, 65011712, 48234496, 31457280, 16515072, 14417920, 12320768) = 2^18 = 262144
It is divided by 512. All values go to the (first) bucket 0.
You can try
return pos.morton % 511; // or pos.morton % 1023, or std::hash<decltype(pos.morton)>{}(pos.morton).
CodePudding user response:
It appears that you use MSVC implementation, which uses power of two sizes of unordered_map
. Some other implementation prefers prime-sized.
The advantage of power-of-two is fast bucket access. Instead of %
operation (modulus), &
operation (bitwise and) can be used.
The advantage of prime-sized is tolerance for not very good hash fuctions like you have.
If you can use other libraries, like boost::unordered_map
, you can try them, to avoid dependency on the specific STL details. But better yet improve the hash function.