Say we have a hash function H
, and two byte strings a
and b
(might be long, e.g. seveal MiBs in size, so we want to avoid hashing them again). We already know the value of H(a)
and H(b)
, and want to calculate H(a b)
(The hash of two strings concatenated together).
We'd like to have a function F
that can calculate H(a b)
from H(a)
, H(b)
and any other properties of a
and b
we can calculate beforehand (e.g. lengths), and takes less time than just hashing the whole string.
The hash function H
doesn't need to be cryptographic, but should be good enough for HashMaps or similar usages.
Does such functions H
and F
exist? Or what should I search/research for if I'd like to know that?
CodePudding user response:
Java's string hash is s[0]*31^(n-1) s[1]*31^(n-2) ... s[n-1]
(modulo int size).
A property of this hash is that, H(a b) = (31^b.length())*H(a) H(b)
.
You can compute 31^b.length()
using exponentiation by squaring in logarithmic time. If you wish to precompute, you can precompute 31^length
for each of your strings and store it with the precomputed hash.
CodePudding user response:
You need to tell more about what you expect from your hash function. Otherwise my answer is : use bit parity as a Hash because B(a b)=(B(a) B(b)) mod 2