I have this Rabin Karp implementation. Now the only thing I'm doing for rolling hash is subtract power*source[i]
from the sourceHash
. power
is 31^target.size()-1 % mod
But I can't understand why we're adding mod
to sourceHash
when it becomes negative. I have tried adding other values but it doesn't work and it only works when we add mod
. Why is this? Is there a specific reason why we're adding mod
and not anything else (like a random big number for example).
int rbk(string source, string target){
int m = target.size();
int n = source.size();
int mod = 128;
int prime = 11;
int power = 1;
int targetHash = 0, sourceHash = 0;
for(int i = 0; i < m - 1; i ){
power =(power*prime) % mod;
}
for(int i = 0; i < target.size(); i ){
sourceHash = (sourceHash*prime source[i]) % mod;
targetHash = (targetHash*prime target[i]) % mod;
}
for(int i = 0; i < n-m 1; i ){
if(targetHash == sourceHash){
bool flag = true;
for(int j = 0; j < m; j ){
if(source[i j] != target[j]){
flag = false;
break;
}
}
if(flag){
return 1;
}
}
if(i < n-m){
sourceHash = (prime*(sourceHash - source[i]*power) source[i m]) % mod;
if(sourceHash < 0){
sourceHash = mod;
}
}
}
return -1;
}
CodePudding user response:
When using modulo arithmetics (mod n)
we have just n
distinct numbers: 0, 1, 2, ..., n - 1
.
All the other numbers which out of 0 .. n - 1
are equal to some number in 0 .. n - 1
:
-n ~ 0
-n 1 ~ 1
-n 2 ~ 2
...
-2 ~ n - 2
-1 ~ n - 1
or
n ~ 0
n 1 ~ 1
n 2 ~ 2
...
2 * n ~ 0
2 * n 1 ~ 0
In general case A ~ B
if and only if (A - B) % n = 0
(here %
stands for remainder).
When implementing Rabin Karp algorithm we can have two potential problems:
- Hash can be too large, we can face integer overflow
- Negative remainder can be implemented in different way on different compilers:
-5 % 3 == -2 == 1
To deal with both problems, we can normalize remainder and operate with numbers within safe 0 .. n - 1
range only.
For arbitrary value A
we can put
A = (A % n n) % n;