Home > Back-end >  Are KMP and Rabin-Karp Sliding window algorithms for pattern match?
Are KMP and Rabin-Karp Sliding window algorithms for pattern match?

Time:05-09

I am trying to understand the Sliding Window algorithm for pattern match. I came across KMP and Rabin-Karp and these looks to be using Sliding Window way of finding pattern in the text. Can we classify KMP and Rabin-Karp types of Sliding window algorithms ?

CodePudding user response:

Classification is always tricky. We can't really do it without a strict definition of what we are talking about, and such definitions are often rather arbitrary.

That being said, in my head a sliding window algorithm has a fixed-sized window that it shifts along the sequence, in each iteration moving a bit of the left part of the window out and shifting the same amount in from the right. With that definition, Rabin-Karp is clearly a sliding window algorithm. In each iteration, it removes one character on the left and add one from the right to update the hash value.

Knuth-Morris-Pratt doesn't do this. You certainly can think of it as having a window, but this window is a border of the prefix of your pattern that you currently match, and when you can extend the match the window size grows, while when you cannot match, it shrinks to the length of a shorter border. The size varies from zero to the length of the pattern during a run of the algorithm.

That being said, my definition of a sliding window algorithm is arbitrary, and others will surely disagree.

Is it useful to have this definition? I think it is. The sliding window idea is an abstraction that might let us construct new algorithms. Do we lose anything by allowing variable size windows? Perhaps not. But what do KMP and RK have in common that we can abstract to something that might be useful?

I don't think there is a lot. They scan the string from left to right (unlike e.g. Boyer-Moore), but they don't share much beyond that. One is hash based, the other is border based. The details are completely different.

You can always come up with a classification where they both will fit, but unless that class tells you something beyond the trivial, there isn't much use to it, would be my opinion.

  • Related