Algorithm-RabinKarp
view release on metacpan or search on metacpan
lib/Algorithm/RabinKarp.pm view on Meta::CPAN
warn "Possible match for 'needle' at @pos"
if $needle_hash eq $hay_hash;
}
=head1 DESCRIPTION
This is an implementation of Rabin and Karp's streaming hash, as
described in "Winnowing: Local Algorithms for Document Fingerprinting" by
Schleimer, Wilkerson, and Aiken. Following the suggestion of Schleimer,
I am using their second equation:
$H[ $c[2..$k + 1] ] = (( $H[ $c[1..$k] ] - $c[1] ** $k ) + $c[$k+1] ) * $k
The results of this hash encodes information about the next k values in
the stream (hense k-gram.) This means for any given stream of length n
integer values (or characters), you will get back n - k + 1 hash
values.
For best results, you will want to create a code generator that filters
your data to remove all unnecessary information. For example, in a large
( run in 0.544 second using v1.01-cache-2.11-cpan-39bf76dae61 )