Algorithm-RabinKarp
view release on metacpan or search on metacpan
lib/Algorithm/RabinKarp.pm view on Meta::CPAN
use Algorithm::RabinKarp::Util qw(stream_fh stream_string);
use UNIVERSAL;
use constant BASE => 2;
our $VERSION = "0.41";
=head1 NAME
Algorithm::RabinKarp - Rabin-Karp streaming hash
=head1 SYNOPSIS
my $text = "A do run run run, a do run run";
my $kgram = Algorithm::RabinKarp->new($window, $text);
or
my $kgram2 = Algorithm::RabinKarp->new($window, $fh);
lib/Algorithm/RabinKarp.pm view on Meta::CPAN
my $needle_hash = $needle->next;
while (my ($hay_hash, @pos) = $haystack->next) {
warn "Possible match for 'needle' at @pos"
if $needle_hash eq $hay_hash;
}
=head1 DESCRIPTION
This is an implementation of Rabin and Karp's streaming hash, as
described in "Winnowing: Local Algorithms for Document Fingerprinting" by
Schleimer, Wilkerson, and Aiken. Following the suggestion of Schleimer,
I am using their second equation:
$H[ $c[2..$k + 1] ] = (( $H[ $c[1..$k] ] - $c[1] ** $k ) + $c[$k+1] ) * $k
The results of this hash encodes information about the next k values in
the stream (hense k-gram.) This means for any given stream of length n
integer values (or characters), you will get back n - k + 1 hash
values.
( run in 0.245 second using v1.01-cache-2.11-cpan-a5abf4f5562 )