Algorithm-RabinKarp

 view release on metacpan or  search on metacpan

lib/Algorithm/RabinKarp.pm  view on Meta::CPAN

use Algorithm::RabinKarp::Util qw(stream_fh stream_string);

use UNIVERSAL;

use constant BASE => 2;

our $VERSION = "0.41";

=head1 NAME

Algorithm::RabinKarp - Rabin-Karp streaming hash

=head1 SYNOPSIS

  my $text = "A do run run run, a do run run";
  my $kgram = Algorithm::RabinKarp->new($window, $text);

or

  my $kgram2 = Algorithm::RabinKarp->new($window, $fh);

lib/Algorithm/RabinKarp.pm  view on Meta::CPAN

  my $needle_hash = $needle->next;
  
  while (my ($hay_hash, @pos) = $haystack->next) {
    warn "Possible match for 'needle' at @pos" 
      if $needle_hash eq $hay_hash;
  }
  
  
=head1 DESCRIPTION

This is an implementation of Rabin and Karp's streaming hash, as
described in "Winnowing: Local Algorithms for Document Fingerprinting" by
Schleimer, Wilkerson, and Aiken.  Following the suggestion of Schleimer,
I am using their second equation:

  $H[ $c[2..$k + 1] ] = (( $H[ $c[1..$k] ] - $c[1] ** $k ) + $c[$k+1] ) * $k

The results of this hash encodes information about the next k values in
the stream (hense k-gram.) This means for any given stream of length n
integer values (or characters), you will get back n - k + 1 hash
values.



( run in 0.245 second using v1.01-cache-2.11-cpan-a5abf4f5562 )