Algorithm-BinarySearch-Vec

 view release on metacpan or  search on metacpan

Vec.pm  view on Meta::CPAN


Algorithm::BinarySearch::Vec - binary search functions and basic set operations for vec()-vectors with fast XS implementations

=head1 SYNOPSIS

 use Algorithm::BinarySearch::Vec;
 
 ##-------------------------------------------------------------
 ## Constants
 my $NOKEY   = $Algorithm::BinarySearch::Vec::KEY_NOT_FOUND;
 my $is_fast = $Algorithm::BinarySearch::Vec::HAVE_XS;
 
 ##-------------------------------------------------------------
 ## Search: element-wise
 $index = vbsearch   ($v,$key,$nbits,$lo,$hi);	##-- match only
 $index = vbsearch_lb($v,$key,$nbits,$lo,$hi);	##-- lower bound
 $index = vbsearch_ub($v,$key,$nbits,$lo,$hi);	##-- upper bound
 
 ##-------------------------------------------------------------
 ## Search: array-wise
 $indices = vabsearch   ($v,\@keys,$nbits,$lo,$hi); ##-- match only
 $indices = vabsearch_lb($v,\@keys,$nbits,$lo,$hi); ##-- lower bound
 $indices = vabsearch_ub($v,\@keys,$nbits,$lo,$hi); ##-- upper bound
 
 ##-------------------------------------------------------------
 ## Search: vector-wise
 $ixvec = vvbsearch   ($v,$keyvec,$nbits,$lo,$hi); ##-- match only
 $ixvec = vvbsearch_lb($v,$keyvec,$nbits,$lo,$hi); ##-- lower bound
 $ixvec = vvbsearch_ub($v,$keyvec,$nbits,$lo,$hi); ##-- upper bound
 
 ##-------------------------------------------------------------
 ## Set Operations
 $cv = vunion($av,$bv,$nbits);          ##-- set union
 $cv = vintersect($av,$bv,$nbits);      ##-- set intersection
 $cv = vsetdiff($av,$bv,$nbits);        ##-- set difference
 
 ##-------------------------------------------------------------
 ## Debugging
 $val  = vget($vec,$i,$nbits);
 undef = vset($vec,$i,$nbits, $newval);
 $vals = vec2array($vec,$nbits);


=head1 DESCRIPTION

The Algorithm::BinarySearch::Vec perl module provides binary search functions and
basic set operations for L<vec()|perlfunc/vec-EXPR-OFFSET-BITS>-vectors,
including fast XS implementations in the package C<Algorithm::BinarySearch::Vec::XS>.
The XS implementations are used by default if available, otherwise pure-perl fallbacks are provided.
You can check whether the XS implementations are available on your system by examining the
boolean scalar C<$Algorithm::BinarySearch::Vec::HAVE_XS>.

=cut

##======================================================================
## Data Conventions
=pod

=head2 Data Conventions

All API functions provided by this module assume that the elements of the vec()-style vector arguments
are sorted in strictly ascending order.  The user is responsible for assuring that this is the case,
since no additional checking is done by this module.

=cut

##======================================================================
## Exports
=pod

=head2 Exports

This module support the following export tags:

=over 4

=item :api

Exports all API functions (default).

=item :const

Exports the following constant(s):

=over 4

=item $KEY_NOT_FOUND

Constant returned by search functions indicating that the requested key
was not found, or the requested bound is not within the data vector.

=back

=item :debug

Exports debugging subs for the XS module (vget(), vset()).

=item :all

Exports everything available.

=back

=cut

##======================================================================
## API: Search: element-wise
=pod

=head2 Search: element-wise

=over 4

=item vbsearch($v,$key,$nbits,?$ilo,?$ihi)

Binary search for $key in the vec()-style vector $v, which contains elements
of $nbits bits each, sorted in ascending order.  $ilo and $ihi if specified are
indices to limit the search.  $ilo defaults to 0, $ihi defaults to (8*$nbits/bytes::length($v)),
i.e. the entire vector is to be searched.
Returns the index $i of an element in $v matching $key (C<vec($v,$i,$nbits)==$key>,
with ($ilo E<lt>= $i E<lt> $ihi)),
or $KEY_NOT_FOUND if no such element exists.

=item vbsearch_lb($v,$key,$nbits,?$ilo,?$ihi)

Binary search for the lower-bound of $key in the vec()-style vector $v.
Arguments are as for L<vbsearch()|vbsearch>.

Returns the maximum index $i such that
C<vec($v,$i,$nbits) E<lt>= $key>
and
C<vec($v,$j,$nbits) E<lt> $key> for all $j with $ilo E<lt>= $j E<lt> $i,
or $KEY_NOT_FOUND if no such $i exists (i.e. if C<vec($v,$ilo,$nbits) E<gt> $key>).
In other words,
returns the least index of a match for $key in $v whenever a match exists,
otherwise the greatest index whose value in $v is strictly less than $key if that exists,
and $KEY_NOT_FOUND if all values in $v are strictly greater than $key.

This is equivalent to (but usually much faster than):

 return $KEY_NOT_FOUND if (vec($v,$ilo,$nbits) > $key);
 for (my $i=$ilo; $i < $ihi; $i++) {
   return $i   if (vec($v,$i,$nbits) == $key);
   return $i-1 if (vec($v,$i,$nbits) >  $key);
 }
 return ($ihi-1);

Note that the semantics of this function differ substantially from
the C++ STL function lower_bound().


=item vbsearch_ub($v,$key,$nbits,?$ilo,?$ihi)

Binary search for the upper-bound of $key in the vec()-style vector $v.
Arguments are as for L<vbsearch()|vbsearch>.

Returns the minimum index $i such that
C<vec($v,$i,$nbits) E<gt>= $key>
and
C<vec($v,$j,$nbits) E<gt> $key> for all $j with $i E<lt> $j E<lt> $ihi,
or $KEY_NOT_FOUND if no such $i exists (i.e. if C<vec($v,$ihi-1,$nbits) E<lt> $key>).
In other words,
returns the greatest index of a match for $key in $v whenever a match exists,
otherwise the least index whose value in $v is strictly greater than $key if that exists,
and $KEY_NOT_FOUND if all values in $v are strictly less than $key.

This is equivalent to (but usually much faster than):

 return $KEY_NOT_FOUND if (vec($v,$ihi-1,$nbits) < $key);
 for ($i=$ihi-1; $i >= 0; $i--) {
   return $i   if (vec($v,$i,$nbits) == $key)
   return $i+1 if (vec($v,$i,$nbits) <  $key)
 }
 return $ilo;

Note that the semantics of this function differ substantially from
the C++ STL function upper_bound().

=back

=cut

##======================================================================
## API: Search: array-wise
=pod

=head2 Search: array-wise

=over 4

=item vabsearch($v,\@keys,$nbits,?$ilo,?$ihi)

Binary search for each value in the ARRAY-ref \@keys in the vec()-style vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns an ARRAY-ref of indices.
This is equivalent to (but usually much faster than):

 $indices = [map {vbsearch($v,$_,$nbits,$ilo,$ihi)} @keys];


=item vabsearch_lb($v,\@keys,$nbits,?$ilo,?$ihi)

Binary search for the lower-bound of each value in the ARRAY-ref \@keys in the vec()-style vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns an ARRAY-ref of indices.
This is equivalent to (but usually much faster than):

 $indices = [map {vbsearch_lb($v,$_,$nbits,$ilo,$ihi)} @keys];

=item vabsearch_ub($v,\@keys,$nbits,?$ilo,?$ihi)

Binary search for the upper-bound of each value in the ARRAY-ref \@keys in the vec()-style vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns an ARRAY-ref of indices.
This is equivalent to (but usually much faster than):

 $indices = [map {vbsearch_ub($v,$_,$nbits,$ilo,$ihi)} @keys];

=back

=cut

##======================================================================
## API: Search: vec-wise
=pod

=head2 Search: vec-wise

=over 4

=item vvbsearch($v,$keyvec,$nbits,?$ilo,?$ihi)

Binary search for each key in the key-vector $keyvec in the "haystack"-vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns a vec()-vector of 32-bit indices.
This is equivalent to (but usually faster than):

 $ixvec = pack('N*', @{vabsearch($v,vec2array($keyvec,$nbits),$nbits,$ilo,$ihi)});

=item vvbsearch_lb($v,$keyvec,$nbits,?$ilo,?$ihi)

Binary lower-bound search for each key in the key-vector $keyvec in the "haystack"-vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns a vec()-vector of 32-bit indices.
This is equivalent to (but usually faster than):

 $ixvec = pack('N*', @{vabsearch_lb($v,vec2array($keyvec,$nbits),$nbits,$ilo,$ihi)});

=item vvbsearch_ub($v,$keyvec,$nbits,?$ilo,?$ihi)

Binary upper-bound search for each key in the key-vector $keyvec in the "haystack"-vector $v.
Other arguments are as for L<vbsearch()|vbsearch>.
Returns a vec()-vector of 32-bit indices.
This is equivalent to (but usually faster than):

 $ixvec = pack('N*', @{vabsearch_ub($v,vec2array($keyvec,$nbits),$nbits,$ilo,$ihi)});

=back

=cut

##======================================================================
## API: Set Opterations
=pod

=head2 Set Operations

The set operations supported by this module assume that the vec()-style vector sets
are sorted in ascending order, contain no duplicates, and are encoded with
C<$nbits E<gt>= 8>; i.e. every element-boundary must lie on a byte-boundary.
The vector-sets returned by these API functions should also conform to these
conventions whenever the parameters do.

=over 4

=item vunion($av,$bv,$nbits)

Computes the union of two sorted vec()-style sets C<$av> and C<$bv>
and returns the result as a sorted vector-set.  Complexity is I<O>(C<$a> + C<$b>)>.

=item vintersect($av,$bv,$nbits)

Computes the intersection of two sorted vec()-style sets C<$av> and C<$bv>
and returns the result as a sorted vector-set.  Complexity is I<O>(C<$A> * log C<$B>),
where C<$A> is the shorter and C<$B> the longer of the argument vectors C<$a> and C<$b>.

=item vsetdiff($av,$bv,$nbits)

Computes the difference of two sorted vec()-style sets C<$av> and C<$bv>
and returns the result as a sorted vector-set.  Complexity is I<O>(C<$A> * log C<$B>),
where C<$A> is the shorter and C<$B> the longer of the argument vectors C<$a> and C<$b>.

=back

=cut

##======================================================================
## Debugging
=pod

=head2 Debugging and Miscellaneous

=over 4

=item vget($vec,$i,$nbits)

Debugging XS-wrapper equivalent to C<vec($vec,$i,$nbits)>.

=item vset($vec,$i,$nbits,$newval)

Debugging XS-wrapper equivalent to C<vec($vec,$i,$nbits)=$newval>.

=item vec2array($vec,$nbits)

Debugging utility, equivalent to

 [map {vec($vec,$_,$nbits)} (0..(length($vec)*8/$nbits-1))]

=back

=cut

##======================================================================
## Footer
=pod

=head1 SEE ALSO

L<vec() in perlfunc(1)|perlfunc/vec-EXPR-OFFSET-BITS>,
PDL(3perl),
perl(1).

=head1 AUTHOR

Bryan Jurish E<lt>moocow@cpan.orgE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2012-2016 by Bryan Jurish

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.10.1 or,
at your option, any later version of Perl 5 you may have available.

=cut



( run in 0.635 second using v1.01-cache-2.11-cpan-e1769b4cff6 )