Algorithm-BitVector
view release on metacpan or search on metacpan
lib/Algorithm/BitVector.pm view on Meta::CPAN
Since Perl does not expect these two operators to be invoked in a void context, such
statements in your code will elicit a warning from Perl to that effect. If these
warnings annoy you, you can turn them off by surrounding such statements with C<no
warnings "void";> and C<use warnings;> directives. The other option is to invoke
such statements in the following manner:
$bv1 = $bv1 << $n;
$bv2 = $bv1 >> $n;
That works because the overload definitions for these bit shift operators return the
bitvector object on which they are invoked. As it turns out, this also allows for
chained invocation of these operators, as in
$bv1 = $bv1 << 3 << 1 >> 2; # 100011
$bv1 = $bv1 << 2 >> 1 >> 3; # 111000
=item B<Iterating over a bitvector:>
while (<$bv1>) {
print "$_ ";
} # 1 1 1 0 0 0
This is made possible by the overload definition for the C<< <> >> operator. The
C<Algorithm::BitVector> class includes an inner class C<BitVecIterator> for this
purpose.
=back
=head1 CONSTRUCTING BITVECTORS
=over 4
=item B<Constructing an empty bitvector:>
$bv = Algorithm::BitVector->new( size => 0 );
print "$bv\n"; # no output
=item B<Constructing a given sized bitvector of all zeros:>
$bv = Algorithm::BitVector->new( size => 13 ); # 0000000000000
=item B<Constructing a bitvector from an integer value:>
$bv = Algorithm::BitVector->new( intVal => 5006 ); # 1001110001110
The module returns the smallest possible bitvector that would accommodate the
integer value provided with the C<intVal> option.
=item B<Constructing a bitvector by specifying both the size and the integer values:>
As mentioned, with the C<intVal> option, you get the smallest possible bitvector
that can be generated from that integer. If you want a I<larger> bitvector, you can
also supply the C<size> option as shown below:
$bv = Algorithm::BitVector->new( intVal => 5006, size => 16 ); # 0001001110001110
If the value supplied for the C<size> option in such a call is not larger than the
smallest bit array that represents the C<intVal> value, the constructor will throw an
exception.
=item B<Constructing a bitvector from a very large integer:>
use Math::BigInt;
$x = Math::BigInt->new('12345678901234567890123456789012345678901234567890');
$bv = Algorithm::BitVector->new( intVal => $x );
#1000011100100111111101100011011010011010101011111000001111001010000\
#10101000000100110011101000111101011111000110001111111000110010110110\
#01110001111110000101011010010
=item B<Constructing a bitvector from a bit string:>
$bv = Algorithm::BitVector->new( bitstring => '00110011' ); # 00110011
=item B<Constructing a bitvector from an ASCII text string:>
$bv = Algorithm::BitVector->new( textstring => "hello\n" );
# 011010000110010101101100011011000110111100001010
=item B<Constructing a bitvector from a hex string:>
$bv = Algorithm::BitVector->new( hexstring => "68656c6c6f" );
# 0110100001100101011011000110110001101111
=item B<Constructing a bitvector from a bit list:>
$bv = Algorithm::BitVector->new( bitlist => [1, 1, 0, 1] ); # 1101
=item B<Constructing a bitvector from the contents of a disk file:>
$bv = Algorithm::BitVector->new( filename => 'testinput1.txt' );
print "$bv\n"; # Nothing to show yet
$bv1 = $bv->read_bits_from_file(64); # Now you have a bitvector from the
# first 64 bits
Note that it takes two calls to create bitvectors from the contents of a file. The
first merely creates an empty bitvector just to set the necessary file handle for
reading the file. It is the second call in which you invoke the method
C<read_bits_from_file()> that actually returns a bitvector from the bits read from
the file. Each call to C<read_bits_from_file()> in this manner spits out a new bit
vector.
=back
=head1 METHODS
=head3 close_file_handle()
=over 4
When you construct bitvectors by block scanning a disk file, after you are done, you
can call this method to close the file handle that was created to read the file:
$bv = Algorithm::BitVector->new( filename => 'testinput.txt' );
## your code to read bit blocks for constructing bitvectors goes here
$bv->close_file_handle();
lib/Algorithm/BitVector.pm view on Meta::CPAN
$gcd = $bv1->gcd( $bv2 ); # 10
print int($gcd); # 2
The result returned by C<gcd()> is a bitvector.
=back
=head3 gen_random_bits()
=over 4
$bv = Algorithm::BitVector->new( intVal => 0 );
$bv = $bv->gen_random_bits(16); # 1100111001010101
The call to C<gen_random_bits()> returns a bitvector whose bits are randomly
generated. The number of bits in the returned bitvector equals the argument integer.
=back
=head3 get_bit()
=over 4
This method gives you array-like access to the individual bits of a bitvector.
$bv = Algorithm::BitVector->new( bitstring => '10111' );
print $bv->get_bit(0); # 1 (the first bit)
print $bv->get_bit(1); # 0
print $bv->get_bit(4); # 1 (the last bit)
Negative values for the index scan a bitvector from right to left, with the C<-1>
index standing for the last (meaning the right-most) bit in the vector:
print $bv->get_bit(-1); # 1 (the last bit)
print $bv->get_bit(-2); # 1
print $bv->get_bit(-5); # 1 (the first bit)
The C<get_bit()> can also return a slice of a bitvector if the argument to the
method is an anonymous array of the index range you desire, as in the second
statement below:
$bv = Algorithm::BitVector->new( bitstring => "10111011");
my $arr = $bv->get_bit( [3..7] );
print "@$arr\n"; # 1 1 0 1 1
In this example, we want C<get_bit()> to return all bits at positions indexed 3
through 7, both ends inclusive. Note that the slice is returned as an array of bits.
=back
=head3 get_bitvector_in_ascii()
=over 4
$bv = Algorithm::BitVector->new( textstring => "hello" );
print "$bv\n"; # 0110100001100101011011000110110001101111
print $bv->get_bitvector_in_ascrii(); # hello
The method returns a string of ASCII characters by converting successive 8-bit slices
of the bitvector into an ASCII character. It throws an exception if the size of the
bit pattern is not a multiple of 8. Calling this method to create a text-based print
representation of a bit vector makes sense only if you don't expect to see any
unprintable characters in the successive 8-bit slices of the bitvector. Let's say
you have created a bitvector from a text string using the appropriate constructor
call. Subsequently, you encrypted this text string. Next, you or someone else
decrypts the encrypted bit stream. Since what comes out at the decryption end must
be the original text string, it would make sense to invoke this method to recover the
original text.
=back
=head3 get_bitvector_in_hex()
=over 4
Assuming that length of your bitvector is a multiple of 4, this methods returns a
hex representation of the bit pattern:
$bv = Algorithm::BitVector->new(bitstring => "0110100001100101011011000110110001101111");
print $bv->get_bitvector_in_hex(); # 68656c6c6f
The hex representation is returned in the form if a string of hex characters. This
method throws an exception if the size of the bitvector is not a multiple of 4.
=back
=head3 get_slice()
=over 4
You can use this method to get a slice of a BitVector that is within a specified range.
You can specify the index range with the usual range operator in Perl. If the index
range is, say, '5..11', the method will return all bits at index values 5 through 10.
my $bv9 = Algorithm::BitVector->new( intVal => 63437, size => 16 );
print "BitVector for testing get_slice(): $bv9\n"; # 1111011111001101
my $slice_bv = $bv9->get_slice( [5..11] );
print "slice BitVector for index values 5 through 10: $slice_bv\n"; # 111110
Note that the method returns the slice in the form of a BitVector object.
=back
=head3 gf_divide_by_modulus()
=over 4
This method is for modular division in the Galois Field C<GF(2^n)>. You must specify
the modulus polynomial through a bit pattern and also the value of the integer C<n>:
$mod = Algorithm::BitVector->new( bitstring => '100011011' ); # AES modulus
$n = 8;
$a = Algorithm::BitVector->new( bitstring => '11100010110001' );
($quotient, $remainder) = $a->gf_divide_by_modulus($mod, $n);
print "$quotient\n"; # 00000000111010
print "$remainder\n"; # 10001111
What this example illustrates is dividing the bitvector C<$a> by the modulus bit
vector C<$mod>. For a more general division of one bitvector C<$a> by another bit
vector C<$b>, you would carry out a multiplication of C<$a> by the MI of C<$b>, where
MI stands for "multiplicative inverse" as returned by a call to the method
C<gf_MI()>. A call to C<gf_divide_by_modulus()> returns two bitvectors, one for the
quotient and the other for the remainder.
=back
=head3 gf_MI()
=over 4
This method returns the multiplicative inverse of a bit pattern in the Galois Field
C<GF(2^n)>. You must specify both the modulus polynomial through its bit pattern and
the value of C<n>:
$modulus = Algorithm::BitVector->new( bitstring => '100011011' ); # AES modulus
$n = 8;
$a = Algorithm::BitVector->new( bitstring => '00110011' );
print $a->gf_MI($modulus, $n); # 01101100
Note that the multiplicative inverse is returned as a bitvector.
=back
=head3 gf_multiply()
=over 4
This method returns a product of two bit patterns in the Galois Field C<GF(2)> field.
That is, given any two polynomials with their coefficients drawn from the 0 and 1
values in C<GF(2)>, this method returns the product polynomial.
$a = Algorithm::BitVector->new( bitstring => '0110001' );
$b = Algorithm::BitVector->new( bitstring => '0110' );
print $a->gf_multiply($b); #00010100110
As you would expect, in general, the bit pattern returned by this method will be
longer than the lengths of the two operand bitvectors. The result returned by the
method is in the form of a bitvector.
=back
=head3 gf_multiply_modular()
=over 4
This method carries out modular multiplication in the Galois Field C<GF(2^n)>. You
must supply it the bitvector for the modulus polynomial and the value of C<n>.
$modulus = Algorithm::BitVector->new( bitstring => '100011011' ); # AES modulus
$n = 8;
$a = Algorithm::BitVector->new( bitstring => '0110001' );
$b = Algorithm::BitVector->new( bitstring => '0110' );
print $a->gf_multiply_modular($b, $modulus, $n); # 10100110
This example returns the product of the bit patterns C<$a> and C<$b> modulo the bit
pattern C<$modulus> in C<GF(2^8)>. The result returned by the method is in the form
of a bitvector.
=back
=head3 hamming_distance()
=over 4
Hamming distance is commonly used to measure dissimilarity between two bitvectors of
the same size.
$bv1 = Algorithm::BitVector->new( bitstring => '11111111' );
$bv2 = Algorithm::BitVector->new( bitstring => '00101011' );
print $bv1->hamming_distance( $bv2 ); # 4
This distance returns the number of bit positions in which two the bit patterns
differ. The method throws an exception if the two bitvectors are not of the same
length. The value returned is an integer.
=back
=head3 int_value()
=over 4
You can find the integer value of a bitvector by
$bv = Algorithm::BitVector->new( intVal => 5678 );
print $bv3->int_value(); # 5678
or, even more simply by
print int($bv); # 5678
which works on account of the overloading of the C<0+> operator.
=back
=head3 is_power_of_2()
=over 4
You can use this predicate to test if the integer value of a bitvector is a power of
2:
$bv = Algorithm::BitVector->new( bitstring => '10000000001110' );
print int($bv); # 826
print $bv->is_power_of_2(); # 0
The predicate returns 1 for true and 0 for false.
=back
=head3 is_power_of_2_sparse()
=over 4
This does the same thing as the C<is_power_of_2()> method but in a way that makes it
faster for large bitvectors with very few bits set.
$bv = Algorithm::BitVector->new( size => 2000000 );
$bv->set_bit(345234, 1);
print $bv->is_power_of_2_sparse(); # 1
=back
=head3 jaccard_distance()
=over 4
The Jaccard distance between two bitvectors is 1 minus the Jaccard similarity
coefficient:
$bv1 = Algorithm::BitVector->new( bitstring => '11111111' );
$bv2 = Algorithm::BitVector->new( bitstring => '10101011' );
print $bv1->jaccard_distance( $bv2 ); # 0.375
lib/Algorithm/BitVector.pm view on Meta::CPAN
=back
=head3 pad_from_left()
=over 4
You can pad a bitvector from the left with a designated number of zeros:
$bv = Algorithm::BitVector->new( bitstring => '101010' );
print $bv->pad_from_left( 4 ); # 0000101010
The method returns the bitvector on which it is invoked. So you can think of it as
an in-place extension of a bitvector (although, under the hood, the extension is
carried out by giving a new longer C<_vector> attribute to the bitvector object).
=back
=head3 pad_from_right()
=over 4
You can pad a bitvector from the right with a designated number of zeros:
$bv = Algorithm::BitVector->new( bitstring => '101010' );
print $bv->pad_from_right( 4 ); # 1010100000
The method returns the bitvector on which it is invoked. So you can think of it as
an in-place extension of a bitvector (although, under the hood, the extension is
carried out by giving a new longer C<_vector> attribute to the bitvector object).
=back
=head3 permute()
=over 4
You can permute the bits in a bitvector with a permutation list as shown below:
$bv1 = Algorithm::BitVector->new( intVal => 203, size => 8 );
print $bv1; # 11001011
$bv2 = $bv1->permute( [3, 2, 1, 0, 7, 6, 5, 4] );
print $bv2; # 00111101
The method returns a new bitvector with permuted bits.
=back
=head3 rank_of_bit_set_at_index()
=over 4
You can measure the "rank" of a bit that is set at a given index. Rank is the number
of bits that are set up to the argument position, as in
$bv = Algorithm::BitVector->new( bitstring => '01010101011100' );
print $bv->rank_of_bit_set_at_index( 10 ); # 6
The value 6 returned by this call to C<rank_of_bit_set_at_index()> is the number of
bits set up to the position indexed 10 (including that position). The method throws
an exception if there is no bit set at the argument position.
=back
=head3 read_bits_from_file()
=over 4
Constructing bitvectors from the contents of a disk file takes two steps: First you
must make the call shown in the first statement below. The purpose of this call is to
create a file handle that is associated with the variable C<$bv> in this case.
Subsequent invocations of C<read_bits_from_file($n)> on this variable will read
blocks of C<$n> bits and return a bitvector for each block thus read. The variable
C<$n> must be a multiple of 8 for this to work.
$bv = Algorithm::BitVector->new( filename => 'testinput.txt' );
$bv1 = $bv->read_bits_from_file(64);
$bv2 = $bv->read_bits_from_file(64);
...
...
$bv->close_file_handle();
When reading a file as shown above, you can test the attribute C<more_to_read> of the
bitvector object in order to find out if there is more to be read in the file. The
C<while> loop shown below reads all of a file in 64-bit blocks.
$bv = Algorithm::BitVector->new( filename => 'testinput.txt' );
while ($bv->{more_to_read}) {
my $bv_read = $bv->read_bits_from_file( 64 );
print "$bv_read\n";
}
$bv->close_file_handle();
The size of the last bitvector constructed from a file corresponds to how many bytes
remain unread in the file at that point. It is your responsibility to zero-pad the
last bitvector appropriately if, say, you are doing block encryption of the whole
file.
=back
=head3 reset()
=over 4
You can reset a previously constructed bitvector all either all 1's or all 0's by
calling this method:
$bv = Algorithm::BitVector->new( intVal => 203, size => 8 );
print $bv; # 11001011
$bv->reset(1);
print $bv; # 11111111
$bv->reset(0);
print $bv; # 00000000
What the method accomplishes can be thought of as in-place resetting of the bits. The
method does not return anything.
=back
=head3 reverse()
lib/Algorithm/BitVector.pm view on Meta::CPAN
in the fifth statement above works.
=back
=head3 shift_right()
=over 4
If you want to shift a bitvector non-circularly to the right, this is the method to
call:
$bv = Algorithm::BitVector->new( bitlist => [1,1, 1, 0, 0, 1] );
$bv->shift_right(3);
print $bv; # 000111
$bv = Algorithm::BitVector->new( bitlist => [1,1, 1, 0, 0, 1] );
$bv->shift_right(3)->shift_right(2);
print $bv; # 000001
As the bitvector is shifted non-circularly to the right, the exposed bit positions on
the left are filled with zeros. Note that the method returns the bitvector object
on which it is invoked. That is the reason why the chained invocation of the method
in the fifth statement above works.
=back
=head3 test_for_primality()
=over 4
If the integer value of a bitvector is small (meaning smaller than C<< 0x7f_ff_ff_ff >>),
you can use this method to test it for its primality through the Miller-Rabin
probabilistic test:
$p = 7417;
$bv = Algorithm::BitVector->new( intVal => $p );
$check = $bv->test_for_primality();
print "The primality test for $p: $check\n";
# The primality test for 7417: is a prime with probability 0.99993896484375
The method returns one of two strings: If the primality test succeeds, the method
returns a string like "C<is a prime with probability xxxxx>". And if the test fails,
the method returns the string "C<is NOT a prime>".
=back
=head3 unpermute()
=over 4
This method reverses the permutation carried out by a call to the C<permute()> method
as shown below:
$bv1 = Algorithm::BitVector->new( intVal => 203, size => 8 );
print $bv1; # 11001011
$bv2 = $bv1->permute( [3, 2, 1, 0, 7, 6, 5, 4] );
print $bv2; # 00111101
$bv3 = $bv2->unpermute( [3, 2, 1, 0, 7, 6, 5, 4] );
print $bv3; # 11001011
The method returns a new bitvector with unpermuted bits. Also note that the method
throws an exception if the permutation list is not as long as the size of the
bitvector on which the method is invoked.
=back
=head3 write_to_file()
=over 4
This method writes the bitvectors in your program to a disk file:
$bv1 = Algorithm::BitVector->new( bitstring => '00001010' );
open my $FILEOUT, ">test.txt";
$bv1->write_to_file( $FILEOUT );
close $FILEOUT;
The size of a bitvector must be a multiple of 8 for this write method to work. If
this condition is not met, the method will throw an exception.
B<Important for Windows Users:> When writing an internally generated bitvector out
to a disk file, it may be important to open the file in the binary mode, since
otherwise the bit pattern `00001010' ('\n') in your bitstring will be written out as
0000110100001010 ('\r\n') which is the line break on Windows machines.
=back
=head1 REQUIRED
This module imports the following modules:
Math::BigInt
List::Util
Math::Random
Fcntl
=head1 THE C<Examples> DIRECTORY
The C<Examples> directory contains the following script that invokes all of the
functionality of this module:
BitVectorDemo.pl
In case there is any doubt about how exactly to invoke a method or how to use an
operator, please look at the usage in this script.
=head1 EXPORT
None by design.
=head1 BUGS
Please notify the author if you encounter any bugs. When sending email, please place
the string 'BitVector' in the subject line.
=head1 INSTALLATION
Download the archive from CPAN in any directory of your choice. Unpack the archive
with a command that on a Linux machine would look like:
tar zxvf Algorithm-BitVector-1.26.tar.gz
This will create an installation directory for you whose name will be
C<Algorithm-BitVector-1.26>. Enter this directory and execute the following commands
for a standard install of the module if you have root privileges:
perl Makefile.PL
make
make test
sudo make install
If you do not have root privileges, you can carry out a non-standard install the
module in any directory of your choice by:
perl Makefile.PL prefix=/some/other/directory/
make
make test
make install
( run in 0.987 second using v1.01-cache-2.11-cpan-98e64b0badf )