Compress-BGZF
view release on metacpan or search on metacpan
lib/Compress/BGZF/Reader.pm view on Meta::CPAN
# sysread wrapper that checks return count and returns read
# (internally we should never read off end of file - doing so indicates
# either a software bug or a corrupt input file so we croak)
#-------------------------------------------------------------------------
# ARG 0 : bytes to read
#-------------------------------------------------------------------------
# RET 0 : string read
#-------------------------------------------------------------------------
my ($fh, $len) = @_;
my $buf = '';
my $r = sysread $fh, $buf, $len;
croak "Returned unexpected byte count" if ($r != $len);
return $buf;
}
1;
__END__
=head1 NAME
Compress::BGZF::Reader - Performs blocked GZIP (BGZF) decompression
=head1 SYNOPSIS
use Compress::BGZF::Reader;
# Use as filehandle
my $fh_bgz = Compress::BGZF::Reader->new_filehandle( $bgz_filename );
# you can do this, but it's probably faster just to pipe gunzip
while (my $line = <$fh_bgz>) {
print $line;
}
# here's the random-access goodness
# fetch 32 bytes from uncompressed offset 1001
seek $fh_bgz, 1001, 0;
read $fh_bgz, my $data, 32;
print $data;
# Use as object
my $reader = Compress::BGZF::Reader->new( $bgz_filename );
# Move to a virtual offset (somehow pre-calculated) and read 32 bytes
$reader->move_to_vo( $virt_offset );
my $data = $reader->read_data(32);
print $data;
$reader->write_index( $fn_idx );
=head1 DESCRIPTION
C<Compress::BGZF::Reader> is a module implementing random access to the BGZIP file
format. While it can do sequential/streaming reads, there is really no point
in using it for this purpose over standard GZIP tools/libraries, since BGZIP
is GZIP-compatible.
There are two main modes of construction - as an object (using C<new()>) and
as a filehandle glob (using C<new_filehandle>). The filehandle mode is
straightforward for general use (emulating seek/read/tell functionality and
passing to other classes/methods that expect a filehandle). The object mode
has additional features such as seeking to virtual offsets and dumping the
offset index to file.
=head1 METHODS
=head2 Filehandle Functions
=over 4
=item B<new_filehandle>
my $fh_bgzf = Compress::BGZF::Writer->new_filehandle( $input_fn );
Create a new C<Compress::BGZF::Reader> engine and tie it to a IO::File handle,
which is returned. Takes a mandatory single argument for the filename to be
read from.
=item B<< <> >>
=item B<readline>
=item B<seek>
=item B<read>
=item B<tell>
=item B<eof>
my $line = <$fh_bgzf>;
my $line = readline $fh_bgzf;
seek $fh_bgzf, 256, 0;
read $fh_bgzf, my $buffer, 32;
my $loc = tell $fh_bgzf;
print "End of file\n" if eof($fh_bgzf);
These functions emulate the standard perl functions of the same name.
=back
=head2 Object-oriented Methods
=over 4
=item B<new>
my $reader = Compress::BGZF::Reader->new( $fn_in );
Create a new C<Compress::BGZF::Reader> engine. Requires a single argument - the
name of the BGZIP file to be read from.
=item B<move_to>
( run in 0.669 second using v1.01-cache-2.11-cpan-140bd7fdf52 )