Compress-BGZF

 view release on metacpan or  search on metacpan

lib/Compress/BGZF/Reader.pm  view on Meta::CPAN


    # sysread wrapper that checks return count and returns read
    # (internally we should never read off end of file - doing so indicates
    # either a software bug or a corrupt input file so we croak)

    #-------------------------------------------------------------------------
    # ARG 0 : bytes to read
    #-------------------------------------------------------------------------
    # RET 0 : string read
    #-------------------------------------------------------------------------

    my ($fh, $len) = @_;
    my $buf = '';
    my $r = sysread $fh, $buf, $len;
    croak "Returned unexpected byte count" if ($r != $len);

    return $buf;

}

1;


__END__

=head1 NAME

Compress::BGZF::Reader - Performs blocked GZIP (BGZF) decompression

=head1 SYNOPSIS

    use Compress::BGZF::Reader;

    # Use as filehandle
    my $fh_bgz = Compress::BGZF::Reader->new_filehandle( $bgz_filename );

    # you can do this, but it's probably faster just to pipe gunzip
    while (my $line = <$fh_bgz>) {
        print $line;
    }

    # here's the random-access goodness
    # fetch 32 bytes from uncompressed offset 1001
    seek $fh_bgz, 1001, 0;
    read $fh_bgz, my $data, 32;
    print $data;

    # Use as object
    my $reader = Compress::BGZF::Reader->new( $bgz_filename );

    # Move to a virtual offset (somehow pre-calculated) and read 32 bytes
    $reader->move_to_vo( $virt_offset );
    my $data = $reader->read_data(32);
    print $data;

    $reader->write_index( $fn_idx );

=head1 DESCRIPTION

C<Compress::BGZF::Reader> is a module implementing random access to the BGZIP file
format. While it can do sequential/streaming reads, there is really no point
in using it for this purpose over standard GZIP tools/libraries, since BGZIP
is GZIP-compatible.

There are two main modes of construction - as an object (using C<new()>) and
as a filehandle glob (using C<new_filehandle>). The filehandle mode is
straightforward for general use (emulating seek/read/tell functionality and
passing to other classes/methods that expect a filehandle).  The object mode
has additional features such as seeking to virtual offsets and dumping the
offset index to file.

=head1 METHODS

=head2 Filehandle Functions

=over 4

=item B<new_filehandle>

    my $fh_bgzf = Compress::BGZF::Writer->new_filehandle( $input_fn );

Create a new C<Compress::BGZF::Reader> engine and tie it to a IO::File handle,
which is returned. Takes a mandatory single argument for the filename to be
read from.

=item B<< <> >>

=item B<readline>

=item B<seek>

=item B<read>

=item B<tell>

=item B<eof>

    my $line = <$fh_bgzf>;
    my $line = readline $fh_bgzf;
    seek $fh_bgzf, 256, 0;
    read $fh_bgzf, my $buffer, 32;
    my $loc = tell $fh_bgzf;
    print "End of file\n" if eof($fh_bgzf);

These functions emulate the standard perl functions of the same name.

=back

=head2 Object-oriented Methods

=over 4

=item B<new>

    my $reader = Compress::BGZF::Reader->new( $fn_in );

Create a new C<Compress::BGZF::Reader> engine. Requires a single argument - the
name of the BGZIP file to be read from.

=item B<move_to>



( run in 0.669 second using v1.01-cache-2.11-cpan-140bd7fdf52 )