CPANPLUS

 view release on metacpan or  search on metacpan

inc/bundle/Archive/Tar.pm  view on Meta::CPAN

An object of class Archive::Tar represents a .tar(.gz) archive full
of files and things.

=head1 Object Methods

=head2 Archive::Tar->new( [$file, $compressed] )

Returns a new Tar object. If given any arguments, C<new()> calls the
C<read()> method automatically, passing on the arguments provided to
the C<read()> method.

If C<new()> is invoked with arguments and the C<read()> method fails
for any reason, C<new()> returns undef.

=cut

my $tmpl = {
    _data   => [ ],
    _file   => 'Unknown',
};

### install get/set accessors for this object.
for my $key ( keys %$tmpl ) {
    no strict 'refs';
    *{__PACKAGE__."::$key"} = sub {
        my $self = shift;
        $self->{$key} = $_[0] if @_;
        return $self->{$key};
    }
}

sub new {
    my $class = shift;
    $class = ref $class if ref $class;

    ### copying $tmpl here since a shallow copy makes it use the
    ### same aref, causing for files to remain in memory always.
    my $obj = bless { _data => [ ], _file => 'Unknown', _error => '' }, $class;

    if (@_) {
        unless ( $obj->read( @_ ) ) {
            $obj->_error(qq[No data could be read from file]);
            return;
        }
    }

    return $obj;
}

=head2 $tar->read ( $filename|$handle, [$compressed, {opt => 'val'}] )

Read the given tar file into memory.
The first argument can either be the name of a file or a reference to
an already open filehandle (or an IO::Zlib object if it's compressed)

The C<read> will I<replace> any previous content in C<$tar>!

The second argument may be considered optional, but remains for
backwards compatibility. Archive::Tar now looks at the file
magic to determine what class should be used to open the file
and will transparently Do The Right Thing.

Archive::Tar will warn if you try to pass a bzip2 / xz compressed file and the
IO::Uncompress::Bunzip2 / IO::Uncompress::UnXz are not available and simply return.

Note that you can currently B<not> pass a C<gzip> compressed
filehandle, which is not opened with C<IO::Zlib>, a C<bzip2> compressed
filehandle, which is not opened with C<IO::Uncompress::Bunzip2>, a C<xz> compressed
filehandle, which is not opened with C<IO::Uncompress::UnXz>, nor a string
containing the full archive information (either compressed or
uncompressed). These are worth while features, but not currently
implemented. See the C<TODO> section.

The third argument can be a hash reference with options. Note that
all options are case-sensitive.

=over 4

=item limit

Do not read more than C<limit> files. This is useful if you have
very big archives, and are only interested in the first few files.

=item filter

Can be set to a regular expression.  Only files with names that match
the expression will be read.

=item md5

Set to 1 and the md5sum of files will be returned (instead of file data)
    my $iter = Archive::Tar->iter( $file,  1, {md5 => 1} );
    while( my $f = $iter->() ) {
        print $f->data . "\t" . $f->full_path . $/;
    }

=item extract

If set to true, immediately extract entries when reading them. This
gives you the same memory break as the C<extract_archive> function.
Note however that entries will not be read into memory, but written
straight to disk. This means no C<Archive::Tar::File> objects are
created for you to inspect.

=back

All files are stored internally as C<Archive::Tar::File> objects.
Please consult the L<Archive::Tar::File> documentation for details.

Returns the number of files read in scalar context, and a list of
C<Archive::Tar::File> objects in list context.

=cut

sub read {
    my $self = shift;
    my $file = shift;
    my $gzip = shift || 0;
    my $opts = shift || {};

    unless( defined $file ) {
        $self->_error( qq[No file to read from!] );
        return;
    } else {
        $self->_file( $file );
    }

    my $handle = $self->_get_handle($file, $gzip, READ_ONLY->( ZLIB ) )
                    or return;

    my $data = $self->_read_tar( $handle, $opts ) or return;

    $self->_data( $data );

    return wantarray ? @$data : scalar @$data;
}

sub _get_handle {
    my $self     = shift;
    my $file     = shift;   return unless defined $file;
    my $compress = shift || 0;
    my $mode     = shift || READ_ONLY->( ZLIB ); # default to read only

    ### Check if file is a file handle or IO glob
    if ( ref $file ) {
	return $file if eval{ *$file{IO} };
	return $file if eval{ $file->isa(q{IO::Handle}) };
	$file = q{}.$file;
    }

    ### get a FH opened to the right class, so we can use it transparently
    ### throughout the program
    my $fh;
    {   ### reading magic only makes sense if we're opening a file for
        ### reading. otherwise, just use what the user requested.
        my $magic = '';
        if( MODE_READ->($mode) ) {
            open my $tmp, $file or do {
                $self->_error( qq[Could not open '$file' for reading: $!] );
                return;
            };

            ### read the first 6 bytes of the file to figure out which class to
            ### use to open the file.
            sysread( $tmp, $magic, 6 );
            close $tmp;
        }

        ### is it xz?
        ### if you asked specifically for xz compression, or if we're in
        ### read mode and the magic numbers add up, use xz
        if( XZ and (
               ($compress eq COMPRESS_XZ) or
               ( MODE_READ->($mode) and $magic =~ XZ_MAGIC_NUM )
            )
        ) {
            if( MODE_READ->($mode) ) {
                $fh = IO::Uncompress::UnXz->new( $file ) or do {
                    $self->_error( qq[Could not read '$file': ] .
                        $IO::Uncompress::UnXz::UnXzError
                    );
                    return;
                };
            } else {
                $fh = IO::Compress::Xz->new( $file ) or do {
                    $self->_error( qq[Could not write to '$file': ] .
                        $IO::Compress::Xz::XzError
                    );
                    return;
                };
            }

        ### is it bzip?
        ### if you asked specifically for bzip compression, or if we're in
        ### read mode and the magic numbers add up, use bzip
        } elsif( BZIP and (
                ($compress eq COMPRESS_BZIP) or
                ( MODE_READ->($mode) and $magic =~ BZIP_MAGIC_NUM )
            )
        ) {

            ### different reader/writer modules, different error vars... sigh
            if( MODE_READ->($mode) ) {
                $fh = IO::Uncompress::Bunzip2->new( $file, MultiStream => 1 ) or do {
                    $self->_error( qq[Could not read '$file': ] .
                        $IO::Uncompress::Bunzip2::Bunzip2Error
                    );
                    return;
                };

            } else {



( run in 3.563 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )