Archive-Libarchive-FFI
view release on metacpan or search on metacpan
}
while(1)
{
$r = archive_read_next_header($a, my $entry);
if($r == ARCHIVE_EOF)
{
last;
}
if($r != ARCHIVE_OK)
{
print archive_error_string($a), "\n";
}
if($r < ARCHIVE_WARN)
{
exit 1;
}
$r = archive_write_header($ext, $entry);
if($r != ARCHIVE_OK)
{
print archive_error_string($ext), "\n";
}
elsif(archive_entry_size($entry) > 0)
{
copy_data($a, $ext);
}
}
archive_read_close($a);
archive_read_free($a);
archive_write_close($ext);
archive_write_free($ext);
sub copy_data
{
my($ar, $aw) = @_;
my $r;
while(1)
{
$r = archive_read_data_block($ar, my $buff, my $offset);
if($r == ARCHIVE_EOF)
{
return;
}
if($r != ARCHIVE_OK)
{
die archive_error_string($ar), "\n";
}
$r = archive_write_data_block($aw, $buff, $offset);
if($r != ARCHIVE_OK)
{
die archive_error_string($aw), "\n";
}
}
}
Unicode
Libarchive deals with two types of string like data. Pathnames, user
and group names are proper strings and are encoded in the codeset for
the current POSIX locale. Content data for files stored and retrieved
from in raw bytes.
The usual operational procedure in Perl is to convert everything on
input into UTF-8, operate on the UTF-8 data and then convert (if
necessary) everything on output to the desired output format.
In order to get useful string data out of libarchive, this module
translates its input/output using the codeset for the current POSIX
locale. So you must be using a POSIX locale that supports the
characters in the pathnames of the archives you are going to process,
and it is highly recommend that you use a UTF-8 locale, which should
cover everything.
use strict;
use warnings;
use utf8;
use Archive::Libarchive::FFI qw( :all );
use POSIX qw( setlocale LC_ALL );
# substitute en_US.utf8 for the correct UTF-8 locale for your region.
setlocale(LC_ALL, "en_US.utf8"); # or 'export LANG=en_US.utf8' from your shell.
my $entry = archive_entry_new();
archive_entry_set_pathname($entry, "пÑивеÑ.txt");
my $string = archive_entry_pathname($entry); # "пÑивеÑ.txt"
archive_entry_free($entry);
If you try to pass a string with characters unsupported by your current
locale, the behavior is undefined. If you try to retrieve strings with
characters unsupported by your current locale you will get undef.
Unfortunately locale names are not portable across systems, so you
should probably not hard code the locale as shown here unless you know
the correct locale name for all the platforms that your script will
run.
There are two Perl only functions that give information about the
current codeset as understood by libarchive. archive_perl_utf8_mode if
the currently selected codeset is UTF-8.
use strict;
use warnings;
use Archive::Libarchive::FFI qw( :all );
die "must use UTF-8 locale" unless archive_perl_utf8_mode();
archive_perl_codeset returns the currently selected codeset.
use strict;
use warnings;
use Archive::Libarchive::FFI qw( :all );
my $entry = archive_entry_new();
if(archive_perl_codeset() =~ /^(ISO-8859-5|CP1251|KOI8-R|UTF-8)$/)
{
archive_entry_set_pathname($entry, "пÑивеÑ.txt");
my $string = archive_entry_pathname($entry); # "пÑивеÑ.txt"
}
else
{
archive_entry_set_pathname($entry, "privet.txt");
my $string = archive_entry_pathname($entry); # "privet.txt"
}
Because libarchive reads and writes file content within an archive
using raw bytes, if your file content has non ASCII characters in it,
then you need to encode them
use Encode qw( encode );
archive_write_data($archive, encode('UTF-8', "пÑивеÑ.txt");
# or
archive_write_data($archive, encode('KOI8-R', "пÑивеÑ.txt");
read:
use Encode qw( decode );
my $raw;
archive_read_data($archive, $raw, 10240);
my $decoded_content = decode('UTF-8', $raw);
# or
my $decoded_content = decode('KOI8-R', $raw);
SUPPORT
If you find bugs, please open an issue on the project GitHub
repository:
https://github.com/plicease/Archive-Libarchive-FFI/issues?state=open
If you have a fix, please open a pull request. You can see the
CONTRIBUTING file for traps, hints and pitfalls.
CAVEATS
Archive and entry objects are really pointers to opaque C structures
and need to be freed using one of archive_read_free, archive_write_free
or archive_entry_free, in order to free the resources associated with
those objects.
Proper Unicode (or non-ASCII character support) depends on setting the
correct POSIX locale, which is system dependent.
The documentation that comes with libarchive is not that great (by its
own admission), being somewhat incomplete, and containing a few subtle
errors. In writing the documentation for this distribution, I borrowed
heavily (read: stole wholesale) from the libarchive documentation,
making changes where appropriate for use under Perl (changing NULL to
undef for example, along with the interface change to make that work).
I may and probably have introduced additional subtle errors. Patches to
the documentation that match the implementation, or fixes to the
implementation so that it matches the documentation (which ever is
appropriate) would greatly appreciated.
SEE ALSO
The intent of this module is to provide a low level fairly thin direct
interface to libarchive, on which a more Perlish OO layer could easily
be written.
Archive::Libarchive::XS
Archive::Libarchive::FFI
Both of these provide the same API to libarchive but the bindings are
implemented in XS for one and via FFI::Raw for the other.
Archive::Libarchive::Any
Offers whichever is available, either the XS or FFI version. The
actual algorithm as to which is picked is subject to change,
depending on with version seems to be the most reliable.
Archive::Peek::Libarchive
Archive::Extract::Libarchive
Both of these provide a higher level, less complete perlish interface
to libarchive.
Archive::Tar
Archive::Tar::Wrapper
Just some of the many modules on CPAN that will read/write tar
archives.
Archive::Zip
Just one of the many modules on CPAN that will read/write zip
archives.
Archive::Any
A module attempts to read/write multiple formats using different
methods depending on what perl modules are installed, and preferring
pure perl modules.
AUTHOR
Graham Ollis <plicease@cpan.org>
( run in 1.091 second using v1.01-cache-2.11-cpan-5a3173703d6 )