Archive-BagIt
view release on metacpan or search on metacpan
README.mkdn view on Meta::CPAN
- enhanced testsuite
- reduce complexity
- use modern perl code
- add flag to enable very strict verify
# Backward Compatibility
To reduce the complexity of code in current module the support for
- parallel processing
=item synchronous I/O
is removed. The existing code is very fast, so there is no performance loss.
In near future the support for [Archive::BagIt::Fast](https://metacpan.org/pod/Archive%3A%3ABagIt%3A%3AFast) will be removed, because it needs hooks, which increase code
complexity in current module without any performance benefit.
# FAQ
## How to access the manifest-entries directly?
README.mkdn view on Meta::CPAN
}
Similar for tagmanifests
## How fast is [Archive::BagIt](https://metacpan.org/pod/Archive%3A%3ABagIt)?
I have made great efforts to optimize Archive::BagIt for high throughput. There are two limiting factors:
- calculation of checksums, by switching from the module "Digest" to OpenSSL by using [Net::SSLeay](https://metacpan.org/pod/Net%3A%3ASSLeay) a significant
speed increase could be achieved.
- loading the files referenced in the manifest files was previously done serially and using synchronous I/O. By
using the [IO::Async](https://metacpan.org/pod/IO%3A%3AAsync) module, the files are loaded asynchronously, the performance gain is huge.
On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for `verify_bag()` are:
processing time run time throughput
Version user time system time total time total MB/s
v0.71 38.31s 1.60s 39.938s 100% 230
v0.81 25.48s 1.68s 27.1s 67% 340
v0.82 48.85s 3.89s 6.84s 17% 1346
## How fast is [Archive::BagIt::Fast](https://metacpan.org/pod/Archive%3A%3ABagIt%3A%3AFast)?
lib/Archive/BagIt.pm view on Meta::CPAN
=back
=head1 Backward Compatibility
To reduce the complexity of code in current module the support for
=over
=item parallel processing
=item synchronous I/O
=back
is removed. The existing code is very fast, so there is no performance loss.
In near future the support for L<Archive::BagIt::Fast> will be removed, because it needs hooks, which increase code
complexity in current module without any performance benefit.
=head1 FAQ
lib/Archive/BagIt.pm view on Meta::CPAN
=head2 How fast is L<Archive::BagIt>?
I have made great efforts to optimize Archive::BagIt for high throughput. There are two limiting factors:
=over
=item calculation of checksums, by switching from the module "Digest" to OpenSSL by using L<Net::SSLeay> a significant
speed increase could be achieved.
=item loading the files referenced in the manifest files was previously done serially and using synchronous I/O. By
using the L<IO::Async> module, the files are loaded asynchronously, the performance gain is huge.
=back
On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for C<verify_bag()> are:
processing time run time throughput
Version user time system time total time total MB/s
v0.71 38.31s 1.60s 39.938s 100% 230
v0.81 25.48s 1.68s 27.1s 67% 340
v0.82 48.85s 3.89s 6.84s 17% 1346
lib/Archive/BagIt/Role/OpenSSL.pm view on Meta::CPAN
=head1 VERSION
version 0.101
=head2 has_async_support()
returns true if async IO is possible, because IO::Async could be loaded, otherwise returns false
=head2 get_hash_string($fh)
calls synchronous or asynchronous function to calc digest of file, depending on result of $bag->use_async()
returns the digest result as hex string
=head1 AVAILABILITY
The latest version of this module is available from the Comprehensive Perl
Archive Network (CPAN). Visit L<http://www.perl.com/CPAN/> to find a CPAN
site near you, or see L<https://metacpan.org/module/Archive::BagIt/>.
=head1 BUGS AND LIMITATIONS
lib/Archive/BagIt/Role/OpenSSL/Async.pm view on Meta::CPAN
package Archive::BagIt::Role::OpenSSL::Async;
use strict;
use warnings;
use Moo;
use namespace::autoclean;
use IO::Async::Loop;
use IO::Async::Stream;
use Net::SSLeay ();
our $VERSION = '0.101'; # VERSION
# ABSTRACT: handles asynchronous digest calculation using openssl
sub BEGIN {
Net::SSLeay::OpenSSL_add_all_digests();
is => 'rw',
}
has 'name' => (
required => 1,
is => 'ro',
lib/Archive/BagIt/Role/OpenSSL/Async.pm view on Meta::CPAN
1;
__END__
=pod
=encoding UTF-8
=head1 NAME
Archive::BagIt::Role::OpenSSL::Async - handles asynchronous digest calculation using openssl
=head1 VERSION
version 0.101
=head1 AVAILABILITY
The latest version of this module is available from the Comprehensive Perl
Archive Network (CPAN). Visit L<http://www.perl.com/CPAN/> to find a CPAN
site near you, or see L<https://metacpan.org/module/Archive::BagIt/>.
lib/Archive/BagIt/Role/OpenSSL/Sync.pm view on Meta::CPAN
package Archive::BagIt::Role::OpenSSL::Sync;
use strict;
use warnings FATAL => 'all';
use Moo;
use namespace::autoclean;
use Net::SSLeay ();
our $VERSION = '0.101'; # VERSION
# ABSTRACT: handles synchronous digest calculation using openssl
sub BEGIN {
Net::SSLeay::OpenSSL_add_all_digests();
}
has 'name' => (
required => 1,
is => 'ro',
);
lib/Archive/BagIt/Role/OpenSSL/Sync.pm view on Meta::CPAN
1;
__END__
=pod
=encoding UTF-8
=head1 NAME
Archive::BagIt::Role::OpenSSL::Sync - handles synchronous digest calculation using openssl
=head1 VERSION
version 0.101
=head1 AVAILABILITY
The latest version of this module is available from the Comprehensive Perl
Archive Network (CPAN). Visit L<http://www.perl.com/CPAN/> to find a CPAN
site near you, or see L<https://metacpan.org/module/Archive::BagIt/>.
( run in 1.480 second using v1.01-cache-2.11-cpan-ff066701436 )