Archive-BagIt

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

  Change: 2f83bef072390585a28ed9014263458e4ecdc724
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-10-17 17:24:14 +0000

    - minor 

  Change: 047dc9bccc8be96d4ad4acd01975c8ba086cc07f
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-10-17 16:50:53 +0000

    - refactoring, reduce complexity of _verify_baginfo() 

  Change: b553ea364ed6b470da765f022c3872747beacf5e
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-10-17 16:11:51 +0000

    - fixed perlcritic warning 

  Change: 6976c6173188757924eff68cf819798a5d103aa3
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-10-17 16:11:28 +0000

Changes  view on Meta::CPAN

  Change: 3206eccbf7fe122c1afc96318fd664d4d5fa2f76
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-02-20 13:34:35 +0000

    - minor, renamed to private method 

  Change: e01551190ef45f4eb8d5f3ed7904e75e22c731f4
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-02-20 13:22:04 +0000

    - bugfix, improve verify_baginfo() to check key and value problems 

  Change: a66b1d0b6048efb515adf5d3a4f2d4cb4970df86
  Author: Andreas Romeyke <art1@andreas-romeyke.de>
  Date : 2025-02-20 13:19:31 +0000

    - add verify_baginfo() related tests 

------------------------------------------
version 0.096 at 2025-01-29 08:02:59 +0000
------------------------------------------

  Change: 4a7021fe3860b7ff1bd67f7bdb06bf37a2602f47
  Author: art1@andreas-romeyke.de <art1pirat@fsfe.org>
  Date : 2025-01-29 09:02:59 +0000

    Released 0.096 

MANIFEST  view on Meta::CPAN

bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut-for-fetch/data/text-file.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut-for-fetch/fetch.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut-for-fetch/manifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut-for-fetch/tagmanifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/bag-info.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/bagit.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/data/bare-filename
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/data/text-file.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/manifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-shortcut/tagmanifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/.-verifyvalid-null.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/bag-info.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/bagit.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/data/bare-filename
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/data/text-file.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/fetch.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/manifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc-for-fetch/tagmanifest-md5.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc/bag-info.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc/bagit.txt
bagit_conformance_suite/v0.97/windows-only/out-of-scope-file-paths-using-unc/data/bare-filename

MANIFEST  view on Meta::CPAN

out/test/Archive-BagIt/src/src_bag_deep/bag-info.txt
out/test/Archive-BagIt/src/src_bag_deep/bagit.txt
out/test/Archive-BagIt/src/src_bag_deep/data/3
out/test/Archive-BagIt/src/src_bag_deep/data/subdir1/1
out/test/Archive-BagIt/src/src_bag_deep/data/subdir2/subsubdir/2
out/test/Archive-BagIt/src/src_bag_deep/manifest-md5.txt
out/test/Archive-BagIt/src/src_bag_deep/tagmanifest-md5.txt
out/test/Archive-BagIt/src/src_files/1
out/test/Archive-BagIt/src/src_files/2
out/test/Archive-BagIt/store_bag.t
out/test/Archive-BagIt/verify_bag.t
out/test/Archive-BagIt/verify_baginfo.t
out/test/Archive-BagIt/verify_simple.t
t/00-compile.t
t/00-load.t
t/access_bag.t
t/base.t
t/blns.license.txt
t/blns.txt
t/internal.t
t/manifest.t
t/naughty_strings.t
t/payload_files.t

MANIFEST  view on Meta::CPAN

t/src/src_bag_deep/bag-info.txt
t/src/src_bag_deep/bagit.txt
t/src/src_bag_deep/data/3
t/src/src_bag_deep/data/subdir1/1
t/src/src_bag_deep/data/subdir2/subsubdir/2
t/src/src_bag_deep/manifest-md5.txt
t/src/src_bag_deep/tagmanifest-md5.txt
t/src/src_files/1
t/src/src_files/2
t/store_bag.t
t/verify_bag.t
t/verify_baginfo.t
t/verify_simple.t
test_baginfo.pl
xt/author/critic.t
xt/author/distmeta.t
xt/author/eol.t
xt/author/minimum-version.t
xt/author/mojibake.t
xt/author/no-tabs.t
xt/author/pod-coverage.t
xt/author/pod-linkcheck.t
xt/author/pod-syntax.t

README.mkdn  view on Meta::CPAN

- only encoding UTF-8 is supported
- version 0.97 or 1.0 allowed
- version 0.97 requires tag-/manifest-files with md5-fixity
- version 1.0 requires tag-/manifest-files with sha512-fixity
- BOM is not supported
- Carriage Return in bagit-files are not allowed
- fetch.txt is unsupported

At the moment only filepaths in linux-style are supported.

To get an more detailled overview, see the testsuite under `t/verify_bag.t` and corresponding test bags from the BagIt conformance testsuite of Library of Congress under `bagit_conformance_suite/`.

See [https://datatracker.ietf.org/doc/rfc8493/?include\_text=1](https://datatracker.ietf.org/doc/rfc8493/?include_text=1) for details.

# TODO

- enhanced testsuite
- reduce complexity
- use modern perl code
- add flag to enable very strict verify

# Backward Compatibility

To reduce the complexity of code in current module the support for

- parallel processing
=item synchronous I/O

is removed. The existing code is very fast, so there is no performance loss.

README.mkdn  view on Meta::CPAN


## How fast is [Archive::BagIt](https://metacpan.org/pod/Archive%3A%3ABagIt)?

I have made great efforts to optimize Archive::BagIt for high throughput. There are two limiting factors:

- calculation of checksums, by switching from the module "Digest" to OpenSSL by using [Net::SSLeay](https://metacpan.org/pod/Net%3A%3ASSLeay) a significant
   speed increase could be achieved.
- loading the files referenced in the manifest files was previously done serially and using synchronous I/O. By
   using the [IO::Async](https://metacpan.org/pod/IO%3A%3AAsync) module, the files are loaded asynchronously, the performance gain is huge.

On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for `verify_bag()` are:

                     processing time          run time             throughput
    Version       user time    system time    total time    total    MB/s
     v0.71        38.31s        1.60s         39.938s       100%     230
     v0.81        25.48s        1.68s         27.1s          67%     340
     v0.82        48.85s        3.89s          6.84s         17%    1346

## How fast is [Archive::BagIt::Fast](https://metacpan.org/pod/Archive%3A%3ABagIt%3A%3AFast)?

It depends. On my system with 8cores, SSD and a 38MB bag with 48 payload files the results for `verify_bag()` are:

                   Rate         Base         Fast
    Base         3.01/s           --         -21%
    Fast         3.80/s          26%           --

On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for `verify_bag()` are:

                 s/iter         Base         Fast
    Base           74.6           --          -9%
    Fast           68.3           9%           --

But you should measure which variant is best for you. In general the default [Archive::BagIt](https://metacpan.org/pod/Archive%3A%3ABagIt) is fast enough.

## How to update an old bag of version v0.97 to v1.0?

You could try this:

README.mkdn  view on Meta::CPAN

# THANKS

Thanks to Rob Schmidt <rjeschmi@gmail.com> for the trustful handover of the project and thanks for your initial work!
I would also like to thank Patrick Hochstenbach and Rusell McOrmond for their valuable and especially detailed advice!
And without the helpful, sometimes rude help of the IRC channel #perl I would have been stuck in a lot of problems.
Without the support of my colleagues at SLUB Dresden, the project would never have made it this far.

# SYNOPSIS

This modules will hopefully help with the basic commands needed to create
and verify a bag. This part supports BagIt 1.0 according to RFC 8493 (\[https://tools.ietf.org/html/rfc8493\](https://tools.ietf.org/html/rfc8493)).

You only need to know the following methods first:

## read a BagIt

    use Archive::BagIt;

    #read in an existing bag:
    my $bag_dir = "/path/to/bag";
    my $bag = Archive::BagIt->new($bag_dir);

## construct a BagIt around a payload

    use Archive::BagIt;
    my $bag2 = Archive::BagIt->make_bag($bag_dir);

## verify a BagIt-dir

    use Archive::BagIt;

    # Validate a BagIt archive against its manifest
    my $bag3 = Archive::BagIt->new($bag_dir);
    my $is_valid1 = $bag3->verify_bag();

    # Validate a BagIt archive against its manifest, report all errors
    my $bag4 = Archive::BagIt->new($bag_dir);
    my $is_valid2 = $bag4->verify_bag( {report_all_errors => 1} );

## read a BagIt-dir, change something, store

Because all methods operate lazy, you should ensure to parse parts of the bag \*BEFORE\* you modify it.
Otherwise it will be overwritten!

    use Archive::BagIt;
    my $bag5 = Archive::BagIt->new($bag_dir); # lazy, nothing happened
    $bag5->load(); # this updates the object representation by parsing the given $bag_dir
    $bag5->store(); # this writes the bag new

README.mkdn  view on Meta::CPAN

    my $bag = Archive::BagIt->new(
        bag_path => $bag_dir,
    );

The arguments are:

- `bag_path` - path to bag-directory
- `force_utf8` - if set the warnings about non portable filenames are disabled (default: enabled)
- `use_plugins` - expected manifest plugin strings, if set it uses the requested plugins,
      example `Archive::BagIt::Plugin::Manifest::SHA256`.
      HINT: this option \*disables\* the forced fixity check in `verify_bag()`!

The bag object will use $bag\_dir, BUT an existing $bag\_dir is not read. If you use `store()` an existing bag will be overwritten!

See `load()` if you want to parse/modify an existing bag.

## has\_force\_utf8()

to check if force\_utf8() was set.

If set it ignores warnings about potential filepath problems.

README.mkdn  view on Meta::CPAN

Getter/Setter for bag info. Expects/returns an array of HashRefs implementing simple key-value pairs.

HINT: RFC8493 does not allow \*reordering\* of entries!

## has\_bag\_info()

returns true if bag info exists.

## errors()

Getter to return collected errors after a `verify_bag()` call with Option `report_all_errors`

## warnings()

Getter to return collected warnings after a `verify_bag()` call

## digest\_callback()

This method could be reimplemented by derived classes to handle fixity checks in own way. The
getter returns an anonymous function with following interface:

    my $digest = $self->digest_callback;
    &$digest( $digestobject, $filename);

This anonymous function MUST use the `get_hash_string()` function of the [Archive::BagIt::Role::Algorithm](https://metacpan.org/pod/Archive%3A%3ABagIt%3A%3ARole%3A%3AAlgorithm) role,

README.mkdn  view on Meta::CPAN

Returns all values which match $searchkey, undef otherwise

## is\_baginfo\_key\_reserved\_as\_uniq($searchkey)

returns true if key is reserved and should be uniq

## is\_baginfo\_key\_reserved( $searchkey )

returns true if key is reserved

## verify\_baginfo()

checks baginfo-keys, returns true if all fine, otherwise returns undef and the message is pushed to `errors()`.
Warnings pushed to ` warnings() `

## delete\_baginfo\_by\_key( $searchkey )

deletes an entry of given $searchkey if exists.
If multiple entries with $searchkey exists, only the last one is deleted.

## exists\_baginfo\_key( $searchkey )

README.mkdn  view on Meta::CPAN

## load\_plugins

As default SHA512 and MD5 will be loaded and therefore used. If you want to create a bag only with one or a specific
checksum-algorithm, you could use this method to (re-)register it. It expects list of strings with namespace of type:
Archive::BagIt::Plugin::Algorithm::XXX where XXX is your chosen fixity algorithm.

## load()

Triggers loading of an existing bag

## verify\_bag($opts)

A method to verify a bag deeply. If `$opts` is set with `{return_all_errors}` all fixity errors are reported.
The default ist to croak with error message if any error is detected.

HINT: You might also want to check [Archive::BagIt::Fast](https://metacpan.org/pod/Archive%3A%3ABagIt%3A%3AFast) to see a more direct way of accessing files (and thus faster).

## calc\_payload\_oxum()

returns an array with octets and streamcount of payload-dir

## calc\_bagsize()

lib/Archive/BagIt.pm  view on Meta::CPAN

                and ($keys{$key} > 1)
        ) {
            $ret &&=$self->_collect_errors("Baginfo key '$key' exists $keys{$key}, but should be uniq!");
        }
    }
    return $ret;
}

###############################################

sub _verify_baginfo {
    my ($self, $info) = @_;
    my $ret = 1;

    if (!defined $info) {
        if (exists $self->{bag_info_file}) {
            $ret &&= $self->_collect_errors("'bag-info.txt' exists, but is not (partially) parseable!");
        }
    } else {
        $ret &&= $self->_check_baginfo_keys_generically($info);
        # check for payload oxum

lib/Archive/BagIt.pm  view on Meta::CPAN

            if ("$octets.$streamcount" ne $loaded_payloadoxum) {
                $ret &&= $self->_collect_errors( "Payload-Oxum differs, calculated $octets.$streamcount but $loaded_payloadoxum was expected by bag-info.txt");
            }
        }
    }
    return $ret;
}

###############################################

sub verify_baginfo {
    my ($self) = @_;
    my $info = $self->bag_info();
    if (List::Util::any { /the baginfo file .* could not be parsed correctly/ } @{$self->{'errors'}}) {
        return;
    }
    return $self->_verify_baginfo($info);
}

###############################################


sub delete_baginfo_by_key {
    my ($self, $searchkey) = @_;
    my $idx = $self->_find_baginfo_idx($searchkey);
    if (defined $idx) {
        splice @{$self->{bag_info}}, $idx, 1; # delete nth indexed entry

lib/Archive/BagIt.pm  view on Meta::CPAN

    $self->payload_path;
    $self->manifest_files;
    $self->checksum_algos;
    $self->tagmanifest_files;
    return 1;
}

###############################################


sub verify_bag {
    my ($self,$opts) = @_;
    #removed the ability to pass in a bag in the parameters, but might want options
    #like $return all errors rather than dying on first one
    my $bagit = $self->bag_path;
    my $version = $self->bag_version(); # to call trigger
    my $encoding = $self->bag_encoding(); # to call trigger
    my $baginfo = $self->verify_baginfo(); #to call trigger

    my $fetch_file = File::Spec->catfile($self->metadata_path, "fetch.txt");
    my $payload_dir   = $self->payload_path;
    my $return_all_errors = $opts->{return_all_errors};

    if (-f $fetch_file) {
        croak("Fetching via file '$fetch_file' is not supported by current Archive::BagIt implementation")
    }
    # check forced fixity
    if ($self->has_forced_fixity_algorithm()) {

lib/Archive/BagIt.pm  view on Meta::CPAN


    unless ($version > .95) {
        croak ("Bag Version $version is unsupported");
    }

    my @errors;


    # check for manifests
    foreach my $algorithm ( keys %{ $self->manifests }) {
        my $res = $self->manifests->{$algorithm}->verify_manifest($self->payload_files, $return_all_errors);
        if ((defined $res) && ($res ne "1")) { push @errors, $res; }
    }
    #check for tagmanifests
    foreach my $algorithm ( keys %{ $self->manifests }) {
        my $res = $self->manifests->{$algorithm}->verify_tagmanifest($self->non_payload_files, $return_all_errors);
        if ((defined $res) && ($res ne "1")) { push @errors, $res; }
    }
    push @{$self->{errors}}, @errors;
    my $err = $self->errors();
    my @err =  @{ $err };
    if (scalar( @err ) > 0) {
        croak join("\n","bag verify for bagit version '$version' failed with invalid files.", @err);
    }
    return 1;
}


sub calc_payload_oxum {
    my($self) = @_;
    my @payload = @{$self->payload_files};
    my $octets=0;
    my $streamcount = scalar @payload;

lib/Archive/BagIt.pm  view on Meta::CPAN

=item BOM is not supported

=item Carriage Return in bagit-files are not allowed

=item fetch.txt is unsupported

=back

At the moment only filepaths in linux-style are supported.

To get an more detailled overview, see the testsuite under F<t/verify_bag.t> and corresponding test bags from the BagIt conformance testsuite of Library of Congress under F<bagit_conformance_suite/>.

See L<https://datatracker.ietf.org/doc/rfc8493/?include_text=1> for details.

=head1 TODO

=over

=item enhanced testsuite

=item reduce complexity

=item use modern perl code

=item add flag to enable very strict verify

=back

=head1 Backward Compatibility

To reduce the complexity of code in current module the support for

=over

=item parallel processing

lib/Archive/BagIt.pm  view on Meta::CPAN

=over

=item calculation of checksums, by switching from the module "Digest" to OpenSSL by using L<Net::SSLeay> a significant
   speed increase could be achieved.

=item loading the files referenced in the manifest files was previously done serially and using synchronous I/O. By
   using the L<IO::Async> module, the files are loaded asynchronously, the performance gain is huge.

=back

On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for C<verify_bag()> are:

                    processing time          run time             throughput
   Version       user time    system time    total time    total    MB/s
    v0.71        38.31s        1.60s         39.938s       100%     230
    v0.81        25.48s        1.68s         27.1s          67%     340
    v0.82        48.85s        3.89s          6.84s         17%    1346

=head2 How fast is L<Archive::BagIt::Fast>?

It depends. On my system with 8cores, SSD and a 38MB bag with 48 payload files the results for C<verify_bag()> are:

                  Rate         Base         Fast
   Base         3.01/s           --         -21%
   Fast         3.80/s          26%           --

On my system with 8cores, SSD and a large 9GB bag with 568 payload files the results for C<verify_bag()> are:

                s/iter         Base         Fast
   Base           74.6           --          -9%
   Fast           68.3           9%           --

But you should measure which variant is best for you. In general the default L<Archive::BagIt> is fast enough.

=head2 How to update an old bag of version v0.97 to v1.0?

You could try this:

lib/Archive/BagIt.pm  view on Meta::CPAN

=head1 THANKS

Thanks to Rob Schmidt <rjeschmi@gmail.com> for the trustful handover of the project and thanks for your initial work!
I would also like to thank Patrick Hochstenbach and Rusell McOrmond for their valuable and especially detailed advice!
And without the helpful, sometimes rude help of the IRC channel #perl I would have been stuck in a lot of problems.
Without the support of my colleagues at SLUB Dresden, the project would never have made it this far.

=head1 SYNOPSIS

This modules will hopefully help with the basic commands needed to create
and verify a bag. This part supports BagIt 1.0 according to RFC 8493 ([https://tools.ietf.org/html/rfc8493](https://tools.ietf.org/html/rfc8493)).

You only need to know the following methods first:

=head2 read a BagIt

    use Archive::BagIt;

    #read in an existing bag:
    my $bag_dir = "/path/to/bag";
    my $bag = Archive::BagIt->new($bag_dir);

=head2 construct a BagIt around a payload

    use Archive::BagIt;
    my $bag2 = Archive::BagIt->make_bag($bag_dir);

=head2 verify a BagIt-dir

    use Archive::BagIt;

    # Validate a BagIt archive against its manifest
    my $bag3 = Archive::BagIt->new($bag_dir);
    my $is_valid1 = $bag3->verify_bag();

    # Validate a BagIt archive against its manifest, report all errors
    my $bag4 = Archive::BagIt->new($bag_dir);
    my $is_valid2 = $bag4->verify_bag( {report_all_errors => 1} );

=head2 read a BagIt-dir, change something, store

Because all methods operate lazy, you should ensure to parse parts of the bag *BEFORE* you modify it.
Otherwise it will be overwritten!

    use Archive::BagIt;
    my $bag5 = Archive::BagIt->new($bag_dir); # lazy, nothing happened
    $bag5->load(); # this updates the object representation by parsing the given $bag_dir
    $bag5->store(); # this writes the bag new

lib/Archive/BagIt.pm  view on Meta::CPAN

The arguments are:

=over 1

=item C<bag_path> - path to bag-directory

=item C<force_utf8> - if set the warnings about non portable filenames are disabled (default: enabled)

=item C<use_plugins> - expected manifest plugin strings, if set it uses the requested plugins,
      example C<Archive::BagIt::Plugin::Manifest::SHA256>.
      HINT: this option *disables* the forced fixity check in C<verify_bag()>!

=back

The bag object will use $bag_dir, BUT an existing $bag_dir is not read. If you use C<store()> an existing bag will be overwritten!

See C<load()> if you want to parse/modify an existing bag.

=head2 has_force_utf8()

to check if force_utf8() was set.

lib/Archive/BagIt.pm  view on Meta::CPAN

Getter/Setter for bag info. Expects/returns an array of HashRefs implementing simple key-value pairs.

HINT: RFC8493 does not allow *reordering* of entries!

=head2 has_bag_info()

returns true if bag info exists.

=head2 errors()

Getter to return collected errors after a C<verify_bag()> call with Option C<report_all_errors>

=head2 warnings()

Getter to return collected warnings after a C<verify_bag()> call

=head2 digest_callback()

This method could be reimplemented by derived classes to handle fixity checks in own way. The
getter returns an anonymous function with following interface:

   my $digest = $self->digest_callback;
   &$digest( $digestobject, $filename);

This anonymous function MUST use the C<get_hash_string()> function of the L<Archive::BagIt::Role::Algorithm> role,

lib/Archive/BagIt.pm  view on Meta::CPAN

Returns all values which match $searchkey, undef otherwise

=head2 is_baginfo_key_reserved_as_uniq($searchkey)

returns true if key is reserved and should be uniq

=head2 is_baginfo_key_reserved( $searchkey )

returns true if key is reserved

=head2 verify_baginfo()

checks baginfo-keys, returns true if all fine, otherwise returns undef and the message is pushed to C<errors()>.
Warnings pushed to C< warnings() >

=head2 delete_baginfo_by_key( $searchkey )

deletes an entry of given $searchkey if exists.
If multiple entries with $searchkey exists, only the last one is deleted.

=head2 exists_baginfo_key( $searchkey )

lib/Archive/BagIt.pm  view on Meta::CPAN

=head2 load_plugins

As default SHA512 and MD5 will be loaded and therefore used. If you want to create a bag only with one or a specific
checksum-algorithm, you could use this method to (re-)register it. It expects list of strings with namespace of type:
Archive::BagIt::Plugin::Algorithm::XXX where XXX is your chosen fixity algorithm.

=head2 load()

Triggers loading of an existing bag

=head2 verify_bag($opts)

A method to verify a bag deeply. If C<$opts> is set with C<{return_all_errors}> all fixity errors are reported.
The default ist to croak with error message if any error is detected.

HINT: You might also want to check L<Archive::BagIt::Fast> to see a more direct way of accessing files (and thus faster).

=head2 calc_payload_oxum()

returns an array with octets and streamcount of payload-dir

=head2 calc_bagsize()

lib/Archive/BagIt/Role/Manifest.pm  view on Meta::CPAN

# $tmp->{filename} = $filename;
sub calc_digests {
    my ($self, $bagit, $filenames_ref) = @_;
    my @digest_hashes;
    my %digest_results;
    # serial variant
    @digest_hashes = map {$self->_fill_digest_hashref($bagit, $_)} @{$filenames_ref};
    return \@digest_hashes;
}

sub _verify_XXX_manifests {
    my ($self, $xxprefix, $xxmanifest_entries, $files_ref, $return_all_errors) = @_;
    # Read the manifest file
    my @files = @{ $files_ref };
    my @invalid_messages;
    my $bagit = $self->bagit->bag_path;
    my $algorithm = $self->algorithm()->name;
    my $subref_invalid_report_or_die = sub {
        my $message = shift;
        if (defined $return_all_errors) {
            push @invalid_messages, $message;

lib/Archive/BagIt/Role/Manifest.pm  view on Meta::CPAN

        push @{$self->bagit->{errors}},
            join("\n\t",
                sort @invalid_messages
            );
        return;
    }
    return 1;
}


sub verify_manifest {
    my ($self, $payload_files_ref, $return_all_errors) = @_;
    if ($self->manifest_file()) {
        return $self->_verify_XXX_manifests(
            "manifest",
            $self->manifest_entries(),
            $payload_files_ref,
            $return_all_errors
        );
    }
    return;
}


sub verify_tagmanifest {
    my ($self, $non_payload_files_ref, $return_all_errors) = @_;
    my @non_payload_files = grep {$_ !~ m#tagmanifest-[0-9a-zA-Z]+\.txt$#} @{ $non_payload_files_ref };
    if ($self->tagmanifest_file()) {
        return $self->_verify_XXX_manifests(
            "tagmanifest",
            $self->tagmanifest_entries(),
            \@non_payload_files,
            $return_all_errors
        );
    }
    return;
}

sub __create_xxmanifest {

lib/Archive/BagIt/Role/Manifest.pm  view on Meta::CPAN

   }

=head2 tagmanifest_entries()

returns the tagmanifest_entries() for the current digest algorithm, the result is hashref, see L<manifest_entries()>

=head2 calc_digests($bagit, $filenames_ref, $opts)

Method to calculate and return all digests for a a list of files. This method will be overwritten by L<Archive::BagIt::Fast>.

=head2 verify_manifest($payload_files, $return_all_errors)

check fixities of payload files in both directions

=head2 verify_tagmanifest($non_payload_files, $return_all_errors)

check fixities of non-payload files in both directions

=head2 create_manifest()

creates a new manifest file for payload files

=head2 create_tagmanifest()

creates a new tagmanifest file for non payload files

out/test/Archive-BagIt/base.t  view on Meta::CPAN

{
  my $bag = $Class->new({bag_path=>$SRC_BAG});
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  note ("checksum algos:", explain $bag->checksum_algos);
  note ("manifest files:", explain $bag->manifest_files);
  note ("bag path:", explain $bag->bag_path);
  note ("metadata path: ", explain $bag->metadata_path);
  note explain $bag->tagmanifest_files;
  my $result = $bag->verify_bag;
  ok($result,     "Bag verifies");
}

{
  note "copying to $DST_BAG";
  if(-d $DST_BAG) {
    rmtree($DST_BAG);
  }
  mkdir($DST_BAG);
  copy($SRC_FILES."/1", $DST_BAG);

out/test/Archive-BagIt/base.t  view on Meta::CPAN

  like (
    $warning ,
    qr/no payload path/,
    'Got expected warning from make_bag()',
  ) or diag 'got unexpected warnings:' , explain($warning);

  ok ($bag,       "Object created");
  isa_ok ($bag,   $Class);
  ok ($bag->load(), "Bag loaded");

  my $result = $bag->verify_bag();
  ok($result,     "Bag verifies");

  rmtree($DST_BAG);
}

{

  my $bag = $Class->new($SRC_BAG);
  my @manifests = $bag->manifest_files();
  my $cnt = scalar @manifests;

out/test/Archive-BagIt/payload_files.t  view on Meta::CPAN

my $DST_BAG = File::Spec->catdir(@ROOT, 'dst_bag');


#validate tests

{
  my $bag = $Class->new($SRC_BAG);
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  my $result = $bag->verify_bag();

  ok($result,     "Bag verifies");
}

{
  my $bag = $Class->new($SRC_BAG_DEEP);
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  my $result = $bag->verify_bag();

  ok($result,     "deep Bag verifies");
}


{
  mkdir($DST_BAG);
  copy($SRC_FILES."/1", $DST_BAG);
  copy($SRC_FILES."/2", $DST_BAG);

  my $bag = $Class->make_bag($DST_BAG);

  ok ($bag,       "Object created");
  isa_ok ($bag,   $Class);
  my $result = $bag->verify_bag();
  ok($result,     "Bag verifies");

  rmtree($DST_BAG);
}

{

  my $bag = $Class->new($SRC_BAG);
  my @manifests = $bag->manifest_files();
  my $cnt = scalar @manifests;

out/test/Archive-BagIt/verify_bag.t  view on Meta::CPAN

use open ':std', ':encoding(UTF-8)';
use Test::More tests => 122;
use Test::Exception;
use File::Spec;
use File::Path;
use File::Copy;
use File::Temp qw(tempdir);
use File::Slurp qw( read_file write_file);

## tests
# verify incorrect manifest or tagmanifest-checksums

my @alg = qw( md5 sha512);
my @prefix_manifestfiles = qw(tagmanifest manifest);

sub _prepare_bag {
    my ($bag_dir) = @_;
    mkpath($bag_dir . "/data");
    write_file("$bag_dir/data/payload1.txt", "PAYLOAD1" );
    write_file("$bag_dir/data/payload2.txt", "PAYLOAD2" );
    write_file("$bag_dir/data/payload3.txt", "PAYLOAD3" );

out/test/Archive-BagIt/verify_bag.t  view on Meta::CPAN

note "base tests";
my $Class_base = 'Archive::BagIt';
use_ok($Class_base);
foreach my $prefix (@prefix_manifestfiles) {
    foreach my $alg (@alg) {
        # preparation tests
        my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
        _prepare_bag($bag_dir);
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        my $bag_ok2 = Archive::BagIt->make_bag("$bag_dir/"); #add slash at end of $bag_dir
        isa_ok($bag_ok2, 'Archive::BagIt', "create new valid IE bagit (with slash)");
        ok($bag_ok2->verify_bag(), "check if bag is verified correctly (with slash)");
        _modify_bag("$bag_dir/$prefix-$alg.txt");
        # real tests
        my $bag_invalid1 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid1->verify_bag(
                    { return_all_errors => 1 }
                )
            }, qr{bag verify for bagit version '1.0' failed with invalid files}, "check if bag fails verification of broken $prefix-$alg.txt (all errors, standard)");
        my $bag_invalid2 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid2->verify_bag()
            }, qr{digest \($alg\) calculated=.*, but expected=}, "check if bag fails verification of broken $prefix-$alg.txt (first error, standard)");
    }
}

# special test to ensure that return_all_errors work as expected
note "return_all_errors tests";
{
    my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
    _prepare_bag($bag_dir);
    SKIP: {
        skip "skipped because testbag could not created", 1 unless -d $bag_dir;
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        write_file("$bag_dir/data/payload1.txt", "PAYLOAD_MODIFIED1");
        # write_file("$bag_dir/data/payload2.txt", "PAYLOAD2" );
        write_file("$bag_dir/data/payload3.txt", "PAYLOAD3_MODIFIED3");
        _modify_bag("$bag_dir/tagmanifest-sha512.txt");
        _modify_bag("$bag_dir/tagmanifest-md5.txt");
        my $bag_invalid1 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid1->verify_bag()
            },
            qr{file.*'data/payload1.txt'.* invalid, digest.*'}s,
            "check if bag fails verification of broken fixity for payload (all errors)"
        );
        my $bag_invalid2 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid2->verify_bag(
                    { return_all_errors => 1 }
                )
            },
            qr{bag verify for bagit version '1.0' failed with invalid files.*file.*normalized='data/payload1.txt'.*file.*normalized='data/payload3.txt'.*file.*normalized='bag-info.txt'}s,
            "check if bag fails verification of broken fixity for payload (all errors)"
        );
    }
}
# tests against bagit conformance testsuite of Library of Congress, see README.1st in t/src/bagit_conformance_suite for details
{
    my @should_fail_bags_097 = (
        [ qw(../bagit_conformance_suite/v0.97/invalid/baginfo-missing-encoding), qr{Encoding line missed} ],
        [ qw(../bagit_conformance_suite/v0.97/invalid/bom-in-bagit.txt), qr{Version string '.+BagIt-Version: 0\.97'.* is incorrect} ],
        [ qw(../bagit_conformance_suite/v0.97/invalid/corrupt-data-file), qr{file '.*' \(normalized='data/bare-filename'\) invalid, digest \(md5\).*} ],

out/test/Archive-BagIt/verify_bag.t  view on Meta::CPAN

        [qw(../bagit_conformance_suite/v0.97/valid/bag-with-leading-dot-slash-in-manifest), "bagit_conformance_suite/v0.97/valid/bag-with-leading-dot-slash-in-manifest"],
        [qw(../bagit_conformance_suite/v0.97/valid/bag-with-space), "bagit_conformance_suite/v0.97/valid/bag-with-space"],
    );
    note "version 0.97 conformance tests";

    foreach my $entry (@should_fail_bags_097) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        throws_ok(sub{ $bag->verify_bag();}, $expected, "conformance v0.97, fail: $descr");
    }
    foreach my $entry ( @should_pass_bags_097) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        ok(sub{ $bag->verify_bag();}, "conformance v0.97, pass: $descr");
    }

    note "version 1.0 conformance tests";
    foreach my $entry ( @should_fail_bags_100) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        throws_ok(sub{ $bag->verify_bag();}, $expected, "conformance v1.0, fail: $descr");
    }
    foreach my $entry ( @should_pass_bags_100) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];

        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        ok(sub{ $bag->verify_bag();}, "conformance v1.0, pass: $descr");
    }

    { # check if payload oxum is verified correctly
        my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
        _prepare_bag($bag_dir);
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        # modify payload oxum
        my $bif = File::Spec->catfile($bag_dir, "bag-info.txt");
        my $bi = read_file($bif );
        $bi =~ s/Payload-Oxum: .*/Payload-Oxum: 0.0/;
        write_file($bif, $bi);
        # also modify tagmanifest files to be valid
        my $bag = Archive::BagIt->new( $bag_dir);
        foreach my $algorithm ( keys %{ $bag->manifests }) {
            ok($bag->manifests->{$algorithm}->create_tagmanifest(), "rewrite tagmanifests for $algorithm");
        }
        my $bag_invalid = Archive::BagIt->new( $bag_dir);
        throws_ok(
            sub {
                $bag_invalid->verify_bag(
                    { return_all_errors => 1 }
                )
            }, qr{bag verify for bagit version '1.0' failed with invalid files}, "check if bag fails verification of broken Payload-Oxum"
        );
    }
}

1;

out/test/Archive-BagIt/verify_baginfo.t  view on Meta::CPAN

    delete $bag->{'bag_info'};
    throws_ok( sub{$bag->_parse_bag_info( undef )}, qr{_parse_baginfo.* called with undef value}, "bag-info parsing undef");
}
{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $got = $bag->_parse_bag_info( "" );
    is_deeply( $got, [], "bag-info parsing valid empty");
    $bag->{"bag_info"} = $got;
    ok($bag->verify_baginfo(), "bag-info verify valid empty");
    is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid empty, warning for missed payload oxum");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input =<<BAGINFO;
Foo: Bar
Foo1: Baz

out/test/Archive-BagIt/verify_baginfo.t  view on Meta::CPAN

        { "Foo1", "Baz"},
        { "Foo2", "Bar2"},
        { "Foo3", "Bar3"},
        { "Foo4", "Bar4\n  Baz4\n  Bay4"},
        { "Foo5", "Bar5"},
        { "Foo6", "Bar6: Baz6"}
    );
    my $got = $bag->_parse_bag_info( $input );
    is_deeply( $got, \@expected, "bag-info parsing valid");
    $bag->{"bag_info"} = $got;
    ok($bag->verify_baginfo(), "bag-info verify valid");
    is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid, warning for missed payload oxum");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input =<<BAGINFO;
Foo:
BAGINFO
    my $got = $bag->_parse_bag_info( $input );
    is_deeply( $got, [], "bag-info parsing invalid");
    $bag->{"bag_info"} = $got;
    ok(!$bag->verify_baginfo(), "bag-info verify invalid");
    #is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid, warning for missed payload oxum");
    is_deeply($bag->{errors}, ["the baginfo file '$SRC_BAG/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for values: '\n'"], "bag-info parsing valid, error logged" );
}

{
    my $input =<<BAGINFO;
 ::: foo
BAGINFO
    my $dir = tempdir(CLEANUP => 1);
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    my $bag2 = Archive::BagIt->new({bag_path=>$dir});
    my $got = $bag2->verify_baginfo();
    ok(!$bag2->verify_baginfo(), "bag-info verify fully invalid");
    is_deeply($bag2->{errors}, ["the baginfo file '$dir/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for keys: '$input'"], "bag-info parsing valid, error logged" );
}
{

    my $input =<<BAGINFO;
Foo: Bar
Foo: Baz
Foo2 : Bar2
Foo3:   Bar3
Foo4: Bar4
  Baz4
  Bay4
Foo5: Bar5
Foo6: Bar6: Baz6
BAGINFO
    my $dir = tempdir(CLEANUP => 1);
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    my $bag2 = Archive::BagIt->new({bag_path=>$dir});
    ok($bag2->verify_baginfo(), "bag-info verify fully valid");
    is_deeply($bag2->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing fully valid, warning for missed payload oxum");
    is_deeply($bag2->{errors}, [], "bag-info verify fully valid, no error log exists");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input = <<BAGINFO;
test:
Bagging-Date: 2025-02-20
Bag-Software-Agent: Archive::BagIt <https://metacpan.org/pod/Archive::BagIt>

out/test/Archive-BagIt/verify_baginfo.t  view on Meta::CPAN

BAGINFO
    my $expected = <<EXPECTED;

Bagging-Date: 2025-02-20
Bag-Software-Agent: Archive::BagIt <https://metacpan.org/pod/Archive::BagIt>
Payload-Oxum: 1.1
Bag-Size: 1 B
EXPECTED

    my $got = $bag->_parse_bag_info($input);
    is_deeply($got, [], "bag-info verify fully valid, parsed output");
    #is_deeply($bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing fully valid, warning for missed payload oxum");
    is_deeply($bag->{errors},
        ["the baginfo file 'src/src_bag/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for values: '$expected'"], "bag-info verify fully valid, no error log exists");
}

{
    my $dir = tempdir(CLEANUP => 1);
    my $input =<<BAGINFO;
test:
BAGINFO
    mkdir(File::Spec->catdir($dir, "data"));
    write_file(File::Spec->catfile($dir, "data", "1.txt"), "1");
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);

out/test/Archive-BagIt/verify_baginfo.t  view on Meta::CPAN

    my $dir = tempdir(CLEANUP => 1);
    my $input =<<BAGINFO;
test:
BAGINFO
    mkdir(File::Spec->catdir($dir, "data"));
    write_file(File::Spec->catfile($dir, "bagit.txt"), "BagIt-Version: 1.0\nTag-File-Character-Encoding: UTF-8");
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    write_file(File::Spec->catfile($dir, "manifest-sha512.txt"), "");

    my $bag4=Archive::BagIt->new($dir);
    ok(!$bag4->verify_baginfo(), "verify_baginfo() with broken bag-info.txt");
    throws_ok(sub{$bag4->verify_bag({report_all_errors => 1})}, qr{the baginfo file .* could not be parsed correctly, because following text blob not fullfill the match requirements for values}, "verify_bag() with broken bag-info.txt");
}

{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => File::Spec->catdir(@ROOT, 'broken_baginfo') ]);
    throws_ok(sub {$bag->verify_bag({ report_all_errors => 1 })}, qr{bag verify for bagit version '1.0' failed with invalid files.\nthe baginfo file .* could not be parsed correctly, because following text blob not fullfill the match requirements for...
}

1;

out/test/Archive-BagIt/verify_simple.t  view on Meta::CPAN

use warnings;
use diagnostics;
use Test::More tests => 4;
use Test::Exception;
my $valid_bag = "bagit_conformance_suite/v1.0/valid/basicBag";
my $invalid_bag = "bagit_conformance_suite/v1.0/invalid/missing-from-manifest";
use Archive::BagIt;

{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => $valid_bag]);
    ok($bag->verify_bag(), "conformance v1.0, pass");
}
#
#
#
{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => $invalid_bag]);
    throws_ok(sub {$bag->verify_bag()}, qr{which is not in}, "conformance v1.0, fail");
}

1;

t/base.t  view on Meta::CPAN

{
  my $bag = $Class->new({bag_path=>$SRC_BAG});
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  note ("checksum algos:", explain $bag->checksum_algos);
  note ("manifest files:", explain $bag->manifest_files);
  note ("bag path:", explain $bag->bag_path);
  note ("metadata path: ", explain $bag->metadata_path);
  note explain $bag->tagmanifest_files;
  my $result = $bag->verify_bag;
  ok($result,     "Bag verifies");
}

{
  note "copying to $DST_BAG";
  if(-d $DST_BAG) {
    rmtree($DST_BAG);
  }
  mkdir($DST_BAG);
  copy($SRC_FILES."/1", $DST_BAG);

t/base.t  view on Meta::CPAN

  like (
    $warning ,
    qr/no payload path/,
    'Got expected warning from make_bag()',
  ) or diag 'got unexpected warnings:' , explain($warning);

  ok ($bag,       "Object created");
  isa_ok ($bag,   $Class);
  ok ($bag->load(), "Bag loaded");

  my $result = $bag->verify_bag();
  ok($result,     "Bag verifies");

  rmtree($DST_BAG);
}

{

  my $bag = $Class->new($SRC_BAG);
  my @manifests = $bag->manifest_files();
  my $cnt = scalar @manifests;

t/payload_files.t  view on Meta::CPAN

my $DST_BAG = File::Spec->catdir(@ROOT, 'dst_bag');


#validate tests

{
  my $bag = $Class->new($SRC_BAG);
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  my $result = $bag->verify_bag();

  ok($result,     "Bag verifies");
}

{
  my $bag = $Class->new($SRC_BAG_DEEP);
  ok($bag,        "Object created");
  isa_ok ($bag,   $Class);

  my $result = $bag->verify_bag();

  ok($result,     "deep Bag verifies");
}


{
  mkdir($DST_BAG);
  copy($SRC_FILES."/1", $DST_BAG);
  copy($SRC_FILES."/2", $DST_BAG);

  my $bag = $Class->make_bag($DST_BAG);

  ok ($bag,       "Object created");
  isa_ok ($bag,   $Class);
  my $result = $bag->verify_bag();
  ok($result,     "Bag verifies");

  rmtree($DST_BAG);
}

{

  my $bag = $Class->new($SRC_BAG);
  my @manifests = $bag->manifest_files();
  my $cnt = scalar @manifests;

t/verify_bag.t  view on Meta::CPAN

use open ':std', ':encoding(UTF-8)';
use Test::More tests => 122;
use Test::Exception;
use File::Spec;
use File::Path;
use File::Copy;
use File::Temp qw(tempdir);
use File::Slurp qw( read_file write_file);

## tests
# verify incorrect manifest or tagmanifest-checksums

my @alg = qw( md5 sha512);
my @prefix_manifestfiles = qw(tagmanifest manifest);

sub _prepare_bag {
    my ($bag_dir) = @_;
    mkpath($bag_dir . "/data");
    write_file("$bag_dir/data/payload1.txt", "PAYLOAD1" );
    write_file("$bag_dir/data/payload2.txt", "PAYLOAD2" );
    write_file("$bag_dir/data/payload3.txt", "PAYLOAD3" );

t/verify_bag.t  view on Meta::CPAN

note "base tests";
my $Class_base = 'Archive::BagIt';
use_ok($Class_base);
foreach my $prefix (@prefix_manifestfiles) {
    foreach my $alg (@alg) {
        # preparation tests
        my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
        _prepare_bag($bag_dir);
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        my $bag_ok2 = Archive::BagIt->make_bag("$bag_dir/"); #add slash at end of $bag_dir
        isa_ok($bag_ok2, 'Archive::BagIt', "create new valid IE bagit (with slash)");
        ok($bag_ok2->verify_bag(), "check if bag is verified correctly (with slash)");
        _modify_bag("$bag_dir/$prefix-$alg.txt");
        # real tests
        my $bag_invalid1 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid1->verify_bag(
                    { return_all_errors => 1 }
                )
            }, qr{bag verify for bagit version '1.0' failed with invalid files}, "check if bag fails verification of broken $prefix-$alg.txt (all errors, standard)");
        my $bag_invalid2 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid2->verify_bag()
            }, qr{digest \($alg\) calculated=.*, but expected=}, "check if bag fails verification of broken $prefix-$alg.txt (first error, standard)");
    }
}

# special test to ensure that return_all_errors work as expected
note "return_all_errors tests";
{
    my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
    _prepare_bag($bag_dir);
    SKIP: {
        skip "skipped because testbag could not created", 1 unless -d $bag_dir;
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        write_file("$bag_dir/data/payload1.txt", "PAYLOAD_MODIFIED1");
        # write_file("$bag_dir/data/payload2.txt", "PAYLOAD2" );
        write_file("$bag_dir/data/payload3.txt", "PAYLOAD3_MODIFIED3");
        _modify_bag("$bag_dir/tagmanifest-sha512.txt");
        _modify_bag("$bag_dir/tagmanifest-md5.txt");
        my $bag_invalid1 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid1->verify_bag()
            },
            qr{file.*'data/payload1.txt'.* invalid, digest.*'}s,
            "check if bag fails verification of broken fixity for payload (all errors)"
        );
        my $bag_invalid2 = new_ok("Archive::BagIt" => [ bag_path => $bag_dir ]);
        throws_ok(
            sub {
                $bag_invalid2->verify_bag(
                    { return_all_errors => 1 }
                )
            },
            qr{bag verify for bagit version '1.0' failed with invalid files.*file.*normalized='data/payload1.txt'.*file.*normalized='data/payload3.txt'.*file.*normalized='bag-info.txt'}s,
            "check if bag fails verification of broken fixity for payload (all errors)"
        );
    }
}
# tests against bagit conformance testsuite of Library of Congress, see README.1st in t/src/bagit_conformance_suite for details
{
    my @should_fail_bags_097 = (
        [ qw(../bagit_conformance_suite/v0.97/invalid/baginfo-missing-encoding), qr{Encoding line missed} ],
        [ qw(../bagit_conformance_suite/v0.97/invalid/bom-in-bagit.txt), qr{Version string '.+BagIt-Version: 0\.97'.* is incorrect} ],
        [ qw(../bagit_conformance_suite/v0.97/invalid/corrupt-data-file), qr{file '.*' \(normalized='data/bare-filename'\) invalid, digest \(md5\).*} ],

t/verify_bag.t  view on Meta::CPAN

        [qw(../bagit_conformance_suite/v0.97/valid/bag-with-leading-dot-slash-in-manifest), "bagit_conformance_suite/v0.97/valid/bag-with-leading-dot-slash-in-manifest"],
        [qw(../bagit_conformance_suite/v0.97/valid/bag-with-space), "bagit_conformance_suite/v0.97/valid/bag-with-space"],
    );
    note "version 0.97 conformance tests";

    foreach my $entry (@should_fail_bags_097) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        throws_ok(sub{ $bag->verify_bag();}, $expected, "conformance v0.97, fail: $descr");
    }
    foreach my $entry ( @should_pass_bags_097) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        ok(sub{ $bag->verify_bag();}, "conformance v0.97, pass: $descr");
    }

    note "version 1.0 conformance tests";
    foreach my $entry ( @should_fail_bags_100) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];
        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        throws_ok(sub{ $bag->verify_bag();}, $expected, "conformance v1.0, fail: $descr");
    }
    foreach my $entry ( @should_pass_bags_100) {
        my $bagdir = $entry->[0];
        my $descr = $bagdir; $descr =~ s|../bagit_conformance_suite/||;
        my $expected = $entry->[1];

        my $bag = new_ok ("Archive::BagIt" => [ bag_path => $bagdir ]);
        ok(sub{ $bag->verify_bag();}, "conformance v1.0, pass: $descr");
    }

    { # check if payload oxum is verified correctly
        my $bag_dir = File::Temp::tempdir(CLEANUP => 1);
        _prepare_bag($bag_dir);
        my $bag_ok = Archive::BagIt->make_bag($bag_dir);
        isa_ok($bag_ok, 'Archive::BagIt', "create new valid IE bagit");
        ok($bag_ok->verify_bag(), "check if bag is verified correctly");
        # modify payload oxum
        my $bif = File::Spec->catfile($bag_dir, "bag-info.txt");
        my $bi = read_file($bif );
        $bi =~ s/Payload-Oxum: .*/Payload-Oxum: 0.0/;
        write_file($bif, $bi);
        # also modify tagmanifest files to be valid
        my $bag = Archive::BagIt->new( $bag_dir);
        foreach my $algorithm ( keys %{ $bag->manifests }) {
            ok($bag->manifests->{$algorithm}->create_tagmanifest(), "rewrite tagmanifests for $algorithm");
        }
        my $bag_invalid = Archive::BagIt->new( $bag_dir);
        throws_ok(
            sub {
                $bag_invalid->verify_bag(
                    { return_all_errors => 1 }
                )
            }, qr{bag verify for bagit version '1.0' failed with invalid files}, "check if bag fails verification of broken Payload-Oxum"
        );
    }
}

1;

t/verify_baginfo.t  view on Meta::CPAN

    delete $bag->{'bag_info'};
    throws_ok( sub{$bag->_parse_bag_info( undef )}, qr{_parse_baginfo.* called with undef value}, "bag-info parsing undef");
}
{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $got = $bag->_parse_bag_info( "" );
    is_deeply( $got, [], "bag-info parsing valid empty");
    $bag->{"bag_info"} = $got;
    ok($bag->verify_baginfo(), "bag-info verify valid empty");
    is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid empty, warning for missed payload oxum");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input =<<BAGINFO;
Foo: Bar
Foo1: Baz

t/verify_baginfo.t  view on Meta::CPAN

        { "Foo1", "Baz"},
        { "Foo2", "Bar2"},
        { "Foo3", "Bar3"},
        { "Foo4", "Bar4\n  Baz4\n  Bay4"},
        { "Foo5", "Bar5"},
        { "Foo6", "Bar6: Baz6"}
    );
    my $got = $bag->_parse_bag_info( $input );
    is_deeply( $got, \@expected, "bag-info parsing valid");
    $bag->{"bag_info"} = $got;
    ok($bag->verify_baginfo(), "bag-info verify valid");
    is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid, warning for missed payload oxum");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input =<<BAGINFO;
Foo:
BAGINFO
    my $got = $bag->_parse_bag_info( $input );
    is_deeply( $got, [], "bag-info parsing invalid");
    $bag->{"bag_info"} = $got;
    ok(!$bag->verify_baginfo(), "bag-info verify invalid");
    #is_deeply( $bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing valid, warning for missed payload oxum");
    is_deeply($bag->{errors}, ["the baginfo file '$SRC_BAG/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for values: '\n'"], "bag-info parsing valid, error logged" );
}

{
    my $input =<<BAGINFO;
 ::: foo
BAGINFO
    my $dir = tempdir(CLEANUP => 1);
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    my $bag2 = Archive::BagIt->new({bag_path=>$dir});
    my $got = $bag2->verify_baginfo();
    ok(!$bag2->verify_baginfo(), "bag-info verify fully invalid");
    is_deeply($bag2->{errors}, ["the baginfo file '$dir/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for keys: '$input'"], "bag-info parsing valid, error logged" );
}
{

    my $input =<<BAGINFO;
Foo: Bar
Foo: Baz
Foo2 : Bar2
Foo3:   Bar3
Foo4: Bar4
  Baz4
  Bay4
Foo5: Bar5
Foo6: Bar6: Baz6
BAGINFO
    my $dir = tempdir(CLEANUP => 1);
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    my $bag2 = Archive::BagIt->new({bag_path=>$dir});
    ok($bag2->verify_baginfo(), "bag-info verify fully valid");
    is_deeply($bag2->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing fully valid, warning for missed payload oxum");
    is_deeply($bag2->{errors}, [], "bag-info verify fully valid, no error log exists");
}

{
    delete $bag->{'warnings'};
    delete $bag->{'errors'};
    delete $bag->{'bag_info'};
    my $input = <<BAGINFO;
test:
Bagging-Date: 2025-02-20
Bag-Software-Agent: Archive::BagIt <https://metacpan.org/pod/Archive::BagIt>

t/verify_baginfo.t  view on Meta::CPAN

BAGINFO
    my $expected = <<EXPECTED;

Bagging-Date: 2025-02-20
Bag-Software-Agent: Archive::BagIt <https://metacpan.org/pod/Archive::BagIt>
Payload-Oxum: 1.1
Bag-Size: 1 B
EXPECTED

    my $got = $bag->_parse_bag_info($input);
    is_deeply($got, [], "bag-info verify fully valid, parsed output");
    #is_deeply($bag->{warnings}, ["Payload-Oxum was expected in bag-info.txt, but not found!"], "bag-info parsing fully valid, warning for missed payload oxum");
    is_deeply($bag->{errors},
        ["the baginfo file 'src/src_bag/bag-info.txt' could not be parsed correctly, because following text blob not fullfill the match requirements for values: '$expected'"], "bag-info verify fully valid, no error log exists");
}

{
    my $dir = tempdir(CLEANUP => 1);
    my $input =<<BAGINFO;
test:
BAGINFO
    mkdir(File::Spec->catdir($dir, "data"));
    write_file(File::Spec->catfile($dir, "data", "1.txt"), "1");
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);

t/verify_baginfo.t  view on Meta::CPAN

    my $dir = tempdir(CLEANUP => 1);
    my $input =<<BAGINFO;
test:
BAGINFO
    mkdir(File::Spec->catdir($dir, "data"));
    write_file(File::Spec->catfile($dir, "bagit.txt"), "BagIt-Version: 1.0\nTag-File-Character-Encoding: UTF-8");
    write_file(File::Spec->catfile($dir, "bag-info.txt"), $input);
    write_file(File::Spec->catfile($dir, "manifest-sha512.txt"), "");

    my $bag4=Archive::BagIt->new($dir);
    ok(!$bag4->verify_baginfo(), "verify_baginfo() with broken bag-info.txt");
    throws_ok(sub{$bag4->verify_bag({report_all_errors => 1})}, qr{the baginfo file .* could not be parsed correctly, because following text blob not fullfill the match requirements for values}, "verify_bag() with broken bag-info.txt");
}

{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => File::Spec->catdir(@ROOT, 'broken_baginfo') ]);
    throws_ok(sub {$bag->verify_bag({ report_all_errors => 1 })}, qr{bag verify for bagit version '1.0' failed with invalid files.\nthe baginfo file .* could not be parsed correctly, because following text blob not fullfill the match requirements for...
}

1;

t/verify_simple.t  view on Meta::CPAN

use warnings;
use diagnostics;
use Test::More tests => 4;
use Test::Exception;
my $valid_bag = "bagit_conformance_suite/v1.0/valid/basicBag";
my $invalid_bag = "bagit_conformance_suite/v1.0/invalid/missing-from-manifest";
use Archive::BagIt;

{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => $valid_bag]);
    ok($bag->verify_bag(), "conformance v1.0, pass");
}
#
#
#
{
    my $bag = new_ok("Archive::BagIt" => [ bag_path => $invalid_bag]);
    throws_ok(sub {$bag->verify_bag()}, qr{which is not in}, "conformance v1.0, fail");
}

1;

xt/author/eol.t  view on Meta::CPAN

    't/src/src_bag_deep/bag-info.txt',
    't/src/src_bag_deep/bagit.txt',
    't/src/src_bag_deep/data/3',
    't/src/src_bag_deep/data/subdir1/1',
    't/src/src_bag_deep/data/subdir2/subsubdir/2',
    't/src/src_bag_deep/manifest-md5.txt',
    't/src/src_bag_deep/tagmanifest-md5.txt',
    't/src/src_files/1',
    't/src/src_files/2',
    't/store_bag.t',
    't/verify_bag.t',
    't/verify_baginfo.t',
    't/verify_simple.t'
);

eol_unix_ok($_, { trailing_whitespace => 1 }) foreach @files;
done_testing;

xt/author/no-tabs.t  view on Meta::CPAN

    't/src/src_bag_deep/bag-info.txt',
    't/src/src_bag_deep/bagit.txt',
    't/src/src_bag_deep/data/3',
    't/src/src_bag_deep/data/subdir1/1',
    't/src/src_bag_deep/data/subdir2/subsubdir/2',
    't/src/src_bag_deep/manifest-md5.txt',
    't/src/src_bag_deep/tagmanifest-md5.txt',
    't/src/src_files/1',
    't/src/src_files/2',
    't/store_bag.t',
    't/verify_bag.t',
    't/verify_baginfo.t',
    't/verify_simple.t'
);

notabs_ok($_) foreach @files;
done_testing;



( run in 1.937 second using v1.01-cache-2.11-cpan-39bf76dae61 )