Datahub-Factory

 view release on metacpan or  search on metacpan

lib/Datahub/Factory/Command/transport.pm  view on Meta::CPAN


=back

=head3 Plugin configuration

    [Importer]
    plugin = OAI
    id_path = 'lidoRecID.0._'

    [plugin_importer_OAI]
    endpoint = https://oai.my.museum/oai

    [Fixer]
    plugin = Fix

    [plugin_fixer_Fix]
    file_name = '/home/datahub/my.fix'

    [Exporter]
    plugin = YAML

lib/Datahub/Factory/Command/transport.pm  view on Meta::CPAN

    condition = record.institution_name
    fixers = FOO, BAR

    [plugin_fixer_Fix]
    file_name = /home/datahub/my.fix

The C<[plugin_fixer_Fix]> can directly load a fix file (via the option
C<file_name>) or can be configured to conditionally load a different
fix file to support multiple fix files for the same data stream (e.g.
when two institutions with different data models use the same API
endpoint). This is done by setting the C<condition> and C<fixers>
options.

=head4 Conditional fixers

    [plugin_fixer_Fix]
    condition = record.institution_name
    fixers = FOO, BAR

    [plugin_fixer_FOO]
    condition = 'Museum of Foo'

lib/Datahub/Factory/Exporter.pm  view on Meta::CPAN

        oauth_password      => 'adatahub'
    };

    my $exporter = Datahub::Factory->exporter('Datahub')->new($datahub_options);

    $exporter->add({'id' => 1});

=head1 DESCRIPTION

A Datahub::Factory::Exporter is a package that is used as a L<role|Moose::Role> for packages
that export data to an endpoint. It enforces a generic reusable interface so
different packages can be loaded and executed programmatically.

=head1 AUTHORS

Pieter De Praetere <pieter@packed.be>

Matthias Vandermaesen <matthias.vandermaesen@vlaamsekunstcollectie.be>

=head1 COPYRIGHT

lib/Datahub/Factory/Importer.pm  view on Meta::CPAN

=head1 NAME

Datahub::Factory::Importer - Namespace for importer packages

=head1 SYNOPSIS

    use Datahub::Factory;
    use Data::Dumper qw(Dumper);

    my $importer_options = {
        endpoint => 'https://my.oai.org/oai'
    };

    my $importer = Datahub::Factory->importer('OAI')->new($importer_options);

    $importer->importer->each({
        my $item = shift;
        print Dumper($item);
    });

=head1 DESCRIPTION

lib/Datahub/Factory/Importer/CollectiveAccess.pm  view on Meta::CPAN

use Datahub::Factory::Sane;

our $VERSION = '1.77';

use Moo;
use Catmandu;
use namespace::clean;

with 'Datahub::Factory::Importer';

has endpoint   => (is => 'ro', required => 1);
has username   => (is => 'ro', required => 1);
has password   => (is => 'ro', required => 1);
has display    => (is => 'ro', default => 'teaser');

sub _build_importer {
    my $self = shift;
    my $ca = Catmandu->store('CA',
        username   => $self->username,
        password   => $self->password,
        url        => $self->endpoint,
        display    => $self->display
    );
    return $ca->bag;
}

1;
__END__

=encoding utf-8

=head1 NAME

Datahub::Factory::Importer::CollectiveAccess - Import data from a L<CollectiveAccess|http://collectiveaccess.org/> instance

=head1 SYNOPSIS

    use Datahub::Factory;
    use Data::Dumper qw(Dumper);

    my $ca = Datahub::Factory->importer('CollectiveAccess')->new(
        endpoint => 'https://my.ca.org/ca',
        username => 'datahub',
        password => 'datahub'
    );

    $ca->importer->each(sub {
        my $item = shift;
        print Dumper($item);
    });

=head1 DESCRIPTION

Datahub::Factory::Importer::CollectiveAccess uses L<Catmandu|http://librecat.org/Catmandu/> to fetch a list of records
from a  L<CollectiveAccess|http://collectiveaccess.org/> instance. It returns an L<Importer|Catmandu::Importer>.

=head1 PARAMETERS

=over

=item C<endpoint>

URL of the CA instance (e.g. I<http://demo.collectiveaccess.org>).

=item C<username>

Name of a user that can be used to query the API.

=item C<password>

Password for the user.

lib/Datahub/Factory/Importer/OAI.pm  view on Meta::CPAN

use Datahub::Factory::Sane;

our $VERSION = '1.77';

use Moo;
use Catmandu::Importer::OAI;
use namespace::clean;

with 'Datahub::Factory::Importer';

has endpoint        => (is => 'ro', required => 1);
has metadata_prefix => (is => 'ro', default => sub {
    return 'oai_lido';
});
has handler         => (is => 'ro');
has set             => (is => 'ro');
has from            => (is => 'ro');
has until           => (is => 'ro');
has username        => (is => 'ro');
has password        => (is => 'ro');


sub _build_importer {
    my $self = shift;
    my $importer = Catmandu::Importer::OAI->new(
        url            => $self->endpoint,
        handler        => $self->handler,
        metadataPrefix => $self->metadata_prefix,
        from           => $self->from,
        until          => $self->until,
        set            => $self->set,
        username       => $self->username,
        password       => $self->password,
    );
    return $importer;
}

1;
__END__

=encoding utf-8

=head1 NAME

Datahub::Factory::Importer::OAI - Import data from an L<OAI-PMH|https://www.openarchives.org/pmh/> endpoint

=head1 SYNOPSIS

    use Datahub::Factory;
    use Data::Dumper qw(Dumper);

    my $oai = Datahub::Factory->importer('OAI')->new(
        url            => 'https://biblio.ugent.be/oai',
        metadataPrefix => 'oai_dc',
        set            => '2011'
    );

    $oai->importer->each(sub {
        my $item = shift;
        print Dumper($item);
    });

=head1 DESCRIPTION

Datahub::Factory::Importer::OAI imports data from an OAI-PMH endpoint. By default it uses the C<ListRecords>
verb to return all records using the I<oai_lido> format. It is possible to only return records from a single
I<Set> or those created, modified or deleted between two dates (I<from> and I<until>).

It automatically deals with I<resumptionTokens>, so client code does not have to implement paging.

=head1 PARAMETERS

Only the C<endpoint> parameter is required.

=over

=item C<endpoint>

URL of the OAI endpoint.

=item handler( sub {} | $object | 'NAME' | '+NAME' )
Handler to transform each record from XML DOM (L<XML::LibXML::Element>) into
Perl hash.

Handlers can be provided as function reference, an instance of a Perl
package that implements 'parse', or by a package NAME. Package names should
be prepended by C<+> or prefixed with C<Catmandu::Importer::OAI::Parser>. E.g
C<foobar> will create a C<Catmandu::Importer::OAI::Parser::foobar> instance.
By default the handler L<Catmandu::Importer::OAI::Parser::oai_dc> is used for
metadataPrefix C<oai_dc>,  L<Catmandu::Importer::OAI::Parser::marcxml> for
C<marcxml>, L<Catmandu::Importer::OAI::Parser::mods> for
C<mods>, L<Catmandu::Importer::OAI::Parser::Lido> for
C<Lido> and L<Catmandu::Importer::OAI::Parser::struct> for other formats.
In addition there is L<Catmandu::Importer::OAI::Parser::raw> to return the XML
as it is.

=item C<metadata_prefix>

Any metadata prefix the endpoint supports. Defaults to C<oai_lido>.

=item C<set>

Optionally, a set to get records from.

=item C<from>

Optionally, a I<must_be_older_than> date.

=item C<until>

lib/Datahub/Factory/Introduction.pod  view on Meta::CPAN


A simple example that pushes OAI data to a YAML output on STDOUT:

    [General]
    id_path = administrativeMetadata.recordWrap.recordID.0._

    [Importer]
    plugin = OAI

    [plugin_importer_OAI]
    endpoint =  https://datahub.vlaamsekunstcollectie.be/oai
    handler = +Catmandu::Importer::OAI::Parser::lido
    metadata_prefix = oai_lido

    [Fixer]
    plugin = Fix

    [plugin_fixer_Fix]
    file_name = '/home/foobar/datahub.fix'

    [Exporter]

lib/Datahub/Factory/Introduction.pod  view on Meta::CPAN


An example defining multiple fix transforms based on a context dependent value:

    [General]
    id_path = 'administrativeMetadata.recordWrap.recordID.0._'

    [Importer]
    plugin = OAI

    [plugin_importer_OAI]
    # endpoint = 'http://collections.britishart.yale.edu/oaicatmuseum/OAIHandler'
    endpoint = https://datahub.vlaamsekunstcollectie.be/oai
    handler = +Catmandu::Importer::OAI::Parser::lido
    metadata_prefix = oai_lido

    [Fixer]
    plugin = Fix

    [plugin_fixer_Fix]
    condition_path = '_metadata.administrativeMetadata.0.recordWrap.recordSource.0.legalBodyName.0.appellationValue.0._'
    fixers = MSK, GRO



( run in 0.493 second using v1.01-cache-2.11-cpan-b61123c0432 )