AC-MrGamoo

 view release on metacpan or  search on metacpan

lib/AC/MrGamoo/FileList.pm  view on Meta::CPAN

# -*- perl -*-

# Copyright (c) 2010 AdCopy
# Author: Jeff Weisberg
# Created: 2010-Jan-14 17:04 (EST)
# Function: get list of files to map
#
# $Id: FileList.pm,v 1.1 2010/11/01 18:41:42 jaw Exp $

package AC::MrGamoo::FileList;
use AC::MrGamoo::Customize;
use AC::Import;
use strict;

our @ISA    = 'AC::MrGamoo::Customize';
our @EXPORT = qw(get_file_list);
our @CUSTOM = @EXPORT;

1;

=head1 NAME

AC::MrGamoo::FileList - get list of files

=head1 SYNOPSIS

    emacs /myperldir/Local/MrGamoo/FileList.pm
    copy. paste. edit.

    use lib '/myperldir';
    my $m = AC::MrGamoo::D->new(
        class_filelist    => 'Local::MrGamoo::FileList',
    );

=head1 IMPORTANT

You can fire up the system, and get the servers talking to each other, and
perform some limited tests without this file.

But you must provide this file in order to actually run map/reduce jobs.

=head1 DESCRIPTION

MrGamoo only runs map/reduce jobs.
It is up to you to get the files on to the servers
and keep track of where they are. And to tell MrGamoo.

Some people keep the file meta-information in a sql database.
Some people keep the file meta-information in a yenta map.
Some people keep the file meta-information in the filesystem.

When a new job starts, your C<get_file_list> function will be
called with the job config, and should return an arrayref
of matching files along with meta-info.

Each element of the returned arrayref should be a hashref
containing at least the following fields:

=head2 filename

the name of the file, relative to the C<basedir> in your config file.

    filename    => 'www/2010/01/17/23/5943_prod_5x2N5qyerdeddsNi'

=head2 location

an arrayref of servers where this file is located. the locations
should be the persistent-ids of the servers (see MySelf).

if the same file is replicated on multiple servers, mrgamoo will
be able to both intelligently determine which servers will process
which files, as well as recover from failures.

    location	=> [ 'mrm@athena.example.com', 'mrm@zeus.example.com' ]

=head2 size

this should be the size of the file, in bytes. mrgamoo will consider
the sizes of files in determining which servers will process which files.

    size	=> 10843

=head1 BUGS

none. you write this yourself.

=head1 SEE ALSO

    AC::MrGamoo

=head1 AUTHOR

    You!

=cut



( run in 1.253 second using v1.01-cache-2.11-cpan-39bf76dae61 )