Parallel-Downloader
view release on metacpan or search on metacpan
lib/Parallel/Downloader.pm view on Meta::CPAN
sub _default_log {
my ( $self, $msg ) = @_;
print "$msg->{msg}\n";
return;
}
1;
__END__
=pod
=head1 NAME
Parallel::Downloader - simply download multiple files at once
=head1 VERSION
version 0.132071
=head1 SYNOPSIS
use HTTP::Request::Common qw( GET POST );
use Parallel::Downloader 'async_download';
# simple example
my @requests = map GET( "http://google.com" ), ( 1..15 );
my @responses = async_download( requests => \@requests );
# complex example
my @complex_reqs = ( ( map POST( "http://google.com", [ type_id => $_ ] ), ( 1..60 ) ),
( map POST( "http://yahoo.com", [ type_id => $_ ] ), ( 1..60 ) ) );
my $downloader = Parallel::Downloader->new(
requests => \@complex_reqs,
workers => 50,
conns_per_host => 12,
aehttp_args => {
timeout => 30,
on_prepare => sub {
print "download started ($AnyEvent::HTTP::ACTIVE / $AnyEvent::HTTP::MAX_PER_HOST)\n"
}
},
debug => 1,
logger => sub {
my ( $downloader, $message ) = @_;
print "downloader sez [$message->{type}]: $message->{msg}\n";
},
);
my @complex_responses = $downloader->run;
=head1 DESCRIPTION
This is not a library to build a parallel downloader on top of. It is a
downloading client build on top of AnyEvent::HTTP.
Its goal is not to be better, faster, or smaller than anything else. Its goal is
to provide the user with a single function they can call with a bunch of HTTP
requests and which gives them the responses for them with as little fuss as
possible and most importantly, without downloading them in sequence.
It handles the busywork of grouping requests by hosts and limiting the amount of
simultaneous requests per host, separate from capping the amount of overall
connections. This allows the user to maximize their own connection without
abusing remote hosts.
Of course, there are facilities to customize the exact limits employed and to
add logging and such; but C<async_download> is the premier piece of API and
should be enough for most uses.
=head1 FUNCTIONS
=head2 async_download
Can be requested to be exported, will instantiate a Parallel::Downloader object
with the given parameters, run it and return the results. Its parameters are as
follows:
=head3 requests (required)
Reference to an array of HTTP::Request objects, all of which will be downloaded.
=head3 aehttp_args
A reference to a hash containing arguments that will be passed to
AnyEvent::HTTP::http_request.
Default is an empty hashref.
=head3 conns_per_host
Sets the number of connections allowed per host by changing the corresponding
AnyEvent::HTTP package variable.
Default is '4'.
=head3 debug
A boolean that determines whether logging operations are a NOP or actually run.
Set to any true value to activate the logging.
Default is '0'.
=head3 logger
A reference to a sub that will receive a hash containing logging information.
Whether that sub then prints them to screen or into a database or other targets
is up to the user.
Default is a sub that prints to the screen.
=head3 workers
The amount of workers to be used for downloading. Useful for controlling the
global amount of connections your machine will try to establish.
Default is '10'.
=head3 build_response
A reference to a sub that will be called on completion of a request to build the
response variable that will be returned for this request. It receives as
( run in 0.779 second using v1.01-cache-2.11-cpan-39bf76dae61 )