Alien-ROOT
view release on metacpan or search on metacpan
inc/inc_File-Fetch/File/Fetch.pm view on Meta::CPAN
### Note that the rules of what a file:// url means vary by the operating system
### of the host being addressed. Thus file:///d|/foo/bar.txt means the obvious
### 'D:\foo\bar.txt' on windows, but on unix it means '/d|/foo/bar.txt' and
### not '/foo/bar.txt'
###
### Similarly if the host interpreting the url is VMS then
### file:///disk$user/my/notes/note12345.txt' means
### 'DISK$USER:[MY.NOTES]NOTE123456.TXT' but will be returned the same as
### if it is unix where it means /disk$user/my/notes/note12345.txt'.
### Except for some cases in the File::Spec methods, Perl on VMS will generally
### handle UNIX format file specifications.
###
### This means it is impossible to serve certain file:// urls on certain systems.
###
### Thus are the problems with a protocol-less specification. :-(
###
sub _parse_uri {
my $self = shift;
my $uri = shift or return;
my $href = { uri => $uri };
### find the scheme ###
$uri =~ s|^(\w+)://||;
$href->{scheme} = $1;
### See rfc 1738 section 3.10
### http://www.faqs.org/rfcs/rfc1738.html
### And wikipedia for more on windows file:// urls
### http://en.wikipedia.org/wiki/File://
if( $href->{scheme} eq 'file' ) {
my @parts = split '/',$uri;
### file://hostname/...
### file://hostname/...
### normalize file://localhost with file:///
$href->{host} = $parts[0] || '';
### index in @parts where the path components begin;
my $index = 1;
### file:////hostname/sharename/blah.txt
if ( HAS_SHARE and not length $parts[0] and not length $parts[1] ) {
$href->{host} = $parts[2] || ''; # avoid warnings
$href->{share} = $parts[3] || ''; # avoid warnings
$index = 4 # index after the share
### file:///D|/blah.txt
### file:///D:/blah.txt
} elsif (HAS_VOL) {
### this code comes from dmq's patch, but:
### XXX if volume is empty, wouldn't that be an error? --kane
### if so, our file://localhost test needs to be fixed as wel
$href->{vol} = $parts[1] || '';
### correct D| style colume descriptors
$href->{vol} =~ s/\A([A-Z])\|\z/$1:/i if ON_WIN;
$index = 2; # index after the volume
}
### rebuild the path from the leftover parts;
$href->{path} = join '/', '', splice( @parts, $index, $#parts );
} else {
### using anything but qw() in hash slices may produce warnings
### in older perls :-(
@{$href}{ qw(host path) } = $uri =~ m|([^/]*)(/.*)$|s;
}
### split the path into file + dir ###
{ my @parts = File::Spec::Unix->splitpath( delete $href->{path} );
$href->{path} = $parts[1];
$href->{file} = $parts[2];
}
### host will be empty if the target was 'localhost' and the
### scheme was 'file'
$href->{host} = '' if ($href->{host} eq 'localhost') and
($href->{scheme} eq 'file');
return $href;
}
=head2 $where = $ff->fetch( [to => /my/output/dir/ | \$scalar] )
Fetches the file you requested and returns the full path to the file.
By default it writes to C<cwd()>, but you can override that by specifying
the C<to> argument:
### file fetch to /tmp, full path to the file in $where
$where = $ff->fetch( to => '/tmp' );
### file slurped into $scalar, full path to the file in $where
### file is downloaded to a temp directory and cleaned up at exit time
$where = $ff->fetch( to => \$scalar );
Returns the full path to the downloaded file on success, and false
on failure.
=cut
sub fetch {
my $self = shift or return;
my %hash = @_;
my $target;
my $tmpl = {
to => { default => cwd(), store => \$target },
};
check( $tmpl, \%hash ) or return;
my ($to, $fh);
### you want us to slurp the contents
if( ref $target and UNIVERSAL::isa( $target, 'SCALAR' ) ) {
$to = tempdir( 'FileFetch.XXXXXX', CLEANUP => 1 );
### plain old fetch
} else {
$to = $target;
### On VMS force to VMS format so File::Spec will work.
$to = VMS::Filespec::vmspath($to) if ON_VMS;
### create the path if it doesn't exist yet ###
unless( -d $to ) {
eval { mkpath( $to ) };
return $self->_error(loc("Could not create path '%1'",$to)) if $@;
}
}
### set passive ftp if required ###
local $ENV{FTP_PASSIVE} = $FTP_PASSIVE;
### we dont use catfile on win32 because if we are using a cygwin tool
### under cmd.exe they wont understand windows style separators.
my $out_to = ON_WIN ? $to.'/'.$self->output_file
: File::Spec->catfile( $to, $self->output_file );
for my $method ( @{ $METHODS->{$self->scheme} } ) {
my $sub = '_'.$method.'_fetch';
unless( __PACKAGE__->can($sub) ) {
$self->_error(loc("Cannot call method for '%1' -- WEIRD!",
$method));
next;
}
### method is blacklisted ###
next if grep { lc $_ eq $method } @$BLACKLIST;
### method is known to fail ###
next if $METHOD_FAIL->{$method};
### there's serious issues with IPC::Run and quoting of command
### line arguments. using quotes in the wrong place breaks things,
### and in the case of say,
### C:\cygwin\bin\wget.EXE --quiet --passive-ftp --output-document
### "index.html" "http://www.cpan.org/index.html?q=1&y=2"
### it doesn't matter how you quote, it always fails.
local $IPC::Cmd::USE_IPC_RUN = 0;
if( my $file = $self->$sub(
to => $out_to
)){
unless( -e $file && -s _ ) {
$self->_error(loc("'%1' said it fetched '%2', ".
"but it was not created",$method,$file));
### mark the failure ###
$METHOD_FAIL->{$method} = 1;
next;
} else {
### slurp mode?
if( ref $target and UNIVERSAL::isa( $target, 'SCALAR' ) ) {
### open the file
open my $fh, "<$file" or do {
$self->_error(
loc("Could not open '%1': %2", $file, $!));
return;
};
### slurp
$$target = do { local $/; <$fh> };
}
my $abs = File::Spec->rel2abs( $file );
return $abs;
}
inc/inc_File-Fetch/File/Fetch.pm view on Meta::CPAN
example, to use an ftp proxy:
$ENV{ftp_proxy} = 'foo.com';
Refer to the LWP::UserAgent manpage for more details.
=head2 I used 'lynx' to fetch a file, but its contents is all wrong!
C<lynx> can only fetch remote files by dumping its contents to C<STDOUT>,
which we in turn capture. If that content is a 'custom' error file
(like, say, a C<404 handler>), you will get that contents instead.
Sadly, C<lynx> doesn't support any options to return a different exit
code on non-C<200 OK> status, giving us no way to tell the difference
between a 'successful' fetch and a custom error page.
Therefor, we recommend to only use C<lynx> as a last resort. This is
why it is at the back of our list of methods to try as well.
=head2 Files I'm trying to fetch have reserved characters or non-ASCII characters in them. What do I do?
C<File::Fetch> is relatively smart about things. When trying to write
a file to disk, it removes the C<query parameters> (see the
C<output_file> method for details) from the file name before creating
it. In most cases this suffices.
If you have any other characters you need to escape, please install
the C<URI::Escape> module from CPAN, and pre-encode your URI before
passing it to C<File::Fetch>. You can read about the details of URIs
and URI encoding here:
http://www.faqs.org/rfcs/rfc2396.html
=head1 TODO
=over 4
=item Implement $PREFER_BIN
To indicate to rather use commandline tools than modules
=back
=head1 BUG REPORTS
Please report bugs or other issues to E<lt>bug-file-fetch@rt.cpan.org<gt>.
=head1 AUTHOR
This module by Jos Boumans E<lt>kane@cpan.orgE<gt>.
=head1 COPYRIGHT
This library is free software; you may redistribute and/or modify it
under the same terms as Perl itself.
=cut
# Local variables:
# c-indentation-style: bsd
# c-basic-offset: 4
# indent-tabs-mode: nil
# End:
# vim: expandtab shiftwidth=4:
( run in 2.159 seconds using v1.01-cache-2.11-cpan-ed4147ee29a )