Alien-ROOT

 view release on metacpan or  search on metacpan

inc/inc_File-Fetch/File/Fetch.pm  view on Meta::CPAN

### Note that the rules of what a file:// url means vary by the operating system
### of the host being addressed. Thus file:///d|/foo/bar.txt means the obvious
### 'D:\foo\bar.txt' on windows, but on unix it means '/d|/foo/bar.txt' and
### not '/foo/bar.txt'
###
### Similarly if the host interpreting the url is VMS then
### file:///disk$user/my/notes/note12345.txt' means
### 'DISK$USER:[MY.NOTES]NOTE123456.TXT' but will be returned the same as
### if it is unix where it means /disk$user/my/notes/note12345.txt'.
### Except for some cases in the File::Spec methods, Perl on VMS will generally
### handle UNIX format file specifications.
###
### This means it is impossible to serve certain file:// urls on certain systems.
###
### Thus are the problems with a protocol-less specification. :-(
###

sub _parse_uri {
    my $self = shift;
    my $uri  = shift or return;

    my $href = { uri => $uri };

    ### find the scheme ###
    $uri            =~ s|^(\w+)://||;
    $href->{scheme} = $1;

    ### See rfc 1738 section 3.10
    ### http://www.faqs.org/rfcs/rfc1738.html
    ### And wikipedia for more on windows file:// urls
    ### http://en.wikipedia.org/wiki/File://
    if( $href->{scheme} eq 'file' ) {

        my @parts = split '/',$uri;

        ### file://hostname/...
        ### file://hostname/...
        ### normalize file://localhost with file:///
        $href->{host} = $parts[0] || '';

        ### index in @parts where the path components begin;
        my $index = 1;

        ### file:////hostname/sharename/blah.txt
        if ( HAS_SHARE and not length $parts[0] and not length $parts[1] ) {

            $href->{host}   = $parts[2] || '';  # avoid warnings
            $href->{share}  = $parts[3] || '';  # avoid warnings

            $index          = 4         # index after the share

        ### file:///D|/blah.txt
        ### file:///D:/blah.txt
        } elsif (HAS_VOL) {

            ### this code comes from dmq's patch, but:
            ### XXX if volume is empty, wouldn't that be an error? --kane
            ### if so, our file://localhost test needs to be fixed as wel
            $href->{vol}    = $parts[1] || '';

            ### correct D| style colume descriptors
            $href->{vol}    =~ s/\A([A-Z])\|\z/$1:/i if ON_WIN;

            $index          = 2;        # index after the volume
        }

        ### rebuild the path from the leftover parts;
        $href->{path} = join '/', '', splice( @parts, $index, $#parts );

    } else {
        ### using anything but qw() in hash slices may produce warnings
        ### in older perls :-(
        @{$href}{ qw(host path) } = $uri =~ m|([^/]*)(/.*)$|s;
    }

    ### split the path into file + dir ###
    {   my @parts = File::Spec::Unix->splitpath( delete $href->{path} );
        $href->{path} = $parts[1];
        $href->{file} = $parts[2];
    }

    ### host will be empty if the target was 'localhost' and the
    ### scheme was 'file'
    $href->{host} = '' if   ($href->{host}      eq 'localhost') and
                            ($href->{scheme}    eq 'file');

    return $href;
}

=head2 $where = $ff->fetch( [to => /my/output/dir/ | \$scalar] )

Fetches the file you requested and returns the full path to the file.

By default it writes to C<cwd()>, but you can override that by specifying
the C<to> argument:

    ### file fetch to /tmp, full path to the file in $where
    $where = $ff->fetch( to => '/tmp' );

    ### file slurped into $scalar, full path to the file in $where
    ### file is downloaded to a temp directory and cleaned up at exit time
    $where = $ff->fetch( to => \$scalar );

Returns the full path to the downloaded file on success, and false
on failure.

=cut

sub fetch {
    my $self = shift or return;
    my %hash = @_;

    my $target;
    my $tmpl = {
        to  => { default => cwd(), store => \$target },
    };

    check( $tmpl, \%hash ) or return;

    my ($to, $fh);
    ### you want us to slurp the contents
    if( ref $target and UNIVERSAL::isa( $target, 'SCALAR' ) ) {
        $to = tempdir( 'FileFetch.XXXXXX', CLEANUP => 1 );

    ### plain old fetch
    } else {
        $to = $target;

        ### On VMS force to VMS format so File::Spec will work.
        $to = VMS::Filespec::vmspath($to) if ON_VMS;

        ### create the path if it doesn't exist yet ###
        unless( -d $to ) {
            eval { mkpath( $to ) };

            return $self->_error(loc("Could not create path '%1'",$to)) if $@;
        }
    }

    ### set passive ftp if required ###
    local $ENV{FTP_PASSIVE} = $FTP_PASSIVE;

    ### we dont use catfile on win32 because if we are using a cygwin tool
    ### under cmd.exe they wont understand windows style separators.
    my $out_to = ON_WIN ? $to.'/'.$self->output_file
                        : File::Spec->catfile( $to, $self->output_file );

    for my $method ( @{ $METHODS->{$self->scheme} } ) {
        my $sub =  '_'.$method.'_fetch';

        unless( __PACKAGE__->can($sub) ) {
            $self->_error(loc("Cannot call method for '%1' -- WEIRD!",
                        $method));
            next;
        }

        ### method is blacklisted ###
        next if grep { lc $_ eq $method } @$BLACKLIST;

        ### method is known to fail ###
        next if $METHOD_FAIL->{$method};

        ### there's serious issues with IPC::Run and quoting of command
        ### line arguments. using quotes in the wrong place breaks things,
        ### and in the case of say,
        ### C:\cygwin\bin\wget.EXE --quiet --passive-ftp --output-document
        ### "index.html" "http://www.cpan.org/index.html?q=1&y=2"
        ### it doesn't matter how you quote, it always fails.
        local $IPC::Cmd::USE_IPC_RUN = 0;

        if( my $file = $self->$sub(
                        to => $out_to
        )){

            unless( -e $file && -s _ ) {
                $self->_error(loc("'%1' said it fetched '%2', ".
                     "but it was not created",$method,$file));

                ### mark the failure ###
                $METHOD_FAIL->{$method} = 1;

                next;

            } else {

                ### slurp mode?
                if( ref $target and UNIVERSAL::isa( $target, 'SCALAR' ) ) {

                    ### open the file
                    open my $fh, "<$file" or do {
                        $self->_error(
                            loc("Could not open '%1': %2", $file, $!));
                        return;
                    };

                    ### slurp
                    $$target = do { local $/; <$fh> };

                }

                my $abs = File::Spec->rel2abs( $file );
                return $abs;

            }

inc/inc_File-Fetch/File/Fetch.pm  view on Meta::CPAN

example, to use an ftp proxy:

    $ENV{ftp_proxy} = 'foo.com';

Refer to the LWP::UserAgent manpage for more details.

=head2 I used 'lynx' to fetch a file, but its contents is all wrong!

C<lynx> can only fetch remote files by dumping its contents to C<STDOUT>,
which we in turn capture. If that content is a 'custom' error file
(like, say, a C<404 handler>), you will get that contents instead.

Sadly, C<lynx> doesn't support any options to return a different exit
code on non-C<200 OK> status, giving us no way to tell the difference
between a 'successful' fetch and a custom error page.

Therefor, we recommend to only use C<lynx> as a last resort. This is
why it is at the back of our list of methods to try as well.

=head2 Files I'm trying to fetch have reserved characters or non-ASCII characters in them. What do I do?

C<File::Fetch> is relatively smart about things. When trying to write
a file to disk, it removes the C<query parameters> (see the
C<output_file> method for details) from the file name before creating
it. In most cases this suffices.

If you have any other characters you need to escape, please install
the C<URI::Escape> module from CPAN, and pre-encode your URI before
passing it to C<File::Fetch>. You can read about the details of URIs
and URI encoding here:

  http://www.faqs.org/rfcs/rfc2396.html

=head1 TODO

=over 4

=item Implement $PREFER_BIN

To indicate to rather use commandline tools than modules

=back

=head1 BUG REPORTS

Please report bugs or other issues to E<lt>bug-file-fetch@rt.cpan.org<gt>.

=head1 AUTHOR

This module by Jos Boumans E<lt>kane@cpan.orgE<gt>.

=head1 COPYRIGHT

This library is free software; you may redistribute and/or modify it
under the same terms as Perl itself.


=cut

# Local variables:
# c-indentation-style: bsd
# c-basic-offset: 4
# indent-tabs-mode: nil
# End:
# vim: expandtab shiftwidth=4:






( run in 2.159 seconds using v1.01-cache-2.11-cpan-ed4147ee29a )