Bio-Das

 view release on metacpan or  search on metacpan

Das/HTTP/Fetch.pm  view on Meta::CPAN

=item $path = $fetcher->path

Return the path part of the HTTP request.

=item $request = $fetcher->request

Return the L<Bio::Das::Request> object that the fetcher will attempt
to satisfy.

=item $args = $fetcher->args

Returns a hashref containing the CGI arguments to be passed to the
HTTP server.  This is simply delegated to the request's args() method.


=item $url = $fetcher->url

Returns the URL for the HTTP request. This is simply delegated to the
request's url() method.

=item $headers = $fetcher->outgoing_headers

Returns a hashref containing the HTTP headers that will be sent in the
request.

=item $host = $fetcher->host

Returns the host to which the fetcher will connect.  Note that this is
B<not> necessarily the same host as the DAS server, as this method
will return the name of the B<proxy> if an HTTP proxy has been
specified.  To get the DAS server hostname, call
$fetcher->request->host.

=item $credentials = $fetcher->auth

Return the authentication credentials as a base64-encoded string.

=item $header = $fetcher->incoming_header

Retrieve the incoming HTTP header.  Depending on the state of the
connection, the header may be empty or incomplete.

=cut

sub socket           { shift->{socket}           }
sub path             { shift->{path}             }
sub request          { shift->{request}          }
sub outgoing_args    { shift->request->args      }
sub url              { shift->request->url       }
sub outgoing_headers { shift->{outgoing_headers} }
sub host             { shift->{host}             }  # mostly for debugging purposes
sub auth             { shift->{auth}             }
sub incoming_header  { shift->{incoming_header}  }  # buffer for header data


=item $mode = $fetcher->mode([$new_mode])

This misnamed method gets or sets the protocol, which is one of 'http'
for regular cleartext transactions or 'https' for transactions using
the encrypting SSL/TLS protocol.  Note that you must have
IO::Socket::SSL and its associated libraries in order to use SSL/TLS.

=cut

sub mode {
  my $self = shift;
  my $d    = $self->{mode};
  $self->{mode} = shift if @_;
  $d;
}

=item $mode = $fetcher->mode([$new_mode])

This misnamed method gets or sets the protocol, which is one of 'http'
for regular cleartext transactions or 'https' for transactions using
the encrypting SSL/TLS protocol.  Note that you must have
IO::Socket::SSL and its associated libraries in order to use SSL/TLS.

=cut

sub method   {
  my $self = shift;
  my $meth = uc $self->request->method;
  return 'GET' unless $meth;
  if ($meth eq 'AUTO') {
    return $self->outgoing_args ? 'POST' : 'GET';
  }
  return $meth;
}

=item $status = $fetcher->status([$new_status])

This method is used to interrogate or change the status of the
transaction. The status keeps track of what has been done so far, and
is one of:

  waiting          # request not yet sent
  reading header   # request sent, waiting for HTTP header
  reading body     # HTTP header received, waiting for HTTP body
  parsing body     # HTTP body partially received, parsing it
  0                # transaction finished normally, EOF.

=cut

sub status   {
  my $self = shift;
  my $d    = $self->{status};
  if (@_) {
      $self->{status} = shift;
      warn "STATUS $self->{status}" if $self->debug;
  }
  $d;
}

=item $debug = $fetcher->debug([$new_debug])

Get or set the debug flag, which enables verbose diagnostic messages.

=cut

sub debug {
  my $self = shift;
  my $d    = $self->{debug};
  $self->{debug} = shift if @_;
  $d;
}

=item ($protocol,$host,$port,$path,$user,$pass) = Bio::Das::HTTP::Fetch->parse_url($url,$norfcwarn)

This method is invoked as a class method (as
Bio::Das::HTTP::Fetch->parse_url) to parse a URL into its
components. The $norfcwarn flag inhibits a warning about the unsafe
nature of embedding username/password information in the URL of
unencrypted transactions.

=cut

# very basic URL-parsing sub
sub parse_url {
  my $self = shift;
  my ($url,$norfcwarn)  = @_;

  my ($ssl,$hostent,$path) = $url =~ m!^http(s?)://([^/]+)(/?[^\#]*)! or return;
  $path ||= '/';

  my ($user,$pass); 
  ($user, $hostent) = $hostent =~ /^(.*@)?(.*)/;
  ($user, $pass) = split(':',substr($user,0,length($user)-1)) if $user;
  if ($pass && !$ssl && !$norfcwarn) {
    warn "Using password in unencrypted URI against RFC #2396 recommendation";
  }

  my ($host,$port) = split(':',$hostent);
  my ($mode,$defport);
  if ($ssl) {
    $mode='https';
    $defport=443;
  } else {
    $mode='http';
    $defport=80;
  }
  return ($mode,$host,$port||$defport,$path,$user,$pass);
}

=item $socket = Bio::Das::HTTP::Fetch->connect($protocol,$host,$port)

This method is used to make a nonblocking connection to the indicated
host and port.  $protocol is one of 'http' or 'https'.  The resulting
IO::Socket will be returned in case of success.  Undef will be
returned in case of other errors.

=cut

# this is called to connect to remote host
sub connect {
  my $pack = shift;
  my ($mode,$host,$port) = @_;
  my $sock;
  if ($mode eq 'https') {
    load_ssl();
    $sock = IO::Socket::SSL->new(Proto => 'tcp',
				 Type => SOCK_STREAM,
				 SSL_use_cert => 0,
				 SSL_verify_mode => 0x00)
  } else {
    $sock = IO::Socket::INET->new(Proto => 'tcp',
				  Type  => SOCK_STREAM)
  }

  return unless $sock;
  $sock->blocking(0);
  my $host_ip = inet_aton($host) or return $pack->error("410 Unknown host $host");
  my $addr = sockaddr_in($port,$host_ip);
  my $result = $sock->IO::Socket::INET::connect($addr);  # don't allow SSL to do its handshake yet!
  return $sock if $result;  # return the socket if connected immediately
  return $sock if $! == EINPROGRESS;  # or if it's in progress
  return;                             # return undef on other errors
}

=item $status = $fetcher->send_request()

This method sends the HTTP request and returns the resulting status.
Because of the vagaries of nonblocking IO, the complete request can be
sent in one shot, in which case the returned status will be "reading
header", or only a partial request might have been written, in which
case the returned status will be "waiting."  In the latter case,
send_request() should be called again until the complete request has
been submitted.

If a communications error occurs, send_request() will return undef, in
which case it should not be called again.

=cut

# this is called to send the HTTP request
sub send_request {
  my $self = shift;
  warn "$self->send_request()" if $self->debug;

  die "not in right state, expected state 'waiting' but got '",$self->status,"'"
    unless $self->status eq 'waiting';

  unless ($self->{socket}->connected) {
    $! = $self->{socket}->sockopt(SO_ERROR);
    return $self->error("411 Couldn't connect: $!") ;
  }

  # if we're in https mode, then we need to complete the
  # SSL handshake at this point
  if ($self->mode eq 'https') {
    $self->complete_ssl_handshake($self->{socket}) || return $self->error("412 SSL error ".$self->{socket}->error);
  }

  $self->{formatted_request} ||= $self->format_request();

  warn "SENDING $self->{formatted_request}" if $self->debug;

  # Send the header and request.  Note that we have to respect
  # both IO::Socket EWOULDBLOCK errors as well as the dodgy
  # IO::Socket::SSL "SSL wants a write" error.
  my $bytes = syswrite($self->{socket},$self->{formatted_request});
  if (!$bytes) {
    return $self->status if $! == EWOULDBLOCK;  # still trying
    return $self->status if $self->{socket}->errstr =~ /SSL wants a write/;
    return $self->error("412 Communications error: $!");
  }
  if ($bytes >= length $self->{formatted_request}) {
    $self->status('reading header');
  } else {
    substr($self->{formatted_request},0,$bytes) = '';  # truncate and try again
  }
  $self->status;
}

=item $status = $fetcher->read()

This method is called when the fetcher is in one of the read states
(reading header, reading body or parsing body).  If successful, it
returns the new status.  If unsuccessful, it returns undef.

On the end of the transaction read() will return numeric 0.

=cut

# this is called when the socket is ready to be read
sub read {
  my $self = shift;
  my $stat = $self->status;
  return $self->read_header if $stat eq 'reading header';
  return $self->read_body   if $stat eq 'reading body'
                            or $stat eq 'parsing body';
}

# read the header through to the $CRLF$CRLF (blank line)
# return a true value for 200 OK
sub read_header {
  my $self = shift;

  my $bytes = sysread($self->{socket},$self->{header},READ_UNIT,length ($self->{header}||''));
  if (!defined $bytes) {
    return $self->status if $! == EWOULDBLOCK;
    return $self->status if $self->{socket}->errstr =~ /SSL wants a read/;
  }
  return $self->error("412 Communications error") unless $bytes > 0;

  # have we found the CRLF yet?
  my $i = rindex($self->{header},"$CRLF$CRLF");
  return $self->status unless $i >= 0;  # no, so keep waiting

  # found the header
  # If we have stuff after the header, then process it
  my $header     = substr($self->{header},0,$i);
  my $extra_data = substr($self->{header},$i+4);

  my ($status_line,@other_lines) = split $CRLF,$header;
  my ($stat_code,$stat_msg) = $status_line =~ m!^HTTP/1\.[01] (\d+) (.+)!;

  # If unauthorized, capture the realm for the authentication 
  if($stat_code == 401){
    # Can't use do_headers, Request will barf on lack of X-Das version

Das/HTTP/Fetch.pm  view on Meta::CPAN


  if (my $request = $self->request) {
    $request->headers(\%headers) || return $self->error($request->error);
  }
  1;
}

=item $result = $fetcher->do_body($body_data)

This method handles the parsing of the DAS document data by sending it
to the Bio::Das::Request object.  It returns a true result if parsing
was successful, or false otherwise.

=cut
# this is called to read the body of the message and act on it
sub do_body {
  my $self = shift;
  my $data = shift;

  my $request = $self->request or return;
  if ($self->status eq 'reading body') { # transition
    $request->start_body or return;
    $self->status('parsing body');
  }

  warn "parsing()...." if $self->debug;
  return $request->body($data);
}

=item $error = $fetcher->error([$new_error])

When called without arguments, error() returns the last error message
generated by the module.  When called with arguments, error() sets the
error message and returns undef.

=cut

# warn in case of error and return undef
sub error {
  my $self = shift;
  if (@_) {
    unless (ref $self) {
      $ERROR = "@_";
      return;
    }
    warn "$self->{url}: ",@_ if $self->debug;
    $self->{error} = "@_";
    return;
  } else {
    return ref($self) ? $self->{error} : $ERROR;
  }
}

=item $fetcher->load_ssl

This method performs initialization needed to use SSL/TLS transactions.

=cut

sub load_ssl {
  eval 'require IO::Socket::SSL' or croak "Must have IO::Socket::SSL installed to use https: urls: $@";

  # cheating a bit -- IO::Socket::SSL doesn't have this function, and needs to!
  eval <<'END' unless defined &IO::Socket::SSL::pending;
sub IO::Socket::SSL::pending {
  my $self = shift;
  my $ssl  = ${*$self}{'_SSL_object'};
  return Net::SSLeay::pending($ssl); # *
}
END

}

=item $fetcher->complete_ssl_handshake($sock)

This method is called to complete the SSL handshake, which must be
performed in blocking mode.  After completing the connection, the
socket is set back to nonblocking.

=cut

sub complete_ssl_handshake {
  my $self = shift;
  my $sock = shift;
  $sock->blocking(1);  # handshake requires nonblocking i/o
  my $result = $sock->connect_SSL($sock);
  $sock->blocking(0);
}

# necessary to define these methods so that IO::Socket::INET objects will act like
# IO::Socket::SSL objects.
sub IO::Socket::INET::pending { 0     }
sub IO::Socket::INET::errstr  { undef }


=head1 AUTHOR

Lincoln Stein <lstein@cshl.org>.

Copyright (c) 2001 Cold Spring Harbor Laboratory

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.  See DISCLAIMER.txt for
disclaimers of warranty.

=head1 SEE ALSO

L<Bio::Das::Request>, L<Bio::Das::HTTP::Fetch>,
L<Bio::Das::Segment>, L<Bio::Das::Type>, L<Bio::Das::Stylesheet>,
L<Bio::Das::Source>, L<Bio::RangeI>

=cut

1;



( run in 1.224 second using v1.01-cache-2.11-cpan-39bf76dae61 )