BioPerl

 view release on metacpan or  search on metacpan

Bio/DB/DBFetch.pm  view on Meta::CPAN

Allows the dynamic retrieval of entries from databases using the
dbfetch script at EBI:
L<http:E<sol>E<sol>www.ebi.ac.ukE<sol>cgi-binE<sol>dbfetch>.

In order to make changes transparent we have host type (currently only
ebi) and location (defaults to ebi) separated out.  This allows later
additions of more servers in different geographical locations.

This is a superclass which is called by instantiable subclasses with
correct parameters.

=head1 FEEDBACK

=head2 Mailing Lists

User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists.  Your participation is much appreciated.

  bioperl-l@bioperl.org                  - General discussion
  http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

=head2 Support 

Please direct usage questions or support issues to the mailing list:

I<bioperl-l@bioperl.org>

rather than to the module maintainer directly. Many experienced and 
reponsive experts will be able look at the problem and quickly 
address it. Please include a thorough description of the problem 
with code and data examples if at all possible.

=head2 Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution.  Bug reports can be submitted via the
web:

  https://github.com/bioperl/bioperl-live/issues

=head1 AUTHOR - Heikki Lehvaslaiho

Email Heikki Lehvaslaiho E<lt>heikki-at-bioperl-dot-orgE<gt>

=head1 APPENDIX

The rest of the documentation details each of the object
methods. Internal methods are usually preceded with a _

=cut

# Let the code begin...

package Bio::DB::DBFetch;
use strict;
use vars qw($MODVERSION $DEFAULTFORMAT $DEFAULTLOCATION
	         $DEFAULTSERVERTYPE);

$MODVERSION = '0.1';
use HTTP::Request::Common;

use base qw(Bio::DB::WebDBSeqI);

# the new way to make modules a little more lightweight

BEGIN { 	
    # global vars
    $DEFAULTSERVERTYPE = 'dbfetch';
    $DEFAULTLOCATION = 'ebi';
}


=head1 Routines from Bio::DB::WebDBSeqI

=head2 get_request

 Title   : get_request
 Usage   : my $url = $self->get_request
 Function: returns a HTTP::Request object
 Returns :
 Args    : %qualifiers = a hash of qualifiers (ids, format, etc)

=cut

sub get_request {
	my ($self, @qualifiers) = @_;
	my ($uids, $format) = $self->_rearrange([qw(UIDS FORMAT)],
														 @qualifiers);

	$self->throw("Must specify a value for UIDs to fetch")
	  unless defined $uids;
	my $tmp;
	my $format_string = '';
	$format ||= $self->default_format;
	($format, $tmp) = $self->request_format($format);
	$format_string = "&format=$format"; 
	my $url = $self->location_url();
	my $uid;
	if( ref($uids) =~ /ARRAY/i ) {
		$uid = join (',', @$uids);
		$self->warn ('The server will accept maximum of 50 entries in a request. The rest are ignored.')
		  if scalar @$uids >50;
	} else {
		$uid = $uids;
	}

	return GET $url. $format_string. '&id='. $uid;
}


=head2 postprocess_data

 Title   : postprocess_data
 Usage   : $self->postprocess_data ( 'type' => 'string',
				     'location' => \$datastr);
 Function: process downloaded data before loading into a Bio::SeqIO
 Returns : void
 Args    : hash with two keys - 'type' can be 'string' or 'file'
                              - 'location' either file location or string
                                           reference containing data

=cut

# remove occasional blank lines at top of web output
sub postprocess_data {
  my ($self, %args) = @_;
  if ($args{type} eq 'string') {
    ${$args{location}} =~ s/^\s+//;  # get rid of leading whitespace
  }
  elsif ($args{type} eq 'file') {
    my $F;
    open $F,"<", $args{location} or $self->throw("Cannot open $args{location}: $!");
    my @data = <$F>;
    for (@data) {
      last unless /^\s+$/;
      shift @data;
    }
    open $F,">", $args{location} or $self->throw("Cannot write to $args{location}: $!");
    print $F @data;



( run in 0.579 second using v1.01-cache-2.11-cpan-39bf76dae61 )