BioPerl

 view release on metacpan or  search on metacpan

Bio/DB/Query/GenBank.pm  view on Meta::CPAN


=head1 AUTHOR - Lincoln Stein

Email lstein@cshl.org

=head1 APPENDIX

The rest of the documentation details each of the
object methods. Internal methods are usually
preceded with a _

=cut

# Let the code begin...

package Bio::DB::Query::GenBank;
use strict;
use URI::Escape 'uri_unescape';
use Bio::DB::NCBIHelper;


#use constant EPOST       => $Bio::DB::NCBIHelper::HOSTBASE . '/entrez/eutils/epost.fcgi';
#use constant ESEARCH     => $Bio::DB::NCBIHelper::HOSTBASE . '/entrez/eutils/esearch.fcgi';
# the reference to the our variable of the $Bio::DB::NCBIHelper::HOSTBASE doesn't seem to work in 
# the constant definition in perl 5.10.1 or 5.16.3
use constant EPOST       => '/entrez/eutils/epost.fcgi';
use constant ESEARCH     => '/entrez/eutils/esearch.fcgi';
use constant DEFAULT_DB  => 'protein';
use constant MAXENTRY    => 100;

use vars qw(@ATTRIBUTES);

use base qw(Bio::DB::Query::WebQuery);

BEGIN {
  @ATTRIBUTES = qw(db reldate mindate maxdate datetype maxids);
  for my $method (@ATTRIBUTES) {
    eval <<END;
sub $method {
   my \$self = shift;
   my \$d    = \$self->{'_$method'};
   \$self->{'_$method'} = shift if \@_;
   \$d;
}
END
  }
}

=head2 new

 Title   : new
 Usage   : $db = Bio::DB::Query::GenBank->new(@args)
 Function: create new query object
 Returns : new query object
 Args    : -db       database (see below for allowable values)
           -query    query string
           -mindate  minimum date to retrieve from (YYYY/MM/DD)
           -maxdate  maximum date to retrieve from (YYYY/MM/DD)
           -reldate  relative date to retrieve from (days)
           -datetype date field to use ('edat' or 'mdat')
           -ids      array ref of gids (overrides query)
           -maxids   the maximum number of IDs you wish to collect
                     (defaults to 100)

This method creates a new query object.  Typically you will specify a
-db and a -query argument, possibly modified by -mindate, -maxdate, or
-reldate.  -mindate and -maxdate specify minimum and maximum dates for
entries you are interested in retrieving, expressed in the form
YYYY/MM/DD.  -reldate is used to fetch entries that are more recent
than the indicated number of days.

If you provide an array reference of IDs in -ids, the query will be
ignored and the list of IDs will be used when the query is passed to a
Bio::DB::GenBank object's get_Stream_by_query() method.  A variety of
IDs are automatically recognized, including GI numbers, Accession
numbers, Accession.version numbers and locus names.

By default, the query will collect only the first 100 IDs and will
generate an exception if you call the ids() method and the query
returned more than that number.  To increase this maximum, set -maxids
to a number larger than the number of IDs you expect to obtain.  This
only affects the list of IDs you obtain when you call the ids()
method, and does not affect in any way the number of entries you
receive when you generate a SeqIO stream from the query.

-db option values:

  The most commonly used databases are:

      protein
      nucleotide
      nuccore
      nucgss
      nucest
      unigene

  An up to date list of database names supported by NCBI eUtils is
  always available at:
  https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?

  However, note that not all of these databases return datatypes that
  are parsable by Bio::DB::GenBank

=cut

sub new {
  my $class = shift;
  my $self  = $class->SUPER::new(@_);
  my ($query,$db,$reldate,$mindate,$maxdate,$datetype,$ids,$maxids)
    = $self->_rearrange([qw(QUERY DB RELDATE MINDATE MAXDATE DATETYPE IDS MAXIDS)],@_);
  $self->db($db || DEFAULT_DB);
  $reldate  && $self->reldate($reldate);
  $mindate  && $self->mindate($mindate);
  $maxdate  && $self->maxdate($maxdate);
  $maxids   && $self->maxids($maxids);
  $datetype ||= 'mdat';
  $datetype && $self->datetype($datetype);
  $self;
}

=head2 cookie



( run in 0.549 second using v1.01-cache-2.11-cpan-5735350b133 )