Bio-DB-SeqFeature
view release on metacpan or search on metacpan
lib/Bio/DB/SeqFeature/Store.pm view on Meta::CPAN
@args = (-name=>shift);
}
push @args,(-aliases=>1);
$self->get_features_by_name(@args);
}
=head2 get_features_by_type
Title : get_features_by_type
Usage : @features = $db->get_features_by_type(@types)
Function: looks up features by their primary_tag
Returns : a list of matching features
Args : list of primary tags
Status : public
This method will return a list of features that have any of the
primary tags given in the argument list. For compatibility with
gbrowse and Bio::DB::GFF, types can be qualified using a colon:
primary_tag:source_tag
in which case only features that match both the primary_tag B<and> the
indicated source_tag will be returned. If the database was loaded from
a GFF3 file, this corresponds to the third and second columns of the
row, in that order.
For example, given the GFF3 lines:
ctg123 geneFinder exon 1300 1500 . + . ID=exon001
ctg123 fgenesH exon 1300 1520 . + . ID=exon002
exon001 and exon002 will be returned by searching for type "exon", but
only exon001 will be returned by searching for type "exon:fgenesH".
=cut
sub get_features_by_type {
my $self = shift;
my @types = @_;
$self->_features(-type=>\@types);
}
=head2 get_features_by_location
Title : get_features_by_location
Usage : @features = $db->get_features_by_location(@args)
Function: looks up features by their location
Returns : a list of matching features
Args : see below
Status : public
This method fetches features based on a location range lookup. You
call it using a positional list of arguments, or a list of
(-argument=E<gt>$value) pairs.
The positional form is as follows:
$db->get_features_by_location($seqid [[,$start,]$end])
The $seqid is the name of the sequence on which the feature resides,
and start and end are optional endpoints for the match. If the
endpoints are missing then any feature on the indicated seqid is
returned.
Examples:
get_features_by_location('chr1'); # all features on chromosome 1
get_features_by_location('chr1',5000); # features between 5000 and the end
get_features_by_location('chr1',5000,8000); # features between 5000 and 8000
Location lookups are overlapping. A feature will be returned if it
partially or completely overlaps the indicated range.
The named argument form gives you more control:
Argument Value
-------- -----
-seq_id The name of the sequence on which the feature resides
-start Start of the range
-end End of the range
-strand Strand of the feature
-range_type Type of range to search over
The B<-strand> argument, if present, can be one of "0" to find
features that are on both strands, "+1" to find only plus strand
features, and "-1" to find only minus strand features. Specifying a
strand of undef is the same as not specifying this argument at all,
and retrieves all features regardless of their strandedness.
The B<-range_type> argument, if present, can be one of "overlaps" (the
default), to find features whose positions overlap the indicated
range, "contains," to find features whose endpoints are completely
contained within the indicated range, and "contained_in" to find
features whose endpoints are both outside the indicated range.
=cut
sub get_features_by_location {
my $self = shift;
my ($seqid,$start,$end,$strand,$rangetype) =
rearrange([['SEQ_ID','SEQID','REF'],'START',['STOP','END'],'STRAND','RANGE_TYPE'],@_);
$self->_features(-seqid=>$seqid,
-start=>$start||undef,
-end=>$end||undef,
-strand=>$strand||undef,
-range_type=>$rangetype);
}
=head2 get_features_by_attribute
Title : get_features_by_attribute
Usage : @features = $db->get_features_by_attribute(@args)
Function: looks up features by their attributes/tags
Returns : a list of matching features
Args : see below
Status : public
This implements a simple tag filter. Pass a list of tag names and
their values. The module will return a list of features whose tag
names and values match. Tag names are case insensitive. If multiple
tag name/value pairs are present, they will be ANDed together. To
match any of a list of values, use an array reference for the value.
Examples:
# return all features whose "function" tag is "GO:0000123"
@features = $db->get_features_by_attribute(function => 'GO:0000123');
# return all features whose "function" tag is "GO:0000123" or "GO:0000555"
@features = $db->get_features_by_attribute(function => ['GO:0000123','GO:0000555']);
# return all features whose "function" tag is "GO:0000123" or "GO:0000555"
# and whose "confirmed" tag is 1
@features = $db->get_features_by_attribute(function => ['GO:0000123','GO:0000555'],
confirmed => 1);
=cut
sub get_features_by_attribute {
my $self = shift;
my %attributes = ref($_[0]) ? %{$_[0]} : @_;
%attributes or $self->throw("Usage: get_feature_by_attribute(attribute_name=>\$attribute_value...)");
$self->_features(-attributes=>\%attributes);
}
###
# features() call -- main query interface
#
=head2 features
Title : features
Usage : @features = $db->features(@args)
Function: generalized query & retrieval interface
Returns : list of features
lib/Bio/DB/SeqFeature/Store.pm view on Meta::CPAN
Title : fetch_sequence
Usage : $sequence = $db->fetch_sequence(-seq_id=>$seqid,-start=>$start,-end=>$end)
Function: Fetch the indicated subsequene from the database
Returns : The sequence string (not a Bio::PrimarySeq object!)
Args : see below
Status : public
This method retrieves a portion of the indicated sequence. The arguments are:
Argument Value
-------- -----
-seq_id Chromosome, contig or other DNA segment
-seqid Synonym for -seq_id
-name Synonym for -seq_id
-start Start of range
-end End of range
-class Obsolete argument used for Bio::DB::GFF compatibility. If
specified will qualify the seq_id as "$class:$seq_id".
-bioseq Boolean flag; if true, returns a Bio::PrimarySeq object instead
of a sequence string.
You can call fetch_sequence using the following shortcuts:
$seq = $db->fetch_sequence('chr3'); # entire chromosome
$seq = $db->fetch_sequence('chr3',1000); # position 1000 to end of chromosome
$seq = $db->fetch_sequence('chr3',undef,5000); # position 1 to 5000
$seq = $db->fetch_sequence('chr3',1000,5000); # positions 1000 to 5000
=cut
###
# fetch_sequence()
#
# equivalent to old Bio::DB::GFF->dna() method
#
sub fetch_sequence {
my $self = shift;
my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'],
'START',['END','STOP'],'CLASS','BIOSEQ'],@_);
$seqid = "$seqid:$class" if defined $class;
my $seq = $self->seq($seqid,$start,$end);
return $seq unless $bioseq;
require Bio::Seq unless Bio::Seq->can('new');
my $display_id = defined $start ? "$seqid:$start..$end" : $seqid;
return Bio::Seq->new(-display_id=>$display_id,-seq=>$seq);
}
=head2 segment
Title : segment
Usage : $segment = $db->segment($seq_id [,$start] [,$end] [,$absolute])
Function: restrict the database to a sequence range
Returns : a Bio::DB::SeqFeature::Segment object
Args : sequence id, start and end ranges (optional)
Status : public
This is a convenience method that can be used when you are interested
in the contents of a particular sequence landmark, such as a
contig. Specify the ID of a sequence or other landmark in the database
and optionally a start and endpoint relative to that landmark. The
method will look up the region and return a
Bio::DB::SeqFeature::Segment object that spans it. You can then use
this segment object to make location-restricted queries on the database.
Example:
$segment = $db->segment('contig23',1,1000); # first 1000 bp of contig23
my @mRNAs = $segment->features('mRNA'); # all mRNAs that overlap segment
Although you will usually want to fetch segments that correspond to
physical sequences in the database, you can actually use any feature
in the database as the sequence ID. The segment() method will perform
a get_features_by_name() internally and then transform the feature
into the appropriate coordinates.
The named feature should exist once and only once in the database. If
it exists multiple times in the database and you attempt to call
segment() in a scalar context, you will get an exception. A workaround
is to call the method in a list context, as in:
my ($segment) = $db->segment('contig23',1,1000);
or
my @segments = $db->segment('contig23',1,1000);
However, having multiple same-named features in the database is often
an indication of underlying data problems.
If the optional $absolute argument is a true value, then the specified
coordinates are relative to the reference (absolute) coordinates.
=cut
###
# Replacement for Bio::DB::GFF->segment() method
#
sub segment {
my $self = shift;
my (@features,@args);
if (@_ == 1 && blessed($_[0])) {
@features = @_;
@args = ();
}
else {
@args = $self->setup_segment_args(@_);
@features = $self->get_features_by_name(@args);
}
if (!wantarray && @features > 1) {
$self->throw(<<END);
segment() called in a scalar context but multiple features match.
Either call in a list context or narrow your search using the -types or -class arguments
END
}
my ($rel_start,$rel_end,$abs) = rearrange(['START',['STOP','END'],'ABSOLUTE'],@args);
$rel_start = 1 unless defined $rel_start;
my @segments;
for my $f (@features) {
( run in 2.281 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )