Bio-Graphics-Glyph-decorated_gene
view release on metacpan or search on metacpan
lib/Bio/Graphics/Glyph/decorated_transcript.pm view on Meta::CPAN
package Bio::Graphics::Glyph::decorated_transcript;
use strict;
use warnings;
use Bio::Graphics::Panel;
use List::Util qw[min max];
use vars qw($VERSION);
$VERSION = '0.02';
use constant DECORATION_TAG_NAME => 'protein_decorations';
use constant DEBUG => 0;
my @color_names = Bio::Graphics::Panel::color_names;
use base
qw(Bio::Graphics::Glyph::processed_transcript);
sub my_descripton {
return <<END;
This glyph extends the functionality of the Bio::Graphics::Glyph::processed_transcript glyph
and allows protein decorations (e.g., signal peptides, transmembrane domains, protein domains)
to be drawn on top of gene models. Currently, the glyph can draw decorations in form of colored or outlined boxes
inside or around CDS segments. Protein decorations are specified at the 'mRNA' transcript level
in protein coordinates. Protein coordinates are automatically mapped to nucleotide coordinates by the glyph.
Decorations are allowed to span exon-exon junctions, in which case decorations are split between exons.
By default, the glyph automatically assigns different colors to different types of protein decorations, whereas
decorations of the same type are always assigned the same color.
Protein decorations are provided either with mRNA features inside GFF files (see example below) or
dynamically via callback function using the B<additional_decorations> option (see glyph options).
The following line is an example of an mRNA feature in a GFF file that contains two protein decorations,
one signal peptide predicted by SignalP and one transmembrane domain predicted by TMHMM:
chr1 my_source mRNA 74796 75599 . + . ID=rna_gene-1;protein_decorations=SignalP40:SP:1:23:0:my_comment,TMHMM:TM:187:209:0
Each protein decoration consists of six fields separated by a colon:
1) Type. For example used to specify decoration source (e.g. 'SignalP40')
2) Name. Decoration name. Used as decoration label by default (e.g. 'SP' for signal peptide)
3) Start. Start coordinate at the protein-level (1-based coordinate)
4) End. End coordinate at the protein-level
5) Score. Optional. Score associated with a decoration (e.g. Pfam E-value). This score can be used
to dynamically filter or color decorations via callbacks (see glyph options).
6) Description. Optional. User-defined description of decoration. The glyph ignores this description,
but it will be made available to callback functions for inspection. Special characters
like ':' or ',' that might interfere with the GFF tag parser should be avoided.
If callback functions are used as glyph parameters (see below), the callback is called for each
decoration separately. That is, the callback can get called multiple times for the same CDS feature,
but each time with a different active decoration. The currently drawn (active) decoration is made available
to the callback via the glyph method 'active_decoration'. The active decoration is returned in form
of a Bio::Graphics::Feature object, with decoration data fields mapped to corresponding feature
attributes in the following way:
type --> \$glyph->active_decoration->type
name --> \$glyph->active_decoration->name
nucleotide start coordinate --> \$glyph->active_decoration->start
nucleotide end coordinate --> \$glyph->active_decoration->end
protein start coordinate --> \$glyph->active_decoration->get_tag_values('p_start')
protein end coordinate --> \$glyph->active_decoration->get_tag_values('p_end')
score --> \$glyph->active_decoration->score
description --> \$glyph->active_decoration->desc
In addition, the glyph passed to the callback allows access to the parent glyph and
parent feature if required (use \$glyph->parent or \$glyph->parent->feature).
NOTE: This glyph works only with GFF3 compliant features. In particular, make sure that
every feature has a valid unique ID and that all child features have a valid parent id.
END
}
sub my_options {
return {
decoration_visible => [
'boolean',
'false',
'Specifies whether decorations should be visible or not. For selective display of individual',
'decorations, specify a callback function and return 1 or 0 after inspecting the active',
'decoration of the glyph. '],
decoration_color => [
'color',
undef,
'Decoration background color. If no color is specified, colors are assigned automatically',
'by decoration type and name, whereas decorations of identical type and name are assigned',
'the same color. A special color \'transparent\' can be used here in combination with',
'the option \'decoration_border\' to draw decorations as outlines.'],
decoration_border => [
['none', 'solid', 'dashed'],
'none',
'Decoration border style. By default, decorations are drawn without border (\'none\' or',
'0). Other valid options here include \'solid\' or \'dashed\'.'],
decoration_border_color => [
'color',
'black',
'Color of decoration border.'],
decoration_label => [
'string',
undef,
'Decoration label. If not specified, the second data field of the decoration is used',
'as label. Set this option to 0 to get unlabeled decorations. If the label text',
'extends beyond the size of the decorated segment, the label will be clipped. Clipping',
'does not occur for SVG output.'],
lib/Bio/Graphics/Glyph/decorated_transcript.pm view on Meta::CPAN
print STDERR "\n";
}
}
$self->{'sorted_decorations'} = $sorted_decorations;
return $sorted_decorations;
}
# returns decorations of feature as Bio::Graphics::Feature array, with coordinates mapped to nucleotide space
sub get_decorations_as_features
{
my $feature = shift;
my $additional_decorations = shift; # optional
my $cds_tag_name = shift; # optional; default: "CDS"
my @features;
my $map = _get_coordinate_map($feature, $cds_tag_name);
my @decorations = get_feature_decorations($feature);
push(@decorations, @$additional_decorations) if ($additional_decorations);
# map coordinates and encapsulate in Bio::Graphics::Feature objects
foreach my $decoration (@decorations)
{
my ( $type, $name, $p_start, $p_end, $score, $desc ) = split( ":", $decoration );
if (!defined $p_end)
{
warn "get_decorations_as_features(): WARNING: invalid decoration data for feature $feature(".$feature->primary_tag."):\n$decoration\n";
next;
}
my $nt_start =$map->{$p_start}->{'codon_start'};
if (!$nt_start)
{
warn "get_decorations_as_features(): WARNING: could not map decoration start coordinate on feature $feature(".$feature->primary_tag."):\n$decoration\n";
next;
}
my $nt_end = $map->{$p_end}->{'codon_end'};
if (!$nt_end)
{
warn "get_decorations_as_features(): WARNING: could not map decoration end coordinate on feature $feature(".$feature->primary_tag."):\n$decoration\n";
next;
}
( $nt_start, $nt_end ) = ( $nt_end, $nt_start )
if ( $nt_start > $nt_end );
my $f = Bio::Graphics::Feature->new
(
-type => $type,
-name => $name,
-display_name => $name,
-start => $nt_start,
-end => $nt_end,
-score => $score,
-desc => $desc,
-seq_id => $feature->seq_id,
-strand => $feature->strand,
-attributes => { # remember protein coordinates for callbacks
'p_start' => $p_start,
'p_end' => $p_end
}
);
warn "DECORATION=$decoration --> $nt_start:$nt_end\n" if (DEBUG);
push(@features, $f);
}
return wantarray ? @features : \@features;
}
# map protein to nucleotide coordinate
sub _get_coordinate_map {
my $feature = shift;
my $cds_tag_name = shift || 'CDS';
my %map;
# sort all CDS features by coordinates
# NOTE: filtering for CDS features by passing feature type to get_SeqFeatures()
# does not work for some reason, probably when no feature store attached
my @cds = grep { $_->primary_tag eq $cds_tag_name } $feature->get_SeqFeatures();
if ( $feature->strand > 0 ) {
my ( $ppos, $residue ) = ( 1, 0 );
my @sorted_cds = sort { $a->start <=> $b->start } (@cds);
foreach my $c (@sorted_cds) {
$map{ $ppos - 1 }{'codon_end'} = $c->start + $residue - 1
if ($residue);
for (
my $ntpos = $c->start + $residue ;
$ntpos <= $c->end ;
$ntpos += 3
)
{
$map{$ppos}{'codon_start'} = $ntpos;
$map{$ppos}{'codon_end'} = $ntpos + 2;
$ppos++;
$residue = $ntpos + 2 - $c->end;
}
}
}
else {
my ( $ppos, $residue ) = ( 1, 0 );
my @sorted_cds = reverse sort { $a->start <=> $b->start } (@cds);
foreach my $c (@sorted_cds) {
$map{ $ppos - 1 }{'codon_end'} = $c->end - $residue + 1
if ($residue);
for (
my $ntpos = $c->end - $residue ;
$ntpos >= $c->start ;
$ntpos -= 3
)
{
$map{$ppos}{'codon_start'} = $ntpos;
$map{$ppos}{'codon_end'} = $ntpos - 2;
# print $self->feature->name."\t$ppos\t".$self->{'p2n'}{$ppos}{'codon_start'}."\t".$self->{'p2n'}{$ppos}{'codon_end'}."\n" if ($self->feature->name eq "DAF19-b");
$ppos++;
$residue = $c->start - ( $ntpos - 2 );
}
lib/Bio/Graphics/Glyph/decorated_transcript.pm view on Meta::CPAN
$gd->string( $font, $h_left + 2, $gd->isa("GD::SVG::Image") ? $label_top-1 : $label_top, $label, $self->factory->translate_color($label_color) );
$gd->clip( 0, 0, $gd->width, $gd->height )
if ( !$gd->isa("GD::SVG::Image") );
}
1;
__END__
=head1 NAME
Bio::Graphics::Glyph::decorated_transcript - draws processed transcript with protein decorations
=head1 SYNOPSIS
See L<Bio::Graphics::Panel> and L<Bio::Graphics::Glyph>.
=head1 DESCRIPTION
This glyph extends the functionality of the L<Bio::Graphics::Glyph::processed_transcript> glyph
and allows protein decorations (e.g., signal peptides, transmembrane domains, protein domains)
to be drawn on top of gene models. Currently, the glyph can draw decorations in form of colored or outlined boxes
inside or around CDS segments. Protein decorations are specified at the 'mRNA' transcript level
in protein coordinates. Protein coordinates are automatically mapped to nucleotide coordinates by the glyph.
Decorations are allowed to span exon-exon junctions, in which case decorations are split between exons.
By default, the glyph automatically assigns different colors to different types of protein decorations, whereas
decorations of the same type are always assigned the same color.
Protein decorations are provided either with mRNA features inside GFF files (see example below) or
dynamically via callback function using the B<additional_decorations> option (see glyph options).
The following line is an example of an mRNA feature in a GFF file that contains two protein decorations,
one signal peptide predicted by SignalP and one transmembrane domain predicted by TMHMM:
C<chr1 my_source mRNA 74796 75599 . + . ID=rna_gene-1;protein_decorations=SignalP40:SP:1:23:0:my_comment,TMHMM:TM:187:209:0>
Each protein decoration consists of six fields separated by a colon:
=over
=item 1. type
Decoration type. For example used to specify decoration source (e.g. 'SignalP40')
=item 2. name
Decoration name. Used as decoration label by default (e.g. 'SP' for signal peptide)
=item 3. start
Start coordinate at the protein-level (1-based coordinate)
=item 4. end
End coordinate at the protein-level
=item 5. score
Optional. Score associated with a decoration (e.g. Pfam E-value). This score can be used
to dynamically filter or color decorations via callbacks (see glyph options).
=item 6. description
Optional. User-defined description of decoration. The glyph ignores this description,
but it will be made available to callback functions for inspection. Special characters
like ':' or ',' that might interfere with the GFF tag parser should be avoided.
=back
If callback functions are used as glyph parameters (see below), the callback is called for each
decoration separately. That is, the callback can get called multiple times for the same CDS feature,
but each time with a different active decoration. The currently drawn (active) decoration is made available
to the callback via the glyph method 'active_decoration'. The active decoration is returned in form
of a Bio::Graphics::Feature object, with decoration data fields mapped to corresponding feature
attributes in the following way:
=over
=item * type --> $glyph->active_decoration->type
=item * name --> $glyph->active_decoration->name
=item * nucleotide start coordinate --> $glyph->active_decoration->start
=item * nucleotide end coordinate --> $glyph->active_decoration->end
=item * protein start coordinate --> $glyph->active_decoration->get_tag_values('p_start')
=item * protein end coordinate --> $glyph->active_decoration->get_tag_values('p_end')
=item * score --> $glyph->active_decoration->score
=item * description --> $glyph->active_decoration->desc
=back
In addition, the glyph passed to the callback allows access to the parent glyph and
parent feature if required (use $glyph->parent or $glyph->parent->feature).
NOTE: This glyph works only with GFF3 compliant features. In particular, make sure that
every feature has a valid unique ID and that all child features have a valid parent id.
=head2 OPTIONS
This glyph inherits all options from the L<Bio::Graphics::Glyph::processed_transcript> glyph.
In addition, it recognizes the following glyph-specific options:
Option Description Default
------ ----------- -------
-decoration_visible false
Specifies whether decorations should be visible
or not. For selective display of individual
decorations, specify a callback function and
return 1 or 0 after inspecting the active decoration
of the glyph.
-decoration_color <auto>
( run in 0.402 second using v1.01-cache-2.11-cpan-140bd7fdf52 )