Bio-Graphics

 view release on metacpan or  search on metacpan

lib/Bio/Graphics/Glyph/decorated_transcript.pm  view on Meta::CPAN

package Bio::Graphics::Glyph::decorated_transcript;

use strict;
use warnings;

use Bio::Graphics::Panel;
use List::Util qw[min max];

use constant DECORATION_TAG_NAME => 'protein_decorations';
use constant DEBUG              => 0;

my @color_names = Bio::Graphics::Panel::color_names;

use base
  qw(Bio::Graphics::Glyph::processed_transcript Bio::Graphics::Glyph::segments);

sub my_descripton {
  return <<END;
This glyph extends the functionality of the Bio::Graphics::Glyph::processed_transcript glyph 
and allows to draw protein decorations (e.g., signal peptides, transmembrane domains, protein domains)
on top of gene models. Currently, the glyph can draw decorations in form of colored or outlined boxes 
inside or around CDS segments. Protein decorations are specified at the 'mRNA' transcript level 
in protein coordinates. Protein coordinates are automatically mapped to nucleotide coordinates by the glyph. 
Decorations are allowed to span exon-exon junctions, in which case decorations are split between exons. 
By default, the glyph automatically assigns different colors to different types of protein decorations, whereas 
decorations of the same type are always assigned the same color. 

Protein decorations are provided either with mRNA features inside GFF files (see example below) or 
dynamically via callback function using the B<additional_decorations> option (see glyph options).
The following line is an example of an mRNA feature in a GFF file that contains two protein decorations, 
one signal peptide predicted by SignalP and one transmembrane domain predicted by TMHMM:

chr1   my_source   mRNA  74796  75599   .  +  .  ID=rna_gene-1;protein_decorations=SignalP40:SP:1:23:0:my_comment,TMHMM:TM:187:209:0

Each protein decoration consists of six fields separated by a colon:

1) Type. For example used to specify decoration source (e.g. 'SignalP40')
2) Name. Decoration name. Used as decoration label by default (e.g. 'SP' for signal peptide)
3) Start. Start coordinate at the protein-level (1-based coordinate)
4) End. End coordinate at the protein-level
5) Score. Optional. Score associated with a decoration (e.g. Pfam E-value). This score can be used 
   to dynamically filter or color decorations via callbacks (see glyph options).
6) Description. Optional. User-defined description of decoration. The glyph ignores this description, 
   but it will be made available to callback functions for inspection. Special characters 
   like ':' or ',' that might interfere with the GFF tag parser should be avoided. 

If callback functions are used as glyph parameters (see below), the callback is called for each
decoration separately. That is, the callback can be called multiple times for the same CDS feature,
but each time with a different decoration. The currently drawn (active) decoration is made available 
to the callback via the glyph method 'active_decoration'. The active decoration is returned in form
of a Bio::Graphics::Feature object, with decoration data fields mapped to corresponding feature
attributes in the following way:

  type --> \$glyph->active_decoration->type
  name --> \$glyph->active_decoration->name
  nucleotide start coordinate --> \$glyph->active_decoration->start
  nucleotide end coordinate --> \$glyph->active_decoration->end
  protein start coordinate --> \$glyph->active_decoration->get_tag_values('p_start')
  protein end coordinate --> \$glyph->active_decoration->get_tag_values('p_end')
  score --> \$glyph->active_decoration->score
  description --> \$glyph->active_decoration->description

In addition, the glyph passed to the callback allows access to the parent glyph and
parent feature if required (use \$glyph->parent or \$glyph->parent->feature). 

END
}

sub my_options {
    return {
	decoration_visible => [
	    'boolean',
	    'false',
	    'Specifies whether decorations should be visible or not. For selective display of individual', 
        'decorations, specify a callback function and return 1 or 0 after inspecting the active',
        'decoration of the glyph. '],
	decoration_color => [
	    'color',
	    undef,
	    'Decoration background color. If no color is specified, colors are assigned automatically',
	    'by decoration type and name, whereas decorations of identical type and name are assigned',
	    'the same color. A special color \'transparent\' can be used here in combination with',
	    'the option \'decoration_border\' to draw decorations as outlines.'],
	decoration_border => [
	    ['none', 'solid', 'dashed'],
	    'none',
	    'Decoration border style. By default, decorations are drawn without border (\'none\' or',
	    '0). Other valid options here include \'solid\' or \'dashed\'.'],
	decoration_border_color => [
	    'color',
	    'black',
	    'Color of decoration boder.'],
	decoration_label => [
	    'string',
	    undef,
	    'Decoration label. If not specified, the second data field of the decoration is used',
	    'as label. Set this option to 0 to get unlabeled decorations. If the label text',
	    'extends beyond the size of the decorated segment, the label will be clipped. Clipping',
	    'does not occur for SVG output.'],
	decoration_label_position => [
	    ['inside', 'above', 'below'],
	    undef,

lib/Bio/Graphics/Glyph/decorated_transcript.pm  view on Meta::CPAN

		my @sorted = reverse sort { $a->length <=> $b->length } (@{$self->mapped_decorations});
		$sorted_decorations = \@sorted;

		if (DEBUG)
		{
			print STDERR "sorted decorations: ";
			foreach my $sd (@$sorted_decorations) { print STDERR $sd->name."(".$sd->length.") "; }
			print STDERR "\n";
		}
	}

	$self->{'sorted_decorations'} = $sorted_decorations;
	
	return $sorted_decorations;
}

sub _map_decorations {
	my $self    = shift;
	my $feature = $self->feature;
	
	$self->_map_coordinates();

	my @mapped_decorations;
	foreach my $h ( @{$self->all_decorations} ) {
		my ( $type, $name, $p_start, $p_end, $score, $desc ) = split( ":", $h );

		if (!defined $p_end)
		{
			warn "_map_decorations(): WARNING: invalid decoration data for feature $feature: '$h'\n";
			next;
		}

		my $nt_start = $self->_map_codon_start($p_start);
		if (!$nt_start)
		{
			warn "DECORATION=$h\n";
			warn "_map_decorations(): WARNING: could not map decoration start coordinate on feature $feature(".$feature->primary_tag.")\n";
			next;
		}
		my $nt_end = $self->_map_codon_end($p_end);
		if (!$nt_end)
		{
			warn "DECORATION=$h\n";
			warn "_map_decorations(): WARNING: could not map decoration end coordinate on feature $feature(".$feature->primary_tag.")\n";
			next;
		}

		( $nt_start, $nt_end ) = ( $nt_end, $nt_start )
		  if ( $nt_start > $nt_end );

		my $f = Bio::Graphics::Feature->new
		(
			-type => $type,
			-name => $name,
			-start => $nt_start,
 			-end => $nt_end,
			-score => $score,
			-desc => $desc,
			-seq_id => $feature->seq_id,
			-strand => $feature->strand,
			-attributes => {   # remember protein coordinates for callbacks  
				'p_start' => $p_start, 
				'p_end' => $p_end 
			}
		);

#		my $mapped_decoration = "$h:$nt_start:$nt_end";
		push( @mapped_decorations, $f );
		
		# init stack offset for stacked decorations
		if ($self->decoration_position($f) eq 'stacked_bottom')
		{			
			if (!defined $self->{'stack_offset_bottom'}{$f})
			{				
				$self->{'cur_stack_offset_bottom'} = 2 
					if (!defined $self->{'cur_stack_offset_bottom'});
					
				$self->{'stack_offset_bottom'}{$f} = $self->{'cur_stack_offset_bottom'};
				$self->{'cur_stack_offset_bottom'} += $self->decoration_height($f);

				warn "$self: stack offset ".$f->name."($f): ".$self->{'stack_offset_bottom'}{$f}."\n"
					if (DEBUG);
			}
		}
		
		warn "DECORATION=$h --> $nt_start:$nt_end\n" if (DEBUG);
	}

	$self->{'mapped_decorations'} = \@mapped_decorations;
}

sub _map_codon_start {
	my $self               = shift;
	my $protein_coordinate = shift;
	
	$self->throw('protein coordinate not specified: ')
		if (!$protein_coordinate and DEBUG);

	return $self->{'p2n'}->{$protein_coordinate}->{'codon_start'};
}

sub _map_codon_end {
	my $self               = shift;
	my $protein_coordinate = shift;
	
	$self->throw('protein coordinate not specified')
		if (!$protein_coordinate and DEBUG);

	return $self->{'p2n'}->{$protein_coordinate}->{'codon_end'};
}

# map protein to nucleotide coordinate
sub _map_coordinates {
	my $self = shift;

 # sort all CDS features by coordinates
 # NOTE: filtering for CDS features by passing feature type to get_SeqFeatures()
 # does not work for some reason, probably when no feature store attached
	my @cds =
	  grep { $_->primary_tag eq 'CDS' } $self->feature->get_SeqFeatures();
	if ( $self->feature->strand > 0 ) {

lib/Bio/Graphics/Glyph/decorated_transcript.pm  view on Meta::CPAN

	$gd->string( $font, $h_left + 2,
		$label_top, $label, $self->factory->translate_color($label_color) );
	$gd->clip( 0, 0, $gd->width, $gd->height )
	  if ( !$gd->isa("GD::SVG::Image") );
}

1;

__END__

=head1 NAME

Bio::Graphics::Glyph::decorated_transcript - draws processed transcript with protein decorations

=head1 SYNOPSIS

  See L<Bio::Graphics::Panel> and L<Bio::Graphics::Glyph>.

=head1 DESCRIPTION

This glyph extends the functionality of the L<Bio::Graphics::Glyph::processed_transcript> glyph 
and allows to draw protein decorations (e.g., signal peptides, transmembrane domains, protein domains)
on top of gene models. Currently, the glyph can draw decorations in form of colored or outlined boxes 
inside or around CDS segments. Protein decorations are specified at the 'mRNA' transcript level 
in protein coordinates. Protein coordinates are automatically mapped to nucleotide coordinates by the glyph. 
Decorations are allowed to span exon-exon junctions, in which case decorations are split between exons. 
By default, the glyph automatically assigns different colors to different types of protein decorations, whereas 
decorations of the same type are always assigned the same color. 

Protein decorations are provided either with mRNA features inside GFF files (see example below) or 
dynamically via callback function using the B<additional_decorations> option (see glyph options).
The following line is an example of an mRNA feature in a GFF file that contains two protein decorations, 
one signal peptide predicted by SignalP and one transmembrane domain predicted by TMHMM:

C<chr1   my_source   mRNA  74796  75599   .  +  .  ID=rna_gene-1;protein_decorations=SignalP40:SP:1:23:0:my_comment,TMHMM:TM:187:209:0>

Each protein decoration consists of six fields separated by a colon:


=over

=item 1. type

Decoration type.  For example used to specify decoration source (e.g. 'SignalP40')

=item 2. name

Decoration name. Used as decoration label by default (e.g. 'SP' for signal peptide)

=item 3. start

Start coordinate at the protein-level (1-based coordinate)

=item 4. end

End coordinate at the protein-level

=item 5. score

Optional. Score associated with a decoration (e.g. Pfam E-value). This score can be used 
to dynamically filter or color decorations via callbacks (see glyph options).

=item 6. description

Optional. User-defined description of decoration. The glyph ignores this description, 
but it will be made available to callback functions for inspection. Special characters 
like ':' or ',' that might interfere with the GFF tag parser should be avoided. 

=back 

If callback functions are used as glyph parameters (see below), the callback is called for each
decoration separately. That is, the callback can be called multiple times for a given CDS feature,
but each time with a different decoration that overlaps with this CDS. The currently drawn (active) 
decoration is made available to the callback via the glyph method 'active_decoration'. The active 
decoration is returned in form of a Bio::Graphics::Feature object, with decoration data fields 
mapped to corresponding feature attributes in the following way:

=over

=item * type --> $glyph->active_decoration->type

=item * name --> $glyph->active_decoration->name

=item * nucleotide start coordinate --> $glyph->active_decoration->start

=item * nucleotide end coordinate --> $glyph->active_decoration->end

=item * protein start coordinate --> $glyph->active_decoration->get_tag_values('p_start')

=item * protein end coordinate --> $glyph->active_decoration->get_tag_values('p_end')

=item * score --> $glyph->active_decoration->score

=item * description --> $glyph->active_decoration->description

=back 

In addition, the glyph passed to the callback allows access to the parent glyph and
parent feature if required (use $glyph->parent or $glyph->parent->feature). 

=head2 OPTIONS

This glyph inherits all options from the L<Bio::Graphics::Glyph::processed_transcript> glyph. 
In addition, it recognizes the following glyph-specific options:

  Option          Description                                              Default
  ------          -----------                                              -------

  -decoration_visible       
  
                  Specifies whether decorations should be visible          false
                  or not. For selective display of individual 
                  decorations, specify a callback function and 
                  return 1 or 0 after inspecting the active decoration
                  of the glyph. 

  -decoration_color
  
                  Decoration background color. If no color is              <auto>
                  specified, colors are assigned automatically by
                  decoration type and name, whereas decorations of 



( run in 0.892 second using v1.01-cache-2.11-cpan-39bf76dae61 )