InSilicoSpectro

 view release on metacpan or  search on metacpan

lib/InSilicoSpectro/InSilico/MassCalculator.pm  view on Meta::CPAN

  return $massType;

} # getMassType


=head2 setModif($mr)

add a InSilico::ModRes object 


=cut

sub setModif
{
  my $mr=shift;
  my $name=$mr->get('name');

  $elMass{"mod_$name"} = [$mr->{delta_monoisotopic}, $mr->{delta_average}];

} # setModif




# -----------------------------------------------------------
# Peptide functions
# -----------------------------------------------------------

=head1 Peptide-related functions

This section groups all the functions that are directly related to peptides, i.e.
peptide mass computation and protein digestion. 

=head2 Modifications

To properly deal with modified
proteins (and hence peptides) and compute their mass and MS/MS spectra, in case
of peptides, we introduce a convention that allows to localize modifications in
protein/peptide sequences.

A protein/peptide sequence is a sequence of amino acids 

  a_1 a_2 a_3 a_4 ... a_n.

The corresponding modifications are represented either as a string or as a
vector.

=head2 String representation

The string takes the form

  m_0:m_1:m_2:m_3:m_4: ... :m_n:m_(n+1),

where m_0 is the N-terminal site modification, m_i is a_i modification, and 
m_(n+1) is the C-terminal site modification. For instance, a peptide

  DEMSCGHTK

might be modified according to

  ACET_nterm:::Oxidation::Cys_CAM::Oxidation:::

which means that there is an N-terminal acetylation, the methionine and the
histidine are oxidized, and the cysteine is carboxyamidomethylated; no
C-terminal modification. We see that in this notation empty positions between
colons are possible to denote the absence of modification. The modification
identifiers come from the configuration file.

In the above string notation it is possible to define variable modifications,
see function variablePeptide.

=head2 Vector representation

Alternatively, modifications can be localized by using a vector of strings.
The length of the vector is len(peptide)+2 and element at index 0 corresponds
to the N-terminus, index len(peptide)+1 to the C-terminus and the indices
between 1 and len(peptide) correspond to the amino acids. The strings of this
vector follow the same rule as the strings between ':' in the modification
string, i.e. the contain the name of the modification or nothing, or they
define variable modifications, see function variablePeptide.

=head2 The PMF case

In PMF only the peptide masses matter and it is not necessary to know the
location of the modifications. We only need to know their numbers. Hence,
when dealing with PMF computations we introduce a third convention for
modification description. We use a vector that contains the number of
occurrences and the modifications alternatively:

  num1, modif1, num2, modif2, ...

=head2 modifToString($modif, [$len])

In order to display modification strings/vectors conveniently, we provide
a unique function modifToString that accepts all three formats and display the
modifications in $modif as a string using an appropriate style. If the parameter
$len (peptide length) is also given, then modifToString complements the length
if the returned string if necessary (MS/MS only).

=cut
sub modifToString
{
  my ($modif, $len) = @_;

  if (ref($modif) eq 'ARRAY'){
    if ((length($modif->[0]) > 0) && ($modif->[0] eq int($modif->[0]))){
      # First element is an integer => list of modifs for PMF
      my $string;
      for (my $i = 0; $i < @$modif; $i+=2){
	$string .= ', ' if (length($string) > 0);
	$string .= "$modif->[$i]x($modif->[$i+1])";
      }
      return $string;
    }
    else{
      # Localized modifs for MS/MS
      my @extra;
      if (defined($len)){
	croak("Modification vector too long") if (@$modif > $len+2);
	for (my $i = 0; $i < $len+2-@$modif; $i++){
	  push(@extra, '');



( run in 1.142 second using v1.01-cache-2.11-cpan-e1769b4cff6 )