InSilicoSpectro
view release on metacpan or search on metacpan
lib/InSilicoSpectro/InSilico/MassCalculator.pm view on Meta::CPAN
return $massType;
} # getMassType
=head2 setModif($mr)
add a InSilico::ModRes object
=cut
sub setModif
{
my $mr=shift;
my $name=$mr->get('name');
$elMass{"mod_$name"} = [$mr->{delta_monoisotopic}, $mr->{delta_average}];
} # setModif
# -----------------------------------------------------------
# Peptide functions
# -----------------------------------------------------------
=head1 Peptide-related functions
This section groups all the functions that are directly related to peptides, i.e.
peptide mass computation and protein digestion.
=head2 Modifications
To properly deal with modified
proteins (and hence peptides) and compute their mass and MS/MS spectra, in case
of peptides, we introduce a convention that allows to localize modifications in
protein/peptide sequences.
A protein/peptide sequence is a sequence of amino acids
a_1 a_2 a_3 a_4 ... a_n.
The corresponding modifications are represented either as a string or as a
vector.
=head2 String representation
The string takes the form
m_0:m_1:m_2:m_3:m_4: ... :m_n:m_(n+1),
where m_0 is the N-terminal site modification, m_i is a_i modification, and
m_(n+1) is the C-terminal site modification. For instance, a peptide
DEMSCGHTK
might be modified according to
ACET_nterm:::Oxidation::Cys_CAM::Oxidation:::
which means that there is an N-terminal acetylation, the methionine and the
histidine are oxidized, and the cysteine is carboxyamidomethylated; no
C-terminal modification. We see that in this notation empty positions between
colons are possible to denote the absence of modification. The modification
identifiers come from the configuration file.
In the above string notation it is possible to define variable modifications,
see function variablePeptide.
=head2 Vector representation
Alternatively, modifications can be localized by using a vector of strings.
The length of the vector is len(peptide)+2 and element at index 0 corresponds
to the N-terminus, index len(peptide)+1 to the C-terminus and the indices
between 1 and len(peptide) correspond to the amino acids. The strings of this
vector follow the same rule as the strings between ':' in the modification
string, i.e. the contain the name of the modification or nothing, or they
define variable modifications, see function variablePeptide.
=head2 The PMF case
In PMF only the peptide masses matter and it is not necessary to know the
location of the modifications. We only need to know their numbers. Hence,
when dealing with PMF computations we introduce a third convention for
modification description. We use a vector that contains the number of
occurrences and the modifications alternatively:
num1, modif1, num2, modif2, ...
=head2 modifToString($modif, [$len])
In order to display modification strings/vectors conveniently, we provide
a unique function modifToString that accepts all three formats and display the
modifications in $modif as a string using an appropriate style. If the parameter
$len (peptide length) is also given, then modifToString complements the length
if the returned string if necessary (MS/MS only).
=cut
sub modifToString
{
my ($modif, $len) = @_;
if (ref($modif) eq 'ARRAY'){
if ((length($modif->[0]) > 0) && ($modif->[0] eq int($modif->[0]))){
# First element is an integer => list of modifs for PMF
my $string;
for (my $i = 0; $i < @$modif; $i+=2){
$string .= ', ' if (length($string) > 0);
$string .= "$modif->[$i]x($modif->[$i+1])";
}
return $string;
}
else{
# Localized modifs for MS/MS
my @extra;
if (defined($len)){
croak("Modification vector too long") if (@$modif > $len+2);
for (my $i = 0; $i < $len+2-@$modif; $i++){
push(@extra, '');
( run in 1.142 second using v1.01-cache-2.11-cpan-e1769b4cff6 )