Bio-Homology-InterologWalk
view release on metacpan or search on metacpan
lib/Bio/Homology/InterologWalk.pm view on Meta::CPAN
$all_seen{$idIN} = 1; $all_seen{$idOUT} = 1;
#then I count what fraction of the original ids actually made it to feature any putative interactions
$old_id_present{$idIN} = 1 if(exists($start_data_set{$idIN}));
$old_id_present{$idOUT} = 1 if(exists($start_data_set{$idOUT}));
#lastly I store those new ids never seen in the starting dataset
$new_id_set{$idOUT} = 1 unless(exists($start_data_set{$idOUT}));
}
my $number_of_old_IDs = keys %old_id_present;
my $percentage = ($number_of_old_IDs / $number_of_elements_start_ds) * 100;
print("Number of IDs from the original dataset that appear in the network: $number_of_old_IDs\n");
print("Percentage of IDs from the original dataset that appear in the final dataset: $percentage\n");
my $number_of_new_IDs = keys %new_id_set;
my $number_of_network_nodes = keys %all_seen;
$percentage= ($number_of_new_IDs / $number_of_network_nodes) * 100;
if($onetoone_only){
print("Number of total UNIQUE IDs in interaction dataset (considering ONE-TO-ONE ortologies only): $number_of_network_nodes\n");
}else{
print("Number of total UNIQUE IDs in interaction dataset: $number_of_network_nodes\n");
}
print("Number of NEW ids (e.g. not seen in starting data set): $number_of_new_IDs\n");
print("Percentage of new ids over the total: $percentage\n");
#I save all the new ids in a flat file. This might be useful to do some analysis of their functional annotation
foreach my $id (sort keys %new_id_set){
#print ("$id\t$new_id_set{$id}\n");
print $out_data $id, "\n";
}
$sth->finish();
$dbh->disconnect();
close($start_data);
lib/Bio/Homology/InterologWalk.pm view on Meta::CPAN
}
foreach my $homology_member (@{$genelist}){
$DF_orthologue_id = $homology_member->stable_id;
next if ($DF_orthologue_id eq $init_id);#I dont want to print again the gene name
$DF_oname = $homology_member->display_label;
#OPI
my $pairwise_alignment_from_multiple = $homology->get_SimpleAlign;
$DF_opi = $pairwise_alignment_from_multiple->overall_percentage_identity;
#$opi = sprintf("%.3f", $overall_pid); #rounded
$DF_orthologue_id = '-' if(!$DF_orthologue_id);
$DF_oname = '-' if(!$DF_oname);
$DF_odesc = '-' if(!$DF_odesc);
$DF_dnds = '-' if(!$DF_dnds);
$DF_opi = '-' if(!$DF_opi);
$DF_fsa_x = '-' if(!$DF_fsa_x);
$DF_fsa_y = '-' if(!$DF_fsa_y);
$DF_nndist = '-' if(!$DF_nndist);
lib/Bio/Homology/InterologWalk.pm view on Meta::CPAN
);
Purpose : This is used to analyse several ancillary data fields obtained alongside the actual
putative PPI IDs and collate them into an Interolog Prioritisation Index (IPX), to associate a
numerical index to each putative PPI based on biological metadata. The index will take into account
a number of features related to each of the steps involved in the orthology walk.
We can divide the metadata features in two broad classes:
- features related to the interaction. These include: Interaction Type, Interaction
Detection Method, Interaction coming from a SPOKE-expanded complex, interaction recon-
firmed through multiple taxa, interaction reconfirmed through multiple detection methods
- features related to the two orthology mappings. These include: orthology type
(one-to-one, one-to-many, many-to-one, many-to-many), OPI (percentage identity of the
conserved columns - see Bio::SimpleAlign), node to node distance, distance from the
first shared ancestor, (under development) dN/dS ratio
The IPX computation will also involve a normalisation stage. The subroutine requires
five arguments (meanscore_x) representing mean values to be used for normalisation.
The actual means are computed in get_mean_scores(), which is pre-requisite to
compute_prioritisation_index().
Returns : success/failure
Argument : -input_path : path to the input tsv file. A suitable input for this subroutine is the
final output of the orthology walk pipeline (see doInterologWalk.pl for usage guidelines).
input file should have .06out extension
lib/Bio/Homology/InterologWalk.pm view on Meta::CPAN
-term_graph : a Go::Parser graph object obtained from parse_ontology() containing a
network representation of the PSI-MI controlled vocabulary of terms.
-meanscore_em : mean experimental method score for normalisation
-meanscore_it : mean interaction type score for normalisation
-meanscore_dm : mean detection method score for normalisation
-meanscore_me_dm : mean 'multiple detection methods' score for normalisation
-meanscore_me_taxa : mean 'multiple taxa' score for normalisation
Throws : -
Comment : -
See Also : L<http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/SimpleAlign.pm#overall_percentage_identity>, L</get_mean_scores>, C<doScores.pl> for sample usage
=cut
sub compute_prioritisation_index{
my %args = @_;
my $in_path = $args{input_path};
my $out_path = $args{output_path};
my $score_path = $args{score_path};
my $graph = $args{term_graph};
scripts/Data/psi-mi.obo view on Meta::CPAN
id: MI:2046
name: lethal dose 50
def: "The LD50 is the dose that kills half (50%) of the animals tested" [PMID:14755292]
subset: Drugable
synonym: "ld50" EXACT PSI-MI-alternate []
synonym: "lethal dose 50 %" EXACT PSI-MI-alternate []
is_a: MI:0640 ! parameter type
[Term]
id: MI:2047
name: percentage of plasma protein binding
def: "Percentage of the drug that is bound in plasma proteins" [PMID:14755292]
subset: Drugable
synonym: "plasma prot binding" EXACT PSI-MI-short []
synonym: "protein binding %" EXACT PSI-MI-alternate []
is_a: MI:0640 ! parameter type
[Term]
id: MI:2048
name: drug biotransformation
def: "The chemical conversion of drugs to other compounds in the body, excluding degradation due to any inherent chemical instability of drugs in biological media." [PMID:14755292]
( run in 0.416 second using v1.01-cache-2.11-cpan-709fd43a63f )