ALBD
view release on metacpan or search on metacpan
SYNOPSIS
This package consists of Perl modules along with supporting Perl
programs that perform Literature Based Discovery (LBD). The core
data from which LBD is performed are co-occurrences matrices
generated from UMLS::Association. ALBD is based on the ABC
co-occurrence model. Many options can be specified, and many
ranking methods are available. The novel ranking methods that use
association measure are available as well as frequency based
ranking methods. See samples/lbd for more info. Can perform open and
closed LBD as well as time slicing evaluation.
ALBD requires UMLS::Association both to compute the co-occurrence
database that the co-occurrence matrix is derived from, but also for
ranking the generated C terms.
UMLS::Association requires the UMLS::Interface module to access
the Unified Medical Language System (UMLS) for semantic type filtering
and to determine if CUIs are valid.
The following sections describe the organization of this software
lib/ALBD.pm view on Meta::CPAN
=head1 ABSTRACT
This package consists of Perl modules along with supporting Perl
programs that perform Literature Based Discovery (LBD). The core
data from which LBD is performed are co-occurrences matrices
generated from UMLS::Association. ALBD is based on the ABC
co-occurrence model. Many options can be specified, and many
ranking methods are available. The novel ranking methods that use
association measure are available as well as frequency based
ranking methods. See samples/lbd for more info. Can perform open and
closed LBD as well as time slicing evaluation.
=head1 INSTALL
To install the module, run the following magic commands:
perl Makefile.PL
make
make test
make install
lib/ALBD.pm view on Meta::CPAN
perl Makefile.PL PREFIX=/home/sid
It is possible to modify other parameters during installation. The
details of these can be found in the ExtUtils::MakeMaker
documentation. However, it is highly recommended not messing around
with other parameters, unless you know what you're doing.
=head1 CONFIGURATION FILE
There are many parameters that can be specified, both for open and
close discovery as well as time slicing evaluation. Please see the
samples folder for info and sample configuration files.
=cut
######################################################################
# Description
######################################################################
#
# This is a description heared more towards understanding or modifying
lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN
# ALBD::Evaluation.pm
#
# Provides functionality to evaluate LBD systems
# Key components are:
# Results Matrix <- all new knowledge generated by an LBD system (e.g.
# all proposed discoveries of a system from pre-cutoff
# data).
# Gold Standard Matrix <- the gold standard knowledge matrix (e.g. all
# knowledge present in the post-cutoff dataset
# that is not present in the pre-cutoff dataset).
#
# Copyright (c) 2017
#
lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN
# along with this program; if not, write to
#
# The Free Software Foundation, Inc.,
# 59 Temple Place - Suite 330,
# Boston, MA 02111-1307, USA.
package Evaluation;
use strict;
use warnings;
# Timeslicing evaluation that calculates the precision of LBD
# (O(k), where k is the number of keys in results)
# input: $resultsMatrixRef <- ref a matrix of LBD results
# $goldMatrixRef <- ref to a gold standard matrix
# output: the precision of results
sub calculatePrecision {
my $resultsMatrixRef = shift;
my $goldMatrixRef = shift;
# calculate the precision which is the percentage of results that are
# are in the gold standard
lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN
my $count = 0;
foreach my $key(keys %{$resultsMatrixRef}) {
if (exists ${$goldMatrixRef}{$key}) {
$count++;
}
}
return $count/(scalar keys %{$resultsMatrixRef});
}
# Timeslicing evaluation that calculate the recall of LBD
# (O(k), where k (is the number of keys in gold)
# input: $resultsMatrixRef <- ref a matrix of LBD results
# $goldMatrixRef <- ref to a gold standard matrix
# output: the recall of results
sub calculateRecall {
my $resultsMatrixRef = shift;
my $goldMatrixRef = shift;
# calculate the recall which is the percentage of knowledge in the gold
# standard that was generated by the LBD system
lib/LiteratureBasedDiscovery/TimeSlicing.pm view on Meta::CPAN
# Boston, MA 02111-1307, USA.
package TimeSlicing;
use strict;
use warnings;
use LiteratureBasedDiscovery::Discovery;
#
# Calculates and outputs to STDOUT Time Slicing evaluation stats of
# precision and recall at $numIntervals intervals, Mean Average Precision
# (MAP), precision at k, and frequency at k
# input: $trueMatrixRef <- a ref to a hash of true discoveries
# $rowRanksRef <- a ref to a hash of arrays of ranked predictions.
# Each hash key is a cui, each hash element is an
# array of ranked predictions for that cui. The ranked
# predictions are cuis are ordered in descending order
# based on association. (from Rank::RankDescending)
# $numIntervals <- the number of recall intervals to generate
sub outputTimeSlicingResults {
samples/runSample.pl view on Meta::CPAN
#Demo file, showing how to run open discovery using the sample data, and how
# to perform time slicing evaluation using the sample data
# run a sample lbd using the parameters in the lbd configuration file
print "\n OPEN DISCOVERY \n";
`perl ../utils/runDiscovery.pl lbdConfig`;
print "LBD Open discovery results output to sampleOutput\n\n";
# run a sample time slicing
# first remove the co-occurrences of the precutoff matrix (in this case it is
# the sampleExplicitMatrix from the post cutoff matrix. This generates a gold
# standard discovery matrix from which time slicing may be performed
( run in 1.269 second using v1.01-cache-2.11-cpan-98e64b0badf )