eval results from the CPAN

ALBD

view release on metacpan or search on metacpan


  SYNOPSIS
        This package consists of Perl modules along with supporting Perl
        programs that perform Literature Based Discovery (LBD). The core 
        data from which LBD is performed are co-occurrences matrices 
        generated from UMLS::Association. ALBD is based on the ABC
        co-occurrence model. Many options can be specified, and many
        ranking methods are available. The novel ranking methods that use
        association measure are available as well as frequency based
        ranking methods. See samples/lbd for more info. Can perform open and
        closed LBD as well as time slicing evaluation.

        ALBD requires UMLS::Association both to compute the co-occurrence
        database that the co-occurrence matrix is derived from, but also for 
        ranking the generated C terms.

        UMLS::Association requires the UMLS::Interface module to access 
        the Unified Medical Language System (UMLS) for semantic type filtering
        and to determine if CUIs are valid.

        The following sections describe the organization of this software

lib/ALBD.pm view on Meta::CPAN

=head1 ABSTRACT

      This package consists of Perl modules along with supporting Perl
      programs that perform Literature Based Discovery (LBD). The core 
      data from which LBD is performed are co-occurrences matrices 
      generated from UMLS::Association. ALBD is based on the ABC
      co-occurrence model. Many options can be specified, and many
      ranking methods are available. The novel ranking methods that use
      association measure are available as well as frequency based
      ranking methods. See samples/lbd for more info. Can perform open and
      closed LBD as well as time slicing evaluation.

=head1 INSTALL

To install the module, run the following magic commands:

  perl Makefile.PL
  make
  make test
  make install

lib/ALBD.pm view on Meta::CPAN

  perl Makefile.PL PREFIX=/home/sid

It is possible to modify other parameters during installation. The
details of these can be found in the ExtUtils::MakeMaker
documentation. However, it is highly recommended not messing around
with other parameters, unless you know what you're doing.

=head1 CONFIGURATION FILE

There are many parameters that can be specified, both for open and
close discovery as well as time slicing evaluation. Please see the 
samples folder for info and sample configuration files.

=cut


######################################################################
#                          Description
######################################################################
#
# This is a description heared more towards understanding or modifying

lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN

# ALBD::Evaluation.pm
#
# Provides functionality to evaluate LBD systems
# Key components are:
# Results Matrix <- all new knowledge generated by an LBD system (e.g.
#                   all proposed discoveries of a system from pre-cutoff
#                   data).
# Gold Standard Matrix <- the gold standard knowledge matrix (e.g. all
#                         knowledge present in the post-cutoff dataset
#                         that is not present in the pre-cutoff dataset).
#
# Copyright (c) 2017
#

lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN

# along with this program; if not, write to
#
# The Free Software Foundation, Inc.,
# 59 Temple Place - Suite 330,
# Boston, MA  02111-1307, USA.

package Evaluation;
use strict;
use warnings;

# Timeslicing evaluation that calculates the precision of LBD 
# (O(k), where k is the number of keys in results)
# input: $resultsMatrixRef <- ref a matrix of LBD results
#        $goldMatrixRef <- ref to a gold standard matrix
# output: the precision of results
sub calculatePrecision {
    my $resultsMatrixRef = shift;
    my $goldMatrixRef = shift;

    # calculate the precision which is the percentage of results that are 
    # are in the gold standard

lib/LiteratureBasedDiscovery/Evaluation.pm view on Meta::CPAN

    my $count = 0;
    foreach my $key(keys %{$resultsMatrixRef}) {
	if (exists ${$goldMatrixRef}{$key}) {
	    $count++;
	}
    }
    return $count/(scalar keys %{$resultsMatrixRef});

}

# Timeslicing evaluation that calculate the recall of LBD 
# (O(k), where k (is the number of keys in gold)
# input: $resultsMatrixRef <- ref a matrix of LBD results
#        $goldMatrixRef <- ref to a gold standard matrix
# output: the recall of results
sub calculateRecall {
    my $resultsMatrixRef = shift;
    my $goldMatrixRef = shift;
    
    # calculate the recall which is the percentage of knowledge in the gold
    # standard that was generated by the LBD system

lib/LiteratureBasedDiscovery/TimeSlicing.pm view on Meta::CPAN

# Boston, MA  02111-1307, USA.

package TimeSlicing;
use strict;
use warnings;

use LiteratureBasedDiscovery::Discovery;


#
# Calculates and outputs to STDOUT Time Slicing evaluation stats of
# precision and recall at $numIntervals intervals, Mean Average Precision
# (MAP), precision at k, and frequency at k
# input:  $trueMatrixRef <- a ref to a hash of true discoveries
#         $rowRanksRef <- a ref to a hash of arrays of ranked predictions. 
#                         Each hash key is a cui,  each hash element is an 
#                         array of ranked predictions for that cui. The ranked 
#                         predictions are cuis are ordered in descending order 
#                         based on association. (from Rank::RankDescending)
#         $numIntervals <- the number of recall intervals to generate
sub outputTimeSlicingResults {

samples/runSample.pl view on Meta::CPAN

#Demo file, showing how to run open discovery using the sample data, and how 
# to perform time slicing evaluation using the sample data

# run a sample lbd using the parameters in the lbd configuration file
print "\n           OPEN DISCOVERY          \n";
`perl ../utils/runDiscovery.pl lbdConfig`;
print "LBD Open discovery results output to sampleOutput\n\n";

# run a sample time slicing
# first remove the co-occurrences of the precutoff matrix (in this case it is 
# the sampleExplicitMatrix from the post cutoff matrix. This generates a gold 
# standard discovery matrix from which time slicing may be performed

( run in 1.048 second using v1.01-cache-2.11-cpan-acf6aa7dc9e )