ALBD
view release on metacpan
or search on metacpan
GPL.txt
view on Meta::CPAN
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
|
GPL.txt
view on Meta::CPAN
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system , which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system ; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
|
GPL.txt
view on Meta::CPAN
274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
|
INSTALL
view on Meta::CPAN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | NAME
ALBD Installation Guide
TESTING PLATFORMS
ALBD has been developed and tested on Linux primarily using Perl.
SYNOPSIS
perl Makefile.PL
make
make test
make install
DESCRIPTION
ALBD provides a system for performing ABC co-occurrence literature based
discovery using a variety of options, and association-based ranking
methods
REQUIREMENTS
ALBD REQUIRES that the following software packages and data:
|
INSTALL
view on Meta::CPAN
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | The package is freely available at: Stage 3: Install ALBD package
The usual way to install the package is to run the following commands: perl Makefile.PL
make
make test
make install
You will often need root access/superuser privileges to run make
install. The module can also be installed locally. To do a local
install, you need to specify a PREFIX option when you run 'perl
Makefile.PL'. For example,
perl Makefile.PL PREFIX=/home
or
|
INSTALL
view on Meta::CPAN
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 | Of course, you could also add the 'use lib' line to the top of the
program yourself, but you might not want to do that. You will need to
replace 5.8.3 with whatever version of Perl you are using. The preceding
instructions should be sufficient for standard and slightly non-standard
installations. However, if you need to modify other makefile options you
should look at the ExtUtils::MakeMaker documentation. Modifying other
makefile options is not recommended unless you really, absolutely, and
completely know what you're doing!
NOTE: If one (or more) of the tests run by 'make test' fails, you will
see a summary of the tests that failed, followed by a message of the
form "make: *** [test_dynamic] Error Y" where Y is a number between 1
and 255 (inclusive). If the number is less than 255, then it indicates
how many test failed ( if more than 254 tests failed, then 254 will still
be shown). If one or more tests died, then 255 will be shown. For more
details, see:
Stage 4: Create an co-occurrence matrix
ALBD requires that a co-occurrence matrix of CUIs has been created. This
matrix is stored as a flat file, in a sparse matrix format such that
each line contains three tab seperated values , cui_1, cui_2, n_11 = the
count of their co-occurrences. Any matrix with that format is
acceptable, however the intended method of matrix generation is to
|
MANIFEST
view on Meta::CPAN
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | samples/lbdConfig
samples/postCutoffMatrix
samples/runSample.pl
samples/sampleExplicitMatrix
samples/sampleGoldMatrix
samples/timeSliceCuiList
samples/timeSlicingConfig
samples/configFileSamples/UMLSAssociationConfig
samples/configFileSamples/UMLSInterfaceConfig
samples/configFileSamples/UMLSInterfaceInternalConfig
t/test.t
t/goldSampleOutput
t/goldSampleTimeSliceOutput
utils/runDiscovery.pl
utils/datasetCreator/applyMaxThreshold.pl
utils/datasetCreator/applyMinThreshold.pl
utils/datasetCreator/applySemanticFilter.pl
utils/datasetCreator/combineCooccurrenceMatrices.pl
utils/datasetCreator/makeOrderNotMatter.pl
utils/datasetCreator/removeCUIPair.pl
utils/datasetCreator/removeExplicit.pl
utils/datasetCreator/testMatrixEquality.pl
utils/datasetCreator/dataStats/getCUICooccurrences.pl
utils/datasetCreator/dataStats/getMatrixStats.pl
utils/datasetCreator/dataStats/metaAnalysis.pl
utils/datasetCreator/fromMySQL/dbToTab.pl
utils/datasetCreator/fromMySQL/removeQuotes.pl
utils/datasetCreator/squaring/convertForSquaring_MATLAB.pl
utils/datasetCreator/squaring/squareMatrix.m
utils/datasetCreator/squaring/squareMatrix_partial.m
utils/datasetCreator/squaring/squareMatrix_perl.pl
META.yml Module YAML meta-data (added by MakeMaker)
|
README
view on Meta::CPAN
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | The following sections describe the organization of this software
package and how to use it. A few typical examples are given to help clearly understand the usage of the modules and the supporting
utilities.
INSTALL
To install the module, run the following magic commands:
perl Makefile.PL
make
make test
make install
This will install the module in the standard location. You will, most
probably, require root privileges to install in standard system directories. To install in a non-standard directory, specify a prefix
during the 'perl Makefile.PL' stage as:
perl Makefile.PL PREFIX=/home/programs
It is possible to modify other parameters during installation. The
|
README
view on Meta::CPAN
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | removeCUIPair.pl -- removes all occurrences of the specified CUI pair
from the co-occurrence matrix
removeExplicit.pl -- removes any keys that occur in an explicit
co-occurrence matrix from another co-occurrence matrix (typically the
squared explicit co-occurrence matrix itself, which generates a
prediction matrix, or the post cutoff matrix used in time slicing to
generate a gold standard file)
testMatrixEquality.pl -- checks to see if two co-occurrence matrix files
contain the same data
Also included are several subfolders with more specific purposes. Within
the dataStats subfolder are scripts to collect various statistics about
the co-occurrence matrices used in LBD. These scriptsinclude:
getCUICooccurrences.pl -- a data statistics file that gets the number of
co-occurrences, and number of unique co-occurrences for every CUI in the
dataset
|
lib/ALBD.pm
view on Meta::CPAN
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | association measure are available as well as frequency based
ranking methods. See samples/lbd for more info. Can perform open and
closed LBD as well as time slicing evaluation.
|
lib/ALBD.pm
view on Meta::CPAN
462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 | |
lib/LiteratureBasedDiscovery/Rank.pm
view on Meta::CPAN
282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 | foreach my $cuiPair ( sort { $tiedAMWScores { $b } <=> $tiedAMWScores { $a }} keys %tiedAMWScores ) {
$ltcAMWScores { $cuiPair } = $currentRank ;
$currentRank --;
}
}
return \ %ltcAMWScores ;
}
sub score_cosineDistance {
my $startingMatrixRef = shift ;
my $explicitMatrixRef = shift ;
my $implicitMatrixRef = shift ;
|
lib/LiteratureBasedDiscovery/TimeSlicing.pm
view on Meta::CPAN
511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 | sub calculatePrecisionAndRecall_implicit {
my $trueMatrixRef = shift ;
my $rowRanksRef = shift ;
my $numIntervals = shift ;
my %precision = ();
my %recall = ();
foreach my $rowKey ( keys %{ $trueMatrixRef }) {
my $trueRef = ${ $trueMatrixRef }{ $rowKey };
my $rankedPredictionsRef = ${ $rowRanksRef }{ $rowKey };
|
lib/LiteratureBasedDiscovery/TimeSlicing.pm
view on Meta::CPAN
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 | if ( $numTrue == 0) {
next ;
}
if ( $numPredictions == 0) {
next ;
}
my $interval = $numPredictions / $numIntervals ;
for ( my $i = 0; $i <= 1; $i +=(1/ $numIntervals )) {
my $numTrueForInterval = 1;
if ( $i > 0) {
$numTrueForInterval = $numTrue *$i ;
}
|
lib/LiteratureBasedDiscovery/TimeSlicing.pm
view on Meta::CPAN
573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 |
$precision { $i } += ( $truePositive / $numChecked );
$recall { $i } += ( $truePositive / $numTrue );
}
}
foreach my $i ( keys %precision ) {
$precision { $i } /= ( scalar keys %{ $trueMatrixRef });
$recall { $i } /= ( scalar keys %{ $trueMatrixRef });
}
return (\ %precision , \ %recall );
}
|
lib/LiteratureBasedDiscovery/TimeSlicing.pm
view on Meta::CPAN
729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 | my $trueMatrixRef = shift ;
my $rowRanksRef = shift ;
my %meanCooccurrenceCount = ();
my $interval = 1;
for ( my $k = 1; $k <= 100; $k += $interval ) {
$meanCooccurrenceCount { $k } = 0;
foreach my $rowKey ( keys %{ $trueMatrixRef }) {
my $rankedPredictionsRef = ${ $rowRanksRef }{ $rowKey };
if (! defined $rankedPredictionsRef ) {
next ;
}
my $trueRef = ${ $trueMatrixRef }{ $rowKey };
|
samples/lbdConfig
view on Meta::CPAN
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | <rankingProcedure>averageMinimumWeight
<rankingMeasure>ll
|
samples/timeSlicingConfig
view on Meta::CPAN
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | <rankingProcedure>averageMinimumWeight
<rankingMeasure>ll
|
t/test.t
view on Meta::CPAN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | #!/usr/local/bin/perl -w
my $precRecallErrorTol = 0.0001;
my $atKErrorTol = 1.0;
`(cd ./samples/; perl runSample.pl) &`;
print "Performing Open Discovery Tests:\n" ;
my %goldScores = ();
open IN, './t/goldSampleOutput'
or die ( "Error: Cannot open gold sample output\n" );
while ( my $line = <IN>) {
if ( $line =~ /\d+\t(\d+\.\d+)\t(C\d+)/) {
$goldScores {$2} = $1;
|
t/test.t
view on Meta::CPAN
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | }
}
ok ( $allExist == 1, "All CUIs exist in the output" );
ok ( $allMatch == 1, "All Scores are the same in the output" );
print "Done with Open Discovery Tests\n\n" ;
print "Performing Time Slicing Tests\n" ;
( my $goldAPScoresRef , my $goldMAP , my $goldPAtKScoresRef , my $goldFAtKScoresRef )
= &readTimeSlicingData ( './t/goldSampleTimeSliceOutput' );
( my $newAPScoresRef , my $newMAP , my $newPAtKScoresRef , my $newFAtKScoresRef )
= &readTimeSlicingData ( './samples/sampleTimeSliceOutput' );
|
utils/datasetCreator/fromMySQL/removeQuotes.pl
view on Meta::CPAN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | my $inFile = '1980_1984_window1_retest_data.txt' ;
my $outFile = '1980_1984_window1_restest_DELETEME' ;
open IN, $inFile or die ( "unable to open inFile: $inFile\n" );
open OUT, '>' . $outFile or die ( "unable to open outFile: $outFile\n" );
while ( my $line = <IN>) {
$line =~ s/"//g;
print OUT $line ;
}
|
utils/runDiscovery.pl
view on Meta::CPAN
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | . "\nOPTIONS\n"
. " --assocConfig path to the UMLS::Association Config File\n"
. " --interfaceConfig path to the UMLS::Interface Config File\n"
. "\nUSAGE EXAMPLES\n"
. " runDiscovery lbdConfigFile\n" ;
;
my $DEBUG = 0;
my $HELP = '' ;
my $VERSION ;
my %options = ();
$options { 'assocConfig' } = '' ;
$options { 'interfaceConfig' } = '' ;
GetOptions( 'debug' => \ $DEBUG ,
|