ALBD

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

NAME
    ALBD README

  SYNOPSIS
        This package consists of Perl modules along with supporting Perl
        programs that perform Literature Based Discovery (LBD). The core 
        data from which LBD is performed are co-occurrences matrices 
        generated from UMLS::Association. ALBD is based on the ABC
        co-occurrence model. Many options can be specified, and many
        ranking methods are available. The novel ranking methods that use
        association measure are available as well as frequency based
        ranking methods. See samples/lbd for more info. Can perform open and
        closed LBD as well as time slicing evaluation.

        ALBD requires UMLS::Association both to compute the co-occurrence
        database that the co-occurrence matrix is derived from, but also for 
        ranking the generated C terms.

        UMLS::Association requires the UMLS::Interface module to access 
        the Unified Medical Language System (UMLS) for semantic type filtering
        and to determine if CUIs are valid.

        The following sections describe the organization of this software
        package and how to use it. A few typical examples are given to help
        clearly understand the usage of the modules and the supporting
        utilities.

  INSTALL
        To install the module, run the following magic commands:

          perl Makefile.PL
          make
          make test
          make install

        This will install the module in the standard location. You will, most
        probably, require root privileges to install in standard system
        directories. To install in a non-standard directory, specify a prefix
        during the 'perl Makefile.PL' stage as:

          perl Makefile.PL PREFIX=/home/programs

        It is possible to modify other parameters during installation. The
        details of these can be found in the ExtUtils::MakeMaker documentation.
        However, it is highly recommended not messing around with other
        parameters, unless you know what you're doing.

  CO-OCCURRENCE MATRIX SETUP
    ALBD requires that a co-occurrence matrix of CUIs has been created. This
    matrix is stored as a flat file, in a sparse matrix format such that
    each line contains three tab seperated values, cui_1, cui_2, n_11 = the
    count of their co-occurrences. Any matrix with that format is
    acceptable, however the intended method of matrix generation is to
    convert a UMLS::Association database into a flat matrix file. These
    databases are created using the CUICollector tool of UMLS::Association,
    and are run over the MetaMapped Medline baseline. With that file, run
    utils/datasetCreator/fromMySQL/dbToTab.pl to convert the desired
    database into a matrix file. Notice that code in dbToTab.pl is just a
    sample mysql command. If the input database is created in another
    method, a different command may be needed. As long as the resulting
    co-occurrence matrix is in the correct format LBD may be run on it. This
    allows flexibility in where co-occurrence information comes from.

    Note: utils/datasetCreator/fromMySQL/removeQuotes.pl may need to be run
    on the resulting tab seperated file, if quotes are inlcuded in the
    resulting co-ocurrence matrix file.

  Set Up Dummy UMLS::Association Database
    UMLS::Association requires that a database can be connected to that is
    in the correct format. Although this database is not required for ALBD
    (since co-occurrence data is loaded from a co-occurrence matrix), it is
    required to run UMLS:Association. If you ran UMLS::Association to
    generate a co-occurrence matrix, you should be fine. Otherwise you will
    need to create a dummy database that it can connect to. This can be done
    in a few steps:

    1) open mysql type mysql at the terminal

    2) create the default database in the correct format, type: CREATE
    DATABASE cuicounts; use cuicounts; CREATE TABLE N_11(cui_1 CHAR(10),
    cui_2 CHAR(10), n_11 BIGINT(20));

  INITIALIZING THE MODULE
    To create an instance of the ALBD object, using default values for all
    configuration options: %options = (); $options{'lbdConfig'} =
    'configFile'; my $lbd = LiteratureBasedDiscovery->new(\%options);
    $lbd->performLBD();

    The following configuration options are also provided though:

    'assocConfig' path to a UMLS::Association configuration file. Default
    location is 'config/association'. Replace this file for your computer to
    avoid having to specify each time

    'interfaceConfig' path to a UMLS::Interface configuration file. Default
    location is '../config/interface'. Replace this file for your computer
    to avoid having to specify each time.

    These are passed through a hash. For example:

        my %options = ();
        $options{'assocConfig'}   = '/home/share/ALBD/config/association';
        $options{'interfaceConfig'} = '/home/shar/ALBD/config/interface';
        $options{'lbdConfig'} = 'configFile'
        my $lbd = LiteratureBasedDiscovery->new(\%options);
        $lbd->performLBD();

  CONTENTS
    All the modules that will be installed in the Perl system directory are
    present in the '/lib' directory tree of the package.

    The package contains a utils/ directory that contain Perl utility
    programs. These utilities use the modules or provide some supporting
    functionality.

    runDiscovery.pl -- runs LBD using the parameters specified in the input
    file, and outputs to an output file.

    The package contains a large selection of functions to manipulate CUI
    Co-occurrence matrices in the utils/datasetCreator/ directory. These are



( run in 2.465 seconds using v1.01-cache-2.11-cpan-f56aa216473 )