Bio-Grep

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

      algorithm that searches all sequences for the specified regular
      expression. Compared to suffix arrays this is really slow for (nearly)
      perfect matches and large databases but requires no additional software.
      Currently it stores all hits of one sequence in memory. So this could be
      a problem for Fasta files with huge sequences (For example whole
      chromosomes as one sequence).

  Vmatch (http://vmatch.de/) for the Vmatch back-end. Commercial software.
    The Vmatch tests assume that vmatch is in your path (You
    can later specify a path to vmatch that is not in your path. 
    The tests will fail but the module should work if the 
    specified path to vmatch is correct.). 
 
  Agrep (http://www.tgries.de/agrep/) for the Agrep back-end. There
    are packages for some Linux distributions available (Debian:
    apt-get install agrep). Fink has some packages for Mac OS X.
    Ebuilds for Gentoo are available, too. As for Vmatch, Agrep 
    tests assume that agrep is in your path. 
    The TRE-agrep (http://laurikari.net/tre/) is much slower but has
    more features and less limitations. 

  GUUGle (http://bibiserv.techfak.uni-bielefeld.de/guugle/)
    A suffix array implementation for RNA sequences. Only allows search
    for exact matches. It is very memory efficent and needs no precalculated
    suffix arrays. Open Source.

  ...and other Perl modules. You will get a warning about missing
  modules when you run the make command. A lot of dependencies, we know,
  but most of them are standard software in bioinformatics. So please
  check if some of them are already installed on your workstation.


INSTALLATION

To install this module type the following (AFTER the installation of
the software in the "Dependecies"-section):

    perl Build.PL
    ./Build
    ./Build test
    ./Build install

Alternatively, to install it "the old way", you can use the following
commands:

   perl Makefile.PL
   make
   make test 
   make install


DOCUMENTATION


1. Tutorials
------------

bgrep is an example implementation. The source code is well documented, so
maybe it is a good starting point.

A not yet comprehensive cookbook is available in perldoc Bio::Grep::Cookbook.
Please contribute recipes if you can!


2. Performance
--------------

2.1 Vmatch


*  Try $sbe->settings->showdesc(200) if you don't need upstream or downstream
   regions. This makes the parser get all data directly out of vmatch output.
   Otherwise the parser will call vsubseqselect for every search result.

*  Try $sbe->settings->online(1) if you allow many mismatches.


3. FAQ
------

- Is it possible to get the coordinates of the hit out of the alignment?
  Yes. $res->alignment->get_seq_by_pos(1)->...

  see perldoc Bio::SimpleAlign


BUGS

Please report any bugs, recipes for the cookbook or feature requests to
C<bug-bio-grep@rt.cpan.org>, or through the web interface at
L<http://rt.cpan.org>. 


COPYRIGHT AND LICENSE

Copyright (C) 2007-2009 by M. Riester.

Based on Weigel::Search v0.13, Copyright (C) 2005-2006 by Max Planck 
Institute for Developmental Biology, Tuebingen.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.8.4 or,
at your option, any later version of Perl 5 you may have available.



( run in 0.655 second using v1.01-cache-2.11-cpan-df04353d9ac )