Bio-Grep
view release on metacpan or search on metacpan
lib/Bio/Grep.pm view on Meta::CPAN
=head2 CONSTRUCTOR
=over
=item C<new($backend)>
This method constructs a C<Bio::Grep> back-end object. Available external back-ends
are C<Vmatch>, C<Agrep>, and C<GUUGle>. Perl regular
expressions are available in the C<RE> back-end. C<Vmatch> is default.
Sets temporary path to C<File::Spec-E<gt>tmpdir();>
my $sbe = Bio::Grep->new('Agrep');
Returns an object that uses L<Bio::Grep::Backend::BackendI>
as base class. See L<Bio::Grep::Backend::BackendI>, L<Bio::Grep::Backend::Vmatch>,
L<Bio::Grep::Backend::Agrep>, L<Bio::Grep::Backend::GUUGle> and
L<Bio::Grep::Backend::RE>.
=back
=head1 FEATURES
=over
=item
C<Bio::Grep> supports most of the features of the back-ends. If you need a
particular feature that is not supported, write a feature request. In general it
should be easy to integrate. For a complete list of supported features, see
L<Bio::Grep::SearchSettings>, for an overview see
L<"FEATURE COMPARISON">.
=item
This module should be suitable for large data sets. The back-end output is piped
to a temporary file and the parser only stores the current hit in memory.
=item
C<Bio::Grep> includes an interface for search result filters. See L<"FILTERS">.
This module also allows you to retrieve up- and downstream regions. Together
with filters, this makes C<Bio::Grep> an ideal framework for seed and extend
algorithms.
=item
C<Bio::Grep> was in particular designed for web services and therefore
checks the settings carefully before calling back-ends. See L<"SECURITY">.
=back
=head1 QUICK START
This is only a short overview of the functionality of this module.
You should also read L<Bio::Grep::Backend::BackendI> and the documentation of
the back-end you want to use (e.g. L<Bio::Grep::Backend::Vmatch>).
L<Bio::Grep::Cookbook> is a (not yet comprehensive) collection of recipes for
common problems.
=head2 GENERATE DATABASES
As a first step, you have to generate a C<Bio::Grep> database out of your Fasta
file in which you want to search. A C<Bio::Grep> database consists of a couple of
files and allows you to retrieve information about the database as well
as to perform queries as fast and memory efficient as possible. You have to do
this only once for every file.
For example:
my $sbe = Bio::Grep->new('Vmatch');
$sbe->generate_database({
file => 'ATH1.cdna',
datapath => 'data',
description => 'AGI Transcripts',
});
Now, in a second script:
my $sbe = Bio::Grep->new('Vmatch');
$sbe->settings->datapath('data');
my %local_dbs_description = $sbe->get_databases();
my @local_dbs = sort keys %local_dbs_description;
Alternatively, you can use L<bgrep> which is part of this distribution:
bgrep --backend Vmatch --database TAIR6_cdna_20060907 --datapath data --createdb
=head2 SEARCH SETTINGS
All search settings are stored in the L<Bio::Grep::SearchSettings>
object of the back-end:
$sbe->settings
To set an option, call
$sbe->settings->optionname(value)
For example
$sbe->settings->datapath('data');
# take first available database
$sbe->settings->database($local_dbs[0]);
$sbe->settings->query('AGAGCCCT');
See the documentation of your back-end for available options.
=head2 SEARCH
To start the back-end with the specified settings, simply call
$sbe->search();
lib/Bio/Grep.pm view on Meta::CPAN
} catch Bio::Root::SystemException with {
my $E = shift;
print STDERR 'Back-end call failed: ' .
$E->{'-text'} . ' (' . $E->{'-line'} . ")\n";
exit(1);
} catch Bio::Root::BadParameter with {
my $E = shift;
print STDERR 'Wrong Settings: ' .
$E->{'-text'} . ' (' . $E->{'-line'} . ")\n";
exit(1);
} otherwise {
my $E = shift;
print STDERR "An unexpected exception occurred: \n$E";
exit(1);
};
C<Bio::Grep> throws a C<SystemException> when a system() call failed,
C<BadParameter> whenever C<Bio::Grep> recognizes some problems in the settings.
Be aware that C<Bio::Grep> does not find all of these problems. In such a case,
the back-end call will fail and this module will throw a C<SystemException>.
Whenever it is not possible to open, copy, close, delete or
write a file, croak() (L<Carp>) is called.
See L<Bio::Root::Exception>, L<Carp>.
=head1 SECURITY
The use of this module (in Web Services for example) should be quite secure. All
test run in taint mode. C<Bio::Grep> checks the settings before it generates the string
for the system() call and uses L<File::Temp> for all temporary files.
However, keep in mind that it is quite B<easy to start a query that will run
forever> without any further settings check, especially with the C<RE>
back-end. So you should limit C<mismatches>, C<query_length> and all
other settings that have an significant impact on the calculation time.
You should also set C<maxhits>.
=head1 INCOMPATIBILITIES
None reported.
=head1 BUGS AND LIMITATIONS
No bugs have been reported.
There is not yet a nice interface for searching for multiple queries. However,
C<Vmatch> and C<GUUGle> support this feature. So you can generate a Fasta query file
with L<Bio::SeqIO> and then set C<$sbe-E<gt>settings-E<gt>query_file()>. To
find out, to which query a match belongs, you have to check C<$res-E<gt>query>.
It is likely that C<$sbe-E<gt>settings-E<gt>query> is renamed to C<queries()>.
Please report any bugs or feature requests to
C<bug-bio-grep@rt.cpan.org>, or through the web interface at
L<http://rt.cpan.org>.
=head1 SEE ALSO
L<Bio::Grep::Cookbook>
L<Bio::Grep::Backend::BackendI>
L<Bio::Grep::Backend::Vmatch>
L<Bio::Grep::Backend::GUUGle>
L<Bio::Grep::Backend::RE>
L<Bio::Grep::Backend::Agrep>
L<Bio::Grep::Benchmarks>
=head2 PUBLICATIONS
C<GUUGle>: L<http://bioinformatics.oxfordjournals.org/cgi/content/full/22/6/762>
=head1 AUTHOR
Markus Riester, E<lt>mriester@gmx.deE<gt>
=head1 LICENSE AND COPYRIGHT
Copyright (C) 2007-2009 by M. Riester.
Based on Weigel::Search v0.13, Copyright (C) 2005-2006 by Max Planck
Institute for Developmental Biology, Tuebingen.
This module is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.
=head1 DISCLAIMER OF WARRANTY
BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH
YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
NECESSARY SERVICING, REPAIR, OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENSE, BE
LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL,
OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE
THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
=cut
# vim: ft=perl sw=4 ts=4 expandtab
( run in 0.390 second using v1.01-cache-2.11-cpan-e1769b4cff6 )