FASTAid

 view release on metacpan or  search on metacpan

lib/FASTAid.pm  view on Meta::CPAN

    foreach my $seq ( @{$seq_array_ref} ) {

        ...do something with each FASTA sequence...
    }


=head1 DESCRIPTION

FASTAid indexes files containing FASTA sequence records and allows quick
random-access retrieval of one or more FASTA sequences.

FASTAid writes the index to a file with the suffix '.fec'.


=head1 DIAGNOSTICS

=over

=item C<< could not open FASTA file >>

A file could not be opened. Probably the path you supplied is incorrect or the permissions
are incorrect.

=item C<< There is already an entry ID >>

The same identifier appears more than once in the FASTA file you supplied. This is a fatal
error because FASTAid uses the identifier to index the position of the sequence.

=item C<< Cannot write FASTAid index >>

The index could not be written. This is a file system error, so probably you don't have
permissions to write in the directory.

=item C<< Must supply at least one ID >>

No identifiers were supplied as arguments to retrieve_entry. Since FASTAid uses the 
identifier as the lookup, it can't retrieve an entry without an identifier.

=item C<< Entry ID = <id> not found! >>

An identifier could not be found in the index. This is a warning, not a fatal error,
because if other identifiers are supplied to retrieve_entry, those sequences will be
returned even if others fail.

There are two common causes for this error: either the index is out of date and the
identifier doesn't exist in the index, or the identifier was misspelled when attempting
the lookup.

=back

=head1 DEPENDENCIES

L<version>


=head1 INCOMPATIBILITIES

None reported.


=head1 BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to
C<bug-FASTAid@rt.cpan.org>, or through the web interface at
L<http://rt.cpan.org>.


=head1 AUTHOR

 Jarret Glasscock C<< <glasscock_cpan@mac.com> >>
 Dave Messina C<< <dave-pause@davemessina.net> >>


=head1 ACKNOWLEDGMENTS

 This software was developed at the Genome Sequencing Center at Washington
 University, St. Louis, MO.


=head1 COPYRIGHT

 Copyright (C) 2004-6 Glasscock, Messina. All Rights Reserved.


=head1 DISCLAIMER

 This software is provided "as is" without warranty of any kind.


=cut


# PRAGMAS
use strict;
use warnings;

# INCLUDES
use Carp;


=head2 create_index

Usage	 : create_index(my_fasta_file) or die "index was not created";
Function : creates a byte index file representing positions of FASTA formatted entries.
Returns  : returns true upon success of index creation, false upon failure
Args     : a single argument, the path to a FASTA file

=cut

sub create_index {
    my ($fasta) = @_;
    my $index = $fasta . '.fec';

    my ( %data, $begin, $id );
    open( DB, $fasta )
        or croak( qq{could not open FASTA file $fasta:\n}, qq{$!\n} );

    # record offsets of records in perl database
    while (<DB>) {



( run in 1.504 second using v1.01-cache-2.11-cpan-39bf76dae61 )