Bio-Kmer

 view release on metacpan or  search on metacpan

README.pod  view on Meta::CPAN


The BioPerl way

  use strict;
  use warnings;
  use Bio::SeqIO;
  use Bio::Kmer;

  # Load up any Bio::SeqIO object. Quality values will be
  # faked internally to help with compatibility even if
  # a fastq file is given.
  my $seqin = Bio::SeqIO->new(-file=>"input.fasta");
  my $kmer=Bio::Kmer->new($seqin);
  my $kmerHash=$kmer->kmers();
  my $countOfCounts=$kmer->histogram();

=head1 DESCRIPTION

A module for helping with kmer analysis. The basic methods help count kmers and can produce a count of counts.  Currently this module only supports fastq format.  Although this module can count kmers with pure perl, it is recommended to give the opti...

=head1 DEPENDENCIES

  * BioPerl
  * Jellyfish >=2
  * Perl threads
  * Perl >=5.10

=head1 VARIABLES

=over

=item $Bio::Kmer::iThreads

Boolean describing whether the module instance is using threads

=back

=head1 METHODS

=over

=item Bio::Kmer->new($filename, \%options)

Create a new instance of the kmer counter.  One object per file. 

  Filename can be either a file path or a Bio::SeqIO object.

  Applicable arguments for \%options:
  Argument     Default    Description
  kmercounter  perl       What kmer counter software to use.
                          Choices: Perl, Jellyfish.
  kmerlength|k 21         Kmer length
  numcpus      1          This module uses perl 
                          multithreading with pure perl or 
                          can supply this option to other 
                          software like jellyfish.
  gt           1          If the count of kmers is fewer 
                          than this, ignore the kmer. This 
                          might help speed analysis if you 
                          do not care about low-count kmers.
  sample       1          Retain only a percentage of kmers.
                          1 is 100%; 0 is 0%
                          Only works with the perl kmer counter.
  verbose      0          Print more messages.

  Examples:
  my $kmer=Bio::Kmer->new("file.fastq.gz",{kmercounter=>"jellyfish",numcpus=>4});

=back


=cut

=pod

=over

=item $kmer->ntcount()

Returns the number of base pairs counted.
In some cases such as when counting with Jellyfish,
that number is not calculated; instead the length
is calculated by the total length of kmers.
Internally, this number is stored as $kmer->{_ntcount}.

Note: internally runs $kmer->histogram() if
$kmer->{_ntcount} is not initially found.

  Arguments: None
  Returns:   integer

=back


=cut

=pod

=over

=item $kmer->count()

Count kmers. This method is called as soon as new() is called
and so you should never have to run this method.
Internally caches the kmer counts to ram.

  Arguments: None
  Returns:   None

=back


=cut

=pod

=over

=item $kmer->clearCache

Clears kmer counts and histogram counts.  You should probably never use



( run in 2.462 seconds using v1.01-cache-2.11-cpan-0d23b851a93 )