Bio-Community

 view release on metacpan or  search on metacpan

lib/Bio/Community/Tools/Rarefier.pm  view on Meta::CPAN

by sampling artefacts. When comparing two identical communities, one for which
10,000 counts were made to one, to one with only 1,000 counts, the smaller
community will appear less diverse. A solution is to repeatedly bootstrap the
larger communities by taking 1,000 random members from it.

This module uses L<Bio::Community::Sampler> to take random member from communities
and normalize them by their number of counts. After all random repetitions have
been performed, average communities or representative communities are returned.
These communities all have the same number of counts.

=head1 AUTHOR

Florent Angly L<florent.angly@gmail.com>

=head1 SUPPORT AND BUGS

User feedback is an integral part of the evolution of this and other Bioperl
modules. Please direct usage questions or support issues to the mailing list, 
L<bioperl-l@bioperl.org>, rather than to the module maintainer directly. Many
experienced and reponsive experts will be able look at the problem and quickly 
address it. Please include a thorough description of the problem with code and
data examples if at all possible.

If you have found a bug, please report it on the BioPerl bug tracking system
to help us keep track the bugs and their resolution:
L<https://redmine.open-bio.org/projects/bioperl/>

=head1 COPYRIGHT

Copyright 2011-2014 by Florent Angly <florent.angly@gmail.com>

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.10.1 or,
at your option, any later version of Perl 5 you may have available.

=head1 APPENDIX

The rest of the documentation details each of the object
methods. Internal methods are usually preceded with a _

=head2 new

 Function: Create a new Bio::Community::Tool::Rarefier object
 Usage   : my $rarefier = Bio::Community::Tool::Rarefier->new( );
 Args    : -metacommunity  : see metacommunity()
           -num_repetitions: see num_repetitions()
           -threshold      : see threshold()
           -sample_size    : see sample_size()
           -drop           : see drop()
           -seed           : see set_seed()
 Returns : a new Bio::Community::Tools::Rarefier object

=cut


package Bio::Community::Tools::Rarefier;

use Moose;
use MooseX::NonMoose;
use MooseX::StrictConstructor;
use namespace::autoclean;
use Bio::Community::Meta;
use Bio::Community::Meta::Beta;
use List::Util qw(min);
use Method::Signatures;

use POSIX; # defines DBL_EPSILON to something like 2.22044604925031e-16
use constant REL_EPSILON => 1 + 10 * DBL_EPSILON;

extends 'Bio::Root::Root';
with 'Bio::Community::Role::PRNG';


=head2 metacommunity

 Function: Get or set the metacommunity to normalize.
 Usage   : my $meta = $rarefier->metacommunity;
 Args    : A Bio::Community::Meta object
 Returns : A Bio::Community::Meta object

=cut

has metacommunity => (
   is => 'rw',
   isa => 'Maybe[Bio::Community::Meta]',
   required => 0,
   default => undef,
   lazy => 1,
   init_arg => '-metacommunity',
   trigger => sub { $_[0]->_clear_avg_meta; $_[0]->_clear_repr_meta },
);


=head2 sample_size

 Function: Get or set the sample size, i.e. the number of members to pick
           randomly at each iteration. It has to be smaller than or equal to the
           total count of the smallest community or an error will be generated.
           If the sample size is omitted, it defaults to the get_members_count()
           of the smallest community.
 Usage   : my $sample_size = $rarefier->sample_size;
 Args    : integer for the sample size
 Returns : integer for the sample size

=cut

has sample_size => (
   is => 'rw',
   isa => 'Maybe[PositiveInt]',
   required => 0,
   default => undef,
   lazy => 1,
   init_arg => '-sample_size',
   trigger => sub { $_[0]->_clear_avg_meta; $_[0]->_clear_repr_meta },
);


=head2 threshold

 Function: Get or set the threshold. While iterating, when the beta diversity or
           distance between the average community and the average community at



( run in 0.672 second using v1.01-cache-2.11-cpan-39bf76dae61 )