Bio-Sampling-Valection
view release on metacpan or search on metacpan
lib/Bio/Sampling/Valection.pm view on Meta::CPAN
package Bio::Sampling::Valection;
use strict;
use warnings;
use Carp;
=head1 NAME
Bio::Sampling::Valection
=head1 VERSION
Version 1.0.1
=cut
our $VERSION = '1.0.1';
=head1 DESCRIPTION
Bio::Sampling::Valection - Sampler for verification
Bio::Sampling::Valection contains a variety of algorithms for choosing verification candidates from competing tools or
parameterizations, to fairly assess their performance against each other.
Originally created to selected from single nucleotide variant (SNV) calls generated by SNV mutation
calling algorithms, this software can easily be extended to other mutation types (e.g. structural
variants, gene fusions, etc.) provided data is formatted correctly.
This software requires the valection package (http://labs.oicr.on.ca/boutros-lab/software/valection).
=head1 SYNOPSIS
There are six selection methods available through six functions. They all take the following arguments:
- **budget**: an integer specifying how many candidates to select
- **infile**: a path to a file which contains the calls from all callers. The infile should be formatted with a tab separating the caller and call on each line:
caller1 name\ta call this caller made
caller2 name\ta call this caller made
e.g.
magnifying glass chr1 576834
magnifying glass chr1 6878924
eye dropper chr1 496267
eye dropper chr1 6878924
Note that the call can contain a tab, but the caller may not.
- **outfile**: a path to a filename where the calls should be outputted
- **seed** (optional): an integer to seed the random number generator with (used to randomize sampling)
use Bio::Sampling::Valection;
# Run the sampling to select 10 candidates
run_equal_per_caller(10, "/home/me/calls.valec", "/home/me/selections.txt", 50);
=cut
use Exporter 'import';
our @EXPORT = qw(run_directed_sampling run_random_sampling run_equal_per_caller run_equal_per_overlap run_increasing_with_overlap run_decreasing_with_overlap);
=head1 FUNCTIONS
The functions are named as follows:
=over
=item run_directed_sampling
=item run_random_sampling
=item run_equal_per_caller
=item run_equal_per_overlap
=item run_increasing_with_overlap
=item run_decreasing_with_overlap
=back
=cut
sub run_directed_sampling {
my $budget = shift;
my $infile = shift;
my $outfile = shift;
my $seed = shift;
my $valection_path = `which valection`;
if (! $valection_path) {
croak(
"You must install the valection library before using this package.\n" .
"It is available at labs.oicr.on.ca/boutros-lab/software/valection.\n"
);
}
if (! defined $seed) {
$seed = "";
}
my $output_string = `valection $budget v '$infile' '$outfile' $seed`;
print($output_string);
}
sub run_random_sampling {
my $budget = shift;
my $infile = shift;
my $outfile = shift;
my $seed = shift;
my $valection_path = `which valection`;
if (! $valection_path) {
croak(
"You must install the valection library before using this package.\n" .
( run in 0.714 second using v1.01-cache-2.11-cpan-39bf76dae61 )