Bio-Sampling-Valection

 view release on metacpan or  search on metacpan

lib/Bio/Sampling/Valection.pm  view on Meta::CPAN

package Bio::Sampling::Valection;

use strict;
use warnings;
use Carp;

=head1 NAME

	Bio::Sampling::Valection

=head1 VERSION

Version 1.0.1

=cut

our $VERSION = '1.0.1';

=head1 DESCRIPTION

Bio::Sampling::Valection - Sampler for verification

Bio::Sampling::Valection contains a variety of algorithms for choosing verification candidates from competing tools or
parameterizations, to fairly assess their performance against each other.

Originally created to selected from single nucleotide variant (SNV) calls generated by SNV mutation
calling algorithms, this software can easily be extended to other mutation types (e.g. structural
variants, gene fusions, etc.) provided data is formatted correctly.

This software requires the valection package (http://labs.oicr.on.ca/boutros-lab/software/valection).

=head1 SYNOPSIS

There are six selection methods available through six functions. They all take the following arguments:

- **budget**: an integer specifying how many candidates to select

- **infile**: a path to a file which contains the calls from all callers. The infile should be formatted with a tab separating the caller and call on each line:

	caller1 name\ta call this caller made
	caller2 name\ta call this caller made

e.g.

	magnifying glass	chr1 576834
	magnifying glass	chr1 6878924
	eye dropper	chr1 496267
	eye dropper	chr1 6878924

Note that the call can contain a tab, but the caller may not.

- **outfile**: a path to a filename where the calls should be outputted

- **seed** (optional): an integer to seed the random number generator with (used to randomize sampling)

	use Bio::Sampling::Valection;

	# Run the sampling to select 10 candidates
	run_equal_per_caller(10, "/home/me/calls.valec", "/home/me/selections.txt", 50);


=cut

use Exporter 'import';
our @EXPORT = qw(run_directed_sampling run_random_sampling run_equal_per_caller run_equal_per_overlap run_increasing_with_overlap run_decreasing_with_overlap);
=head1 FUNCTIONS

The functions are named as follows:

=over

=item run_directed_sampling

=item run_random_sampling

=item run_equal_per_caller

=item run_equal_per_overlap

=item run_increasing_with_overlap

=item run_decreasing_with_overlap

=back

=cut

sub run_directed_sampling {
	my $budget = shift;
	my $infile = shift;
	my $outfile = shift;
	my $seed = shift;

	my $valection_path = `which valection`;
	if (! $valection_path) {
		croak(
			"You must install the valection library before using this package.\n" .
			"It is available at labs.oicr.on.ca/boutros-lab/software/valection.\n"
			);
		}

	if (! defined $seed) {
		$seed = "";
		}

	my $output_string = `valection $budget v '$infile' '$outfile' $seed`;
	print($output_string);
	}

sub run_random_sampling {
	my $budget = shift;
	my $infile = shift;
	my $outfile = shift;
	my $seed = shift;

	my $valection_path = `which valection`;
	if (! $valection_path) {
		croak(
			"You must install the valection library before using this package.\n" .



( run in 0.714 second using v1.01-cache-2.11-cpan-39bf76dae61 )