CLIPSeqTools

 view release on metacpan or  search on metacpan

lib/CLIPSeqTools/Tutorial/Details.pod  view on Meta::CPAN

Conservation score for the read. The score is calculated as the average
phastCons score of all the nucleotides of the read. To minimize storage needs,
the phastCons conservation score is multiplied by 1000 to convert it from
floating point number to integer.

=back

=head2 C<clipseqtools>

=head3 Description

C<clipseqtools> is the main toolbox of the I<CLIPSeqTools> suite. It runs
analyses on a single dataset. It offers a wide selection of tools that cover
many aspects of a CLIP-Seq analysis pipeline.

=head3 Commands

Each command of C<clipseqtools> is designed to perform a well defined task. To
invoke a command use:

  clipseqtools <command>

C<clipseqtools> supports the following commands which can run independently or
as a predefined pipeline.

=over

=item 1. C<all>

Will run all of the commands as a pipeline. This is probably the most common
option to use unless you need very fine-grained control on what is happening.

=item 2. C<reads_long_gaps_size_distribution>

Measure the size distribution of long alignment gaps (eg. alignment on
exon-exon junctions) produced by a gap aware aligner.

=item 3. C<size_distribution>

Measure the size distribution for reads.

=item 4. C<cluster_size_and_score_distribution>

Assemble reads in clusters and measure their size and number of contained
reads distribution.

=item 5. C<count_reads_on_genic_elements>

Count reads on transcripts, genes, exons and introns.

=item 6. C<distribution_on_genic_elements>

Measure how reads are distributed along the length of 5'UTR, CDS and 3'UTR.

=item 7. C<distribution_on_introns_exons>

Measure how reads are distributed along the length of exons and introns.

=item 8. C<genome_coverage>

Measure percent of genome covered by reads.

=item 9. C<genomic_distribution>

Count reads on genes, repeats, exons , introns, 5'UTRs, ...

=item 10. C<nmer_enrichment_over_shuffled>

Measure the enrichment of Nmers within the reads over shuffled reads.

=item 11. C<nucleotide_composition>

Measure the nucleotide composition along reads.

=item 12. C<conservation_distribution>

Measure the number of reads at each conservation level.

=back

=head3 Running analysis on subsets of data

Since C<clipseqtools> relies on database tables, the filtering and run of an
analysis on subsets of data is particularly straightforward. The only thing a
user has to do is give the filtering criteria when executing each of the
commands. The syntax for the filtering criteria is easy and intuitive and
probably best explained with an example.

Example:

To run an analysis only on reads that are highly conserved, have a deletion
and are not repeats, the following flags should be added when running a
command:

  --filter conservation=">500" --filter deletion="def" --filter rmsk="undef"

The supported operators for creating a filter are: C<< >, >=, <, <=, =, !=,
def, undef >>.


=head2 C<clipseqtools-compare>

=head3 Description

C<clipseqtools-compare> is a toolbox that can be used to compare CLIP-Seq
datasets with each other.

=head3 Commands

C<clipseqtools-compare> supports the following commands which can run
independently.

=over

=item 1. C<all>

Will run all of the commands as a pipeline. This is probably the most common
option to use unless you need very fine-grained control on what is happening.

=item 2. C<libraries_overlap_stats>



( run in 0.628 second using v1.01-cache-2.11-cpan-39bf76dae61 )