Bio-ToolBox
view release on metacpan or search on metacpan
file parent directory instead of the current directory.
v.1.10.2 (svn 591)
- Added a new option of position when adjusting coordinates of retrieved
features using the script get_features.pl. Coordinates may be adjusted
at the 5 prime, 3 prime, or both ends of stranded features. This also
fixes bugs where collected features on the reverse strand with adjusted
coordinates were not reported properly.
- Improved automatic recognition of the name, score, and other columns
in the convertor scripts data2bed.pl, data2gff.pl, and data2wig.pl.
- Improved the Cluster and Treeview export function in script
manipulate_datasets.pl. The CDT files generated now include separate ID
and NAME columns per the specification, and new manipulations are
included prior to exporting, including percentile rank and log2.
- The convert null function now also converts zero values if requested
in script manipulate_datasets.pl.
- Added new option of a minimum size when trimming windows in the script
find_enriched_regions.pl.
- Increased the radius from 35 bp to 50 bp when verifying a putative
mapped nucleosome in script map_nucleosomes.pl, leading to fewer
overlapping or offset nucleosomes.
requesting database feature types. By default, all database features are
presented to the user as a list when selecting database features to
collect data. The source_exclude parameter in the biotoolbox.cfg
configuration file is now deprecated.
- Upgraded script get_intersecting_features.pl to automatically
recognize input file columns and search for more than 1 feature type
- Fixed bug in script get_datasets.pl where the program will not
continue when only a data database was provided
- Fixed bug of requesting index when using a .kgg file as a gene list in
script pull_features.pl
- Fixed bug in generating file name for Treeview export function in
script manipulate_datasets.pl
- Fixed behavior when reading files to prevent adding the current
program name to the metadata when the input file does not have this
metadata
- Minor updates to script novo_wrapper.pl
v.1.9.0 (svn 493)
- Added new script get_features.pl which generates a list of features
for one or more feature types from a database. Information about the
features may be returned, including name, type, and coordinates. Sub
lib/Bio/ToolBox/Data/file.pm view on Meta::CPAN
gene prediction, and known Gene tables. The Bin column may or may not be present.
=item Peak files
These include file extensions F<.narrowPeak> and F<.broadPeak>.
These are special "BED6+4" file formats.
=item CDT
These include file extension F<.cdt>.
Cluster data files used with Cluster 3.0 and Treeview.
=item SGR
Rare file format of chromosome, position, score. File extension F<.sgr>.
=item TEXT
Almost any tab-delimited text file with a F<.txt> or F<.tsv> extension
can be loaded.
scripts/manipulate_datasets.pl view on Meta::CPAN
print " Unable to export data to file!\n";
}
# since no changes have been made to the data structure, return
return 0;
}
sub export_treeview_function {
# this is a specialized function to export a datafile into a format
# compatible with the Treeview program
print " Exporting CDT file for Treeview and Cluster analysis\n";
# First check for previous modifications
if ( $modification and not $function ) {
print " There are existing unsaved changes to the data. Do you want to\n";
my $p = ' save these first before making required, irreversible changes? y/n: ';
my $answer = prompt($p);
if ( lc $answer eq 'y' ) {
rewrite_function();
}
}
scripts/manipulate_datasets.pl view on Meta::CPAN
L2 - convert dataset to log2
L10 - convert dataset to log10
n0 - convert null values to 0
LIST
my $p = 'Enter the manipulation(s) in order of desired execution: ';
my $answer = prompt($p);
@manipulations = split /[,\s]+/, $answer;
}
### First, delete extraneous datasets or columns
# the CDT format for Treeview expects a unique ID and NAME column
# we will duplicate the first column
unshift @datasets, $datasets[0];
# perform a reordering of the columns
$Data->reorder_column(@datasets);
# rename the first two columns
$Data->name( 0, 'ID' );
$Data->name( 1, 'NAME' );
scripts/manipulate_datasets.pl view on Meta::CPAN
true nulls. If an output file name is specified using the --outfile
option, it will be used. Otherwise, a possible filename will be
suggested based on the input file name. If any modifications are
made to the data structure, a normal data file will still be written.
Note that this could overwrite the exported file if the output file name
was specified on the command line, as both file write subroutines will
use the same name!
=item B<treeview> (menu option B<i>)
Export the data to the CDT format compatible with both Treeview and
Cluster programs for visualizing and/or generating clusters. Specify the
columns containing a unique name and the columns to be analyzed (e.g.
--index <name>,<start-stop>). Extraneous columns are removed.
Additional manipulations on the columns may be performed prior to
exporting. These may be chosen interactively or using the codes
listed below and specified using the --target option.
su - decreasing sort by sum of row values
sm - decreasing sort by mean of row values
cg - median center features (rows)
( run in 0.524 second using v1.01-cache-2.11-cpan-49f99fa48dc )