Bio-ToolBox

 view release on metacpan or  search on metacpan

CHANGES  view on Meta::CPAN

	file parent directory instead of the current directory.

v.1.10.2 (svn 591)
	- Added a new option of position when adjusting coordinates of retrieved
	features using the script get_features.pl. Coordinates may be adjusted
	at the 5 prime, 3 prime, or both ends of stranded features. This also 
	fixes bugs where collected features on the reverse strand with adjusted
	coordinates were not reported properly.
	- Improved automatic recognition of the name, score, and other columns
	in the convertor scripts data2bed.pl, data2gff.pl, and data2wig.pl. 
	- Improved the Cluster and Treeview export function in script
	manipulate_datasets.pl. The CDT files generated now include separate ID
	and NAME columns per the specification, and new manipulations are
	included prior to exporting, including percentile rank and log2.
	- The convert null function now also converts zero values if requested
	in script manipulate_datasets.pl.
	- Added new option of a minimum size when trimming windows in the script
	find_enriched_regions.pl.
	- Increased the radius from 35 bp to 50 bp when verifying a putative
	mapped nucleosome in script map_nucleosomes.pl, leading to fewer
	overlapping or offset nucleosomes.

CHANGES  view on Meta::CPAN

	requesting database feature types. By default, all database features are
	presented to the user as a list when selecting database features to
	collect data. The source_exclude parameter in the biotoolbox.cfg
	configuration file is now deprecated.
	- Upgraded script get_intersecting_features.pl to automatically
	recognize input file columns and search for more than 1 feature type
	- Fixed bug in script get_datasets.pl where the program will not
	continue when only a data database was provided
	- Fixed bug of requesting index when using a .kgg file as a gene list in
	script pull_features.pl
	- Fixed bug in generating file name for Treeview export function in
	script manipulate_datasets.pl
	- Fixed behavior when reading files to prevent adding the current
	program name to the metadata when the input file does not have this
	metadata
	- Minor updates to script novo_wrapper.pl
	
v.1.9.0 (svn 493)
	- Added new script get_features.pl which generates a list of features
	for one or more feature types from a database. Information about the
	features may be returned, including name, type, and coordinates. Sub

lib/Bio/ToolBox/Data/file.pm  view on Meta::CPAN

gene prediction, and known Gene tables. The Bin column may or may not be present.

=item Peak files

These include file extensions F<.narrowPeak> and F<.broadPeak>. 
These are special "BED6+4" file formats. 

=item CDT

These include file extension F<.cdt>. 
Cluster data files used with Cluster 3.0 and Treeview.

=item SGR

Rare file format of chromosome, position, score. File extension F<.sgr>.

=item TEXT

Almost any tab-delimited text file with a F<.txt> or F<.tsv> extension
can be loaded.

scripts/manipulate_datasets.pl  view on Meta::CPAN

		print " Unable to export data to file!\n";
	}

	# since no changes have been made to the data structure, return
	return 0;
}

sub export_treeview_function {

	# this is a specialized function to export a datafile into a format
	# compatible with the Treeview program

	print " Exporting CDT file for Treeview and Cluster analysis\n";

	# First check for previous modifications
	if ( $modification and not $function ) {
		print " There are existing unsaved changes to the data. Do you want to\n";
		my $p = ' save these first before making required, irreversible changes? y/n:  ';
		my $answer = prompt($p);
		if ( lc $answer eq 'y' ) {
			rewrite_function();
		}
	}

scripts/manipulate_datasets.pl  view on Meta::CPAN

  L2 - convert dataset to log2
  L10 - convert dataset to log10
  n0 - convert null values to 0 
LIST
		my $p      = 'Enter the manipulation(s) in order of desired execution: ';
		my $answer = prompt($p);
		@manipulations = split /[,\s]+/, $answer;
	}

	### First, delete extraneous datasets or columns
	# the CDT format for Treeview expects a unique ID and NAME column
	# we will duplicate the first column
	unshift @datasets, $datasets[0];

	# perform a reordering of the columns
	$Data->reorder_column(@datasets);

	# rename the first two columns
	$Data->name( 0, 'ID' );
	$Data->name( 1, 'NAME' );

scripts/manipulate_datasets.pl  view on Meta::CPAN

true nulls. If an output file name is specified using the --outfile 
option, it will be used. Otherwise, a possible filename will be 
suggested based on the input file name. If any modifications are 
made to the data structure, a normal data file will still be written. 
Note that this could overwrite the exported file if the output file name
was specified on the command line, as both file write subroutines will 
use the same name!

=item B<treeview> (menu option B<i>)

Export the data to the CDT format compatible with both Treeview and 
Cluster programs for visualizing and/or generating clusters. Specify the 
columns containing a unique name and the columns to be analyzed (e.g. 
--index <name>,<start-stop>). Extraneous columns are removed. 
Additional manipulations on the columns may be performed prior to 
exporting. These may be chosen interactively or using the codes 
listed below and specified using the --target option.
  
  su - decreasing sort by sum of row values
  sm - decreasing sort by mean of row values
  cg - median center features (rows)



( run in 0.539 second using v1.01-cache-2.11-cpan-49f99fa48dc )