Bio-ToolBox

 view release on metacpan or  search on metacpan

CHANGES  view on Meta::CPAN

######## Bio::ToolBox revision history #############


v2.041
	- Fix bug that prevented compilation on Perls <= 5.28
	- Fix testing bug

v2.04
	- Fix bugs that prevented scripts get_binned_data.pl and 
	  get_relative_data.pl from compiling.
	- Clean up code in utility and big_helper modules.
	- Add new Build tests for big_helper.pm, which requires that
	  external UCSC utilities are in the PATH, but should safely
	  skip if they are not.
	- Add new Build test to verify all included application scripts.
	- Change all tests to use the modern Test2 suite.

v2.03
	- Improve Data table sorting (again) to handle natural sorting by
	  using embedded numeric values, regardless at beginning, middle,
	  or end of string – useful for genes or numbered items with prefix/suffix.
	- Add new functions to report statistics on feature length and filter
	  features by length (minimum - maximum range) in manipulate_datasets.pl.
	- Add new option to filter alignments based on mapping quality when
	  counting using data collection apps get_datasets.pl, get_binned_data.pl
	  and get_relative_data.pl. 
	- Add new options for specifying what to use for the output Bed Name
	  column from apps get_features.pl and get_gene_regions.pl, including
	  feature Name or ID.
	- Optimize alignment filtering based on flags for a very slight, but
	  measurable, improvement in execution time when collecting alignment
	  counts or generating wig files with bam2wig.pl.
	- Avoid writing duplicate comment lines when merging files in merge_datasets.pl.
	- Remove silly multiple-zero prefix when naming features in data2bed.pl
	  and data2gff.pl.
	- The new_data() method now properly recognizes options in Bio::ToolBox.
	- Added new new_bed() shortcut method to Bio::ToolBox.
	- Avoid writing any metadata or comment lines to TSV files. Presumption
	  is that these are primarily for data export and sharing. Add rudimentary
	  support for writing CSV files.
	- Improve coordinate extraction from coordinate strings, allowing to extract
	  for example from "chr1:123,456-789,000:-".
	- Allow genomic coordinate sorting by coordinate string.
	- Handle new Ensembl gencode tags when filtering in Bio::ToolBox::GeneTools.
	- Implement map quality filtering in low level alignment callbacks used by
	  HTS and Sam adapters. Add new use_minimum_mapq() function in 
	  Bio::ToolBox::db_helper to set the map quality level on global scale.
	- Optimize name counting 'ncount' method in Bio::ToolBox::db_helper.
	- Optimize and update API for counting all alignments in a bam file with
	  sum_total_bam_alignments() functions.
	- Remove outdated functions in manipulate_datasets.pl.
	- Fix bug with setting tag values of zero in SeqFeature objects.
	- Rename splice_data() to split_data() in Bio::ToolBox::Data.

v2.02
	- Add support for newer versions of UCSC utilities that no longer
	  allow reading from standard input, particularly 'wigToBigWig'.
	  Temporary wig files are written first and then the utility is
	  called. Updates made to bam2wig.pl, data2wig.pl, and manipulate_wig.pl.
	- Add public headers() boolean method for Data objects.
	- Changed duplicate() method for Stream objects to allow generating
	  a duplicate Data object if no output file name is provided.
	- Various library bug fixes and improvements, including reading bedGraph
	  files, writing simple text files, handling spaces in numeric index lists,
	  automatic checking of extensions, parsing annotation files into tables,
	  and speeding up row deletion in large data tables.
	- Fix issues with default output filenames in data collection scripts
	  get_datasets.pl, get_binned_data.pl, and get_relative_data.pl. Default is
	  to reuse input filename unless it was parsed, in which case the basename
	  plus txt is used. Also fix bugs regarding mismatched column names when
	  explicitly not parsing input annotation files.
	- Fix bug with leaving behind MergeDatasetCoordinate column in merge_datasets.pl.
	- Fix bug with using new column name instead of original column name in
	  message statements when manipulating columns in manipulate_datasets.pl.
	- Fix possible bug with undefined strand when automatically flipping coordinates
	  of reversed intervals in SeqFeature objects.
	- Remove deprecated and unused functions.
	- Add missing POD method sections for complete coverage.
	  

v2.01
	- Update chromosome sorting to properly handle chromosomal arms, for
	  example with Drosophila
	- Change '.groups.txt' group file name to '.col_groups.txt' when writing
	  column metadata file for scripts get_binned_data.pl and get_relative_data.pl
	- Change back to '_summary.txt' file name when writing a summary file
	- Change "--blacklist" option to "--exclude" in bam2wig.pl
	- Improve error handling scenarios in data2wig.pl, including invalid indexes
	- Fix bugs in manipulate_datasets.pl, including missing lines in the view function
	  and restricting the addname function to only update a proper "Name" column
	- Update any remaining POD text references about 0-base indexing to 1-base

v2.0
	- Version number change, no code changes

v1.70
	- MAJOR UPDATE: Change all internal and user-oriented column indexing
	to 1-base instead of 0-base indexing, i.e. column numbers are now
	listed beginning with 1 instead of 0. WARNING!!! THIS WILL BREAK ALL
	PRE-EXISTING SCRIPTS AND CODE THAT USES HARD-CODED COLUMN INDEXES!!!
	- MAJOR UPDATE: Use a single unified Bio::ToolBox::Parser module with
	subclasses for bed, gff, gtf, and ucsc table formats. NOTE: This changed
	name capitalization of Bio::ToolBox::Parser subclasses from parser
	- Improve parsing of gtf files, especially with duplicate tags
	- Replaced old table sorting algorithm to use numeric, mixed digit-string,



( run in 2.288 seconds using v1.01-cache-2.11-cpan-140bd7fdf52 )