App-Anchr
view release on metacpan or search on metacpan
doc/e_coli.md view on Meta::CPAN
# Tuning parameters for the dataset of *E. coli*
[TOC level=1-3]: # " "
- [Tuning parameters for the dataset of *E. coli*](#tuning-parameters-for-the-dataset-of-e-coli)
- [More tools on downloading and preprocessing data](#more-tools-on-downloading-and-preprocessing-data)
- [Extra external executables](#extra-external-executables)
- [Two of the leading assemblers](#two-of-the-leading-assemblers)
- [PacBio specific tools](#pacbio-specific-tools)
- [*Escherichia coli* str. K-12 substr. MG1655](#escherichia-coli-str-k-12-substr-mg1655)
- [Download](#download)
- [Preprocess Illumina reads](#preprocess-illumina-reads)
- [Preprocess PacBio reads](#preprocess-pacbio-reads)
- [Reads stats](#reads-stats)
- [Spades](#spades)
- [Platanus](#platanus)
- [Quorum](#quorum)
- [Down sampling](#down-sampling)
- [Generate k-unitigs (sampled)](#generate-k-unitigs-sampled)
- [Create anchors (sampled)](#create-anchors-sampled)
- [Merge anchors with Qxx, Lxx and QxxLxx](#merge-anchors-with-qxx-lxx-and-qxxlxx)
- [Merge anchors](#merge-anchors)
- [Scaffolding with PE](#scaffolding-with-pe)
- [Different K values](#different-k-values)
- [3GS](#3gs)
- [Local corrections](#local-corrections)
- [Expand anchors](#expand-anchors)
- [Final stats](#final-stats)
# More tools on downloading and preprocessing data
## Extra external executables
```bash
brew install aria2 curl # downloading tools
brew install homebrew/science/sratoolkit # NCBI SRAToolkit
brew reinstall --build-from-source --without-webp gd # broken, can't find libwebp.so.6
brew reinstall --build-from-source gnuplot@4
brew install homebrew/science/mummer # mummer need gnuplot4
brew install openblas # numpy
brew install python
pip install --upgrade pip setuptools
pip install matplotlib
brew install homebrew/science/quast # assembly quality assessment
quast --test # may recompile the bundled nucmer
# canu requires gnuplot 5 while mummer requires gnuplot 4
brew install --build-from-source canu
brew unlink gnuplot@4
brew install gnuplot
brew unlink gnuplot
brew link gnuplot@4 --force
brew install r --without-tcltk --without-x11
brew install kmergenie --with-maxkmer=200
```
## Two of the leading assemblers
```bash
brew install homebrew/science/spades
brew install wang-q/tap/platanus
```
## PacBio specific tools
PacBio is switching its data format from `hdf5` to `bam`, but at now
(early 2017) the majority of public available PacBio data are still in
formats of `.bax.h5` or `hdf5.tgz`. For dealing with these files, PacBio
releases some tools which can be installed by another specific tool,
named `pitchfork`.
Their tools *can* be compiled under macOS with Homebrew.
* Install some third party tools
```bash
brew install md5sha1sum
brew install zlib boost openblas
( run in 2.452 seconds using v1.01-cache-2.11-cpan-97f6503c9c8 )