App-Anchr

 view release on metacpan or  search on metacpan

doc/pacbio_consensus.md  view on Meta::CPAN


echo "==> Clone latest linuxbrew"
git clone https://github.com/Linuxbrew/brew.git ~/.linuxbrew

# .bashrc
if grep -q -i linuxbrew $HOME/.bashrc; then
    echo "==> .bashrc already contains linuxbrew"
else
    echo "==> Update .bashrc"

    LB_PATH='export PATH="$HOME/.linuxbrew/bin:$PATH"'
    LB_MAN='export MANPATH="$HOME/.linuxbrew/share/man:$MANPATH"'
    LB_INFO='export INFOPATH="$HOME/.linuxbrew/share/info:$INFOPATH"'
    echo '# Linuxbrew' >> $HOME/.bashrc
    echo $LB_PATH >> $HOME/.bashrc
    echo $LB_MAN  >> $HOME/.bashrc
    echo $LB_INFO >> $HOME/.bashrc
    echo >> $HOME/.bashrc

    eval $LB_PATH
    eval $LB_MAN
    eval $LB_INFO
fi
```

## 通过 pitchfork 编译

前期准备工作见[在此](e_coli.md/#pacbio-specific-tools).

```bash
cd ~/share/pitchfork

make GenomicConsensus
make pbfalcon
make pbreports
```

编译好的可执行文件与库文件在 `~/share/pitchfork/deployment`.

试运行.

```bash
source ~/share/pitchfork/deployment/setup-env.sh

quiver --help
```

## 直接安装 falcon-integrate, 现在不推荐

[wiki page](https://github.com/PacificBiosciences/FALCON-integrate/wiki/Installation)

```bash
mkdir -p $HOME/share
cd $HOME/share

git clone git://github.com/PacificBiosciences/FALCON-integrate.git
cd FALCON-integrate
git checkout master  # or whatever version you want
make init
source env.sh
make config-edit-user
make -j all

# Test data stored in dropbox. f* gfw
# make test
```

编译完成后, 会生成`fc_env`目录, 里面是可执行文件. `tree -L 2 fc_env`, `6 directories, 79 files`.

# falcon 样例数据

falcon-examples 里的数据是通过一个小众程序 `git-sym` 从 dropbox 下载的, 在墙内无法按说明文件里的提示来使用.

同时其内的很多设置都是写死的集群路径, 以及 sge 配置, 大大增加了复杂度, 并让人无法理解.

注意:

* fasta 文件 **必须** 以 `.fasta` 为扩展名
* fasta 文件中的序列名称, 必须符合 falcon (fasta2DB of dazz_db) 的要求, 即 sra 默认名称**不符合要求**,
  错误提示为 `Pacbio header line format error`
* [这里](https://github.com/PacificBiosciences/FALCON/issues/251)有个脚本帮助解决这个问题. 已经放到本地,
  `falcon_name_fasta.pl`

* Clear intermediate dirs

    ```bash
    find $HOME/data/pacbio -type d -name 'm_*' | xargs rm -fr
    find $HOME/data/pacbio -type d -name 'job_*' | xargs rm -fr
    ```

## `falcon/example` 里的 [*E. coli* 样例](https://github.com/PacificBiosciences/FALCON/wiki/Setup:-Complete-example).

* 过墙下载以下三个文件

```bash
mkdir -p $HOME/data/pacbio/rawdata/ecoli_test
cd $HOME/data/pacbio/rawdata/ecoli_test

proxychains4 wget -c https://www.dropbox.com/s/tb78i5i3nrvm6rg/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.1.subreads.fasta
proxychains4 wget -c https://www.dropbox.com/s/v6wwpn40gedj470/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.2.subreads.fasta
proxychains4 wget -c https://www.dropbox.com/s/j61j2cvdxn4dx4g/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.3.subreads.fasta

# N50 14124
# C   105451
faops n50 -C *.subreads.fasta
```

* 配置文件及运行

```bash
source ~/share/pitchfork/deployment/setup-env.sh

if [ -d $HOME/data/pacbio/ecoli_test ];
then
    rm -fr $HOME/data/pacbio/ecoli_test
fi
mkdir -p $HOME/data/pacbio/ecoli_test
cd $HOME/data/pacbio/ecoli_test
find $HOME/data/pacbio/rawdata/ecoli_test -name "*.fasta" > input.fofn

# https://github.com/PacificBiosciences/FALCON/blob/master/examples/fc_run_ecoli.cfg



( run in 1.101 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )