App-Anchr
view release on metacpan or search on metacpan
doc/bacteria_2_3.md view on Meta::CPAN
cat stat.md
```
| Name | N50 | Sum | # |
|:---------|--------:|-----------:|---------:|
| Genome | 3288558 | 5165770 | 2 |
| Paralogs | 3333 | 155714 | 62 |
| Illumina | 101 | 1368727962 | 13551762 |
| PacBio | 11771 | 1228497092 | 143537 |
| uniq | 101 | 1361783404 | 13483004 |
| scythe | 101 | 1346787728 | 13483004 |
| Q20L60 | 101 | 1264469138 | 12611522 |
| Q25L60 | 101 | 1200269501 | 12011552 |
| Q30L60 | 101 | 1080002384 | 10917028 |
## Vpar: down sampling
```bash
BASE_DIR=$HOME/data/anchr/Vpar
cd ${BASE_DIR}
doc/bacteria_2_3.md view on Meta::CPAN
```
| Name | SumCor | CovCor | N50SR | Sum | # | N50Anchor | Sum | # | N50Others | Sum | # | Kmer | RunTimeKU | RunTimeAN |
|:---------------|--------:|-------:|------:|------:|---:|----------:|------:|---:|----------:|-------:|--:|--------------------:|----------:|:----------|
| Q25L60X40P000 | 75.71M | 40.0 | 35248 | 1.8M | 72 | 35248 | 1.8M | 71 | 865 | 865 | 1 | "31,41,51,61,71,81" | 0:01'13'' | 0:00'52'' |
| Q25L60X40P001 | 75.71M | 40.0 | 32751 | 1.8M | 75 | 32751 | 1.79M | 72 | 4293 | 9.43K | 3 | "31,41,51,61,71,81" | 0:01'15'' | 0:00'52'' |
| Q25L60X40P002 | 75.71M | 40.0 | 32751 | 1.8M | 76 | 32751 | 1.79M | 73 | 4293 | 9.45K | 3 | "31,41,51,61,71,81" | 0:01'13'' | 0:00'54'' |
| Q25L60X40P003 | 75.71M | 40.0 | 32751 | 1.82M | 75 | 32803 | 1.77M | 72 | 23232 | 47.32K | 3 | "31,41,51,61,71,81" | 0:01'14'' | 0:00'47'' |
| Q25L60X80P000 | 151.42M | 80.0 | 32751 | 1.8M | 78 | 32751 | 1.8M | 74 | 645 | 2.58K | 4 | "31,41,51,61,71,81" | 0:01'48'' | 0:01'09'' |
| Q25L60X80P001 | 151.42M | 80.0 | 31667 | 1.8M | 79 | 31667 | 1.8M | 77 | 865 | 1.44K | 2 | "31,41,51,61,71,81" | 0:01'49'' | 0:01'12'' |
| Q25L60X120P000 | 227.13M | 120.0 | 32404 | 1.8M | 83 | 32404 | 1.8M | 78 | 650 | 3.55K | 5 | "31,41,51,61,71,81" | 0:02'27'' | 0:01'21'' |
| Q25L60X160P000 | 302.84M | 160.0 | 31667 | 1.8M | 84 | 31667 | 1.8M | 83 | 865 | 865 | 1 | "31,41,51,61,71,81" | 0:03'05'' | 0:01'42'' |
| Q30L60X40P000 | 75.71M | 40.0 | 35248 | 1.8M | 72 | 35248 | 1.8M | 71 | 855 | 855 | 1 | "31,41,51,61,71,81" | 0:01'14'' | 0:00'56'' |
| Q30L60X40P001 | 75.71M | 40.0 | 32751 | 1.84M | 76 | 32813 | 1.76M | 71 | 32374 | 74.19K | 5 | "31,41,51,61,71,81" | 0:01'13'' | 0:00'45'' |
| Q30L60X40P002 | 75.71M | 40.0 | 32751 | 1.8M | 73 | 32751 | 1.8M | 72 | 855 | 855 | 1 | "31,41,51,61,71,81" | 0:01'13'' | 0:00'44'' |
| Q30L60X40P003 | 75.71M | 40.0 | 32741 | 1.8M | 75 | 32741 | 1.8M | 74 | 865 | 865 | 1 | "31,41,51,61,71,81" | 0:01'13'' | 0:00'45'' |
| Q30L60X80P000 | 151.42M | 80.0 | 32751 | 1.8M | 74 | 32751 | 1.8M | 73 | 865 | 865 | 1 | "31,41,51,61,71,81" | 0:01'49'' | 0:01'08'' |
| Q30L60X80P001 | 151.42M | 80.0 | 32751 | 1.8M | 74 | 32751 | 1.8M | 73 | 865 | 865 | 1 | "31,41,51,61,71,81" | 0:01'50'' | 0:01'12'' |
| Q30L60X120P000 | 227.13M | 120.0 | 32751 | 1.8M | 77 | 32751 | 1.8M | 75 | 865 | 1.49K | 2 | "31,41,51,61,71,81" | 0:02'26'' | 0:01'32'' |
| Q30L60X160P000 | 302.84M | 160.0 | 32404 | 1.8M | 79 | 32404 | 1.8M | 77 | 865 | 1.49K | 2 | "31,41,51,61,71,81" | 0:03'00'' | 0:01'37'' |
## Ftul: merge anchors
```bash
BASE_NAME=Ftul
cd ${HOME}/data/anchr/${BASE_NAME}
# merge anchors
mkdir -p merge
anchr contained \
doc/bacteria_2_3.md view on Meta::CPAN
$(echo "contigTrim"; faops n50 -H -S -C contigTrim/contig.fasta;) >> stat3.md
cat stat3.md
```
| Name | N50 | Sum | # |
|:-------------|--------:|--------:|---:|
| Genome | 1892775 | 1892775 | 1 |
| Paralogs | 33912 | 93531 | 10 |
| anchor.merge | 32813 | 1801122 | 73 |
| others.merge | 32404 | 64274 | 3 |
| anchor.cover | 32813 | 1796007 | 71 |
| anchorLong | 35248 | 1795927 | 70 |
| contigTrim | 1027458 | 1856949 | 4 |
* Clear QxxLxxXxx.
```bash
BASE_NAME=Ftul
cd ${HOME}/data/anchr/${BASE_NAME}
doc/e_coli.md view on Meta::CPAN
```bash
mkdir -p ~/share/SMRTAnalysis_2.3.0
cd ~/share/SMRTAnalysis_2.3.0
aria2c -x 9 -s 3 -c http://files.pacb.com/software/smrtanalysis/2.3.0/smrtanalysis_2.3.0.140936.run
aria2c -x 9 -s 3 -c http://files.pacb.com/software/smrtanalysis/2.3.0/smrtanalysis-patch_2.3.0.140936.p5.run
aria2c -x 9 -s 3 -c https://atlas.hashicorp.com/ubuntu/boxes/trusty64/versions/20170313.0.7/providers/virtualbox.box
vagrant box add ubuntu/trusty64 trusty-server-cloudimg-amd64-vagrant-disk1.box --force
curl -O https://raw.githubusercontent.com/mhsieh/SMRTAnalysis_2.3.0_install/master/vagrant-u1404/Vagrantfile
vagrant destroy -f
rm -fr .vagrant/
vagrant up --provider virtualbox
```
# *Escherichia coli* str. K-12 substr. MG1655
* Genome: INSDC
doc/e_coli.md view on Meta::CPAN
| Q25L30 | 151 | 1382782641 | 10841386 |
| Q25L60 | 151 | 1317617346 | 9994728 |
| Q25L90 | 151 | 1177142378 | 8586574 |
| Q25L120 | 151 | 837111446 | 5805874 |
| Q30L30 | 125 | 1192536117 | 10716954 |
| Q30L60 | 127 | 1149107745 | 9783292 |
| Q30L90 | 130 | 1021609911 | 8105773 |
| Q30L120 | 139 | 693661043 | 5002158 |
| Q35L30 | 64 | 588252718 | 9588363 |
| Q35L60 | 72 | 366922898 | 5062192 |
| Q35L90 | 95 | 35259773 | 364046 |
| Q35L120 | 124 | 647353 | 5169 |
| PacBio | 13982 | 748508361 | 87225 |
| PacBio.trim | 13630 | 688575670 | 77687 |
| PacBio.20x | 13962 | 99252919 | 11500 |
| PacBio.20x.trim | 13541 | 88697009 | 9980 |
| PacBio.40x | 13948 | 198650072 | 23000 |
| PacBio.40x.trim | 13565 | 179462005 | 20137 |
| PacBio.80x | 13996 | 395094712 | 46000 |
| PacBio.80x.trim | 13608 | 360190363 | 40682 |
doc/masurca.md view on Meta::CPAN
| perl -n -e '/ESTIMATED_GENOME_SIZE=\"(\d+)\"/ and print $1' )
done >> stat.md
cat stat.md
```
| name | N50SR | #SR | N50Contig | #Contig | N50Scaffold | #Scaffold | EstG |
|:--------------|------:|-----:|----------:|--------:|------------:|----------:|--------:|
| PE_SJ_Sanger4 | 4586 | 4187 | 205225 | 69 | 3196849 | 35 | 4602968 |
| PE_SJ_Sanger | 4586 | 4187 | 63274 | 141 | 3070846 | 28 | 4602968 |
| PE_SJ | 4586 | 4187 | 43125 | 219 | 3058404 | 59 | 4602968 |
| PE_Sanger4 | 4705 | 4042 | 125228 | 67 | 534852 | 30 | 4595684 |
| PE_Sanger | 4705 | 4042 | 19435 | 412 | 21957 | 359 | 4595684 |
| PE | 4705 | 4043 | 20826 | 407 | 34421 | 278 | 4595684 |
| superreads | 4705 | 4043 | | | | | 4595684 |
æè¶³å¤å¤ç long reads æ¯æä¸, ä¸éè¦ short jump.
# SuperReads 3.1.3
2017 å¹´ 2 æ, UMD ftp ä¸å¤äºä¸ä¸ªæ°ç¨åº
[SuperReads_RNA](ftp://ftp.genome.umd.edu/pub/MaSuRCA/beta/SuperReads_RNA-1.0.1.tar.gz), æ¯ MaSuRCA
3.2.1 çç®åç. å¾å¯è½æ¯ `StringTie` ç¨äº super-reads æ¥å¤ç RNA-seq, å¨å¾å¤äººçè¦æ±ä¸åç.
æ ¹æ®è¿ä¸ªçæ¬, æå° MaSuRCA 3.1.3 ç®å, 廿ææçä¾èµ, 廿é
å `Celera Assembler` çé¨å, åªçä¸äº
doc/model_organisms.md view on Meta::CPAN
```
## s288c: expand anchors
å¨é
¿é
é
µæ¯ä¸, æä¸åå ç»å®å
¨ç¸åçåºå, å®ä»¬é½æ¯æ°è¿åçççæ®µéå¤:
* I:216563-218385, VIII:537165-538987
* I:223713-224783, VIII:550350-551420
* IV:528442-530427, IV:532327-534312, IV:536212-538197
* IV:530324-531519, IV:534209-535404
* IV:5645-7725, X:738076-740156
* IV:7810-9432, X:736368-737990
* IX:9683-11043, X:9666-11026
* IV:1244112-1245373, XV:575980-577241
* VIII:212266-214124, VIII:214264-216122
* IX:11366-14953, X:11349-14936
* XII:468935-470576, XII:472587-474228, XII:482167-483808, XII:485819-487460,
* XII:483798-485798, XII:487450-489450
* anchorLong
doc/model_organisms.md view on Meta::CPAN
| Genome | 25286936 | 137567477 | 8 |
| Paralogs | 4031 | 13665900 | 4492 |
| anchor.merge | 26860 | 117041459 | 9566 |
| others.merge | 8732 | 3092289 | 1004 |
| anchor.cover | 26199 | 116199529 | 9576 |
| anchorLong | 69814 | 115806088 | 4924 |
| contigTrim | 1238480 | 123572499 | 603 |
| spades.contig | 108756 | 132705321 | 61620 |
| spades.scaffold | 142273 | 132725706 | 61182 |
| platanus.contig | 11503 | 156820565 | 359399 |
| platanus.scaffold | 146404 | 129134232 | 71416 |
* quast
```bash
BASE_NAME=iso_1
cd ${HOME}/data/anchr/${BASE_NAME}
rm -fr 9_qa_contig
quast --no-check --threads 16 \
--eukaryote \
doc/pacbio_consensus.md view on Meta::CPAN
```
## `falcon/example` éç [*E. coli* æ ·ä¾](https://github.com/PacificBiosciences/FALCON/wiki/Setup:-Complete-example).
* è¿å¢ä¸è½½ä»¥ä¸ä¸ä¸ªæä»¶
```bash
mkdir -p $HOME/data/pacbio/rawdata/ecoli_test
cd $HOME/data/pacbio/rawdata/ecoli_test
proxychains4 wget -c https://www.dropbox.com/s/tb78i5i3nrvm6rg/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.1.subreads.fasta
proxychains4 wget -c https://www.dropbox.com/s/v6wwpn40gedj470/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.2.subreads.fasta
proxychains4 wget -c https://www.dropbox.com/s/j61j2cvdxn4dx4g/m140913_050931_42139_c100713652400000001823152404301535_s1_p0.3.subreads.fasta
# N50 14124
# C 105451
faops n50 -C *.subreads.fasta
```
* é
ç½®æä»¶åè¿è¡
```bash
source ~/share/pitchfork/deployment/setup-env.sh
( run in 1.954 second using v1.01-cache-2.11-cpan-39bf76dae61 )