App-Sandy
view release on metacpan or search on metacpan
the `@` character at the beginning of the line; the `sam` format uses the first column (called the
*query template name*).
| Sequence identifier | File format |
| :-- | :-: |
| \>**MYID and Optional information**<br />ATCGATCG | `fasta` |
| @**MYID and Optional information**<br />ATCGATCG<br />+<br />ABCDEFGH | `fastq` |
| **MYID** 99 chr1 123456 20 8M chr1 123478 30 ATCGATCG ABCDEFGH | `sam` |
Sequence identifiers may be customized in output using a format string passed by the user. This format
is a combination of literal and escaped characters, in a similar fashion to that used in C programming
languageâs `printf` function.
For example, simulating a paired-end sequencing you can add the read length, read position and mate
position into all sequence identifiers with the following format:
%i.%U read=%c:%t-%n mate=%c:%T-%N length=%r
In this case, results in `fastq` format would be:
==> Into R1
lib/App/Sandy/Command/Genome.pm view on Meta::CPAN
For I<bam> option, B<--append-id> is ignored, considering
that the sequence identifier is splitted by blank character, so
just the first field is included into the query name column
(first column).
=item B<--join-paired-ends>
By default, paired-end reads are put into two different files,
I<prefix_R[12]_001.fastq(\.gz)?>. If the user wants both outputs
together, she can pass this option.
If the B<--id> does not have the escape character %R, it is
automatically included right after the first field (blank separated values)
as in I<id/%R> - which resolves to I<id/1> or I<id/2>.
It is necessary to distinguish which read is R1/R2
=item B<--compression-level>
Regulates the speed of compression using the specified digit (between 1 and 9),
where "1" indicates the fastest compression method (less compression) and "9"
indicates the slowest compression method (best compression). The default
compression level is "6"
lib/App/Sandy/Command/Genome.pm view on Meta::CPAN
=item B<--id>
Overlap the default defined template id:
I<single-end> %i.%U_%c_%s_%t_%n and I<paired-end> %i.%U_%c_%s_%S_%E
e.g. SR123.1_chr1_P_1001_1101
See B<Format>
=item B<Format>
A string B<Format> is a combination of literal and escape characters similar to the way I<printf> works.
That way, the user has the freedom to customize the fastq sequence identifier to fit her needs. Valid
escape characteres are:
B<Common escape characters>
----------------------------------------------------------------------------
Escape Meaning
----------------------------------------------------------------------------
%i instrument id composed by SR + PID
%I job slot number
%q quality profile
%e sequencing error
%x sequencing error position
%R read 1, or 2 if it is the paired-end mate
lib/App/Sandy/Command/Genome.pm view on Meta::CPAN
%C sequence id type (reference or alternate non reference allele) ***
%s read strand
%t read start position
%n read end position
%a read start position regarding reference genome ***
%b read end position regarding reference genome ***
%v genomic variation position ***
----------------------------------------------------------------------------
*** specific for genomic variation (genome simulation only)
B<Paired-end specific escape characters>
----------------------------------------------------------------------------
Escape Meaning
----------------------------------------------------------------------------
%T mate read start position
%N mate read end position
%A mate read start position regarding reference genome ***
%B mate read end position regarding reference genome ***
%D distance between the paired-reads
%M fragment mean
lib/App/Sandy/Command/Genome.pm view on Meta::CPAN
To see the current list of available quality-profiles:
$ sandy quality
And in order to learn how to add your custom quality-profile, see:
$ sandy quality add --help
Sequence identifiers (first lines of fastq entries) may be customized in output using
a format string passed by the user. This format is a combination of literal and escaped
characters, in a similar fashion to that used in C programming languageâs printf function.
For example, letâs simulate a paired-end sequencing and add the read length, read position
and mate position into all sequence identifiers:
$ sandy genome \
--id="%i.%U read=%c:%t-%n mate=%c:%T-%N length=%r" \
hg38.fa
In this case, results would be:
lib/App/Sandy/Command/Transcriptome.pm view on Meta::CPAN
For I<bam> option, B<--append-id> is ignored, considering
that the sequence identifier is splitted by blank character, so
just the first field is included into the query name column
(first column).
=item B<--join-paired-ends>
By default, paired-end reads are put into two different files,
I<prefix_R[12]_001.fastq(\.gz)?>. If the user wants both outputs
together, she can pass this option.
If the B<--id> does not have the escape character %R, it is
automatically included right after the first field (blank separated values)
as in I<id/%R> - which resolves to I<id/1> or I<id/2>.
It is necessary to distinguish which read is R1/R2
=item B<--compression-level>
Regulates the speed of compression using the specified digit (between 1 and 9),
where "1" indicates the fastest compression method (less compression) and "9"
indicates the slowest compression method (best compression). The default
compression level is "6"
lib/App/Sandy/Command/Transcriptome.pm view on Meta::CPAN
=item B<--id>
Overlap the default defined template id:
I<single-end> %i.%U %U and I<paired-end> %i.%U %U
e.g. SR123.1 1
See B<Format>
=item B<Format>
A string B<Format> is a combination of literal and escape characters similar to the way I<printf> works.
That way, the user has the freedom to customize the fastq sequence identifier to fit her needs. Valid
escape characteres are:
B<Common escape characters>
----------------------------------------------------------------------------
Escape Meaning
----------------------------------------------------------------------------
%i instrument id composed by SR + PID
%I job slot number
%q quality profile
%e sequencing error
%x sequencing error position
%R read 1, or 2 if it is the paired-end mate
lib/App/Sandy/Command/Transcriptome.pm view on Meta::CPAN
%C sequence id type (reference or alternate non reference allele) ***
%s read strand
%t read start position
%n read end position
%a read start position regarding reference genome ***
%b read end position regarding reference genome ***
%v genomic variation position ***
----------------------------------------------------------------------------
*** specific for genomic variation (genome simulation only)
B<Paired-end specific escape characters>
----------------------------------------------------------------------------
Escape Meaning
----------------------------------------------------------------------------
%T mate read start position
%N mate read end position
%A mate read start position regarding reference genome ***
%B mate read end position regarding reference genome ***
%D distance between the paired-reads
%M fragment mean
lib/App/Sandy/Command/Transcriptome.pm view on Meta::CPAN
To see the current list of available quality-profiles:
$ sandy quality
And in order to learn how to add your custom quality-profile, see:
$ sandy quality add --help
Sequence identifiers (first lines of fastq entries) may be customized in output using
a format string passed by the user. This format is a combination of literal and escaped
characters, in a similar fashion to that used in C programming languageâs printf function.
For example, letâs simulate a paired-end sequencing and add the read length, read position
and mate position into all sequence identifiers:
$ sandy expression --id="%i.%U read=%c:%t-%n mate=%c:%T-%N length=%r" my_genes.fa.gz
In this case, results would be:
==> Into R1
@SR.1 read=BRAF:979-880 mate=BRAF:736-835 length=100
my_snprintf() NEED_my_snprintf NEED_my_snprintf_GLOBAL
my_sprintf() NEED_my_sprintf NEED_my_sprintf_GLOBAL
my_strlcat() NEED_my_strlcat NEED_my_strlcat_GLOBAL
my_strlcpy() NEED_my_strlcpy NEED_my_strlcpy_GLOBAL
my_strnlen() NEED_my_strnlen NEED_my_strnlen_GLOBAL
newCONSTSUB() NEED_newCONSTSUB NEED_newCONSTSUB_GLOBAL
newSVpvn_share() NEED_newSVpvn_share NEED_newSVpvn_share_GLOBAL
PL_parser NEED_PL_parser NEED_PL_parser_GLOBAL
PL_signals NEED_PL_signals NEED_PL_signals_GLOBAL
pv_display() NEED_pv_display NEED_pv_display_GLOBAL
pv_escape() NEED_pv_escape NEED_pv_escape_GLOBAL
pv_pretty() NEED_pv_pretty NEED_pv_pretty_GLOBAL
sv_catpvf_mg() NEED_sv_catpvf_mg NEED_sv_catpvf_mg_GLOBAL
sv_catpvf_mg_nocontext() NEED_sv_catpvf_mg_nocontext NEED_sv_catpvf_mg_nocontext_GLOBAL
sv_setpvf_mg() NEED_sv_setpvf_mg NEED_sv_setpvf_mg_GLOBAL
sv_setpvf_mg_nocontext() NEED_sv_setpvf_mg_nocontext NEED_sv_setpvf_mg_nocontext_GLOBAL
sv_unmagicext() NEED_sv_unmagicext NEED_sv_unmagicext_GLOBAL
utf8_to_uvchr_buf() NEED_utf8_to_uvchr_buf NEED_utf8_to_uvchr_buf_GLOBAL
vload_module() NEED_vload_module NEED_vload_module_GLOBAL
vmess() NEED_vmess NEED_vmess_GLOBAL
warner() NEED_warner NEED_warner_GLOBAL
DEFINEP_t8_p8|5.033003||Viu
DEFINEP_t8_pb|5.033003||Viu
DEFINEP_tb|5.035004||Viu
DEFINEP_tb_p8|5.033003||Viu
DEFINEP_tb_pb|5.033003||Viu
DEFSV|5.004005|5.003007|p
DEFSV_set|5.010001|5.003007|p
del_body_by_type|||Viu
delete_eval_scope|5.009004||xViu
delimcpy|5.004000|5.004000|n
delimcpy_no_escape|5.025005||cVni
DEL_NATIVE|5.017010||Viu
del_sv|5.005000||Viu
DEPENDS_PAT_MOD|5.013009||Viu
DEPENDS_PAT_MODS|5.013009||Viu
deprecate|5.011001||Viu
deprecate_disappears_in|5.025009||Viu
deprecate_fatal_in|5.025009||Viu
despatch_signals|5.007001||cVu
destroy_matcher|5.027008||Viu
DETACH|5.005000||Viu
putc|5.003007||Viu
put_charclass_bitmap_innards|5.021004||Viu
put_charclass_bitmap_innards_common|5.023008||Viu
put_charclass_bitmap_innards_invlist|5.023008||Viu
put_code_point|5.021004||Viu
putc_unlocked|5.003007||Viu
putenv|5.005000||Viu
put_range|5.019009||Viu
putw|5.003007||Viu
pv_display|5.006000|5.003007|p
pv_escape|5.009004|5.003007|p
pv_pretty|5.009004|5.003007|p
pv_uni_display|5.007003|5.007003|
pWARN_ALL|5.006000||Viu
pWARN_NONE|5.006000||Viu
pWARN_STD|5.006000||Viu
PWGECOS|5.004005|5.004005|Vn
PWPASSWD|5.005000|5.005000|Vn
qerror|5.006000||cViu
QR_PAT_MODS|5.009005||Viu
QUAD_IS_INT|5.006000|5.006000|Vn
# define PERL_PV_PRETTY_NOCLEAR PERL_PV_ESCAPE_NOCLEAR
#endif
#ifndef PERL_PV_PRETTY_DUMP
# define PERL_PV_PRETTY_DUMP PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_QUOTE
#endif
#ifndef PERL_PV_PRETTY_REGPROP
# define PERL_PV_PRETTY_REGPROP PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_LTGT|PERL_PV_ESCAPE_RE
#endif
/* Hint: pv_escape
* Note that unicode functionality is only backported to
* those perl versions that support it. For older perl
* versions, the implementation will fall back to bytes.
*/
#ifndef pv_escape
#if defined(NEED_pv_escape)
static char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);
static
#else
extern char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);
#endif
#if defined(NEED_pv_escape) || defined(NEED_pv_escape_GLOBAL)
#ifdef pv_escape
# undef pv_escape
#endif
#define pv_escape(a,b,c,d,e,f) DPPP_(my_pv_escape)(aTHX_ a,b,c,d,e,f)
#define Perl_pv_escape DPPP_(my_pv_escape)
char *
DPPP_(my_pv_escape)(pTHX_ SV *dsv, char const * const str,
const STRLEN count, const STRLEN max,
STRLEN * const escaped, const U32 flags)
{
const char esc = flags & PERL_PV_ESCAPE_RE ? '%' : '\\';
const char dq = flags & PERL_PV_ESCAPE_QUOTE ? '"' : esc;
char octbuf[32] = "%123456789ABCDF";
STRLEN wrote = 0;
STRLEN chsize = 0;
STRLEN readsize = 1;
#if defined(is_utf8_string) && defined(utf8_to_uvchr_buf)
bool isuni = flags & PERL_PV_ESCAPE_UNI ? 1 : 0;
#endif
wrote += chsize;
} else {
char tmp[2];
my_snprintf(tmp, sizeof tmp, "%c", c);
sv_catpvn(dsv, tmp, 1);
wrote++;
}
if (flags & PERL_PV_ESCAPE_FIRSTCHAR)
break;
}
if (escaped != NULL)
*escaped= pv - str;
return SvPVX(dsv);
}
#endif
#endif
#ifndef pv_pretty
#if defined(NEED_pv_pretty)
static char * DPPP_(my_pv_pretty)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, char const * const start_color, char const * const end_color, const U32 flags);
static
#define pv_pretty(a,b,c,d,e,f,g) DPPP_(my_pv_pretty)(aTHX_ a,b,c,d,e,f,g)
#define Perl_pv_pretty DPPP_(my_pv_pretty)
char *
DPPP_(my_pv_pretty)(pTHX_ SV *dsv, char const * const str, const STRLEN count,
const STRLEN max, char const * const start_color, char const * const end_color,
const U32 flags)
{
const U8 dq = (flags & PERL_PV_PRETTY_QUOTE) ? '"' : '%';
STRLEN escaped;
if (!(flags & PERL_PV_PRETTY_NOCLEAR))
sv_setpvs(dsv, "");
if (dq == '"')
sv_catpvs(dsv, "\"");
else if (flags & PERL_PV_PRETTY_LTGT)
sv_catpvs(dsv, "<");
if (start_color != NULL)
sv_catpv(dsv, D_PPP_CONSTPV_ARG(start_color));
pv_escape(dsv, str, count, max, &escaped, flags | PERL_PV_ESCAPE_NOCLEAR);
if (end_color != NULL)
sv_catpv(dsv, D_PPP_CONSTPV_ARG(end_color));
if (dq == '"')
sv_catpvs(dsv, "\"");
else if (flags & PERL_PV_PRETTY_LTGT)
sv_catpvs(dsv, ">");
if ((flags & PERL_PV_PRETTY_ELLIPSES) && escaped < count)
sv_catpvs(dsv, "...");
return SvPVX(dsv);
}
#endif
#endif
#ifndef pv_display
#if defined(NEED_pv_display)
( run in 0.458 second using v1.01-cache-2.11-cpan-c21f80fb71c )