App-Bin4TSV-8
view release on metacpan or search on metacpan
=head1 SYNOPSIS
This module provides a Unix-like commands of :
colgrep
colsummary
crosstable
csel
csv2tsv
digitdemog
expandtab
venn
=head1 DESCRIPTION
=head1 SEE ALSO
=cut
1 ;
By doing the command above, it will also install the
following modules as the dependency requirements.
App::colsummary
App::colgrep
App::crosstable
App::csel
App::csv2tsv # This will also install Text::CSV
App::digitdemog
App::expandtab # This will also install Text::VisualWidth
App::venn
}
},
"runtime" : {
"requires" : {
"App::colgrep" : "0",
"App::colsummary" : "0",
"App::crosstable" : "0",
"App::csel" : "0",
"App::csv2tsv" : "0",
"App::digitdemog" : "0",
"App::expandtab" : "0",
"App::venn" : "0"
}
}
},
"release_status" : "stable",
"version" : "0.120"
}
directory:
- t
- inc
requires:
App::colgrep: '0'
App::colsummary: '0'
App::crosstable: '0'
App::csel: '0'
App::csv2tsv: '0'
App::digitdemog: '0'
App::expandtab: '0'
App::venn: '0'
version: '0.120'
Makefile.PL view on Meta::CPAN
WriteMakefile (
NAME => 'App::Bin4TSV::8' ,
VERSION_FROM => '8.pm' ,
PREREQ_PM => {
App::colgrep => 0 ,
App::colsummary => 0 ,
App::crosstable => 0 ,
App::csel => 0 ,
App::csv2tsv => 0 ,
App::digitdemog => 0 ,
App::expandtab => 0 ,
App::venn => 0
},
AUTHOR => 'Toshiyuki SHIMONO (bin4tsv at gmail.com)' ,
LICENSE => 'perl_5'
) ;
This module provides a Unix-like commands:
colgrep
colsummary
crosstable
csel
csv2tsv
digitdemog
expandtab
venn
Copyright (c) 2021 Toshiyuki SHIMONO. All rights reserved.
This program is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.
```
setopt interactivecomments
# â ãã®ä¸è¨ã®ã³ãã³ã1è¡ãå®è¡ãããã¨ã§ãzshã«ãã㦠# ããå¾ãã bash ã®ããã«ã³ã¡ã³ãã¨è¦ãªãããã
# â ãããä¸è¨ãè§£é¤ããããªã£ãå ´å㯠unsetopt interactivecomments
```
### æ¬ã¬ãã¸ããªã³ãã³ããå¥åã«ã¤ã³ã¹ãã¼ã«ããæ¹æ³
```
cpanm App::csv2tsv #âText::CSVã«ä¾å. ç´20ç§
cpanm App::expandtab #â Text::VisualWidthã«ä¾å
cpanm App::colsummary # æ®ãã¯ãããããç´2ç§ã§å®äº
cpanm App::venn # ã¢ã¸ã¥ã¼ã«åã¯App::ã³ãã³ãå
cpanm App::csel # ã¢ã³ã¤ã³ã¹ãã¼ã«ã®æã¯ -U ã使ãâ
cpanm App::crosstable # ä¾. cpanm -U App::csel
cpanm App::freq # cpanm -vã§ã¤ã³ã¹ãã¼ã«è©³ç´°ã表示
cpanm App::digitdemog
```
### CSVå½¢å¼ããTSVå½¢å¼ã«å¤æãã(csv2tsv)
```
cpanm App::csv2tsv
wget https://www8.cao.go.jp/chosei/shukujitsu/syukujitsu.csv
file syukujitsu.csv # â Non-ISO extended-ASCII text, with CRLF line terminators
nkf syukujitsu.csv | less
nkf syukujitsu.csv | csv2tsv > syukujitsu.tsv
less -x25 syukujitsu.tsv
```
### åãç¸¦ã«æãã¦è¡¨ç¤ºããæ©è½(expandtab)
```
cpanm App::expandtab # ç´8ç§. Text::VisualWidthã«ä¾å
tr ":" "\t" < /etc/passwd | expandtab | less -NS
sed 1,10d /etc/passwd | expandtab -i: -b. | less -NS
```
### å
¨ååã®ç¹å¾´ãã³ã³ãã¯ãã«è¡¨ç¤º(colsummary)
1. /etc/passwdã使ãä¾
```
colsummary -i: <( grep -v -e '^#' /etc/passwd )
```
2. ã¤ã³ã¿ã¼ãããä¸ã®ãTRCæ°å峿¸ãªã¼ãã³ãã¼ã¿ã ããããã¼ã¿ããã¦ã³ãã¼ããã¦ãä¸è¨ã®æ§ãª
ã³ãã³ãã©ã¤ã³ãå®è¡ããããã¡ã¤ã«åã¯é©å®æ¥ä»ã®é¨åãªã©ã¯éã£ã¦ããã§ãããã
ãªãã®ãã®ãµã¤ãã«ã¯ãããã®æ¸èªæ
å ±ã¯ãå¶å©ã»éå¶å©ãåãããå©ç¨æç¶ããªãã§ãèªç±ã«ãå©ç¨ããã ãã¾ããã¨è¨è¼ããã£ã(2021å¹´6æ20æ¥(æ¥)ç¾å¨)ã
```
cpanm App::colsummary # ç´2ç§
unzip TRCOpenBibData_20210605.zip # zipãã¡ã¤ã«ãè§£å
ln -s TRCOpenBibData_20210605.txt 0605.txt
colsummary -v9 -g3 -m0 0605.txt | expandtab -s30
```
3. 彿°ã®ç¥æ¥ã®ãã¼ã¿ã使ã£ãä¾
```
nkf syukujitsu.csv | tr "/," "\t\t" | sed 1d | ~/bin4tsv/*/colsummary | expandtab
```
ä¸è¨âã®åºåã¯ãããªãâ
```
cpos diff ave. range frequent frequency~lower(multi) digits
1 68 1992.323 1955~2022 2019|2018|2001|2007|1996|1990 22|20|19(4)|18(3)|17(11)|16(8)~14(7)|13(6)|12(8)|11|10|9(10) 4
2 12 6.104 1~12 5|11|1|9|4|3 185|151|147|138|78|74|67|60|35|30|9|1 1~2
3 25 13.618 1~30 23|3|15|11|1|5 161|136|87|70|69|68(2)~10|9|8(3)|7|6(3)|5 1~2
4 23 0.000 ãã©ãã®æ¥~çµå©ã®å 伿¥|ãã©ãã®æ¥|å¤å´æè¬ã®æ¥|æåã®æ¥|æ²æ³è¨å¿µæ¥|å
æ¥ 104|68(8)|67|57|56|53~27|16|7|3|2(2)|1(3) 2~12
```
4. PCRæ¤æ»å®æ½äººæ°ã®ä¾
```
colsummary -= -i, pcr_case_daily.csv | expandtab
# -= ã§ 1è¡ç®ãååã®ä¸¦ã³ã¨è¦ãªãããã¼ã¿ã®å¤ã¨è¦ãªããªãã
# -i, ã«ããå
¥åã®åºåãæåã(ã¿ãæåã§ã¯ãªãã¦)ã³ã³ãæå(,)ã«å¤æ´ãã¦ããã
```
ä¸è¨âã®åºåã¯ãããªãâ
```
cpos diff ave. name range frequent frequency~lower(multi) digits
1 487 2020.347 æ¥ä» 2020/10/1~2021/6/9 2020/8/21|2020/7/19|2020/7/24|2020/5/10|2020/10/26|2020/11/11 1(487) 8~10
2 85 24.988 å½ç«ææçç ç©¶æ 0|1~517 0|20|8|3|18|36 354|6(2)|5|3(6)|2(23)|1(52) 1~3
3 178 222.912 æ¤ç«æ 0|1~1733 0|1|13|4|11|3 257|12|5(2)|4(2)|3(3)|2(22)|1(147) 1~4
4 474 3910.355 å°æ¹è¡çç ç©¶æã»ä¿å¥æ 398~11856 2425|867|4678|2502|6314|1326 2(13)|1(461) 3~5
5 471 20647.676 æ°éæ¤æ»ä¼ç¤¾ ~0|2~89005 0|37|28|3651|11|5 9|2(8)|1(462) 0|1~5
6 453 1974.018 大å¦ç ~0|4~6367 0|1012|4766|558|650|3924 14|2(21)|1(431) 0|1~4
7 455 7567.082 å»çæ©é¢ |9~35210 |430|543|1808|795|10649 18|2(15)|1(439) 0|1~5
```
ä¸è¨ã®`colsummary`ã®åºåã®ä¾ã¨ãã¦ç¾ããave.ã¯æ°å¤ã¨ãã¦ã®å¹³åã§ãã(æ°ã§ç¡ãå
¥åå¤ã¯0ã¨ãã¦è¨ç®ãã)ããã®åºåå㯠`-m 0`ã¨ãããªãã·ã§ã³ã§æå¶ã§ããã`expandtab`ã§åè§ç©ºç½æåã§...
### è¤æ°ãã¡ã¤ã«ã®è¡ãã¼ã¿ã®éãªããè¦ã(venn)
4åã®ããã»ã¹ç½®æã§ãã¢:
```
cpanm App::venn
function y(){ echo -n $* | perl -pe's/./$&\n/g' }
perldoc List::Util # qãã¼ã§çµäº. minstrã¨maxstrãåç
§
### ã¯ãã¹éè¨(crosstable)
```
cpanm App::crosstable
```
1. ç¥æ¥ã®ä¾
```
awk -F/ 'NR>1{print $2"\t"$1}' syukujitsu.csv | crosstable | csel -p1,-43..-1 | expandtab
```
ä¸è¨ã®åºåâ (6æã¯é叏伿¥ã¯ãªããã1993å¹´ã«ã¯ãã£ãã)
```
X1*X2 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
1 2 2 2 2 4 2 2 2 2 4 2 2 2 2 2 4 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 3 2 2 2 2 3 2 2 2 2 2
2 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 3 2 2
3 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1
4 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 2 1 1 1
17 33 2021
18 51 2020
22 73 2019
20 93 2018
17 110 2017
```
### å
¨è¡ã§åæåãã©ã®æ¡ã«ä½å(digitdemog)
```
digitdemog <( csel -p11 TRCOpenBibData_20210605.txt ) | expandtab
# ä¸è¨ã®çµæã¯ä¸è¨ã®ããã«ãªãâ
1 2 3 4 5 6 7 8 9
'.' 0 0 0 0 3 0 0 0 0
'0' 0 137 0 0 3 0 0 0 0
'1' 385 313 0 4 0 0 0 0 0
'2' 666 59 0 10 4 0 0 0 0
'3' 118 16 0 4 1 0 0 0 0
'4' 3 6 0 0 0 0 0 0 0
'5' 0 90 0 0 2 3 0 0 0
'6' 0 208 0 0 4 0 0 0 0
'9' 0 253 0 0 3 0 0 0 0
'c' 0 0 1151 0 0 18 3 0 0
'm' 0 0 0 1151 0 0 18 3 0
'Ã' 0 0 21 0 0 0 0 0 0
end 80 0 0 0 1151 0 0 18 3
```
`digitdemog -L2`ã§å
·ä½ä¾ãæ½åºã
```
digitdemog -L2 <( csel -p11 TRCOpenBibData_20210605.txt ) | expandtab
# ä¸è¨ã®çµæã¯ä¸è¨ã®ããã«ãªãâ
length freq minstr maxstr
0 80 '' <-- same
4 1151 '11cm'(2) '43cm'(3)
7 18 '12Ã12cm'(2) '27Ã39cm'(2)
8 3 '18Ã8.5cm' <-- same
```
`digitdemog -.`ã使ãã
```
digitdemog -. <( csel -p11 TRCOpenBibData_20210605.txt ) | expandtab
# ä¸è¨ã®çµæã¯ä¸è¨ã®ããã«ãªãâ
1 2 3 4 5 6 7 8 9
'.' 0 0 0 0 3. 0 0 0 0
'0' 0 137 0 0 3 0 0 0 0
'1' 385 313 0 4 0 0 0 0 0
'2' 666 59 0 10 4 0 0 0 0
'3' 118 16 0 4 1 0 0 0 0
'4' 3 6 0 0 0 0 0 0 0
'5' 0 90 0 0 2 3. 0 0 0
'6' 0 208 0 0 4 0 0 0 0
( run in 2.756 seconds using v1.01-cache-2.11-cpan-97f6503c9c8 )