DiaColloDB
view release on metacpan or search on metacpan
dcdb-query.perl view on Meta::CPAN
##-- timing
if ($dotime || $niters > 1) {
$cli->info("operation completed in ", $timer->timestr,
($niters > 1 ? sprintf(" (%.2f iter/sec)", $niters/$timer->elapsed) : qw()),
);
}
__END__
###############################################################
## pods
###############################################################
=pod
=encoding utf8
=head1 NAME
dcdb-query.perl - query a DiaColloDB diachronic collocation database
=head1 SYNOPSIS
dcdb-query.perl [OPTIONS] DBURL QUERY1 [QUERY2]
General Options:
-help # display a brief usage summary
-version # display program version
-[no]time # do/don't report operation timing (default=do)
-iters NITERS # benchmark NITERS iterations of query
Query Options:
-col, -ug, -ddc, -tdf # select profile type (collocations, unigrams, ddc client, tdf matrix; default=-col)
-(a|b)?date DATES # set target DATE or /REGEX/ or MIN-MAX
-(a|b)?slice SLICE # set target date slice (default=1)
-groupby GROUPBY # set result aggregation (default=l)
-kbest KBEST # return only KBEST items per date-slice (default=10)
-nokbest # disable k-best pruning
-cutoff CUTOFF # set minimum score for returned items (default=none)
-nocutoff # disable cutoff pruning
-[no]global # do/don't trim profiles globally (vs. locally by date-slice; default=don't)
-[no]strings # debug: do/don't stringify returned profile (default=do)
-1pass , -2pass # do/don't use fast but incorrect 1-pass method (default=don't)
-O KEY=VALUE # set DiaColloDB::Client option
-SO KEY_=VALUE # set sub-client option (for list:// clients)
Scoring Options:
-f # score by raw frequency
-lf # score by log-frequency
-fm # score by frequency per million tokens
-lfm # score by log-frequency per million tokens
-milf # score by pointwise mutual information x log-frequency product
-mi1 # score by raw pointwise mutual information
-mi3 # score by pointwise mutual information^3 (Rychlý 2008)
-ld # score by scaled log-Dice coefficient (Rychlý 2008)
-ll # score by 1-sided log-likelihood ratio (Evert 2008)
-eps EPS # smoothing constant (default=0)
-diff DIFFOP # diff operation (adiff|diff|sum|min|max|avg|havg|gavg; default=adiff)
I/O Options:
-user USER[:PASSWD] # user credentials for HTTP queries
-text # use text output (default)
-json # use json output
-null # don't output profile at all
-[no]pretty # do/don't pretty-print json output (default=do)
-log-level LEVEL # set minimum DiaColloDB log-level
Arguments:
DBURL # DB URL (file://, rcfile://, http://, or list://)
QUERY1 # space-separated target1 string(s) LIST or /REGEX/ or DDC-query
QUERY2 # space-separated target2 string(s) LIST or /REGEX/ or DDC-query (for diff profiles)
Grouping and Filtering:
GROUPBY is a space- or comma-separated list of the form ATTR1[=FILTER1] ..., where:
- ATTR is the name or alias of a supported attribute (e.g. 'lemma', 'pos', etc.), and
- FILTER is either a |-separated LIST of literal values or a /REGEX/[gimsadlu]*
Diff Operations:
DIFF is one of: adiff diff sum min max avg havg gavg lavg
=cut
###############################################################
## DESCRIPTION
###############################################################
=pod
=head1 DESCRIPTION
dcdb-query.perl
is a command-line utility for querying a
L<DiaColloDB|DiaColloDB> diachronic collocation database.
=cut
###############################################################
## OPTIONS AND ARGUMENTS
###############################################################
=pod
=head1 OPTIONS AND ARGUMENTS
=cut
###############################################################
# Arguments
###############################################################
=pod
=head2 Arguments
=over 4
=item DBURL
URL identifying the L<DiaColloDB|DiaColloDB>
database to be queried,
in a form accepted by L<DiaColloDB::Client-E<gt>open()|DiaColloDB::Client/open>.
In particular, I<DBURL> can be a local L<DiaColloDB|DiaColloDB> database directory,
in which case it will be queried via
the L<DiaColloDB::Client::file|DiaColloDB::Client::file> class.
dcdb-query.perl view on Meta::CPAN
score by frequency per million tokens
=item -lfm
score by log-frequency per million tokens
=item -milf
score by pointwise mutual information x log-frequency product
=item -mi1
score by raw pointwise mutual information
=item -mi3
score by pointwise mutual information^3 (Rychlý 2008)
=item -ld
score by scaled log-Dice coefficient (Rychlý 2008; default)
=item -ll
score by 1-sided log-likelihood ratio (Evert 2008)
=item -eps EPS
score function smoothing constant (default=0.5)
=item -diff DIFFOP
diff operation to use for
L<comparison profiles|/QUERY2>.
Known values:
adiff # absolute score difference (default)
diff # raw score difference
sum # sum
min # minimum
max # maximum
avg # average
havg # pseudo-harmonic average
gavg # pseudo-geometric average
=back
=cut
###############################################################
# I/O and Logging Options
=pod
=head2 I/O and Logging Options
=over 4
=item -user USER[:PASSWD]
Specify user credentials for HTTP queries
=item -text
generate text output (default).
=item -json
generate json output.
=item -html
generate HTML output.
=item -null
don't output profile data at all (for timing and debugging).
=item -[no]pretty
do/don't pretty-print json output (default=do)
=item -score-format FORMAT
L<sprintf|perlfunc/sprintf>-format for score formatting,
used by text and HTML output modes.
=item -log-level LEVEL
set minimum L<DiaColloDB::Logger|DiaColloDB::Logger> log-level.
=back
=cut
###############################################################
# Bugs and Limitations
###############################################################
=pod
=head1 BUGS AND LIMITATIONS
Probably many.
=cut
###############################################################
# Footer
###############################################################
=pod
=head1 ACKNOWLEDGEMENTS
Perl by Larry Wall.
=head1 AUTHOR
Bryan Jurish E<lt>moocow@cpan.orgE<gt>
=head1 SEE ALSO
( run in 2.364 seconds using v1.01-cache-2.11-cpan-cdf2f3d4e48 )