DiaColloDB
view release on metacpan or search on metacpan
DiaColloDB/Client/list.pod view on Meta::CPAN
##========================================================================
## POD DOCUMENTATION, auto-generated by podextract.perl
##========================================================================
## NAME
=pod
=head1 NAME
DiaColloDB::Client::list - diachronic collocation db: client: distributed
=cut
##========================================================================
## DESCRIPTION
=pod
=head1 DESCRIPTION
DiaColloDB::Client::list is a subclass of
L<DiaColloDB::Client|DiaColloDB::Client> for
accessing a set of distributed L<DiaColloDB|DiaColloDB> databases
via a C<list://> URL whose path part is a space- or colon-separated list
of sub-URLs supported by L<DiaColloDB::Client|DiaColloDB::Client>.
It supports the L<DiaColloDB::Client|DiaColloDB::Client> API
by calling the relevant methods on each of its sub-clients.
new() options and object structure:
##-- DiaColloDB::Client: options
url => $url, ##-- list url (sub-urls, separated by whitespace, "+SCHEME://", or "+://")
##
##-- DiaColloDB::Client::list
urls => \@urls, ##-- sub-urls
opts => \%opts, ##-- sub-client options
fudge => $fudge, ##-- get ($fudge*$kbest) items from sub-clients (-1:all; 0|1:none; default=10)
fork => $bool, ##-- run each subclient query in its own fork? (default=if available)
lazy => $bool, ##-- use temporary on-demand sub-clients (true,default) or persistent sub-clients (false)
extend => $boo, ##-- use extend() queries to acquire correct f2 counts? (default=true)
logFudge => $level, ##-- log-level for fudge-coefficient debugging (default='debug')
logThread => $level, ##-- log-level for thread operations (default='none')
##
##-- guts
#clis => \@clis, ##-- per-url sub-clients for "busy" (non-"lazy") mode
The most important client parameter is the fudge-coefficient option C<fudge=E<gt>$fudge>, which requests
that up to C<$fudge*$kbest> items be retrieved from sub-clients for each L<profile()|profile>
call. If C<$fudge E<lt> 0>, all collocates will be retrieved from each sub-client,
and trimming will be performed exclusively by the superordinate DiaColloDB::Client::list object.
If C<$fudge == 0>, only the C<$kbest> collocates from each sub-client will be retrieved.
The default value of 10 should return reasonable results without too large of
a performance penalty in most cases, but be aware that the results for C<$fudge E<gt> 0> may not be strictly correct
due to sub-client local pruning; see L<|/KNOWN BUGS> for details.
This module supports parallel processing of sub-client queries using whatever
threading implementation (if any) is provided by the L<DiaColloDB::threads|DiaColloDB::threads> module.
Parallel sub-client processing is enabled by default if
a working
L<threads|threads> or L<forks|forks> module was found by
L<DiaColloDB::threads|DiaColloDB::threads>,
but can be disabled by specifying
the C<fork=E<gt>0> option to the list-client.
=head2 List URLs
List URLs passed as the the C<url> option to the constructor can be either ARRAY-refs
of sub-URLs or simple strings with an optional C<list://> scheme.
In the latter case, sub-URLs in the argument string are separated by whitespace
or by a plus character ("+") followed by the sub-URL scheme, e.g.:
["file://a","file://b"] ##-- ARRAY-ref of explicit file URLs
["a" , "b" ] ##-- ARRAY-ref of implicit file URLs
"list://file://a file://b" ##-- string with space-separated explicit file URLs
"list://a b" ##-- string with space-separated implicit file URLs
"list://file://a+file://b" ##-- list with "+"-separated explicit file URLs
"list://a+://b" ##-- list with "+"-separated implicit file URLs
Options can be passed to the appropriate sub-URLs via those URLs' query strings,
as described in L<DiaColloDB::Client/open>.
Options to the DiaColloDB::Client::list object itself can be passed in by using
a sub-URL consisting of a HASH-ref or only a query string, e.g.:
["a","b",{fudge=>0}] ##-- ARRAY-ref with local options as HASH-ref
["a","b","?fudge=0"] ##-- ARRAY-ref with local options as query-string
"list://a b ?fudge=0" ##-- space-sparated string with local options
"list://a+://b+://?fudge=0" ##-- "+"-separated string with local options
=cut
##======================================================================
## Footer
##======================================================================
=pod
=head1 KNOWN BUGS
=head2 Incorrect Independent Collocate Frequencies
Prior to the introduction of L<extend()|DiaColloDB/extend> queries in
DiaCollODB v0.11.000, the list-clients were B<always>
apt to return incorrrect independent collocate frequencies I<f2> whenever
( run in 2.869 seconds using v1.01-cache-2.11-cpan-437f7b0c052 )