DiaColloDB

 view release on metacpan or  search on metacpan

DiaColloDB/Client/list.pod  view on Meta::CPAN

##========================================================================
## POD DOCUMENTATION, auto-generated by podextract.perl

##========================================================================
## NAME
=pod

=head1 NAME

DiaColloDB::Client::list - diachronic collocation db: client: distributed

=cut

##========================================================================
## DESCRIPTION
=pod

=head1 DESCRIPTION

DiaColloDB::Client::list is a subclass of
L<DiaColloDB::Client|DiaColloDB::Client> for
accessing a set of distributed L<DiaColloDB|DiaColloDB> databases
via a C<list://> URL whose path part is a space- or colon-separated list
of sub-URLs supported by L<DiaColloDB::Client|DiaColloDB::Client>.
It supports the L<DiaColloDB::Client|DiaColloDB::Client> API
by calling the relevant methods on each of its sub-clients.

new() options and object structure:

 ##-- DiaColloDB::Client: options
 url  => $url,       ##-- list url (sub-urls, separated by whitespace, "+SCHEME://", or "+://")
 ##
 ##-- DiaColloDB::Client::list
 urls  => \@urls,     ##-- sub-urls
 opts  => \%opts,     ##-- sub-client options
 fudge => $fudge,     ##-- get ($fudge*$kbest) items from sub-clients (-1:all; 0|1:none; default=10)
 fork => $bool,       ##-- run each subclient query in its own fork? (default=if available)
 lazy => $bool,       ##-- use temporary on-demand sub-clients (true,default) or persistent sub-clients (false)
 extend => $boo,      ##-- use extend() queries to acquire correct f2 counts? (default=true)
 logFudge => $level,  ##-- log-level for fudge-coefficient debugging (default='debug')
 logThread => $level, ##-- log-level for thread operations (default='none')
 ##
 ##-- guts
 #clis => \@clis,     ##-- per-url sub-clients for "busy" (non-"lazy") mode

The most important client parameter is the fudge-coefficient option C<fudge=E<gt>$fudge>, which requests
that up to C<$fudge*$kbest> items be retrieved from sub-clients for each L<profile()|profile>
call.  If C<$fudge E<lt> 0>, all collocates will be retrieved from each sub-client,
and trimming will be performed exclusively by the superordinate DiaColloDB::Client::list object.
If C<$fudge == 0>, only the C<$kbest> collocates from each sub-client will be retrieved.
The default value of 10 should return reasonable results without too large of
a performance penalty in most cases, but be aware that the results for C<$fudge E<gt> 0> may not be strictly correct
due to sub-client local pruning; see L<|/KNOWN BUGS> for details.

This module supports parallel processing of sub-client queries using whatever
threading implementation (if any) is provided by the L<DiaColloDB::threads|DiaColloDB::threads> module.
Parallel sub-client processing is enabled by default if
a working
L<threads|threads> or L<forks|forks> module was found by
L<DiaColloDB::threads|DiaColloDB::threads>,
but can be disabled by specifying
the C<fork=E<gt>0> option to the list-client.

=head2 List URLs

List URLs passed as the the C<url> option to the constructor can be either ARRAY-refs
of sub-URLs or simple strings with an optional C<list://> scheme.
In the latter case, sub-URLs in the argument string are separated by whitespace
or by a plus character ("+") followed by the sub-URL scheme, e.g.:

 ["file://a","file://b"]        ##-- ARRAY-ref of explicit file URLs
 ["a"       , "b"      ]        ##-- ARRAY-ref of implicit file URLs
 
 "list://file://a file://b"     ##-- string with space-separated explicit file URLs
 "list://a b"                   ##-- string with space-separated implicit file URLs
 
 "list://file://a+file://b"     ##-- list with "+"-separated explicit file URLs
 "list://a+://b"                ##-- list with "+"-separated implicit file URLs

Options can be passed to the appropriate sub-URLs via those URLs' query strings,
as described in L<DiaColloDB::Client/open>.
Options to the DiaColloDB::Client::list object itself can be passed in by using
a sub-URL consisting of a HASH-ref or only a query string, e.g.:

 ["a","b",{fudge=>0}]           ##-- ARRAY-ref with local options as HASH-ref
 ["a","b","?fudge=0"]           ##-- ARRAY-ref with local options as query-string
 
 "list://a b ?fudge=0"          ##-- space-sparated string with local options
 "list://a+://b+://?fudge=0"    ##-- "+"-separated string with local options

=cut

##======================================================================
## Footer
##======================================================================
=pod

=head1 KNOWN BUGS

=head2 Incorrect Independent Collocate Frequencies

Prior to the introduction of L<extend()|DiaColloDB/extend> queries in
DiaCollODB v0.11.000, the list-clients were B<always>
apt to return incorrrect independent collocate frequencies I<f2> whenever



( run in 2.869 seconds using v1.01-cache-2.11-cpan-437f7b0c052 )