CWB-Web

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

    perl Makefile.PL PREFIX=~/perl INSTALLMAN1DIR=~/perl/man/man1

Note that you will then have to include the appropriate subdirectories of
"~/perl/lib/perl5/" in your Perl search path in order to use the CWB modules.


PACKAGE OVERVIEW

The CWB::Web package contains the following Perl modules:

    CWB::Web::Cache     a transparent shared cache for CQP query results
    CWB::Web::Query     convenient front-end to CQP queries, with pre-formatted
                        kwic lines
    CWB::Web::Search    search-engine-like queries on CWB corpora

NB: The CWB::Web::Query and CWB::Web::Search modules do not make use of the CQP
query cache, so they are much less efficient than a custom implementation.

See the manual pages (e.g. "perldoc CWB::Web::Cache") of these modules for 
further, but still incomplete information.

lib/CWB/Web/Cache.pm  view on Meta::CPAN

## for the cache directory path and caching strategy.  In order to make a named query result
## (of the running CQP process) persistent, it is stored in the disk cache directory, and a unique
## identifier is returned to the calling program.  This identifier can then be used to recover
## the persistent named query in a subsequent session (unless the result has already expired from
## the cache, which case must be handled by the caller).
##
## The CWB::Web::Cache module can also execute simple CQP queries and make their results
## persistent.  The query results are identified by corpus, query string, and an optional sort
## (stored as metadata) rather than a single unique identifier, and can be shared among different
## processes using the same cache directory.  When a persistent query result has expired from the
## cache, it is re-created in a way transparent to the calling program (by re-executing the query
## expression in the CQP process).
##
## The cache directory contains two subdirectories and an optional CONFIG file:
##   index/  ...  text files as 'markers' for cached queries (may contain 'metadata' about the cached query)
##   data/   ...  named query results stored in CQP's internal format
## A persistent named query is stored in a file with the name <corpus>:<query_name> in the data/
## subdirectory (e.g. DICKENS:ResultA-1121, where the numerical suffix is used to create a unique filename
## if necessary).  A text file with the same name is created in the index/ directory and may hold
## meta-information about the cached query result.  Storing a named query proceeds in the following steps:
##   1. create cache directory and subdirectories if they do not exist

lib/CWB/Web/Cache.pm  view on Meta::CPAN


=head1 SYNOPSIS

  use CWB::CQP;
  use CWB::Web::Cache;

  $cqp = new CWB::CQP;
  $cache = new CWB::Web::Cache -cqp => $cqp, -cachedir => $dir,
    [-cachesize => $cache_size,] [-cachetime => $expiration_time];

  # transparently execute and cache simple CQP queries
  $id = $cache->query(-corpus => "DICKENS", -query => '[pos="NN"] "of" "England"');
  ($size) = $cqp->exec("size $id");

  # optional features: sort clause, set keyword, subquery, and maximal number of matches
  $id = $cache->query(
    -corpus => "DICKENS", -query => $query,
    -sort => $sort_clause,
    -keyword => $set_keyword_command,
    -subquery => $subquery,
    -cut => $max_nr_of_matches  # resonable default calculated from cache size

lib/CWB/Web/Cache.pm  view on Meta::CPAN


The B<CWB::Web::Cache> module provides a simple shared caching meachnism
for CQP query results, making them persistent across multiple CQP sessions.
Old data files are automatically deleted when they pass the specified I<$expiration_time>, or
to keep the cache from growing beyond the specified I<$cache_size> limit.

Note that a B<CWB::Web::Cache> handle must be created with a pre-initialised CQP backend (i.e.
a B<CWB::CQP> object), which will be used to access the cache and (re-)run a query when necessary.

Most scripts will access the cache through the B<query()> method, which executes and caches CQP queries
in a fully transparent way (with optional C<sort> clause, C<set keyword> command, subquery,
and C<cut> to limit the maximal number of matches).  After successful execution, the query result is
loaded into the CQP backend, the appropriate corpus is activated, and the I<$id> of the named query is
returned.

The C<sort> clause is executed I<after> a C<set keyword> command 
so that C<keyword> anchors can be used in sorting.

Direct access to cache entries is provided by the low-level methods B<store()> and B<retrieve()>.
Note that these are intended for internal use only and may change in future releases.



( run in 0.848 second using v1.01-cache-2.11-cpan-0a6323c29d9 )