BDB
view release on metacpan or search on metacpan
=head1 NAME
BDB - Asynchronous Berkeley DB access
=head1 SYNOPSIS
use BDB;
my $env = db_env_create;
mkdir "bdtest", 0700;
db_env_open
$env,
"bdtest",
BDB::INIT_LOCK | BDB::INIT_LOG | BDB::INIT_MPOOL
| BDB::INIT_TXN | BDB::RECOVER | BDB::USE_ENVIRON | BDB::CREATE,
0600;
$env->set_flags (BDB::AUTO_COMMIT | BDB::TXN_NOSYNC, 1);
my $db = db_create $env;
db_open $db, undef, "table", undef, BDB::BTREE, BDB::AUTO_COMMIT | BDB::CREATE
| BDB::READ_UNCOMMITTED, 0600;
db_put $db, undef, "key", "data", 0, sub {
db_del $db, undef, "key";
};
db_sync $db;
# when you also use Coro, management is easy:
use Coro::BDB;
# automatic event loop integration with AnyEvent:
use AnyEvent::BDB;
# automatic result processing with EV:
my $WATCHER = EV::io BDB::poll_fileno, EV::READ, \&BDB::poll_cb;
# with Glib:
add_watch Glib::IO BDB::poll_fileno,
in => sub { BDB::poll_cb; 1 };
# or simply flush manually
BDB::flush;
=head1 DESCRIPTION
See the BerkeleyDB documentation (L<http://www.oracle.com/technology/documentation/berkeley-db/db/index.html>).
The BDB API is very similar to the C API (the translation has been very faithful).
See also the example sections in the document below and possibly the eg/
subdirectory of the BDB distribution. Last not least see the IO::AIO
documentation, as that module uses almost the same asynchronous request
model as this module.
I know this is woefully inadequate documentation. Send a patch!
=head1 REQUEST ANATOMY AND LIFETIME
Every request method creates a request. which is a C data structure not
directly visible to Perl.
During their existance, bdb requests travel through the following states,
in order:
=over 4
=item ready
Immediately after a request is created it is put into the ready state,
waiting for a thread to execute it.
=item execute
A thread has accepted the request for processing and is currently
executing it (e.g. blocking in read).
=item pending
The request has been executed and is waiting for result processing.
While request submission and execution is fully asynchronous, result
processing is not and relies on the perl interpreter calling C<poll_cb>
(or another function with the same effect).
=item result
The request results are processed synchronously by C<poll_cb>.
The C<poll_cb> function will process all outstanding aio requests by
calling their callbacks, freeing memory associated with them and managing
any groups they are contained in.
=item done
Request has reached the end of its lifetime and holds no resources anymore
(except possibly for the Perl object, but its connection to the actual
aio request is severed and calling its methods will either do nothing or
result in a runtime error).
=back
=cut
package BDB;
use common::sense;
use base 'Exporter';
our $VERSION;
BEGIN {
$VERSION = '1.92';
our @BDB_REQ = qw(
db_env_open db_env_close db_env_txn_checkpoint db_env_lock_detect
db_env_memp_sync db_env_memp_trickle db_env_dbrename db_env_dbremove
db_env_log_archive db_env_lsn_reset db_env_fileid_reset
db_open db_close db_compact db_sync db_verify db_upgrade
db_put db_exists db_get db_pget db_del db_key_range
db_txn_commit db_txn_abort db_txn_finish
db_c_close db_c_count db_c_put db_c_get db_c_pget db_c_del
db_sequence_open db_sequence_close
db_sequence_get db_sequence_remove
);
our @EXPORT = (@BDB_REQ, qw(dbreq_pri dbreq_nice db_env_create db_create));
our @EXPORT_OK = qw(
poll_fileno poll_cb poll_wait flush
min_parallel max_parallel max_idle
nreqs nready npending nthreads
max_poll_time max_poll_reqs
);
require XSLoader;
XSLoader::load ("BDB", $VERSION);
}
=head1 BERKELEYDB FUNCTIONS
All of these are functions. The create functions simply return a new
object and never block. All the remaining functions take an optional
callback as last argument. If it is missing, then the function will be
executed synchronously. In both cases, C<$!> will reflect the return value
of the function.
BDB functions that cannot block (mostly functions that manipulate
settings) are method calls on the relevant objects, so the rule of thumb
is: if it's a method, it's not blocking, if it's a function, it takes a
callback as last argument.
In the following, C<$int> signifies an integer return value,
C<bdb_filename> is a "filename" (octets on unix, madness on windows),
C<U32> is an unsigned 32 bit integer, C<int> is some integer, C<NV> is a
floating point value.
Most C<SV *> types are generic perl scalars (for input and output of data
values).
The various C<DB_ENV> etc. arguments are handles return by
C<db_env_create>, C<db_create>, C<txn_begin> and so on. If they have an
appended C<_ornull> this means they are optional and you can pass C<undef>
for them, resulting a NULL pointer on the C level.
The C<SV *callback> is the optional callback function to call when the
request is completed. This last callback argument is special: the callback
is simply the last argument passed. If there are "optional" arguments
before the callback they can be left out. The callback itself can be left
out or specified as C<undef>, in which case the function will be executed
synchronously.
For example, C<db_env_txn_checkpoint> usually is called with all integer
arguments zero. These can be left out, so all of these specify a call
to C<< DB_ENV->txn_checkpoint >>, to be executed asynchronously with a
callback to be called:
db_env_txn_checkpoint $db_env, 0, 0, 0, sub { };
db_env_txn_checkpoint $db_env, 0, 0, sub { };
db_env_txn_checkpoint $db_env, sub { };
While these all specify a call to C<< DB_ENV->txn_checkpoint >> to be
executed synchronously:
db_env_txn_checkpoint $db_env, 0, 0, 0, undef;
db_env_txn_checkpoint $db_env, 0, 0, 0;
db_env_txn_checkpoint $db_env, 0;
=head2 BDB functions
Functions in the BDB namespace, exported by default:
$env = db_env_create (U32 env_flags = 0)
flags: RPCCLIENT
db_env_open (DB_ENV *env, bdb_filename db_home, U32 open_flags, int mode, SV *callback = 0)
open_flags: INIT_CDB INIT_LOCK INIT_LOG INIT_MPOOL INIT_REP INIT_TXN RECOVER RECOVER_FATAL USE_ENVIRON USE_ENVIRON_ROOT CREATE LOCKDOWN PRIVATE REGISTER SYSTEM_MEM
db_env_close (DB_ENV *env, U32 flags = 0, SV *callback = 0)
db_env_txn_checkpoint (DB_ENV *env, U32 kbyte = 0, U32 min = 0, U32 flags = 0, SV *callback = 0)
flags: FORCE
db_env_lock_detect (DB_ENV *env, U32 flags = 0, U32 atype = DB_LOCK_DEFAULT, SV *dummy = 0, SV *callback = 0)
atype: LOCK_DEFAULT LOCK_EXPIRE LOCK_MAXLOCKS LOCK_MAXWRITE LOCK_MINLOCKS LOCK_MINWRITE LOCK_OLDEST LOCK_RANDOM LOCK_YOUNGEST
db_env_memp_sync (DB_ENV *env, SV *dummy = 0, SV *callback = 0)
db_env_memp_trickle (DB_ENV *env, int percent, SV *dummy = 0, SV *callback = 0)
db_env_dbremove (DB_ENV *env, DB_TXN_ornull *txnid, bdb_filename file, bdb_filename database, U32 flags = 0, SV *callback = 0)
db_env_dbrename (DB_ENV *env, DB_TXN_ornull *txnid, bdb_filename file, bdb_filename database, bdb_filename newname, U32 flags = 0, SV *callback = 0)
db_env_log_archive (DB_ENV *env, SV *listp, U32 flags = 0, SV *callback = 0)
db_env_lsn_reset (DB_ENV *env, bdb_filename db, U32 flags = 0, SV *callback = 0)
db_env_fileid_reset (DB_ENV *env, bdb_filename db, U32 flags = 0, SV *callback = 0)
$db = db_create (DB_ENV *env = 0, U32 flags = 0)
flags: XA_CREATE
db_open (DB *db, DB_TXN_ornull *txnid, bdb_filename file, bdb_filename database, int type, U32 flags, int mode, SV *callback = 0)
flags: AUTO_COMMIT CREATE EXCL MULTIVERSION NOMMAP RDONLY READ_UNCOMMITTED THREAD TRUNCATE
db_close (DB *db, U32 flags = 0, SV *callback = 0)
flags: DB_NOSYNC
db_verify (DB *db, bdb_filename file, bdb_filename database = 0, SV *dummy = 0, U32 flags = 0, SV *callback = 0)
db_upgrade (DB *db, bdb_filename file, U32 flags = 0, SV *callback = 0)
db_compact (DB *db, DB_TXN_ornull *txn = 0, SV *start = 0, SV *stop = 0, SV *unused1 = 0, U32 flags = DB_FREE_SPACE, SV *unused2 = 0, SV *callback = 0)
flags: FREELIST_ONLY FREE_SPACE
db_sync (DB *db, U32 flags = 0, SV *callback = 0)
db_key_range (DB *db, DB_TXN_ornull *txn, SV *key, SV *key_range, U32 flags = 0, SV *callback = 0)
db_put (DB *db, DB_TXN_ornull *txn, SV *key, SV *data, U32 flags = 0, SV *callback = 0)
flags: APPEND NODUPDATA NOOVERWRITE
db_exists (DB *db, DB_TXN_ornull *txn, SV *key, U32 flags = 0, SV *callback = 0) (v4.6)
db_get (DB *db, DB_TXN_ornull *txn, SV *key, SV *data, U32 flags = 0, SV *callback = 0)
flags: CONSUME CONSUME_WAIT GET_BOTH SET_RECNO MULTIPLE READ_COMMITTED READ_UNCOMMITTED RMW
db_pget (DB *db, DB_TXN_ornull *txn, SV *key, SV *pkey, SV *data, U32 flags = 0, SV *callback = 0)
flags: CONSUME CONSUME_WAIT GET_BOTH SET_RECNO MULTIPLE READ_COMMITTED READ_UNCOMMITTED RMW
db_del (DB *db, DB_TXN_ornull *txn, SV *key, U32 flags = 0, SV *callback = 0)
db_txn_commit (DB_TXN *txn, U32 flags = 0, SV *callback = 0)
flags: TXN_NOSYNC TXN_SYNC
db_txn_abort (DB_TXN *txn, SV *callback = 0)
db_c_close (DBC *dbc, SV *callback = 0)
db_c_count (DBC *dbc, SV *count, U32 flags = 0, SV *callback = 0)
db_c_put (DBC *dbc, SV *key, SV *data, U32 flags = 0, SV *callback = 0)
flags: AFTER BEFORE CURRENT KEYFIRST KEYLAST NODUPDATA
db_c_get (DBC *dbc, SV *key, SV *data, U32 flags = 0, SV *callback = 0)
flags: CURRENT FIRST GET_BOTH GET_BOTH_RANGE GET_RECNO JOIN_ITEM LAST NEXT NEXT_DUP NEXT_NODUP PREV PREV_DUP PREV_NODUP SET SET_RANGE SET_RECNO READ_UNCOMMITTED MULTIPLE MULTIPLE_KEY RMW
db_c_pget (DBC *dbc, SV *key, SV *pkey, SV *data, U32 flags = 0, SV *callback = 0)
db_c_del (DBC *dbc, U32 flags = 0, SV *callback = 0)
Return the I<request result pipe file descriptor>. This filehandle must be
polled for reading by some mechanism outside this module (e.g. Event or
select, see below or the SYNOPSIS). If the pipe becomes readable you have
to call C<poll_cb> to check the results.
See C<poll_cb> for an example.
=item BDB::poll_cb
Process some outstanding events on the result pipe. You have to call this
regularly. Returns the number of events processed. Returns immediately
when no events are outstanding. The amount of events processed depends on
the settings of C<BDB::max_poll_req> and C<BDB::max_poll_time>.
If not all requests were processed for whatever reason, the filehandle
will still be ready when C<poll_cb> returns.
Example: Install an Event watcher that automatically calls
BDB::poll_cb with high priority:
Event->io (fd => BDB::poll_fileno,
poll => 'r', async => 1,
cb => \&BDB::poll_cb);
=item BDB::max_poll_reqs $nreqs
=item BDB::max_poll_time $seconds
These set the maximum number of requests (default C<0>, meaning infinity)
that are being processed by C<BDB::poll_cb> in one call, respectively
the maximum amount of time (default C<0>, meaning infinity) spent in
C<BDB::poll_cb> to process requests (more correctly the mininum amount
of time C<poll_cb> is allowed to use).
Setting C<max_poll_time> to a non-zero value creates an overhead of one
syscall per request processed, which is not normally a problem unless your
callbacks are really really fast or your OS is really really slow (I am
not mentioning Solaris here). Using C<max_poll_reqs> incurs no overhead.
Setting these is useful if you want to ensure some level of
interactiveness when perl is not fast enough to process all requests in
time.
For interactive programs, values such as C<0.01> to C<0.1> should be fine.
Example: Install an EV watcher that automatically calls
BDB::poll_cb with low priority, to ensure that other parts of the
program get the CPU sometimes even under high load.
# try not to spend much more than 0.1s in poll_cb
BDB::max_poll_time 0.1;
my $bdb_poll = EV::io BDB::poll_fileno, EV::READ, \&BDB::poll_cb);
=item BDB::poll_wait
If there are any outstanding requests and none of them in the result
phase, wait till the result filehandle becomes ready for reading (simply
does a C<select> on the filehandle. This is useful if you want to
synchronously wait for some requests to finish).
See C<nreqs> for an example.
=item BDB::poll
Waits until some requests have been handled.
Returns the number of requests processed, but is otherwise strictly
equivalent to:
BDB::poll_wait, BDB::poll_cb
=item BDB::flush
Wait till all outstanding BDB requests have been handled.
Strictly equivalent to:
BDB::poll_wait, BDB::poll_cb
while BDB::nreqs;
=back
=head2 VERSION CHECKING
BerkeleyDB comes in various versions, many of them have minor
incompatibilities. This means that traditional "at least version x.x"
checks are often not sufficient.
Example: set the log_autoremove option in a way compatible with <v4.7 and
v4.7. Note the use of & on the constants to avoid triggering a compiletime
bug when the symbol isn't available.
$DB_ENV->set_flags (&BDB::LOG_AUTOREMOVE ) if BDB::VERSION v0, v4.7;
$DB_ENV->log_set_config (&BDB::LOG_AUTO_REMOVE) if BDB::VERSION v4.7;
=over 4
=item BDB::VERSION
The C<BDB::VERSION> function, when called without arguments, returns the
Berkeley DB version as a v-string (usually with 3 components). You should
use C<lt> and C<ge> operators exclusively to make comparisons.
Example: check for at least version 4.7.
BDB::VERSION ge v4.7 or die;
=item BDB::VERSION min-version
Returns true if the BDB version is at least the given version (specified
as a v-string), false otherwise.
Example: check for at least version 4.5.
BDB::VERSION v4.7 or die;
=item BDB::VERSION min-version, max-version
Returns true of the BDB version is at least version C<min-version> (specify C<undef> or C<v0> for any minimum version)
and less then C<max-version>.
Example: check wether version is strictly less then v4.7.
BDB::VERSION v0, v4.7
or die "version 4.7 is not yet supported";
=back
=cut
sub VERSION {
# I was dumb enough to override the VERSION method here, so let's try
# to fix it up.
if ($_[0] eq __PACKAGE__) {
$VERSION
} else {
if (@_ > 0) {
return undef if VERSION_v lt $_[0];
if (@_ > 1) {
return undef if VERSION_v ge $_[1];
}
}
VERSION_v
}
}
=head2 CONTROLLING THE NUMBER OF THREADS
=over 4
=item BDB::min_parallel $nthreads
Set the minimum number of BDB threads to C<$nthreads>. The current
default is C<8>, which means eight asynchronous operations can execute
concurrently at any one time (the number of outstanding requests,
however, is unlimited).
BDB starts threads only on demand, when an BDB request is queued and
no free thread exists. Please note that queueing up a hundred requests can
create demand for a hundred threads, even if it turns out that everything
is in the cache and could have been processed faster by a single thread.
It is recommended to keep the number of threads relatively low, as some
Linux kernel versions will scale negatively with the number of threads
(higher parallelity => MUCH higher latency). With current Linux 2.6
versions, 4-32 threads should be fine.
Under most circumstances you don't need to call this function, as the
module selects a default that is suitable for low to moderate load.
=item BDB::max_parallel $nthreads
Sets the maximum number of BDB threads to C<$nthreads>. If more than the
specified number of threads are currently running, this function kills
them. This function blocks until the limit is reached.
While C<$nthreads> are zero, aio requests get queued but not executed
until the number of threads has been increased again.
This module automatically runs C<max_parallel 0> at program end, to ensure
that all threads are killed and that there are no outstanding requests.
Under normal circumstances you don't need to call this function.
=item BDB::max_idle $nthreads
Limit the number of threads (default: 4) that are allowed to idle (i.e.,
threads that did not get a request to process within 10 seconds). That
means if a thread becomes idle while C<$nthreads> other threads are also
idle, it will free its resources and exit.
This is useful when you allow a large number of threads (e.g. 100 or 1000)
to allow for extremely high load situations, but want to free resources
under normal circumstances (1000 threads can easily consume 30MB of RAM).
The default is probably ok in most situations, especially if thread
creation is fast. If thread creation is very slow on your system you might
want to use larger values.
=item $oldmaxreqs = BDB::max_outstanding $maxreqs
This is a very bad function to use in interactive programs because it
blocks, and a bad way to reduce concurrency because it is inexact: Better
use an C<aio_group> together with a feed callback.
Sets the maximum number of outstanding requests to C<$nreqs>. If you
to queue up more than this number of requests, the next call to the
C<poll_cb> (and C<poll_some> and other functions calling C<poll_cb>)
function will block until the limit is no longer exceeded.
The default value is very large, so there is no practical limit on the
number of outstanding requests.
You can still queue as many requests as you want. Therefore,
C<max_oustsanding> is mainly useful in simple scripts (with low values) or
as a stop gap to shield against fatal memory overflow (with large values).
=item $old_cb = BDB::set_sync_prepare $cb
Sets a callback that is called whenever a request is created without an
explicit callback. It has to return two code references. The first is used
as the request callback (it should save the return status), and the second
is called to wait until the first callback has been called (it must set
C<$!> to the return status).
This mechanism can be used to include BDB into other event mechanisms,
such as L<Coro::BDB>.
To allow other, callback-based, events to be executed while callback-less
ones are run, you could use this sync prepare function:
sub {
my $status;
(
sub { $status = $! },
sub { BDB::poll while !defined $status; $! = $status },
)
}
It works by polling for results till the request has finished and then
sets C<$!> to the return value. This means that if you don't use a
callback, BDB would simply fall back to synchronous operations.
By default, or if the sync prepare function is set to C<undef>, is to
execute callback-less BDB requests in the foreground thread, setting C<$!>
to the return value, without polling for other events.
=back
=head2 STATISTICAL INFORMATION
=over 4
=item BDB::nreqs
Returns the number of requests currently in the ready, execute or pending
states (i.e. for which their callback has not been invoked yet).
Example: wait till there are no outstanding requests anymore:
BDB::poll_wait, BDB::poll_cb
while BDB::nreqs;
=item BDB::nready
Returns the number of requests currently in the ready state (not yet
executed).
=item BDB::npending
Returns the number of requests currently in the pending state (executed,
but not yet processed by poll_cb).
=back
=cut
set_sync_prepare (undef);
min_parallel 8;
END { flush }
1;
=head1 COMMON PITFALLS
=head2 Unexpected Crashes
Remember that, by default, BDB will execute requests in parallel, in
somewhat random order. That means that it is easy to run a C<db_get>
request on the same database as a concurrent C<db_close> request, leading
to a crash, silent data corruption, eventually the next world war on
terrorism.
If you only ever use foreground requests (without a callback), this will
not be an issue (unless you use threads).
=head2 Unexpected Freezes or Deadlocks
Remember that, by default, BDB will execute requests in parallel, which
easily leads to deadlocks (even concurrent put's on the same database can
( run in 0.603 second using v1.01-cache-2.11-cpan-f56aa216473 )