Data-SortedSet-Shared
view release on metacpan or search on metacpan
Pop and peek
my ($member, $score) = $z->pop_min; # remove + return the lowest, or () if empty
my ($member, $score) = $z->pop_max;
my ($member, $score) = $z->peek_min; # without removing
my ($member, $score) = $z->peek_max;
Iteration
$z->each(sub { my ($member, $score) = @_; ... });
"each" snapshots all members under the read lock, then invokes the
callback once per member in score order after the lock is released, so
the callback may safely call back into the set.
Introspection and lifecycle
$z->count; $z->max_entries; $z->stats; # see STATS
$z->path; $z->memfd; $z->sync; $z->unlink; # or Class->unlink($path)
$z->eventfd; $z->fileno; $z->notify; $z->eventfd_consume;
"sync" flushes the mapping to its backing store and "unlink" removes the
backing file (also callable as "Class->unlink($path)"). "path" returns
the backing file path ("undef" for an anonymous or memfd-backed set) and
"memfd" returns the descriptor of a "new_memfd" set (-1 otherwise). The
eventfd methods let another process wait for updates: "eventfd" lazily
creates an eventfd and returns its descriptor (croaks on failure;
calling it again returns the same fd), "fileno" returns the current
eventfd descriptor or -1, "notify" writes a wakeup (returning false if
no eventfd is attached), and "eventfd_consume" reads and resets the
counter, returning it as an integer or "undef" when nothing is pending.
SHARING ACROSS PROCESSES
The set lives in a shared mapping, so several processes operate on the
same data with no serialization layer in between. There are three ways
to share it:
* A backing file -- every process calls "new($path, $max)" on the same
path. The first to arrive creates and sizes the file (serialized by
an exclusive lock); the rest map it.
* An anonymous mapping inherited across "fork" -- create with
"new(undef, $max)" before forking; the parent and its children then
share the one mapping.
* A memfd -- create with "new_memfd($name, $max)" and hand its "memfd"
descriptor to an unrelated process (over a UNIX socket with
"SCM_RIGHTS", or while the creator is alive via "/proc/$pid/fd/$n"),
which reopens it with new_from_fd($fd).
# children populate a fork-shared set; the parent reads the result
my $z = Data::SortedSet::Shared->new(undef, 1_000_000);
for my $k (1 .. 4) {
unless (fork) { # child
$z->add($k * 1_000_000 + $_, rand) for 1 .. 1000;
exit;
}
}
1 while wait != -1; # reap children
print $z->count, "\n"; # 4000
Every operation is serialized by the rwlock, so concurrent writers do
not corrupt the tree. A writer can wake readers blocked in other
processes through the eventfd interface: it calls "notify" after a
batch, and a reader selects on "fileno" then drains the count with
"eventfd_consume".
COMPLEXITY
"score"/"exists"/"peek_*" are O(1); "add"/"remove"/"incr"/"rank"/
"at_rank"/"pop_*" and locating a range bound are O(log n); a range or
iteration of "k" members is O(log n + k), scanning sequentially through
the linked leaves.
STATS
stats() returns a hashref with keys: "count", "max_entries", "height"
(B+tree height), "node_capacity", "nodes_used", "index_slots",
"index_load" (occupied fraction of the member index), "ops" (running
count of write-path calls, whether or not they changed the set), and
"mmap_size" (bytes).
SECURITY
The mmap region is writable by all processes that open it. Do not share
backing files with untrusted processes.
CRASH SAFETY
The write lock is a futex-based rwlock with PID-encoded ownership; if a
writer dies while holding it, the next writer detects the dead owner and
recovers. Reader slots are reclaimed similarly. Limitation: PID reuse is
not detected, which is very unlikely in practice but cannot be ruled
out.
SEE ALSO
Data::SortedSet::Shared::Strings (string-keyed variant, bundled with
this distribution), Data::Intern::Shared, Data::SpatialHash::Shared,
Data::HashMap::Shared, Data::Heap::Shared, Data::Graph::Shared, and the
rest of the "Data::*::Shared" family.
AUTHOR
vividsnow
LICENSE
This is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
( run in 0.310 second using v1.01-cache-2.11-cpan-bbe5e583499 )