Data-HashMap-Shared
view release on metacpan or search on metacpan
lib/Data/HashMap/Shared.pm view on Meta::CPAN
# cursor auto-destroyed when out of scope
C<shm_xx_each> is also safe to use with C<remove> during iteration.
Resize/compaction is deferred until iteration ends.
Diagnostics:
my $cap = shm_xx_capacity $map; # current table capacity (slots)
my $tb = shm_xx_tombstones $map; # tombstone count
my $au = shm_xx_arena_used $map; # arena bytes used (0 for int-only)
my $ac = shm_xx_arena_cap $map; # arena total capacity (0 for int-only)
my $sz = shm_xx_mmap_size $map; # backing file size in bytes
my $ok = shm_xx_reserve $map, $n; # pre-grow (false if exceeds max)
my $ev = shm_xx_stat_evictions $map; # cumulative LRU eviction count
my $ex = shm_xx_stat_expired $map; # cumulative TTL expiration count
my $rc = shm_xx_stat_recoveries $map; # cumulative stale lock recovery count
my $p = $map->path; # backing file path (method only)
my $s = $map->stats; # hashref with all diagnostics in one call
# stats keys: size, capacity, max_entries, tombstones, mmap_size,
# arena_used, arena_cap, evictions, expired, recoveries, max_size, ttl
C<set_multi>, C<get_multi>, C<remove_multi>, C<get_with_ttl>, C<stats>,
C<path>, C<sync>, and C<unlink> are method-only (no keyword form).
File management:
$map->sync; # flush the mmap to the backing file (msync MS_SYNC)
$map->unlink; # remove backing file (mmap stays valid)
Data::HashMap::Shared::II->unlink($path); # class method form
C<sync> issues a synchronous C<msync(2)> over the whole mapping (every
shard, for sharded maps) and dies on error. Use it to force durability of
a file-backed map; it is a no-op for anonymous mappings, which have no
backing file. Changes are visible to other processes sharing the mapping
without C<sync> â it only affects on-disk persistence.
=head2 Crash Safety
If a process dies (e.g., SIGKILL, OOM kill) while holding the write lock,
other processes detect the stale lock within 2 seconds and automatically
recover. The writer's PID is encoded in the rwlock word itself (single
atomic CAS, no crash window). On C<FUTEX_WAIT> timeout, waiters
C<kill($pid, 0)> the holder and CAS-release the lock if it's dead.
Reader-side recovery uses a 1024-slot table in the shared mmap (one slot
per process, claimed lazily on first lock; fork()'d children claim a
fresh slot via C<pthread_atfork>). On a writer-lock timeout the recovery
scan CAS-claims each dead PID's slot, drains the waiter counts, and
force-resets the reader counter once no live reader holds it â so a
worker killed mid-C<incr_by> no longer pins the rwlock indefinitely.
If a live reader is concurrently present, the dead slot is left intact
for the next recovery cycle (preserves the only record of the stuck
counter). Beyond 1024 simultaneous handles per map, new handles skip
slot tracking and fall back to the slow per-timeout drain.
The same path validates and rebuilds the LRU doubly-linked list if a
dead writer left it inconsistent. C<stat_recoveries> in C<stats> counts
every recovery event.
Recovery uses C<kill($pid, 0)> for liveness, which cannot distinguish a
reused PID from the original. Hitting a false "alive" requires a process to
die in the brief window it holds a read lock B<and> the kernel to cycle
through the entire PID space back to that exact number within the ~2-second
recovery window B<and> hand it to a long-lived process â i.e. a runaway fork
storm. Even then the effect is bounded: writers stall until the recycled
process exits; reads are unaffected and no data is corrupted. Writer-crash
recovery is immune (the writer PID lives in the lock word and is reclaimed
independently of the slot table).
B<Limitation>: PID-based recovery assumes all processes share the same
PID namespace. Cross-container sharing (different PID namespaces) is not
supported.
After recovery from a mid-mutation crash, the map data may be partially
inconsistent (e.g., one entry was being updated when the writer died).
Map structure (locks, LRU, free lists, counters) is restored, but the
specific entry being mutated may have stale or partial bytes. Calling
C<clear> after detecting a stale lock recovery is recommended for
safety-critical applications.
=head1 BENCHMARKS
Throughput versus other shared-memory / on-disk solutions, 25K entries,
single process, Linux x86_64. All values in M ops/s (higher is better).
Run C<perl -Mblib bench/vs.pl 25000> to reproduce.
B<Integer key E<rarr> integer value> (Shared::II):
BerkeleyDB LMDB Shared::II
INSERT 31 46 184
LOOKUP 35 40 383
INCREMENT 16 18 165
B<String key E<rarr> string value, short> (inline E<le> 7B, Shared::SS):
FastMmap BerkeleyDB LMDB SharedMem Shared::SS
INSERT 11 26 40 62 130
LOOKUP 10 32 34 146 213
DELETE 14 18 -- 32 68
B<String key E<rarr> string value, long> (~50-100B, Shared::SS):
BerkeleyDB LMDB SharedMem Shared::SS
INSERT 25 37 61 133
LOOKUP 30 33 125 229
B<LRU cache lookup> (25K entries, lock-free clock eviction):
plain LRU
II 350 373 (lock-free, ~6% faster via clock)
SS 159 159
B<Cross-process> (25K SS entries, 2 processes, ops/s):
Shared::SS SharedMem LMDB
READS 3,250,000 1,986,000 728,000
WRITES 2,801,000 826,000 95,000
MIXED 50/50 3,691,000 1,963,000 211,000
LMDB benchmarked with MDB_WRITEMAP|MDB_NOSYNC|MDB_NOMETASYNC|MDB_NORDAHEAD.
BerkeleyDB with DB_PRIVATE|128MB cache.
( run in 0.617 second using v1.01-cache-2.11-cpan-df04353d9ac )