Data-HashMap-Shared
view release on metacpan or search on metacpan
# cursor auto-destroyed when out of scope
"shm_xx_each" is also safe to use with "remove" during iteration.
Resize/compaction is deferred until iteration ends.
Diagnostics:
my $cap = shm_xx_capacity $map; # current table capacity (slots)
my $tb = shm_xx_tombstones $map; # tombstone count
my $au = shm_xx_arena_used $map; # arena bytes used (0 for int-only)
my $ac = shm_xx_arena_cap $map; # arena total capacity (0 for int-only)
my $sz = shm_xx_mmap_size $map; # backing file size in bytes
my $ok = shm_xx_reserve $map, $n; # pre-grow (false if exceeds max)
my $ev = shm_xx_stat_evictions $map; # cumulative LRU eviction count
my $ex = shm_xx_stat_expired $map; # cumulative TTL expiration count
my $rc = shm_xx_stat_recoveries $map; # cumulative stale lock recovery count
my $p = $map->path; # backing file path (method only)
my $s = $map->stats; # hashref with all diagnostics in one call
# stats keys: size, capacity, max_entries, tombstones, mmap_size,
# arena_used, arena_cap, evictions, expired, recoveries, max_size, ttl
"set_multi", "get_multi", "remove_multi", "get_with_ttl", "stats",
"path", "sync", and "unlink" are method-only (no keyword form).
File management:
$map->sync; # flush the mmap to the backing file (msync MS_SYNC)
$map->unlink; # remove backing file (mmap stays valid)
Data::HashMap::Shared::II->unlink($path); # class method form
"sync" issues a synchronous msync(2) over the whole mapping (every
shard, for sharded maps) and dies on error. Use it to force durability
of a file-backed map; it is a no-op for anonymous mappings, which have
no backing file. Changes are visible to other processes sharing the
mapping without "sync" â it only affects on-disk persistence.
Crash Safety
If a process dies (e.g., SIGKILL, OOM kill) while holding the write
lock, other processes detect the stale lock within 2 seconds and
automatically recover. The writer's PID is encoded in the rwlock word
itself (single atomic CAS, no crash window). On "FUTEX_WAIT" timeout,
waiters "kill($pid, 0)" the holder and CAS-release the lock if it's
dead.
Reader-side recovery uses a 1024-slot table in the shared mmap (one slot
per process, claimed lazily on first lock; fork()'d children claim a
fresh slot via "pthread_atfork"). On a writer-lock timeout the recovery
scan CAS-claims each dead PID's slot, drains the waiter counts, and
force-resets the reader counter once no live reader holds it â so a
worker killed mid-"incr_by" no longer pins the rwlock indefinitely. If a
live reader is concurrently present, the dead slot is left intact for
the next recovery cycle (preserves the only record of the stuck
counter). Beyond 1024 simultaneous handles per map, new handles skip
slot tracking and fall back to the slow per-timeout drain.
The same path validates and rebuilds the LRU doubly-linked list if a
dead writer left it inconsistent. "stat_recoveries" in "stats" counts
every recovery event.
Recovery uses "kill($pid, 0)" for liveness, which cannot distinguish a
reused PID from the original. Hitting a false "alive" requires a process
to die in the brief window it holds a read lock and the kernel to cycle
through the entire PID space back to that exact number within the
~2-second recovery window and hand it to a long-lived process â i.e. a
runaway fork storm. Even then the effect is bounded: writers stall until
the recycled process exits; reads are unaffected and no data is
corrupted. Writer-crash recovery is immune (the writer PID lives in the
lock word and is reclaimed independently of the slot table).
Limitation: PID-based recovery assumes all processes share the same PID
namespace. Cross-container sharing (different PID namespaces) is not
supported.
After recovery from a mid-mutation crash, the map data may be partially
inconsistent (e.g., one entry was being updated when the writer died).
Map structure (locks, LRU, free lists, counters) is restored, but the
specific entry being mutated may have stale or partial bytes. Calling
"clear" after detecting a stale lock recovery is recommended for
safety-critical applications.
BENCHMARKS
Throughput versus other shared-memory / on-disk solutions, 25K entries,
single process, Linux x86_64. All values in M ops/s (higher is better).
Run "perl -Mblib bench/vs.pl 25000" to reproduce.
Integer key â integer value (Shared::II):
BerkeleyDB LMDB Shared::II
INSERT 31 46 184
LOOKUP 35 40 383
INCREMENT 16 18 165
String key â string value, short (inline ⤠7B, Shared::SS):
FastMmap BerkeleyDB LMDB SharedMem Shared::SS
INSERT 11 26 40 62 130
LOOKUP 10 32 34 146 213
DELETE 14 18 -- 32 68
String key â string value, long (~50-100B, Shared::SS):
BerkeleyDB LMDB SharedMem Shared::SS
INSERT 25 37 61 133
LOOKUP 30 33 125 229
LRU cache lookup (25K entries, lock-free clock eviction):
plain LRU
II 350 373 (lock-free, ~6% faster via clock)
SS 159 159
Cross-process (25K SS entries, 2 processes, ops/s):
Shared::SS SharedMem LMDB
READS 3,250,000 1,986,000 728,000
WRITES 2,801,000 826,000 95,000
MIXED 50/50 3,691,000 1,963,000 211,000
LMDB benchmarked with
MDB_WRITEMAP|MDB_NOSYNC|MDB_NOMETASYNC|MDB_NORDAHEAD. BerkeleyDB with
DB_PRIVATE|128MB cache.
( run in 0.629 second using v1.01-cache-2.11-cpan-df04353d9ac )