Data-HashMap-Shared

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

        # cursor auto-destroyed when out of scope

    "shm_xx_each" is also safe to use with "remove" during iteration.
    Resize/compaction is deferred until iteration ends.

    Diagnostics:

        my $cap = shm_xx_capacity $map;           # current table capacity (slots)
        my $tb  = shm_xx_tombstones $map;         # tombstone count
        my $au  = shm_xx_arena_used $map;         # arena bytes used (0 for int-only)
        my $ac  = shm_xx_arena_cap $map;          # arena total capacity (0 for int-only)
        my $sz  = shm_xx_mmap_size $map;          # backing file size in bytes
        my $ok  = shm_xx_reserve $map, $n;        # pre-grow (false if exceeds max)
        my $ev  = shm_xx_stat_evictions $map;     # cumulative LRU eviction count
        my $ex  = shm_xx_stat_expired $map;       # cumulative TTL expiration count
        my $rc  = shm_xx_stat_recoveries $map;    # cumulative stale lock recovery count
        my $p   = $map->path;                    # backing file path (method only)
        my $s   = $map->stats;                   # hashref with all diagnostics in one call
        # stats keys: size, capacity, max_entries, tombstones, mmap_size,
        #   arena_used, arena_cap, evictions, expired, recoveries, max_size, ttl

    "set_multi", "get_multi", "remove_multi", "get_with_ttl", "stats",
    "path", "sync", and "unlink" are method-only (no keyword form).

    File management:

        $map->sync;                               # flush the mmap to the backing file (msync MS_SYNC)
        $map->unlink;                             # remove backing file (mmap stays valid)
        Data::HashMap::Shared::II->unlink($path); # class method form

    "sync" issues a synchronous msync(2) over the whole mapping (every
    shard, for sharded maps) and dies on error. Use it to force durability
    of a file-backed map; it is a no-op for anonymous mappings, which have
    no backing file. Changes are visible to other processes sharing the
    mapping without "sync" — it only affects on-disk persistence.

  Crash Safety
    If a process dies (e.g., SIGKILL, OOM kill) while holding the write
    lock, other processes detect the stale lock within 2 seconds and
    automatically recover. The writer's PID is encoded in the rwlock word
    itself (single atomic CAS, no crash window). On "FUTEX_WAIT" timeout,
    waiters "kill($pid, 0)" the holder and CAS-release the lock if it's
    dead.

    Reader-side recovery uses a 1024-slot table in the shared mmap (one slot
    per process, claimed lazily on first lock; fork()'d children claim a
    fresh slot via "pthread_atfork"). On a writer-lock timeout the recovery
    scan CAS-claims each dead PID's slot, drains the waiter counts, and
    force-resets the reader counter once no live reader holds it — so a
    worker killed mid-"incr_by" no longer pins the rwlock indefinitely. If a
    live reader is concurrently present, the dead slot is left intact for
    the next recovery cycle (preserves the only record of the stuck
    counter). Beyond 1024 simultaneous handles per map, new handles skip
    slot tracking and fall back to the slow per-timeout drain.

    The same path validates and rebuilds the LRU doubly-linked list if a
    dead writer left it inconsistent. "stat_recoveries" in "stats" counts
    every recovery event.

    Recovery uses "kill($pid, 0)" for liveness, which cannot distinguish a
    reused PID from the original. Hitting a false "alive" requires a process
    to die in the brief window it holds a read lock and the kernel to cycle
    through the entire PID space back to that exact number within the
    ~2-second recovery window and hand it to a long-lived process — i.e. a
    runaway fork storm. Even then the effect is bounded: writers stall until
    the recycled process exits; reads are unaffected and no data is
    corrupted. Writer-crash recovery is immune (the writer PID lives in the
    lock word and is reclaimed independently of the slot table).

    Limitation: PID-based recovery assumes all processes share the same PID
    namespace. Cross-container sharing (different PID namespaces) is not
    supported.

    After recovery from a mid-mutation crash, the map data may be partially
    inconsistent (e.g., one entry was being updated when the writer died).
    Map structure (locks, LRU, free lists, counters) is restored, but the
    specific entry being mutated may have stale or partial bytes. Calling
    "clear" after detecting a stale lock recovery is recommended for
    safety-critical applications.

BENCHMARKS
    Throughput versus other shared-memory / on-disk solutions, 25K entries,
    single process, Linux x86_64. All values in M ops/s (higher is better).
    Run "perl -Mblib bench/vs.pl 25000" to reproduce.

    Integer key → integer value (Shared::II):

                  BerkeleyDB   LMDB   Shared::II
        INSERT          31       46         184
        LOOKUP          35       40         383
        INCREMENT       16       18         165

    String key → string value, short (inline ≤ 7B, Shared::SS):

                  FastMmap   BerkeleyDB   LMDB   SharedMem   Shared::SS
        INSERT        11          26       40        62          130
        LOOKUP        10          32       34       146          213
        DELETE        14          18       --        32           68

    String key → string value, long (~50-100B, Shared::SS):

                  BerkeleyDB   LMDB   SharedMem   Shared::SS
        INSERT        25         37        61          133
        LOOKUP        30         33       125          229

    LRU cache lookup (25K entries, lock-free clock eviction):

                  plain   LRU
        II         350    373   (lock-free, ~6% faster via clock)
        SS         159    159

    Cross-process (25K SS entries, 2 processes, ops/s):

                      Shared::SS   SharedMem       LMDB
        READS        3,250,000    1,986,000     728,000
        WRITES       2,801,000      826,000      95,000
        MIXED 50/50  3,691,000    1,963,000     211,000

    LMDB benchmarked with
    MDB_WRITEMAP|MDB_NOSYNC|MDB_NOMETASYNC|MDB_NORDAHEAD. BerkeleyDB with
    DB_PRIVATE|128MB cache.



( run in 0.629 second using v1.01-cache-2.11-cpan-df04353d9ac )