Hash-SharedMem

 view release on metacpan or  search on metacpan

DESIGN  view on Meta::CPAN

but included to satisfy Perl's requirements for an SV buffer, so that an
SV can be constructed to represent the string without needing to copy the
string.)  The string can contain octets of any value, including NUL; the
explicit length is the canonical way to determine where the string ends.
A string in the data file cannot contain anything that's not an octet
(or, via Perl's aliasing, a Latin-1 character).

A collection of keys and values is represented as a btree (actually a
B+-tree), in which the keys are sorted lexicographically.  The maximum
number of entries in a btree node is a parameter to the design, called
MAXFANOUT, on which all users of a shash must agree.  It must be odd,
and between 3 and 255 inclusive.  (Lower limit is inherent in btree
structure; upper limit is to fit in a byte.)  A suggested value for
it is 15.  Each btree node except for the root node contains between
(MAXFANOUT+1)/2 and MAXFANOUT (inclusive) entries.  The root node
contains between 2 and MAXFANOUT (inclusive) entries if there is at
least one layer of btree nodes below it, or between 0 and MAXFANOUT
(inclusive) entries if its entries are individual key/value pairs.
A btree node begins with a word header, in which bits are allocated thus:

	0-5   layer (0 = pointing directly to key/value pairs)
	6-7   zero padding
	8-15  fanout (range 0 to MAXFANOUT inclusive)
	16-63 zero padding

(Note: six bits for the layer number is enough to accommodate the maximum
possible number of addressable nodes in a btree with the minimum possible
fanout.  Trees in practice are unlikely to get anywhere near this height.)

The node header is immediately followed by an array of entries, as
many as indicated by the header.  Each entry consists of two words.
The first word is a pointer to a string object representing the first
(lexicographically earliest) key in the collection represented by the
entry.  The second word is a pointer to either a btree node of the next
layer down (if this node is not level zero) or a string representing a
value (if this node is level zero).  The key to which the entry refers
is either the first key in the collection represented by the subnode,
or the key under which the value is stored, respectively.

String and btree node objects are permitted to be multiply referenced.
(The key arrangement implies that a btree node cannot appear more than
once in a single tree, but it is expected that nodes will appear in
multiple trees.)  It is also permitted for their representations to
overlap, with each other or even with the file header, provided that
they do not use lines that contain mutable data.  (Overlapping isn't
very useful, except that the empty btree node and empty string, whose
representations are all-bits-zero, can sensibly be aliased to padding
in the header.)

For compatibility checking purposes, the design parameters (other than
endianness) are encapsulated as a word quantity, in which bits are
allocated thus:

	0-7   log2(line/byte)
	8-15  log2(page/byte)
	16-23 MAXFANOUT
	24-63 zero padding

If, in the future, there is any change in the file formats, it should be
indicated by setting bits in what is currently zero padding.  This might
be formulated as setting bit flags, or as changing a version number.
ext2's concept of readonly-compatible changes may be useful, but
comprehensibility of a shash to old code is not generally a priority.

A data file begins with this header:

	* word: magic number 0xc693dac5ed5e47c2

	* word: design parameters

	* word: total file length in bytes (must be page-aligned, and
	  must match physical file length)

	* zero pad to line alignment

	* word: offset of next point in file that may be allocated (must
	  be line-aligned; if equal to total file length then the file
	  is full)

	* zero pad to line alignment

	* word: root pointer and handoff flag (see notes below)

	* zero pad to line alignment

The earliest point at which the next-allocation word can point is the
end of the header.

A process wishing to write data into the file must allocate space for
it by (atomically) moving the next-allocation pointer.  It owns the
lines located between the old and new positions of the pointer.  It may
initially write arbitrarily to lines that it owns.  However, once a line
has been made reachable from the root pointer (by containing (part of)
a reachable object), that line must not be written to again, even if it
later becomes unreachable from the root.  (It remains reachable to anyone
who read the root pointer when it was reachable.)  A process may give
back allocated lines, by moving the next-allocation pointer backwards.
Such lines must not have been made reachable (i.e., must still be legal
to write to), and (due to the single allocation pointer arrangement)
can only be given back when they immediately precede the current
next-allocation position.

There are no restrictions on the content of file space that is neither
part of the header nor has ever been part of a reachable object.
This includes parts of reachable lines that don't form objects, lines
owned by some process that have not been made reachable, and the unowned
lines following the next-allocation position.

A data file is identified, among the data files in a single shash, by a
word quantity.  Identifier zero has special significance, and identifiers
are otherwise assigned sequentially by means described below.  Data file
identifiers are permitted to wrap.  This design does not protect against
race conditions that span a large fraction of a data file identifier
cycle.

The root pointer word contains both a pointer and a one-bit flag to
control handoff.  The handoff flag is bit 0, and the rest of the word must
be a valid pointer (with bits 1 and 2 necessarily zero).  The pointer is
read by masking off the handoff flag from the word.  The shash's root is
pointed to by the pointer part of the root pointer word in the current
data file.  The handoff flag governs how a root can be superseded.



( run in 0.460 second using v1.01-cache-2.11-cpan-39bf76dae61 )