Algorithm-MinPerfHashTwoLevel
view release on metacpan or search on metacpan
lib/Algorithm/MinPerfHashTwoLevel.pm view on Meta::CPAN
x = ((x >> 16) ^ x) * 0x45d9f3b;
x = ((x >> 16) ^ x);
which is just a simple 32 bit integer hash function I found at
https://stackoverflow.com/a/12996028, but any decent reversible
integer hash function would do.
*NOTE* in Perl a given string may have differing binary representations
if it is encoded as utf8 or not. This module uses the same conventions
as Perl itself, which is that keys are stored in their minimal form when
possible, and are only stored in their unicode (utf8) form when they
cannot be downgraded to latin-1. This ensures that the unicode and latin-1
representations of a given string are treated as the same key. This module
deals with this by "normalizing" the keys and values into latin-1, but
tracking the representation as a flag. See key_normalized and key_is_utf8
(and their 'val' equivalents) documented in the construct method.
=head2 METHODS
=over 4
=item new
lib/Tie/Hash/MinPerfHashTwoLevel/OnDisk.pm view on Meta::CPAN
Currently there is only one support file format variant, 5.
The file structure consists of a header, followed by a byte vector of seed/state
data for the hash function, followed by a bucket table with records of a fixed size,
optionally followed by a bitvector of the flags for the keys with two bits per key,
optionally followed by a bitvector of flags for values with one bit per value,
followed by a string table containing the comment for the file and the strings it
contains, and lastly a checksum; the last 8 bytes of the file contain a hash of the
rest of the file. The key flags may be 0 for "latin-1/not-utf8", 1 for "is-utf8",
and 2 for "was-utf8" which is used for keys which can be represented as latin-1,
but should be restored as unicode/utf8. The val flags are similar but do not (need to)
support "was-utf8".
Structure:
Header
Hash-state
Bucket-table
Key flags (optional)
Val flags (optional)
Strings
padlist_dup|||
padlist_store|||
padname_dup|||
padname_free|||
padnamelist_dup|||
padnamelist_free|||
parse_body|||
parse_gv_stash_name|||
parse_ident|||
parse_lparen_question_flags|||
parse_unicode_opts|||
parse_uniprop_string|||
parser_dup|||
parser_free_nexttoke_ops|||
parser_free|||
path_is_searchable|||n
peep|||
pending_ident|||
perl_alloc_using|||n
perl_alloc|||n
perl_clone_using|||n
#endif
#ifndef PERL_PV_PRETTY_DUMP
# define PERL_PV_PRETTY_DUMP PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_QUOTE
#endif
#ifndef PERL_PV_PRETTY_REGPROP
# define PERL_PV_PRETTY_REGPROP PERL_PV_PRETTY_ELLIPSES|PERL_PV_PRETTY_LTGT|PERL_PV_ESCAPE_RE
#endif
/* Hint: pv_escape
* Note that unicode functionality is only backported to
* those perl versions that support it. For older perl
* versions, the implementation will fall back to bytes.
*/
#ifndef pv_escape
#if defined(NEED_pv_escape)
static char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);
static
#else
extern char * DPPP_(my_pv_escape)(pTHX_ SV * dsv, char const * const str, const STRLEN count, const STRLEN max, STRLEN * const escaped, const U32 flags);
( run in 0.331 second using v1.01-cache-2.11-cpan-88abd93f124 )