Algorithm-Toy-HashSC
view release on metacpan or search on metacpan
lib/Algorithm/Toy/HashSC.pm view on Meta::CPAN
# -*- Perl -*-
#
# Toy deterministic separate chain hash implementation, based on code in
# "Algorithms (4th Edition)" by Robert Sedgewick and Kevin Wayne. This
# code is not for any sort of use where performance is critical, or
# where malicious input may cause "Algorithmic Complexity Attacks" (see
# perlsec(1)).
#
# run perldoc(1) on this file for additional documentation
package Algorithm::Toy::HashSC;
use 5.010;
use strict;
use warnings;
use Carp qw/croak/;
use Moo;
use namespace::clean;
use Scalar::Util qw/looks_like_number/;
our $VERSION = '0.02';
##############################################################################
#
# ATTRIBUTES
# Each list should end up with ~N/M key-value pairs, assuming the input
# is not malicious, and that the hash function is perfect enough. "M"
# here is the modulus, and "N" is the number of key-value pairs added.
#
# Internally, it's an array of array of arrays, or something like that.
has _chain => (
is => 'rw',
default => sub { [] },
);
has modulus => (
is => 'rw',
default => sub { 7 },
coerce => sub {
die 'modulus must be a positive integer > 1'
if !defined $_[0]
or !looks_like_number $_[0]
or $_[0] < 2;
return int $_[0];
},
trigger => sub {
my ($self) = @_;
# clobber extant hash (Moo does not provide old value, so cannot do
# this only when the modulus changes, oh well)
$self->_chain( [] ) unless $self->unsafe;
},
);
# Boolean, disables various sanity checks if set to a true value (in
# particular whether the hash is cleared when the modulus is changed).
has unsafe => (
is => 'rw',
default => sub { 0 },
coerce => sub { $_[0] ? 1 : 0 },
);
##############################################################################
#
# METHODS
lib/Algorithm/Toy/HashSC.pm view on Meta::CPAN
for my $i ( 0 .. $#$chain ) {
if ( $key eq $chain->[$i][0] ) {
my $kvpair = splice @$chain, $i, 1;
return $kvpair->[1];
}
}
}
return;
}
1;
__END__
=head1 NAME
Algorithm::Toy::HashSC - a toy separate chain hash implementation
=head1 SYNOPSIS
use Algorithm::Toy::HashSC;
my $h = Algorithm::Toy::HashSC->new;
$h->put("key", "value");
$h->get("key"); # "value"
$h->put("blah", 42);
$h->keys; # "key","blah"
$h->take("blah"); # 42
$h->keys; # "key"
# perhaps more interesting once more key/value pairs are added
$h->keys_with("key") # "key"
# keys in a particular chain (from 0 to the modulus-1)
$h->keys_in(0);
# reset things
$h->clear_hash;
# change the number of chains (or buckets). this will destory
# any prior contents of the hash
$h->modulus(2);
# or the modulus can be set via the constructor
$h = Algorithm::Toy::HashSC->new( modulus => 2 );
# Danger zone!
$h->unsafe(1);
=head1 DESCRIPTION
This is a toy separate chain hash implementation; productive uses are
left as an exercise to the reader, probably music or artwork where the
particulars of the hash code and modulus groups the data in a desired
manner; this ordering or grouping can help determine e.g. pitch sets,
rhythmic material, etc. Hence, the B<keys_in> and B<keys_with> methods
to obtain the keys in a chain, or with a particular key. Variety could
be added by varying the modulus, or changing the B<hash> or
B<hashcode> methods.
This module is not for use where performance is a concern, or where
untrusted user input may be supplied for the key material.
L<perlsec/"Algorithmic Complexity Attacks"> discusses why Perl's hash
are no longer deterministic and thus not suitable for deterministic
music composition.
=head1 CONSTRUCTOR
The B<new> method accepts any of the L</"ATTRIBUTES">.
=head1 ATTRIBUTES
=over 4
=item B<_chain>
Where the keys and values are stored. Internal value. No peeking!
=item B<modulus> I<an-integer-greater-than-one>
Gets or sets the B<modulus> attribute. This determines the number of
chains available. It probably should be a prime number (and must be
greater than one) to better help evenly distribute the keys. A smaller
B<modulus> will cause longer chains, that is, more keys and values
lumped together.
Setting this value will clear the contents of the hash (by default).
=item B<unsafe> I<boolean>
If set to a true value, will allow unsafe operations. Possible side-
effects include old keys and values lingering in the hash (use
B<clear_hash> if this is a problem) or keys and values not being
available, or to allow duplicate keys to be stored (depending on the
particulars of B<modulus> and the result of the B<hash> calculation).
(A caller could pass keys with a B<hashcode> method that violates the
B<unsafe> setting, but that's their problem (or feature).)
=back
=head1 METHODS
=over 4
=item B<clear_hash>
Clears the contents of the hash and returns the object.
The hash may also be cleared by default when various attributes
are altered.
=item B<get> I<key>
Obtains the value for the given I<key>. The key may be a scalar value,
or a coderef that provides a B<hashcode> method. Equality is tested for
via the C<eq> operator (L<perlop/"Equality Operators">).
Like B<take> but not destructive.
=item B<hash> I<key>
Used internally to calculate the index of the chain (or bucket) where
a given I<key> resides, within the limits of the B<modulus>
attribute. This calculation can be adjusted by supplying keys with a
B<hashcode> method.
=item B<keys>
Returns the keys present in the hash, if any. Keys are ordered based
on the structure of the hash chains, and this ordering will not
change unless the B<modulus> attribute or B<hash> or B<hashcode>
methods are altered.
=item B<keys_in> I<chain-number>
Returns the keys in the given I<chain-number> where I<chain-number> is
less than B<modulus>, if any.
=item B<keys_with> I<key>
Returns the keys in the same chain as the given key, if any. As with
B<keys>, this grouping will not change unless various attributes or
methods are altered. A smaller B<modulus> will cause more keys to
group together.
=item B<put> I<key> I<value>
Adds the given key and value to the hash. The I<key> may be an object
with a B<hashcode> method; this method should return a number that for
the expected set of key values results in evenly distributed numbers
across the given B<modulus>. If the key already exists in the hash, the
value will be updated.
=item B<take> I<key>
Deletes the given key/value pair from the hash, and returns the value. A
destructive version of B<get>.
=back
=head1 BUGS
=head2 Reporting Bugs
Bugs, patches, and whatnot might best be applied towards:
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Algorithm-Toy-HashSC>
L<https://github.com/thrig/Algorithm-Toy-HashSC>
=head2 Known Issues
The default hash code calculation has not been tested to determine how
evenly it spreads keys out across the modulus space. As a workaround,
the hash key can be an object that provides a B<hashcode> method, in
which case this issue falls out of scope of this module.
=head1 SEE ALSO
L<perlsec/"Algorithmic Complexity Attacks"> - details on why Perl's hash
do not behave so simply as that of this module do.
"Algorithms" (4th Edition) by Robert Sedgewick and Kevin Wayne.
L<Hash::Util> - insight into Perl's hashes.
The "smhasher" project may help verify whether B<hashcode> functions are
as perfect as possible.
=head1 COPYRIGHT AND LICENSE
Copyright 2015 Jeremy Mates
This program is distributed under the (Revised) BSD License:
L<https://opensource.org/licenses/BSD-3-Clause>
=cut
( run in 1.127 second using v1.01-cache-2.11-cpan-39bf76dae61 )