Cache-Memcached-Turnstile
view release on metacpan or search on metacpan
"cache_get_or_compute" call.
The "compute_time" parameter (in integer seconds) indicates a high
estimate of the time it might take to compute the value on cache miss.
You can generally be a tad generous on this. It defaults to 2 seconds.
The "expiration" parameter indicates the desired expiration time for the
computed value. It defaults to 0, which is unbounded retention. That is
not usually a good idea, so make sure to provide a better value. The
unit is seconds from "now" or, if more than 30 days, it's considered a
Unix epoch (Memcached rules, not ours).
Finally, the "wait" parameter can either be a function reference, a
number (may be fractional, in unit of seconds), or it may be omitted
altogether. If omitted, "wait" will be set to the "compute_time"
parameter if one was explicitly provided. Otherwise, it defaults to 0.1
seconds to avoid blocking clients too long.
If "wait" is a number (or it was set to a number as per the
aforementioned defaults), and if the running process has a cache miss,
but there is another process already updating the cached value, then we
will wait for "wait" number of seconds and retry to fetch (once).
If "wait" is a function reference, then that function will be called
under the conditions we'd otherwise wait & retry. The function is
invoked as "$wait->($memd_client, $parameter_hashref)". Its return value
is directly returned from "cache_get_or_compute", so if you want logic
similar to the *wait, then retry* logic that is the default, then you
could use a callback like the following:
wait => sub {
my ($memd_client, $args) = @_;
# ... custom logic here ...
# Retry. But don't go into infinite loop, thus the empty callback:
return cache_get_or_compute($memd_client, %$args, "wait" => sub {return()});
}
"multi_cache_get_or_compute"
This function is the multi-key implementation of the Thundering Herd
protection, that is, it attempts to minimize the number of client-server
roundtrips as much as it can and reaches for Memcached's batch interface
throughout.
"multi_cache_get_or_compute" will attempt to fetch or compute the cached
value for the each of the keys, and will try really hard to avoid more
than one user recomputing any given cached value at any given time. Most
of the interface mimicks that of the single-key version as much as
possible, but there are some important differences highlighted below. As
"keys" needs to be a reference to an array containing array references
of key/expiration pairs. "compute_cb" receives an extra, third,
parameter as compared to the single-key implementation: A references to
an array containing the keys for which values need to be computed. (This
list of keys is possibly only a subset of the original set of keys.) The
callback needs to return a reference to an array of values which
correspond to the computed values for each input key in turn.
The "wait" parameter works fundamentally the same as in the single-key
function, but the callback variant also receives a third parameter: The
list of keys whose values weren't available from the cache and couldn't
be locked for computation. The callback is expected to return a hash
reference of keys and values. This is different from the "compute_cb"
interface to allow for easy calling back into
"multi_cache_get_or_compute" for retries (see the "wait" example for the
single-key implementation above).
As with the single-key variant, "compute_time" is the additional
cached-value life time for a single value, so should at least be an
upper bound (or slightly more) on the computation-time for a single key.
Alas, there is a trade-off here: Since the implementation seeks to limit
the number of roundtrips as much as possible, it will pass all
keys-to-be-computed to one run of the "compute_cb". This means that the
computation time can add up to be significantly more than the single-key
"compute_time" value, so the "compute_time" parameter may have to be
adjusted upwards depending on the situation and relative cost. Failing
to do so will result in seeing more hard cache misses on concurrent use
as well as an increase in the number of cache entries being recomputed
multiple times in parallel, which this module aims to avoid in the first
place.
In other words, the rule of thumb for the multi-key interface is: Be
somewhat generous on the "compute_time" setting and provide a separate
and appropriate "wait" time or implementation.
SEE ALSO
<http://en.wikipedia.org/wiki/Thundering_herd_problem>
Cache::Memcached::Fast
<https://github.com/ericflo/django-newcache> and
<https://bitbucket.org/zzzeek/dogpile.cache/> for examples of prior art.
AUTHOR
Steffen Mueller, <smueller@cpan.org>
Rafaël Garcia-Suarez, <rgs@consttype.org<gt>
ACKNOWLEDGMENT
This module was originally developed for Booking.com. With approval from
Booking.com, this module was generalized and put on CPAN, for which the
authors would like to express their gratitude.
COPYRIGHT AND LICENSE
(C) 2013 Steffen Mueller. All rights reserved.
This code is available under the same license as Perl version
5.8.1 or higher.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
( run in 2.132 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )