Cache-Memcached-Turnstile

 view release on metacpan or  search on metacpan

lib/Cache/Memcached/Turnstile.pm  view on Meta::CPAN


The C<compute_time> parameter (in integer seconds)
indicates a high estimate of the time it might take to compute the value
on cache miss. You can generally be a tad generous on this. It defaults
to 2 seconds.

The C<expiration> parameter indicates the desired expiration time for
the computed value. It defaults to C<0>, which is unbounded retention. That is not usually
a good idea, so make sure to provide a better value. The unit is seconds
from "now" or, if more than 30 days, it's considered a Unix epoch (Memcached
rules, not ours).

Finally, the C<wait> parameter can either be a function reference, a number
(may be fractional, in unit of seconds), or it may be omitted altogether.
If omitted, C<wait> will be set to the C<compute_time> parameter if one was
explicitly provided. Otherwise, it defaults to C<0.1> seconds to avoid
blocking clients too long.

If C<wait> is a number (or it was set to a number as per the aforementioned
defaults), and if the running process has a cache miss, but there is
another process already updating the cached value, then we will
wait for C<wait> number of seconds and retry to fetch (once).

If C<wait> is a function reference, then that function will be called
under the conditions we'd otherwise wait & retry. The function is
invoked as C<$wait-E<gt>($memd_client, $parameter_hashref)>. Its
return value is directly returned from C<cache_get_or_compute>,
so if you want logic similar to the I<wait, then retry> logic that is
the default, then you could use a callback like the following:

  wait => sub {
    my ($memd_client, $args) = @_;
    # ... custom logic here ...
    # Retry. But don't go into infinite loop, thus the empty callback:
    return cache_get_or_compute($memd_client, %$args, "wait" => sub {return()});
  }

=head2 C<multi_cache_get_or_compute>

This function is the multi-key implementation of the Thundering Herd protection,
that is, it attempts to minimize the number of client-server roundtrips as much
as it can and reaches for Memcached's batch interface throughout.

C<multi_cache_get_or_compute> will attempt to fetch or compute the cached value for
the each of the keys, and will try really hard to avoid more than one user recomputing
any given cached value at any given time. Most of the interface mimicks that
of the single-key version as much as possible, but there are some important
differences highlighted below. As

C<keys> needs to be a reference to an array containing array references of
key/expiration pairs. C<compute_cb> receives an extra, third, parameter
as compared to the single-key implementation: A references to an array
containing the keys for which values need to be computed. (This list of
keys is possibly only a subset of the original set of keys.) The callback
needs to return a reference to an array of values which correspond to the
computed values for each input key in turn.

The C<wait> parameter works fundamentally the same as in the single-key function,
but the callback variant also receives a third parameter:
The list of keys whose values weren't available from the cache and couldn't be
locked for computation. The callback is expected to return a hash reference
of keys and values. This is different from the C<compute_cb> interface
to allow for easy calling back into C<multi_cache_get_or_compute> for retries
(see the C<wait> example for the single-key implementation above).

As with the single-key variant, C<compute_time> is the additional
cached-value life time for a single value, so should at least be an upper bound
(or slightly more) on the computation-time for a single key.
Alas, there is a trade-off here: Since the implementation seeks to
limit the number of roundtrips as much as possible, it will
pass all keys-to-be-computed to one run of the C<compute_cb>.
This means that the computation time can add up to be significantly more
than the single-key C<compute_time> value, so the C<compute_time>
parameter may have to be adjusted upwards depending on the situation
and relative cost. Failing to do so will result in seeing more hard cache
misses on concurrent use as well as an increase in the number of cache
entries being recomputed multiple times in parallel, which this module
aims to avoid in the first place.

In other words, the rule of thumb for the multi-key interface is:
Be somewhat generous on the C<compute_time> setting and provide
a separate and appropriate C<wait> time or implementation.

=head1 SEE ALSO

L<http://en.wikipedia.org/wiki/Thundering_herd_problem>

L<Cache::Memcached::Fast>

L<https://github.com/ericflo/django-newcache>
and L<https://bitbucket.org/zzzeek/dogpile.cache/> for examples of
prior art.

=head1 AUTHOR

Steffen Mueller, E<lt>smueller@cpan.orgE<gt>

Rafaël Garcia-Suarez, E<lt>rgs@consttype.org<gt>

=head1 ACKNOWLEDGMENT

This module was originally developed for Booking.com.
With approval from Booking.com, this module was generalized
and put on CPAN, for which the authors would like to express
their gratitude.

=head1 COPYRIGHT AND LICENSE

 (C) 2013 Steffen Mueller. All rights reserved.
 
 This code is available under the same license as Perl version
 5.10.1 or higher.
 
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

=cut



( run in 0.683 second using v1.01-cache-2.11-cpan-75ffa21a3d4 )