AnyEvent
view release on metacpan or search on metacpan
lib/AnyEvent.pm view on Meta::CPAN
use AnyEvent;
my $quit = AnyEvent->condvar;
$fcp->txn_client_get ($url)->cb (sub {
...
$quit->send;
});
$quit->recv;
=head1 BENCHMARKS
To give you an idea of the performance and overheads that AnyEvent adds
over the event loops themselves and to give you an impression of the speed
of various event loops I prepared some benchmarks.
=head2 BENCHMARKING ANYEVENT OVERHEAD
Here is a benchmark of various supported event models used natively and
through AnyEvent. The benchmark creates a lot of timers (with a zero
timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
which it is), lets them fire exactly once and destroys them again.
Source code for this benchmark is found as F<eg/bench> in the AnyEvent
distribution. It uses the L<AE> interface, which makes a real difference
for the EV and Perl backends only.
=head3 Explanation of the columns
I<watcher> is the number of event watchers created/destroyed. Since
different event models feature vastly different performances, each event
loop was given a number of watchers so that overall runtime is acceptable
and similar between tested event loop (and keep them from crashing): Glib
would probably take thousands of years if asked to process the same number
of watchers as EV in this benchmark.
I<bytes> is the number of bytes (as measured by the resident set size,
RSS) consumed by each watcher. This method of measuring captures both C
and Perl-based overheads.
I<create> is the time, in microseconds (millionths of seconds), that it
takes to create a single watcher. The callback is a closure shared between
all watchers, to avoid adding memory overhead. That means closure creation
and memory usage is not included in the figures.
I<invoke> is the time, in microseconds, used to invoke a simple
callback. The callback simply counts down a Perl variable and after it was
invoked "watcher" times, it would C<< ->send >> a condvar once to
signal the end of this phase.
I<destroy> is the time, in microseconds, that it takes to destroy a single
watcher.
=head3 Results
name watchers bytes create invoke destroy comment
EV/EV 100000 223 0.47 0.43 0.27 EV native interface
EV/Any 100000 223 0.48 0.42 0.26 EV + AnyEvent watchers
Coro::EV/Any 100000 223 0.47 0.42 0.26 coroutines + Coro::Signal
Perl/Any 100000 431 2.70 0.74 0.92 pure perl implementation
Event/Event 16000 516 31.16 31.84 0.82 Event native interface
Event/Any 16000 1203 42.61 34.79 1.80 Event + AnyEvent watchers
IOAsync/Any 16000 1911 41.92 27.45 16.81 via IO::Async::Loop::IO_Poll
IOAsync/Any 16000 1726 40.69 26.37 15.25 via IO::Async::Loop::Epoll
Glib/Any 16000 1118 89.00 12.57 51.17 quadratic behaviour
Tk/Any 2000 1346 20.96 10.75 8.00 SEGV with >> 2000 watchers
POE/Any 2000 6951 108.97 795.32 14.24 via POE::Loop::Event
POE/Any 2000 6648 94.79 774.40 575.51 via POE::Loop::Select
=head3 Discussion
The benchmark does I<not> measure scalability of the event loop very
well. For example, a select-based event loop (such as the pure perl one)
can never compete with an event loop that uses epoll when the number of
file descriptors grows high. In this benchmark, all events become ready at
the same time, so select/poll-based implementations get an unnatural speed
boost.
Also, note that the number of watchers usually has a nonlinear effect on
overall speed, that is, creating twice as many watchers doesn't take twice
the time - usually it takes longer. This puts event loops tested with a
higher number of watchers at a disadvantage.
To put the range of results into perspective, consider that on the
benchmark machine, handling an event takes roughly 1600 CPU cycles with
EV, 3100 CPU cycles with AnyEvent's pure perl loop and almost 3000000 CPU
cycles with POE.
C<EV> is the sole leader regarding speed and memory use, which are both
maximal/minimal, respectively. When using the L<AE> API there is zero
overhead (when going through the AnyEvent API create is about 5-6 times
slower, with other times being equal, so still uses far less memory than
any other event loop and is still faster than Event natively).
The pure perl implementation is hit in a few sweet spots (both the
constant timeout and the use of a single fd hit optimisations in the perl
interpreter and the backend itself). Nevertheless this shows that it
adds very little overhead in itself. Like any select-based backend its
performance becomes really bad with lots of file descriptors (and few of
them active), of course, but this was not subject of this benchmark.
The C<Event> module has a relatively high setup and callback invocation
cost, but overall scores in on the third place.
C<IO::Async> performs admirably well, about on par with C<Event>, even
when using its pure perl backend.
C<Glib>'s memory usage is quite a bit higher, but it features a
faster callback invocation and overall ends up in the same class as
C<Event>. However, Glib scales extremely badly, doubling the number of
watchers increases the processing time by more than a factor of four,
making it completely unusable when using larger numbers of watchers
(note that only a single file descriptor was used in the benchmark, so
inefficiencies of C<poll> do not account for this).
The C<Tk> adaptor works relatively well. The fact that it crashes with
more than 2000 watchers is a big setback, however, as correctness takes
precedence over speed. Nevertheless, its performance is surprising, as the
file descriptor is dup()ed for each watcher. This shows that the dup()
( run in 2.490 seconds using v1.01-cache-2.11-cpan-df04353d9ac )