AnyEvent-Fork
view release on metacpan or search on metacpan
vfork where possible. This gives the speed of vfork, with the
flexibility of fork.
Forking usually creates a copy-on-write copy of the parent process.
For example, modules or data files that are loaded will not use
additional memory after a fork. Exec'ing a new process, in contrast,
means modules and data files might need to be loaded again, at extra
CPU and memory cost.
But when forking, you still create a copy of your data structures -
if the program frees them and replaces them by new data, the child
processes will retain the old version even if it isn't used, which
can suddenly and unexpectedly increase memory usage when freeing
memory.
For example, Gtk2::CV is an image viewer optimised for large
directories (millions of pictures). It also forks subprocesses for
thumbnail generation, which inherit the data structure that stores
all file information. If the user changes the directory, it gets
freed in the main process, leaving a copy in the thumbnailer
processes. This can lead to many times the memory usage that would
actually be required. The solution is to fork early (and being
unable to dynamically generate more subprocesses or do this from a
module)... or to use <AnyEvent:Fork>.
There is a trade-off between more sharing with fork (which can be
good or bad), and no sharing with exec.
This module allows the main program to do a controlled fork, and
allows modules to exec processes safely at any time. When creating a
custom process pool you can take advantage of data sharing via fork
without risking to share large dynamic data structures that will
blow up child memory usage.
In other words, this module puts you into control over what is being
shared and what isn't, at all times.
Exec'ing a new perl process might be difficult.
For example, it is not easy to find the correct path to the perl
interpreter - $^X might not be a perl interpreter at all. Worse,
there might not even be a perl binary installed on the system.
This module tries hard to identify the correct path to the perl
interpreter. With a cooperative main program, exec'ing the
interpreter might not even be necessary, but even without help from
the main program, it will still work when used from a module.
Exec'ing a new perl process might be slow, as all necessary modules have
to be loaded from disk again, with no guarantees of success.
Long running processes might run into problems when perl is upgraded
and modules are no longer loadable because they refer to a different
perl version, or parts of a distribution are newer than the ones
already loaded.
This module supports creating pre-initialised perl processes to be
used as a template for new processes at a later time, e.g. for use
in a process pool.
Forking might be impossible when a program is running.
For example, POSIX makes it almost impossible to fork from a
multi-threaded program while doing anything useful in the child - in
fact, if your perl program uses POSIX threads (even indirectly via
e.g. IO::AIO or threads), you cannot call fork on the perl level
anymore without risking memory corruption or worse on a number of
operating systems.
This module can safely fork helper processes at any time, by calling
fork+exec in C, in a POSIX-compatible way (via Proc::FastSpawn).
Parallel processing with fork might be inconvenient or difficult to
implement. Modules might not work in both parent and child.
For example, when a program uses an event loop and creates watchers
it becomes very hard to use the event loop from a child program, as
the watchers already exist but are only meaningful in the parent.
Worse, a module might want to use such a module, not knowing whether
another module or the main program also does, leading to problems.
Apart from event loops, graphical toolkits also commonly fall into
the "unsafe module" category, or just about anything that
communicates with the external world, such as network libraries and
file I/O modules, which usually don't like being copied and then
allowed to continue in two processes.
With this module only the main program is allowed to create new
processes by forking (because only the main program can know when it
is still safe to do so) - all other processes are created via
fork+exec, which makes it possible to use modules such as event
loops or window interfaces safely.
EXAMPLES
This is where the wall of text ends and code speaks.
Create a single new process, tell it to run your worker function.
AnyEvent::Fork
->new
->require ("MyModule")
->run ("MyModule::worker, sub {
my ($master_filehandle) = @_;
# now $master_filehandle is connected to the
# $slave_filehandle in the new process.
});
"MyModule" might look like this:
package MyModule;
sub worker {
my ($slave_filehandle) = @_;
# now $slave_filehandle is connected to the $master_filehandle
# in the original process. have fun!
}
Create a pool of server processes all accepting on the same socket.
# create listener socket
my $listener = ...;
# create a pool template, initialise it and give it the socket
my $pool = AnyEvent::Fork
->new
->require ("Some::Stuff", "My::Server")
->send_fh ($listener);
( run in 0.620 second using v1.01-cache-2.11-cpan-39bf76dae61 )