Cluster-Init

 view release on metacpan or  search on metacpan

lib/Cluster/Init.pm  view on Meta::CPAN


=item *

Create L<"/etc/cltab">.

=item * 

Replicate L<"/etc/cltab"> to all nodes.

=item * 

Run 'C<clinit -d>' on each node.  Putting this in F</etc/inittab> as a
"respawn" process would be a good idea, or you could have it started
as a managed process under HACMP, VCS, Linux-HA etc.

=item * 

Run 'C<clinit my_group my_level>' on each node where you want resource
group I<my_group> to be running at runlevel I<my_level>.  

=item * 

Check current status in L<"/var/run/clinit/clstat"> on each node.  (Or
use B<OpenMosix::HA>, which collates this for you across all nodes.)

=back

=head1 INSTALLATION

Use Perl's normal sequence:

  perl Makefile.PL
  make
  make test
  make install

You'll need to install this module on each node in the cluster.  

This module includes a script, L</clinit>, which will be installed when
you run 'make install'.  See the output of C<perl -V:installscript> to
find out which directory the script is installed in.

=head1 CONCEPTS

=over 4

=item Cluster

A group of machines administered as a single unit and offering a
common set of services.  See I<enterprise cluster>,
I<high-availability cluster>, and I<high-throughput cluster>.

=item Computing Cluster

See I<High-Throughput Cluster>.

=item Enterprise Cluster

A well-administered B<enterprise infrastructure> (see
L<http://www.Infrastructures.Org>), in which each machine, whether
desktop or server, provides scalable commodity services.  Any machine
  or group of machines can be easily and quickly replaced, with
minimal user impact, without restoring from backups, with no advance
notice or unique preparation.  May include elements of both I<high
availability> and I<high throughput> clusters.  

=item High-Availability Cluster

(Also B<HA Cluster>.)  A cluster of machines optimized for providing
high uptime and minimal user impact in case of hardware failure, in
return for increased per-node expense and complexity.  Normally
includes shared disk, unattended failover of filesystem mounts and IP
and MAC addresses, and automatic daemon restart on the surviving
node(s).  Suitable for applications such as NFS and database servers,
and other services which normally cannot be replicated easily.

Examples of HA cluster platforms include OpenMosix::HA, Linux-HA, AIX
HACMP, and Veritas VCS.

Due to the expense of providing the per-node redundancy required for
high availability, HA clusters are normally not scalable to the
hundreds of nodes typically needed for high-throughput applications.
OpenMosix::HA is the exception to this rule; it provides an HA layer
on top of a high-throughput openMosix cluster.

=item High-Throughput Cluster

A cluster of machines optimized for cheaply delivering large
quantities of work in a short time, in return for reduced per-process
reliability.  May include features such as process checkpointing and
migration, high-speed interconnects, or distributed shared memory.
Some high-throughput clusters are optimized for scavenging unused
cycles on desktop machines.  Most high-throughput clusters are
suitable for supercomputing-class applications which can be
parallellized across dozens, hundreds, or even thousands of nodes.

Examples of high-throughput cluster platforms include OpenMosix::HA,
openMosix itself, Linux Beowulf, and Condor.

Due to the internode dependencies inherent in distributed shared
memory or migration of interactive processes, high-throughput clusters
normally do not meet the needs of high availability -- they are
intended for brute-force problem solving where the death of a single
process out of thousands is not significant.  High-throughput clusters
are not typically designed to provide mission-critical interactive
services to the public.  

The one (known) exception is OpenMosix::HA -- it provides high
availability for both interactive and batch processes running on a
high-throughput openMosix cluster. 

=item Resource Group

A collection of applications and physical resources (like filesystem
mounts) which need to execute together on the same cluster node.
Resource groups allow easy migration of applications between nodes.

Cluster::Init supports resource groups explicitly.  Resource groups
are configured in L<"/etc/cltab">.

For example, B<sendmail>, F</etc/sendmail.cf>, and the
F</var/spool/mqueue> directory might make up a resource group -- they
all need to be present on the same node.  From L<"/etc/cltab">, you
could spawn the scripts which update F<sendmail.cf>, mount F<mqueue>,
and then start B<sendmail> itself.  

Another example; Apache, a virtual IP address, and the filesystem
containing the HTML document tree might together constitute a resource
group.  To start this resource group, you might need to mount the
filesystem, ifconfig the virtual IP, and start httpd.  This sequence
can easily be specified in F</etc/cltab>.

=back

=head1 UTILITIES

=head2 clinit

Cluster::Init includes B<clinit>, a script which is intended to be a
bolt-in cluster init tool.  The script is called like C<init> or
C<telinit>, with the addition of a new "resource group" argument.  See
the output of C<clinit -h>.  

The first time you execute B<clinit> you will need to use the C<-d>
flag only, to start the B<Cluster::Init> daemon.  This flag does not
automatically background the daemon though -- this is so it will work
as a "respawn" entry in F</etc/inittab>.  If you're testing from the
command line or running from a shell script, use 'C<clinit -d &>'.

Once you have the daemon running, use B<clinit> I<without> the C<-d>
flag.  This will cause it to run as a client only, talking to the
daemon via a UNIX domain socket.  At this point you will use B<clinit>
in roughly the same way you would use the UNIX B<telinit>, in this



( run in 1.056 second using v1.01-cache-2.11-cpan-df04353d9ac )