set results from the CPAN

AI-Genetic

view release on metacpan or search on metacpan


package AI::Genetic;

use strict;
use Carp;

use vars qw/$VERSION/;

$VERSION = 0.05;

use AI::Genetic::Defaults;

# new AI::Genetic. More modular.
# Not too many checks are done still.

##### Shared private vars
# this hash predefines some strategies

my %_strategy = (
		 rouletteSinglePoint => \&AI::Genetic::Defaults::rouletteSinglePoint,
		 rouletteTwoPoint    => \&AI::Genetic::Defaults::rouletteTwoPoint,
		 rouletteUniform     => \&AI::Genetic::Defaults::rouletteUniform,

		 tournamentSinglePoint => \&AI::Genetic::Defaults::tournamentSinglePoint,
		 tournamentTwoPoint    => \&AI::Genetic::Defaults::tournamentTwoPoint,
		 tournamentUniform     => \&AI::Genetic::Defaults::tournamentUniform,

		 randomSinglePoint => \&AI::Genetic::Defaults::randomSinglePoint,
		 randomTwoPoint    => \&AI::Genetic::Defaults::randomTwoPoint,
		 randomUniform     => \&AI::Genetic::Defaults::randomUniform,
		);

# this hash maps the genome types to the
# classes they're defined in.

my %_genome2class = (
		     bitvector   => 'AI::Genetic::IndBitVector',
		     rangevector => 'AI::Genetic::IndRangeVector',
		     listvector  => 'AI::Genetic::IndListVector',
		    );

##################

# sub new():
# This is the constructor. It creates a new AI::Genetic
# object. Options are:
# -population: set the population size
# -crossover:  set the crossover probability
# -mutation:   set the mutation probability
# -fitness:    set the fitness function
# -type:       set the genome type. See docs.
# -terminate:  set termination sub.

sub new {
  my ($class, %args) = @_;

  my $self = bless {
		    ADDSEL => {},   # user-defined selections
		    ADDCRS => {},   # user-defined crossovers
		    ADDMUT => {},   # user-defined mutations
		    ADDSTR => {},   # user-defined strategies
		   } => $class;

  $self->{FITFUNC}    = $args{-fitness}    || sub { 1 };
  $self->{CROSSRATE}  = $args{-crossover}  || 0.95;
  $self->{MUTPROB}    = $args{-mutation}   || 0.05;
  $self->{POPSIZE}    = $args{-population} || 100;
  $self->{TYPE}       = $args{-type}       || 'bitvector';
  $self->{TERM}       = $args{-terminate}  || sub { 0 };

  $self->{PEOPLE}     = [];   # list of individuals
  $self->{GENERATION} = 0;    # current gen.

  $self->{INIT}       = 0;    # whether pop is initialized or not.
  $self->{SORTED}     = 0;    # whether the population is sorted by score or not.
  $self->{INDIVIDUAL} = '';   # name of individual class to use().

  return $self;
}

# sub createStrategy():
# This method creates a new strategy.
# It takes two arguments: name of strategy, and
# anon sub that implements it.

sub createStrategy {
  my ($self, $name, $sub) = @_;

  if (ref($sub) eq 'CODE') {
    $self->{ADDSTR}{$name} = $sub;
  } else {
    # we don't know what this operation is.
    carp <<EOC;
ERROR: Must specify anonymous subroutine for strategy.
       Strategy '$name' will be deleted.
EOC
    ;
    delete $self->{ADDSTR}{$name};
    return undef;
  }

  return $name;
}

# sub evolve():
# This method evolves the population using a specific strategy
# for a specific number of generations.

sub evolve {
  my ($self, $strategy, $gens) = @_;

  unless ($self->{INIT}) {

Genetic.pm view on Meta::CPAN

# It also examines the type of individuals we want, and
# require()s the proper class. Throws an error if it can't.
# Must pass to it an anon list that will be passed to the
# newRandom method of the individual.

# In case of bitvector, $newArgs is length of bitvector.
# In case of rangevector, $newArgs is anon list of anon lists.
# each sub-anon list has two elements, min number and max number.
# In case of listvector, $newArgs is anon list of anon lists.
# Each sub-anon list contains possible values of gene.

sub init {
  my ($self, $newArgs) = @_;

  $self->{INIT} = 0;

  my $ind;
  if (exists $_genome2class{$self->{TYPE}}) {
    $ind = $_genome2class{$self->{TYPE}};
  } else {
    $ind = $self->{TYPE};
  }

  eval "use $ind";  # does this work if package is in same file?
  if ($@) {
    carp "ERROR: Init failed. Can't require '$ind': $@,";
    return undef;
  }

  $self->{INDIVIDUAL} = $ind;
  $self->{PEOPLE}     = [];
  $self->{SORTED}     = 0;
  $self->{GENERATION} = 0;
  $self->{INITARGS}   = $newArgs;

  push @{$self->{PEOPLE}} =>
    $ind->newRandom($newArgs) for 1 .. $self->{POPSIZE};

  $_->fitness($self->{FITFUNC}) for @{$self->{PEOPLE}};

  $self->{INIT} = 1;
}

# sub people():
# returns the current list of individuals in the population.
# note: this returns the actual array ref, so any changes
# made to it (ex, shift/pop/etc) will be reflected in the
# population.

sub people {
  my $self = shift;

  if (@_) {
    $self->{PEOPLE} = shift;
    $self->{SORTED} = 0;
  }

  $self->{PEOPLE};
}

# useful little methods to set/query parameters.
sub size       { $_[0]{POPSIZE}    = $_[1] if defined $_[1]; $_[0]{POPSIZE}   }
sub crossProb  { $_[0]{CROSSRATE}  = $_[1] if defined $_[1]; $_[0]{CROSSRATE} }
sub mutProb    { $_[0]{MUTPROB}    = $_[1] if defined $_[1]; $_[0]{MUTPROB}   }
sub indType    { $_[0]{INDIVIDUAL} }
sub generation { $_[0]{GENERATION} }

# sub inject():
# This method is used to add individuals to the current population.
# The point of it is that sometimes the population gets stagnant,
# so it could be useful add "fresh blood".
# Takes a variable number of arguments. The first argument is the
# total number, N, of new individuals to add. The remaining arguments
# are genomes to inject. There must be at most N genomes to inject.
# If the number, n, of genomes to inject is less than N, N - n random
# genomes are added. Perhaps an example will help?
# returns 1 on success and undef on error.

sub inject {
  my ($self, $count, @genomes) = @_;

  unless ($self->{INIT}) {
    carp "can't inject() before init()";
    return undef;
  }

  my $ind = $self->{INDIVIDUAL};

  my @newInds;
  for my $i (1 .. $count) {
    my $genes = shift @genomes;

    if ($genes) {
      push @newInds => $ind->newSpecific($genes, $self->{INITARGS});
    } else {
      push @newInds => $ind->newRandom  ($self->{INITARGS});      
    }
  }

  $_->fitness($self->{FITFUNC}) for @newInds;

  push @{$self->{PEOPLE}} => @newInds;

  return 1;
}

__END__

=head1 NAME

AI::Genetic - A pure Perl genetic algorithm implementation.

=head1 SYNOPSIS

    use AI::Genetic;
    my $ga = new AI::Genetic(
        -fitness    => \&fitnessFunc,
        -type       => 'bitvector',
        -population => 500,
        -crossover  => 0.9,
        -mutation   => 0.01,
	-terminate  => \&terminateFunc,
       );

     $ga->init(10);
     $ga->evolve('rouletteTwoPoint', 100);
     print "Best score = ", $ga->getFittest->score, ".\n";

     sub fitnessFunc {
         my $genes = shift;

         my $fitness;
         # assign a number to $fitness based on the @$genes
         # ...

         return $fitness;
      }

      sub terminateFunc {
         my $ga = shift;

         # terminate if reached some threshold.
         return 1 if $ga->getFittest->score > $THRESHOLD;
         return 0;
      }

=head1 DESCRIPTION

This module implements a Genetic Algorithm (GA) in pure Perl.
Other Perl modules that achieve the same thing (perhaps better,
perhaps worse) do exist. Please check CPAN. I mainly wrote this
module to satisfy my own needs, and to learn something about GAs
along the way.

B<PLEASE NOTE:> As of v0.02, AI::Genetic has been re-written from
scratch to be more modular and expandable. To achieve this, I had
to modify the API, so it is not backward-compatible with v0.01.
As a result, I do not plan on supporting v0.01.

I will not go into the details of GAs here, but here are the
bare basics. Plenty of information can be found on the web.

In a GA, a population of individuals compete for survival. Each
individual is designated by a set of genes that define its
behaviour. Individuals that perform better (as defined by the
fitness function) have a higher chance of mating with other
individuals. When two individuals mate, they swap some of
their genes, resulting in an individual that has properties
from both of its "parents". Every now and then, a mutation
occurs where some gene randomly changes value, resulting in
a different individual. If all is well defined, after a few
generations, the population should converge on a "good-enough"
solution to the problem being tackled.

A GA implementation runs for a discrete number of time steps
called I<generations>. What happens during each generation can
vary greatly depending on the strategy being used (See 
L</"STRATEGIES"> for more info).
Typically, a variation of the following happens at
each generation:

=over 4

=item B<1. Selection>

Here the performance of all the individuals is evaluated
based on the fitness function, and each is given a specific
fitness value. The higher the value, the bigger the chance
of an individual passing its genes on in future generations
through mating (crossover).

=item B<2. Crossover>

Here, individuals selected are randomly paired up for
crossover (aka I<sexual reproduction>). This is further
controlled by the crossover rate specified and may result in
a new offspring individual that contains genes common to
both parents. New individuals are injected into the current
population.

=item B<3. Mutation>

In this step, each individual is given the chance to mutate
based on the mutation probability specified. If an individual
is to mutate, each of its genes is given the chance to randomly
switch its value to some other state.

=back

=head1 CLASS METHODS

Here are the public methods.

=over 4

=item I<$ga>-E<gt>B<new>(I<options>)

This is the constructor. It accepts options in the form of
hash-value pairs. These are:

=over 8

=item B<-population>

Genetic.pm view on Meta::CPAN

              [qw/red big thin/],
              [qw/blue small fat/],
             );

this adds 5 new individuals, 2 with the specified genetic coding, and 3 randomly
generated.

=item I<$ga>-E<gt>B<evolve>(I<strategy>, ?I<num_generations>?)

This method causes the GA to evolve the population using the specified strategy.
A strategy name has to be specified as the first argument. The second argument
is optional and specifies the number of generations to evolve. It defaults to
1. See L</"STRATEGIES"> for more information on the default strategies.

Each generation consists of the following steps:

=over

=item o

The population is sorted according to the individuals' fitnesses.

=item o

The subroutine corresponding to the named strategy is called with one argument,
the AI::Genetic object. This subroutine is expected to alter the object itself.

=item o

If a termination subroutine is given, it is executed and the return value is
checked. Evolution terminates if this sub returns a true value.

=back

=item I<$ga>-E<gt>B<getFittest>(?I<N>?)

This returns the I<N> fittest individuals. If not specified,
I<N> defaults to 1. As a side effect, it sorts the population by
fitness score. The actual AI::Genetic::Individual objects are returned.
You can use the C<genes()> and C<score()> methods to get the genes and the
scores of the individuals. Please check L<AI::Genetic::Individual> for details.

=item I<$ga>-E<gt>B<sortPopulation>

This method sorts the population according to fitness function. The results
are cached for speed.

=item I<$ga>-E<gt>B<sortIndividuals>(?[I<ListOfIndividuals>]?)

Given an anonymous list of individuals, this method sorts them according
to fitness, returning an anonymous list of the sorted individuals.

=item I<$ga>-E<gt>B<people>()

Returns an anonymous list of individuals of the current population.
B<IMPORTANT>: the actual array reference used by the AI::Genetic object
is returned, so any changes to it will be reflected in I<$ga>.

=item I<$ga>-E<gt>B<size>(?I<newSize>?)

This method is used to query and set the population size.

=item I<$ga>-E<gt>B<crossProb>(?I<newProb>?)

This method is used to query and set the crossover rate.

=item I<$ga>-E<gt>B<mutProb>(?I<newProb>?)

This method is used to query and set the mutation rate.

=item I<$ga>-E<gt>B<indType>()

This method returns the type of individual: I<bitvector>, I<listvector>,
or I<rangevector>.

=item I<$ga>-E<gt>B<generation>()

This method returns the current generation.

=back

=head1 FITNESS FUNCTION

Very quickly you will realize that properly defining the fitness function
is the most important aspect of a GA. Most of the time that a genetic
algorithm takes to run is spent in running the fitness function for each
separate individual to get its fitness. AI::Genetic tries to minimize this
time by caching the fitness result for each individual. But, B<you should
spend a lot of time optimizing your fitness function to achieve decent run
times.>

The fitness function should expect only one argument, an anonymous list of
genes, corresponding to the individual being analyzed. It is expected
to return a number which defines the fitness score of the said individual.
The higher the score, the more fit the individual, the more the chance it
has to be chosen for crossover.

=head1 STRATEGIES

AI::Genetic comes with 9 predefined strategies. These are:

=over

=item rouletteSinglePoint

This strategy implements roulette-wheel selection and single-point crossover.

=item rouletteTwoPoint

This strategy implements roulette-wheel selection and two-point crossover.

=item rouletteUniform

This strategy implements roulette-wheel selection and uniform crossover.

=item tournamentSinglePoint

This strategy implements tournament selection and single-point crossover.

=item tournamentTwoPoint

This strategy implements tournament selection and two-point crossover.

=item tournamentUniform

This strategy implements tournament selection and uniform crossover.

=item randomSinglePoint

( run in 0.723 second using v1.01-cache-2.11-cpan-39bf76dae61 )