AI-NeuralNet-Kohonen
view release on metacpan or search on metacpan
lib/AI/NeuralNet/Kohonen.pm view on Meta::CPAN
confess "{weight_dim} not set" unless $self->{weight_dim};
confess "{map_dim_x} not set" unless $self->{map_dim_x};
confess "{map_dim_y} not set" unless $self->{map_dim_y};
for my $x (0..$self->{map_dim_x}){
$self->{map}->[$x] = [];
for my $y (0..$self->{map_dim_y}){
$self->{map}->[$x]->[$y] = new AI::NeuralNet::Kohonen::Node(
dim => $self->{weight_dim},
missing_mask => $self->{missing_mask},
);
}
}
}
=head1 METHOD clear_map
As L<METHOD randomise_map> but sets all C<map> nodes to
either the value supplied as the only paramter, or C<undef>.
=cut
sub clear_map { my $self=shift;
confess "{weight_dim} not set" unless $self->{weight_dim};
confess "{map_dim_x} not set" unless $self->{map_dim_x};
confess "{map_dim_y} not set" unless $self->{map_dim_y};
my $val = shift || $self->{missing_mask};
my $w = [];
foreach (0..$self->{weight_dim}){
push @$w, $val;
}
for my $x (0..$self->{map_dim_x}){
$self->{map}->[$x] = [];
for my $y (0..$self->{map_dim_y}){
$self->{map}->[$x]->[$y] = new AI::NeuralNet::Kohonen::Node(
weight => $w,
dim => $self->{weight_dim},
missing_mask => $self->{missing_mask},
);
}
}
}
=head1 METHOD train
Optionally accepts a parameter that is the number of epochs
for which to train: the default is the value in the C<epochs> field.
An epoch is composed of A number of generations, the number being
the total number of input vectors.
For every generation, iterates:
=over 4
=item 1
selects a target from the input array (see L</PRIVATE METHOD _select_target>);
=item 2
finds the best-matching unit (see L</METHOD find_bmu>);
=item 3
adjusts the neighbours of the BMU (see L</PRIVATE METHOD _adjust_neighbours_of>);
=back
At the end of every generation, the learning rate is decayed
(see L</PRIVATE METHOD _decay_learning_rate>).
See C<CONSTRUCTOR new> for details of applicable callbacks.
Returns a true value.
=cut
sub train { my ($self,$epochs) = (shift,shift);
$epochs = $self->{epochs} unless defined $epochs;
&{$self->{train_start}} if exists $self->{train_start};
for my $epoch (1..$epochs){
$self->{t} = $epoch;
&{$self->{epoch_start}} if exists $self->{epoch_start};
for (0..$#{$self->{input}}){
my $target = $self->_select_target;
my $bmu = $self->find_bmu($target);
$self->_adjust_neighbours_of($bmu,$target);
}
$self->_decay_learning_rate;
&{$self->{epoch_end}} if exists $self->{epoch_end};
}
&{$self->{train_end}} if $self->{train_end};
return 1;
}
=head1 METHOD find_bmu
For a specific taraget, finds the Best Matching Unit in the map
and return the x/y index.
Accepts: a reference to an array that is the target.
Returns: a reference to an array that is the BMU (and should
perhaps be abstracted as an object in its own right), indexed as follows:
=over 4
=item 0
euclidean distance from the supplied target
=item 1, 2
I<x> and I<y> co-ordinate in the map
=back
See L</METHOD get_weight_at>,
and L<AI::NeuralNet::Kohonen::Node/distance_from>,
=cut
sub find_bmu { my ($self,$target) = (shift,shift);
my $closest = []; # [value, x,y] value and co-ords of closest match
for my $x (0..$self->{map_dim_x}){
for my $y (0..$self->{map_dim_y}){
my $distance = $self->{map}->[$x]->[$y]->distance_from( $target );
$closest = [$distance,0,0] if $x==0 and $y==0;
$closest = [$distance,$x,$y] if $distance < $closest->[0];
}
}
return $closest;
}
=head1 METHOD get_weight_at
Returns a reference to the weight array at the supplied I<x>,I<y>
co-ordinates.
Accepts: I<x>,I<y> co-ordinates, each a scalar.
Returns: reference to an array that is the weight of the node, or
C<undef> on failure.
lib/AI/NeuralNet/Kohonen.pm view on Meta::CPAN
# Format input data
foreach (@{$self->{input}}){
print OUT join("\t",@{$_->{values}});
if ($_->{class}){
print OUT " $_->{class} " ;
}
print OUT "\n";
}
# EOF
print OUT chr 26;
close OUT;
return 1;
}
#
# Process ASCII from table field or input file
# Accepts: ASCII as array or array ref
#
sub _process_input_text { my ($self) = (shift);
if (not defined $_[1]){
if (ref $_[0] eq 'ARRAY'){
@_ = @{$_[0]};
} else {
@_ = split/[\n\r\f]+/,$_[0];
}
}
chomp @_;
my @specs = split/\s+/,(shift @_);
#- Dimensionality of the vectors (integer, compulsory).
$self->{weight_dim} = shift @specs;
$self->{weight_dim}--; # Perl indexing
#- Topology type, either hexa or rect (string, optional, case-sensitive).
my $display = shift @specs;
if (not defined $display and exists $self->{display}){
# Intentionally blank
} elsif (not defined $display){
$self->{display} = undef;
} elsif ($display eq 'hexa'){
$self->{display} = 'hex'
} elsif ($display eq 'rect'){
$self->{display} = undef;
}
#- Map dimension in x-direction (integer, optional).
$_ = shift @specs;
$self->{map_dim_x} = $_ if defined $_;
#- Map dimension in y-direction (integer, optional).
$_ = shift @specs;
$self->{map_dim_y} = $_ if defined $_;
#- Neighborhood type, either bubble or gaussian (string, optional, case-sen- sitive).
# not implimented
# Format input data
foreach (@_){
$self->_add_input_from_str($_);
}
return 1;
}
=head1 PRIVATE METHOD _select_target
Return a random target from the training set in the C<input> field,
unless the C<targeting> field is defined, when the targets are
iterated over.
=cut
sub _select_target { my $self=shift;
if (not $self->{targeting}){
return $self->{input}->[
(int rand(scalar @{$self->{input}}))
];
}
else {
$self->{tar}++;
if ($self->{tar}>$#{ $self->{input} }){
$self->{tar} = 0;
}
return $self->{input}->[$self->{tar}];
}
}
=head1 PRIVATE METHOD _adjust_neighbours_of
Accepts: a reference to an array containing
the distance of the BMU from the target, as well
as the x and y co-ordinates of the BMU in the map;
a reference to the target, which is an
C<AI::NeuralNet::Kohonen::Input> object.
Returns: true.
=head2 FINDING THE NEIGHBOURS OF THE BMU
( t )
sigma(t) = sigma(0) exp ( - ------ )
( lambda )
Where C<sigma> is the width of the map at any stage
in time (C<t>), and C<lambda> is a time constant.
Lambda is our field C<time_constant>.
The map radius is naturally just half the map width.
=head2 ADJUSTING THE NEIGHBOURS OF THE BMU
W(t+1) = W(t) + THETA(t) L(t)( V(t)-W(t) )
Where C<L> is the learning rate, C<V> the target vector,
and C<W> the weight. THETA(t) represents the influence
of distance from the BMU upon a node's learning, and
is calculated by the C<Node> class - see
L<AI::NeuralNet::Kohonen::Node/distance_effect>.
=cut
sub _adjust_neighbours_of { my ($self,$bmu,$target) = (shift,shift,shift);
my $neighbour_radius = int (
($self->{map_dim_a}/$self->{neighbour_factor}) * exp(- $self->{t} / $self->{time_constant})
);
# Distance from co-ord vector (0,0) as integer
# Basically map_width * y + x
my $centre = ($self->{map_dim_a}*$bmu->[2])+$bmu->[1];
# Set the class of the BMU
$self->{map}->[ $bmu->[1] ]->[ $bmu->[2] ]->{class} = $target->{class};
lib/AI/NeuralNet/Kohonen.pm view on Meta::CPAN
__END__
1;
=head1 FILE FORMAT
This module has begun to attempt the I<SOM_PAK> format:
I<SOM_PAK> file format version 3.1 (April 7, 1995),
Helsinki University of Technology, Espoo:
=over 4
The input data is stored in ASCII-form as a list of entries, one line
...for each vectorial sample.
The first line of the file is reserved for status knowledge of the
entries; in the present version it is used to define the following
items (these items MUST occur in the indicated order):
- Dimensionality of the vectors (integer, compulsory).
- Topology type, either hexa or rect (string, optional, case-sensitive).
- Map dimension in x-direction (integer, optional).
- Map dimension in y-direction (integer, optional).
- Neighborhood type, either bubble or gaussian (string, optional, case-sen-
sitive).
...
Subsequent lines consist of n floating-point numbers followed by an
optional class label (that can be any string) and two optional
qualifiers (see below) that determine the usage of the corresponding
data entry in training programs. The data files can also contain an
arbitrary number of comment lines that begin with '#', and are
ignored. (One '#' for each comment line is needed.)
If some components of some data vectors are missing (due to data
collection failures or any other reason) those components should be
marked with 'x'...[in processing, these] are ignored.
...
Each data line may have two optional qualifiers that determine the
usage of the data entry during training. The qualifiers are of the
form codeword=value, where spaces are not allowed between the parts of
the qualifier. The optional qualifiers are the following:
=over 4
=item -
Enhancement factor: e.g. weight=3. The training rate for the
corresponding input pattern vector is multiplied by this
parameter so that the reference vectors are updated as if this
input vector were repeated 3 times during training (i.e., as if
the same vector had been stored 2 extra times in the data file).
=item -
Fixed-point qualifier: e.g. fixed=2,5. The map unit defined by
the fixed-point coordinates (x = 2; y = 5) is selected instead of
the best-matching unit for training. (See below for the definition
of coordinates over the map.) If several inputs are forced to
known locations, a wanted orientation results in the map.
=back
=back
Not (yet) implimented in file format:
=over 4
=item *
hexa/rect is only visual, and only in the ::Demo::RGB package atm
=item *
I<neighbourhood type> is always gaussian.
=item *
i<x> for missing data.
=item *
the two optional qualifiers
=back
=cut
=head1 SEE ALSO
See L<AI::NeuralNet::Kohonen::Node/distance_from>;
L<AI::NeuralNet::Kohonen::Demo::RGB>.
L<The documentation for C<SOM_PAK>|ftp://cochlea.hut.fi/pub/som_pak>,
which has lots of advice on map building that may or may not be applicable yet.
A very nice explanation of Kohonen's algorithm:
L<AI-Junkie SOM tutorial part 1|http://www.fup.btinternet.co.uk/aijunkie/som1.html>
=head1 AUTHOR AND COYRIGHT
This implimentation Copyright (C) Lee Goddard, 2003-2006.
All Rights Reserved.
Available under the same terms as Perl itself.
( run in 1.016 second using v1.01-cache-2.11-cpan-39bf76dae61 )