view release on metacpan or search on metacpan
Revision history for Perl extension AI::NeuralNet::SOM.
0.07 Sat May 24 08:53:26 CEST 2008
- fix: Hexa::initialize: corner case when @data is empty not handled (tom fawcett)
0.06 Fri May 23 10:23:29 CEST 2008
- fix: label '0' in label method (tom fawcett)
- fix: value '0' in value method (rho)
0.05 Mi 16. Jan 20:58:19 CET 2008
- improvement of documentation
- training now holds sigma and l constant during an epoch, but applies ALL vectors (exactly once)
0.04 17. Jun CEST 2007
t/som.t
t/rect.t
t/hexa.t
t/torus.t
t/pods.t
lib/AI/NeuralNet/SOM.pm
lib/AI/NeuralNet/SOM/Rect.pm
lib/AI/NeuralNet/SOM/Hexa.pm
lib/AI/NeuralNet/SOM/Torus.pm
lib/AI/NeuralNet/SOM/Utils.pm
META.yml Module meta-data (added by MakeMaker)
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
input_dim => 3);
$nn->initialize;
$nn->train (30,
[ 3, 2, 4 ],
[ -1, -1, -1 ],
[ 0, 4, -3]);
my @mes = $nn->train (30, ...); # learn about the smallest errors
# during training
print $nn->as_data; # dump the raw data
print $nn->as_string; # prepare a somehow formatted string
use AI::NeuralNet::SOM::Torus;
# similar to above
use AI::NeuralNet::SOM::Hexa;
my $nn = new AI::NeuralNet::SOM::Hexa (output_dim => 6,
input_dim => 4);
$nn->initialize ( [ 0, 0, 0, 0 ] ); # all get this value
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
overly slow.
Particular emphasis has been given that the package plays nicely with
others. So no use of files, no arcane dependencies, etc.
=head2 Scenario
The basic idea is that the neural network consists of a 2-dimensional
array of N-dimensional vectors. When the training is started these
vectors may be completely random, but over time the network learns
from the sample data, which is a set of N-dimensional vectors.
Slowly, the vectors in the network will try to approximate the sample
vectors fed in. If in the sample vectors there were clusters, then
these clusters will be neighbourhoods within the rectangle (or
whatever topology you are using).
Technically, you have reduced your dimension from N to 2.
=head1 INTERFACE
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
=item I<initialize>
I<$nn>->initialize
You need to initialize all vectors in the map before training. There are several options
how this is done:
=over
=item providing data vectors
If you provide a list of vectors, these will be used in turn to seed the neurons. If the list is
shorter than the number of neurons, the list will be started over. That way it is trivial to
zero everything:
$nn->initialize ( [ 0, 0, 0 ] );
=item providing no data
Then all vectors will get randomized values (in the range [ -0.5 .. 0.5 ]).
=item using eigenvectors (see L</HOWTOS>)
=back
=item I<train>
I<$nn>->train ( I<$epochs>, I<@vectors> )
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
$nn->train (30,
[ 3, 2, 4 ],
[ -1, -1, -1 ],
[ 0, 4, -3]);
=cut
sub train {
my $self = shift;
my $epochs = shift || 1;
die "no data to learn" unless @_;
$self->{LAMBDA} = $epochs / log ($self->{_Sigma0}); # educated guess?
my @mes = (); # this will contain the errors during the epochs
for my $epoch (1..$epochs) {
$self->{T} = $epoch;
my $sigma = $self->{_Sigma0} * exp ( - $self->{T} / $self->{LAMBDA} ); # compute current radius
my $l = $self->{_L0} * exp ( - $self->{T} / $epochs ); # current learning rate
my @veggies = @_; # make a local copy, that will be destroyed in the loop
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
}
sub _adjust { # http://www.ai-junkie.com/ann/som/som4.html
my $self = shift;
my $l = shift; # the learning rate
my $sigma = shift; # the current radius
my $unit = shift; # which unit to change
my ($x, $y, $d) = @$unit; # it contains the distance
my $v = shift; # the vector which makes the impact
my $w = $self->{map}->[$x]->[$y]; # find the data behind the unit
my $theta = exp ( - ($d ** 2) / (2 * $sigma ** 2)); # gaussian impact (using distance and current radius)
foreach my $i (0 .. $#$w) { # adjusting values
$w->[$i] = $w->[$i] + $theta * $l * ( $v->[$i] - $w->[$i] );
}
}
=pod
=item I<bmu>
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
Obviously, the longer you let your SOM be trained, the smaller the error should become.
=cut
sub mean_error {
my $self = shift;
my $error = 0;
map { $error += $_ } # then add them all up
map { ( $self->bmu($_) )[2] } # then find the distance
@_; # take all data vectors
return ($error / scalar @_); # return the mean value
}
=pod
=item I<neighbors>
I<$ns> = I<$nn>->neighbors (I<$sigma>, I<$x>, I<$y>)
Finds all neighbors of (X, Y) with a distance smaller than SIGMA. Returns a list reference of (X, Y,
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
=item I<radius> (read-only)
I<$radius> = I<$nn>->radius
Returns the I<radius> of the map. Different topologies interpret this differently.
=item I<map>
I<$m> = I<$nn>->map
This method returns a reference to the map data. See the appropriate subclass of the data
representation.
=cut
sub map {
my $self = shift;
return $self->{map};
}
=pod
lib/AI/NeuralNet/SOM.pm view on Meta::CPAN
print I<$nn>->as_string
This methods creates a pretty-print version of the current vectors.
=cut
sub as_string { die; }
=pod
=item I<as_data>
print I<$nn>->as_data
This methods creates a string containing the raw vector data, row by
row. This can be fed into gnuplot, for instance.
=cut
sub as_data { die; }
=pod
=back
=head1 HOWTOs
=over
=item I<using Eigenvectors to initialize the SOM>
See the example script in the directory C<examples> provided in the
distribution. It uses L<PDL> (for speed and scalability, but the
results are not as good as I had thought).
=item I<loading and saving a SOM>
See the example script in the directory C<examples>. It uses
C<Storable> to directly dump the data structure onto disk. Storage and
retrieval is quite fast.
=back
=head1 FAQs
=over
=item I<I get 'uninitialized value ...' warnings, many of them>
lib/AI/NeuralNet/SOM/Hexa.pm view on Meta::CPAN
my $self = shift;
return $self->{_D};
}
=pod
=cut
sub initialize {
my $self = shift;
my @data = @_;
our $i = 0;
my $get_from_stream = sub {
$i = 0 if $i > $#data;
return [ @{ $data[$i++] } ]; # cloning !
} if @data;
$get_from_stream ||= sub {
return [ map { rand( 1 ) - 0.5 } 1..$self->{_Z} ];
};
for my $x (0 .. $self->{_D}-1) {
for my $y (0 .. $self->{_D}-1) {
$self->{map}->[$x]->[$y] = &$get_from_stream;
}
}
}
lib/AI/NeuralNet/SOM/Hexa.pm view on Meta::CPAN
Not implemented.
=cut
## TODO: pretty printing of this as hexagon ?
sub as_string { die "not implemented"; }
=pod
=item I<as_data>
Not implemented.
=cut
sub as_data { die "not implemented"; }
=pod
=back
=head1 AUTHOR
Robert Barta, E<lt>rho@devc.atE<gt>
=head1 COPYRIGHT AND LICENSE
lib/AI/NeuralNet/SOM/Rect.pm view on Meta::CPAN
use AI::NeuralNet::SOM::Rect;
my $nn = new AI::NeuralNet::SOM::Rect (output_dim => "5x6",
input_dim => 3);
$nn->initialize;
$nn->train (30,
[ 3, 2, 4 ],
[ -1, -1, -1 ],
[ 0, 4, -3]);
print $nn->as_data;
=head1 INTERFACE
=head2 Constructor
The constructor takes the following arguments (additionally to those in the base class):
=over
=item C<output_dim> : (mandatory, no default)
lib/AI/NeuralNet/SOM/Rect.pm view on Meta::CPAN
}
=pod
=head2 Methods
=cut
sub initialize {
my $self = shift;
my @data = @_;
our $i = 0;
my $get_from_stream = sub {
$i = 0 if $i > $#data;
return [ @{ $data[$i++] } ]; # cloning !
} if @data;
$get_from_stream ||= sub {
return [ map { rand( 1 ) - 0.5 } 1..$self->{_Z} ];
};
for my $x (0 .. $self->{_X}-1) {
for my $y (0 .. $self->{_Y}-1) {
$self->{map}->[$x]->[$y] = &$get_from_stream;
}
}
}
lib/AI/NeuralNet/SOM/Rect.pm view on Meta::CPAN
}
$s .= sprintf "\n";
}
$s .= sprintf "\n";
}
return $s;
}
=pod
=item I<as_data>
print I<$nn>->as_data
This methods creates a string containing the raw vector data, row by
row. This can be fed into gnuplot, for instance.
=cut
sub as_data {
my $self = shift;
my $s = '';
my $dim = scalar @{ $self->{map}->[0]->[0] };
for my $x (0 .. $self->{_X}-1) {
for my $y (0 .. $self->{_Y}-1){
for my $w ( 0 .. $dim-1 ){
$s .= sprintf ("\t%f", $self->{map}->[$x]->[$y]->[$w]);
}
$s .= sprintf "\n";
lib/AI/NeuralNet/SOM/Torus.pm view on Meta::CPAN
use AI::NeuralNet::SOM::Torus;
my $nn = new AI::NeuralNet::SOM::Torus (output_dim => "5x6",
input_dim => 3);
$nn->initialize;
$nn->train (30,
[ 3, 2, 4 ],
[ -1, -1, -1 ],
[ 0, 4, -3]);
print $nn->as_data;
=head1 DESCRIPTION
This SOM is very similar to that with a rectangular topology, except that the rectangle is connected
on the top edge and the bottom edge to first form a cylinder; and that cylinder is then formed into
a torus by connecting the rectangle's left and right border (L<http://en.wikipedia.org/wiki/Torus>).
=head1 INTERFACE
It exposes the same interface as the base class.
foreach my $x ( 0 .. 4 ) {
foreach my $y ( 0 .. 5 ) {
return 1 if AI::NeuralNet::SOM::Utils::vector_distance ($m->[$x]->[$y], $v) < 0.01;
}
}
return 0;
}
ok ($nn->as_string, 'pretty print');
ok ($nn->as_data, 'raw format');
# print $nn->as_string;
}
{
my $nn = new AI::NeuralNet::SOM::Rect (output_dim => "5x6",
input_dim => 3);
$nn->initialize;
foreach my $x (0 .. 5 -1) {
foreach my $x ( 0 .. 4 ) {
foreach my $y ( 0 .. 5 ) {
return 1 if AI::NeuralNet::SOM::Utils::vector_distance ($m->[$x]->[$y], $v) < 0.01;
}
}
return 0;
}
ok ($nn->as_string, 'pretty print');
ok ($nn->as_data, 'raw format');
# print $nn->as_string;
}
__END__
# randomized pick
@vectors = ...;
my $get = sub {
return @vectors [ int (rand (scalar @vectors) ) ];