AI-Perceptron-Simple
view release on metacpan or search on metacpan
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
package AI::Perceptron::Simple;
use 5.008001;
use strict;
use warnings;
use Carp "croak";
use utf8;
binmode STDOUT, ":utf8";
require local::lib; # no local::lib in tests, this is also to avoid loading local::lib multiple times
use Text::CSV qw( csv );
use Text::Matrix;
use File::Basename qw( basename );
use List::Util qw( shuffle );
=head1 NAME
AI::Perceptron::Simple
A Newbie Friendly Module to Create, Train, Validate and Test Perceptrons / Neurons
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
$nerve->tame( ... );
$nerve->exercise( ... );
$nerve->train( $training_data_csv, $expected_column_name, $save_nerve_to );
# or
$nerve->train(
$training_data_csv, $expected_column_name, $save_nerve_to,
$show_progress, $identifier); # these two parameters must go together
# validate
$nerve->take_lab_test( ... );
$nerve->take_mock_exam( ... );
# fill results to original file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
} );
# or
# fill results to a new file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
results_write_to => $new_csv
} );
# test - see "validate" method, same usage
$nerve->take_real_exam( ... );
$nerve->work_in_real_world( ... );
$nerve->test( ... );
# confusion matrix
my %c_matrix = $nerve->get_confusion_matrix( {
full_data_file => $file_csv,
actual_output_header => $header_name,
predicted_output_header => $predicted_header_name,
more_stats => 1, # optional
} );
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml
);
our %EXPORT_TAGS = (
process_data => [ qw( shuffle_data shuffle_stimuli ) ],
local_data => [ qw( preserve save_perceptron revive load_perceptron ) ],
portable_data => [ qw( preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml ) ],
);
=head1 DESCRIPTION
This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.
This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual script...
The implementation here is super basic as it only takes in input of the dendrites and calculate the output. If the output is higher than the threshold, the final result (category) will
be 1 aka perceptron is activated. If not, then the result will be 0 (not activated).
Depending on how you view or categorize the final result, the perceptron will fine tune itself (aka train) based on the learning rate until the desired result is met. Everything from
here on is all mathematics and numbers which only makes sense to the computer and not humans anymore.
Whenever the perceptron fine tunes itself, it will increase/decrease all the dendrites that is significant (attributes labelled 1) for each input. This means that even when the
perceptron successfully fine tunes itself to suite all the data in your file for the first round, the perceptron might still get some of the things wrong for the next round of training.
Therefore, the perceptron should be trained for as many rounds as possible. The more "confusion" the perceptron is able to correctly handle, the more "mature" the perceptron is.
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
shuffle_data( @_ );
}
sub shuffle_data {
my $stimuli = shift or croak "Please specify the original file name";
my @shuffled_stimuli_names = @_
or croak "Please specify the output files for the shuffled data";
my @aoa;
for ( @shuffled_stimuli_names ) {
# copied from _real_validate_or_test
# open for shuffling
my $aoa = csv (in => $stimuli, encoding => ":encoding(utf-8)");
my $attrib_array_ref = shift @$aoa; # 'remove' the header, it's annoying :)
@aoa = shuffle( @$aoa ); # this can only process actual array
unshift @aoa, $attrib_array_ref; # put back the headers before saving file
csv( in => \@aoa, out => $_, encoding => ":encoding(utf-8)" )
and
print "Saved shuffled data into ", basename($_), "!\n";
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
my $data_ref = shift;
my %data = %{ $data_ref };
# check keys
$data{ learning_rate } = LEARNING_RATE if not exists $data{ learning_rate };
$data{ threshold } = THRESHOLD if not exists $data{ threshold };
#####
# don't pack this key checking process into a subroutine for now
# this is also used in &_real_validate_or_test
my @missing_keys;
for ( qw( initial_value attribs ) ) {
push @missing_keys, $_ unless exists $data{ $_ };
}
croak "Missing keys: @missing_keys" if @missing_keys;
#####
# continue to process the rest of the data
my %attributes;
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
}
}
=head1 VALIDATION RELATED METHODS
All the validation methods here have the same parameters as the actual C<validate> method and they all do the same stuff. They are also used in the same way.
=head2 take_mock_exam (...)
=head2 take_lab_test (...)
=head2 validate ( \%options )
This method validates the perceptron against another set of data after it has undergone the training process.
This method calculates the output of each row of data and write the result into the predicted column. The data begin written into the new file or the original file will maintain it's sequence.
Please take note that this method will load all the data of the validation stimuli, so please split your stimuli into multiple files if possible and call this method a few more times.
For C<%options>, the followings are needed unless mentioned:
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
This column will be filled with binary numbers and the full new data will be saved to the file specified in the C<results_write_to> key.
=item results_write_to => $new_csv_file
Optional.
The default behaviour will write the predicted output back into C<stimuli_validate> ie the original data. The sequence of the data will be maintained.
=back
I<*This method will call C<_real_validate_or_test> to do the actual work.>
=cut
sub take_mock_exam {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub take_lab_test {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub validate {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
=head1 TESTING RELATED SUBROUTINES/METHODS
All the testing methods here have the same parameters as the actual C<test> method and they all do the same stuff. They are also used in the same way.
=head2 take_real_exam (...)
=head2 work_in_real_world (...)
=head2 test ( \%options )
This method is used to put the trained nerve to the test. You can think of it as deploying the nerve for the actual work or maybe putting the nerve into an empty brain and see how
well the brain survives :)
This method works and behaves the same way as the C<validate> method. See C<validate> for the details.
I<*This method will call &_real_validate_or_test to do the actual work.>
=cut
# redirect to _real_validate_or_test
sub take_real_exam {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub work_in_real_world {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub test {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
=head2 _real_validate_or_test ( $data_hash_ref )
This is where the actual validation or testing takes place.
C<$data_hash_ref> is the list of parameters passed into the C<validate> or C<test> methods.
This is a B<method>, so use the OO way. This is one of the exceptions to the rules where private subroutines are treated as methods :)
=cut
sub _real_validate_or_test {
my $self = shift; my $data_hash_ref = shift;
#####
my @missing_keys;
for ( qw( stimuli_validate predicted_column_index ) ) {
push @missing_keys, $_ unless exists $data_hash_ref->{ $_ };
}
croak "Missing keys: @missing_keys" if @missing_keys;
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
unshift @$aoa, $attrib_array_ref;
print "Saving data to $output_file\n";
csv( in => $aoa, out => $output_file, encoding => ":encoding(utf-8)" );
print "Done saving!\n";
}
=head2 &_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )
This is where the filling in of the predicted values takes place. Take note that the parameters naming are the same as the ones used in the C<validate> and C<test> method.
This subroutine should be called in the procedural way.
=cut
sub _fill_predicted_values {
my ( $self, $stimuli_validate, $predicted_index, $aoa ) = @_;
# CSV processing is all according to the documentation of Text::CSV
open my $data_fh, "<:encoding(UTF-8)", $stimuli_validate
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
=head2 get_confusion_matrix ( \%options )
Returns the confusion matrix in the form of a hash. The hash will contain these keys: C<true_positive>, C<true_negative>, C<false_positive>, C<false_negative>, C<accuracy>, C<sensitivity>. More stats like C<precision>, C<specificity> and C<F1_Score> ...
If you are trying to manipulate the confusion matrix hash or something, take note that all the stats are in percentage (%) in decimal (if any) except the total entries.
For C<%options>, the followings are needed unless mentioned:
=over 4
=item full_data_file => $filled_test_file
This is the CSV file filled with the predicted values.
Make sure that you don't do anything to the actual and predicted output in this file after testing the nerve. These two columns must contain binary values only!
=item actual_output_header => $actual_column_name
=item predicted_output_header => $predicted_column_name
The binary values are treated as follows:
=over 4
=item C<0> is negative
( run in 2.186 seconds using v1.01-cache-2.11-cpan-f56aa216473 )