view release on metacpan or search on metacpan
* make subroutines exportable, the names are too long
* :local_data
* :portable_data
* cleaned & refactored codes
* &display_confusion_matrix
* improved the documentation
Version 1.01 24 AUGUST 2021
* Fixed some technical issues
* fixed test scripts not run in correct sequence
* must be creation -> train -> validate/test
* local::lib issue should be fixed by now
Version 1.00 23 AUGUST 2021
* The following features were implemented over the course of time (see also My::Perceptron v0.04 on github):
* create perceptron
* process data: &train method
* read csv - for training stage
* save and load the perceptron
* output algorithm for train
* read and calculate data line by line
* validate method
* read csv bulk
* write predicted values into original file
* write predicted values into new file
* test method
* read csv bulk
* write predicted values into original file
* write predicted values into new file
* confusion matrix
* read only expected and predicted columns, line by line
* TP, TN, FP, FN
* total entries
* accuracy
* sensitivity
* display confusion matrix data to console
* use Text:Matrix
* synonyms
* synonyms MUST call actual subroutines and not copy pasting!
* train: tame, exercise
* validate: take_mock_exam, take_lab_test
* test: take_real_exam, work_in_real_world
* generate_confusion_matrix: get_exam_results
* display_confusion_matrix: display_exam_results
* save_perceptron: preserve
* load_perceptron: revive
t/00-load.t
t/00-manifest.t
t/00-pod-coverage.t
t/00-pod.t
t/02-creation.t
t/02-state_portable.t
t/02-state_synonyms.t
t/04-train.t
t/04-train_synonyms_exercise.t
t/04-train_synonyms_tame.t
t/06-validate.t
t/06-validate_synonyms_lab.t
t/06-validate_synonyms_mock.t
t/08-confusion_matrix.t
t/08-confusion_matrix_synonyms.t
t/10-test.t
t/10-test_synonyms_exam.t
t/10-test_synonyms_work.t
t/12-shuffle_data.t
t/12-shuffle_data_synonym.t
t/book_list_test-filled-non-binary.csv
t/book_list_test-filled.csv
t/book_list_test.csv
t/book_list_test_exam-filled.csv
t/book_list_test_work-filled.csv
t/book_list_to_shuffle.csv
t/book_list_train.csv
t/book_list_validate-filled.csv
t/book_list_validate.csv
t/book_list_validate_lab-filled.csv
t/book_list_validate_mock-filled.csv
t/perceptron_1.nerve
t/perceptron_exercise.nerve
t/perceptron_state_synonyms.nerve
t/perceptron_tame.nerve
t/portable.nerve
t/portable_nerve.yaml
t/shuffled_1.csv
t/shuffled_2.csv
t/shuffled_3.csv
xt/boilerplate.t
AI-Perceptron-Simple (v1.04)
A Newbie Friendly Module to Create, Train, Validate and Test Perceptrons / Neurons
This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.
This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual script...
INSTALLATION
To install this module, run the following commands:
perl Makefile.PL
make
make test
docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN
<li><a href="#train-stimuli_train_csv-expected_output_header-save_nerve_to_file">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file )</a></li>
<li><a href="#train-stimuli_train_csv-expected_output_header-save_nerve_to_file-display_stats-identifier">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier )</a></li>
<li><a href="#calculate_output-self-stimuli_hash">&_calculate_output( $self, \%stimuli_hash )</a></li>
<li><a href="#tune-self-stimuli_hash-tune_up_or_down">&_tune( $self, \%stimuli_hash, $tune_up_or_down )</a></li>
</ul>
</li>
<li><a href="#VALIDATION-RELATED-METHODS">VALIDATION RELATED METHODS</a>
<ul>
<li><a href="#take_mock_exam">take_mock_exam (...)</a></li>
<li><a href="#take_lab_test">take_lab_test (...)</a></li>
<li><a href="#validate-options">validate ( \%options )</a></li>
</ul>
</li>
<li><a href="#TESTING-RELATED-SUBROUTINES-METHODS">TESTING RELATED SUBROUTINES/METHODS</a>
<ul>
<li><a href="#take_real_exam">take_real_exam (...)</a></li>
<li><a href="#work_in_real_world">work_in_real_world (...)</a></li>
<li><a href="#test-options">test ( \%options )</a></li>
<li><a href="#real_validate_or_test-data_hash_ref">_real_validate_or_test ( $data_hash_ref )</a></li>
<li><a href="#fill_predicted_values-self-stimuli_validate-predicted_index-aoa">&_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )</a></li>
</ul>
</li>
<li><a href="#RESULTS-RELATED-SUBROUTINES-METHODS">RESULTS RELATED SUBROUTINES/METHODS</a>
<ul>
<li><a href="#get_exam_results">get_exam_results ( ... )</a></li>
<li><a href="#get_confusion_matrix-options">get_confusion_matrix ( \%options )</a></li>
<li><a href="#collect_stats-options">&_collect_stats ( \%options )</a></li>
<li><a href="#calculate_total_entries-c_matrix_ref">&_calculate_total_entries ( $c_matrix_ref )</a></li>
<li><a href="#calculate_accuracy-c_matrix_ref">&_calculate_accuracy ( $c_matrix_ref )</a></li>
<li><a href="#calculate_sensitivity-c_matrix_ref">&_calculate_sensitivity ( $c_matrix_ref )</a></li>
docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN
# train
$nerve->tame( ... );
$nerve->exercise( ... );
$nerve->train( $training_data_csv, $expected_column_name, $save_nerve_to );
# or
$nerve->train(
$training_data_csv, $expected_column_name, $save_nerve_to,
$show_progress, $identifier); # these two parameters must go together
# validate
$nerve->take_lab_test( ... );
$nerve->take_mock_exam( ... );
# fill results to original file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
} );
# or
# fill results to a new file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
results_write_to => $new_csv
} );
# test - see "validate" method, same usage
$nerve->take_real_exam( ... );
$nerve->work_in_real_world( ... );
$nerve->test( ... );
# confusion matrix
my %c_matrix = $nerve->get_confusion_matrix( {
full_data_file => $file_csv,
actual_output_header => $header_name,
predicted_output_header => $predicted_header_name,
docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN
<dt id="portable_data---subroutines-under-NERVE-PORTABILITY-RELATED-SUBROUTINES-section"><code>:portable_data</code> - subroutines under <code>NERVE PORTABILITY RELATED SUBROUTINES</code> section.</dt>
<dd>
</dd>
</dl>
<p>Most of the stuff are OO.</p>
<h1 id="DESCRIPTION">DESCRIPTION</h1>
<p>This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.</p>
<p>This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual scr...
<p>The implementation here is super basic as it only takes in input of the dendrites and calculate the output. If the output is higher than the threshold, the final result (category) will be 1 aka perceptron is activated. If not, then the result will...
<p>Depending on how you view or categorize the final result, the perceptron will fine tune itself (aka train) based on the learning rate until the desired result is met. Everything from here on is all mathematics and numbers which only makes sense to...
<p>Whenever the perceptron fine tunes itself, it will increase/decrease all the dendrites that is significant (attributes labelled 1) for each input. This means that even when the perceptron successfully fine tunes itself to suite all the data in you...
<h1 id="CONVENTIONS-USED">CONVENTIONS USED</h1>
docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN
<h2 id="tame">tame ( ... )</h2>
<h2 id="exercise">exercise ( ... )</h2>
<h2 id="train-stimuli_train_csv-expected_output_header-save_nerve_to_file">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file )</h2>
<h2 id="train-stimuli_train_csv-expected_output_header-save_nerve_to_file-display_stats-identifier">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier )</h2>
<p>Trains the perceptron.</p>
<p><code>$stimuli_train_csv</code> is the set of data / input (in CSV format) to train the perceptron while <code>$save_nerve_to_file</code> is the filename that will be generate each time the perceptron finishes the training process. This data file ...
<p><code>$expected_output_header</code> is the header name of the columns in the csv file with the actual category or the exepcted values. This is used to determine to tune the nerve up or down. This value should only be 0 or 1 for the sake of simpli...
<p><code>$display_stats</code> is <b>optional</b> and the default is 0. It will display more output about the tuning process. It will show the followings:</p>
<dl>
<dt id="tuning-status">tuning status</dt>
<dd>
docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN
<p>Value is <code>0</code></p>
</dd>
</dl>
<p>This subroutine should be called in the procedural way for now.</p>
<h1 id="VALIDATION-RELATED-METHODS">VALIDATION RELATED METHODS</h1>
<p>All the validation methods here have the same parameters as the actual <code>validate</code> method and they all do the same stuff. They are also used in the same way.</p>
<h2 id="take_mock_exam">take_mock_exam (...)</h2>
<h2 id="take_lab_test">take_lab_test (...)</h2>
<h2 id="validate-options">validate ( \%options )</h2>
<p>This method validates the perceptron against another set of data after it has undergone the training process.</p>
<p>This method calculates the output of each row of data and write the result into the predicted column. The data begin written into the new file or the original file will maintain it's sequence.</p>
<p>Please take note that this method will load all the data of the validation stimuli, so please split your stimuli into multiple files if possible and call this method a few more times.</p>
<p>For <code>%options</code>, the followings are needed unless mentioned:</p>
<dl>
<dt id="stimuli_validate-csv_file">stimuli_validate => $csv_file</dt>
<dd>
<p>This is the CSV file containing the validation data, make sure that it contains a column with the predicted values as it is needed in the next key mentioned: <code>predicted_column_index</code></p>
</dd>
<dt id="predicted_column_index-column_number">predicted_column_index => $column_number</dt>
<dd>
<p>This is the index of the column that contains the predicted output values. <code>$index</code> starts from <code>0</code>.</p>
<p>This column will be filled with binary numbers and the full new data will be saved to the file specified in the <code>results_write_to</code> key.</p>
</dd>
<dt id="results_write_to-new_csv_file">results_write_to => $new_csv_file</dt>
<dd>
<p>Optional.</p>
<p>The default behaviour will write the predicted output back into <code>stimuli_validate</code> ie the original data. The sequence of the data will be maintained.</p>
</dd>
</dl>
<p><i>*This method will call <code>_real_validate_or_test</code> to do the actual work.</i></p>
<h1 id="TESTING-RELATED-SUBROUTINES-METHODS">TESTING RELATED SUBROUTINES/METHODS</h1>
<p>All the testing methods here have the same parameters as the actual <code>test</code> method and they all do the same stuff. They are also used in the same way.</p>
<h2 id="take_real_exam">take_real_exam (...)</h2>
<h2 id="work_in_real_world">work_in_real_world (...)</h2>
<h2 id="test-options">test ( \%options )</h2>
<p>This method is used to put the trained nerve to the test. You can think of it as deploying the nerve for the actual work or maybe putting the nerve into an empty brain and see how well the brain survives :)</p>
<p>This method works and behaves the same way as the <code>validate</code> method. See <code>validate</code> for the details.</p>
<p><i>*This method will call &_real_validate_or_test to do the actual work.</i></p>
<h2 id="real_validate_or_test-data_hash_ref">_real_validate_or_test ( $data_hash_ref )</h2>
<p>This is where the actual validation or testing takes place.</p>
<p><code>$data_hash_ref</code> is the list of parameters passed into the <code>validate</code> or <code>test</code> methods.</p>
<p>This is a <b>method</b>, so use the OO way. This is one of the exceptions to the rules where private subroutines are treated as methods :)</p>
<h2 id="fill_predicted_values-self-stimuli_validate-predicted_index-aoa">&_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )</h2>
<p>This is where the filling in of the predicted values takes place. Take note that the parameters naming are the same as the ones used in the <code>validate</code> and <code>test</code> method.</p>
<p>This subroutine should be called in the procedural way.</p>
<h1 id="RESULTS-RELATED-SUBROUTINES-METHODS">RESULTS RELATED SUBROUTINES/METHODS</h1>
<p>This part is related to generating the confusion matrix.</p>
<h2 id="get_exam_results">get_exam_results ( ... )</h2>
<p>The parameters and usage are the same as <code>get_confusion_matrix</code>. See the next method.</p>
docs/specifications.t view on Meta::CPAN
#
# Version 0.01 - completed on 8 August 2021
# [v] able to create perceptron
# [v] able to process data: &train method
# [v] read csv - for training stage
# [v] able to save the actual perceptron object and load it back
#
# Version 0.02 - completed on 17 August 2021
# [v] implement output algorithm for train and finalize it
# [v] read and calculate data line by line, not bulk, so no shuffling method
# [v] implement validate method
# [v] read csv bulk - for validating and testing stages
# [v] write into a new csv file - validation and testing stages
# [v] implement testing method
# [v] read csv bulk - for validating and testing stages
# [v] write into a new csv file - validation and testing stages
#
# Version 0.03 - completed on 19 August 2021
# [v] implement confusion matrix
# [v] read only expected and predicted columns, line by line
# [v] return a hash of data
docs/specifications.t view on Meta::CPAN
# [v] accuracy
# [v] sensitivity
# [v] remove the return value for "train" method
# [v] display confusion matrix data to console
# [v] use Text:Matrix
#
# Version 0.04 / Version 1.0 - completed on 23 AUGUST 2021
# [v] add synonyms
# [v] synonyms MUST call actual subroutines and not copy pasting!
# train: [v] tame [v] exercise
# validate: [v] take_mock_exam [v] take_lab_test
# test: [v] take_real_exam [v] work_in_real_world
# generate_confusion_matrix: [v] get_exam_results
# display_confusion_matrix: [v] display_exam_results
# save_perceptron: [v] preserve
# load_perceptron: [v] revive
#
# Version 1.01
# [v] fixed currently known issues as much as possible (see 'Changes')
# - "long size integer" === "byte order not compatible"
#
docs/specifications.t view on Meta::CPAN
#
# Version ?.??
# ? implement shuffling system into training stage, bulk data processing
# ? Data processing: splitting data, k-fold
# -...
#
#
############ "flow" of the codes ############
# these three steps could be done in seperated scripts if necessary
# &train and &validate could be put inside a loop or something
# the parameters make more sense when they are taken from @ARGV
# so when it's the first time training, it will create the nerve_file,
# the second time and up it will directly overrride that file since everything is read from it
# ... anyway :) afterall training stage wasn't meant to be a fully working program, so it shouldnt be a problem
# just assume that
$perceptron->train( $stimuli_train, $save_nerve_to_file );
# reads training stimuli from csv
# tune attributes based on csv data
# calls the same subroutine to do the calculation
# shouldn't give any output upon completion
# should save a copy of itselt into a new file
# returns the nerve's data filename to be used in validate()
# these two can go into a loop with conditions checking
# which means that we can actuall write this
# $perceptron->validate( $stimuli_validate,
# $perceptron->train( $stimuli_train, $save_nerve_to_file )
# );
# and then check the confusion matrix, if not satisfied, run the loop again :)
$perceptron->validate( $stimuli_validate, $nerve_data_to_read );
$perceptron->test( $stimuli_test ); # loads nerve data from data file, turn into a object, then do the following:
# reads from csv :
# validation stimuli
# testing stimuli
# both will call the same subroutine to do calculation
# both will write predicted data into the original data file
# show results ie confusion matrix (TP-true positive, TN-true negative, FP-false positive, FN-false negative)
# this should only be done during validation and testing
$perceptron->generate_confusion_matrix( { 1 => $csv_header_true, 0 => $csv_header_false } );
# calculates the 4 thingy based on the current data on hand (RAM), don't read from file again, it shouldn't be a problem
# returns a hash
# ie it must be used together with validate() and test() to avoid problems
# ie validate() and test() must be in different scripts, which makes sense
# unless, create 3 similar objects to do the work in one go
# save data of the trained perceptron
$perceptron->save_data( $data_file );
# see train() on saving copy of the perceptron
# load data of percpetron for use in actual program
My::Perceptron::load_data( $data_file );
# loads the perceptron and returns the actual My::Perceptron object
# should work though as Storable claims it can do that
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
# train
$nerve->tame( ... );
$nerve->exercise( ... );
$nerve->train( $training_data_csv, $expected_column_name, $save_nerve_to );
# or
$nerve->train(
$training_data_csv, $expected_column_name, $save_nerve_to,
$show_progress, $identifier); # these two parameters must go together
# validate
$nerve->take_lab_test( ... );
$nerve->take_mock_exam( ... );
# fill results to original file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
} );
# or
# fill results to a new file
$nerve->validate( {
stimuli_validate => $validation_data_csv,
predicted_column_index => 4,
results_write_to => $new_csv
} );
# test - see "validate" method, same usage
$nerve->take_real_exam( ... );
$nerve->work_in_real_world( ... );
$nerve->test( ... );
# confusion matrix
my %c_matrix = $nerve->get_confusion_matrix( {
full_data_file => $file_csv,
actual_output_header => $header_name,
predicted_output_header => $predicted_header_name,
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml
);
our %EXPORT_TAGS = (
process_data => [ qw( shuffle_data shuffle_stimuli ) ],
local_data => [ qw( preserve save_perceptron revive load_perceptron ) ],
portable_data => [ qw( preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml ) ],
);
=head1 DESCRIPTION
This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.
This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual script...
The implementation here is super basic as it only takes in input of the dendrites and calculate the output. If the output is higher than the threshold, the final result (category) will
be 1 aka perceptron is activated. If not, then the result will be 0 (not activated).
Depending on how you view or categorize the final result, the perceptron will fine tune itself (aka train) based on the learning rate until the desired result is met. Everything from
here on is all mathematics and numbers which only makes sense to the computer and not humans anymore.
Whenever the perceptron fine tunes itself, it will increase/decrease all the dendrites that is significant (attributes labelled 1) for each input. This means that even when the
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
shuffle_data( @_ );
}
sub shuffle_data {
my $stimuli = shift or croak "Please specify the original file name";
my @shuffled_stimuli_names = @_
or croak "Please specify the output files for the shuffled data";
my @aoa;
for ( @shuffled_stimuli_names ) {
# copied from _real_validate_or_test
# open for shuffling
my $aoa = csv (in => $stimuli, encoding => ":encoding(utf-8)");
my $attrib_array_ref = shift @$aoa; # 'remove' the header, it's annoying :)
@aoa = shuffle( @$aoa ); # this can only process actual array
unshift @aoa, $attrib_array_ref; # put back the headers before saving file
csv( in => \@aoa, out => $_, encoding => ":encoding(utf-8)" )
and
print "Saved shuffled data into ", basename($_), "!\n";
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
my $data_ref = shift;
my %data = %{ $data_ref };
# check keys
$data{ learning_rate } = LEARNING_RATE if not exists $data{ learning_rate };
$data{ threshold } = THRESHOLD if not exists $data{ threshold };
#####
# don't pack this key checking process into a subroutine for now
# this is also used in &_real_validate_or_test
my @missing_keys;
for ( qw( initial_value attribs ) ) {
push @missing_keys, $_ unless exists $data{ $_ };
}
croak "Missing keys: @missing_keys" if @missing_keys;
#####
# continue to process the rest of the data
my %attributes;
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
=head2 exercise ( ... )
=head2 train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file )
=head2 train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier )
Trains the perceptron.
C<$stimuli_train_csv> is the set of data / input (in CSV format) to train the perceptron while C<$save_nerve_to_file> is
the filename that will be generate each time the perceptron finishes the training process. This data file is the data of the C<AI::Perceptron::Simple>
object and it is used in the C<validate> method.
C<$expected_output_header> is the header name of the columns in the csv file with the actual category or the exepcted values. This is used to determine to tune the nerve up or down. This value should only be 0 or 1 for the sake of simplicity.
C<$display_stats> is B<optional> and the default is 0. It will display more output about the tuning process. It will show the followings:
=over 4
=item tuning status
Indicates the nerve was tuned up, down or no tuning needed
lib/AI/Perceptron/Simple.pm view on Meta::CPAN
}
#print $_, ": ", $self->{ attributes_hash_ref }{ $_ }, "\n";
}
}
}
=head1 VALIDATION RELATED METHODS
All the validation methods here have the same parameters as the actual C<validate> method and they all do the same stuff. They are also used in the same way.
=head2 take_mock_exam (...)
=head2 take_lab_test (...)
=head2 validate ( \%options )
This method validates the perceptron against another set of data after it has undergone the training process.
This method calculates the output of each row of data and write the result into the predicted column. The data begin written into the new file or the original file will maintain it's sequence.
Please take note that this method will load all the data of the validation stimuli, so please split your stimuli into multiple files if possible and call this method a few more times.
For C<%options>, the followings are needed unless mentioned:
=over 4
=item stimuli_validate => $csv_file
This is the CSV file containing the validation data, make sure that it contains a column with the predicted values as it is needed in the next key mentioned: C<predicted_column_index>
=item predicted_column_index => $column_number
This is the index of the column that contains the predicted output values. C<$index> starts from C<0>.
This column will be filled with binary numbers and the full new data will be saved to the file specified in the C<results_write_to> key.
=item results_write_to => $new_csv_file
Optional.
The default behaviour will write the predicted output back into C<stimuli_validate> ie the original data. The sequence of the data will be maintained.
=back
I<*This method will call C<_real_validate_or_test> to do the actual work.>
=cut
sub take_mock_exam {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub take_lab_test {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub validate {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
=head1 TESTING RELATED SUBROUTINES/METHODS
All the testing methods here have the same parameters as the actual C<test> method and they all do the same stuff. They are also used in the same way.
=head2 take_real_exam (...)
=head2 work_in_real_world (...)
=head2 test ( \%options )
This method is used to put the trained nerve to the test. You can think of it as deploying the nerve for the actual work or maybe putting the nerve into an empty brain and see how
well the brain survives :)
This method works and behaves the same way as the C<validate> method. See C<validate> for the details.
I<*This method will call &_real_validate_or_test to do the actual work.>
=cut
# redirect to _real_validate_or_test
sub take_real_exam {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub work_in_real_world {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
sub test {
my ( $self, $data_hash_ref ) = @_;
$self->_real_validate_or_test( $data_hash_ref );
}
=head2 _real_validate_or_test ( $data_hash_ref )
This is where the actual validation or testing takes place.
C<$data_hash_ref> is the list of parameters passed into the C<validate> or C<test> methods.
This is a B<method>, so use the OO way. This is one of the exceptions to the rules where private subroutines are treated as methods :)
=cut
sub _real_validate_or_test {
my $self = shift; my $data_hash_ref = shift;
#####
my @missing_keys;
for ( qw( stimuli_validate predicted_column_index ) ) {
push @missing_keys, $_ unless exists $data_hash_ref->{ $_ };
}
croak "Missing keys: @missing_keys" if @missing_keys;
#####
my $stimuli_validate = $data_hash_ref->{ stimuli_validate };
my $predicted_index = $data_hash_ref->{ predicted_column_index };
# actual processing starts here
my $output_file = defined $data_hash_ref->{ results_write_to }
? $data_hash_ref->{ results_write_to }
: $stimuli_validate;
# open for writing results
my $aoa = csv (in => $stimuli_validate, encoding => ":encoding(utf-8)");
my $attrib_array_ref = shift @$aoa; # 'remove' the header, it's annoying :)
$aoa = _fill_predicted_values( $self, $stimuli_validate, $predicted_index, $aoa );
# put back the array of headers before saving file
unshift @$aoa, $attrib_array_ref;
print "Saving data to $output_file\n";
csv( in => $aoa, out => $output_file, encoding => ":encoding(utf-8)" );
print "Done saving!\n";
}
=head2 &_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )
This is where the filling in of the predicted values takes place. Take note that the parameters naming are the same as the ones used in the C<validate> and C<test> method.
This subroutine should be called in the procedural way.
=cut
sub _fill_predicted_values {
my ( $self, $stimuli_validate, $predicted_index, $aoa ) = @_;
# CSV processing is all according to the documentation of Text::CSV
open my $data_fh, "<:encoding(UTF-8)", $stimuli_validate
or croak "Can't open $stimuli_validate: $!";
my $csv = Text::CSV->new( {auto_diag => 1, binary => 1} );
my $attrib = $csv->getline($data_fh);
$csv->column_names( $attrib );
# individual row
my $row = 0;
while ( my $data = $csv->getline_hr($data_fh) ) {
t/06-validate.t view on Meta::CPAN
use AI::Perceptron::Simple;
use FindBin;
# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
# use different sets of data for training and validating the nerve. Same goes to testing data.
# I'm doing this only to make sure the nerve is working correctly
use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";
use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";
# 36 headers
my @attributes = qw (
glossy_cover has_plastic_layer_on_cover male_present female_present total_people_1 total_people_2 total_people_3
total_people_4 total_people_5_n_above has_flowers flower_coverage_more_than_half has_leaves leaves_coverage_more_than_half has_trees
trees_coverage_more_than_half has_other_living_things has_fancy_stuff has_obvious_inanimate_objects red_shades blue_shades yellow_shades
orange_shades green_shades purple_shades brown_shades black_shades overall_red_dominant overall_green_dominant
t/06-validate.t view on Meta::CPAN
for ( 0..5 ) {
print "Round $_\n";
$perceptron->train( TRAINING_DATA, "brand", $nerve_file, WANT_STATS, IDENTIFIER );
print "\n";
}
#print Dumper($perceptron), "\n";
# write ack to original file
my $ori_file_size = -s VALIDATION_DATA;
stdout_like {
ok ( $perceptron->validate( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
} ),
"Validate succedded!" );
} qr/book_list_validate\.csv/, "Correct output for validate when saving file";
# with new output file
stdout_like {
ok ( $perceptron->validate( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
results_write_to => VALIDATION_DATA_OUTPUT_FILE
} ),
"Validate succedded!" );
} qr/book_list_validate\-filled\.csv/, "Correct output for validate when saving to NEW file";
ok( -e VALIDATION_DATA_OUTPUT_FILE, "New validation file found" );
isnt( -s VALIDATION_DATA_OUTPUT_FILE, 0, "New output file is not empty" );
done_testing;
# besiyata d'shmaya
t/06-validate_synonyms_lab.t view on Meta::CPAN
use AI::Perceptron::Simple;
use FindBin;
# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
# use different sets of data for training and validating the nerve. Same goes to testing data.
# I'm doing this only to make sure the nerve is working correctly
use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";
use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate_lab-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";
# 36 headers
my @attributes = qw (
glossy_cover has_plastic_layer_on_cover male_present female_present total_people_1 total_people_2 total_people_3
total_people_4 total_people_5_n_above has_flowers flower_coverage_more_than_half has_leaves leaves_coverage_more_than_half has_trees
trees_coverage_more_than_half has_other_living_things has_fancy_stuff has_obvious_inanimate_objects red_shades blue_shades yellow_shades
orange_shades green_shades purple_shades brown_shades black_shades overall_red_dominant overall_green_dominant
t/06-validate_synonyms_lab.t view on Meta::CPAN
print "Round $_\n";
$perceptron->train( TRAINING_DATA, "brand", $nerve_file, WANT_STATS, IDENTIFIER );
print "\n";
}
#print Dumper($perceptron), "\n";
# write ack to original file
my $ori_file_size = -s VALIDATION_DATA;
stdout_like {
ok ( $perceptron->take_lab_test( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
} ),
"Validate succedded!" );
} qr/book_list_validate\.csv/, "Correct output for take_lab_test when saving file";
# with new output file
stdout_like {
ok ( $perceptron->take_lab_test( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
results_write_to => VALIDATION_DATA_OUTPUT_FILE
} ),
"Validate succedded!" );
} qr/book_list_validate_lab\-filled\.csv/, "Correct output for take_lab_test when saving to NEW file";
ok( -e VALIDATION_DATA_OUTPUT_FILE, "New validation file found" );
isnt( -s VALIDATION_DATA_OUTPUT_FILE, 0, "New output file is not empty" );
done_testing;
# besiyata d'shmaya
t/06-validate_synonyms_mock.t view on Meta::CPAN
use AI::Perceptron::Simple;
use FindBin;
# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
# use different sets of data for training and validating the nerve. Same goes to testing data.
# I'm doing this only to make sure the nerve is working correctly
use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";
use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate_mock-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";
# 36 headers
my @attributes = qw (
glossy_cover has_plastic_layer_on_cover male_present female_present total_people_1 total_people_2 total_people_3
total_people_4 total_people_5_n_above has_flowers flower_coverage_more_than_half has_leaves leaves_coverage_more_than_half has_trees
trees_coverage_more_than_half has_other_living_things has_fancy_stuff has_obvious_inanimate_objects red_shades blue_shades yellow_shades
orange_shades green_shades purple_shades brown_shades black_shades overall_red_dominant overall_green_dominant
t/06-validate_synonyms_mock.t view on Meta::CPAN
print "Round $_\n";
$perceptron->train( TRAINING_DATA, "brand", $nerve_file, WANT_STATS, IDENTIFIER );
print "\n";
}
#print Dumper($perceptron), "\n";
# write ack to original file
my $ori_file_size = -s VALIDATION_DATA;
stdout_like {
ok ( $perceptron->take_mock_exam( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
} ),
"Validate succedded!" );
} qr/book_list_validate\.csv/, "Correct output for take_mock_exam when saving file";
# with new output file
stdout_like {
ok ( $perceptron->take_mock_exam( {
stimuli_validate => VALIDATION_DATA,
predicted_column_index => 4,
results_write_to => VALIDATION_DATA_OUTPUT_FILE
} ),
"Validate succedded!" );
} qr/book_list_validate_mock\-filled\.csv/, "Correct output for take_mock_exam when saving to NEW file";
ok( -e VALIDATION_DATA_OUTPUT_FILE, "New validation file found" );
isnt( -s VALIDATION_DATA_OUTPUT_FILE, 0, "New output file is not empty" );
done_testing;
# besiyata d'shmaya
t/10-test.t view on Meta::CPAN
use constant IDENTIFIER => "book_name";
my $nerve_file = $FindBin::Bin . "/perceptron_1.nerve";
ok( -s $nerve_file, "Found nerve file to load" );
my $mature_nerve = AI::Perceptron::Simple::load_perceptron( $nerve_file );
# write to original file
stdout_like {
ok ( $mature_nerve->test( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
} ),
"Testing stage succedded!" );
} qr/book_list_test\.csv/, "Correct output for testing when saving back to original file";
# with new output file
stdout_like {
ok ( $mature_nerve->test( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
results_write_to => TEST_DATA_NEW_FILE
} ),
"Testing stage succedded!" );
} qr/book_list_test\-filled\.csv/, "Correct output for testing when saving to NEW file";
ok( -e TEST_DATA_NEW_FILE, "New testing file found" );
isnt( -s TEST_DATA_NEW_FILE, 0, "New output file is not empty" );
t/10-test_synonyms_exam.t view on Meta::CPAN
use constant IDENTIFIER => "book_name";
my $nerve_file = $FindBin::Bin . "/perceptron_1.nerve";
ok( -s $nerve_file, "Found nerve file to load" );
my $mature_nerve = AI::Perceptron::Simple::load_perceptron( $nerve_file );
# write to original file
stdout_like {
ok ( $mature_nerve->take_real_exam( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
} ),
"Testing stage succedded!" );
} qr/book_list_test\.csv/, "Correct output for testing when saving back to original file";
# with new output file
stdout_like {
ok ( $mature_nerve->take_real_exam( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
results_write_to => TEST_DATA_NEW_FILE
} ),
"Testing stage succedded!" );
} qr/book_list_test_exam\-filled\.csv/, "Correct output for testing when saving to NEW file";
ok( -e TEST_DATA_NEW_FILE, "New testing file found" );
isnt( -s TEST_DATA_NEW_FILE, 0, "New output file is not empty" );
t/10-test_synonyms_work.t view on Meta::CPAN
use constant IDENTIFIER => "book_name";
my $nerve_file = $FindBin::Bin . "/perceptron_1.nerve";
ok( -s $nerve_file, "Found nerve file to load" );
my $mature_nerve = AI::Perceptron::Simple::load_perceptron( $nerve_file );
# write to original file
stdout_like {
ok ( $mature_nerve->work_in_real_world( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
} ),
"Testing stage succedded!" );
} qr/book_list_test\.csv/, "Correct output for testing when saving back to original file";
# with new output file
stdout_like {
ok ( $mature_nerve->work_in_real_world( {
stimuli_validate => TEST_DATA,
predicted_column_index => 4,
results_write_to => TEST_DATA_NEW_FILE
} ),
"Testing stage succedded!" );
} qr/book_list_test_work\-filled\.csv/, "Correct output for testing when saving to NEW file";
ok( -e TEST_DATA_NEW_FILE, "New testing file found" );
isnt( -s TEST_DATA_NEW_FILE, 0, "New output file is not empty" );