data results from the CPAN

data

AI-Perceptron-Simple

view release on metacpan or search on metacpan


Version 1.04    17 SEPTEMBER 2021
* fixed some critical problems
    * yml nerve not loading back as an AI::Perceptron::Simple object
    * fix docs: missing parameter $nerve for:
        * save_perceptron
        * save_perceptron_yaml
* changed die to croak for file opening

Version 1.03    9 SEPTEMBER 2021
* data processing subroutine available:
    * shuffle data
    * added import tag ":process_data"
* added more useful data to the confusion matrix:
    * sum of column and rows to make it look more classic :)
    * more_stats option to show more stats:
        * precision, specificity, F1 score, negative predicted value, false negative rate, false positive rate
        * false discovery rate, false omission rate, balanced accuracy

Version 1.02    26 AUGUST 2021
* minimum perl version changed to 5.8.1 due to YAML
* fix test for display_confusion_matrix
    * modifier "n" ( >= perl 5.22 ) changed to primitive '?:', 5.22 is too high
    * fixed inaccurate test for output
* YAML (nerve file) for portability supported
    * make subroutines exportable, the names are too long
       * :local_data
       * :portable_data
* cleaned & refactored codes
    * &display_confusion_matrix
* improved the documentation

Version 1.01    24 AUGUST 2021
* Fixed some technical issues
    * fixed test scripts not run in correct sequence
        * must be creation -> train -> validate/test
    * local::lib issue should be fixed by now

Version 1.00    23 AUGUST 2021
* The following features were implemented over the course of time (see also My::Perceptron v0.04 on github):
    * create perceptron
    * process data: &train method
        * read csv - for training stage
    * save and load the perceptron

    * output algorithm for train
        * read and calculate data line by line
    * validate method
        * read csv bulk
        * write predicted values into original file
        * write predicted values into new file
    * test method
        * read csv bulk
        * write predicted values into original file
        * write predicted values into new file

    * confusion matrix
        * read only expected and predicted columns, line by line
        * return a hash of data
            * TP, TN, FP, FN
            * total entries
            * accuracy
            * sensitivity
    * display confusion matrix data to console
        * use Text:Matrix

    * synonyms
        * synonyms MUST call actual subroutines and not copy pasting!
        * train: tame, exercise
        * validate: take_mock_exam, take_lab_test
        * test:  take_real_exam, work_in_real_world
        * generate_confusion_matrix: get_exam_results
        * display_confusion_matrix: display_exam_results
        * save_perceptron: preserve

MANIFEST view on Meta::CPAN

t/04-train_synonyms_exercise.t
t/04-train_synonyms_tame.t
t/06-validate.t
t/06-validate_synonyms_lab.t
t/06-validate_synonyms_mock.t
t/08-confusion_matrix.t
t/08-confusion_matrix_synonyms.t
t/10-test.t
t/10-test_synonyms_exam.t
t/10-test_synonyms_work.t
t/12-shuffle_data.t
t/12-shuffle_data_synonym.t
t/book_list_test-filled-non-binary.csv
t/book_list_test-filled.csv
t/book_list_test.csv
t/book_list_test_exam-filled.csv
t/book_list_test_work-filled.csv
t/book_list_to_shuffle.csv
t/book_list_train.csv
t/book_list_validate-filled.csv
t/book_list_validate.csv
t/book_list_validate_lab-filled.csv

MANIFEST view on Meta::CPAN

t/perceptron_1.nerve
t/perceptron_exercise.nerve
t/perceptron_state_synonyms.nerve
t/perceptron_tame.nerve
t/portable.nerve
t/portable_nerve.yaml
t/shuffled_1.csv
t/shuffled_2.csv
t/shuffled_3.csv
xt/boilerplate.t
META.yml                                 Module YAML meta-data (added by MakeMaker)
META.json                                Module JSON meta-data (added by MakeMaker)

README view on Meta::CPAN

AI-Perceptron-Simple (v1.04)

A Newbie Friendly Module to Create, Train, Validate and Test Perceptrons / Neurons

This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.

This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual script...

INSTALLATION

To install this module, run the following commands:

	perl Makefile.PL
	make
	make test

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

  <li><a href="#VERSION">VERSION</a></li>
  <li><a href="#SYNOPSIS">SYNOPSIS</a></li>
  <li><a href="#EXPORT">EXPORT</a></li>
  <li><a href="#DESCRIPTION">DESCRIPTION</a></li>
  <li><a href="#CONVENTIONS-USED">CONVENTIONS USED</a></li>
  <li><a href="#DATASET-STRUCTURE">DATASET STRUCTURE</a></li>
  <li><a href="#PERCEPTRON-DATA">PERCEPTRON DATA</a></li>
  <li><a href="#DATA-PROCESSING-RELATED-SUBROUTINES">DATA PROCESSING RELATED SUBROUTINES</a>
    <ul>
      <li><a href="#shuffle_stimuli">shuffle_stimuli ( ... )</a></li>
      <li><a href="#shuffle_data-original_data-shuffled_1-shuffled_2">shuffle_data ( $original_data =&gt; $shuffled_1, $shuffled_2, ... )</a></li>
      <li><a href="#shuffle_data-ORIGINAL_DATA-shuffled_1-shuffled_2">shuffle_data ( ORIGINAL_DATA, $shuffled_1, $shuffled_2, ... )</a></li>
    </ul>
  </li>
  <li><a href="#CREATION-RELATED-SUBROUTINES-METHODS">CREATION RELATED SUBROUTINES/METHODS</a>
    <ul>
      <li><a href="#new-options">new ( \%options )</a></li>
      <li><a href="#get_attributes">get_attributes</a></li>
      <li><a href="#learning_rate-value">learning_rate ( $value )</a></li>
      <li><a href="#learning_rate">learning_rate</a></li>
      <li><a href="#threshold-value">threshold ( $value )</a></li>
      <li><a href="#threshold">threshold</a></li>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

      <li><a href="#take_mock_exam">take_mock_exam (...)</a></li>
      <li><a href="#take_lab_test">take_lab_test (...)</a></li>
      <li><a href="#validate-options">validate ( \%options )</a></li>
    </ul>
  </li>
  <li><a href="#TESTING-RELATED-SUBROUTINES-METHODS">TESTING RELATED SUBROUTINES/METHODS</a>
    <ul>
      <li><a href="#take_real_exam">take_real_exam (...)</a></li>
      <li><a href="#work_in_real_world">work_in_real_world (...)</a></li>
      <li><a href="#test-options">test ( \%options )</a></li>
      <li><a href="#real_validate_or_test-data_hash_ref">_real_validate_or_test ( $data_hash_ref )</a></li>
      <li><a href="#fill_predicted_values-self-stimuli_validate-predicted_index-aoa">&amp;_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )</a></li>
    </ul>
  </li>
  <li><a href="#RESULTS-RELATED-SUBROUTINES-METHODS">RESULTS RELATED SUBROUTINES/METHODS</a>
    <ul>
      <li><a href="#get_exam_results">get_exam_results ( ... )</a></li>
      <li><a href="#get_confusion_matrix-options">get_confusion_matrix ( \%options )</a></li>
      <li><a href="#collect_stats-options">&amp;_collect_stats ( \%options )</a></li>
      <li><a href="#calculate_total_entries-c_matrix_ref">&amp;_calculate_total_entries ( $c_matrix_ref )</a></li>
      <li><a href="#calculate_accuracy-c_matrix_ref">&amp;_calculate_accuracy ( $c_matrix_ref )</a></li>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

    $nerve = AI::Perceptron::Simple-&gt;new( {
        initial_value =&gt; $size_of_each_dendrite,
        learning_rate =&gt; 0.3, # optional
        threshold =&gt; 0.85, # optional
        attribs =&gt; \@dendrites,
    } );

    # train
    $nerve-&gt;tame( ... );
    $nerve-&gt;exercise( ... );
    $nerve-&gt;train( $training_data_csv, $expected_column_name, $save_nerve_to );
    # or
    $nerve-&gt;train(
        $training_data_csv, $expected_column_name, $save_nerve_to, 
        $show_progress, $identifier); # these two parameters must go together


    # validate
    $nerve-&gt;take_lab_test( ... );
    $nerve-&gt;take_mock_exam( ... );

    # fill results to original file
    $nerve-&gt;validate( { 
        stimuli_validate =&gt; $validation_data_csv, 
        predicted_column_index =&gt; 4,
     } );
    # or        
    # fill results to a new file
    $nerve-&gt;validate( {
        stimuli_validate =&gt; $validation_data_csv,
        predicted_column_index =&gt; 4,
        results_write_to =&gt; $new_csv
    } );


    # test - see &quot;validate&quot; method, same usage
    $nerve-&gt;take_real_exam( ... );
    $nerve-&gt;work_in_real_world( ... );
    $nerve-&gt;test( ... );


    # confusion matrix
    my %c_matrix = $nerve-&gt;get_confusion_matrix( { 
        full_data_file =&gt; $file_csv, 
        actual_output_header =&gt; $header_name,
        predicted_output_header =&gt; $predicted_header_name,
        more_stats =&gt; 1, # optional
    } );

    # accessing the confusion matrix
    my @keys = qw( true_positive true_negative false_positive false_negative 
                   total_entries accuracy sensitivity );
    for ( @keys ) {
        print $_, &quot; =&gt; &quot;, $c_matrix{ $_ }, &quot;\n&quot;;
    }

    # output to console
    $nerve-&gt;display_confusion_matrix( \%c_matrix, { 
        zero_as =&gt; &quot;bad apples&quot;, # cat  milk   green  etc.
        one_as =&gt; &quot;good apples&quot;, # dog  honey  pink   etc.
    } );


    # saving and loading data of perceptron locally
    # NOTE: nerve data is automatically saved after each trainning process
    use AI::Perceptron::Simple &quot;:local_data&quot;;

    my $nerve_file = &quot;apples.nerve&quot;;
    preserve( ... );
    save_perceptron( $nerve, $nerve_file );

    # load data of percpetron for use in actual program
    my $apple_nerve = revive( ... );
    my $apple_nerve = load_perceptron( $nerve_file );


    # for portability of nerve data
    use AI::Perceptron::Simple &quot;:portable_data&quot;;

    my $yaml_nerve_file = &quot;pearls.yaml&quot;;
    preserve_as_yaml ( ... );
    save_perceptron_yaml ( $nerve, $yaml_nerve_file );

    # load nerve data on the other computer
    my $pearl_nerve = revive_from_yaml ( ... );
    my $pearl_nerve = load_perceptron_yaml ( $yaml_nerve_file );


    # processing data
    use AI::Perceptron::Simple &quot;:process_data&quot;;
    shuffle_stimuli ( ... )
    shuffle_data ( ORIGINAL_STIMULI, $new_file_1, $new_file_2, ... );
    shuffle_data ( $original_stimuli =&gt; $new_file_1, $new_file_2, ... );</code></pre>

<h1 id="EXPORT">EXPORT</h1>

<p>None by default.</p>

<p>All the subroutines from <code>DATA PROCESSING RELATED SUBROUTINES</code>, <code>NERVE DATA RELATED SUBROUTINES</code> and <code>NERVE PORTABILITY RELATED SUBROUTINES</code> sections are importable through tags or manually specifying them.</p>

<p>The tags available include the following:</p>

<dl>

<dt id="process_data---subroutines-under-DATA-PROCESSING-RELATED-SUBROUTINES-section"><code>:process_data</code> - subroutines under <code>DATA PROCESSING RELATED SUBROUTINES</code> section.</dt>
<dd>

</dd>
<dt id="local_data---subroutines-under-NERVE-DATA-RELATED-SUBROUTINES-section"><code>:local_data</code> - subroutines under <code>NERVE DATA RELATED SUBROUTINES</code> section.</dt>
<dd>

</dd>
<dt id="portable_data---subroutines-under-NERVE-PORTABILITY-RELATED-SUBROUTINES-section"><code>:portable_data</code> - subroutines under <code>NERVE PORTABILITY RELATED SUBROUTINES</code> section.</dt>
<dd>

</dd>
</dl>

<p>Most of the stuff are OO.</p>

<h1 id="DESCRIPTION">DESCRIPTION</h1>

<p>This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.</p>

<p>This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual scr...

<p>The implementation here is super basic as it only takes in input of the dendrites and calculate the output. If the output is higher than the threshold, the final result (category) will be 1 aka perceptron is activated. If not, then the result will...

<p>Depending on how you view or categorize the final result, the perceptron will fine tune itself (aka train) based on the learning rate until the desired result is met. Everything from here on is all mathematics and numbers which only makes sense to...

<p>Whenever the perceptron fine tunes itself, it will increase/decrease all the dendrites that is significant (attributes labelled 1) for each input. This means that even when the perceptron successfully fine tunes itself to suite all the data in you...

<h1 id="CONVENTIONS-USED">CONVENTIONS USED</h1>

<p>Please take note that not all subroutines/method must be used to make things work. All the subroutines and methods are listed out for the sake of writing the documentation.</p>

<p>Private methods/subroutines are prefixed with <code>_</code> or <code>&amp;_</code> and they aren&#39;t meant to be called directly. You can if you want to. There are quite a number of them to be honest, just ignore them if you happen to see them ...

<p>Synonyms are placed before the actual ie. technical subroutines/methods. You will see <code>...</code> as the parameters if they are synonyms. Move to the next subroutine/method until you find something like <code>\%options</code> as the parameter...

<h1 id="DATASET-STRUCTURE">DATASET STRUCTURE</h1>

<p><i>This module can only process CSV files.</i></p>

<p>Any field ie columns that will be used for processing must be binary ie. <code>0</code> or <code>1</code> only. Your dataset can contain other columns with non-binary data as long as they are not one of the dendrites.</p>

<p>There are soem sample dataset which can be found in the <code>t</code> directory. The original dataset can also be found in <code>docs/book_list.csv</code>. The files can also be found <a href="https://github.com/Ellednera/AI-Perceptron-Simple">he...

<h1 id="PERCEPTRON-DATA">PERCEPTRON DATA</h1>

<p>The perceptron/neuron data is stored using the <code>Storable</code> module.</p>

<p>See <code>Portability of Nerve Data</code> section below for more info on some known issues.</p>

<h1 id="DATA-PROCESSING-RELATED-SUBROUTINES">DATA PROCESSING RELATED SUBROUTINES</h1>

<p>These subroutines can be imported using the tag <code>:process_data</code>.</p>

<p>These subroutines should be called in the procedural way.</p>

<h2 id="shuffle_stimuli">shuffle_stimuli ( ... )</h2>

<p>The parameters and usage are the same as <code>shuffled_data</code>. See the next two subroutines.</p>

<h2 id="shuffle_data-original_data-shuffled_1-shuffled_2">shuffle_data ( $original_data =&gt; $shuffled_1, $shuffled_2, ... )</h2>

<h2 id="shuffle_data-ORIGINAL_DATA-shuffled_1-shuffled_2">shuffle_data ( ORIGINAL_DATA, $shuffled_1, $shuffled_2, ... )</h2>

<p>Shuffles <code>$original_data</code> or <code>ORIGINAL_DATA</code> and saves them to other files.</p>

<h1 id="CREATION-RELATED-SUBROUTINES-METHODS">CREATION RELATED SUBROUTINES/METHODS</h1>

<h2 id="new-options">new ( \%options )</h2>

<p>Creates a brand new perceptron and initializes the value of each attribute / dendrite aka. weight. Think of it as the thickness or plasticity of the dendrites.</p>

<p>For <code>%options</code>, the followings are needed unless mentioned:</p>

<dl>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

<h2 id="tame">tame ( ... )</h2>

<h2 id="exercise">exercise ( ... )</h2>

<h2 id="train-stimuli_train_csv-expected_output_header-save_nerve_to_file">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file )</h2>

<h2 id="train-stimuli_train_csv-expected_output_header-save_nerve_to_file-display_stats-identifier">train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier )</h2>

<p>Trains the perceptron.</p>

<p><code>$stimuli_train_csv</code> is the set of data / input (in CSV format) to train the perceptron while <code>$save_nerve_to_file</code> is the filename that will be generate each time the perceptron finishes the training process. This data file ...

<p><code>$expected_output_header</code> is the header name of the columns in the csv file with the actual category or the exepcted values. This is used to determine to tune the nerve up or down. This value should only be 0 or 1 for the sake of simpli...

<p><code>$display_stats</code> is <b>optional</b> and the default is 0. It will display more output about the tuning process. It will show the followings:</p>

<dl>

<dt id="tuning-status">tuning status</dt>
<dd>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN


</dd>
<dt id="new-sum">new sum</dt>
<dd>

<p>The new sum of all <code>weightage * input</code> after fine-tuning the nerve</p>

</dd>
</dl>

<p>If <code>$display_stats</code> is specified ie. set to <code>1</code>, then you <b>MUST</b> specify the <code>$identifier</code>. <code>$identifier</code> is the column / header name that is used to identify a specific row of data in <code>$stimul...

<h2 id="calculate_output-self-stimuli_hash">&amp;_calculate_output( $self, \%stimuli_hash )</h2>

<p>Calculates and returns the <code>sum(weightage*input)</code> for each individual row of data. Actually, it justs add up all the existing weight since the <code>input</code> is always 1 for now :)</p>

<p><code>%stimuli_hash</code> is the actual data to be used for training. It might contain useless columns.</p>

<p>This will get all the avaible dendrites using the <code>get_attributes</code> method and then use all the keys ie. headers to access the corresponding values.</p>

<p>This subroutine should be called in the procedural way for now.</p>

<h2 id="tune-self-stimuli_hash-tune_up_or_down">&amp;_tune( $self, \%stimuli_hash, $tune_up_or_down )</h2>

<p>Fine tunes the nerve. This will directly alter the attributes values in <code>$self</code> according to the attributes / dendrites specified in <code>new</code>.</p>

<p>The <code>%stimuli_hash</code> here is the same as the one in the <code>_calculate_output</code> method.</p>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

<h1 id="VALIDATION-RELATED-METHODS">VALIDATION RELATED METHODS</h1>

<p>All the validation methods here have the same parameters as the actual <code>validate</code> method and they all do the same stuff. They are also used in the same way.</p>

<h2 id="take_mock_exam">take_mock_exam (...)</h2>

<h2 id="take_lab_test">take_lab_test (...)</h2>

<h2 id="validate-options">validate ( \%options )</h2>

<p>This method validates the perceptron against another set of data after it has undergone the training process.</p>

<p>This method calculates the output of each row of data and write the result into the predicted column. The data begin written into the new file or the original file will maintain it&#39;s sequence.</p>

<p>Please take note that this method will load all the data of the validation stimuli, so please split your stimuli into multiple files if possible and call this method a few more times.</p>

<p>For <code>%options</code>, the followings are needed unless mentioned:</p>

<dl>

<dt id="stimuli_validate-csv_file">stimuli_validate =&gt; $csv_file</dt>
<dd>

<p>This is the CSV file containing the validation data, make sure that it contains a column with the predicted values as it is needed in the next key mentioned: <code>predicted_column_index</code></p>

</dd>
<dt id="predicted_column_index-column_number">predicted_column_index =&gt; $column_number</dt>
<dd>

<p>This is the index of the column that contains the predicted output values. <code>$index</code> starts from <code>0</code>.</p>

<p>This column will be filled with binary numbers and the full new data will be saved to the file specified in the <code>results_write_to</code> key.</p>

</dd>
<dt id="results_write_to-new_csv_file">results_write_to =&gt; $new_csv_file</dt>
<dd>

<p>Optional.</p>

<p>The default behaviour will write the predicted output back into <code>stimuli_validate</code> ie the original data. The sequence of the data will be maintained.</p>

</dd>
</dl>

<p><i>*This method will call <code>_real_validate_or_test</code> to do the actual work.</i></p>

<h1 id="TESTING-RELATED-SUBROUTINES-METHODS">TESTING RELATED SUBROUTINES/METHODS</h1>

<p>All the testing methods here have the same parameters as the actual <code>test</code> method and they all do the same stuff. They are also used in the same way.</p>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

<h2 id="work_in_real_world">work_in_real_world (...)</h2>

<h2 id="test-options">test ( \%options )</h2>

<p>This method is used to put the trained nerve to the test. You can think of it as deploying the nerve for the actual work or maybe putting the nerve into an empty brain and see how well the brain survives :)</p>

<p>This method works and behaves the same way as the <code>validate</code> method. See <code>validate</code> for the details.</p>

<p><i>*This method will call &amp;_real_validate_or_test to do the actual work.</i></p>

<h2 id="real_validate_or_test-data_hash_ref">_real_validate_or_test ( $data_hash_ref )</h2>

<p>This is where the actual validation or testing takes place.</p>

<p><code>$data_hash_ref</code> is the list of parameters passed into the <code>validate</code> or <code>test</code> methods.</p>

<p>This is a <b>method</b>, so use the OO way. This is one of the exceptions to the rules where private subroutines are treated as methods :)</p>

<h2 id="fill_predicted_values-self-stimuli_validate-predicted_index-aoa">&amp;_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )</h2>

<p>This is where the filling in of the predicted values takes place. Take note that the parameters naming are the same as the ones used in the <code>validate</code> and <code>test</code> method.</p>

<p>This subroutine should be called in the procedural way.</p>

<h1 id="RESULTS-RELATED-SUBROUTINES-METHODS">RESULTS RELATED SUBROUTINES/METHODS</h1>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

<h2 id="get_confusion_matrix-options">get_confusion_matrix ( \%options )</h2>

<p>Returns the confusion matrix in the form of a hash. The hash will contain these keys: <code>true_positive</code>, <code>true_negative</code>, <code>false_positive</code>, <code>false_negative</code>, <code>accuracy</code>, <code>sensitivity</code>...

<p>If you are trying to manipulate the confusion matrix hash or something, take note that all the stats are in percentage (%) in decimal (if any) except the total entries.</p>

<p>For <code>%options</code>, the followings are needed unless mentioned:</p>

<dl>

<dt id="full_data_file-filled_test_file">full_data_file =&gt; $filled_test_file</dt>
<dd>

<p>This is the CSV file filled with the predicted values.</p>

<p>Make sure that you don&#39;t do anything to the actual and predicted output in this file after testing the nerve. These two columns must contain binary values only!</p>

</dd>
<dt id="actual_output_header-actual_column_name">actual_output_header =&gt; $actual_column_name</dt>
<dd>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN


</dd>
</dl>

<h2 id="collect_stats-options">&amp;_collect_stats ( \%options )</h2>

<p>Generates a hash of confusion matrix based on <code>%options</code> given in the <code>get_confusion_matrix</code> method.</p>

<h2 id="calculate_total_entries-c_matrix_ref">&amp;_calculate_total_entries ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>total_entries</code> key in the confusion matrix hash.</p>

<h2 id="calculate_accuracy-c_matrix_ref">&amp;_calculate_accuracy ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>accuracy</code> key in the confusion matrix hash.</p>

<h2 id="calculate_sensitivity-c_matrix_ref">&amp;_calculate_sensitivity ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>sensitivity</code> key in the confusion matrix hash.</p>

<h2 id="calculate_precision-c_matrix_ref">&amp;_calculate_precision ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>precision</code> key in the confusion matrix hash.</p>

<h2 id="calculate_specificity-c_matrix_ref">&amp;_calculate_specificity ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>specificity</code> key in the confusion matrix hash.</p>

<h2 id="calculate_f1_score-c_matrix_ref">&amp;_calculate_f1_score ( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>F1_Score</code> key in the confusion matrix hash.</p>

<h2 id="calculate_negative_predicted_value-c_matrix_ref">&amp;_calculate_negative_predicted_value( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>negative_predicted_value</code> key in the confusion matrix hash.</p>

<h2 id="calculate_false_negative_rate-c_matrix_ref">&amp;_calculate_false_negative_rate( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>false_negative_rate</code> key in the confusion matrix hash.</p>

<h2 id="calculate_false_positive_rate-c_matrix_ref">&amp;_calculate_false_positive_rate( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>false_positive_rate</code> key in the confusion matrix hash.</p>

<h2 id="calculate_false_discovery_rate-c_matrix_ref">&amp;_calculate_false_discovery_rate( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>false_discovery_rate</code> key in the confusion matrix hash.</p>

<h2 id="calculate_false_omission_rate-c_matrix_ref">&amp;_calculate_false_omission_rate( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>false_omission_rate</code> key in the confusion matrix hash.</p>

<h2 id="calculate_balanced_accuracy-c_matrix_ref">&amp;_calculate_balanced_accuracy( $c_matrix_ref )</h2>

<p>Calculates and adds the data for the <code>balanced_accuracy</code> key in the confusion matrix hash.</p>

<h2 id="display_exam_results">display_exam_results ( ... )</h2>

<p>The parameters are the same as <code>display_confusion_matrix</code>. See the next method.</p>

<h2 id="display_confusion_matrix-confusion_matrix-labels">display_confusion_matrix ( \%confusion_matrix, \%labels )</h2>

<p>Display the confusion matrix. If <code>%confusion_matrix</code> has <code>more_stats</code> elements, it will display them if they exists. The default elements ie <code>accuracy</code> and <code>sensitivity</code> must be present, while the rest c...

<p><code>%confusion_matrix</code> is the same confusion matrix returned by the <code>get_confusion_matrix</code> method.</p>

docs/AI-Perceptron-Simple-1.04.html view on Meta::CPAN

<p>Returns a list <code>( $matrix, $c_matrix )</code> which can directly be passed to <code>_print_extended_matrix</code>.</p>

<h2 id="print_extended_matrix-matrix-c_matrix">&amp;_print_extended_matrix ( $matrix, $c_matrix )</h2>

<p>Extends and outputs the matrix on the screen.</p>

<p><code>$matrix</code> and <code>$c_matrix</code> are the same as returned by <code>&amp;_build_matrix</code>.</p>

<h1 id="NERVE-DATA-RELATED-SUBROUTINES">NERVE DATA RELATED SUBROUTINES</h1>

<p>This part is about saving the data of the nerve. These subroutines can be imported using the <code>:local_data</code> tag.</p>

<p><b>The subroutines are to be called in the procedural way</b>. No checking is done currently.</p>

<p>See <code>PERCEPTRON DATA</code> and <code>KNOWN ISSUES</code> sections for more details on the subroutines in this section.</p>

<h2 id="preserve">preserve ( ... )</h2>

<p>The parameters and usage are the same as <code>save_perceptron</code>. See the next subroutine.</p>

<h2 id="save_perceptron-nerve-nerve_file">save_perceptron ( $nerve, $nerve_file )</h2>

<p>Saves the <code>AI::Perceptron::Simple</code> object into a <code>Storable</code> file. There shouldn&#39;t be a need to call this method manually since after every training process this will be called automatically.</p>

<h2 id="revive">revive (...)</h2>

<p>The parameters and usage are the same as <code>load_perceptron</code>. See the next subroutine.</p>

<h2 id="load_perceptron-nerve_file_to_load">load_perceptron ( $nerve_file_to_load )</h2>

<p>Loads the data and turns it into a <code>AI::Perceptron::Simple</code> object as the return value.</p>

<h1 id="NERVE-PORTABILITY-RELATED-SUBROUTINES">NERVE PORTABILITY RELATED SUBROUTINES</h1>

<p>These subroutines can be imported using the <code>:portable_data</code> tag.</p>

<p>The file type currently supported is YAML. Please be careful with the data as you won&#39;t want the nerve data accidentally modified.</p>

<h2 id="preserve_as_yaml">preserve_as_yaml ( ... )</h2>

<p>The parameters and usage are the same as <code>save_perceptron_yaml</code>. See the next subroutine.</p>

<h2 id="save_perceptron_yaml-nerve-yaml_nerve_file">save_perceptron_yaml ( $nerve, $yaml_nerve_file )</h2>

<p>Saves the <code>AI::Perceptron::Simple</code> object into a <code>YAML</code> file.</p>

<h2 id="revive_from_yaml">revive_from_yaml (...)</h2>

<p>The parameters and usage are the same as <code>load_perceptron</code>. See the next subroutine.</p>

<h2 id="load_perceptron_yaml-yaml_nerve_file">load_perceptron_yaml ( $yaml_nerve_file )</h2>

<p>Loads the YAML data and turns it into a <code>AI::Perceptron::Simple</code> object as the return value.</p>

<h1 id="TO-DO">TO DO</h1>

<p>These are the to-do&#39;s that <b>MIGHT</b> be done in the future. Don&#39;t put too much hope in them please :)</p>

<ul>

<li><p>Clean up and refactor source codes</p>

</li>
<li><p>Add more useful data for confusion matrix</p>

</li>
<li><p>Implement shuffling data feature</p>

</li>
<li><p>Implement fast/smart training feature</p>

</li>
<li><p>Write a tutorial or something for this module</p>

</li>
<li><p>and something yet to be known...</p>

</li>
</ul>

<h1 id="KNOWN-ISSUES">KNOWN ISSUES</h1>

<h2 id="Portability-of-Nerve-Data">Portability of Nerve Data</h2>

<p>Take note that the <code>Storable</code> nerve data is not compatible across different versions.</p>

<p>If you really need to send the nerve data to different computers with different versions of <code>Storable</code> module, see the docs of the following subroutines:</p>

<ul>

<li><p><code>&amp;preserve_as_yaml</code> or <code>&amp;save_perceptron_yaml</code> for storing data.</p>

</li>
<li><p><code>&amp;revive_from_yaml</code> or <code>&amp;load_perceptron_yaml</code> for retrieving the data.</p>

</li>
</ul>

<h1 id="AUTHOR">AUTHOR</h1>

<p>Raphael Jong Jun Jie, <code>&lt;ellednera at cpan.org&gt;</code></p>

<h1 id="BUGS">BUGS</h1>

docs/specifications.t view on Meta::CPAN


plan( skip_all => "This is just the specification" );
done_testing;

######### specifications ##############
#
# This specification is based on My::Perceptron (see my github repo) and packed into AI::Perceptron::Simple v1.00
#
# Version 0.01 - completed on 8 August 2021
#   [v] able to create perceptron
#   [v] able to process data: &train method
#       [v] read csv - for training stage
#   [v] able to save the actual perceptron object and load it back
#
# Version 0.02 - completed on 17 August 2021
#   [v] implement output algorithm for train and finalize it
#   [v] read and calculate data line by line, not bulk, so no shuffling method
#   [v] implement validate method
#       [v] read csv bulk - for validating and testing stages
#       [v] write into a new csv file - validation and testing stages
#   [v] implement testing method
#       [v] read csv bulk - for validating and testing stages
#       [v] write into a new csv file - validation and testing stages
#
# Version 0.03 - completed on 19 August 2021
#   [v] implement confusion matrix
#       [v] read only expected and predicted columns, line by line
#       [v] return a hash of data
#           [v] TP, TN, FP, FN
#           [v] accuracy
#           [v] sensitivity
#   [v] remove the return value for "train" method
#   [v] display confusion matrix data to console
#       [v] use Text:Matrix
#
# Version 0.04 / Version 1.0 - completed on 23 AUGUST 2021
#   [v] add synonyms
#       [v] synonyms MUST call actual subroutines and not copy pasting!
#       train: [v] tame  [v] exercise
#       validate: [v] take_mock_exam  [v] take_lab_test
#       test: [v] take_real_exam  [v] work_in_real_world
#       generate_confusion_matrix: [v] get_exam_results
#       display_confusion_matrix: [v] display_exam_results

docs/specifications.t view on Meta::CPAN

#       load_perceptron: [v] revive
#
# Version 1.01
#   [v] fixed currently known issues as much as possible (see 'Changes')
#       - "long size integer" === "byte order not compatible"
#
# Version 1.02
#   [v] minimum perl version changed to 5.8 due to Test::Output
#   [v] YAML (nerve file) for portability
#       [v] make subroutines exportable, the names are too long
#           [v] :local_data
#           [v] :portable_data
#   [v] fix test for display_confusion_matrix
#       [v] modifier "n" (perl 5.22 and above) changed to primitive '?:', 5.22 is too high
#       [v] fixed inaccurate test for output part
#   [v] clean & refactor codes
#       [v] refactored &display_confusion_matrix
#   [v] improve the documentation
#
# Version 1.03
#   [v] data processing: shuffle data + import tag
#   [v] add more useful data to the confusion matrix
#       [v] sum of column and rows to make it look more classic :)
#   [v] optional option to show more stats
#       [v] precision    [v] specificity    [v] F1 score
#       [v] negative_predicted_value    [v] false_negative_rate    [v] false_positive_rate
#       [v] false_discovery_rate    [v] false_omission_rate    [v] balanced_accuracy
#
# Version 1.04
#   [v] fix docs
#   [v] change die to croak for file opening
#   [v] fixed yaml nerve not loading back as an AI::Perceptron::Simple object

docs/specifications.t view on Meta::CPAN

#       - sum: green
#       - etc
#   -add public function:
#       -predict result from non-csv input (single row), might be useful when gui is involved
#
# Version 1.06
#   -add a simple tutorial
#   -smart tuning feature: automatically increase/decrease learning_rate in multiples in training stage
#
# Version ?.??
#   ? implement shuffling system into training stage, bulk data processing   
#   ? Data processing: splitting data, k-fold
#   -...
#
#
############ "flow" of the codes ############

# these three steps could be done in seperated scripts if necessary
# &train and &validate could be put inside a loop or something
# the parameters make more sense when they are taken from @ARGV
    # so when it's the first time training, it will create the nerve_file,
    # the second time and up it will directly overrride that file since everything is read from it
    # ... anyway :) afterall training stage wasn't meant to be a fully working program, so it shouldnt be a problem
# just assume that 
$perceptron->train( $stimuli_train, $save_nerve_to_file ); 
    # reads training stimuli from csv
    # tune attributes based on csv data
        # calls the same subroutine to do the calculation
    # shouldn't give any output upon completion
    # should save a copy of itselt into a new file
    # returns the nerve's data filename to be used in validate()
        # these two can go into a loop with conditions checking
        # which means that we can actuall write this
            # $perceptron->validate( $stimuli_validate, 
            #                        $perceptron->train( $stimuli_train, $save_nerve_to_file ) 
            #                       );
            # and then check the confusion matrix, if not satisfied, run the loop again :)
$perceptron->validate( $stimuli_validate, $nerve_data_to_read );
$perceptron->test( $stimuli_test ); # loads nerve data from data file, turn into a object, then do the following:
    # reads from csv :
        # validation stimuli
        # testing stimuli
    # both will call the same subroutine to do calculation
    # both will write predicted data into the original data file

# show results ie confusion matrix (TP-true positive, TN-true negative, FP-false positive, FN-false negative)
# this should only be done during validation and testing
$perceptron->generate_confusion_matrix( { 1 => $csv_header_true, 0 => $csv_header_false } );
    # calculates the 4 thingy based on the current data on hand (RAM), don't read from file again, it shouldn't be a problem
        # returns a hash
    # ie it must be used together with validate() and test() to avoid problems
        # ie validate() and test() must be in different scripts, which makes sense
        # unless, create 3 similar objects to do the work in one go
        
# save data of the trained perceptron
$perceptron->save_data( $data_file );
    # see train() on saving copy of the perceptron

# load data of percpetron for use in actual program
My::Perceptron::load_data( $data_file );
    # loads the perceptron and returns the actual My::Perceptron object
        # should work though as Storable claims it can do that


# besiyata d'shmaya

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

    $nerve = AI::Perceptron::Simple->new( {
        initial_value => $size_of_each_dendrite,
        learning_rate => 0.3, # optional
        threshold => 0.85, # optional
        attribs => \@dendrites,
    } );

    # train
    $nerve->tame( ... );
    $nerve->exercise( ... );
    $nerve->train( $training_data_csv, $expected_column_name, $save_nerve_to );
    # or
    $nerve->train(
        $training_data_csv, $expected_column_name, $save_nerve_to, 
        $show_progress, $identifier); # these two parameters must go together


    # validate
    $nerve->take_lab_test( ... );
    $nerve->take_mock_exam( ... );

    # fill results to original file
    $nerve->validate( { 
        stimuli_validate => $validation_data_csv, 
        predicted_column_index => 4,
     } );
    # or        
    # fill results to a new file
    $nerve->validate( {
        stimuli_validate => $validation_data_csv,
        predicted_column_index => 4,
        results_write_to => $new_csv
    } );


    # test - see "validate" method, same usage
    $nerve->take_real_exam( ... );
    $nerve->work_in_real_world( ... );
    $nerve->test( ... );


    # confusion matrix
    my %c_matrix = $nerve->get_confusion_matrix( { 
        full_data_file => $file_csv, 
        actual_output_header => $header_name,
        predicted_output_header => $predicted_header_name,
        more_stats => 1, # optional
    } );

    # accessing the confusion matrix
    my @keys = qw( true_positive true_negative false_positive false_negative 
                   total_entries accuracy sensitivity );
    for ( @keys ) {
        print $_, " => ", $c_matrix{ $_ }, "\n";
    }

    # output to console
    $nerve->display_confusion_matrix( \%c_matrix, { 
        zero_as => "bad apples", # cat  milk   green  etc.
        one_as => "good apples", # dog  honey  pink   etc.
    } );


    # saving and loading data of perceptron locally
    # NOTE: nerve data is automatically saved after each trainning process
    use AI::Perceptron::Simple ":local_data";

    my $nerve_file = "apples.nerve";
    preserve( ... );
    save_perceptron( $nerve, $nerve_file );

    # load data of percpetron for use in actual program
    my $apple_nerve = revive( ... );
    my $apple_nerve = load_perceptron( $nerve_file );


    # for portability of nerve data
    use AI::Perceptron::Simple ":portable_data";

    my $yaml_nerve_file = "pearls.yaml";
    preserve_as_yaml ( ... );
    save_perceptron_yaml ( $nerve, $yaml_nerve_file );

    # load nerve data on the other computer
    my $pearl_nerve = revive_from_yaml ( ... );
    my $pearl_nerve = load_perceptron_yaml ( $yaml_nerve_file );


    # processing data
    use AI::Perceptron::Simple ":process_data";
    shuffle_stimuli ( ... )
    shuffle_data ( ORIGINAL_STIMULI, $new_file_1, $new_file_2, ... );
    shuffle_data ( $original_stimuli => $new_file_1, $new_file_2, ... );

=head1 EXPORT

None by default.

All the subroutines from C<DATA PROCESSING RELATED SUBROUTINES>, C<NERVE DATA RELATED SUBROUTINES> and C<NERVE PORTABILITY RELATED SUBROUTINES> sections are importable through tags or manually specifying them.

The tags available include the following:

=over 4

=item C<:process_data> - subroutines under C<DATA PROCESSING RELATED SUBROUTINES> section.

=item C<:local_data> - subroutines under C<NERVE DATA RELATED SUBROUTINES> section.

=item C<:portable_data> - subroutines under C<NERVE PORTABILITY RELATED SUBROUTINES> section.

=back

Most of the stuff are OO.

=cut

use Exporter qw( import );
our @EXPORT_OK = qw( 
    shuffle_data shuffle_stimuli
    preserve save_perceptron revive load_perceptron
    preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml
);
our %EXPORT_TAGS = ( 
    process_data => [ qw( shuffle_data shuffle_stimuli ) ],
    local_data => [ qw( preserve save_perceptron revive load_perceptron ) ],
    portable_data => [ qw( preserve_as_yaml save_perceptron_yaml revive_from_yaml load_perceptron_yaml ) ],
);

=head1 DESCRIPTION

This module provides methods to build, train, validate and test a perceptron. It can also save the data of the perceptron for future use for any actual AI programs.

This module is also aimed to help newbies grasp hold of the concept of perceptron, training, validation and testing as much as possible. Hence, all the methods and subroutines in this module are decoupled as much as possible so that the actual script...

The implementation here is super basic as it only takes in input of the dendrites and calculate the output. If the output is higher than the threshold, the final result (category) will 
be 1 aka perceptron is activated. If not, then the result will be 0 (not activated).

Depending on how you view or categorize the final result, the perceptron will fine tune itself (aka train) based on the learning rate until the desired result is met. Everything from 
here on is all mathematics and numbers which only makes sense to the computer and not humans anymore.

Whenever the perceptron fine tunes itself, it will increase/decrease all the dendrites that is significant (attributes labelled 1) for each input. This means that even when the 
perceptron successfully fine tunes itself to suite all the data in your file for the first round, the perceptron might still get some of the things wrong for the next round of training. 
Therefore, the perceptron should be trained for as many rounds as possible. The more "confusion" the perceptron is able to correctly handle, the more "mature" the perceptron is. 
No one defines how "mature" it is except the programmer himself/herself :)

=head1 CONVENTIONS USED

Please take note that not all subroutines/method must be used to make things work. All the subroutines and methods are listed out for the sake of writing the documentation. 

Private methods/subroutines are prefixed with C<_> or C<&_> and they aren't meant to be called directly. You can if you want to. There are quite a number of them to be honest, just ignore them if you happen to see them :)

Synonyms are placed before the actual ie. technical subroutines/methods. You will see C<...> as the parameters if they are synonyms. Move to the next subroutine/method until you find something like C<\%options> as the parameter or anything that isn't...

=head1 DATASET STRUCTURE

I<This module can only process CSV files.>

Any field ie columns that will be used for processing must be binary ie. C<0> or C<1> only. Your dataset can contain other columns with non-binary data as long as they are not one of the dendrites.

There are soem sample dataset which can be found in the C<t> directory. The original dataset can also be found in C<docs/book_list.csv>. The files can also be found L<here|https://github.com/Ellednera/AI-Perceptron-Simple>.

=head1 PERCEPTRON DATA

The perceptron/neuron data is stored using the C<Storable> module. 

See C<Portability of Nerve Data> section below for more info on some known issues.

=head1 DATA PROCESSING RELATED SUBROUTINES

These subroutines can be imported using the tag C<:process_data>.

These subroutines should be called in the procedural way.

=head2 shuffle_stimuli ( ... )

The parameters and usage are the same as C<shuffled_data>. See the next two subroutines.

=head2 shuffle_data ( $original_data => $shuffled_1, $shuffled_2, ... )

=head2 shuffle_data ( ORIGINAL_DATA, $shuffled_1, $shuffled_2, ... )

Shuffles C<$original_data> or C<ORIGINAL_DATA> and saves them to other files.

=cut

sub shuffle_stimuli {
    shuffle_data( @_ );
}

sub shuffle_data {
    my $stimuli = shift or croak "Please specify the original file name";
    my @shuffled_stimuli_names = @_ 
        or croak "Please specify the output files for the shuffled data";
    
    my @aoa;
    for ( @shuffled_stimuli_names ) {
        # copied from _real_validate_or_test
        # open for shuffling
        my $aoa = csv (in => $stimuli, encoding => ":encoding(utf-8)");
        my $attrib_array_ref = shift @$aoa; # 'remove' the header, it's annoying :)
        @aoa = shuffle( @$aoa ); # this can only process actual array
        unshift @aoa, $attrib_array_ref; # put back the headers before saving file

        csv( in => \@aoa, out => $_, encoding => ":encoding(utf-8)" ) 
        and
        print "Saved shuffled data into ", basename($_), "!\n";

    }
}

=head1 CREATION RELATED SUBROUTINES/METHODS

=head2 new ( \%options )

Creates a brand new perceptron and initializes the value of each attribute / dendrite aka. weight. Think of it as the thickness or plasticity of the dendrites.

lib/AI/Perceptron/Simple.pm view on Meta::CPAN


Generally speaking, this value is usually between C<0> and C<1>. However, it all depend on your combination of numbers for the other options.

=back

=cut

sub new {
    my $class = shift;
    
    my $data_ref = shift;
    my %data = %{ $data_ref };
    
    # check keys
    $data{ learning_rate } = LEARNING_RATE if not exists $data{ learning_rate };
    $data{ threshold } = THRESHOLD if not exists $data{ threshold };
    
    #####
    # don't pack this key checking process into a subroutine for now
    # this is also used in &_real_validate_or_test
    my @missing_keys;
    for ( qw( initial_value attribs ) ) {
        push @missing_keys, $_ unless exists $data{ $_ };
    }
    
    croak "Missing keys: @missing_keys" if @missing_keys;
    #####
    
    # continue to process the rest of the data
    my %attributes;
    for ( @{ $data{ attribs } } ) {
        $attributes{ $_ } = $data{ initial_value };
    }
    
    my %processed_data = (
        learning_rate => $data{ learning_rate },
        threshold => $data{ threshold },
        attributes_hash_ref => \%attributes,
    );
    
    bless \%processed_data, $class;
}

=head2 get_attributes

Obtains a hash of all the attributes of the perceptron

=cut

sub get_attributes {
    my $self = shift;

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

=head2 tame ( ... )

=head2 exercise ( ... )

=head2 train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file )

=head2 train ( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier )

Trains the perceptron. 

C<$stimuli_train_csv> is the set of data / input (in CSV format) to train the perceptron while C<$save_nerve_to_file> is 
the filename that will be generate each time the perceptron finishes the training process. This data file is the data of the C<AI::Perceptron::Simple> 
object and it is used in the C<validate> method.

C<$expected_output_header> is the header name of the columns in the csv file with the actual category or the exepcted values. This is used to determine to tune the nerve up or down. This value should only be 0 or 1 for the sake of simplicity.

C<$display_stats> is B<optional> and the default is 0. It will display more output about the tuning process. It will show the followings:

=over 4

=item tuning status

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

=item threshold

The threshold of the nerve

=item new sum

The new sum of all C<weightage * input> after fine-tuning the nerve

=back

If C<$display_stats> is specified ie. set to C<1>, then you B<MUST> specify the C<$identifier>. C<$identifier> is the column / header name that is used to identify a specific row of data in C<$stimuli_train_csv>.

=cut

sub tame {
    train( @_ );
}

sub exercise {
    train( @_ );
}

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

sub train {
    my $self = shift;
    my( $stimuli_train_csv, $expected_output_header, $save_nerve_to_file, $display_stats, $identifier ) = @_;
    
    $display_stats = 0 if not defined $display_stats;
    if ( $display_stats and not defined $identifier ) {
        croak "Please specifiy a string for \$identifier if you are trying to display stats";
    }
    
    # CSV processing is all according to the documentation of Text::CSV
    open my $data_fh, "<:encoding(UTF-8)", $stimuli_train_csv 
        or croak "Can't open $stimuli_train_csv: $!";
    
    my $csv = Text::CSV->new( {auto_diag => 1, binary => 1} );
    
    my $attrib = $csv->getline($data_fh);
    $csv->column_names( $attrib );

    # individual row
    ROW: while ( my $row = $csv->getline_hr($data_fh) ) {
        # print $row->{book_name}, " -> ";
        # print $row->{$expected_output_header} ? "æ„æž—\n" : "é…ä¸½ä¼˜å“\n";

        # calculate the output and fine tune parameters if necessary
        while (1) {
            my $output = _calculate_output( $self, $row );
            
            #print "Sum = ", $output, "\n";
            
            # $expected_output_header to be checked together over here

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

                    print "   -> NO TUNING NEEDED";
                    print "   Sum = ", _calculate_output( $self, $row );
                    print "   Threshold = ", $self->threshold, "\n";
                }
                
                next ROW;
            } #else { print "Something's not right\n'" }
        }
    }

    close $data_fh;
    
    save_perceptron( $self, $save_nerve_to_file ); # this doesn't return anything
    
}

=head2 &_calculate_output( $self, \%stimuli_hash )

Calculates and returns the C<sum(weightage*input)> for each individual row of data. Actually, it justs add up all the existing weight since the C<input> is always 1 for now :)

C<%stimuli_hash> is the actual data to be used for training. It might contain useless columns.

This will get all the avaible dendrites using the C<get_attributes> method and then use all the keys ie. headers to access the corresponding values.

This subroutine should be called in the procedural way for now.

=cut

sub _calculate_output {
    my $self = shift; 
    my $stimuli_hash_ref = shift;

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

=head1 VALIDATION RELATED METHODS

All the validation methods here have the same parameters as the actual C<validate> method and they all do the same stuff. They are also used in the same way.

=head2 take_mock_exam (...)

=head2 take_lab_test (...)

=head2 validate ( \%options )

This method validates the perceptron against another set of data after it has undergone the training process.

This method calculates the output of each row of data and write the result into the predicted column. The data begin written into the new file or the original file will maintain it's sequence.

Please take note that this method will load all the data of the validation stimuli, so please split your stimuli into multiple files if possible and call this method a few more times.

For C<%options>, the followings are needed unless mentioned:

=over 4

=item stimuli_validate => $csv_file

This is the CSV file containing the validation data, make sure that it contains a column with the predicted values as it is needed in the next key mentioned: C<predicted_column_index>

=item predicted_column_index => $column_number

This is the index of the column that contains the predicted output values. C<$index> starts from C<0>.

This column will be filled with binary numbers and the full new data will be saved to the file specified in the C<results_write_to> key.

=item results_write_to => $new_csv_file

Optional.

The default behaviour will write the predicted output back into C<stimuli_validate> ie the original data. The sequence of the data will be maintained.

=back

I<*This method will call C<_real_validate_or_test> to do the actual work.>

=cut

sub take_mock_exam {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

sub take_lab_test {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

sub validate {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

=head1 TESTING RELATED SUBROUTINES/METHODS

All the testing methods here have the same parameters as the actual C<test> method and they all do the same stuff. They are also used in the same way.

=head2 take_real_exam (...)

=head2 work_in_real_world (...)

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

well the brain survives :)

This method works and behaves the same way as the C<validate> method. See C<validate> for the details.

I<*This method will call &_real_validate_or_test to do the actual work.>

=cut

# redirect to _real_validate_or_test
sub take_real_exam {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

sub work_in_real_world {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

sub test {
    my ( $self, $data_hash_ref ) = @_;
    $self->_real_validate_or_test( $data_hash_ref );
}

=head2 _real_validate_or_test ( $data_hash_ref )

This is where the actual validation or testing takes place. 

C<$data_hash_ref> is the list of parameters passed into the C<validate> or C<test> methods.

This is a B<method>, so use the OO way. This is one of the exceptions to the rules where private subroutines are treated as methods :)

=cut

sub _real_validate_or_test {

    my $self = shift;   my $data_hash_ref = shift;
    
    #####
    my @missing_keys;
    for ( qw( stimuli_validate predicted_column_index ) ) {
        push @missing_keys, $_ unless exists $data_hash_ref->{ $_ };
    }
    
    croak "Missing keys: @missing_keys" if @missing_keys;
    #####
    
    my $stimuli_validate = $data_hash_ref->{ stimuli_validate };
    my $predicted_index = $data_hash_ref->{ predicted_column_index };
    
    # actual processing starts here
    my $output_file = defined $data_hash_ref->{ results_write_to } 
                        ? $data_hash_ref->{ results_write_to }
                        : $stimuli_validate;
    
    # open for writing results
    my $aoa = csv (in => $stimuli_validate, encoding => ":encoding(utf-8)");
    
    my $attrib_array_ref = shift @$aoa; # 'remove' the header, it's annoying :)

    $aoa = _fill_predicted_values( $self, $stimuli_validate, $predicted_index, $aoa );

    # put back the array of headers before saving file
    unshift @$aoa, $attrib_array_ref;

    print "Saving data to $output_file\n";
    csv( in => $aoa, out => $output_file, encoding => ":encoding(utf-8)" );
    print "Done saving!\n";

}

=head2 &_fill_predicted_values ( $self, $stimuli_validate, $predicted_index, $aoa )

This is where the filling in of the predicted values takes place. Take note that the parameters naming are the same as the ones used in the C<validate> and C<test> method.

This subroutine should be called in the procedural way.

=cut

sub _fill_predicted_values {
    my ( $self, $stimuli_validate, $predicted_index, $aoa ) = @_;

    # CSV processing is all according to the documentation of Text::CSV
    open my $data_fh, "<:encoding(UTF-8)", $stimuli_validate 
        or croak "Can't open $stimuli_validate: $!";
    
    my $csv = Text::CSV->new( {auto_diag => 1, binary => 1} );
    
    my $attrib = $csv->getline($data_fh);
    
    $csv->column_names( $attrib );

    # individual row
    my $row = 0;
    while ( my $data = $csv->getline_hr($data_fh) ) {
        
        if ( _calculate_output( $self, $data )  >= $self->threshold ) {
            # write 1 into aoa
            $aoa->[ $row ][ $predicted_index ] = 1;
        } else {
            #write 0 into aoa
            $aoa->[ $row ][ $predicted_index ] = 0;
        }
        
        $row++;
    }
    
    close $data_fh;
    
    $aoa;
}

=head1 RESULTS RELATED SUBROUTINES/METHODS

This part is related to generating the confusion matrix.

=head2 get_exam_results ( ... )

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

=head2 get_confusion_matrix ( \%options )

Returns the confusion matrix in the form of a hash. The hash will contain these keys: C<true_positive>, C<true_negative>, C<false_positive>, C<false_negative>, C<accuracy>, C<sensitivity>. More stats like C<precision>, C<specificity> and C<F1_Score> ...

If you are trying to manipulate the confusion matrix hash or something, take note that all the stats are in percentage (%) in decimal (if any) except the total entries.

For C<%options>, the followings are needed unless mentioned:

=over 4

=item full_data_file => $filled_test_file

This is the CSV file filled with the predicted values. 

Make sure that you don't do anything to the actual and predicted output in this file after testing the nerve. These two columns must contain binary values only!

=item actual_output_header => $actual_column_name

=item predicted_output_header => $predicted_column_name

The binary values are treated as follows:

lib/AI/Perceptron/Simple.pm view on Meta::CPAN



=head2 &_collect_stats ( \%options )

Generates a hash of confusion matrix based on C<%options> given in the C<get_confusion_matrix> method.

=cut

sub _collect_stats {
    my $info = shift;
    my $file = $info->{ full_data_file };
    my $actual_header = $info->{ actual_output_header };
    my $predicted_header = $info->{ predicted_output_header };
    my $more_stats = defined ( $info->{ more_stats } ) ? 1 : 0;
    
    my %c_matrix = ( 
        true_positive => 0, true_negative => 0, false_positive => 0, false_negative => 0,
        accuracy => 0, sensitivity => 0
    );
    
    # CSV processing is all according to the documentation of Text::CSV
    open my $data_fh, "<:encoding(UTF-8)", $file
        or croak "Can't open $file: $!";
    
    my $csv = Text::CSV->new( {auto_diag => 1, binary => 1} );
    
    my $attrib = $csv->getline($data_fh); # get the row of headers, can't specify any column
    # shouldn't be a problem, since we're reading line by line :)

    $csv->column_names( $attrib );

    # individual row
    while ( my $row = $csv->getline_hr($data_fh) ) {
        
        # don't pack this part into another subroutine, number of rows can be very big
        if ( $row->{ $actual_header } == 1 and $row->{ $predicted_header } == 1 ) {

            # true positive
            $c_matrix{ true_positive }++;
            
        } elsif ( $row->{ $actual_header } == 0 and $row->{ $predicted_header } == 0 ) {
            
            # true negative

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

            $c_matrix{ false_positive }++;
            
        } else {
        
            croak "Something's wrong!\n".
            "Make sure that the actual and predicted values in your file are binary ie 0 or 1" ;
            
        }
    }
    
    close $data_fh;

    _calculate_total_entries( \%c_matrix );

    _calculate_sensitivity( \%c_matrix );
    
    _calculate_accuracy( \%c_matrix );
    
    if ( $more_stats == 1 ) {
        _calculate_precision( \%c_matrix );

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

        _calculate_false_discovery_rate( \%c_matrix ); #
        _calculate_false_omission_rate( \%c_matrix ); #
        _calculate_balanced_accuracy( \%c_matrix ); #
    }
    
    %c_matrix;
}

=head2 &_calculate_total_entries ( $c_matrix_ref )

Calculates and adds the data for the C<total_entries> key in the confusion matrix hash.

=cut

sub _calculate_total_entries {

    my $c_matrix = shift;
    my $total = $c_matrix->{ true_negative } + $c_matrix->{ false_positive };
       $total += $c_matrix->{ false_negative } + $c_matrix->{ true_positive };

    $c_matrix->{ total_entries } = $total;

}

=head2 &_calculate_accuracy ( $c_matrix_ref )

Calculates and adds the data for the C<accuracy> key in the confusion matrix hash.

=cut

sub _calculate_accuracy {

    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ true_positive } + $c_matrix->{ true_negative };
    my $denominator = $numerator + $c_matrix->{ false_positive } + $c_matrix->{ false_negative };
    
    $c_matrix->{ accuracy } = $numerator / $denominator * 100;
    
    # no need to return anything, we're using ref
}

=head2 &_calculate_sensitivity ( $c_matrix_ref )

Calculates and adds the data for the C<sensitivity> key in the confusion matrix hash.

=cut

sub _calculate_sensitivity {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ true_positive };
    my $denominator = $numerator + $c_matrix->{ false_negative };
    
    $c_matrix->{ sensitivity } = $numerator / $denominator * 100;

    # no need to return anything, we're using ref
}

=head2 &_calculate_precision ( $c_matrix_ref )

Calculates and adds the data for the C<precision> key in the confusion matrix hash.

=cut

sub _calculate_precision {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ true_positive };
    my $denominator = $numerator + $c_matrix->{ false_positive };
    
    $c_matrix->{ precision } = $numerator / $denominator * 100;
}

=head2 &_calculate_specificity ( $c_matrix_ref )

Calculates and adds the data for the C<specificity> key in the confusion matrix hash.

=cut

sub _calculate_specificity {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ true_negative };
    my $denominator = $numerator + $c_matrix->{ false_positive };
    
    $c_matrix->{ specificity } = $numerator / $denominator * 100;
}

=head2 &_calculate_f1_score ( $c_matrix_ref )

Calculates and adds the data for the C<F1_Score> key in the confusion matrix hash.

=cut

sub _calculate_f1_score {
    my $c_matrix = shift;
    
    my $numerator = 2 * $c_matrix->{ true_positive };
    my $denominator = $numerator + $c_matrix->{ false_positive } + $c_matrix->{ false_negative };
    
    $c_matrix->{ F1_Score } = $numerator / $denominator * 100;
}       

=head2  &_calculate_negative_predicted_value( $c_matrix_ref )

Calculates and adds the data for the C<negative_predicted_value> key in the confusion matrix hash.

=cut

sub _calculate_negative_predicted_value {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ true_negative };
    my $denominator = $numerator + $c_matrix->{ false_negative };
    
    $c_matrix->{ negative_predicted_value } = $numerator / $denominator * 100;
}

=head2  &_calculate_false_negative_rate( $c_matrix_ref )

Calculates and adds the data for the C<false_negative_rate> key in the confusion matrix hash.

=cut

sub _calculate_false_negative_rate {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ false_negative };
    my $denominator = $numerator + $c_matrix->{ true_positive };
    
    $c_matrix->{ false_negative_rate } = $numerator / $denominator * 100;
}

=head2  &_calculate_false_positive_rate( $c_matrix_ref )

Calculates and adds the data for the C<false_positive_rate> key in the confusion matrix hash.

=cut

sub _calculate_false_positive_rate {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ false_positive };
    my $denominator = $numerator + $c_matrix->{ true_negative };
    
    $c_matrix->{ false_positive_rate } = $numerator / $denominator * 100;
}

=head2  &_calculate_false_discovery_rate( $c_matrix_ref )

Calculates and adds the data for the C<false_discovery_rate> key in the confusion matrix hash.

=cut

sub _calculate_false_discovery_rate {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ false_positive };
    my $denominator = $numerator + $c_matrix->{ true_positive };
    
    $c_matrix->{ false_discovery_rate } = $numerator / $denominator * 100;
}

=head2  &_calculate_false_omission_rate( $c_matrix_ref )

Calculates and adds the data for the C<false_omission_rate> key in the confusion matrix hash.

=cut

sub _calculate_false_omission_rate {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ false_negative };
    my $denominator = $numerator + $c_matrix->{ true_negative };
    
    $c_matrix->{ false_omission_rate } = $numerator / $denominator * 100;
}

=head2  &_calculate_balanced_accuracy( $c_matrix_ref )

Calculates and adds the data for the C<balanced_accuracy> key in the confusion matrix hash.

=cut

sub _calculate_balanced_accuracy {
    my $c_matrix = shift;
    
    my $numerator = $c_matrix->{ sensitivity } + $c_matrix->{ specificity };
    my $denominator = 2;
    
    $c_matrix->{ balanced_accuracy } = $numerator / $denominator; # numerator already in %

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

    my $predicted_columns = [ "P: ".$labels->{ zero_as }, "P: ".$labels->{ one_as }, "Sum" ];
    my $actual_rows = [ "A: ".$labels->{ zero_as }, "A: ".$labels->{ one_as }, "Sum"];
    
    # row sum
    my $actual_0_sum = $c_matrix->{ true_negative } + $c_matrix->{ false_positive };
    my $actual_1_sum = $c_matrix->{ false_negative } + $c_matrix->{ true_positive };
    # column sum
    my $predicted_0_sum = $c_matrix->{ true_negative } + $c_matrix->{ false_negative };
    my $predicted_1_sum = $c_matrix->{ false_positive } + $c_matrix->{ true_positive };
    
    my $data = [ 
        [ $c_matrix->{ true_negative },  $c_matrix->{ false_positive }, $actual_0_sum ],
        [ $c_matrix->{ false_negative }, $c_matrix->{ true_positive }, $actual_1_sum ],
        [ $predicted_0_sum, $predicted_1_sum, $c_matrix->{ total_entries } ],
    ];
    my $matrix = Text::Matrix->new(
        rows => $actual_rows,
        columns => $predicted_columns,
        data => $data,
    );
    
    $matrix, $c_matrix;
}

=head2 &_print_extended_matrix ( $matrix, $c_matrix )

Extends and outputs the matrix on the screen.

C<$matrix> and C<$c_matrix> are the same as returned by C<&_build_matrix>.

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

    print "  False Negative Rate: $c_matrix->{ false_negative_rate } %\n" if exists $c_matrix->{ false_negative_rate };
    print "  False Positive Rate: $c_matrix->{ false_positive_rate } %\n" if exists $c_matrix->{ false_positive_rate };
    print "  False Discovery Rate: $c_matrix->{ false_discovery_rate } %\n" if exists $c_matrix->{ false_discovery_rate };
    print "  False Omission Rate: $c_matrix->{ false_omission_rate } %\n" if exists $c_matrix->{ false_omission_rate };
    print "  Balanced Accuracy: $c_matrix->{ balanced_accuracy } %\n" if exists $c_matrix->{ balanced_accuracy };
    print "~~" x24, "\n";
}

=head1 NERVE DATA RELATED SUBROUTINES

This part is about saving the data of the nerve. These subroutines can be imported using the C<:local_data> tag.

B<The subroutines are to be called in the procedural way>. No checking is done currently.

See C<PERCEPTRON DATA> and C<KNOWN ISSUES> sections for more details on the subroutines in this section.

=head2 preserve ( ... )

The parameters and usage are the same as C<save_perceptron>. See the next subroutine.

=head2 save_perceptron ( $nerve, $nerve_file )

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

    store $self, $nerve_file;
    no Storable;
}

=head2 revive (...)

The parameters and usage are the same as C<load_perceptron>. See the next subroutine.

=head2 load_perceptron ( $nerve_file_to_load )

Loads the data and turns it into a C<AI::Perceptron::Simple> object as the return value.

=cut

sub revive {
    load_perceptron( @_ );
}

sub load_perceptron {
    my $nerve_file_to_load = shift;
    use Storable;
    my $loaded_nerve = retrieve( $nerve_file_to_load );
    no Storable;
    
    $loaded_nerve;
}

=head1 NERVE PORTABILITY RELATED SUBROUTINES

These subroutines can be imported using the C<:portable_data> tag.

The file type currently supported is YAML. Please be careful with the data as you won't want the nerve data accidentally modified.

=head2 preserve_as_yaml ( ... )

The parameters and usage are the same as C<save_perceptron_yaml>. See the next subroutine.

=head2 save_perceptron_yaml ( $nerve, $yaml_nerve_file )

Saves the C<AI::Perceptron::Simple> object into a C<YAML> file.

=cut

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

    YAML::DumpFile( $nerve_file, $self );
    no YAML;
}

=head2 revive_from_yaml (...)

The parameters and usage are the same as C<load_perceptron>. See the next subroutine.

=head2 load_perceptron_yaml ( $yaml_nerve_file )

Loads the YAML data and turns it into a C<AI::Perceptron::Simple> object as the return value.

=cut

sub revive_from_yaml {
    load_perceptron_yaml( @_ );
}

sub load_perceptron_yaml {
    my $nerve_file_to_load = shift;
    use YAML;

lib/AI/Perceptron/Simple.pm view on Meta::CPAN

}

=head1 TO DO

These are the to-do's that B<MIGHT> be done in the future. Don't put too much hope in them please :)

=over 4

=item * Clean up and refactor source codes

=item * Add more useful data for confusion matrix

=item * Implement shuffling data feature

=item * Implement fast/smart training feature

=item * Write a tutorial or something for this module

=item * and something yet to be known...

=back

=head1 KNOWN ISSUES

=head2 Portability of Nerve Data

Take note that the C<Storable> nerve data is not compatible across different versions.

If you really need to send the nerve data to different computers with different versions of C<Storable> module, see the docs of the following subroutines: 

=over 4

=item * C<&preserve_as_yaml> or C<&save_perceptron_yaml> for storing data.

=item * C<&revive_from_yaml> or C<&load_perceptron_yaml> for retrieving the data.

=back

=head1 AUTHOR

Raphael Jong Jun Jie, C<< <ellednera at cpan.org> >>

=head1 BUGS

Please report any bugs or feature requests to C<bug-ai-perceptron-simple at rt.cpan.org>, or through

t/02-state_portable.t view on Meta::CPAN

#!/usr/bin/perl

use strict;
use warnings;
use Test::More;

use AI::Perceptron::Simple qw( :portable_data );
# for :local_data test, see 04-train.t   02-state_synonyms.t utilizes the full invocation

use FindBin;
use constant MODULE_NAME => "AI::Perceptron::Simple";

my @attributes = qw ( has_trees trees_coverage_more_than_half has_other_living_things );

my $total_headers = scalar @attributes;

my $perceptron = AI::Perceptron::Simple->new( {
    initial_value => 0.01,
    attribs => \@attributes
} );

subtest "All data related subroutines found" => sub {
    # this only checks if the subroutines are contained in the package
    ok( AI::Perceptron::Simple->can("preserve_as_yaml"), "&preserve_as_yaml is present" );
    ok( AI::Perceptron::Simple->can("save_perceptron_yaml"), "&save_perceptron_yaml is persent" );

    ok( AI::Perceptron::Simple->can("revive_from_yaml"), "&revive_from_yaml is present" );
    ok( AI::Perceptron::Simple->can("load_perceptron_yaml"), "&load_perceptron_yaml is present" );

};

my $yaml_nerve_file = $FindBin::Bin . "/portable_nerve.yaml";

# save file
save_perceptron_yaml( $perceptron, $yaml_nerve_file );
ok( -e $yaml_nerve_file, "Found the YAML perceptron." );
# load and check
ok( my $transfered_nerve = load_perceptron_yaml( $yaml_nerve_file ), "&loaded_perceptron_from_YAML" );

is_deeply( $transfered_nerve, $perceptron, "&load_perceptron_yaml - correct data after loading" );
is ( ref ($transfered_nerve), "AI::Perceptron::Simple", "Loaded back as a blessed object" );

# test synonyms
AI::Perceptron::Simple::preserve_as_yaml( $perceptron, $yaml_nerve_file );
ok( -e $yaml_nerve_file, "Synonym - Found the YAML perceptron." );

ok( $transfered_nerve = AI::Perceptron::Simple::revive_from_yaml( $yaml_nerve_file ), "&revive_from_yaml is working correctly" );

is_deeply( $transfered_nerve, $perceptron, "&revive_from_yaml - correct data after loading" );
is ( ref ($transfered_nerve), "AI::Perceptron::Simple", "Loaded back as a blessed object" );

done_testing();

# besiyata d'shmaya

t/04-train.t view on Meta::CPAN

#!/usr/bin/perl

use strict;
use warnings;
use Test::More;

use AI::Perceptron::Simple qw( :local_data );

# pwd is the actual .pm module in blib
# ie. My-Perceptron/blib/lib/My/Perceptron.pm
use FindBin;
use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";

# 36 headers

t/06-validate.t view on Meta::CPAN

use strict;
use warnings;
use Test::More;
use Test::Output;

use AI::Perceptron::Simple;

use FindBin;

# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
    # use different sets of data for training and validating the nerve. Same goes to testing data.
    # I'm doing this only to make sure the nerve is working correctly

use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";


use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";

t/06-validate_synonyms_lab.t view on Meta::CPAN

use strict;
use warnings;
use Test::More;
use Test::Output;

use AI::Perceptron::Simple;

use FindBin;

# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
    # use different sets of data for training and validating the nerve. Same goes to testing data.
    # I'm doing this only to make sure the nerve is working correctly

use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";


use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate_lab-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";

t/06-validate_synonyms_mock.t view on Meta::CPAN

use strict;
use warnings;
use Test::More;
use Test::Output;

use AI::Perceptron::Simple;

use FindBin;

# TRAINING_DATA & VALIDATION_DATA have the same contents, in real world, don't do this
    # use different sets of data for training and validating the nerve. Same goes to testing data.
    # I'm doing this only to make sure the nerve is working correctly

use constant TRAINING_DATA => $FindBin::Bin . "/book_list_train.csv";
use constant VALIDATION_DATA => $FindBin::Bin . "/book_list_validate.csv";


use constant VALIDATION_DATA_OUTPUT_FILE => $FindBin::Bin . "/book_list_validate_mock-filled.csv";
use constant MODULE_NAME => "AI::Perceptron::Simple";
use constant WANT_STATS => 1;
use constant IDENTIFIER => "book_name";

t/08-confusion_matrix.t view on Meta::CPAN


use FindBin;

use constant TEST_FILE => $FindBin::Bin . "/book_list_test-filled.csv";
use constant NON_BINARY_FILE => $FindBin::Bin . "/book_list_test-filled-non-binary.csv";

my $nerve_file = $FindBin::Bin . "/perceptron_1.nerve";
my $perceptron = AI::Perceptron::Simple::load_perceptron( $nerve_file );

ok ( my %c_matrix = $perceptron->get_confusion_matrix( { 
        full_data_file => TEST_FILE, 
        actual_output_header => "brand",
        predicted_output_header => "predicted",
    } ), 
    "get_confusion_matrix method is working");

is ( ref \%c_matrix, ref {}, "Confusion matrix in correct data structure" );

is ( $c_matrix{ true_positive }, 2, "Correct true_positive" );
is ( $c_matrix{ true_negative }, 4, "Correct true_negative" );
is ( $c_matrix{ false_positive }, 1, "Correct false_positive" );
is ( $c_matrix{ false_negative }, 3, "Correct false_negative" );

is ( $c_matrix{ total_entries }, 10, "Total entries is correct" );
ok ( AI::Perceptron::Simple::_calculate_total_entries( \%c_matrix ), 
    "Testing the 'untestable' &_calculate_total_entries" );
is ( $c_matrix{ total_entries }, 10, "'illegal' calculation of total entries is correct" );

t/08-confusion_matrix.t view on Meta::CPAN


like ( $c_matrix{ sensitivity }, qr/40/, "Accuracy seems correct to me" );
ok ( AI::Perceptron::Simple::_calculate_sensitivity( \%c_matrix ), 
    "Testing the 'untestable' &_calculate_sensitivity" );
like ( $c_matrix{ accuracy }, qr/60/, "'illegal' calculation of sensitivity seems correct to me" );

{
    local $@;
    eval {
        $perceptron->get_confusion_matrix( { 
            full_data_file => NON_BINARY_FILE,
            actual_output_header => "brand",
            predicted_output_header => "predicted",
        } );
    };

    like ( $@, qr/Something\'s wrong\!/, "Croaked! Found non-binary values in file");
}

my $piece;
my @pieces = ('A: ', 'P: ', 'actual', 'predicted', 'entries', 'Accuracy', 'Sensitivity', 'MP520', 'Yi Lin');

t/08-confusion_matrix.t view on Meta::CPAN

    };
    
    like ( $@, qr/zero_as one_as/, "Both keys not found" );
}

# more_stats enabled

subtest "More stats" => sub {

    my %c_matrix_more_stats = $perceptron->get_confusion_matrix( { 
            full_data_file => TEST_FILE, 
            actual_output_header => "brand",
            predicted_output_header => "predicted",
            more_stats => 1,
        } );

    like ( $c_matrix_more_stats{ precision }, qr/66.66/, "Precision seems correct to me" );
    is ( $c_matrix_more_stats{ specificity }, 80, "Specificity seems correct to me" );
    is ( $c_matrix_more_stats{ F1_Score }, 50, "F1 Score seems correct to me" );
    like ( $c_matrix_more_stats{ negative_predicted_value }, qr/57.142/, "Negative Predicted Value seems correct to me" );
    is ( $c_matrix_more_stats{ false_negative_rate }, 60, "False Negative Rate seems correct to me" );

t/08-confusion_matrix_synonyms.t view on Meta::CPAN


use FindBin;

use constant TEST_FILE => $FindBin::Bin . "/book_list_test-filled.csv";
use constant NON_BINARY_FILE => $FindBin::Bin . "/book_list_test-filled-non-binary.csv";

my $nerve_file = $FindBin::Bin . "/perceptron_1.nerve";
my $perceptron = AI::Perceptron::Simple::load_perceptron( $nerve_file );

ok ( my %c_matrix = $perceptron->get_exam_results( { 
        full_data_file => TEST_FILE, 
        actual_output_header => "brand",
        predicted_output_header => "predicted",
    } ), 
    "get_exam_results method is working");

is ( ref \%c_matrix, ref {}, "Confusion matrix in correct data structure" );

is ( $c_matrix{ true_positive }, 2, "Correct true_positive" );
is ( $c_matrix{ true_negative }, 4, "Correct true_negative" );
is ( $c_matrix{ false_positive }, 1, "Correct false_positive" );
is ( $c_matrix{ false_negative }, 3, "Correct false_negative" );

is ( $c_matrix{ total_entries }, 10, "Total entries is correct" );
ok ( AI::Perceptron::Simple::_calculate_total_entries( \%c_matrix ), 
    "Testing the 'untestable' &_calculate_total_entries" );
is ( $c_matrix{ total_entries }, 10, "'illegal' calculation of total entries is correct" );

t/08-confusion_matrix_synonyms.t view on Meta::CPAN


like ( $c_matrix{ sensitivity }, qr/40/, "Accuracy seems correct to me" );
ok ( AI::Perceptron::Simple::_calculate_sensitivity( \%c_matrix ), 
    "Testing the 'untestable' &_calculate_sensitivity" );
like ( $c_matrix{ accuracy }, qr/60/, "'illegal' calculation of sensitivity seems correct to me" );

{
    local $@;
    eval {
        $perceptron->get_exam_results( { 
            full_data_file => NON_BINARY_FILE,
            actual_output_header => "brand",
            predicted_output_header => "predicted",
        } );
    };

    like ( $@, qr/Something\'s wrong\!/, "Croaked! Found non-binary values in file");
}


my $piece;

t/08-confusion_matrix_synonyms.t view on Meta::CPAN

        $perceptron->display_exam_results( \%c_matrix );
    };
    
    like ( $@, qr/zero_as one_as/, "Both keys not found" );
}
# more_stats enabled

subtest "More stats" => sub {

    my %c_matrix_more_stats = $perceptron->get_confusion_matrix( { 
            full_data_file => TEST_FILE, 
            actual_output_header => "brand",
            predicted_output_header => "predicted",
            more_stats => 1,
        } );

    like ( $c_matrix_more_stats{ precision }, qr/66.66/, "Precision seems correct to me" );
    is ( $c_matrix_more_stats{ specificity }, 80, "Specificity seems correct to me" );
    is ( $c_matrix_more_stats{ F1_Score }, 50, "F1 Score seems correct to me" );
    like ( $c_matrix_more_stats{ negative_predicted_value }, qr/57.142/, "Negative Predicted Value seems correct to me" );
    is ( $c_matrix_more_stats{ false_negative_rate }, 60, "False Negative Rate seems correct to me" );

t/12-shuffle_data.t view on Meta::CPAN

#!/usr/bin/perl

use strict;
use warnings;
use Test::More;
use Test::Output;

# use AI::Perceptron::Simple "shuffle_data";
use AI::Perceptron::Simple ":process_data";

use FindBin;
# this one will directly use "ORIGINAL_STIMULI" as the filename if use with "=>", strange
use constant ORIGINAL_STIMULI => $FindBin::Bin . "/book_list_to_shuffle.csv";

my $original_stimuli = $FindBin::Bin . "/book_list_to_shuffle.csv";
my $shuffled_data_1 = $FindBin::Bin . "/shuffled_1.csv";
my $shuffled_data_2 = $FindBin::Bin . "/shuffled_2.csv";
my $shuffled_data_3 = $FindBin::Bin . "/shuffled_3.csv";

ok( -e $original_stimuli, "Found the original file" );

{
local $@;
eval { shuffle_data };
like( $@, qr/^Please specify/, "Croaked at invocation with any arguements" )
}

{
local $@;
eval { shuffle_data($original_stimuli) };
like( $@, qr/output files/, "Croaked when new file names not present" )
}

shuffle_data( $original_stimuli => $shuffled_data_1, $shuffled_data_2, $shuffled_data_3 );

stdout_like {
    shuffle_data( ORIGINAL_STIMULI, $shuffled_data_1, $shuffled_data_2, $shuffled_data_3 );
} qr/^Saved/, "Correct output after saving file";


ok( -e $shuffled_data_1, "Found the first shuffled file" );
ok( -e $shuffled_data_2, "Found the second shuffled file" );
ok( -e $shuffled_data_3, "Found the third shuffled file" );

done_testing();

# besiyata d'shmaya

t/12-shuffle_data_synonym.t view on Meta::CPAN

#!/usr/bin/perl

use strict;
use warnings;
use Test::More;
use Test::Output;

use AI::Perceptron::Simple "shuffle_stimuli";
#use AI::Perceptron::Simple ":process_data";

use FindBin;
# this one will directly use "ORIGINAL_STIMULI" as the filename if use with "=>", strange
use constant ORIGINAL_STIMULI => $FindBin::Bin . "/book_list_to_shuffle.csv";

my $original_stimuli = $FindBin::Bin . "/book_list_to_shuffle.csv";
my $shuffled_data_1 = $FindBin::Bin . "/shuffled_1.csv";
my $shuffled_data_2 = $FindBin::Bin . "/shuffled_2.csv";
my $shuffled_data_3 = $FindBin::Bin . "/shuffled_3.csv";

ok( -e $original_stimuli, "Found the original file" );

{
local $@;
eval { shuffle_stimuli };
like( $@, qr/^Please specify/, "Croaked at invocation with any arguements" )
}

{
local $@;
eval { shuffle_stimuli($original_stimuli) };
like( $@, qr/output files/, "Croaked when new file names not present" )
}

shuffle_stimuli( $original_stimuli => $shuffled_data_1, $shuffled_data_2, $shuffled_data_3 );

stdout_like {
    shuffle_stimuli( ORIGINAL_STIMULI, $shuffled_data_1, $shuffled_data_2, $shuffled_data_3 );
} qr/^Saved/, "Correct output after saving file";


ok( -e $shuffled_data_1, "Found the first shuffled file" );
ok( -e $shuffled_data_2, "Found the second shuffled file" );
ok( -e $shuffled_data_3, "Found the third shuffled file" );

done_testing();

# besiyata d'shmaya

( run in 0.656 second using v1.01-cache-2.11-cpan-a5abf4f5562 )