AI-NeuralNet-BackProp

 view release on metacpan or  search on metacpan

BackProp.pm  view on Meta::CPAN

			print $str;
			$x++;
			if($x>$break-1) {
				print "\n";
				$x=0;
			}
		}
		print "\n";
	}
	
	# Returns percentage difference between all elements of two
	# array refs of exact same length (in elements).
	# Now calculates actual difference in numerical value.
	sub pdiff {
		no strict 'refs';
		shift if(substr($_[0],0,4) eq 'AI::'); 
		my $a1	=	shift;
		my $a2	=	shift;
		my $a1s	=	$#{$a1}; #AI::NeuralNet::BackProp::_FETCHSIZE($a1);
		my $a2s	=	$#{$a2}; #AI::NeuralNet::BackProp::_FETCHSIZE($a2);
		my ($a,$b,$diff,$t);

BackProp.pm  view on Meta::CPAN

			if($a!=$b) {
				if($a<$b){$t=$a;$a=$b;$b=$t;}
				$a=1 if(!$a);
				$diff+=(($a-$b)/$a)*100;
			}
		}
		$a1s = 1 if(!$a1s);
		return sprintf("%.10f",($diff/$a1s));
	}
	
	# Returns $fa as a percentage of $fb
	sub p {
		shift if(substr($_[0],0,4) eq 'AI::'); 
		my ($fa,$fb)=(shift,shift);
		sprintf("%.3f",((($fb-$fa)*((($fb-$fa)<0)?-1:1))/$fa)*100);
	}
	
	# This sub will take an array ref of a data set, which it expects in this format:
	#   my @data_set = (	[ ...inputs... ], [ ...outputs ... ],
	#				   				   ... rows ...
	#				   );
	#
	# This wil sub returns the percentage of 'forgetfullness' when the net learns all the
	# data in the set in order. Usage:
	#
	#	 learn_set(\@data,[ options ]);
	#
	# Options are options in hash form. They can be of any form that $net->learn takes.
	#
	# It returns a percentage string.
	#
	sub learn_set {
		my $self	=	shift if(substr($_[0],0,4) eq 'AI::'); 
		my $data	=	shift;
		my %args	=	@_;
		my $len		=	$#{$data}/2-1;
		my $inc		=	$args{inc};
		my $max		=	$args{max};
	    my $error	=	$args{error};
	    my $p		=	(defined $args{flag})	?$args{flag}	   :1;

BackProp.pm  view on Meta::CPAN

			$res=$data->[$row]->[0]-$self->run($data->[$row-1])->[0];
		}
		return $res;
	}
	
	# This sub will take an array ref of a data set, which it expects in this format:
	#   my @data_set = (	[ ...inputs... ], [ ...outputs ... ],
	#				   				   ... rows ...
	#				   );
	#
	# This wil sub returns the percentage of 'forgetfullness' when the net learns all the
	# data in the set in RANDOM order. Usage:
	#
	#	 learn_set_rand(\@data,[ options ]);
	#
	# Options are options in hash form. They can be of any form that $net->learn takes.
	#
	# It returns a true value.
	#
	sub learn_set_rand {
		my $self	=	shift if(substr($_[0],0,4) eq 'AI::'); 

BackProp.pm  view on Meta::CPAN

		my $rS		=	$self->{rS};
		my $r		=	$rB;#-$rA+1;
		return $in if(!$rA && !$rB);
		my $l		=	$self->{OUT}-1;
		my $out 	=	[];
		# Adjust for a maximum outside what we have seen so far
		for my $i (0..$l) {
			$rS=$in->[$i] if($in->[$i]>$rS);
		}
		#print "\$l:$l,\$rA:$rA,\$rB:$rB,\$rS:$rS,\$r:$r\n";
		# Loop through, convert values to percentage of maximum, then multiply
		# percentage by range and add to base of range to get finaly value
		for my $i (0..$l) {
			#print "\$i:$i,\$in:$in->[$i]\n";
			$rS=1 if(!$rS);
			my $t=intr((($rS-$in->[$i])/$rS)*$r+$rA);
			#print "t:$t,$self->{rRef}->[$t],i:$i\n";
			$out->[$i] = $self->{rRef}->[$t];
		}
		$self->{rS}=$rS;
		return $out;
	}

BackProp.pm  view on Meta::CPAN

UPDATED: You can now learn inputs with a 0 value. Beware though, it may not learn() a 0 value 
in the input map if you have randomness disabled. See NOTES on using a 0 value with randomness
disabled.

The first two arguments may be array refs (or now, strings), and they may be of different lengths.

Options should be written on hash form. There are three options:
	 
	 inc	=>	$learning_gradient
	 max	=>	$maximum_iterations
	 error	=>	$maximum_allowable_percentage_of_error
	 

$learning_gradient is an optional value used to adjust the weights of the internal
connections. If $learning_gradient is ommitted, it defaults to 0.20.
 
$maximum_iterations is the maximum numbers of iteration the loop should do.
It defaults to 1024.  Set it to 0 if you never want the loop to quit before
the pattern is perfectly learned.

$maximum_allowable_percentage_of_error is the maximum allowable error to have. If 
this is set, then learn() will return when the perecentage difference between the
actual results and desired results falls below $maximum_allowable_percentage_of_error.
If you do not include 'error', or $maximum_allowable_percentage_of_error is set to -1,
then learn() will not return until it gets an exact match for the desired result OR it
reaches $maximum_iterations.


=item $net->learn_set(\@set, [ options ]);

UPDATE: Inputs and outputs in the dataset can now be strings. See information on auto-crunching
in learn()

This takes the same options as learn() and allows you to specify a set to learn, rather

BackProp.pm  view on Meta::CPAN

	);


See the paragraph on measuring forgetfulness, below. There are 
two learn_set()-specific option tags available:

	flag     =>  $flag
	pattern  =>  $row

If "flag" is set to some TRUE value, as in "flag => 1" in the hash of options, or if the option "flag"
is not set, then it will return a percentage represting the amount of forgetfullness. Otherwise,
learn_set() will return an integer specifying the amount of forgetfulness when all the patterns 
are learned. 

If "pattern" is set, then learn_set() will use that pattern in the data set to measure forgetfulness by.
If "pattern" is omitted, it defaults to the first pattern in the set. Example:

	my @set = (
		[ 0,1,0,1 ],  [ 0 ],
		[ 0,0,1,0 ],  [ 1 ],
		[ 1,1,0,1 ],  [ 2 ],  #  <---

BackProp.pm  view on Meta::CPAN

Now why the heck would anyone want to measure forgetfulness, you ask? Maybe you wonder how I 
even measure that. Well, it is not a vital value that you have to know. I just put in a 
"forgetfulness measure" one day because I thought it would be neat to know. 

How the module measures forgetfulness is this: First, it learns all the patterns in the set provided,
then it will run the very first pattern (or whatever pattern is specified by the "row" option)
in the set after it has finished learning. It will compare the run() output with the desired output
as specified in the dataset. In a perfect world, the two should match exactly. What we measure is
how much that they don't match, thus the amount of forgetfulness the network has.

NOTE: In version 0.77 percentages were disabled because of a bug. Percentages are now enabled.

Example (from examples/ex_dow.pl):

	# Data from 1989 (as far as I know..this is taken from example data on BrainMaker)
	my @data = ( 
		#	Mo  CPI  CPI-1 CPI-3 	Oil  Oil-1 Oil-3    Dow   Dow-1 Dow-3   Dow Ave (output)
		[	1, 	229, 220,  146, 	20.0, 21.9, 19.5, 	2645, 2652, 2597], 	[	2647  ],
		[	2, 	235, 226,  155, 	19.8, 20.0, 18.3, 	2633, 2645, 2585], 	[	2637  ],
		[	3, 	244, 235,  164, 	19.6, 19.8, 18.1, 	2627, 2633, 2579], 	[	2630  ],
		[	4, 	261, 244,  181, 	19.6, 19.6, 18.1, 	2611, 2627, 2563], 	[	2620  ],

BackProp.pm  view on Meta::CPAN

Level 1 ($level = 1) : This causes ALL debugging information for the network to be dumped
as the network runs. In this mode, it is a good idea to pipe your STDIO to a file, especially
for large programs.

Level 2 ($level = 2) : A slightly-less verbose form of debugging, not as many internal 
data dumps.

Level 3 ($level = 3) : JUST prints weight mapping as weights change.

Level 4 ($level = 4) : JUST prints the benchmark info for EACH learn loop iteteration, not just
learning as a whole. Also prints the percentage difference for each loop between current network
results and desired results, as well as learning gradient ('incremenet').   

Level 4 is useful for seeing if you need to give a smaller learning incrememnt to learn() .
I used level 4 debugging quite often in creating the letters.pl example script and the small_1.pl
example script.

Toggles debuging off when called with no arguments. 



BackProp.pm  view on Meta::CPAN

it will print the $high_state_character (can be more than one character) for every element that
has a true value, and the $low_state_character for every element that has a false value. 
If you do not supply a $high_state_character, or the $high_state_character is a null or empty or 
undefined string, it join_cols() will just print the numerical value of each element seperated
by a null character (\0). join_cols() defaults to the latter behaviour.



=item $net->pdiff($array_ref_A, $array_ref_B);

This function is used VERY heavily internally to calculate the difference in percent
between elements of the two array refs passed. It returns a %.10f (sprintf-format) 
percent sting.


=item $net->p($a,$b);

Returns a floating point number which represents $a as a percentage of $b.



=item $net->intr($float);

Rounds a floating-point number rounded to an integer using sprintf() and int() , Provides
better rounding than just calling int() on the float. Also used very heavily internally.



docs.htm  view on Meta::CPAN

<P>Note, the old method of calling crunch on the values still works just as well.</P>
<P><B>UPDATED:</B> You can now learn inputs with a 0 value. Beware though, it may not <A HREF="#item_learn"><CODE>learn()</CODE></A> a 0 value 
in the input map if you have randomness disabled. See NOTES on using a 0 value with randomness
disabled.</P>
<P>The first two arguments may be array refs (or now, strings), and they may be of different lengths.</P>
<P>Options should be written on hash form. There are three options:
</P>
<PRE>
         inc    =&gt;      $learning_gradient
         max    =&gt;      $maximum_iterations
         error  =&gt;      $maximum_allowable_percentage_of_error</PRE>
<P>$learning_gradient is an optional value used to adjust the weights of the internal
connections. If $learning_gradient is ommitted, it defaults to 0.20.
</P>
<P>
$maximum_iterations is the maximum numbers of iteration the loop should do.
It defaults to 1024.  Set it to 0 if you never want the loop to quit before
the pattern is perfectly learned.</P>
<P>$maximum_allowable_percentage_of_error is the maximum allowable error to have. If 
this is set, then <A HREF="#item_learn"><CODE>learn()</CODE></A> will return when the perecentage difference between the
actual results and desired results falls below $maximum_allowable_percentage_of_error.
If you do not include 'error', or $maximum_allowable_percentage_of_error is set to -1,
then <A HREF="#item_learn"><CODE>learn()</CODE></A> will not return until it gets an exact match for the desired result OR it
reaches $maximum_iterations.</P>
<P></P>
<DT><STRONG><A NAME="item_learn_set">$net-&gt;learn_set(\@set, [ options ]);</A></STRONG><BR>
<DD>
<B>UPDATED:</B> Inputs and outputs in the dataset can now be strings. See information on auto-crunching
in <A HREF="#item_learn"><CODE>learn()</CODE></A>
<P>This takes the same options as <A HREF="#item_learn"><CODE>learn()</CODE></A> and allows you to specify a set to learn, rather
than individual patterns. A dataset is an array refrence with at least two elements in the
array, each element being another array refrence (or now, a scalar string). For each pattern to

docs.htm  view on Meta::CPAN

                # inputs        outputs
                [ 1,2,3,4 ],  [ 1,3,5,6 ],
                [ 0,2,5,6 ],  [ 0,2,1,2 ]
        );</PRE>
<P>See the paragraph on measuring forgetfulness, below. There are 
two learn_set()-specific option tags available:</P>
<PRE>
        flag     =&gt;  $flag
        pattern  =&gt;  $row</PRE>
<P>If ``flag'' is set to some TRUE value, as in ``flag =&gt; 1'' in the hash of options, or if the option ``flag''
is not set, then it will return a percentage represting the amount of forgetfullness. Otherwise,
<A HREF="#item_learn_set"><CODE>learn_set()</CODE></A> will return an integer specifying the amount of forgetfulness when all the patterns 
are learned.</P>
<P>If ``pattern'' is set, then <A HREF="#item_learn_set"><CODE>learn_set()</CODE></A> will use that pattern in the data set to measure forgetfulness by.
If ``pattern'' is omitted, it defaults to the first pattern in the set. Example:</P>
<PRE>
        my @set = (
                [ 0,1,0,1 ],  [ 0 ],
                [ 0,0,1,0 ],  [ 1 ],
                [ 1,1,0,1 ],  [ 2 ],  #  &lt;---
                [ 0,1,1,0 ],  [ 3 ]

docs.htm  view on Meta::CPAN

If you wish to measure forgetfulness as indicated by the line with the arrow, then you would
pass 2 as the &quot;pattern&quot; option, as in &quot;pattern =&gt; 2&quot;.</P>
<P>Now why the heck would anyone want to measure forgetfulness, you ask? Maybe you wonder how I 
even measure that. Well, it is not a vital value that you have to know. I just put in a 
``forgetfulness measure'' one day because I thought it would be neat to know.</P>
<P>How the module measures forgetfulness is this: First, it learns all the patterns in the set provided,
then it will run the very first pattern (or whatever pattern is specified by the ``row'' option)
in the set after it has finished learning. It will compare the <A HREF="#item_run"><CODE>run()</CODE></A> output with the desired output
as specified in the dataset. In a perfect world, the two should match exactly. What we measure is
how much that they don't match, thus the amount of forgetfulness the network has.</P>
<P>NOTE: In version 0.77 percentages were disabled because of a bug. Percentages are now enabled.</P>
<P>Example (from examples/ex_dow.pl):</P>
<PRE>
        # Data from 1989 (as far as I know..this is taken from example data on BrainMaker)
        my @data = ( 
                #       Mo  CPI  CPI-1 CPI-3    Oil  Oil-1 Oil-3    Dow   Dow-1 Dow-3   Dow Ave (output)
                [       1,      229, 220,  146,         20.0, 21.9, 19.5,       2645, 2652, 2597],      [       2647  ],
                [       2,      235, 226,  155,         19.8, 20.0, 18.3,       2633, 2645, 2585],      [       2637  ],
                [       3,      244, 235,  164,         19.6, 19.8, 18.1,       2627, 2633, 2579],      [       2630  ],
                [       4,      261, 244,  181,         19.6, 19.6, 18.1,       2611, 2627, 2563],      [       2620  ],
                [       5,      276, 261,  196,         19.5, 19.6, 18.0,       2630, 2611, 2582],      [       2638  ],

docs.htm  view on Meta::CPAN

of debugging.
<P>Level 0 ($level = 0) : Default, no debugging information printed. All printing is 
left to calling script.</P>
<P>Level 1 ($level = 1) : This causes ALL debugging information for the network to be dumped
as the network runs. In this mode, it is a good idea to pipe your STDIO to a file, especially
for large programs.</P>
<P>Level 2 ($level = 2) : A slightly-less verbose form of debugging, not as many internal 
data dumps.</P>
<P>Level 3 ($level = 3) : JUST prints weight mapping as weights change.</P>
<P>Level 4 ($level = 4) : JUST prints the benchmark info for EACH learn loop iteteration, not just
learning as a whole. Also prints the percentage difference for each loop between current network
results and desired results, as well as learning gradient ('incremenet').</P>
<P>Level 4 is useful for seeing if you need to give a smaller learning incrememnt to <A HREF="#item_learn"><CODE>learn()</CODE></A> .
I used level 4 debugging quite often in creating the letters.pl example script and the small_1.pl
example script.</P>
<P>Toggles debuging off when called with no arguments.</P>
<P></P>
<DT><STRONG><A NAME="item_save">$net-&gt;save($filename);</A></STRONG><BR>
<DD>
This will save the complete state of the network to disk, including all weights and any
words crunched with <A HREF="#item_crunch"><CODE>crunch()</CODE></A> . Also saves any output ranges set with <A HREF="#item_range"><CODE>range()</CODE></A> .

docs.htm  view on Meta::CPAN

it prints the elements of $array_ref to STDIO, adding a newline (\n) after every $row_length_in_elements
number of elements has passed. Additionally, if you include a $high_state_character and a $low_state_character,
it will print the $high_state_character (can be more than one character) for every element that
has a true value, and the $low_state_character for every element that has a false value. 
If you do not supply a $high_state_character, or the $high_state_character is a null or empty or 
undefined string, it <A HREF="#item_join_cols"><CODE>join_cols()</CODE></A> will just print the numerical value of each element seperated
by a null character (\0). <A HREF="#item_join_cols"><CODE>join_cols()</CODE></A> defaults to the latter behaviour.
<P></P>
<DT><STRONG><A NAME="item_pdiff">$net-&gt;pdiff($array_ref_A, $array_ref_B);</A></STRONG><BR>
<DD>
This function is used VERY heavily internally to calculate the difference in percent
between elements of the two array refs passed. It returns a %.10f (sprintf-format) 
percent sting.
<P></P>
<DT><STRONG><A NAME="item_p">$net-&gt;p($a,$b);</A></STRONG><BR>
<DD>
Returns a floating point number which represents $a as a percentage of $b.
<P></P>
<DT><STRONG><A NAME="item_intr">$net-&gt;intr($float);</A></STRONG><BR>
<DD>
Rounds a floating-point number rounded to an integer using <CODE>sprintf()</CODE> and <CODE>int()</CODE> , Provides
better rounding than just calling <CODE>int()</CODE> on the float. Also used very heavily internally.
<P></P>
<DT><STRONG><A NAME="item_high">$net-&gt;high($array_ref);</A></STRONG><BR>
<DD>
Returns the index of the element in array REF passed with the highest comparative value.
<P></P>

examples/ex_add2.pl  view on Meta::CPAN

	use English;
	
	my $ofile = "addnet_data.txt";
	
	open( OUTFP, ">$ofile" ) or die "Could not open output file\n";
	
	my ( $layers, $inputs, $outputs, $top, $inc, $top, $runtime,
	$forgetfulness );
	my @answers;
	my @predictions;
	my @percent_diff;
	
	$inputs = 3;
	$outputs = 1;
	my ($maxinc,$maxtop,$incstep);
	select OUTFP; $OUTPUT_AUTOFLUSH = 1; select STDOUT;
	print OUTFP "layers inc top forgetfulness time \%diff1 \%diff2 \%diff3
	\%diff4\n\n";
	
	for( $layers = 1; $layers <= 3; $layers++ ){
	 if( $layers <= 2 ){

examples/ex_add2.pl  view on Meta::CPAN

	  if( $inc > .3 ){
	   $maxtop = 3;
	  }
	  else{
	   $maxtop = 4;
	  }
	  for( $top=1; $top <=$maxtop; $top++ ){
	   addnet();
	   printf OUTFP "%d %.3f %d %g %s %f %f %f %f\n",
	   $layers, $inc, $top, $forgetfulness, timestr($runtime),
	$percent_diff[0],
	   $percent_diff[1], $percent_diff[2], $percent_diff[3];
	   print "layers inc top forgetfulness time \%diff1 \%diff2 \%diff3
	\%diff4\n";
	   printf "%d %.3f %d %g %s %f %f %f %f\n",
	   $layers, $inc, $top, $forgetfulness, timestr($runtime),
	$percent_diff[0],
	   $percent_diff[1], $percent_diff[2], $percent_diff[3];
	  }
	 }
	}
	
	#....................................................
	sub addnet
	{
	 print "\nCreate a new net with $layers layers, 3 inputs, and 1 output\n";
	 my $net = AI::NeuralNet::BackProp->new($layers,3,1);
	

examples/ex_add2.pl  view on Meta::CPAN

	      [ 2654, 2234, 2534 ] );
	
	    test_net( $net, @input );
	}
	#.....................................................................
	 sub test_net {
	  my @set;
	  my $fb;
	  my $net = shift;
	  my @data = @_;
	  undef @percent_diff; #@answers; undef @predictions;
	
	  for( $i=0; defined( $data[$i] ); $i++ ){
	   @set = @{ $data[$i] };
	   $fb = $net->run(\@set)->[0];
	   # Print output
	   print "Test Factors: (",join(',',@set),")\n";
	   $answer = eval( join( '+',@set ));
	   push @percent_diff, 100.0 * abs( $answer - $fb )/ $answer;
	   print "Prediction : $fb      answer: $answer\n";
	  }
	 }
	
	



( run in 0.356 second using v1.01-cache-2.11-cpan-05162d3a2b1 )