AI-NaiveBayes1

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

2.011  Fri May 28 07:55:50 ADT 2021
  - merged corrections from Daniel Bohmer

2.010  Tue May 25 18:30:47 ADT 2021
  - fixing bugs reported by perltesters on some platforms.
    It seems to be due to running tests in a parallel way, and
    that using the same name of the temporary file is the problem.

2.009  Sun May 23 09:23:06 ADT 2021
  - fixed bug with t/auxfunctions.pl not being located during
    testing (hopefully fixed because it requires a different
    environment to reproduce)
  - made Changes reverse chronological

2.008  Fri May 21 10:56:33 ADT 2021
  - GitHub release
  - documentation improvements

2.007  Wed Jan 29 05:58:37 AST 2020
  - documentation update
  - fixed bug: import_from_YAML and import_from_YAML_file now

Changes  view on Meta::CPAN

  - improved an error message
  - fixed some testing problems due to whitespace
  - small improvement in generating documentation

1.5   Wed Jan 30 08:06:22 AST 2008
  - fixed testing problems due to differences in the lowest
    significant digit on different platforms

1.4   Fri Dec 14 11:42:16 AST 2007
  - added test skipping if YAML module is not available
  - removing deprecated "our" reserved word

1.3   Fri Dec  7 07:26:47 AST 2007
  - added reading from a table format (add_table)
  - fixed some warnings reported during testing
  - added option "with counts" to print_model
  - added smoothing option for attributes

1.2   Mon Mar 14 08:03:16 AST 2005
  - addition to documentation
  - fixing a minor bug and a warning

META.json  view on Meta::CPAN

   },
   "name" : "AI-NaiveBayes1",
   "no_index" : {
      "directory" : [
         "t",
         "inc"
      ]
   },
   "prereqs" : {
      "build" : {
         "requires" : {
            "ExtUtils::MakeMaker" : "0"
         }
      },
      "configure" : {
         "requires" : {
            "ExtUtils::MakeMaker" : "0"
         }
      },
      "runtime" : {
         "requires" : {
            "YAML" : "0.0"
         }
      }
   },
   "release_status" : "stable",
   "version" : "2.012"
}

META.yml  view on Meta::CPAN

---
abstract: 'Naive Bayes Classification'
author:
  - 'Vlado Keselj https://web.cs.dal.ca/~vlado'
build_requires:
  ExtUtils::MakeMaker: '0'
configure_requires:
  ExtUtils::MakeMaker: '0'
dynamic_config: 1
generated_by: 'ExtUtils::MakeMaker version 7.04, CPAN::Meta::Converter version 2.150001'
license: perl
meta-spec:
  url: http://module-build.sourceforge.net/META-spec-v1.4.html
  version: '1.4'
name: AI-NaiveBayes1
no_index:
  directory:
    - t
    - inc
requires:
  YAML: '0.0'
version: '2.012'

NaiveBayes1.pm  view on Meta::CPAN

		     );
	}
    }				# foreach real attribute
}				# end of sub train

sub predict {
  my ($self, %params) = @_;
  my $newattrs = $params{attributes} or die "Missing 'attributes' parameter for predict()";
  my $m = $self->{model};  # For convenience
  
  my %scores;
  my @labels = @{ $self->{labels} };
  $scores{$_} = $m->{labelprob}{$_} foreach (@labels);
  foreach my $att (keys(%{ $newattrs })) {
      if (!defined($self->{attribute_type}{$att})) { die "Unknown attribute: `$att'" }
      next if $self->{attribute_type}{$att} eq 'real';
      die unless exists($self->{stat_attributes}{$att});
      my $attval = $newattrs->{$att};
      die "Unknown value `$attval' for attribute `$att'."
      unless exists($self->{stat_attributes}{$att}{$attval}) or
	  exists($self->{smoothing}{$att});
      foreach my $label (@labels) {
	  if (exists($m->{condprob}{$att}{$attval}) and
	      exists($m->{condprob}{$att}{$attval}{$label}) and
	      $m->{condprob}{$att}{$attval}{$label} > 0 ) {
	      $scores{$label} *=
		  $m->{condprob}{$att}{$attval}{$label};
	  } elsif (exists($self->{smoothing}{$att})) {
	      $scores{$label} *=
                  $m->{condprob}{$att}{'*'}{$label};
	  } else { $scores{$label} = 0 }

      }
  }

  foreach my $att (keys %{$newattrs}){
      next unless $self->{attribute_type}{$att} eq 'real';
      my $sum=0; my %nscores;
      foreach my $label (@labels) {
	  die unless exists $m->{real_stat}{$att}{$label}{mean};
	  $nscores{$label} =
              0.398942280401433 / $m->{real_stat}{$att}{$label}{stddev}*
              exp( -0.5 *
                  ( ( $newattrs->{$att} -
                      $m->{real_stat}{$att}{$label}{mean})
                    / $m->{real_stat}{$att}{$label}{stddev}
                  ) ** 2
		 );
	  $sum += $nscores{$label};
      }
      if ($sum==0) { print STDERR "Ignoring all Gaussian probabilities: all=0!\n" }
      else {
	  foreach my $label (@labels) { $scores{$label} *= $nscores{$label} }
      }
  }

  my $sumPx = 0.0;
  $sumPx += $scores{$_} foreach (keys(%scores));
  $scores{$_} /= $sumPx foreach (keys(%scores));
  return \%scores;
}

sub print_model {
    my $self = shift;
    my $withcounts = '';
    if ($#_>-1 && $_[0] eq 'with counts')
    { shift @_; $withcounts = 1; }
    my $m = $self->{model};
    my @labels = $self->labels;
    my $r;

NaiveBayes1.pm  view on Meta::CPAN

		     label=>'repairs=N',cases=>14);
  $nb->add_instances(attributes=>{model=>'T',place=>'N'},
		     label=>'repairs=Y',cases=> 6);
  $nb->add_instances(attributes=>{model=>'T',place=>'N'},
		     label=>'repairs=N',cases=>84);

  $nb->train;

  print "Model:\n" . $nb->print_model;
  
  # Find results for unseen instances
  my $result = $nb->predict
     (attributes => {model=>'T', place=>'N'});

  foreach my $k (keys(%{ $result })) {
      print "for label $k P = " . $result->{$k} . "\n";
  }

  # export the model into a string
  my $string = $nb->export_to_YAML();

  # create the same model from the string
  my $nb1 = AI::NaiveBayes1->import_from_YAML($string);

  # write the model to a file (shorter than model->string->file)
  $nb->export_to_YAML_file('t/tmp1');

NaiveBayes1.pm  view on Meta::CPAN


=over 4

=item new()

Constructor. Creates a new C<AI::NaiveBayes1> object and returns it.

=item import_from_YAML($string)

Constructor. Creates a new C<AI::NaiveBayes1> object from a string where it is
represented in C<YAML>.  Requires YAML module.

=item import_from_YAML_file($file_name)

Constructor. Creates a new C<AI::NaiveBayes1> object from a file where it is
represented in C<YAML>.  Requires YAML module.

=back

=head2 Non-Constructor Methods

=over 4

=item add_table()

Add instances from a table.  The first row are attributes, followed by

NaiveBayes1.pm  view on Meta::CPAN


Add instances from a CSV file.  Primitive format implementation (e.g.,
no commas allowed in attribute names or values).

=item drop_attributes(@attributes)

Delete attributes after adding instances.

=item set_real(list_of_attributes)

Delares a list of attributes to be real-valued.  During training,
their conditional probabilities will be modeled with Gaussian (normal)
distributions. 

=item C<add_instance(attributes=E<gt>HASH,label=E<gt>STRING|ARRAY)>

Adds a training instance to the categorizer.

=item C<add_instances(attributes=E<gt>HASH,label=E<gt>STRING|ARRAY,cases=E<gt>NUMBER)>

Adds a number of identical instances to the categorizer.

=item export_to_YAML()

Returns a C<YAML> string representation of an C<AI::NaiveBayes1>
object.  Requires YAML module.

=item C<export_to_YAML_file( $file_name )>

Writes a C<YAML> string representation of an C<AI::NaiveBayes1>
object to a file.  Requires YAML module.

=item C<print_model( OPTIONAL 'with counts' )>

Returns a string, human-friendly representation of the model.
The model is supposed to be trained before calling this method.
One argument 'with counts' can be supplied, in which case explanatory
expressions with counts are printed as well.

=item train()

Calculates the probabilities that will be necessary for categorization
using the C<predict()> method.

=item C<predict( attributes =E<gt> HASH )>

Use this method to predict the label of an unknown instance.  The
attributes should be of the same format as you passed to
C<add_instance()>.  C<predict()> returns a hash reference whose keys
are the names of labels, and whose values are corresponding
probabilities.

=item C<labels>

Returns a list of all the labels the object knows about (in no
particular order), or the number of labels if called in a scalar
context.

=back

NaiveBayes1.pm  view on Meta::CPAN

for each C=c we collect the mean value (m) and standard deviation (s)
for A during training.  During classification, P(A=a|C=c) is estimated
using Gaussian distribution, i.e., in the following way:

                    1               (a-m)^2
 P(A=a|C=c) = ------------ * exp( - ------- )
              sqrt(2*Pi)*s           2*s^2

this boils down to the following lines of code:

    $scores{$label} *=
    0.398942280401433 / $m->{real_stat}{$att}{$label}{stddev}*
      exp( -0.5 *
           ( ( $newattrs->{$att} -
               $m->{real_stat}{$att}{$label}{mean})
             / $m->{real_stat}{$att}{$label}{stddev}
           ) ** 2
	   );

i.e.,

NaiveBayes1.pm  view on Meta::CPAN

and CPAN-testers, including: Andreas Koenig, Alexandr Ciornii, jlatour,
Jost.Krieger, tvmaly, Matthew Musgrove, Michael Stevens, Nigel Horne,
Graham Crookham, David Cantrell (dcantrell).

=head1 AUTHOR

Copyright 2003-21 Vlado Keselj L<https://web.cs.dal.ca/~vlado>.
In 2004 Yung-chung Lin provided implementation of the Gaussian model for
continous variables.

This script is provided "as is" without expressed or implied warranty.
This is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.

The module is available on CPAN (L<https://metacpan.org/author/VLADO>), and
L<https://web.cs.dal.ca/~vlado/srcperl/>.  The latter site is
updated more frequently.

=head1 SEE ALSO

L<Algorithm::NaiveBayes>, L<perl>.

README  view on Meta::CPAN


DEPENDENCIES

The methods for importing to and exporting from YAML format require
module YAML.

COPYRIGHT AND LICENCE

Copyright 2003-21 Vlado Keselj https://web.cs.dal.ca/~vlado

This script is provided "as is" without express or implied warranty.
This is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.



( run in 1.006 second using v1.01-cache-2.11-cpan-49f99fa48dc )