test results from the CPAN

test

AI-Calibrate

view release on metacpan or search on metacpan

Revision history for Perl extension AI::Calibrate.

1.5   Fri Aug 3 2012
      - Changes to ./t/AI-Calibrate-1.t to let it pass with almost-equal
        numbers.

1.4   Thu Aug 2 2012
      - Revised calibration algorithm based on bug
      - Updated tests in ./t
      - Added ./t/AI-Calibrate-KL.t using Kun Liu's dataset.
      - Added ./t/AI-Calibrate-pathologies.t to test for pathological cases.

1.3   Fri Nov 4
      - Removed dependency on Test::Deep, added explicit declaration of
        dependency on Test::More to Makefile.PL

1.2   Thu Nov 3
      - Fixed test ./t/AI-Calibrate-NB.t so that test wouldn't fail.  Used to
        call is_deeply, which was failing on slight differences between
        floating point numbers.  Now compares with a small tolerance.

1.1   Thu Feb 28 19:00:06 2008
      - Added new function print_mapping
      - Added new test file AI-Calibrate-NB.t which, if AI::NaiveBayes1 is
        present, trains a classifier and calibrates it.

1.0   Thu Feb 05 11:37:31 2008
      - First public release to CPAN.

0.01  Thu Jan 24 11:37:31 2008
	- original version; created by h2xs 1.23 with options
		-XA -n AI::Calibrate

README view on Meta::CPAN

vector machines have a similar problem.  Both classifier types should be
calibrated before their scores are used as probability estimates.  This module
calibrates a classifier using the Pool Adjacent Violators algorithm.

INSTALLATION

To install this module type the following:

   perl Makefile.PL
   make
   make test
   make install

DEPENDENCIES

No other modules are required.

COPYRIGHT AND LICENCE

Copyright (C) 2008 by Tom Fawcett

lib/AI/Calibrate.pm view on Meta::CPAN

use constant PROB  => 1;

=head1 NAME

AI::Calibrate - Perl module for producing probabilities from classifier scores

=head1 SYNOPSIS

  use AI::Calibrate ':all';
  ... train a classifier ...
  ... test classifier on $points ...
  $calibrated = calibrate($points);

=head1 DESCRIPTION

Classifiers usually return some sort of an instance score with their
classifications.  These scores can be used as probabilities in various
calculations, but first they need to be I<calibrated>.  Naive Bayes, for
example, is a very useful classifier, but the scores it produces are usually
"bunched" around 0 and 1, making these scores poor probability estimates.
Support vector machines have a similar problem.  Both classifier types should
be calibrated before their scores are used as probability estimates.

This module calibrates classifier scores using a method called the Pool
Adjacent Violators (PAV) algorithm.  After you train a classifier, you take a
(usually separate) set of test instances and run them through the classifier,
collecting the scores assigned to each.  You then supply this set of instances
to the calibrate function defined here, and it will return a set of ranges
mapping from a score range to a probability estimate.

For example, assume you have the following set of instance results from your
classifier.  Each result is of the form C<[ASSIGNED_SCORE, TRUE_CLASS]>:

 my $points = [
              [.9, 1],
              [.8, 1],

lib/AI/Calibrate.pm view on Meta::CPAN

This means that, given a SCORE produced by the classifier, you can map the
SCORE onto a probability like this:

               SCORE >= .9        prob = 1
         .9  > SCORE >= .7        prob = 3/4
         .7  > SCORE >= .45       prob = 2/3
         .45 > SCORE >= .3        prob = 3/4
         .2  > SCORE >= .7        prob = 3/4
         .02 > SCORE              prob = 0

For a realistic example of classifier calibration, see the test file
t/AI-Calibrate-NB.t, which uses the AI::NaiveBayes1 module to train a Naive
Bayes classifier then calibrates it using this module.

=cut

=head1 FUNCTIONS

=over 4

=item B<calibrate>

t/AI-Calibrate-1.t view on Meta::CPAN

#  -*- Mode: CPerl -*-
use strict;
# Before `make install' is performed this script should be runnable with
# `make test'. After `make install' it should work as `perl AI-Calibrate.t'

#########################

use Test::More tests => 34;
BEGIN { use_ok('AI::Calibrate', ':all') };

srand;

sub deeply_approx {
    # Like Test::More::is_deeply but uses approx() to compare elements.
    my( $got, $expected ) = @_;
    my $EPSILON = 1.0e-6;
    sub max {  $_[0] > $_[1] ? $_[0] : $_[1] }
    sub approx {

t/AI-Calibrate-1.t view on Meta::CPAN

    }
    my @delta_expected;
    for my $point (@$calibrated_expected) {
        my($thresh, $class) = @$point;
        push(@delta_expected, [ $thresh+$delta, $class]);
    }
    my $delta_got = calibrate(\@delta_points, 0);
    ok(deeply_approx($delta_got, \@delta_expected), "unsorted cal $_");
}

my @test_estimates =
  ( [100, 1],
    [.9,    1 ],
    [.8,   1],
    [.7,  3/4 ],
    [.5,  3/4 ],
    [.45, 2/3 ],
    [.35, 2/3 ],
    [.3,  1/2 ],
    [.2,  1/3 ],
    [.02,   0 ],
    [.00001, 0]
);


print "Using this mapping:\n";
print_mapping($calibrated_got);
print;

for my $pair (@test_estimates) {
    my($score, $prob_expected) = @$pair;
    my $prob_got = score_prob($calibrated_got, $score);
    is($prob_got, $prob_expected, "score_prob test @$pair");
}

t/AI-Calibrate-KL.t view on Meta::CPAN

#  -*- Mode: CPerl -*-
use strict;
# Before `make install' is performed this script should be runnable with
# `make test'. After `make install' it should work as `perl AI-Calibrate.t'

#########################

use Test::More tests => 4;
BEGIN { use_ok('AI::Calibrate', ':all') };

sub trim($) {
    my $string = shift;
    $string =~ s/^\s+//;
    $string =~ s/\s+$//;
    return $string;
}

#  These points are from Kun Liu

t/AI-Calibrate-NB.t view on Meta::CPAN

#  -*- Mode: CPerl -*-
use English;
use strict;
# Before `make install' is performed this script should be runnable with
# `make test'. After `make install' it should work as `perl AI-Calibrate.t'

use Test::More;

eval("use AI::NaiveBayes1");
if ($EVAL_ERROR) {
    plan skip_all => 'AI::NaiveBayes1 does not seem to be present';
} else {
    plan tests => 2;
}

use_ok('AI::Calibrate', ':all');

my @instances =
  ( [ { outlook=>'sunny',temperature=>85,humidity=>85,windy=>'FALSE'},
      'no'],
    [ {outlook=>'sunny',temperature=>80,humidity=>90,windy=>'TRUE'},
      'no'],
    [ {outlook=>'overcast',temperature=>83,humidity=>86,windy=>'FALSE'},

t/AI-Calibrate-NB.t view on Meta::CPAN

  (
   [0.779495793582905, 1],
   [0.535425255450615, 0.666666666666667]
  );

for my $i (0 .. $#expected) {
    print "$i = @{$expected[$i]}\n";
}

# This fails because two numbers differ at the 15th digit:
# is_deeply($calibrated, \@expected, "Naive Bayes calibration test");

sub close_enough {
    my($x, $y) = @_;
    return(abs($x - $y) < 1.0e-5);
}

sub lists_close_enough {
    my($got, $expected) = @_;
    if (@$got != @$expected) {
        return 0;

t/AI-Calibrate-pathologies.t view on Meta::CPAN

#  -*- Mode: CPerl -*-
use strict;
# Before `make install' is performed this script should be runnable with
# `make test'. After `make install' it should work as `perl AI-Calibrate.t'

use Test::More tests => 6;
BEGIN { use_ok('AI::Calibrate', ':all') };

my $points0 = [ ];


use Data::Dumper;

is_deeply( calibrate($points0), [], "empty point set");

my $points1 = [

( run in 0.271 second using v1.01-cache-2.11-cpan-0a987023a57 )