AI-Calibrate
view release on metacpan or search on metacpan
lib/AI/Calibrate.pm view on Meta::CPAN
(positive class).
$sorted is boolean (0 by default) indicating whether the data are already
sorted by score. Unless this is set to 1, calibrate() will sort the data
itself.
Calibrate returns a reference to an ordered list of references:
[ [score, prob], [score, prob], [score, prob] ... ]
Scores will be in descending numerical order. See the DESCRIPTION section for
how this structure is interpreted. You can pass this structure to the
B<score_prob> function, along with a new score, to get a probability.
=cut
sub calibrate {
my($data, $sorted) = @_;
if (DEBUG) {
print "Original data:\n";
lib/AI/Calibrate.pm view on Meta::CPAN
probability of one to each positive instance and a probability of zero to each
negative instance, and puts each instance in its own group. It then looks, at
each iteration, for adjacent violators: adjacent groups whose probabilities
locally increase rather than decrease. When it finds such groups, it pools
them and replaces their probability estimates with the average of the group's
values. It continues this process of averaging and replacement until the
entire sequence is monotonically decreasing. The result is a sequence of
instances, each of which has a score and an associated probability estimate,
which can then be used to map scores into probability estimates.
For further information on the PAV algorithm, you can read the section in my
paper referenced below.
=head1 EXPORT
This module exports three functions: calibrate, score_prob and print_mapping.
=head1 BUGS
None known. This implementation is straightforward but inefficient (its time
is O(n^2) in the length of the data series). A linear time algorithm is
( run in 1.620 second using v1.01-cache-2.11-cpan-39bf76dae61 )