Algorithm-AM
view release on metacpan or search on metacpan
lib/Algorithm/AM.pm view on Meta::CPAN
my $context = pack "S!4", @context_list;
return $context;
}
1;
__END__
=pod
=encoding UTF-8
=head1 NAME
Algorithm::AM - Classify data with Analogical Modeling
=head1 VERSION
version 3.13
=head1 SYNOPSIS
use Algorithm::AM;
my $dataset = dataset_from_file(path => 'finnverb', format => 'nocommas');
my $am = Algorithm::AM->new(training_set => $dataset);
my $result = $am->classify($dataset->get_item(0));
print @{ $result->winners };
print ${ $result->statistical_summary };
=head1 DESCRIPTION
This module provides an object-oriented interface for
classifying single items using the analogical modeling algorithm.
To work with sets of items needing to be classified, see
L<Algorithm::AM::Batch>. To run classification from the command line
without writing your own Perl code, see L<analogize>.
This module logs information using L<Log::Any>, so if you
want automatic print-outs you need to set an adaptor. See the
L</classify> method for more information on logged data.
=head1 BACKGROUND AND TERMINOLOGY
Analogical Modeling (or AM) was developed as an exemplar-based
approach to modeling language usage, and has also been found useful
in modeling other "sticky" phenomena. AM is especially suited to this
because it predicts probabilistic occurrences instead of assigning
static labels for instances.
AM was not designed to be a classifier, but as a cognitive theory explaining
variation in human behavior. As such, though in practice it is often used
like any other machine learning classifier, there are fine theoretical points
in which it differs. As a theory of human behavior, much of the value in its
predictions lies in matching observed human behavior, including non-determinism
and degradations in accuracy caused by paucity of data.
The AM algorithm could be called a
L<probabilistic|http://en.wikipedia.org/wiki/Probabilistic_classification>,
L<instance-based|http://en.wikipedia.org/wiki/Instance-based_learning>
classifier. However, the probabilities given for each classification
are not degrees of certainty, but actual probabilities of occurring
in real usage. AM models "sticky" phenomena as being intrinsically
sticky, not as deterministic phenomena that just require more data to be
predicted perfectly.
Though it is possible to choose an outcome probabilistically, in practice
users are generally interested in either the full predicted probability
distribution
or the outcome with the highest probability. The entire outcome probability
distribution can be retrieved via
L<Algorithm::AM::Result/scores_normalized>. The highest probability outcome
can be retrieved via L<Algorithm::AM::Result/winners>.
If you're only interested in classification accuracy based on the highest
probability outcome (treating AM like any other classification algorithm),
use L<Algorithm::AM::Result/result>.
See L<Algorithm::AM::Result> for other types of information available
after classification. See L<Algorithm::AM::algorithm> for details
on the actual mechanism of classification.
AM practitioners often use specialized terminolgy, but most of this
terminology has more common machine learning terminology equivalents.
This software tries to use the specialized terminology for end-user-facing
tasks like reports or command-line API's.
AM uses the term "exemplar" where ML uses "training instance". Historically
the AM software used the word "item" to refer to either training or test
instances, and that term is retained here. AM has "outcomes" and ML has
"class labels" (we use the latter). Finally, AM practitioners refer to
"variables", and we use the ML term "feature" here.
=head1 EXPORTS
When this module is imported, it also imports the following:
=over
=item L<Algorithm::AM::Result>
=item L<Algorithm::AM::DataSet>
Also imports L<Algorithm::AM::DataSet/dataset_from_file>.
=item L<Algorithm::AM::DataSet::Item>
Also imports L<Algorithm::AM::DataSet::Item/new_item>.
=item L<Algorithm::AM::BigInt>
Also imports L<Algorithm::AM::BigInt/bigcmp>.
=back
=head1 METHODS
=for Pod::Coverage BUILD
=head2 C<new>
Creates a new instance of an analogical modeling classifier. This
method takes named parameters which set state described in the
documentation for the relevant methods. The only required parameter
( run in 0.344 second using v1.01-cache-2.11-cpan-13bb782fe5a )