\bsay[ (] results from the CPAN

AI-MaxEntropy
  # and, let it learn
  my $model = $me->learn;

  # then, we can make predictions on unseen data

  # ask what a red thing is most likely to be
  print $model->predict(['red'])."\n";
  # the answer is apple, because all red things the learner have ever seen
  # are apples
  
  # ask what a smooth thing is most likely to be
  print $model->predict(['smooth'])."\n";
  # the answer is banana, because the learner have seen more smooth bananas
  # (weighted 3) than smooth apples (weighted 2)

  # ask what a red, long thing is most likely to be
  print $model->predict(['red', 'long'])."\n";
  # the answer is banana, because the learner have seen more long bananas
  # (weighted 3) than red apples (weighted 2)

  # print out scores of all possible answers to the feature round and red
  for ($model->all_labels) {
      my $s = $model->score(['round', 'red'] => $_);
      print "$_: $s\n";
  }
  
  # save the model
  $model->save('model_file');

  # load the model
  $model->load('model_file');

=head1 CONCEPTS

=head2 What is a Maximum Entropy model?

Maximum Entropy (ME) model is a popular approach for machine learning.
From a user's view, it just behaves like a classifier which classify things
according to the previously learnt things.

Theorically, a ME learner try to recover the real probability distribution 
of the data based on limited number of observations, by applying the
principle of maximum entropy. 

You can find some good tutorials on Maximum Entropy model here:

L<http://homepages.inf.ed.ac.uk/s0450736/maxent.html>

=head2 Features

Generally, a feature is a binary function answers a yes-no question on a
specified piece of data. 

For examples, 

  "Is it a red apple?"

  "Is it a yellow banana?"

If the answer is yes,
we say this feature is active on that piece of data.

In practise, a feature is usually represented as
a tuple C<E<lt>x, yE<gt>>. For examples, the above two features can be
represented as

  <red, apple>

  <yellow, banana>

=head2 Samples

A sample is a set of active features, all of which share a common C<y>.
This common C<y> is sometimes called label or tag.
For example, we have a big round red apple, the correpsonding sample is 

  {<big, apple>, <round, apple>, <red, apple>}

In this module, a samples is denoted in Perl code as

  $xs => $y => $w

C<$xs> is an array ref holding all C<x>,
C<$y> is a scalar holding the label
and C<$w> is the weight of the sample, which tells how many times the
sample occurs.

Therefore, the above sample can be denoted as

  ['big', 'round', 'red'] => 'apple' => 1.0

The weight C<$w> can be ommited when it equals to 1.0,
so the above denotation can be shorten to

  ['big', 'round', 'red'] => 'apple'

=head2 Models

With a set of samples, a model can be learnt for future predictions.
The model (the lambda vector essentailly) is a knowledge representation
of the samples that it have seen before.
By applying the model, we can calculate the probability of each possible
label for a certain sample. And choose the most possible one
according to these probabilities.

=head1 FUNCTIONS

NOTE: This is still an alpha version, the APIs may be changed
in future versions.

=head2 new

Create a Maximum Entropy learner. Optionally, initial values of properties
can be specified.

  my $me1 = AI::MaxEntropy->new;
  my $me2 = AI::MaxEntropy->new(
      algorithm => { epsilon => 1e-6 });
  my $me3 = AI::MaxEntropy->new(
      algorithm => { m => 7, epsilon => 1e-4 },
      smoother => { type => 'gaussian', sigma => 0.8 }
( run in 0.520 second using v1.01-cache-2.11-cpan-a9496e3eb41 )