AI-MaxEntropy

 view release on metacpan or  search on metacpan

lib/AI/MaxEntropy.pm  view on Meta::CPAN


=head2 Features

Generally, a feature is a binary function answers a yes-no question on a
specified piece of data. 

For examples, 

  "Is it a red apple?"

  "Is it a yellow banana?"

If the answer is yes,
we say this feature is active on that piece of data.

In practise, a feature is usually represented as
a tuple C<E<lt>x, yE<gt>>. For examples, the above two features can be
represented as

  <red, apple>

  <yellow, banana>

=head2 Samples

A sample is a set of active features, all of which share a common C<y>.
This common C<y> is sometimes called label or tag.
For example, we have a big round red apple, the correpsonding sample is 

  {<big, apple>, <round, apple>, <red, apple>}

In this module, a samples is denoted in Perl code as

  $xs => $y => $w

C<$xs> is an array ref holding all C<x>,
C<$y> is a scalar holding the label
and C<$w> is the weight of the sample, which tells how many times the
sample occurs.

Therefore, the above sample can be denoted as

  ['big', 'round', 'red'] => 'apple' => 1.0

The weight C<$w> can be ommited when it equals to 1.0,
so the above denotation can be shorten to

  ['big', 'round', 'red'] => 'apple'

=head2 Models

With a set of samples, a model can be learnt for future predictions.
The model (the lambda vector essentailly) is a knowledge representation
of the samples that it have seen before.
By applying the model, we can calculate the probability of each possible
label for a certain sample. And choose the most possible one
according to these probabilities.

=head1 FUNCTIONS

NOTE: This is still an alpha version, the APIs may be changed
in future versions.

=head2 new

Create a Maximum Entropy learner. Optionally, initial values of properties
can be specified.

  my $me1 = AI::MaxEntropy->new;
  my $me2 = AI::MaxEntropy->new(
      algorithm => { epsilon => 1e-6 });
  my $me3 = AI::MaxEntropy->new(
      algorithm => { m => 7, epsilon => 1e-4 },
      smoother => { type => 'gaussian', sigma => 0.8 }
  );

=head2 see

Let the Maximum Entropy learner see a sample.

  my $me = AI::MaxEntropy->new;

  # see a sample with default weight 1.0
  $me->see(['red', 'round'] => 'apple');
  
  # see a sample with specified weight 0.5
  $me->see(['yellow', 'long'] => 'banana' => 0.5);

The sample can be also represented in the attribute-value form, which like

  $me->see({color => 'yellow', shape => 'long'} => 'banana');
  $me->see({color => ['red', 'green'], shape => 'round'} => 'apple');

Actually, the two samples above are converted internally to,

  $me->see(['color:yellow', 'shape:long'] => 'banana');
  $me->see(['color:red', 'color:green', 'shape:round'] => 'apple');

=head2 forget_all

Forget all samples the learner have seen previously.

=head2 cut

Cut the features that occur less than the specified number.

For example, 

  ...
  $me->cut(1)

will cut all features that occur less than one time.

=head2 learn 

Learn a model from all the samples that the learner have seen so far,
returns an L<AI::MaxEntropy::Model> object, which can be used to make
prediction on unlabeled samples.

  ...



( run in 0.940 second using v1.01-cache-2.11-cpan-d8267643d1d )