AI-MaxEntropy

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

    Theorically, a ME learner try to recover the real probability
    distribution of the data based on limited number of observations, by
    applying the principle of maximum entropy.

    You can find some good tutorials on Maximum Entropy model here:

    <http://homepages.inf.ed.ac.uk/s0450736/maxent.html>

  Features
    Generally, a feature is a binary function answers a yes-no question on a
    specified piece of data.

    For examples,

      "Is it a red apple?"

      "Is it a yellow banana?"

    If the answer is yes, we say this feature is active on that piece of
    data.

    In practise, a feature is usually represented as a tuple "<x, y>". For
    examples, the above two features can be represented as

      <red, apple>

      <yellow, banana>

  Samples
    A sample is a set of active features, all of which share a common "y".
    This common "y" is sometimes called label or tag. For example, we have a
    big round red apple, the correpsonding sample is

      {<big, apple>, <round, apple>, <red, apple>}

    In this module, a samples is denoted in Perl code as

      $xs => $y => $w

    $xs is an array ref holding all "x", $y is a scalar holding the label
    and $w is the weight of the sample, which tells how many times the
    sample occurs.

    Therefore, the above sample can be denoted as

      ['big', 'round', 'red'] => 'apple' => 1.0

    The weight $w can be ommited when it equals to 1.0, so the above
    denotation can be shorten to

      ['big', 'round', 'red'] => 'apple'

  Models
    With a set of samples, a model can be learnt for future predictions. The
    model (the lambda vector essentailly) is a knowledge representation of
    the samples that it have seen before. By applying the model, we can
    calculate the probability of each possible label for a certain sample.
    And choose the most possible one according to these probabilities.

FUNCTIONS
    NOTE: This is still an alpha version, the APIs may be changed in future
    versions.

  new
    Create a Maximum Entropy learner. Optionally, initial values of
    properties can be specified.

      my $me1 = AI::MaxEntropy->new;
      my $me2 = AI::MaxEntropy->new(
          algorithm => { epsilon => 1e-6 });
      my $me3 = AI::MaxEntropy->new(
          algorithm => { m => 7, epsilon => 1e-4 },
          smoother => { type => 'gaussian', sigma => 0.8 }
      );

  see
    Let the Maximum Entropy learner see a sample.

      my $me = AI::MaxEntropy->new;

      # see a sample with default weight 1.0
      $me->see(['red', 'round'] => 'apple');
  
      # see a sample with specified weight 0.5
      $me->see(['yellow', 'long'] => 'banana' => 0.5);

    The sample can be also represented in the attribute-value form, which
    like

      $me->see({color => 'yellow', shape => 'long'} => 'banana');
      $me->see({color => ['red', 'green'], shape => 'round'} => 'apple');

    Actually, the two samples above are converted internally to,

      $me->see(['color:yellow', 'shape:long'] => 'banana');
      $me->see(['color:red', 'color:green', 'shape:round'] => 'apple');

  forget_all
    Forget all samples the learner have seen previously.

  cut
    Cut the features that occur less than the specified number.

    For example,

      ...
      $me->cut(1)

    will cut all features that occur less than one time.

  learn
    Learn a model from all the samples that the learner have seen so far,
    returns an AI::MaxEntropy::Model object, which can be used to make
    prediction on unlabeled samples.

      ...

      my $model = $me->learn;

      print $model->predict(['x1', 'x2', ...]);



( run in 1.195 second using v1.01-cache-2.11-cpan-d8267643d1d )