Algorithm-Viterbi
view release on metacpan or search on metacpan
lib/Algorithm/Viterbi.pm view on Meta::CPAN
my ($prob, $v_path, $v_prob) = $v->forward_viterbi($observations);
-- or --
my $training_data = [
[ 'walk', 'Sunny' ],
[ 'walk', 'Sunny' ],
[ 'walk', 'Rainy' ],
[ 'shop', 'Rainy' ],
[ 'clean', 'Rainy' ],
[ 'clean', 'Rainy' ],
...
];
$v->train($training_data);
my ($prob, $v_path, $v_prob) = $v->forward_viterbi($observations);
=head1 DESCRIPTION
Algorithm::Viterbi computes the forward probability, the Viterbi path
and the Viterbi probability of a sequence of observations, based on
a given start, emission and transition probability.
Alternatively, the start, emission and transition probability can be
computed from a set of training data.
The whole idea of this module is inspired by an article on the Viterbi
algorithm in Wikipedia, the free encyclopedia. Rather than copying all
text, I'm just including the link to the Wikipedia page:
L<http://en.wikipedia.org/wiki/Viterbi_algorithm>.
I think the page is well-written and I see no need to repeat the theory
here. Reading it may clarify the documentation below.
=cut
use strict;
use warnings;
=head1 METHODS
=over 8
=item new
Creates a new C<Algorithm::Viterbi> object.
The following attributes can be set with the constructor:
my $v = Algorthm::Viterbi->new(
start_state => '$',
unknown_emission_prob => undef,
unknown_transition_prob => 0);
The values of the attributes in the example are the default values.
For a detailed description and use of these attributes, see below.
=cut
sub new
{
my $class = shift;
my $self = {@_};
bless $self, $class;
$self->{unknown_transition_prob} = 0 if (!defined($self->{unknown_transition_prob}));
$self->{start_state} = '$' if (!defined($self->{start_state}));
return $self;
}
=item train
This method computes the start, emission and transition probabilities
from a set of observations and their associated states.
The probabilities are simple averages of the passed observations,
so if you require sophisticated smoothing on the emission, start and/or
transition, then you're better off rolling your own.
The value of member start_state is a bogus state used to define the begin state of the first transition.
By default, this state is set to '$'. You can change this by setting the variable in the constructor
or later by accessing the member directly. See example below.
This state can also be used as a separator between the beginning and end of a sequence of observations.
For example, you could assign this state (tag) to every end-of-sentence symbol when training on a
pre-tagged corpus.
The set of observations is passed as a reference to an array as shown in the following example:
use strict;
use Algorithm::Viterbi;
use Data::Dumper;
my $observations = [
[ 'work', 'rainy' ],
[ 'work', 'sunny' ],
[ 'walk', 'sunny' ],
[ 'walk', 'rainy' ],
[ 'shop', 'rainy' ],
[ 'work', 'rainy' ],
];
my $v = Algorithm::Viterbi->new(start_state => '###');
$v->train($observations);
print Dumper($v);
will produce:
$VAR1 = bless( {
'transition' => {
'sunny' => {
'sunny' => '0.5',
'rainy' => '0.25'
},
'rainy' => {
'sunny' => '0.5',
'rainy' => '0.5'
},
'###' => {
'rainy' => '0.25'
}
},
'emission' => {
'shop' => {
'rainy' => '0.25'
},
'walk' => {
'sunny' => '0.5',
'rainy' => '0.25'
},
'work' => {
'sunny' => '0.5',
'rainy' => '0.5'
}
},
'start_state' => '###',
'states' => [
'sunny',
'rainy'
],
'unknown_transition_prob' => 0,
'start' => {
'sunny' => '0.333333333333333',
'rainy' => '0.666666666666667'
}
}, 'Algorithm::Viterbi' );
=cut
sub train
{
my ($self, $training_data) = @_;
my $ep = {};
my $tp = {};
my $sp = {};
my $pt = $self->{start_state};
foreach my $o(@$training_data){
my ($a, $t) = @$o;
$ep->{$a}{$t}++;
$tp->{$pt}{$t}++;
$pt = $t;
$sp->{$t}++;
}
#emission
foreach my $a(keys %$ep){
( run in 2.543 seconds using v1.01-cache-2.11-cpan-d7f47b0818f )