Algorithm-Viterbi
view release on metacpan or search on metacpan
algorithm in Wikipedia, the free encyclopedia. Rather than copying all
text, I'm just including the link to the Wikipedia page:
<http://en.wikipedia.org/wiki/Viterbi_algorithm>. I think the page is
well-written and I see no need to repeat the theory here. Reading it may
clarify the documentation below.
METHODS
new Creates a new "Algorithm::Viterbi" object. The following
attributes can be set with the constructor:
my $v = Algorthm::Viterbi->new(
start_state => '$',
unknown_emission_prob => undef,
unknown_transition_prob => 0);
The values of the attributes in the example are the default
values. For a detailed description and use of these attributes,
see below.
train This method computes the start, emission and transition
probabilities from a set of observations and their associated
states. The probabilities are simple averages of the passed
observations, so if you require sophisticated smoothing on the
emission, start and/or transition, then you're better off
rolling your own.
The value of member start_state is a bogus state used to define
the begin state of the first transition. By default, this state
is set to '$'. You can change this by setting the variable in
the constructor or later by accessing the member directly. See
example below.
This state can also be used as a separator between the beginning
and end of a sequence of observations. For example, you could
assign this state (tag) to every end-of-sentence symbol when
training on a pre-tagged corpus.
The set of observations is passed as a reference to an array as
shown in the following example:
use strict;
use Algorithm::Viterbi;
use Data::Dumper;
my $observations = [
[ 'work', 'rainy' ],
[ 'work', 'sunny' ],
[ 'walk', 'sunny' ],
[ 'walk', 'rainy' ],
[ 'shop', 'rainy' ],
[ 'work', 'rainy' ],
];
my $v = Algorithm::Viterbi->new(start_state => '###');
$v->train($observations);
print Dumper($v);
will produce:
$VAR1 = bless( {
'transition' => {
'sunny' => {
'sunny' => '0.5',
'rainy' => '0.25'
},
'rainy' => {
'sunny' => '0.5',
'rainy' => '0.5'
},
'###' => {
'rainy' => '0.25'
}
},
'emission' => {
'shop' => {
'rainy' => '0.25'
},
'walk' => {
'sunny' => '0.5',
'rainy' => '0.25'
},
'work' => {
'sunny' => '0.5',
'rainy' => '0.5'
}
},
'start_state' => '###',
'states' => [
'sunny',
'rainy'
],
'unknown_transition_prob' => 0,
'start' => {
'sunny' => '0.333333333333333',
'rainy' => '0.666666666666667'
}
}, 'Algorithm::Viterbi' );
start Initializes the start probabilities. The start probabilities are
passed as a reference to a hash, as shown in this example:
my $start_probability = { 'Rainy'=> 0.6, 'Sunny'=> 0.4 };
my $v = Algorithm::Viterbi->new();
$v->start($start_probability);
From the start probabilities, all possible states are derived,
by copying the keys of the start hash. This list of states is
used by the forward_viterbi method. It is therefore important to
mention all possible states in the start hash.
Returns the start probabilities.
emission
Initializes the emission probabilities. The emission is passed
as a reference to a hash, as shown in this example:
my $emission_probability = {
'shop' => { 'Sunny' => '0.3', 'Rainy' => '0.4' },
'swim' => { 'Sunny' => '0.1' },
'walk' => { 'Sunny' => '0.5', 'Rainy' => '0.1' },
( run in 1.089 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )