Algorithm-ExpectationMaximization

 view release on metacpan or  search on metacpan

lib/Algorithm/ExpectationMaximization.pm  view on Meta::CPAN

indicates that the ID tag is in the first column, the
character C<0> that the second column is to be ignored, and
the C<1>'s that follow that the 3rd, the 4th, and the 5th
columns are to be used for clustering.

If instead of random seeding, you wish to use the kmeans
based seeding, just replace the option C<random> supplied
for C<seeding> by C<kmeans>.  You can also do manual seeding
by designating a specified set of data elements to serve as
cluster seeds.  The call to the constructor in this case
looks like

    my $clusterer = Algorithm::ExpectationMaximization->new(
                                datafile            => $datafile,
                                mask                => $mask,
                                K                   => 3,
                                max_em_iterations   => 300,
                                seeding             => 'manual',
                                seed_tags           => ['a26', 'b53', 'c49'],
                                terminal_output     => 1,
                    );

where the option C<seed_tags> is set to an anonymous array
of symbolic names associated with the data elements.

If you know the class priors, you can supply them through an
additional option to the constructor that looks like

    class_priors    => [0.6, 0.2, 0.2],

for the case of C<K> equal to 3.  B<In general, this would
be a useful thing to do only for the case of manual
seeding.> If you go for manual seeding, the order in which
the priors are expressed should correspond to the order of
the manually chosen tags supplied through the C<seed_tags>
option.



( run in 0.856 second using v1.01-cache-2.11-cpan-64827b87656 )