Algorithm-FuzzyCmeans
view release on metacpan or search on metacpan
lib/Algorithm/FuzzyCmeans.pm view on Meta::CPAN
Algorithm::FuzzyCmeans - perl implementation of Fuzzy c-means clustering
=head1 SYNOPSIS
use Algorithm::FuzzyCmeans;
# input documents
my %documents = (
Alex => { 'Pop' => 10, 'R&B' => 6, 'Rock' => 4 },
Bob => { 'Jazz' => 8, 'Reggae' => 9 },
Dave => { 'Classic' => 4, 'World' => 4 },
Ted => { 'Jazz' => 9, 'Metal' => 2, 'Reggae' => 6 },
Fred => { 'Hip-hop' => 3, 'Rock' => 3, 'Pop' => 3 },
Sam => { 'Classic' => 8, 'Rock' => 1 },
);
my $fcm = Algorithm::FuzzyCmeans->new(
distance_class => 'Algorithm::FuzzyCmeans::Distance::Cosine',
m => 2.0,
);
foreach my $id (keys %documents) {
$fcm->add_document($id, $documents{$id});
}
my $num_cluster = 3;
my $num_iter = 20;
$fcm->do_clustering($num_cluster, $num_iter);
# show clustering result
foreach my $id (sort { $a cmp $b } keys %{ $fcm->memberships }) {
printf "%s\t%s\n", $id,
join "\t", map { sprintf "%.4f", $_ } @{ $fcm->memberships->{$id} };
}
# show cluster centroids
foreach my $centroid (@{ $fcm->centroids }) {
print join "\t", map { sprintf "%s:%.4f", $_, $centroid->{$_} }
keys %{ $centroid };
print "\n";
}
=head1 DESCRIPTION
Algorithm::FuzzyCmeans is a perl implementation of Fuzzy c-means clustering.
=head1 METHODS
=head2 new
Create a new instance.
`m' option is a fuzzyness coefficient, and must be more than 1.0 (default: 2.0).
`distance_class' option is a class name with distance function between vectors. Currently, 'Algorithm::FuzzyCmeans::Distance::Euclid'(euclid distance) and 'Algorithm::FuzzyCmeans::Distance::Cosine'(cosine distance) are supported (default: cosine).
=head2 add_document($id, $vector)
Add an input document to the instance of Algorithm::FuzzyCmeans. $id parameter is the identifier of a document, and $vector parameter is the feature vector of a document. $vector parameter must be a hash reference, each key of $vector parameter is th...
=head2 do_clustering($num_cluster, $num_iter)
Do clustering input documents. $num_cluster parameter specifies the number of output clusters, and $num_iter parameter specifies the number of clustering iterations.
=head2 memberships
This method is the accessor of clustering result. The output of the method is a hash reference, the key is the identifier of each input document, and the value is the list of the degrees of membership of each input document in output clusters.
=head2 centroids
This method is the accessor of the vectors of cluster centroids.
=head1 AUTHOR
Mizuki Fujisawa E<lt>fujisawa@bayon.ccE<gt>
=head1 SEE ALSO
=over
=item Wikipedia: Fuzzy c-means clustering
http://en.wikipedia.org/wiki/Cluster_Analysis#Fuzzy_c-means_clustering
=back
=head1 LICENSE
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
=cut
( run in 0.840 second using v1.01-cache-2.11-cpan-96521ef73a4 )