Algorithm-LinearManifoldDataClusterer

 view release on metacpan or  search on metacpan

lib/Algorithm/LinearManifoldDataClusterer.pm  view on Meta::CPAN

I<In linear-manifold based machine learning, instead of constructing a single
subspace in the manner described above, we construct a set of subspaces, one for each
data cluster on the nonlinear manifold.  After the subspaces have been constructed, a
data element is assigned to that subspace for which the reconstruction error is the
least.> On the face of it, this sounds like a chicken-and-egg sort of a problem.  You
need to have already clustered the data in order to construct the subspaces at
different places on the manifold so that you can figure out which cluster to place a
data element in.

Such problems, when they do possess a solution, are best tackled through iterative
algorithms in which you start with a guess for the final solution, you rearrange the
measured data on the basis of the guess, and you then use the new arrangement of the
data to refine the guess.  Subsequently, you iterate through the second and the third
steps until you do not see any discernible changes in the new arrangements of the
data.  This forms the basis of the clustering algorithm that is described under
B<Phase 1> in the section that follows.  This algorithm was first proposed in the
article "Dimension Reduction by Local Principal Component Analysis" by Kambhatla and
Leen that appeared in the journal Neural Computation in 1997.

Unfortunately, experiments show that the algorithm as proposed by Kambhatla and Leen
is much too sensitive to how the clusters are seeded initially.  To get around this
limitation of the basic clustering-by-minimization-of-reconstruction-error, this
module implements a two phased approach.  In B<Phase 1>, we introduce a multiplier

lib/Algorithm/LinearManifoldDataClusterer.pm  view on Meta::CPAN

output the clustering result produced by the graph partitioning algorithm.

=back

=over

=item B<get_data_from_csv()>:

    $clusterer->get_data_from_csv();

As you can guess from its name, the method extracts the data from the CSV file named
in the constructor.

=item B<linear_manifold_clusterer()>:

    $clusterer->linear_manifold_clusterer();   

    or 

    my $clusters = $clusterer->linear_manifold_clusterer();



( run in 0.487 second using v1.01-cache-2.11-cpan-ba35b6b0368 )