Algorithm-KNN-XS

 view release on metacpan or  search on metacpan

lib/Algorithm/KNN/XS.pm  view on Meta::CPAN

package Algorithm::KNN::XS;
use 5.006001;
use strict;
use warnings;

our $VERSION = '0.01001';

require XSLoader;
XSLoader::load('Algorithm::KNN::XS', $VERSION);

our %ANN_SPLIT_RULE = (
    'ANN_KD_SUGGEST'  => 0,
    'ANN_KD_STD'      => 1,
    'ANN_KD_MIDPT'    => 2,
    'ANN_KD_FAIR'     => 3,
    'ANN_KD_SL_MIDPT' => 4,
    'ANN_KD_SL_FAIR'  => 5,
);

our %ANN_SHRINK_RULE = (
    'ANN_BD_SUGGEST'  => 0,
    'ANN_BD_NONE'     => 1,
    'ANN_BD_SIMPLE'   => 2,
    'ANN_BD_CENTROID' => 3,
);

=head1 NAME

Algorithm::KNN::XS - A class interface to perform a fast k neareast neighbor
search using libANN via XS.

=head1 SYNOPSIS

    use Algorithm::KNN::XS;
    
    # define data points in a N dimensional space
    my @points = (
      [7.45, 2],
      [16.56, 32.1],
    );
    
    # create a new knn object
    my $knn = Algorithm::KNN::XS->new(points => \@points);
    
    # perform a k nearest neighbor search at the given query point, the result
    # contains all neighbors and their distance to the query point.
    my $result = $knn->annkSearch(query_point => [4.23, 2.45]);

=head1 DESCRIPTION

A class interface todo a fast k nearest neighbor search using libANN via XS.

The XS Module automatically creates the namespace
Algorithm::KNN::XS::LibANNInterface.

More information about libANN can be found with the following links.

libANN main page: L<http://www.cs.umd.edu/~mount/ANN/>
libANN manual: L<http://www.cs.umd.edu/~mount/ANN/Files/1.1.2/ANNmanual_1.1.pdf>

If you want to use another Minkowski distance metric than "Euclidean" you must
compile another libANN library with the correct options before building this
Module. Possible other distances are "Manhatten" and "Max".

=head1 USAGE

=head2 Methods

=over 4

=item * Algorithm::KNN::XS->new ( ... )

This class method instantiates a new kd or bd tree from a points array or a
string of a dumped tree.

If points and a dump string is given, the dump string will be ignored.

The method will croak if something goes wrong.

libANN will do a exit(1) which is not catchable via eval if a wrong dump
string is given, so make sure that the dump is valid before using that
parameter.

    my $knn = Algorithm::KNN::XS->new(
        points      => [
            [3.453, 2.324],
            [6.874, 4.874],
        ],
        dump        => '',
        bd_tree     => 1,
        bucket_size => 1,
        split_rule  => 'ANN_KD_SUGGEST',
        shrink_rule => 'ANN_BD_SUGGEST',
    );

=over 8

=item * points

The number of points provided in each element also sets the dimension of the
tree that is going to be created. Each item must have the same amount of
points.

Either this or the dump parameter is mandatory.

=item * dump

A String of a tree that got dumped via the method Dump(points => 1) of this
module.

You can use a dump of a kd tree to create a bd tree but not the other way
around.

libANN will do a exit(1) which is not catchable via eval if you try to input
a string that was dumped without the "points" parameter set to 1 or if the
string is not a valid dump.

Either this or the points parameter is mandatory.

lib/Algorithm/KNN/XS.pm  view on Meta::CPAN

    my $stats = {
        dimension             => 2,
        n_points              => 7,
        bucket_size           => 1,
        leaves                => 7,
        trvial_leaves         => 0,
        splitting_nodes       => 6,
        shrinking_nodes       => 0,
        depth                 => 4,
        avg_leaf_aspect_ratio => 1.61012744903564,
    };

=back

=back

=cut

sub getStats {
    my $stats = shift->tree->getStats();

    return {
        dimension             => $stats->[0],
        n_points              => $stats->[1],
        bucket_size           => $stats->[2],
        leaves                => $stats->[3],
        trvial_leaves         => $stats->[4],
        splitting_nodes       => $stats->[5],
        shrinking_nodes       => $stats->[6],
        depth                 => $stats->[7],
        avg_leaf_aspect_ratio => $stats->[8],
    };
}

# helper function to get the data in a better to handle array of hashes
sub _restructure_return_value {
    my $self             = shift;
    my $result           = shift;
    my $idx_last_element = $self->tree()->theDim();

    my @return_value;

    foreach my $idx (0 .. scalar @{$result} - 1) {
        foreach my $idx_element (0 .. $idx_last_element) {
            if ($idx_element != $idx_last_element) {
                push @{$return_value[$idx]->{point}}, $result->[$idx]->[$idx_element];
            }
            else {
                $return_value[$idx]->{distance} = $result->[$idx]->[$idx_element];
            }
        }
    }

    return \@return_value;
}

=head1 SEE ALSO

=over 4

=item * L<http://www.cs.umd.edu/~mount/ANN/>

=item * L<http://www.cs.umd.edu/~mount/ANN/Files/1.1.2/ANNmanual_1.1.pdf>

=back

=head1 AUTHOR

Stephan Conrad, E<lt>conrad@stephanconrad.deE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2011 by Stephan Conrad

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.6.1 or,
at your option, any later version of Perl 5 you may have available.

=cut

1;

__END__



( run in 1.114 second using v1.01-cache-2.11-cpan-df04353d9ac )