Algorithm-SVMLight
view release on metacpan or search on metacpan
lib/Algorithm/SVMLight.pm view on Meta::CPAN
sub predict {
my ($self, %params) = @_;
for ('attributes') {
die "Missing required '$_' parameter" unless exists $params{$_};
}
my (@values, @indices);
while (my ($key) = each %{ $params{attributes} }) {
push @indices, $self->{features}{$key} if exists $self->{features}{$key};
}
@indices = sort {$a <=> $b} @indices;
foreach my $i (@indices) {
push @values, $params{attributes}{ $self->{rfeatures}[$i] };
}
# warn "Predicting: (@indices), (@values)\n";
$self->predict_i(\@indices, \@values);
}
sub add_instance {
my ($self, %params) = @_;
for ('attributes', 'label') {
die "Missing required '$_' parameter" unless exists $params{$_};
}
for ($params{label}) {
die "Label must be a real number, not '$_'" unless /^-?\d+(\.\d+)?$/;
}
my @values;
my @indices;
while (my ($key, $val) = each %{ $params{attributes} }) {
unless ( exists $self->{features}{$key} ) {
$self->{features}{$key} = 1 + keys %{ $self->{features} };
push @{ $self->{rfeatures} }, $key;
}
push @indices, $self->{features}{$key};
}
@indices = sort { $a <=> $b} @indices;
foreach my $i (@indices) {
push @values, $params{attributes}{ $self->{rfeatures}[$i] };
}
#warn "Adding document: (@indices), (@values) => $params{label}\n";
my $id = exists $params{query_id} ? $params{query_id} : 0;
my $slack = exists $params{slack_id} ? $params{slack_id} : 1;
my $cost = exists $params{cost_factor} ? $params{cost_factor} : 1.0;
$self->add_instance_i($params{label}, "", \@indices, \@values, $id, $slack, $cost);
}
sub write_model {
my ($self, $file) = @_;
$self->_write_model($file);
# Write a footer line
if ( my $numf = keys %{ $self->{features} } ) {
open my($fh), ">> $file" or die "Can't write footer to $file: $!";
print $fh ('#rfeatures: [undef, ' ,
join( ', ', map _escape($self->{rfeatures}[$_]), 1..$numf ),
"]\n");
}
}
sub read_model {
my ($self, $file) = @_;
$self->_read_model($file);
# Read the footer line
open my($fh), $file or die "Can't read $file: $!";
local $_;
while (<$fh>) {
next unless /^#rfeatures: (\[.*\])$/;
my $rf = $self->{rfeatures} = eval $1;
die $@ if $@;
$self->{features} = { map {$rf->[$_], $_} 1..$#$rf };
}
}
sub _escape {
local $_ = shift;
s/([\\'])/\\$1/g;
s/\n/\\n/g;
s/\r/\\r/g;
return "'$_'";
}
1;
__END__
=head1 NAME
Algorithm::SVMLight - Perl interface to SVMLight Machine-Learning Package
=head1 SYNOPSIS
use Algorithm::SVMLight;
my $s = new Algorithm::SVMLight;
$s->add_instance
(attributes => {foo => 1, bar => 1, baz => 3},
label => 1);
$s->add_instance
(attributes => {foo => 2, blurp => 1},
label => -1);
... repeat for several more instances, then:
$s->train;
# Find results for unseen instances
my $result = $s->predict
(attributes => {bar => 3, blurp => 2});
=head1 DESCRIPTION
This module implements a perl interface to Thorsten Joachims' SVMLight
package:
=over 4
SVMLight is an implementation of Vapnik's Support Vector Machine
[Vapnik, 1995] for the problem of pattern recognition, for the problem
of regression, and for the problem of learning a ranking function. The
optimization algorithms used in SVMlight are described in [Joachims,
2002a ]. [Joachims, 1999a]. The algorithm has scalable memory
requirements and can handle problems with many thousands of support
vectors efficiently.
-- http://svmlight.joachims.org/
=back
Support Vector Machines in general, and SVMLight specifically,
represent some of the best-performing Machine Learning approaches in
domains such as text categorization, image recognition, bioinformatics
string processing, and others.
For efficiency reasons, the underlying SVMLight engine indexes features by integers, not
( run in 2.351 seconds using v1.01-cache-2.11-cpan-cdf2f3d4e48 )