Algorithm-PageRank-XS
view release on metacpan or search on metacpan
lib/Algorithm/PageRank/XS.pm view on Meta::CPAN
package Algorithm::PageRank::XS;
use 5.008005;
use strict;
use warnings;
use Carp;
require Exporter;
use AutoLoader;
our $VERSION = '0.04';
require XSLoader;
XSLoader::load('Algorithm::PageRank::XS', $VERSION);
1;
__END__;
=head1 NAME
Algorithm::PageRank::XS - A Fast PageRank implementation
=head1 DESCRIPTION
This module implements a simple PageRank algorithm in C. The goal is
to quickly get a vector that is closed to the eigenvector of the
stochastic matrix of a graph.
L<Algorithm::PageRank> does some pagerank calculations, but it's
slow and memory intensive. This module was developed to compute pagerank
on graphs with millions of arcs. This module will not, however, scale
up to quadrillions of arcs (see the TODO).
=head1 SYNOPSYS
use Algorithm::PageRank::XS;
my $pr = Algorithm::PageRank::XS->new();
$pr->graph([
'John' => 'Joey',
'John' => 'James',
'Joey' => 'John',
'James' => 'Joey',
]
);
$pr->result();
# {
# 'James' => '0.569840431213379',
# 'Joey' => '1',
# 'John' => '0.754877686500549'
# }
#
#
# The following simple program takes up arcs and prints the ranks.
use Algorithm::PageRank::XS;
my $pr = Algorithm::PageRank::XS->new();
while (<>) {
chomp;
my ($from, to) = split(/\t/, $_);
$pr->add_arc($from, $to);
}
my $r = $pr->results();
while (my ($name, $rank) = each(%{$r})) {
print "$name,$rank\n";
}
=head1 METHODS
=head2 new %PARAMS
Create a new PageRank object. Possible parameters:
=over 4
=item alpha
This is (1 - how much people can move from one node to another unconnected one randomly). Decreasing
this number makes convergence more likely, but brings us further from the true eigenvector.
=item max_tries
The maximum number of tries until we give up trying to achieve convergence.
=item convergence
The maximum number the difference between two subsequent vectors must be before we say we are
"convergent enough". The convergence rate is the rate at which C<alpha^t> goes to 0. Thus,
if you set C<alpha> to C<0.85>, and C<convergence> to C<0.000001>, then you will need C<85> tries.
=back
=head2 add_arc
Add an arc to the pagerank object before running the computation.
The actual values don't matter. So you can run:
$pr->add_arc("Apple", "Orange");
and you mean that C<"Apple"> links to C<"Orange">.
=head2 graph
( run in 1.797 second using v1.01-cache-2.11-cpan-39bf76dae61 )