Algorithm-WordLevelStatistics

 view release on metacpan or  search on metacpan

lib/Algorithm/WordLevelStatistics.pm  view on Meta::CPAN

          count => 150,
          sigma_nor => 3.71679156784182
        }
  ...
 }

=item compute_spectrum()

The return value is a reference to an hash like the following one:

 {
   C => 50.2020428972437,
   count => 47,
   sigma_nor => 6.16069263723295
 }

=back

=head1 THEORY

The word level statistics algorithm uses a generalization of the level statistics analysis of quantum disordered systems to extract automatically keywords in literary texts.

The systems takes into account not only the relative frequencies of the words present in the text but also their spatial distribution in the text, and it is based on the consideration that relevant words are naturally clustered by the authors of the ...

The word level statistics does not need a reference corpus but it uses just one document to extract the document's keywords. Moreover it is to be considered "language agnostic", because the algorithm does not need a "a priori" words classification (e...


=head1 HISTORY

=over 4

=item 0.03

Corrected the test case (added test file to tarball.

=item 0.02

Removed the dependency from L<Statistics::Lite>.
The removal of the dependency make this a self contained Perl module.
The module is indeed faster too! (~15% speed improvement on large files).

=item 0.01

Initial version of the module

=back

=head1 AUTHOR

Francesco Nidito

=head1 COPYRIGHT

Copyright 2009 Francesco Nidito. All rights reserved.

This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.

=head1 SEE ALSO

L<http://bioinfo2.ugr.es/TextKeywords/>.

=cut



( run in 1.490 second using v1.01-cache-2.11-cpan-df04353d9ac )