Lingua-EN-Fathom

 view release on metacpan or  search on metacpan

lib/Lingua/EN/Fathom.pm  view on Meta::CPAN

=head2 percent_complex_words

Returns the percentage of complex words in the analysed text file or block. A 
complex word must consist of three or more syllables. This statistic is used to
calculate the fog index.

=head2 num_sentences

Returns the number of sentences in the analysed text file or block. A sentence
is any group of words and non words terminated with a single full stop. Spaces
may occur before and after the full stop.

=head2 num_text_lines

Returns the number of lines containing some text in the analysed
text file or block.

=head2 num_non_text_lines

Returns the number of lines containing no text in the analysed
text file or block.

=head2 num_blank_lines

Returns the number of empty lines in the analysed
text file or block.

=head2 num_paragraphs

Returns the number of paragraphs in the analysed text file or block.

=head2 syllables_per_word

Returns the average number of syllables per word in the analysed 
text file or block.

=head2 words_per_sentence

Returns the average number of words per sentence in the analysed 
text file or block.


=head2 READABILITY

Three indices of text readability are calculated. They all measure complexity as
a function of syllables per word and words per sentence. They assume the text is
well formed and logical. You could analyse a passage of nonsensical English and
find the readability is quite good, provided the words are not too complex and
the sentences not too long.

For more information see: L<http://www.plainlanguage.com/Resources/readability.html>


=head2 fog

Returns the Fog index for the analysed text file or block.

  ( words_per_sentence + percent_complex_words ) * 0.4

The Fog index, developed by Robert Gunning, is a well known and simple
formula for measuring readability. The index indicates the number of years
of formal education a reader of average intelligence would need to read the
text once and understand that piece of writing with its word sentence workload.

   18 unreadable
   14 difficult
   12 ideal
   10 acceptable
    8 childish


=head2 flesch

Returns the Flesch reading ease score for the analysed text file or block.

   206.835 - (1.015 * words_per_sentence) - (84.6 * syllables_per_word)

This score rates text on a 100 point scale. The higher the score, the easier
it is to understand the text. A score of 60 to 70 is considered to be optimal.


=head2 kincaid

Returns the Flesch-Kincaid grade level score for the analysed text
file or block.

   (11.8 * syllables_per_word) +  (0.39 * words_per_sentence) - 15.59;

This score rates text on U.S. grade school level. So a score of 8.0 means
that the document can be understood by an eighth grader. A score of 7.0 to
8.0 is considered to be optimal.

=head2 unique_words

Returns a hash of unique words. The words (in lower case) are held in
the hash keys while the number of occurrences are held in the hash values.


=head2 report

    print($text->report);

Produces a text based report containing all Fathom statistics for
the currently analysed text block or file. For example: 
    
Number of characters       : 813
Number of words            : 135
Percent of complex words   : 20.00
Average syllables per word : 1.7704
Number of sentences        : 12
Average words per sentence : 11.2500
Number of text lines       : 13
Number of non text lines   : 0
Number of blank lines      : 8
Number of paragraphs       : 4


READABILITY INDICES

Fog                        : 12.5000
Flesch                     : 45.6429



( run in 0.855 second using v1.01-cache-2.11-cpan-d7f47b0818f )