Lingua-NATools

 view release on metacpan or  search on metacpan

lib/Lingua/NATools/Lexicon.pm  view on Meta::CPAN


  $lex = Lingua::NATools::Lexicon->new("file.lex");

  $word = $lex->word_from_id(2);

  $id = $lex->id_from_word("cavalo");

  @ids = $lex->sentence_to_ids("era uma vez um gato maltez");

  $sentence = $lex->ids_to_sentence(10,2,3,2,5,4,3,2,5);

  $lex->size;

  $lex->id_count(2);

  $lex->close;

=head1 DESCRIPTION

This module encapsulates the NATools Lexicon files, making them
accessible using Perl. The implementation is based on OO
philosophy. First, you must open a lexicon file using:

 $lex = Lingua::NATools::Lexicon->new("lexicon.file.lex");

When you have all done, do not forget to close it. This makes some
memory frees, and is welcome for the process of opening new lexicon
files.

 $lex->close;

Lexicon files map words to identifiers and vice-versa. Its usage is
simple: use

  $lex->id_from_word($word)

to get an id for a word. Use

  $lex->word_from_id($id)

to get back the word from the id. If you need to make big quantities
of conversions to construct or parse a sentence use C<ids_to_sentence>
or C<sentence_to_ids> respectively.

=head2 C<new>

This is the C<Lingua::NATools::Lexicon> constructor. Pass it a
I<lexicon> file.  These files usually end with a C<.lex> extension:

   my $lexicon = Lingua::NATools::Lexicon->new("file.lex");

=cut

sub new {
    my ($class, $filename) = @_;
    return undef unless -f $filename;

    my $wlid = Lingua::NATools::wlopen($filename);
    return undef if $wlid < 0;

    return bless +{ id => $wlid } => $class # amen
}

=head2 C<save>

This method saves the current lexicon object in the supplied file:

   $lexicon->save("/there/lexicon.lex");

=cut

sub save {
    my ($self, $filename) = @_;
    Lingua::NATools::wlsave($self->{id}, $filename);
}

=head2 C<close>

Call this method to close a Lexicon. This is important to free resources
(both memory and lexicons, as there is a limited number of open lexicons
at a time).

   $lexicon->close;

=cut

sub close {
    my $self = shift;
    Lingua::NATools::wlclose($self->{id});
}

=head2 C<word_from_id>

This method is used to convert one I<word-id> to a I<word>:

   my $word = $lexicon->word_from_id ($word_id);

=cut

sub word_from_id {
    my ($self, $id) = @_;
    return Lingua::NATools::wlgetbyid($self->{id}, $id);
}

=head2 C<ids_to_sentence>

This method calls C<word_from_id> for each passed parameter.
Thus, it receives a list of word identifiers, and returns the
corresponding string. Words are separated by a space character.

   my $sentence = $lexicon->ids_to_sentence(1,3,5,2,3,6);

=cut

sub ids_to_sentence {
    # We will need something more to handle correct cases
    my $self = shift;
    return join(" ",map { $self->word_from_id($_) } @_);
}

=head2 C<id_from_word>

 view all matches for this distribution
 view release on metacpan -  search on metacpan

( run in 0.381 second using v1.00-cache-2.02-grep-82fe00e-cpan-dad7e4baca0 )