Lingua-NATools
view release on metacpan - search on metacpan
view release on metacpan or search on metacpan
lib/Lingua/NATools/Lexicon.pm view on Meta::CPAN
$lex = Lingua::NATools::Lexicon->new("file.lex");
$word = $lex->word_from_id(2);
$id = $lex->id_from_word("cavalo");
@ids = $lex->sentence_to_ids("era uma vez um gato maltez");
$sentence = $lex->ids_to_sentence(10,2,3,2,5,4,3,2,5);
$lex->size;
$lex->id_count(2);
$lex->close;
=head1 DESCRIPTION
This module encapsulates the NATools Lexicon files, making them
accessible using Perl. The implementation is based on OO
philosophy. First, you must open a lexicon file using:
$lex = Lingua::NATools::Lexicon->new("lexicon.file.lex");
When you have all done, do not forget to close it. This makes some
memory frees, and is welcome for the process of opening new lexicon
files.
$lex->close;
Lexicon files map words to identifiers and vice-versa. Its usage is
simple: use
$lex->id_from_word($word)
to get an id for a word. Use
$lex->word_from_id($id)
to get back the word from the id. If you need to make big quantities
of conversions to construct or parse a sentence use C<ids_to_sentence>
or C<sentence_to_ids> respectively.
=head2 C<new>
This is the C<Lingua::NATools::Lexicon> constructor. Pass it a
I<lexicon> file. These files usually end with a C<.lex> extension:
my $lexicon = Lingua::NATools::Lexicon->new("file.lex");
=cut
sub new {
my ($class, $filename) = @_;
return undef unless -f $filename;
my $wlid = Lingua::NATools::wlopen($filename);
return undef if $wlid < 0;
return bless +{ id => $wlid } => $class # amen
}
=head2 C<save>
This method saves the current lexicon object in the supplied file:
$lexicon->save("/there/lexicon.lex");
=cut
sub save {
my ($self, $filename) = @_;
Lingua::NATools::wlsave($self->{id}, $filename);
}
=head2 C<close>
Call this method to close a Lexicon. This is important to free resources
(both memory and lexicons, as there is a limited number of open lexicons
at a time).
$lexicon->close;
=cut
sub close {
my $self = shift;
Lingua::NATools::wlclose($self->{id});
}
=head2 C<word_from_id>
This method is used to convert one I<word-id> to a I<word>:
my $word = $lexicon->word_from_id ($word_id);
=cut
sub word_from_id {
my ($self, $id) = @_;
return Lingua::NATools::wlgetbyid($self->{id}, $id);
}
=head2 C<ids_to_sentence>
This method calls C<word_from_id> for each passed parameter.
Thus, it receives a list of word identifiers, and returns the
corresponding string. Words are separated by a space character.
my $sentence = $lexicon->ids_to_sentence(1,3,5,2,3,6);
=cut
sub ids_to_sentence {
# We will need something more to handle correct cases
my $self = shift;
return join(" ",map { $self->word_from_id($_) } @_);
}
=head2 C<id_from_word>
view all matches for this distributionview release on metacpan - search on metacpan
( run in 0.381 second using v1.00-cache-2.02-grep-82fe00e-cpan-dad7e4baca0 )