Lingua-NATools
view release on metacpan or search on metacpan
scripts/nat-create view on Meta::CPAN
=item *
the configuration file ("nat.cnf" - metadata information)
=item *
the corpus
=item *
the corpus indexes
=item *
the probabilistic translation dictionaries ("source-target.dmp", "target-source.dmp")
=item *
the (bi,tri,tetra)grams databases ("source.ngrams", "target.ngrams")
=back
=head2 Known Switches
=over 4
=item tokenize
The C<-tokenize> flag can be used to force NATools to tokenize the
texts. Note that at the moment a Portuguese tokenizer is used for all
languages. This might change in the future.
=item id
The C<-id=name> flag can be used to force NATools Corpora name. By default
the name is read interactively.
=item q
The C<-q> flag can be used to force quiet mode. In thic case, the
name is extracted from the file-names.
=item lang
The C<-lang=PT..EN> flag can be used to force languages.
=item ngrams
The C<-ngrams> flag can be set to force NATools to create ngrams
indexes.
=item noEM
The C<-noEM> flag is used to bypass the EM-Algorithm (useful for debug
purposes, mainly).
=item ipfp
The C<-ipfp> flag is mutually exclusive with C<-noEM>, C<-samplea> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the IPFP
one. Optional numeric argument is the number of iterations. Defaults
to 5.
=item samplea
The C<-samplea> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the Sample
A one. Optional numeric argument is the number of iterations. Defaults
to 10.
=item sampleb
The C<-sampleb> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-samplea>. It defines that the EM-Algorithm to be used is the Sample
B one. Optional numeric argument is the number of iterations. Defaults
to 10.
=back
=head1 SEE ALSO
NATools documentation, perl(1)
=head1 AUTHOR
Alberto Manuel Brandão Simões, E<lt>ambs@cpan.orgE<gt>
=head1 COPYRIGHT AND LICENSE
Copyright (C) 2006-2011 by Alberto Manuel Brandão Simões
=cut
( run in 1.503 second using v1.01-cache-2.11-cpan-71847e10f99 )