Lingua-NATools

 view release on metacpan or  search on metacpan

scripts/nat-create  view on Meta::CPAN

=item *

the configuration file ("nat.cnf" - metadata information)

=item *

the corpus

=item *

the corpus indexes

=item *

the probabilistic translation dictionaries ("source-target.dmp", "target-source.dmp")

=item *

the (bi,tri,tetra)grams databases  ("source.ngrams", "target.ngrams")

=back

=head2 Known Switches

=over 4

=item tokenize

The C<-tokenize> flag can be used to force NATools to tokenize the
texts. Note that at the moment a Portuguese tokenizer is used for all
languages. This might change in the future.

=item id

The C<-id=name> flag can be used to force NATools Corpora name. By default
the name is read interactively.

=item q

The C<-q> flag can be used to force quiet mode.  In thic case, the
name is extracted from the file-names.

=item lang

The C<-lang=PT..EN> flag can be used to force languages.

=item ngrams

The C<-ngrams> flag can be set to force NATools to create ngrams
indexes.

=item noEM

The C<-noEM> flag is used to bypass the EM-Algorithm (useful for debug
purposes, mainly).

=item ipfp

The C<-ipfp> flag is mutually exclusive with C<-noEM>, C<-samplea> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the IPFP
one. Optional numeric argument is the number of iterations. Defaults
to 5.

=item samplea

The C<-samplea> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the Sample
A one. Optional numeric argument is the number of iterations. Defaults
to 10.

=item sampleb

The C<-sampleb> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-samplea>. It defines that the EM-Algorithm to be used is the Sample
B one. Optional numeric argument is the number of iterations. Defaults
to 10.

=back

=head1 SEE ALSO

NATools documentation, perl(1)

=head1 AUTHOR

Alberto Manuel Brandão Simões, E<lt>ambs@cpan.orgE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2006-2011 by Alberto Manuel Brandão Simões

=cut



( run in 1.503 second using v1.01-cache-2.11-cpan-71847e10f99 )