Lingua-NATools
view release on metacpan or search on metacpan
lib/Lingua/NATools.pm view on Meta::CPAN
$pcorpus->run_post(5);
=head2 C<run_generic_EM>
This method invoques one of the three algorithms for Entropy
Maximization of the alignment matrix: C<nat-sampleA>, C<nat-sampleB>
and C<nat-ipfp>.
You should call the method with the name of the algorithm ("sampleA",
"sampleB" or "ipfp"), the number of iterations to be done, and the
chunk to be processed.
Returns the time used to run the command.
$pcorpus->run_generic_EM("ipfp", 5, 3);
=head2 C<align_all>
This method will re-align all chunks in the corpora repository. It
will not re-encode them, just re-align.
scripts/nat-create view on Meta::CPAN
=item noEM
The C<-noEM> flag is used to bypass the EM-Algorithm (useful for debug
purposes, mainly).
=item ipfp
The C<-ipfp> flag is mutually exclusive with C<-noEM>, C<-samplea> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the IPFP
one. Optional numeric argument is the number of iterations. Defaults
to 5.
=item samplea
The C<-samplea> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-sampleb>. It defines that the EM-Algorithm to be used is the Sample
A one. Optional numeric argument is the number of iterations. Defaults
to 10.
=item sampleb
The C<-sampleb> flag is mutually exclusive with C<-noEM>, C<-ipfp> and
C<-samplea>. It defines that the EM-Algorithm to be used is the Sample
B one. Optional numeric argument is the number of iterations. Defaults
to 10.
=back
=head1 SEE ALSO
NATools documentation, perl(1)
=head1 AUTHOR
* @file
* @brief Corpora pre-processing unit
*/
/**
* @brief maximum number of words in a translation unit
*/
#define MAXBUF 500
/**
* @brief number of iterations between updating progress information
*/
#define STEP 100
/**
* @brief value used as default size when alloccating the buffer for
* the index
*/
#define DEFAULT_INDEX_SIZE 150000
static nat_boolean_t quiet;
( run in 0.878 second using v1.01-cache-2.11-cpan-71847e10f99 )