Acme-Lingua-ZH-Remix
view release on metacpan or search on metacpan
Instance method. Optionally takes "min" or "max" parameter as the
constraint of sentence length (number of characters).
Both min and max values are required to be integers greater or equal to
zero. The value of max should be greater then the value of min. If any
of these values are invalidate, it is treated as if they are not passed.
The default values of min, max are 0 and 140, respectively.
The implementation random algorthm based, thus it needs indefinite time
to generate the result. If it takes more then 1000 iterations, it aborts
and return the results anyway, regardless the length constraint. This
can happen when the lengths of phrases from corpus do no adds up to a
value within the given range.
The returned scalar is the generate sentence string of wide characters.
(Which makes Encode::is_utf8 return true.)
AUTHOR
Kang-min Liu <gugod@gugod.org>
lib/Acme/Lingua/ZH/Remix.pm view on Meta::CPAN
Instance method. Optionally takes "min" or "max" parameter as the constraint of
sentence length (number of characters).
Both min and max values are required to be integers greater or equal to
zero. The value of max should be greater then the value of min. If any of these
values are invalidate, it is treated as if they are not passed.
The default values of min, max are 0 and 140, respectively.
The implementation random algorthm based, thus it needs indefinite time to
generate the result. If it takes more then 1000 iterations, it aborts and return
the results anyway, regardless the length constraint. This can happen when the
lengths of phrases from corpus do no adds up to a value within the given range.
The returned scalar is the generate sentence string of wide characters. (Which
makes Encode::is_utf8 return true.)
=cut
sub random_sentence {
my ($self, %options) = @_;
lib/Acme/Lingua/ZH/Remix.pm view on Meta::CPAN
my $ending = $self->random_phrase(random(qw/ã ï¼ ï¼/)) || "â¦";
while ( length($ending) > $options{max} ) {
$ending = $self->random_phrase(random(qw/ã ï¼ ï¼/)) || "â¦";
}
unshift @phrases, $ending;
my $l = length($ending);
my $iterations = 0;
my $max_iterations = 1000;
my $average = ($options{min} + $options{max}) / 2;
my $desired = int(rand($options{max} - $options{min}) + $options{min}) || $average || $options{max};
while ($iterations++ < $max_iterations) {
my $x;
do {
$x = random('ï¼', 'ã', 'ï¼', '/')
} while ($self->phrase_ratio($x) == 0);
my $p = $self->random_phrase($x);
if ($l + length($p) < $options{max}) {
unshift @phrases, $p;
$l += length($p);
}
my $r = abs(1 - $l/$desired);
last if $r < 0.1;
last if $r < 0.2 && $iterations >= $max_iterations/2;
}
$str = join "", @phrases;
$str =~ s/ï¼$//;
$str =~ s/^ã(.+)ã$/$1/;
if (rand > 0.5) {
$str =~ s/(ï¼)â¦â¦/$1/gs;
} else {
$str =~ s/ï¼(â¦â¦)/$1/gs;
( run in 1.050 second using v1.01-cache-2.11-cpan-96521ef73a4 )