Acme-CPANLists-PERLANCAR

 view release on metacpan or  search on metacpan

lib/Acme/CPANLists/PERLANCAR/Task/PickingRandomLinesFromFile.pm  view on Meta::CPAN

from a specified file. The whole content of the file need not be slurped into
memory, but the routine requires a single-pass of reading all lines from the
file. The algorithm is as described in perlfaq (See: `perldoc -q "random
line"`).

If you pick more than one lines, then there might be duplicates.

_
            },
            {
                module=>'File::RandomLine',
                summary => 'Recommended for large files',
                description => <<'_',

This module gives you a choice of two algorithms. The first is similar to
<pm:File::Random> (the scan method), giving each line of the file equal weight.
The second algorithm is more interesting: it works by random seeking the file,
discarding the line fragment (a.k.a. searching forward for the next newline
character), reading the next line, then repeating the process until the desired
number of lines is reached. This means one doesn't have to read the whole file
and the picking process is much faster than the scan method. It might be
preferred for very large files.

Note that due to the nature of the algorithm, lines are weighted by the number
of characters. In other words, lines that have long lines immediately preceding
them will have a greater probability of being picked. Depending on your use case
or the line length variation of your file, this algorithm might or might not be
acceptable to you.

_
            },
            {
                module => 'File::Random::Pick',
                description => <<'_',

This module is an alternative to <pm:File::Random>. It offers a `random_line()`
routine that avoids duplication.

_
            },
            {
                module => 'App::PickRandomLines',
                description => <<'_',

A CLI that allows you to use <pm:File::Random::Pick> or <pm:File::RandomLine> on
the command-line.

_
            },
        ],
    },
);

1;
# ABSTRACT: Picking random lines from a file

__END__

=pod

=encoding UTF-8

=head1 NAME

Acme::CPANLists::PERLANCAR::Task::PickingRandomLinesFromFile - Picking random lines from a file

=head1 VERSION

This document describes version 0.26 of Acme::CPANLists::PERLANCAR::Task::PickingRandomLinesFromFile (from Perl distribution Acme-CPANLists-PERLANCAR), released on 2017-09-08.

=head1 MODULE LISTS

=head2 Picking random lines from a file

=over

=item * L<File::Random>

The C<random_line()> function from this module picks one or more random lines
from a specified file. The whole content of the file need not be slurped into
memory, but the routine requires a single-pass of reading all lines from the
file. The algorithm is as described in perlfaq (See: C<perldoc -q "random
line">).

If you pick more than one lines, then there might be duplicates.


=item * L<File::RandomLine> - Recommended for large files

This module gives you a choice of two algorithms. The first is similar to
L<File::Random> (the scan method), giving each line of the file equal weight.
The second algorithm is more interesting: it works by random seeking the file,
discarding the line fragment (a.k.a. searching forward for the next newline
character), reading the next line, then repeating the process until the desired
number of lines is reached. This means one doesn't have to read the whole file
and the picking process is much faster than the scan method. It might be
preferred for very large files.

Note that due to the nature of the algorithm, lines are weighted by the number
of characters. In other words, lines that have long lines immediately preceding
them will have a greater probability of being picked. Depending on your use case
or the line length variation of your file, this algorithm might or might not be
acceptable to you.


=item * L<File::Random::Pick>

This module is an alternative to L<File::Random>. It offers a C<random_line()>
routine that avoids duplication.


=item * L<App::PickRandomLines>

A CLI that allows you to use L<File::Random::Pick> or L<File::RandomLine> on
the command-line.


=back

=head1 HOMEPAGE



( run in 0.465 second using v1.01-cache-2.11-cpan-f56aa216473 )