File-CountLines
view release on metacpan or search on metacpan
The file is read in equally sized blocks. The size of the blocks can be
supplied with the "blocksize" option. The default is 4096, and can be
changed by setting $File::CountLines::BlockSize.
Do not use a block size smaller than the length of the separator, that
might produce wrong results. (In general there's no reason to chose a
smaller block size at all. Depending on your size a larger block size
might speed up things a bit.)
Character Encodings
If you supply a separator yourself, it should not be a decoded string.
The file is read in binary mode, which implies that this module works
fine for text files in ASCII-compatible encodings, including ASCII
itself, UTF-8 and all the ISO-8859-* encodings (aka Latin-1, Latin-2,
...).
Note that the multi byte encodings like UTF-32, UTF-16le, UTF-16be and
UCS-2 encode a line feed character in a way that the 0x0A byte is a
substring of the encoded character, but if you search blindly for that
byte you will get false positives. For example the *LATIN CAPITAL LETTER
lib/File/CountLines.pm view on Meta::CPAN
can be supplied with the C<blocksize> option. The default is 4096,
and can be changed by setting C<$File::CountLines::BlockSize>.
Do not use a block size smaller than the length of the separator, that
might produce wrong results. (In general there's no reason to chose a
smaller block size at all. Depending on your size a larger block size
might speed up things a bit.)
=head1 Character Encodings
If you supply a separator yourself, it should not be a decoded string.
The file is read in binary mode, which implies that this module
works fine for text files in ASCII-compatible encodings, including
ASCII itself, UTF-8 and all the ISO-8859-* encodings (aka Latin-1,
Latin-2, ...).
Note that the multi byte encodings like UTF-32, UTF-16le, UTF-16be
and UCS-2 encode a line feed character in a way that the C<0x0A> byte
is a substring of the encoded character, but if you search blindly for
that byte you will get false positives. For example the I<LATIN CAPITAL
( run in 0.367 second using v1.01-cache-2.11-cpan-a9ef4e587e4 )