Acme-InputRecordSeparatorIsRegexp
view release on metacpan or search on metacpan
Remember: the value of $/ is a string, not a regex. awk has to
be better for something. :-)
This module attempts to get around that limitation, providing
a mechanism (using tied filehandles) to get the readline function
(and readline operator <...>) to define "lines" with respect to
a regular expression.
A common use case is to read a text file that you don't know
whether is uses Unix (\n), Windows/DOS (\r\n), or Mac (\r)
style line-endings, or even if it might contain all three.
Other use cases are files that contain multiple types of records
where a different sequence of characters is used to denote the
end of different types of records.
INSTALLATION
To install this module, run the following commands:
lib/Acme/InputRecordSeparatorIsRegexp.pm view on Meta::CPAN
=head1 VERSION
Version 0.07
=head1 SYNOPSIS
use Acme::InputRecordSeparatorIsRegexp;
# open-then-tie
open my $fh, '<', 'file-with-Win-Mac-and-Unix-line-endings';
tie *$fh, 'Acme::IRSRegexp', $fh, '\r\n|\n|\r';
while (<$fh>) {
# $_ could have "\r\n", "\n", or "\r" line ending now
}
# tie-then-open
tie *{$fh=Symbol::gensym}, 'Acme::IRSRegExp', qr/\r\n|[\r\n]/;
open $fh, '<', 'file-with-ambiguous-line-endings';
$line = <$fh>;
lib/Acme/InputRecordSeparatorIsRegexp.pm view on Meta::CPAN
Remember: the value of $/ is a string, not a regex. B<awk>
has to be better for something. :-)
=back
This module provides a mechanism to read records from a file
using a regular expression as a record separator.
A common use case for this module is to read a text file
that you don't know whether it uses Unix (C<\n>),
Windows/DOS (C<\r\n>), or Mac (C<\r>) style line-endings,
or even if it might contain all three. To properly parse
this file, you could tie its file handle to this package with
the appropriate regular expression:
my $fh = Symbol::gensym;
tie *$fh, 'Acme::InputRecordSeparatorIsRegexp', '\r\n|\r|\n';
open $fh, '<', 'file-with-ambiguous-line-endings';
@lines = <$fh>;
( run in 1.015 second using v1.01-cache-2.11-cpan-df04353d9ac )