Acme-InputRecordSeparatorIsRegexp

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN


    Remember: the value of $/ is a string, not a regex. awk has to
    be better for something. :-)

This module attempts to get around that limitation, providing
a mechanism (using tied filehandles) to get the readline function
(and readline operator <...>) to define "lines" with respect to
a regular expression.

A common use case is to read a text file that you don't know
whether is uses Unix (\n), Windows/DOS (\r\n), or Mac (\r)
style line-endings, or even if it might contain all three.

Other use cases are files that contain multiple types of records
where a different sequence of characters is used to denote the
end of different types of records.


INSTALLATION

To install this module, run the following commands:

lib/Acme/InputRecordSeparatorIsRegexp.pm  view on Meta::CPAN


=head1 VERSION

Version 0.07

=head1 SYNOPSIS

    use Acme::InputRecordSeparatorIsRegexp;

    # open-then-tie
    open my $fh, '<', 'file-with-Win-Mac-and-Unix-line-endings';
    tie *$fh, 'Acme::IRSRegexp', $fh, '\r\n|\n|\r';
    while (<$fh>) {
        # $_ could have "\r\n", "\n", or "\r" line ending now
    }

    # tie-then-open
    tie *{$fh=Symbol::gensym}, 'Acme::IRSRegExp', qr/\r\n|[\r\n]/;
    open $fh, '<', 'file-with-ambiguous-line-endings';
    $line = <$fh>;

lib/Acme/InputRecordSeparatorIsRegexp.pm  view on Meta::CPAN


Remember: the value of $/ is a string, not a regex. B<awk>
has to be better for something. :-)

=back

This module provides a mechanism to read records from a file
using a regular expression as a record separator.

A common use case for this module is to read a text file 
that you don't know whether it uses Unix (C<\n>), 
Windows/DOS (C<\r\n>), or Mac (C<\r>) style line-endings, 
or even if it might contain all three. To properly parse
this file, you could tie its file handle to this package with
the appropriate regular expression:

    my $fh = Symbol::gensym;
    tie *$fh, 'Acme::InputRecordSeparatorIsRegexp', '\r\n|\r|\n';
    open $fh, '<', 'file-with-ambiguous-line-endings';

    @lines = <$fh>;



( run in 1.015 second using v1.01-cache-2.11-cpan-df04353d9ac )