Acme-InputRecordSeparatorIsRegexp

 view release on metacpan or  search on metacpan

lib/Acme/InputRecordSeparatorIsRegexp.pm  view on Meta::CPAN

to the input record separator of a file handle, available since
v0.04,  is to import this package's C<open> function and to
specify an C<:irs(...)> I<pseudo-layer>.

   use Acme::InputRecordSeparatorIsRegexp 'open';
   $result = open FILEHANDLE, "<:irs(REGEXP)", EXPR
   $result = open FILEHANDLE, "<:irs(REGEXP)", EXPR, LIST
   $result = open FILEHANDLE, "<:irs(REGEXP)", REFERENCE

   $result = open my $fh, "<:irs(\r|\n|\r\n)", "ambiguous-line-endings.txt"

The C<:irs(...)> layer may be combined with other layers.

   open my $fh, "<:encoding(UTF-16):irs(\R)", "ambiguous.txt"

See also: L<"binmode">

=head2 autochomp

Returns the current setting, or sets the C<autochomp> attribute
of a file handle associated with this package. When the
C<autochomp> attribute of the file handle is enabled, any lines
read from the file handle through the C<readline> function
or C<< <> >> operator will be returned with the (custom) line
endings automatically removed.

    use Acme::InputRecordSeparatorIsRegexp 'open','autochomp';
    open my $fh, '<:irs(\R)', 'ambiguous.txt';
    autochomp($fh,1);           # enable autochomp
    my $is_autochomped = autochomp($fh);
    autochomp(tied(*$fh), 0);   # disable

This function can also be called as a method on the I<tied>
file handle.

    (tied *$fh)->autochomp(1);  # enable
    $fh->autochomp(0);          # not OK, must use tied handle

Enabling C<autochomp> with this function on a regular file handle
will tie the file handle into this package using the current
value of C<$/> as the handle's record separator. If you are
just looking for autochomp functionality and don't care about
applying regular expressions to determine line endings, this
function provides an (inefficient) way to do that to
arbitrary file handles.

The default attribute value is false.

=head2 binmode FILEHANDLE, LAYER

Overrides Perl's builtin L<binmode|perlfunc/"binmode"> function. 
If the I<pseudo-layer> C<:irs(...)> is specified, then apply the 
given regular expression as the dynamic input record separator for 
the given filehandle.
Any other layers specified are passed to Perl's builtin C<binmode>
function.


=head2 input_record_separator

Returns the current setting, or changes the setting, of a file handle's
input record separator, I<including file handles that have not
already been tied to this package>. This overcomes a limitation
in L<IO::Handle::input_record_separator|IO::Handle/"METHODS">
where input record separators are not supported on a per-filehandle
basis.

With no arguments, returns the input record separator associated
with the file handle. For regular file handles, this is always
the current value of L<< C<$/>|perlvar/"$INPUT_RECORD_SEPARATOR" >>.

    use Acme::InputRecordSeperatorIsRegexp ':all';

    open my $fh_reg, "<", "some_text_file";
    open my $fh_pkg, "<:irs(\d)", "some_text_file";

    $rs = $fh_reg->input_record_separator;    #   "\n"
    $rs = input_record_separator($fh_reg);    #   "\n"
    $rs = $fh_pkg->input_record_separator;    #   '\d'
    $rs = input_record_separator($fh_pkg);    #   '\d'

With two or more arguments, sets the input record separator for
the file handle as the regular expression indicated by the second
argument (any argument after the second is ignored). For regular
file handles, a side effect is that the file handle will be tied 
to this package

    print ref(tied *$fh_reg);        #   ""
    $fh_reg->input_record_separator(qr/\r\n|\r|\n/);
    print ref(tied *$fh_reg);        #   "Acme::InputRecordSeparatorIsRegexp"

If you are just looking for the functionality of setting different
input record separators on different file handles but don't care about
applying regular expressions to determine line endings, this function
provides an (inefficient) way to do that for arbitrary file handles.
Note that the argument to set the input record separator is treated
as a regular expression, so apply C<quotemeta> to it as necessary.

=head1 METHODS

=head2 chomp

    my $chars_removed = (tied *$fh)->chomp($line_from_fh);
    my $chars_removed = (tied *$fh)->chomp(@lines_from_fh);

Like the builtin L<< C<chomp>|"chomp"/perlvar >> function,
but removes the trailing string from lines that correspond to
the file handle's custom input record separator regular
expression instead of C<$/>. Like the builtin C<chomp>,
returns the total number of characters removed from
all its arguments. See also L<"autochomp">.

=head1 INTERNALS

In unusual circumstances, you may be interested in some of the
internals of the tied file handle object. You can set the values
of these internals by passing additional arguments to the
C<tie> statement or passing a hash reference to this package's 
L<"open"> function, for example:

    my $th = Acme::InputRecordSeparatorIsRegexp->open( $regex, '<', $filename,
    			{ bufsize => 65336 } );

=head2 bufsize

The amount of data, in bytes, to read from the input stream at
a time. For performance reasons, this should be at least a few kilobytes.
B<For the module to work correctly, it should also be much larger
than the length of any sequence of characters that could be construed
as a line ending.>

=head1 ALIAS

The package C<Acme::IRSRegexp> is an alias for
C<Acme::InputRecordSeparatorIsRegexp>, allowing you to write

    use Acme::InputRecordSeparatorIsRegexp;
    tie *$fh, 'Acme::IRSRegexp', 

=head1 AUTHOR

Marty O'Brien, C<< <mob at cpan.org> >>

=head1 BUGS, LIMITATIONS, AND OTHER NOTES

Because this package must often pre-fetch input to determine where
a line-ending is, it is generally not appropriate to apply this
package to C<STDIN> or other terminal-like input.

Changing C<$/> will have no affect on a file handle that has
already been tied to this package.

Calling L<< C<chomp>|"chomp"/perlfunc >> on a return value from this
package will operate with C<$/>, B<not> with the regular expression
associated with the tied file handle. Use the construction
C<< tied(*$fh)->chomp(...) >> to perform the chomp operation on
a filehandle that has customized its input record separator with
this package. Or see the L<< C<autochomp>|"autochomp" >> method
to automatically get chomped input.

Please report any bugs or feature requests to 
C<bug-acme-inputrecordseparatorisregexp at rt.cpan.org>, or through
the web interface at L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Acme-InputRecordSeparatorIsRegexp>.  
I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.


=head1 SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Acme::InputRecordSeparatorIsRegexp

You can also look for information at:

=over 4

=item * RT: CPAN's request tracker (report bugs here)

L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Acme-InputRecordSeparatorIsRegexp>

=item * AnnoCPAN: Annotated CPAN documentation

L<http://annocpan.org/dist/Acme-InputRecordSeparatorIsRegexp>

=item * CPAN Ratings

L<http://cpanratings.perl.org/d/Acme-InputRecordSeparatorIsRegexp>

=item * Search CPAN

L<http://search.cpan.org/dist/Acme-InputRecordSeparatorIsRegexp/>

=back


=head1 ACKNOWLEDGEMENTS

L<perlvar|perlvar/"$INPUT_RECORD_SEPARATOR">

=head1 LICENSE AND COPYRIGHT

Copyright 2013-2018 Marty O'Brien.

This program is free software; you can redistribute it and/or modify it
under the terms of the the Artistic License (2.0). You may obtain a
copy of the full license at:

L<http://www.perlfoundation.org/artistic_license_2_0>

Any use, modification, and distribution of the Standard or Modified
Versions is governed by this Artistic License. By using, modifying or
distributing the Package, you accept this license. Do not use, modify,
or distribute the Package, if you do not accept this license.

If your Modified Version has been derived from a Modified Version made
by someone other than you, you are nevertheless required to ensure that
your Modified Version complies with the requirements of this license.

This license does not grant you the right to use any trademark, service
mark, tradename, or logo of the Copyright Holder.

This license includes the non-exclusive, worldwide, free-of-charge
patent license to make, have made, use, offer to sell, sell, import and
otherwise transfer the Package with respect to any patent claims



( run in 0.526 second using v1.01-cache-2.11-cpan-39bf76dae61 )