Regexp-Bind

 view release on metacpan or  search on metacpan

Bind.pm  view on Meta::CPAN

  $record = bind_array($string, $regexp);
  @record = global_bind_array($string, $regexp);

  $record = bind_array(\$string, $regexp);
  @record = global_bind_array(\$string, $regexp);

  $record = bind(\$string, $embedded_regexp);
  @record = global_bind(\$string, $embedded_egexp);


=head1 DESCRIPTION

This module is an extension to perl's native regexp function. It binds
anonymous hashes or named variables to matched buffers. Both normal
regexp syntax and embedded regexp syntax are supported. You can view
it as a tiny and petite data extraction system.

=head1 FUNCTIONS

Two types of function are exported. They bind the given fields to
captured contents, and return anonymous hashes/arrayes of the fields.

In the following example, you can pass in either a string or a
string-reference.

=head2 Match the first occurrence

  use Data::Dumper;

=head3 Binding to anonymous hash

  $record = bind($string, $regexp, qw(field_1 field_2 field_3));
  print Dumper $record;

=head3 Binding to array

  $record = bind_array($string, $regexp);
  print $record->[0];

=head2 Do global matching and store matched parts in @record 

=head3 Binding to anonymous hash

  @record = global_bind($string, $regexp, qw(field_1 field_2 field_3));
  print Dumper $_ foreach @record;

=head3 Binding to array

  @record = global_bind_array($string, $regexp);
  print $record[0]->[0];

=head1 NAMED VARIABLE BINDING

To use named variable binding, please set $Regexp::Bind::USE_NAMED_VAR to non-undef, and then matched parts will be bound to named variables while using bind(). It is not supported for global_bind(), bind_array() and global_bind_array().

  $Regexp::Bind::USE_NAMED_VAR = 1;
  bind($string, $regexp, qw(field_1 field_2 field_3));
  print "$field_1 $field_2 $field_3\n";


=head1 EMBEDDED REGEXP

Using embedded regexp syntax means you can embed fields right in
regexp itself. Its embedded syntax exploits the feature of in-line
commenting in regexps.

The module first tries to detect if embedded syntax is used. If
detected, then comments are stripped and regexp is turned back into a
simple one.

Using embedded syntax, for the sake of simplicity and legibility,
field's name is restricted to B<alphanumerics> only. bind_array() and
global_bind_array() do not support embedded syntax.


Example:

  bind($string, qr'# (?#<field_1>\w+) (?#<field_2>\d+)\n'm);

is converted into

  bind($string, qr'# (\w+) (\d+)\n'm);

If embedded syntax is detected, further input arguments are ignored. It means that

  bind($string, qr'# (?#<field_1>\w+) (?#<field_2>\d+)\n'm,
       qw(field_1 field_2));

is the same as

  bind($string, qr'# (?#<field_1>\w+) (?#<field_2>\d+)\n'm);

and conceptually equal to

  bind($string, qr'# (\w+) (\d+)\n'm, qw(field_1 field_2));



Note that the module simply replaces B<(?#E<lt>field nameE<gt>> with
B<(> and binds the field's name to buffer. It does not check for
syntax correctness, so any fancier usage may crash.


=head1 INLINE FILTERING

Inline filtering now works with B<embedded syntax>. Matched parts are
saved in $_, and you can do some simple transformation within the
brackets before they are exported.

  bind($string, qr'# (?#<field_1>{ s/\s+//, $_ }\w+) (?#<field_2>{ $_*= 10, $_ }\d+)\n'm);


=cut

package Regexp::Bind;

use Exporter;
our @ISA = qw(Exporter);
our @EXPORT_OK = qw(bind global_bind bind_array global_bind_array);
our $VERSION = '0.05';



( run in 0.534 second using v1.01-cache-2.11-cpan-71847e10f99 )