Text-Embed

 view release on metacpan or  search on metacpan

lib/Text/Embed.pm  view on Meta::CPAN

package Text::Embed;

use strict;
use warnings;
use Carp;

our $VERSION  = '0.03';

my %modules   = ();
my %regexen   = ();
my %callbacks = ();
my %handles   = ();

my $rex_proc  = undef;
my $rex_parse = undef;

my $NL        = '(?:\r?\n)'; 
my $VARS      = '\$\((\w+)\)';

#
# Default handlers for parsing - see POD
# 

my %def_parse  =
(
    ':underscore' => qr/${NL}__([^_].*[^_])__$NL/,
    ':define'     => qr/${NL}#define\s+?(\S+?)(?:$NL|\s+?$NL|\s+?)/,
    ':cdata'      => sub{$_ = shift or return; 
                       return($$_ =~ m#\s*?<!\[(.+?)\[(.*?)\]\]>\s*#sgo);
                     },
);

$def_parse{':default'} = $def_parse{':underscore'};
$rex_parse             = join('|', keys %def_parse);

#
# Default handlers for processing - see POD
# 

my %def_proc  =
(
    ':raw'            => undef,
    ':trim'           => sub{ trim($_[1]);     },
    ':compress'       => sub{ compress($_[1]); },
    ':block-indent'   => sub{ block($_[1]);    },
    ':block-noindent' => sub{ block($_[1],1);  },

    ':strip-cpp'      => sub{strip($_[1],'/\*','\*/'),strip($_[1], '//');},
    ':strip-c'        => sub{strip($_[1],'/\*','\*/');},
    ':strip-xml'      => sub{strip($_[1],'<!--','-->');},
    ':strip-perl'     => sub{strip($_[1]);},
);

$def_proc{':default'}  = $def_proc{':raw'};
$rex_proc              = join('|', keys %def_proc);

#
# import: 
# process arguments and tie caller's %DATA
#
sub import
{
    my $package = shift;
    my $regex   = shift;
    my $cback   = @_ ? [@_] : undef;
    my $caller  = caller;

    $regex = $def_parse{$regex}    if($regex && $regex =~ /^$rex_parse$/);
    $regex = $def_parse{':default'}unless $regex;

    # NB: test for existence...
    if(!exists $modules{$caller}){
        # process all callbacks that are stringified
        no strict 'refs';
        if($cback){
            foreach(@$cback){
                if(!ref $_){
                    if($_ =~ /^$rex_proc$/){
                        # predefined alias
                        $_ = $def_proc{$_};
                    }
                    else{
                        # stringy code ref - relative or absolute
                        $_ = ($_ =~ /\:\:/go) ? \&{$_} : 
                                                \&{$caller."\::".$_}; 
                    }
                }
                else{

lib/Text/Embed.pm  view on Meta::CPAN

        ...          # process pairs
    );

    ...

    __DATA__

    ...

=head3 Stage 1: Parsing

By default, B<Text::Embed> uses similar syntax to the __DATA__ token to 
seperate segments - a line consisting of two underscores surrounding an
identifier. Of course, a suitable syntax depends on the text being embedded.

A REGEX or CODE reference can be passed as the first argument - in order 
to gain finer control of how __DATA__ is parsed:

=over 4

=item REGEX

    use Text::Embed qr(<<<<<<<<(\w*?)>>>>>>>>);

A regular expression will be used in a call to C<split()>. Any 
leading or trailing empty strings will be removed automatically.

=item CODE

    use Text::Embed sub{$_ = shift; ...}
    use Text::Embed &Some::Other::Function;

A subroutine will be passed a reference to the __DATA__ I<string>. 
It should return a LIST of key-value pairs.

=back

In the name of laziness, B<Text::Embed> provides a couple of 
predefined formats:

=over 4

=item :default

Line-oriented __DATA__ like format:

    __BAZ__ 
    baz baz baz
    __FOO__
    foo foo foo
    foo foo foo

=item :define

CPP-like format (%DATA is readonly - can be used to define constants):

    #define BAZ     baz baz baz
    #define FOO     foo foo foo
                    foo foo foo

=item :cdata

Line-agnostic CDATA-like format. Anything outside of tags is ignored.

    <![BAZ[baz baz baz]]>
    <![FOO[
        foo foo foo
        foo foo foo
    ]]>

=back

=head3 Stage 2: Processing

After parsing, each key-value pair can be further processed by an arbitrary
number of callbacks. 

A common usage of this might be controlling how whitespace is represented 
in each segment. B<Text::Embed> provides some likely defaults which operate
on the hash values only.

=over 4

=item :trim

Removes trailing or leading whitespace

=item :compress

Substitutes zero or more whitspace with a single <SPACE>

=item :block-indent

Removes trailing or leading blank lines, preserves all indentation

=item :block-noindent

Removes trailing or leading blank lines, preserves unique indentation

=item :raw

Leave untouched

=item :default

Same as B<:raw>

=back

If you need more control, CODE references or named subroutines can be 
invoked as necessary. At this point it is safe to rename or modify keys. 
Undefining a key removes the entry from B<%DATA>.

=head3 An Example Callback chain

For the sake of brevity, consider a module that has some embedded SQL. 
We can implement a processing callback that will prepare each statement, 
leaving B<%DATA> full of ready to execute DBI statement handlers: 

    package Whatever;



( run in 1.958 second using v1.01-cache-2.11-cpan-13bb782fe5a )