Inline-Awk

 view release on metacpan or  search on metacpan

Awk.pm  view on Meta::CPAN

package Inline::Awk;


###############################################################################
#
# Inline::Awk - Add awk code to your Perl programs.
#
# John McNamara, jmcnamara@cpan.org
#
# Documentation after __END__
#


use strict;
use Carp;
require Inline;


use vars qw($VERSION @ISA);
@ISA = qw(Inline);
$VERSION = '0.04';


###############################################################################
#
# register(). This function is required by Inline See the Inline-API pod.
#
sub register {
    return {
                language => 'Awk',
                aliases  => ['AWK', 'awk'],
                type     => 'interpreted',
                suffix   => 'pl',
           };
}


###############################################################################
#
# build(). This function is required by Inline See the Inline-API pod.
#
# Unlike other inline modules we don't interface with a compiler or
# interpreter. Instead we translate the awk code into Perl code using a2p and
# eval it into the user's program.
#
# The main body of the awk code is wrapped in a sub called awk() that the user
# can call. It accepts arguments and localised them into @ARGV.
#
# Any functions are stripped out and given there own copy of the global
# variables created by a2p. This allows the user to write functions and then
# call them from Perl.
#
# The code is derived from Foo.pm. The majority of the smoke and mirrors is
# handled by Inline.
#
sub build {

    my $self    = shift;


    # Set up the path and object names.
    my $awk_obj;
    my $path    = $self->{API}{install_lib} .'/auto/' .$self->{API}{modpname};
    my $obj     = $self->{API}{location};
    ($awk_obj   = $obj) =~ s|$self->{API}{suffix}$|awk|;


    # Create the build directory if necessary.
    $self->mkpath($path) unless -d $path;


    # Get the awk code.
    my $awk_code = $self->{API}{code};


    # In awk a standalone END block implies a default PATTERN block.
    # This behaviour isn't replicated by a2p. Therefore, if the awk code
    # includes an END block we append an empty PATTERN block to force a2p
    # to generate the required while loop.
    #
    $awk_code .= "\n{}\n" if $awk_code =~ m[END(\s*)?{];


    # Create the awk "object" file. In this case it is just a source file.
    open  OBJ, "> $awk_obj" or croak "Can't open $awk_obj for output: $!\n";
    print OBJ  $awk_code;
    close OBJ;


    # Run the awk code through a2p to generate the Perl code.
    #
    my @code = `a2p $awk_obj`;
    chomp(@code);


    # Remove the shebang lines and the switch processing
    splice(@code, 0, 9);


    # Add code for processing args other than @ARGV and add a modified switch
    # processor to take account of the fact that the code is being called from
    # within a sub.
    #
    my @main = ( 'local @ARGV = @_ if @_;',
                 '',
                 '# process any FOO=bar switches',
                 'if (@ARGV) {',
                 '    eval "\$$1$2;" while $ARGV[0] =~ /^(\w+=)(.*)/ '.
                                                       '&& shift @ARGV;',
                 '}',
                 ''
               );


    # The following code is a workaround for the fact that a2p sometimes chomps
    # lines without setting $\ for subsequent print statements.
    #
    push @main, '$\ = "\n";' if grep { /\s+chomp;/ } @code;


    # Store global assignments for use in any subroutines. Variables can be
    # declared more than once so we use a hash to store the last declaration
    # only. We also deal with chained assignments such as $\ = $/ = "\n";.
    #
    my $globals;
    my %globals;

    foreach (@code) {
        last if /while \(/;             # Bail at first while
        last if /^sub /;                # or bail at first sub
        next unless /^(\$\S+)\s=\s/;    # Match assignment

        # Extract chained assignments
        my $line = $_;
        my @vars;
        push @vars, $1 while $line =~ s|^(\$\S+)\s=\s||;

        # Strip trailing comments, crudely
        $line =~ s|;\s+#.*|;|;

        # Literal value remains in $line
        foreach my $var (@vars) {
            $globals{$var} = "    $var = $line\n";
        }
    }


    # Format the globals for printing
    $globals = "\n". join '', values %globals;


    # Separate the main code from the subs.
    while (@code) {
        last if $code[0] =~ /^sub \w+ \{$/;
        push @main, shift @code;
    }


    # Add the global variable to any subroutines
    foreach (@code) {
        s/(^sub .*)/$1$globals/;
    }


Awk.pm  view on Meta::CPAN

    __END__
    __AWK__

    function square(num) {
        return num * num
    }

    function echo(str) {
        return str " " str
    }

You can call an awk program via the C<awk()> function. Here is a simple version of the Unix utility C<wc> which counts the number of lines, words and characters in a file:

    use Inline AWK;

    awk();

    __END__
    __AWK__

    # Simple minded wc
    BEGIN {
        file = ARGV[1]
    }

    {
        words += NF
        chars += length($0) +1 # +2 in DOS
    }

    END {
        printf("%7d%8d%8d %s\n", NR, words, chars, file)
    }




=head2 awk()

The C<awk()> function is imported into you Inline::Awk program by default. It allows you to run C<Inline::Awk> code as a program and to pass arguments to it.

Say, for example, that you have an awk program called C<parsefile.awk> that is normally run like this:

    awk -f parsefile.awk type=1 example.ini

If you then turned C<parsefile.awk> into an Inline::Awk program, (perhaps by using the C<a2a> utility in the distro), you could run the code from within a Perl program as follows:

    awk('type=1', 'example.ini');

If you are using C<-w> or C<warnings> in your Perl program you should quote any literal string in the variable assignment that you pass:

    awk('type=ini',   'example.ini'); # gives a warning with -w
    awk('type="ini"', 'example.ini'); # no warning

The default action of an awk program is to loop over the files that it is passed as arguments. Therefore, the C<awk()> function without arguments is equivalent to inserting the following code into your Perl program:

    while (<>) {
        # Converted awk code here
    }

As usual, the empty diamond operator, C<E<lt>E<gt>> will operate on C<@ARGV>, shifting off the elements until it is empty. Therefore, C<@ARGV> will be cleared after you call C<awk()>. However, C<awk()> creates a C<local> copy of any arguments that it...

    awk(@ARGV);
    # Do something else with @ARGV in Perl

An awk program doesn't loop over a file if it contains a BEGIN block only:

    use Inline AWK;

    awk();

    __END__
    __AWK__

    BEGIN { print "Hello, world!" }


As with all Perl functions the return value of C<awk()> is the last expression evaluated. This is an unintentional feature but you may find a use for it.

If your program only has awk functions and no awk program you can ignore C<awk()>. However, it is still imported into you Perl program.




=head1 HOW TO USE Inline::Awk

You can use C<Inline::Awk> in any of the following ways. See also the Inline documentation:

Method 1 (the standard method):

    use Inline AWK;

    # Call the awk code

    __END__
    __AWK__

    # awk code here

Method 2 (for simple code):

    use Inline AWK => "# awk code here";


Method 3 (requires Inline::Files):

    use Inline::Files;
    use Inline AWK;

    # Call the awk code

    __AWK__

    # awk code here

Note, any of the following use declarations are valid:

    use Inline awk;
    use Inline Awk;
    use Inline AWK;



( run in 1.902 second using v1.01-cache-2.11-cpan-5a3173703d6 )