App-Framework
view release on metacpan or search on metacpan
lib/App/Framework/Extension/Filter.pm view on Meta::CPAN
=back
=item B<app($app, $opts_href, $state_href, $line)>
Called once for each input file. Called at the start of processing. Allows any setting up of variables stored in the state HASH.
Arguments are:
=over 4
=item I<$app> - The application object
=item I<$opts_href> - HASH ref to the command line options (see L<App::Framework::Feature::Options> and L</Filter Options>)
=item I<$state_href> - HASH ref to state
=item I<$line> - Text of input line
=back
=item B<app_end($app, $opts_href, $state_href)>
Called once for each input file. Called at the end of processing. Allows for any end of file tidy up, data sorting etc.
Arguments are:
=over 4
=item I<$app> - The application object
=item I<$opts_href> - HASH ref to the command line options (see L<App::Framework::Feature::Options> and L</Filter Options>)
=item I<$state_href> - HASH ref to state
=back
=back
=head2 Output
By default, each time the extension calls the 'app' subroutine it sets the B<output> field of the state HASH to undef. The 'app'
subroutine must set this field to some value for the extension to write anything to the output file.
For examples, the following simple 'app' subroutine causes all input files to be output uppercased:
sub app
{
my ($app, $opts_href, $state_href, $line) = @_ ;
# uppercase
$state_href->{output} = uc $line ;
}
If no L</outfile> option is specified, then all output will be written to STDOUT. Also, normally the output is written line-by-line after each line has been processed. If the L</buffer>
option has been specified, then all output lines are buffered (into the state variable L</output_lines>) then written out at the end of processing all input. Similarly, if the L</inplace>
option is specified, then buffering is used to process the complete input file then overwrite it with the output.
=head2 Outfile option
The L</outfile> option may be used to set the output filename. This may include variables that are specific to the Filter extension, where the variables value is updated for each
input file being processed. The following Filter-sepcific variables may be used:
$filter{'filter_file'} = $state_href->{file} ;
$filter{'filter_filenum'} = $state_href->{file_number} ;
my ($base, $path, $ext) = fileparse($file, '\..*') ;
$filter{'filter_name'} = $base ;
$filter{'filter_base'} = $base ;
$filter{'filter_path'} = $path ;
$filter{'filter_ext'} = $ext ;
=over 4
=item I<filter_file> - Input full file path
=item I<filter_base> - Basename of input file (excluding extension)
=item I<filter_name> - Alias for L</filter_base>
=item I<filter_path> - Directory path of input file
=item I<filter_ext> - Extension of input file
=item I<filter_filenum> - Input file number (starting from 1)
=back
NOTE: Specifying these variables for options at the command line will require you to escape the variables per the operating system you are using (e.g. use single quotes ' ' around
the value in Linux).
For example, with the command line arguments:
-outfile '/tmp/$filter_name-$filter_filenum.txt' afile.doc /doc/bfile.text
Processes './afile.doc' into '/tmp/afile-1.txt', and '/doc/bfile.text' into '/tmp/bfile-2.txt'
=head2 Example
As an example, here is a script that filters one or more HTML files to strip out unwanted sections (they are actually Doxygen HTML files
that I wanted to convert into a pdf book):
#!/usr/bin/perl
#
use strict ;
use App::Framework '::Filter' ;
# VERSION
our $VERSION = '1.00' ;
## Create app
go() ;
#----------------------------------------------------------------------
sub app_begin
{
my ($app, $opts_href, $state_href, $line) = @_ ;
# force in-place editing
$app->set(inplace => 1) ;
lib/App/Framework/Extension/Filter.pm view on Meta::CPAN
}
else
{
## STDOUT - so ignore
}
}
#----------------------------------------------------------------------------
=item B<_wr_output($state_href, $opts_href, $line)>
End of output file
=cut
sub _wr_output
{
my $this = shift ;
my ($state_href, $opts_href, $line) = @_ ;
my $fh = $this->out_fh ;
$this->_dbg_prt(["_wr_output($line) fh=$fh\n"]) ;
if ($fh)
{
print $fh "$line\n" ;
}
}
# ============================================================================================
# END OF PACKAGE
=back
=head1 DIAGNOSTICS
Setting the debug flag to level 1 prints out (to STDOUT) some debug messages, setting it to level 2 prints out more verbose messages.
=head1 AUTHOR
Steve Price C<< <sdprice at cpan.org> >>
=head1 BUGS
None that I know of!
=cut
1;
__END__
* app_start - allows hash setup
* app_end - allows file creation/tweak
* app
** return output line?
** HASH state auto- updated with:
*** all output lines (so far)
*** regexp match vars (under 'vars' ?)
** app sets HASH 'output' to tell filter what to output (allows multi-line?)
* options
** inplace - buffers up lines then overwrites (input) file
** dir - output to dir
** input file wildcards
** recurse - does recursive file find (ignore .cvs .svn)
** output - can spec filename template ($name.ext)
* Filtering feature
** All extra loading of filter submodules
** Feature options: +Filter(perl c) - specifies extra Filter::Perl, Filter::C modules
* Filter spec:
(
('<spec>', <flags>, <code>),
('<spec>', <flags>, <code>),
('<spec>', <flags>, <code>),
)
Each entry perfomed on the line, move on to next entry if no match OR match and (flags & FILTER_CONTINUE) [default]
Calls <code> on match AND (flags & FILTER_CALL); calls app if no <code> specified
Flag bitmasks:
FILTER_CONTINUE - allows next entry to be processed if matches; normally stops
FILTER_CALL - call code on match
<spec> is of the form:
[<cond>:]/<regexp>/[:<setvars>]
<cond> evaluatable condition that must be met before running the regexp. Variables can be used by name
(names are converted to $state->{'vars'}{name})
<stevars> colon separated list of variable assignments evaluated on match. Variables used by name (as <cond>). Regexp matches
accessed by $n or \n
( run in 0.637 second using v1.01-cache-2.11-cpan-e1769b4cff6 )