DTA-CAB

 view release on metacpan or  search on metacpan

dta-cab-analyze.perl  view on Meta::CPAN

sub cleanup {
  if (!$fp || !$fp->is_child) {
    #print STDERR "$0: END block running\n"; ##-- DEBUG
    $fp->abort()  if ($fp);
    $fp->unlink() if ($fp && !$keeptmp);
    #$statq->unlink() if ($statq && !$keeptmp);
    #File::Path::rmtree($blockdir) if ($blockdir && !$keeptmp);
  }
}

END {
  cleanup();
}

__END__
=pod

=head1 NAME

dta-cab-analyze.perl - Command-line analysis interface for DTA::CAB

=head1 SYNOPSIS

 dta-cab-analyze.perl [OPTIONS...] DOCUMENT_FILE(s)...

 General Options
  -help                           ##-- show short usage summary
  -version                        ##-- show version & exit
  -verbose LEVEL                  ##-- alias for -log-level=LEVEL
  -begin CODE                     ##-- evaluate CODE early in script
  -onload CODE                    ##-- evaluate CODE after loading analyzer(s)
  -module MODULE                  ##-- alias for -begin="use MODULE;"
  -end CODE                       ##-- evaluade CODE late in script

 Parallelization Options
  -jobs NJOBS                     ##-- fork() off up to NJOBS parallel jobs (default=0: don't fork() at all)
  -job-queue QPATH                ##-- use QPATH as job-queue socket (default: temporary)
  -keep , -nokeep                 ##-- do/don't keep temporary queue files (default: don't)

 Analysis Options
  -config PLFILE                  ##-- load analyzer config file PLFILE
  -analysis-class  CLASS          ##-- set analyzer class (if -config is not specified)
  -analysis-option OPT=VALUE      ##-- set analysis option
  -profile , -noprofile           ##-- do/don't report profiling information (default: do)

 I/O Options
  -list                           ##-- arguments are list-files, not filenames
  -words                          ##-- arguments are word text, not filenames
  -input-class CLASS              ##-- select input parser class (default: Text)
  -input-option OPT=VALUE         ##-- set input parser option

  -output-class CLASS             ##-- select output formatter class (default: Text)
  -output-option OPT=VALUE        ##-- set output formatter option
  -output-level LEVEL             ##-- override output formatter level (default: 1)
  -output-format TEMPLATE         ##-- set output format (default=STDOUT)

  -format-class CLASS             ##-- alias for -input-class=CLASS -output-class=CLASS
  -format-option OPT=VALUE        ##-- alias for -input-option OPT=VALUE -output-option OPT=VALUE

 Block-wise Processing Options
  -block SIZE[{k,M,G,T}][@EOB]    ##-- pseudo-streaming block-wise analysis (not for all formats)
  -noblock                        ##-- disable block-wise processing
  -log-block-info LEVEL		  ##-- log block-info at LEVEL (default=INFO)
  -log-block-trace LEVEL          ##-- log block-trace at LEVEL (default=none)
  -log-block-profile LEVEL        ##-- log block-profile at LEVEL (default=none)

 Logging Options                  ##-- see Log::Log4perl(3pm)
  -log-level LEVEL                ##-- set minimum log level (default=TRACE)
  -log-stderr , -nolog-stderr     ##-- do/don't log to stderr (default=true)
  -log-syslog , -nolog-syslog     ##-- do/don't log to syslog (default=false)
  -log-file LOGFILE               ##-- log directly to FILE (default=none)
  -log-rotate , -nolog-rotate     ##-- do/don't auto-rotate log files (default=true)
  -log-config L4PFILE             ##-- log4perl config file (overrides -log-stderr, etc.)
  -log-watch  , -nowatch          ##-- do/don't watch log4perl config file (default=false)
  -log-option OPT=VALUE           ##-- set any logging option (e.g. -log-option twlevel=trace)

=cut

##==============================================================================
## Description
##==============================================================================
=pod

=head1 DESCRIPTION

dta-cab-analyze.perl is a command-line utility for analyzing
documents with the L<DTA::CAB|DTA::CAB> analysis suite, without the need
to set up and/or connect to an independent server.

=cut

##==============================================================================
## Options and Arguments
##==============================================================================
=pod

=head1 OPTIONS AND ARGUMENTS

=cut

##==============================================================================
## Options: General Options
=pod

=head2 General Options

=over 4

=item -help

Display a short help message and exit.

=item -man

Display a longer help message and exit.

=item -version

Display program and module version information and exit.

=item -verbose

dta-cab-analyze.perl  view on Meta::CPAN


=back

=cut

##==============================================================================
## Options: Other Options
=pod

=head2 Analysis Options

=over 4

=item -config PLFILE

B<Required>.

Load analyzer configuration from PLFILE,
which should be a perl source file parseable
by L<DTA::CAB::Persistent::loadFile()|DTA::CAB::Persistent/item_loadFile>
as a L<DTA::CAB::Analyzer|DTA::CAB::Analyzer> object.
Prototypically, this file will just look like:

 our $obj = DTA::CAB->new( opt1=>$val1, ... );

=item -analysis-option OPT=VALUE

Set an arbitrary analysis option C<OPT> to C<VALUE>.
May be multiply specified.

=item -profile , -noprofile

Do/don't report profiling information (default: do)

=back

=cut

##==============================================================================
## Options: I/O Options
=pod

=head2 I/O Options

=over 4

=item -list

Arguments are list files (1 input per line), not filenames.
List-file arguments can actually contain a subset of command-line options
in addition to input filenames.
Not compatible with the L<-words> option.

=item -words

Arguments are word text, not filenames.
Not compatible with the L<-list> option.

=item -block SIZE[{k,M,G,T}][@EOB]

Do pseudo-streaming block-wise analysis.
Currently only supported for 'TT' and 'TJ' formats.
SIZE is the minimum size in bytes for non-final analysis blocks,
and may have an optional SI suffix 'k', 'M', 'G', or 'T'.
EOB indicates the desired block-boundary type; either 's' to
force all block-boundaries to be sentence boundaries,
or 't' ('w') for token (word) boundaries.  Default=128k@w.

=item -input-class CLASS

Select input parser class (default: Text).

=item -input-option OPT=VALUE

Set arbitrary input parser options.
May be multiply specified.



=item -output-class CLASS

Select output formatter class (default: Text)

=item -output-option OPT=VALUE

Set arbitrary output formatter option.
May be multiply specified.

=item -output-level LEVEL

Override output formatter level (default: 1)

=item -output-format FORMAT

Set output format (default='-' (STDOUT)), a printf-style format which may contain the following %-escapes:

 %f  : INFILE           : current input file
 %b  : basename(INFILE) : basename of current input file
 %d  : dirname(INFILE)  : directory of current input file
 %x  : extension(INFILE): extension of current input file
 %F  :                  : alias for %d/%b

=back

=cut


##======================================================================
## Footer
##======================================================================
=pod

=head1 ACKNOWLEDGEMENTS

Perl by Larry Wall.

=head1 AUTHOR

Bryan Jurish E<lt>moocow@cpan.orgE<gt>

=head1 COPYRIGHT AND LICENSE



( run in 0.708 second using v1.01-cache-2.11-cpan-8f98c5d2c55 )