DTA-CAB
view release on metacpan or search on metacpan
dta-cab-analyze.perl view on Meta::CPAN
sub cleanup {
if (!$fp || !$fp->is_child) {
#print STDERR "$0: END block running\n"; ##-- DEBUG
$fp->abort() if ($fp);
$fp->unlink() if ($fp && !$keeptmp);
#$statq->unlink() if ($statq && !$keeptmp);
#File::Path::rmtree($blockdir) if ($blockdir && !$keeptmp);
}
}
END {
cleanup();
}
__END__
=pod
=head1 NAME
dta-cab-analyze.perl - Command-line analysis interface for DTA::CAB
=head1 SYNOPSIS
dta-cab-analyze.perl [OPTIONS...] DOCUMENT_FILE(s)...
General Options
-help ##-- show short usage summary
-version ##-- show version & exit
-verbose LEVEL ##-- alias for -log-level=LEVEL
-begin CODE ##-- evaluate CODE early in script
-onload CODE ##-- evaluate CODE after loading analyzer(s)
-module MODULE ##-- alias for -begin="use MODULE;"
-end CODE ##-- evaluade CODE late in script
Parallelization Options
-jobs NJOBS ##-- fork() off up to NJOBS parallel jobs (default=0: don't fork() at all)
-job-queue QPATH ##-- use QPATH as job-queue socket (default: temporary)
-keep , -nokeep ##-- do/don't keep temporary queue files (default: don't)
Analysis Options
-config PLFILE ##-- load analyzer config file PLFILE
-analysis-class CLASS ##-- set analyzer class (if -config is not specified)
-analysis-option OPT=VALUE ##-- set analysis option
-profile , -noprofile ##-- do/don't report profiling information (default: do)
I/O Options
-list ##-- arguments are list-files, not filenames
-words ##-- arguments are word text, not filenames
-input-class CLASS ##-- select input parser class (default: Text)
-input-option OPT=VALUE ##-- set input parser option
-output-class CLASS ##-- select output formatter class (default: Text)
-output-option OPT=VALUE ##-- set output formatter option
-output-level LEVEL ##-- override output formatter level (default: 1)
-output-format TEMPLATE ##-- set output format (default=STDOUT)
-format-class CLASS ##-- alias for -input-class=CLASS -output-class=CLASS
-format-option OPT=VALUE ##-- alias for -input-option OPT=VALUE -output-option OPT=VALUE
Block-wise Processing Options
-block SIZE[{k,M,G,T}][@EOB] ##-- pseudo-streaming block-wise analysis (not for all formats)
-noblock ##-- disable block-wise processing
-log-block-info LEVEL ##-- log block-info at LEVEL (default=INFO)
-log-block-trace LEVEL ##-- log block-trace at LEVEL (default=none)
-log-block-profile LEVEL ##-- log block-profile at LEVEL (default=none)
Logging Options ##-- see Log::Log4perl(3pm)
-log-level LEVEL ##-- set minimum log level (default=TRACE)
-log-stderr , -nolog-stderr ##-- do/don't log to stderr (default=true)
-log-syslog , -nolog-syslog ##-- do/don't log to syslog (default=false)
-log-file LOGFILE ##-- log directly to FILE (default=none)
-log-rotate , -nolog-rotate ##-- do/don't auto-rotate log files (default=true)
-log-config L4PFILE ##-- log4perl config file (overrides -log-stderr, etc.)
-log-watch , -nowatch ##-- do/don't watch log4perl config file (default=false)
-log-option OPT=VALUE ##-- set any logging option (e.g. -log-option twlevel=trace)
=cut
##==============================================================================
## Description
##==============================================================================
=pod
=head1 DESCRIPTION
dta-cab-analyze.perl is a command-line utility for analyzing
documents with the L<DTA::CAB|DTA::CAB> analysis suite, without the need
to set up and/or connect to an independent server.
=cut
##==============================================================================
## Options and Arguments
##==============================================================================
=pod
=head1 OPTIONS AND ARGUMENTS
=cut
##==============================================================================
## Options: General Options
=pod
=head2 General Options
=over 4
=item -help
Display a short help message and exit.
=item -man
Display a longer help message and exit.
=item -version
Display program and module version information and exit.
=item -verbose
dta-cab-analyze.perl view on Meta::CPAN
=back
=cut
##==============================================================================
## Options: Other Options
=pod
=head2 Analysis Options
=over 4
=item -config PLFILE
B<Required>.
Load analyzer configuration from PLFILE,
which should be a perl source file parseable
by L<DTA::CAB::Persistent::loadFile()|DTA::CAB::Persistent/item_loadFile>
as a L<DTA::CAB::Analyzer|DTA::CAB::Analyzer> object.
Prototypically, this file will just look like:
our $obj = DTA::CAB->new( opt1=>$val1, ... );
=item -analysis-option OPT=VALUE
Set an arbitrary analysis option C<OPT> to C<VALUE>.
May be multiply specified.
=item -profile , -noprofile
Do/don't report profiling information (default: do)
=back
=cut
##==============================================================================
## Options: I/O Options
=pod
=head2 I/O Options
=over 4
=item -list
Arguments are list files (1 input per line), not filenames.
List-file arguments can actually contain a subset of command-line options
in addition to input filenames.
Not compatible with the L<-words> option.
=item -words
Arguments are word text, not filenames.
Not compatible with the L<-list> option.
=item -block SIZE[{k,M,G,T}][@EOB]
Do pseudo-streaming block-wise analysis.
Currently only supported for 'TT' and 'TJ' formats.
SIZE is the minimum size in bytes for non-final analysis blocks,
and may have an optional SI suffix 'k', 'M', 'G', or 'T'.
EOB indicates the desired block-boundary type; either 's' to
force all block-boundaries to be sentence boundaries,
or 't' ('w') for token (word) boundaries. Default=128k@w.
=item -input-class CLASS
Select input parser class (default: Text).
=item -input-option OPT=VALUE
Set arbitrary input parser options.
May be multiply specified.
=item -output-class CLASS
Select output formatter class (default: Text)
=item -output-option OPT=VALUE
Set arbitrary output formatter option.
May be multiply specified.
=item -output-level LEVEL
Override output formatter level (default: 1)
=item -output-format FORMAT
Set output format (default='-' (STDOUT)), a printf-style format which may contain the following %-escapes:
%f : INFILE : current input file
%b : basename(INFILE) : basename of current input file
%d : dirname(INFILE) : directory of current input file
%x : extension(INFILE): extension of current input file
%F : : alias for %d/%b
=back
=cut
##======================================================================
## Footer
##======================================================================
=pod
=head1 ACKNOWLEDGEMENTS
Perl by Larry Wall.
=head1 AUTHOR
Bryan Jurish E<lt>moocow@cpan.orgE<gt>
=head1 COPYRIGHT AND LICENSE
( run in 0.708 second using v1.01-cache-2.11-cpan-8f98c5d2c55 )