Convert-UUlib

 view release on metacpan or  search on metacpan

UUlib.pm  view on Meta::CPAN

  MSG_MESSAGE   just a message, nothing important
  MSG_NOTE      something that should be noticed
  MSG_WARNING   important msg, processing continues
  MSG_ERROR     processing has been terminated
  MSG_FATAL     decoder cannot process further requests
  MSG_PANIC     recovery impossible, app must terminate

=head2 Options

  OPT_VERSION	version number MAJOR.MINORplPATCH (ro)
  OPT_FAST	assumes only one part per file
  OPT_DUMBNESS	switch off the program's intelligence
  OPT_BRACKPOL	give numbers in [] higher precendence
  OPT_VERBOSE	generate informative messages
  OPT_DESPERATE	try to decode incomplete files
  OPT_IGNREPLY	ignore RE:plies (off by default)
  OPT_OVERWRITE	whether it's OK to overwrite ex. files
  OPT_SAVEPATH	prefix to save-files on disk
  OPT_IGNMODE	ignore the original file mode
  OPT_DEBUG	print messages with FILE/LINE info
  OPT_ERRNO	get last error code for RET_IOERR (ro)
  OPT_PROGRESS	retrieve progress information
  OPT_USETEXT	handle text messages
  OPT_PREAMB	handle Mime preambles/epilogues
  OPT_TINYB64	detect short B64 outside of Mime
  OPT_ENCEXT	extension for single-part encoded files
  OPT_REMOVE	remove input files after decoding (dangerous)
  OPT_MOREMIME	strict MIME adherence
  OPT_DOTDOT	".."-unescaping has not yet been done on input files
  OPT_RBUF      set default read I/O buffer size in bytes
  OPT_WBUF      set default write I/O buffer size in bytes
  OPT_AUTOCHECK automatically check file list after every loadfile

=head2 Result/Error codes

  RET_OK        everything went fine
  RET_IOERR     I/O Error - examine errno
  RET_NOMEM     not enough memory
  RET_ILLVAL    illegal value for operation
  RET_NODATA    decoder didn't find any data
  RET_NOEND     encoded data wasn't ended properly
  RET_UNSUP     unsupported function (encoding)
  RET_EXISTS    file exists (decoding)
  RET_CONT      continue -- special from ScanPart
  RET_CANCEL    operation canceled

=head2 File States

 This code is zero, i.e. "false":

  UUFILE_READ   Read in, but not further processed

 The following state codes are or'ed together:

  FILE_MISPART  Missing Part(s) detected
  FILE_NOBEGIN  No 'begin' found
  FILE_NOEND    No 'end' found
  FILE_NODATA   File does not contain valid uudata
  FILE_OK       All Parts found, ready to decode
  FILE_ERROR    Error while decoding
  FILE_DECODED  Successfully decoded
  FILE_TMPFILE  Temporary decoded file exists

=head2 Encoding types

  UU_ENCODED    UUencoded data
  B64_ENCODED   Mime-Base64 data
  XX_ENCODED    XXencoded data
  BH_ENCODED    Binhex encoded
  PT_ENCODED    Plain-Text encoded (MIME)
  QP_ENCODED    Quoted-Printable (MIME)
  YENC_ENCODED  yEnc encoded (non-MIME)

=head1 EXPORTED FUNCTIONS

=head2 Initializing and cleanup

Initialize is automatically called when the module is loaded and allocates
quite a small amount of memory for todays machines ;) CleanUp releases that
again.

On my machine, a fairly complete decode with DBI backend needs about 10MB
RSS to decode 20000 files.

=over

=item CleanUp

Release memory, file items and clean up files. Should be called after a
decoidng run, if you want to start a new one.

=back

=head2 Setting and querying options

=over

=item $option = GetOption OPT_xxx

=item SetOption OPT_xxx, opt-value

=back

See the C<OPT_xxx> constants above to see which options exist.

=head2 Setting various callbacks

=over

=item SetMsgCallback [callback-function]

=item SetBusyCallback [callback-function]

=item SetFileCallback [callback-function]

=item SetFNameFilter [callback-function]

=back

=head2 Call the currently selected FNameFilter

=over

=item $file = FNameFilter $file

=back

=head2 Loading sourcefiles, optionally fuzzy merge and start decoding

=over

=item ($retval, $count) = LoadFile $fname, [$id, [$delflag, [$partno]]]

Load the given file and scan it for encoded contents. Optionally tag it
with the given id, and if C<$delflag> is true, delete the file after it
is no longer necessary. If you are certain of the part number, you can
specify it as the last argument.

A better (usually faster) way of doing this is using the C<SetFNameFilter>
functionality.

=item $retval = Smerge $pass

If you are desperate, try to call C<Smerge> with increasing C<$pass>
values, beginning at C<0>, to try to merge parts that usually would not
have been merged.

Most probably this will result in garbled files, so never do this by
default, except:

If the C<OPT_AUTOCHECK> option has been disabled (by default it is
enabled) to speed up file loading, then you I<have> to call C<Smerge -1>
after loading all files as an additional pre-pass (which is normally done
by C<LoadFile>).

=item $item = GetFileListItem $item_number

Return the C<$item> structure for the C<$item_number>'th found file, or
C<undef> of no file with that number exists.

The first file has number C<0>, and the series has no holes, so you can
iterate over all files by starting with zero and incrementing until you
hit C<undef>.

This function has to walk the linear list of fils on each access, so
if you want to iterate over all items, it is usually faster to use
C<GetFileList>.

=item @items = GetFileList

Similar to C<GetFileListItem>, but returns all files in one go.

=back

=head2 Decoding files

=over

=item $retval = $item->rename ($newname)

Change the ondisk filename where the decoded file will be saved.

=item $retval = $item->decode_temp

Decode the file into a temporary location, use C<< $item->infile >> to
retrieve the temporary filename.

=item $retval = $item->remove_temp

Remove the temporarily decoded file again.

=item $retval = $item->decode ([$target_path])

Decode the file to its destination, or the given target path.

=item $retval = $item->info (callback-function)

=back

=head2 Querying (and setting) item attributes

=over

=item $state    = $item->state

=item $mode     = $item->mode ([newmode])

=item $uudet    = $item->uudet

=item $size     = $item->size

=item $filename = $item->filename ([newfilename})

=item $subfname = $item->subfname

=item $mimeid   = $item->mimeid

=item $mimetype = $item->mimetype

=item $binfile  = $item->binfile

=back

=head2 Information about source parts

=over

=item $parts = $item->parts

Return information about all parts (source files) used to decode the file
as a list of hashrefs with the following structure:

 {
   partno   => <integer describing the part number, starting with 1>,
   # the following member sonly exist when they contain useful information
   sfname   => <local pathname of the file where this part is from>,
   filename => <the ondisk filename of the decoded file>,
   subfname => <used to cluster postings, possibly the posting filename>,
   subject  => <the subject of the posting/mail>,
   origin   => <the possible source (From) address>,
   mimetype => <the possible mimetype of the decoded file>,
   mimeid   => <the id part of the Content-Type>,
 }

Usually you are interested mostly the C<sfname> and possibly the C<partno>
and C<filename> members.

=back

=head2 Functions below are not documented and not very well tested - feedback welcome

  QuickDecode
  EncodeMulti
  EncodePartial
  EncodeToStream
  EncodeToFile
  E_PrepSingle
  E_PrepPartial

=head2 EXTENSION FUNCTIONS

Functions found in this module but not documented in the uulib documentation:

=over

=item $msg = straction ACT_xxx

Return a human readable string representing the given action code.

=item $msg = strerror RET_xxx

Return a human readable string representing the given error code.

=item $str = strencoding xxx_ENCODED

Return the name of the encoding type as a string.

=item $str = strmsglevel MSG_xxx

Returns the message level as a string.

=item SetFileNameCallback $cb

Sets (or queries) the FileNameCallback, which is called whenever the
decoding library can't find a filename and wants to extract a filename
from the subject line of a posting. The callback will be called with
two arguments, the subject line and the current candidate for the
filename. The latter argument can be C<undef>, which means that no
filename could be found (and likely no one exists, so it is safe to also
return C<undef> in this case). If it doesn't return anything (not even
C<undef>!), then nothing happens, so this is a no-op callback:

   sub cb {
      return ();
   }

If it returns C<undef>, then this indicates that no filename could be
found. In all other cases, the return value is taken to be the filename.

This is a slightly more useful callback:



( run in 1.752 second using v1.01-cache-2.11-cpan-df04353d9ac )