Data-CTable
view release on metacpan or search on metacpan
(either because there was no cache file yet or the cache file was out
of date or corrupted or otherwise unusable), read() will then try to
create a cache file. This, of course, takes some time, but the time
taken will be more than made up in the speedup of the next read()
operation on the same file.
If creating the cache file fails (for example, because file
permissions didn't allow the cache directory to be created or the
cache file to be written), read() generates a warning explaining why
cacheing failed, but the read() operation itself still succeeds.
No parameters in the object itself are set or modified to indicate the
success or failure of writing the cache file.
Similarly, there is no way to tell whether a successful read()
operation read from the cache or from the original data file. If you
want to be SURE the reading was from the data file, either turn off
_CacheOnRead, or call the read_file() method instead of read().
NOTE: because the name of the cache file to be used is calculated just
before the read() is actually done, the cache file can only be found
if the _CacheSubDir and _CacheExtension are the same as they were when
the cache was last created. If you change these parameters after
having previously cached a file, the older caches could be "orphaned"
and just sit around wasting disk space.
=head2 Cacheing on write()
You may optionally set _CacheOnWrite (default = false) to true. If
done, then a cache file will be saved for files written using the
write() command. Read about write() below for more about why you
might want to do this.
=head1 AUTOMATIC DIRECTORY CREATION
When Data::CTable needs to write a file, (a cache file or a data
file), it automatically tries to create any directories or
subdirectories you specify in the _FileName or _CacheSubDir
parameters.
If it fails while writing a data file, write() will fail (and you will
be warned). If it fails to create a directory while writing a cache
file, a warning will be issued, but the overall read() or write()
operation will still return a result indicating success.
Any directories created will have the permissions 0777 (world-write)
for easy cleanup.
Generally, the only directory the module will have to create is a
subdirectory to hold cache files.
However, since other directories could be created, be sure to exercise
caution when allowing the module to create any directories for you on
any system where security might be an issue.
Also, if the 0666 permissions on the cache files themselves are too
liberal, you can either 1) turn off cacheing, or 2) call the
prep_cache_file() method to get the name of the cache file that would
have been written, if any, and then restrict its permissions:
chmod (0600, $this->prep_cache_file());
=head1 READING DATA FILES
## Replacing data in table with data read from a file
$t->read($Path) ## Simple calling convention
$t->read( ## Named-parameter convention
## Params that override params in the object if supplied...
_FileName => $Path, ## Full or partial path of file to read
_FieldList => [...], ## Fields to read; others to be discarded
_HeaderRow => 0, ## No header row (_FieldList required!)
_LineEnding => undef, ## Text line ending (undef means guess)
_FDelimiter => undef, ## Field delimiter (undef means guess)
_ReturnMap => 1, ## Whether to decode internal returns
_ReturnEncoding=>"\x0B", ## How to decode returns.
_MacRomanMap => undef, ## Whether/when to read Mac char set
_CacheOnRead => 0, ## Enable/disable cacheing behavior
_CacheExtension=> ".x", ## Extension to add to cache file name
_CacheSubDir => "", ## (Sub-)dir, if any, for cache files
## Params specific to the read()/write() methods...
_MaxRecords => 200, ## Limit on how many records to read
)
$t->read_file() ## Internal: same as read(); ignores cacheing
read() opens a Merge, CSV, or Tab-delimited file and reads in all or
some fields, and all or some records, REPLACING ANY EXISTING DATA in
the CTable object.
Using the simple calling convention, just pass it a file name. All
other parameters will come from the object (or will be defaulted if
absent). To specify additional parameters or override any parameters
in the object while reading, use the named-parameter calling
convention.
See the full PARAMETER LIST, above, or read on for some extra details:
_ReturnMap controls whether return characters encoded as ASCII 11
should be mapped back to real newlines (C<"\n">) when read into memory.
If false, they are left as ASCII 11 characters. (default is "true")
_ReturnEncoding controls the character that returns are encoded as, if
different from ASCII 11.
_FieldList is an array (reference) listing the names of fields to
import, in order (and will become the object's _FieldList upon
successful completion of the read() operation). If not provided and
not found in the object, or empty, then all fields found in the file
are imported and the object's field list will be set from those found
in the file, in the order found there. If _HeaderRow is false, then
$this->read_postcheck();
$this->progress("Thawed $FileName.");
$Success = 1;
goto done; ## Successful completion: we read from the cache.
}
## Could not retrieve for whatever reason (maybe cache did not
## exist yet or was out of date or had to be abandoned). So just
## read normally and possibly write the cache.
cache_failed:
{
$Success = $this->read_file(%$Params) or goto done;
## Now, having read successfully, we try to write the cache
## for next time. Writing the cache is optional; failing to
## write it is not a failure of the method.
{ ## Code in this block may fail and that's OK.
## First, pre-flight.
$this->warn("Cache file $CacheFileName cannot be created/overwritten: $!"),
goto done ## Successful completion.
unless $this->try_file_write($CacheFileName);
## The data to be stored is:
## 1) All data columns read by read_file()
## 2) Any parameters set by read_file()
## 3) _Subset param indicating partial fieldlist was read from file.
## 4) _Newline setting so we know if it's compatabile when read back.
## No other parameters should be cached because we want a
## read from the cache to produce exactly the same result
## as a read from the file itself would have produced.
## After a read, fieldlist() will contain the fields
## actually read, so cols_hash() WILL yield all the
## columns.
my $Data = {(
## Refs to each column read by read_file()
%{ $this->cols_hash() },
## Other parameters set by read_file()
_FieldList => $this->{_FieldList },
_LineEnding => $this->{_LineEnding},
_FDelimiter => $this->{_FDelimiter},
_HeaderRow => $this->{_HeaderRow },
_Subset => $this->{_Subset },
_Newline => "\n",
)};
$this->warn("Failed to cache $CacheFileName"),
unlink($CacheFileName),
goto done ## Successful completion.
unless $this->write_cache($Data, $CacheFileName);
chmod 0666, $CacheFileName; ## Liberal perms if possible.
}
goto done; ## Successful completion: we read from the file & maybe saved cache.
}
done:
return ($Success);
}
=pod
=head1 WRITING DATA FILES
## Writing some or all data from table into a data file
$t->write($Path) ## Simple calling convention
$t->write( ## Named-parameter convention
## Params that override params in the object if supplied...
_FileName => $Path, ## "Base path"; see _WriteExtension
_WriteExtension=> ".out",## Insert/append extension to _FileName
_FieldList => [...], ## Fields to write; others ignored
_Selection => [...], ## Record (#s) to write; others ignored
_HeaderRow => 0, ## Include header row in file
_LineEnding => undef, ## Record delimiter (default is "\n")
_FDelimiter => undef, ## Field delimiter (default is comma)
_ReturnMap => 1, ## Whether to encode internal returns
_ReturnEncoding=>"\x0B", ## How to encode returns
_MacRomanMap => undef, ## Whether/when to write Mac char set
_CacheOnWrite => 1, ## Enable saving cache after write()
_CacheExtension=> ".x", ## Extension to add to cache file name
_CacheSubDir => "", ## (Sub-)dir, if any, for cache files
## Params specific to the read()/write() methods...
_MaxRecords => 200, ## Limit on how many records to write
)
$t->write_file() ## Internal: same as write(); ignores cacheing
write() writes a Merge, CSV, or Tab-delimited file.
It uses parameters as described above. Any parameters not supplied
will be gotten from the object.
Using the simple calling convention, just pass it a path which will
override the _FileName parameter in the object, if any.
All other parameters will come from the object (or will be defaulted
if absent).
## if an attempt to create needed subdirectories has failed.
my $CacheFileName = $this->prep_cache_file($WriteFileName, $CacheExtension, $CacheSubDir)
or goto done;
## Pre-flight the cache file for writing.
$this->warn("Cache file $CacheFileName cannot be created/overwritten: $!"),
goto done ## Successful completion.
unless $this->try_file_write($CacheFileName);
## The data to be stored is:
## 1) All data columns written by write_file()
## 2) Any file format parameters used by write_file()
## Calculate the main writing-related parameters using the same
## logic that write_file() uses...
## Default for FieldList is all fields.
$FieldList ||= $this->fieldlist();
## Convert from optional "dos", "mac", "unix" symbolic values.
$LineEnding = $this->lineending_string($LineEnding);
## Default for LineEnding is "\n" (CR on Mac; LF on Unix; CR/LF on DOS)
$LineEnding = "\n" unless length($LineEnding);
## Default for FDelimiter is comma
$FDelimiter = ',' unless length($FDelimiter);
## Default for HeaderRow is true.
$HeaderRow = 1 unless defined($HeaderRow);
## No other parameters should be cached because we want a
## read from the cache to produce exactly the same result
## as a read from the file itself would have produced.
my $Data = {(
## Refs to each column written
%{ $this->cols_hash($FieldList)},
## Other relevant file-format parameters
_FieldList => $FieldList,
_LineEnding => $LineEnding,
_FDelimiter => $FDelimiter,
_HeaderRow => $HeaderRow,
_Subset => $this->{_Subset} || 0,
_Newline => "\n",
## We don't need to save _ReturnMap and
## _ReturnEncoding because those only are relevant
## when reading physical files. Cached data has the
## return chars already encoded as returns.
)};
$this->warn("Failed to cache $CacheFileName"),
unlink($CacheFileName), ## Delete cache if failure
goto done ## Successful completion.
unless $this->write_cache($Data, $CacheFileName);
chmod 0666, $CacheFileName; ## Liberal perms if possible.
done:
return($WriteFileName);
}
sub write_cache
{
my $this = shift;
my ($Data, $CacheFileName) = @_;
$this->progress("Storing $CacheFileName...");
my $Success = nstore($Data, $CacheFileName);
$this->progress("Stored $CacheFileName.") if $Success;
done:
return($Success);
}
sub write_file ## Just write; don't worry about cacheing
{
my $this = shift;
my $Params = (@_ == 1 ? {_FileName => $_[0]} : {@_});
my($FileName, $FieldList, $Selection, $MaxRecords, $LineEnding, $FDelimiter, $QuoteFields, $ReturnMap, $ReturnEncoding, $MacRomanMap, $HeaderRow, $WriteExtension)
= map {$this->getparam($Params, $_)}
qw(_FileName _FieldList _Selection _MaxRecords _LineEnding _FDelimiter _QuoteFields _ReturnMap _ReturnEncoding _MacRomanMap _HeaderRow _WriteExtension);
my $Success;
$this->{_ErrorMsg} = "";
## if FileName is unspecified, or is the single character "-",
## then default to STDOUT.
$FileName = \ *STDOUT if ($FileName =~ /^-?$/);
## If we have a regular file handle, bless it into IO::File.
$FileName = bless ($FileName, 'IO::File') if ref($FileName) =~ /(HANDLE)|(GLOB)/;
## If we have a file handle either passed or constructed, make note of that fact.
my $GotHandle = ref($FileName) eq 'IO::File';
$this->{_ErrorMsg} = "FileName must be specified for write()", goto done
unless $GotHandle or length($FileName);
## Default for FieldList is all fields.
$FieldList ||= $this->fieldlist();
## Default for Selection is all records.
$Selection ||= $this->selection();
## Default for MaxRecords is 0 (meaning write all records)
$MaxRecords = 0 unless (int($MaxRecords) == $MaxRecords);
## Convert from optional "dos", "mac", "unix" symbolic values.
$LineEnding = $this->lineending_string($LineEnding);
## Default for LineEnding is "\n" (CR on Mac; LF on Unix; CR/LF on DOS)
$LineEnding = "\n" unless length($LineEnding);
( run in 0.739 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )