Data-CTable

 view release on metacpan or  search on metacpan

CTable.pm  view on Meta::CPAN

(either because there was no cache file yet or the cache file was out
of date or corrupted or otherwise unusable), read() will then try to
create a cache file.  This, of course, takes some time, but the time
taken will be more than made up in the speedup of the next read()
operation on the same file.

If creating the cache file fails (for example, because file
permissions didn't allow the cache directory to be created or the
cache file to be written), read() generates a warning explaining why
cacheing failed, but the read() operation itself still succeeds.

No parameters in the object itself are set or modified to indicate the
success or failure of writing the cache file.  

Similarly, there is no way to tell whether a successful read()
operation read from the cache or from the original data file.  If you
want to be SURE the reading was from the data file, either turn off
_CacheOnRead, or call the read_file() method instead of read().

NOTE: because the name of the cache file to be used is calculated just
before the read() is actually done, the cache file can only be found
if the _CacheSubDir and _CacheExtension are the same as they were when
the cache was last created.  If you change these parameters after
having previously cached a file, the older caches could be "orphaned"
and just sit around wasting disk space.

=head2 Cacheing on write()

You may optionally set _CacheOnWrite (default = false) to true.  If
done, then a cache file will be saved for files written using the
write() command.  Read about write() below for more about why you
might want to do this.

=head1 AUTOMATIC DIRECTORY CREATION

When Data::CTable needs to write a file, (a cache file or a data
file), it automatically tries to create any directories or
subdirectories you specify in the _FileName or _CacheSubDir
parameters.  

If it fails while writing a data file, write() will fail (and you will
be warned).  If it fails to create a directory while writing a cache
file, a warning will be issued, but the overall read() or write()
operation will still return a result indicating success.

Any directories created will have the permissions 0777 (world-write)
for easy cleanup.

Generally, the only directory the module will have to create is a
subdirectory to hold cache files.

However, since other directories could be created, be sure to exercise
caution when allowing the module to create any directories for you on
any system where security might be an issue.

Also, if the 0666 permissions on the cache files themselves are too
liberal, you can either 1) turn off cacheing, or 2) call the
prep_cache_file() method to get the name of the cache file that would
have been written, if any, and then restrict its permissions:

	chmod (0600, $this->prep_cache_file());

=head1 READING DATA FILES

	## Replacing data in table with data read from a file

	$t->read($Path)     ## Simple calling convention

	$t->read(           ## Named-parameter convention

	     ## Params that override params in the object if supplied...

	     _FileName      => $Path, ## Full or partial path of file to read

	     _FieldList     => [...], ## Fields to read; others to be discarded

	     _HeaderRow     => 0,     ## No header row (_FieldList required!)

	     _LineEnding    => undef, ## Text line ending (undef means guess)
	     _FDelimiter    => undef, ## Field delimiter (undef means guess)

	     _ReturnMap     => 1,     ## Whether to decode internal returns
	     _ReturnEncoding=>"\x0B", ## How to decode returns.
	     _MacRomanMap   => undef, ## Whether/when to read Mac char set 

	     _CacheOnRead   => 0,     ## Enable/disable cacheing behavior
	     _CacheExtension=> ".x",  ## Extension to add to cache file name
	     _CacheSubDir   => "",    ## (Sub-)dir, if any, for cache files

	     ## Params specific to the read()/write() methods...

	     _MaxRecords    => 200,   ## Limit on how many records to read
	     )

	$t->read_file()     ## Internal: same as read(); ignores cacheing

read() opens a Merge, CSV, or Tab-delimited file and reads in all or
some fields, and all or some records, REPLACING ANY EXISTING DATA in
the CTable object.

Using the simple calling convention, just pass it a file name.  All
other parameters will come from the object (or will be defaulted if
absent).  To specify additional parameters or override any parameters
in the object while reading, use the named-parameter calling
convention.

See the full PARAMETER LIST, above, or read on for some extra details:

_ReturnMap controls whether return characters encoded as ASCII 11
should be mapped back to real newlines (C<"\n">) when read into memory.
If false, they are left as ASCII 11 characters. (default is "true")

_ReturnEncoding controls the character that returns are encoded as, if
different from ASCII 11.

_FieldList is an array (reference) listing the names of fields to
import, in order (and will become the object's _FieldList upon
successful completion of the read() operation).  If not provided and
not found in the object, or empty, then all fields found in the file
are imported and the object's field list will be set from those found
in the file, in the order found there.  If _HeaderRow is false, then

CTable.pm  view on Meta::CPAN


		$this->read_postcheck();

		$this->progress("Thawed  $FileName.");

		$Success = 1;
		goto done;         ## Successful completion: we read from the cache.
	}

	## Could not retrieve for whatever reason (maybe cache did not
	## exist yet or was out of date or had to be abandoned).  So just
	## read normally and possibly write the cache.

  cache_failed:
	{
		$Success = $this->read_file(%$Params) or goto done;

		## Now, having read successfully, we try to write the cache
		## for next time.  Writing the cache is optional; failing to
		## write it is not a failure of the method.

		{	## Code in this block may fail and that's OK.

			## First, pre-flight.
			$this->warn("Cache file $CacheFileName cannot be created/overwritten: $!"), 
			goto done											## Successful completion.
				unless $this->try_file_write($CacheFileName);

			## The data to be stored is: 

			##   1) All data columns read by read_file()
			##   2) Any parameters set by read_file()
			##	 3) _Subset param indicating partial fieldlist was read from file.
			##	 4) _Newline setting so we know if it's compatabile when read back.

			## No other parameters should be cached because we want a
			## read from the cache to produce exactly the same result
			## as a read from the file itself would have produced.

			## After a read, fieldlist() will contain the fields
			## actually read, so cols_hash() WILL yield all the
			## columns.

			my $Data = {(
						 ## Refs to each column read by read_file()
						 %{                 $this->cols_hash() },	

						 ## Other parameters set by read_file()
						 _FieldList		=>	$this->{_FieldList },	
						 _LineEnding	=>	$this->{_LineEnding},
						 _FDelimiter	=>	$this->{_FDelimiter},
						 _HeaderRow		=>	$this->{_HeaderRow },
						 _Subset		=>	$this->{_Subset    },
						 _Newline		=>	"\n",
						 )};
			
			$this->warn("Failed to cache $CacheFileName"), 
			unlink($CacheFileName), 
			goto done                                    ## Successful completion.
				unless $this->write_cache($Data, $CacheFileName);
			chmod 0666, $CacheFileName;					 ## Liberal perms if possible.
		}

		goto done;    ## Successful completion: we read from the file & maybe saved cache.
	}
	
  done:
	return ($Success);
}

=pod

=head1 WRITING DATA FILES

	## Writing some or all data from table into a data file

	$t->write($Path)              ## Simple calling convention

	$t->write(                    ## Named-parameter convention

	     ## Params that override params in the object if supplied...

	     _FileName      => $Path, ## "Base path"; see _WriteExtension

	     _WriteExtension=> ".out",## Insert/append extension to _FileName

	     _FieldList     => [...], ## Fields to write; others ignored
	     _Selection     => [...], ## Record (#s) to write; others ignored

	     _HeaderRow     => 0,     ## Include header row in file

	     _LineEnding    => undef, ## Record delimiter (default is "\n")
	     _FDelimiter    => undef, ## Field delimiter (default is comma)

	     _ReturnMap     => 1,     ## Whether to encode internal returns
	     _ReturnEncoding=>"\x0B", ## How to encode returns
	     _MacRomanMap   => undef, ## Whether/when to write Mac char set 


	     _CacheOnWrite  => 1,     ## Enable saving cache after write()
	     _CacheExtension=> ".x",  ## Extension to add to cache file name
	     _CacheSubDir   => "",    ## (Sub-)dir, if any, for cache files

	     ## Params specific to the read()/write() methods...

	     _MaxRecords    => 200,   ## Limit on how many records to write
	     )

	$t->write_file()    ## Internal: same as write(); ignores cacheing

write() writes a Merge, CSV, or Tab-delimited file.

It uses parameters as described above.  Any parameters not supplied
will be gotten from the object.

Using the simple calling convention, just pass it a path which will
override the _FileName parameter in the object, if any.

All other parameters will come from the object (or will be defaulted
if absent).  

CTable.pm  view on Meta::CPAN

	## if an attempt to create needed subdirectories has failed.

	my $CacheFileName	= $this->prep_cache_file($WriteFileName, $CacheExtension, $CacheSubDir)
		or goto done;
	
	## Pre-flight the cache file for writing.
	$this->warn("Cache file $CacheFileName cannot be created/overwritten: $!"), 
	goto done											## Successful completion.
		unless $this->try_file_write($CacheFileName);

	## The data to be stored is:
	
	##   1) All data columns written by write_file()
	##   2) Any file format parameters used by write_file()

	## Calculate the main writing-related parameters using the same
	## logic that write_file() uses...

	## Default for FieldList is all fields.
	$FieldList			||= $this->fieldlist();
	
	## Convert from optional "dos", "mac", "unix" symbolic values.
	$LineEnding = $this->lineending_string($LineEnding);
	
	## Default for LineEnding is "\n" (CR on Mac; LF on Unix; CR/LF on DOS)
	$LineEnding			= "\n"		unless length($LineEnding);
	
	## Default for FDelimiter is comma
	$FDelimiter			= ','		unless length($FDelimiter);

	## Default for HeaderRow is true.
	$HeaderRow			= 1 		unless defined($HeaderRow);

	## No other parameters should be cached because we want a
	## read from the cache to produce exactly the same result
	## as a read from the file itself would have produced.
	
	my $Data = {(
				 ## Refs to each column written
				 %{                 $this->cols_hash($FieldList)},	
				 
				 ## Other relevant file-format parameters
				 _FieldList		=>	$FieldList,
				 _LineEnding	=>	$LineEnding,
				 _FDelimiter	=>	$FDelimiter,
				 _HeaderRow		=>	$HeaderRow,
				 _Subset		=>	$this->{_Subset} || 0,
				 _Newline		=>	"\n",
				 
				 ## We don't need to save _ReturnMap and
				 ## _ReturnEncoding because those only are relevant
				 ## when reading physical files. Cached data has the
				 ## return chars already encoded as returns.
				 
				 )};
	
	$this->warn("Failed to cache $CacheFileName"), 
	unlink($CacheFileName), 					 ## Delete cache if failure
	goto done                                    ## Successful completion.
		unless $this->write_cache($Data, $CacheFileName);
	chmod 0666, $CacheFileName;					 ## Liberal perms if possible.
	
  done:
	return($WriteFileName);
}

sub write_cache
{
	my $this					= shift;
	my ($Data, $CacheFileName)	= @_;
	
	$this->progress("Storing $CacheFileName...");
	
	my $Success = nstore($Data, $CacheFileName);
	
	$this->progress("Stored  $CacheFileName.") if $Success;
	
  done:
	return($Success);
}

sub write_file		## Just write; don't worry about cacheing
{
	my $this		= shift;
	my $Params		= (@_ == 1 ? {_FileName => $_[0]} : {@_});

	my($FileName, $FieldList, $Selection, $MaxRecords, $LineEnding, $FDelimiter, $QuoteFields, $ReturnMap, $ReturnEncoding, $MacRomanMap, $HeaderRow, $WriteExtension) 
	    = map {$this->getparam($Params, $_)} 
	qw(_FileName  _FieldList  _Selection  _MaxRecords  _LineEnding  _FDelimiter  _QuoteFields  _ReturnMap  _ReturnEncoding  _MacRomanMap  _HeaderRow  _WriteExtension);

	my $Success;
	
	$this->{_ErrorMsg} = "";

	## if FileName is unspecified, or is the single character "-",
	## then default to STDOUT.
	$FileName = \ *STDOUT if ($FileName =~ /^-?$/);
	
	## If we have a regular file handle, bless it into IO::File.
    $FileName = bless ($FileName, 'IO::File') if ref($FileName) =~ /(HANDLE)|(GLOB)/;
	
	## If we have a file handle either passed or constructed, make note of that fact.
	my $GotHandle = ref($FileName) eq 'IO::File';
	
	$this->{_ErrorMsg} = "FileName must be specified for write()", goto done
		unless $GotHandle or length($FileName);
	
	## Default for FieldList is all fields.
	$FieldList			||= $this->fieldlist();
	
	## Default for Selection is all records.
	$Selection			||= $this->selection();

	## Default for MaxRecords is 0 (meaning write all records)
	$MaxRecords			= 0			unless (int($MaxRecords) == $MaxRecords);

	## Convert from optional "dos", "mac", "unix" symbolic values.
	$LineEnding = $this->lineending_string($LineEnding);

	## Default for LineEnding is "\n" (CR on Mac; LF on Unix; CR/LF on DOS)
	$LineEnding			= "\n"		unless length($LineEnding);



( run in 0.739 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )