DBD-XBase
view release on metacpan or search on metacpan
lib/XBase/Index.pm view on Meta::CPAN
@ISA = qw( XBase::Base );
$VERSION = '1.05';
$DEBUG = 0;
$VERBOSE = 0 unless defined $VERBOSE;
# We will setup global variable to denote the byte order (endian)
my $packed = pack('d', 1);
if ($packed eq "\077\360\000\000\000\000\000\000") {
$BIGEND = 1;
} elsif ($packed eq "\000\000\000\000\000\000\360\077") {
$BIGEND = 0;
} else {
die "XBase::Index: your architecture is not supported.\n";
}
# Open appropriate index file and create object according to suffix
sub new {
my ($class, $file) = (shift, shift);
my @opts = @_;
print "XBase::Index::new($class, $file, @_)\n" if $XBase::Index::VERBOSE;
if (ref $class) { @opts = ('dbf', $class, @opts); }
my ($ext) = ($file =~ /\.(...)$/);
$ext = lc $ext;
if ($ext eq 'sdbm' or $ext eq 'pag' or $ext eq 'dir') {
require XBase::SDBM;
$ext = 'SDBM';
}
my $object = eval "new XBase::$ext \$file, \@opts";
return $object if defined $object;
__PACKAGE__->Error("Error loading index: unknown extension\n") if $@;
return;
}
# For XBase::*x object, a record is one page, object XBase::*x::Page here
sub get_record {
my ($self, $num) = @_;
return $self->{'pages_cache'}{$num}
if defined $self->{'pages_cache'}{$num};
my $newpage = (ref $self) . '::Page::new';
my $page = $self->$newpage($num);
if (defined $page) {
$self->{'pages_cache'}{$num} = $page;
local $^W = 0;
print "Page $page->{'num'}:\tkeys: @{[ map { s/\s+$//; $_; } @{$page->{'keys'}}]}\n\tvalues: @{$page->{'values'}}\n" if $DEBUG;
print "\tlefts: @{$page->{'lefts'}}\n" if defined $page->{'lefts'} and $DEBUG;
}
$page;
}
# Get next (value, record number in dbf) pair
# The important values of the index object are 'level' holding the
# current level of the "cursor", 'pages' holding an array of pages
# currently open for each level and 'rows' with an array of current row
# in each level
sub fetch {
my $self = shift;
my ($level, $page, $row, $key, $val, $left);
# cycle while we get to the leaf record or otherwise get
# a real value, not a pointer to lower page
while (not defined $val)
{
$level = $self->{'level'};
# if we do not have level, let's start from zero
if (not defined $level) {
$level = $self->{'level'} = 0;
$page = $self->get_record($self->{'start_page'});
if (not defined $page) {
$self->Error("Index corrupt: $self: no root page $self->{'start_page'}\n");
return;
}
# and initialize 'pages' and 'rows'
$self->{'pages'} = [ $page ];
$self->{'rows'} = [];
}
# get current page for this level
$page = $self->{'pages'}[$level];
if (not defined $page) {
$self->Error("Index corrupt: $self: page for level $level lost in normal course\n");
return;
}
# get current row for current level and increase it
# (or setup to zero)
my $row = $self->{'rows'}[$level];
if (not defined $row) {
$row = $self->{'rows'}[$level] = 0;
} else {
$self->{'rows'}[$level] = ++$row;
}
# get the (key, value, pointer) from the page
($key, $val, $left) = $page->get_key_val_left($row);
# there is another page to walk
if (defined $left) {
# go deeper
$level++;
my $oldpage = $page;
# load the next page
$page = $self->get_record($left);
if (not defined $page) {
$self->Error("Index corrupt: $self: no page $left, ref'd from $oldpage, row $row, level $level\n");
return;
}
# and put it into the structure
$self->{'pages'}[$level] = $page;
$self->{'rows'}[$level] = undef;
$self->{'level'} = $level;
# and even if some index structures allow the
lib/XBase/Index.pm view on Meta::CPAN
You can test your index using the indexdump script in the main
directory of the DBD::XBase distribution (I mean test XBase::Index
on correct index data, not testing corrupted index file, of course ;-)
Just run
./indexdump ~/path/index.ndx
./indexdump ~/path/index.cdx tag_name
or
perl -Ilib ./indexdump ~/path/index.cdx tag_name
if you haven't installed this version of XBase.pm/DBD::XBase yet. You
should get the content of the index file. On each row, there is
the key value and a record number of the record in the dbf file. Let
me know if you get results different from those you expect. I'd
probably ask you to send me the index file (and possibly the dbf file
as well), so that I can debug the problem.
The index file is (as already noted) a complement to a dbf file. Index
file without a dbf doesn't make much sense because the only thing that
you can get from it is the record number in the dbf file, not the
actual data. But it makes sense to test -- dump the content of the
index to see if the sequence is OK.
The index formats usually distinguish between numeric and character
data. Some of the file formats include the information about the type
in the index file, other depend on the dbf file. Since with indexdump
we only look at the index file, you may need to specify the -type
option to indexdump if it complains that it doesn't know the data
type of the values (this is the case with cdx at least). The possible
values are num, char and date and the call would be like
./indexdump -type=num ~/path/index.cdx tag_name
(this -type option may not work with all index formats at the moment
-- will be fixed and patches always welcome).
You can use C<-ddebug> option to indexdump to see how pages are
fetched and decoded, or run debugger to see the calls and parsing.
=head2 Using the index files to speed up searches in dbf
The syntax for using the index files to access data in the dbf file is
generally
my $table = new XBase "tablename";
# or any other arguments to get the XBase object
# see XBase(3)
my $cur = $table->prepare_select_with_index("indexfile",
"list", "of", "fields", "to", "return");
or
my $cur = $table->prepare_select_with_index(
[ "indexfile_with_tags", "tag_name" ],
"list", "of", "fields", "to", "return");
where we specify the tag in the index file (this is necessary with cdx
and mdx). After we have the cursor, we can search to given record and
start fetching the data:
$cur->find_eq('jezek');
while (my @data = $cur->fetch) { # do something
=head2 Supported index formats
The following table summarizes which formats are supproted by
XBase::Index. If the field says something else that Yes, I welcome
testers and offers of example index files.
Reading of index files -- types supported by XBase::Index
type string numeric date
----------------------------------------------------------
ndx Yes Yes Yes (you need to
convert to Julian)
ntx Yes Yes Untested
idx Untested Untested Untested
(but should be pretty usable)
mdx Untested Untested Untested
cdx Yes Yes Untested
Writing of index files -- not supported untill the reading
is stable enough.
So if you have access to an index file that is untested or unsupported
and you care about support of these formats, contact me. If you are
able to actually generate those files on request, the better because I
may need specific file size or type to check something. If the file
format you work with is supported, I still appreciate a report that it
really works for you.
B<Please note> that there is very little documentation about the file
formats and the work on XBase::Index is heavilly based on making
assumption based on real life data. Also, the documentation is often
wrong or only describing some format variations but not the others.
I personally do not need the index support but am more than happy to
make it a reality for you. So I need your help -- contact me if it
doesn't work for you and offer me your files for testing. Mentioning
word XBase somewhere in the Subject line will get you (hopefully ;-)
fast response. Mentioning work Help or similar stupidity will probably
make my filters to consider your email as spam. Help yourself by
making my life easier in helping you.
=head2 Programmer's notes
Programmers might find the following information usefull when trying
to debug XBase::Index from their files:
The XBase::Index module contains the basic XBase::Index package and
also packages XBase::ndx, XBase::ntx, XBase::idx, XBase::mdx and
XBase::cdx, and for each of these also a package
XBase::index_type::Page. Reading the file goes like this: you create
as object calling either new XBase::Index or new XBase::ndx (or
( run in 1.283 second using v1.01-cache-2.11-cpan-39bf76dae61 )