DBD-XBase
view release on metacpan or search on metacpan
lib/XBase/Index.pm view on Meta::CPAN
are looking for. That way you've avoided reading all pages describing
the values that are lower. Here you descend one level, fetch the page
and again search the list of keys in that page. And you repeat this
process until you get to the leaf (lowest) level and here you finaly
find a pointer to the dbf. XBase::Index does this for you.
Some of the formats also support multiple indexes in one file --
usually there is one top level index that for different field values
points to different root pages in the index file (so called tags).
XBase::Index supports (or aims to support) the following index
formats: ndx, ntx, mdx, cdx and idx. They differ in a way they store
the keys and pointers but the idea is always the same: make a tree of
pages, where the page contains keys and pointer either to pages at
lower levels, or to dbf (or both). XBase::Index only supports
read only access to the index fields at the moment (and if you need
writing them as well, follow reading because we need to have the
reading support stable before I get to work on updating the indexes).
=head2 Testing your index file (and XBase::Index)
You can test your index using the indexdump script in the main
directory of the DBD::XBase distribution (I mean test XBase::Index
on correct index data, not testing corrupted index file, of course ;-)
Just run
./indexdump ~/path/index.ndx
./indexdump ~/path/index.cdx tag_name
or
perl -Ilib ./indexdump ~/path/index.cdx tag_name
if you haven't installed this version of XBase.pm/DBD::XBase yet. You
should get the content of the index file. On each row, there is
the key value and a record number of the record in the dbf file. Let
me know if you get results different from those you expect. I'd
probably ask you to send me the index file (and possibly the dbf file
as well), so that I can debug the problem.
The index file is (as already noted) a complement to a dbf file. Index
file without a dbf doesn't make much sense because the only thing that
you can get from it is the record number in the dbf file, not the
actual data. But it makes sense to test -- dump the content of the
index to see if the sequence is OK.
The index formats usually distinguish between numeric and character
data. Some of the file formats include the information about the type
in the index file, other depend on the dbf file. Since with indexdump
we only look at the index file, you may need to specify the -type
option to indexdump if it complains that it doesn't know the data
type of the values (this is the case with cdx at least). The possible
values are num, char and date and the call would be like
./indexdump -type=num ~/path/index.cdx tag_name
(this -type option may not work with all index formats at the moment
-- will be fixed and patches always welcome).
You can use C<-ddebug> option to indexdump to see how pages are
fetched and decoded, or run debugger to see the calls and parsing.
=head2 Using the index files to speed up searches in dbf
The syntax for using the index files to access data in the dbf file is
generally
my $table = new XBase "tablename";
# or any other arguments to get the XBase object
# see XBase(3)
my $cur = $table->prepare_select_with_index("indexfile",
"list", "of", "fields", "to", "return");
or
my $cur = $table->prepare_select_with_index(
[ "indexfile_with_tags", "tag_name" ],
"list", "of", "fields", "to", "return");
where we specify the tag in the index file (this is necessary with cdx
and mdx). After we have the cursor, we can search to given record and
start fetching the data:
$cur->find_eq('jezek');
while (my @data = $cur->fetch) { # do something
=head2 Supported index formats
The following table summarizes which formats are supproted by
XBase::Index. If the field says something else that Yes, I welcome
testers and offers of example index files.
Reading of index files -- types supported by XBase::Index
type string numeric date
----------------------------------------------------------
ndx Yes Yes Yes (you need to
convert to Julian)
ntx Yes Yes Untested
idx Untested Untested Untested
(but should be pretty usable)
mdx Untested Untested Untested
cdx Yes Yes Untested
Writing of index files -- not supported untill the reading
is stable enough.
So if you have access to an index file that is untested or unsupported
and you care about support of these formats, contact me. If you are
able to actually generate those files on request, the better because I
may need specific file size or type to check something. If the file
format you work with is supported, I still appreciate a report that it
really works for you.
B<Please note> that there is very little documentation about the file
formats and the work on XBase::Index is heavilly based on making
( run in 0.849 second using v1.01-cache-2.11-cpan-39bf76dae61 )