Fsdb

 view release on metacpan or  search on metacpan

lib/Fsdb.pm  view on Meta::CPAN

#!/usr/bin/perl -w

#
# Fsdb.pm
#
# Copyright (C) 1991-2024 by John Heidemann <johnh@isi.edu>
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License,
# version 2, as published by the Free Software Foundation.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
#

package Fsdb;

use warnings;
use strict;
use utf8;

=encoding utf8

=head1 NAME

Fsdb - a flat-text database for shell scripting


=cut
our $VERSION = '3.4';

=head1 SYNOPSIS

Fsdb, the flatfile streaming database is package of commands
for manipulating flat-ASCII databases from
shell scripts.  Fsdb is useful to process medium amounts of data (with
very little data you'd do it by hand, with megabytes you might want a
real database).
Fsdb was known as as Jdb from 1991 to Oct. 2008.

Fsdb is very good at doing things like:

=over 4

=item *

extracting measurements from experimental output

=item *

examining data to address different hypotheses

=item *

joining data from different experiments

=item *

eliminating/detecting outliers

=item *

computing statistics on data
(mean, confidence intervals, correlations, histograms)

=item *

reformatting data for graphing programs

=back

Fsdb is built around the idea of a flat text file as a database.
Fsdb files (by convention, with the extension F<.fsdb>),
have a header documenting the schema (what the columns mean),
and then each line represents a database record (or row).

For example:

	#fsdb experiment duration
	ufs_mab_sys 37.2
	ufs_mab_sys 37.3
	ufs_rcp_real 264.5
	ufs_rcp_real 277.9

Is a simple file with four experiments (the rows), 
each with a description, size parameter, and run time
in the first, second, and third columns.

Rather than hand-code scripts to do each special case, Fsdb provides
higher-level functions.  Although it's often easy throw together a
custom script to do any single task, I believe that there are several
advantages to using Fsdb:

=over 4

lib/Fsdb.pm  view on Meta::CPAN

Still in beta, but picking up some bug fixes.

=over 4

=item ENHANCEMENT

L<dbmapreduce> now generates plausible output when given no rows
of input.

=item ENHANCEMENT

L<dbroweval> the warnings option was backwards;
now corrected.  As a result, warnings in user code now default off
(like in fsdb-1.x).

=item BUG FIX

L<dbcolpercentile> now defaults to assuming the target column is numeric.
The new option C<-N> allows selection of a non-numeric target.

=item BUG FIX

L<dbcolscorrelate> now includes C<--sample> and C<--nosample> options
to compute the sample or full population correlation coefficients.
Thanks to Xue Cai for finding this bug.

=back


=head2 2.11, 14-Oct-08

Still in beta, but picking up some bug fixes.

=over 4

=item ENHANCEMENT

L<html_table_to_db> is now more aggressive about filling in empty cells
with the official empty value, rather than leaving them blank or as whitespace.

=item ENHANCEMENT

L<dbpipeline> now catches failures during pipeline element setup
and exits reasonably gracefully.

=item BUG FIX

L<dbsubprocess> now reaps child processes, thus avoiding
running out of processes when used a lot.

=back

=head2 2.12, 16-Oct-08

Finally, a full (non-beta) 2.x release!

=over 4

=item INCOMPATIBLE CHANGE

Jdb has been renamed Fsdb, the flatfile-streaming database.
This change affects all internal Perl APIs,
but no shell command-level APIs.
While Jdb served well for more than ten years,
it is easily confused with the Java debugger (even though Jdb was there first!).
It also is too generic to work well in web search engines.
Finally, Jdb stands for ``John's database'', and we're a bit beyond that.
(However, some call me the ``file-system guy'', so 
one could argue it retains that meeting.)

If you just used the shell commands, this change should not affect you.
If you used the Perl-level libraries directly in your code,
you should be able to rename "Jdb" to "Fsdb" to move to 2.12.

The jdb-announce list not yet been renamed, but it will be shortly.

With this release I've accomplished everything I wanted to
in fsdb-2.x.  I therefore expect to return to boring, bugfix releases.

=back

=head2 2.13, 30-Oct-08

=over 4

=item BUG FIX

L<dbrowaccumulate> now treats non-numeric data as zero by default.

=item BUG FIX

Fixed a perl-5.10ism in L<dbmapreduce> that
breaks that program under 5.8. 
Thanks to Martin Lukac for reporting the bug.

=back

=head2 2.14, 26-Nov-08

=over 4

=item BUG FIX

Improved documentation for L<dbmapreduce>'s C<-f> option.

=item ENHANCEMENT

L<dbcolmovingstats> how computes a moving standard deviation in addition
to a moving mean.

=back


=head2 2.15, 13-Apr-09

=over 4

=item BUG FIX

Fix a F<make install> bug reported by Shalindra Fernando.



( run in 1.807 second using v1.01-cache-2.11-cpan-140bd7fdf52 )