Fsdb

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

Fsdb(3)               User Contributed Perl Documentation              Fsdb(3)

NAME
       Fsdb - a flat-text database for shell scripting

SYNOPSIS
       Fsdb, the flatfile streaming database is package of commands for
       manipulating flat-ASCII databases from shell scripts.  Fsdb is useful
       to process medium amounts of data (with very little data you'd do it by
       hand, with megabytes you might want a real database).  Fsdb was known
       as as Jdb from 1991 to Oct. 2008.

       Fsdb is very good at doing things like:

       +o   extracting measurements from experimental output

       +o   examining data to address different hypotheses

       +o   joining data from different experiments

       +o   eliminating/detecting outliers

       +o   computing   statistics   on   data   (mean,  confidence  intervals,
           correlations, histograms)

       +o   reformatting data for graphing programs

       Fsdb is built around the idea of a flat text file as a database.   Fsdb
       files  (by  convention,  with  the  extension  .fsdb),  have  a  header
       documenting the schema (what the columns  mean),  and  then  each  line
       represents a database record (or row).

       For example:

               #fsdb experiment duration
               ufs_mab_sys 37.2
               ufs_mab_sys 37.3
               ufs_rcp_real 264.5
               ufs_rcp_real 277.9

       Is  a  simple  file  with  four  experiments  (the  rows),  each with a
       description, size parameter, and run time in  the  first,  second,  and
       third columns.

       Rather  than  hand-code  scripts to do each special case, Fsdb provides
       higher-level functions.  Although it's  often  easy  throw  together  a
       custom  script  to do any single task, I believe that there are several
       advantages to using Fsdb:

       +o   these programs provide a higher level interface than plain Perl, so

           **  Fewer lines of simpler code:

                   dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration

               Picks out just one type of experiment and  computes  statistics
               on it, rather than:

                   while (<>) { split; $sum+=$F[1]; $ss+=$F[1]**2; $n++; }
                   $mean = $sum / $n; $std_dev = ...

               in dozens of places.

       +o   the library uses names for columns, so

           **  No more $F[1], use "_duration".

README  view on Meta::CPAN

           fsdb-2.x.

       SEMI-COMPATIBLE CHANGE
           The  header  of fsdb files has changed, it is now #fsdb, not #h (or
           #L) and parsing of -F and -R are also different.   See  dbfilealter
           for  the  new  specification.   The  v1  file  format will be read,
           compatibly, but not written.

       BUG FIX
           dbmapreduce now tolerates comments  that  precede  the  first  key,
           instead of failing with an error message.

   2.9, 6-Aug-08
       Still in beta; just a quick bug-fix for dbmapreduce.

       ENHANCEMENT
           dbmapreduce  now  generates  plausible output when given no rows of
           input.

   2.10, 23-Sep-08
       Still in beta, but picking up some bug fixes.

       ENHANCEMENT
           dbmapreduce now generates plausible output when given  no  rows  of
           input.

       ENHANCEMENT
           dbroweval  the  warnings option was backwards; now corrected.  As a
           result, warnings in user code now default off (like in fsdb-1.x).

       BUG FIX
           dbcolpercentile now defaults  to  assuming  the  target  column  is
           numeric.   The  new  option  "-N" allows selection of a non-numeric
           target.

       BUG FIX
           dbcolscorrelate now includes "--sample" and "--nosample" options to
           compute the sample or  full  population  correlation  coefficients.
           Thanks to Xue Cai for finding this bug.

   2.11, 14-Oct-08
       Still in beta, but picking up some bug fixes.

       ENHANCEMENT
           html_table_to_db  is  now  more  aggressive  about filling in empty
           cells with the official empty value, rather than leaving them blank
           or as whitespace.

       ENHANCEMENT
           dbpipeline now catches failures during pipeline element  setup  and
           exits reasonably gracefully.

       BUG FIX
           dbsubprocess  now  reaps child processes, thus avoiding running out
           of processes when used a lot.

   2.12, 16-Oct-08
       Finally, a full (non-beta) 2.x release!

       INCOMPATIBLE CHANGE
           Jdb has been renamed Fsdb, the flatfile-streaming  database.   This
           change  affects  all internal Perl APIs, but no shell command-level
           APIs.  While Jdb served well for more than ten years, it is  easily
           confused with the Java debugger (even though Jdb was there first!).
           It  also  is  too  generic  to  work  well  in  web search engines.
           Finally, Jdb stands for ``John's database'', and we're a bit beyond
           that.  (However, some call me the ``file-system guy'', so one could
           argue it retains that meeting.)

           If you just used the shell commands, this change should not  affect
           you.   If  you used the Perl-level libraries directly in your code,
           you should be able to rename "Jdb" to "Fsdb" to move to 2.12.

           The jdb-announce list not yet been renamed, but it will be shortly.

           With this release I've  accomplished  everything  I  wanted  to  in
           fsdb-2.x.  I therefore expect to return to boring, bugfix releases.

   2.13, 30-Oct-08
       BUG FIX
           dbrowaccumulate now treats non-numeric data as zero by default.

       BUG FIX
           Fixed  a perl-5.10ism in dbmapreduce that breaks that program under
           5.8.  Thanks to Martin Lukac for reporting the bug.

   2.14, 26-Nov-08
       BUG FIX
           Improved documentation for dbmapreduce's "-f" option.

       ENHANCEMENT
           dbcolmovingstats  how  computes  a  moving  standard  deviation  in
           addition to a moving mean.

   2.15, 13-Apr-09
       BUG FIX
           Fix a make install bug reported by Shalindra Fernando.

   2.16, 14-Apr-09
       BUG FIX
           Another minor release bug: on some systems programize_module looses
           executable permissions.  Again reported by Shalindra Fernando.

   2.17, 25-Jun-09
       TYPO FIXES
           Typo in the dbroweval manual fixed.

       IMPROVEMENT
           There  is no longer a comment line to label columns in dbcolneaten,
           instead the header  line  is  tweaked  to  line  up.   This  change
           restores  the  Jdb-1.x  behavior,  and  means that repeated runs of
           dbcolneaten no longer add comment lines each time.

       BUG FIX
           It turns out   dbcolneaten  was  not  correctly  handling  trailing
           spaces   when  given  the  "-E"  option  to  suppress  them.   This
           regression is now fixed.

       EXTENSION
           dbroweval(1) can now handle direct references to the last  row  via
           $lfref, a dubious but now documented feature.



( run in 0.909 second using v1.01-cache-2.11-cpan-39bf76dae61 )