Alien-Judy

 view release on metacpan or  search on metacpan

src/judy-1.0.5/doc/int/10minutes.htm  view on Meta::CPAN

<A NAME="pgfId=997381">
 </A>
This continues until the population grows to 32 keys. At this point an actual tree structure is formed with a &quot;compressed&quot; 256-ary node (branch) that decodes the first byte of each key. The value 32 was chosen because this is where a tree s...
<P CLASS="Body">
<A NAME="pgfId=997382">
 </A>
There are three kinds of branches. Two are 1-cache-line fill objects to traverse, and one is a 2-cache-line fill object to traverse. In every path down the tree and at all populations, a maximum of one of the 2-cache-line fill branches is used. This ...
<P CLASS="Body">
<A NAME="pgfId=997383">
 </A>
On the other extreme, a highly populated Judy1 tree where the key has been decoded down to 1 byte, and the density of a 256-wide sub-expanse of keys grows to greater than 0.094 (25 keys / 256 expanse), a bitmap of 32 bytes (256 bits) is formed from a...
<P CLASS="Body">
<A NAME="pgfId=997579">
 </A>
Notice that to insert or delete a key is almost as simple as setting or clearing a bit. Also notice, the memory consumption is almost the same for both 32- and 64-bit Judy trees. Given the same set of keys, both 32- and 64-bit Judy trees have remarka...
<P CLASS="Body">
<A NAME="pgfId=997580">
 </A>
In this short writeup it wasn't possible to describe all the data structure details such as: Root, JPM, narrow and rich pointers, linear, bitmap and uncompressed branches, value areas, immediate indexes, terminal nodes (leafs), least compressed form,...
<P CLASS="Body">
<A NAME="pgfId=997386">

src/judy-1.0.5/src/Judy1/Judy1.h  view on Meta::CPAN

#define cJ1_LEAF6_MAXPOP1    (J_1_MAXB / 6)
#define cJ1_LEAF7_MAXPOP1    (J_1_MAXB / 7)
#define cJ1_LEAFW_MAXPOP1    ((J_1_MAXB - cJU_BYTESPERWORD) / cJU_BYTESPERWORD)

#endif


// MAXIMUM POPULATIONS OF IMMEDIATE JPs:
//
// These specify the maximum Population of immediate JPs with various Index
// Sizes (== sizes of remaining undecoded Index bits).

#define cJ1_IMMED1_MAXPOP1  ((sizeof(jp_t) - 1) / 1)    // 7 [15].
#define cJ1_IMMED2_MAXPOP1  ((sizeof(jp_t) - 1) / 2)    // 3  [7].
#define cJ1_IMMED3_MAXPOP1  ((sizeof(jp_t) - 1) / 3)    // 2  [5].

#ifdef JU_64BIT
#define cJ1_IMMED4_MAXPOP1  ((sizeof(jp_t) - 1) / 4)    //    [3].
#define cJ1_IMMED5_MAXPOP1  ((sizeof(jp_t) - 1) / 5)    //    [3].
#define cJ1_IMMED6_MAXPOP1  ((sizeof(jp_t) - 1) / 6)    //    [2].
#define cJ1_IMMED7_MAXPOP1  ((sizeof(jp_t) - 1) / 7)    //    [2].

src/judy-1.0.5/src/JudyCommon/JudyCount.c  view on Meta::CPAN

// not really a search, but just a count.

	case cJU_JPLEAF_B1:  LEAFB1ABOVE(j__udyCountLeafB1);


#ifdef JUDY1
// ----------------------------------------------------------------------------
// FULL POPULATION:
//
// Return the count of Indexes AT OR ABOVE Index, which is the total population
// of the expanse (a constant) less the value of the undecoded digit remaining
// in Index (its base-0 offset in the expanse), which yields an inclusive count
// above.
//
// TBD:  This only supports a 1-byte full expanse.  Should this extract a
// stored value for pop0 and possibly more LSBs of Index, to handle larger full
// expanses?

	case cJ1_JPFULLPOPU1:
	    return(cJU_JPFULLPOPU1_POP0 + 1 - JU_DIGITATSTATE(Index, 1));
#endif

src/judy-1.0.5/src/JudyCommon/JudyDel.c  view on Meta::CPAN

#define JU_IMMED_01(NewJPType,ParentJPType)                             \
                                                                        \
            assert(parentJPtype == (ParentJPType));                     \
            assert(JU_JPDCDPOP0(Pjp) == JU_TRIMTODCDSIZE(Index));       \
            JU_JPSETADT(Pjp, 0, 0, NewJPType);                          \
            return(1)

// Convert cJ*_JPIMMED_*_02 to cJU_JPIMMED_*_01:
//
// Move the undeleted Index, whichever does not match the least bytes of Index,
// from undecoded-bytes-only (in jp_1Index or jp_LIndex as appropriate) to
// jp_DcdPopO (full-field).  Pjp, Index, and offset are in the context.

#define JU_IMMED_02(cIS,LeafType,NewJPType)             \
        {                                               \
            LeafType Pleaf;                             \
                                                        \
            assert((ParentLevel - 1) == (cIS));         \
  JUDY1CODE(Pleaf  = (LeafType) (Pjp->jp_1Index);)      \
  JUDYLCODE(Pleaf  = (LeafType) (Pjp->jp_LIndex);)      \
  JUDYLCODE(PjvRaw = (Pjv_t) (Pjp->jp_Addr);)           \

src/judy-1.0.5/src/JudyCommon/JudyDel.c  view on Meta::CPAN

                || ((Pjpm->jpm_JP.jp_Type) == cJU_JPBRANCH_B)
                || ((Pjpm->jpm_JP.jp_Type) == cJU_JPBRANCH_U));

// WALK THE TREE 
//
// Note:  Recursive code in j__udyDelWalk() knows how to collapse a lower-level
// BranchL containing a single JP into the parent JP as a narrow pointer, but
// the code here cant do that for a top-level BranchL.  The result can be
// PArray -> JPM -> BranchL containing a single JP.  This situation is
// unavoidable because a JPM cannot contain a narrow pointer; the BranchL is
// required in order to hold the top digit decoded, and it does not collapse to
// a LEAFW until the population is low enough.
//
// TBD:  Should we add a topdigit field to JPMs so they can hold narrow
// pointers?

            if (j__udyDelWalk(Pjp, Index, cJU_ROOTSTATE, Pjpm) == -1)
            {
                JU_COPY_ERRNO(PJError, Pjpm);
                return(JERRI);
            }

src/judy-1.0.5/src/JudyCommon/JudyGet.c  view on Meta::CPAN

        Word_t    Index         // to retrieve.
#else
        Pcvoid_t  PArray,       // from which to retrieve.
        Word_t    Index,        // to retrieve.
        PJError_t PJError       // optional, for returning error info.
#endif
        )
{
        Pjp_t     Pjp;          // current JP while walking the tree.
        Pjpm_t    Pjpm;         // for global accounting.
        uint8_t   Digit;        // byte just decoded from Index.
        Word_t    Pop1;         // leaf population (number of indexes).
        Pjll_t    Pjll;         // pointer to LeafL.
        DBGCODE(uint8_t ParentJPType;)

#ifndef JUDYGETINLINE

        if (PArray == (Pcvoid_t) NULL)  // empty array.
        {
  JUDY1CODE(return(0);)
  JUDYLCODE(return((PPvoid_t) NULL);)

src/judy-1.0.5/src/JudyCommon/JudyPrevNext.c  view on Meta::CPAN

// See the manual entry for the API.
//
// OVERVIEW OF Judy*Prev():
//
// Use a reentrant switch statement (state machine, SM1 = "get") to decode the
// callers *PIndex-1, starting with the (PArray), through branches, if
// any, down to an immediate or a leaf.  Look for *PIndex-1 in that leaf, and
// if found, return it.
//
// A dead end is either a branch that does not contain a JP for the appropriate
// digit in *PIndex-1, or a leaf that does not contain the undecoded digits of
// *PIndex-1.  Upon reaching a dead end, backtrack through the leaf/branches
// that were just traversed, using a list (history) of parent JPs that is built
// while going forward in SM1Get.  Start with the current leaf or branch.  In a
// backtracked leaf, look for an Index less than *PIndex-1.  In each
// backtracked branch, look "sideways" for the next JP, if any, lower than the
// one for the digit (from *PIndex-1) that was previously decoded.  While
// backtracking, if a leaf has no previous Index or a branch has no lower JP,
// go to its parent branch in turn.  Upon reaching the JRP, return failure, "no
// previous Index".  The backtrack process is sufficiently different from
// SM1Get to merit its own separate reentrant switch statement (SM2 =
// "backtrack").
//
// While backtracking, upon finding a lower JP in a branch, there is certain to
// be a "prev" Index under that JP (unless the Judy array is corrupt).
// Traverse forward again, this time taking the last (highest, right-most) JP
// in each branch, and the last (highest) Index upon reaching an immediate or a

src/judy-1.0.5/src/JudyCommon/JudyPrivate.h  view on Meta::CPAN

           pointer.  Since this is shared with the Decode field, the combined
           sizes must be 3[7], that is, 1 word less 1 byte for the JP Type.

   The 1-byte field jp_Type is represented as:

   1..3 == Number of bytes in the population (Pop0) word of the Branch or Leaf
           below the pointer (note:  1..7 on 64-bit); indicates:
           - number of bytes in Decode field == 3 - this number;
           - number of bytes remaining to decode.
           Note:  The maximum is 3, not 4, because the 1st byte of the Index is
           always decoded digitally in the top branch.
   -B-  == JP points to a Branch (there are many kinds of Branches).
   -L-  == JP points to a Leaf (there are many kinds of Leaves).

   (2)  == Digit of Index decoded by position offset in branch (really
           0..0xff).

    4*  == Digit of Index necessary for decoding a "narrow" pointer, in a
           Decode field; replaces 1 missing branch (really 0..0xff).

    4+  == Digit of Index NOT necessary for decoding a "narrow" pointer, but
           used for fast traversal of the SM by Judy1Test() and JudyLGet()
           (see the code) (really 0..0xff).

    0   == Byte in a JPs Pop0 field that is always ignored, because a leaf

src/judy-1.0.5/src/JudyCommon/JudyPrivate.h  view on Meta::CPAN

#define cJU_FULLBITMAPB ((BITMAPB_t) cJU_ALLONES)
#define cJU_FULLBITMAPL ((BITMAPL_t) cJU_ALLONES)


// ****************************************************************************
// MISCELLANEOUS JUDY-SPECIFIC DECLARATIONS
// ****************************************************************************

// ROOT STATE:
//
// State at the start of the Judy SM, based on 1 byte decoded per state; equal
// to the number of bytes per Index to decode.

#define cJU_ROOTSTATE (sizeof(Word_t))


// SUBEXPANSES PER STATE:
//
// Number of subexpanses per state traversed, which is the number of JPs in a
// branch (actual or theoretical) and the number of bits in a bitmap.

src/judy-1.0.5/src/JudyCommon/JudyPrivateBranch.h  view on Meta::CPAN

#define JU_BRANCHB_MIN_POP        1000
#define JU_BRANCHB_MAX_POP      (-1UL)

#endif // NO_BRANCHU


// MISCELLANEOUS MACROS:

// Get N most significant bits from the shifted Index word:
//
// As Index words are decoded, they are shifted left so only relevant,
// undecoded Index bits remain.

#define JU_BITSFROMSFTIDX(SFTIDX, N)  ((SFTIDX) >> (cJU_BITSPERWORD - (N)))

// TBD:  I have my doubts about the necessity of these macros (dlb):

// Produce 1-digit mask at specified state:

#define cJU_MASKATSTATE(State)  (0xffL << (((State) - 1) * cJU_BITSPERBYTE))

// Get byte (digit) from Index at the specified state, right justified:

src/judy-1.0.5/src/JudyL/JudyL.h  view on Meta::CPAN

#define cJL_LEAF7_MAXPOP1       (J_L_MAXB / (7 + cJU_BYTESPERWORD))
#define cJL_LEAFW_MAXPOP1 \
           ((J_L_MAXB - cJU_BYTESPERWORD) / (2 * cJU_BYTESPERWORD))

#endif // 64-bit


// MAXIMUM POPULATIONS OF IMMEDIATE JPs:
//
// These specify the maximum Population of immediate JPs with various Index
// Sizes (== sizes of remaining undecoded Index bits).  Since the JP Types enum
// already lists all the immediates in order by state and size, calculate these
// values from it to avoid redundancy.

#define cJL_IMMED1_MAXPOP1  ((cJU_BYTESPERWORD - 1) / 1)        // 3 [7].
#define cJL_IMMED2_MAXPOP1  ((cJU_BYTESPERWORD - 1) / 2)        // 1 [3].
#define cJL_IMMED3_MAXPOP1  ((cJU_BYTESPERWORD - 1) / 3)        // 1 [2].

#ifdef JU_64BIT
#define cJL_IMMED4_MAXPOP1  ((cJU_BYTESPERWORD - 1) / 4)        //   [1].
#define cJL_IMMED5_MAXPOP1  ((cJU_BYTESPERWORD - 1) / 5)        //   [1].

src/judy-1.0.5/src/JudySL/JudySL.c  view on Meta::CPAN

#define SCLSIZE(LEN)  (((LEN) + STRUCTOVD + WORDSIZE - 1) / WORDSIZE)

// string routines, may replace with your own
//
#define STRCMP(S1,S2)   strcmp((void *)(S1), (void *)(S2))
#define STRCPY(S1,S2)   strcpy((void *)(S1), (void *)(S2))
#define STRLEN(S1)      (strlen((void *)(S1)) + 1)


// Index and value area for a shortcut leaf, depending on how it matches the
// undecoded remainder of the Index, given a Pscl_t that includes type bits
// that must be cleared:
//
// PSCLINDEX() and PSCLVALUE() are also useful when Pscl contains uncleared
// TYPE bits.
//
// Note:  SCLCMP() cannot take advantage of knowing the Index length because
// the scl_Index length is not pre-known when these macros are used.

#define PSCLINDEX(PSCL)  ((CLEAR_PSCL(PSCL))->scl_Index)
#define PSCLVALUE(PSCL)  ((CLEAR_PSCL(PSCL))->scl_Pvalue)

src/judy-1.0.5/test/SLcompare.c  view on Meta::CPAN


wc /data/domnames.txt
28826508 28826508 488227095 /data/domnames.txt

For the curious, JudySL (I.E JSLI/G macros in this program) is simply an
application subroutine that uses JudyL to the work.  JudyL is a very
scaleable word to word (or pointer) mapper.  JudySL is implemented as a
digital tree using JudyL arrays as 2^32 (or 2^64 in 64 bit machines)
wide (virtual) nodes.  Each level in the digital tree decodes the next 4
(or 8) bytes of the line/string until only one entry would be in the
node.  Then the remaining undecoded portion of the line is stored as a
string from the parent node.  A digital tree does not need rotation or
balancing.  In practice, digital trees are rarely use because of their
poor memory efficiency in the nodes and a tendency to have large depths.
These problems are solved in the Judy algorithm.  For more details see
application notes on:  www.sourcejudy.com.  And for the really curious,
a technical paper is available in the application section.  If you would
like to be a reviewer, contact Doug Baskins at <doug@sourcejudy.com>

*/



( run in 1.134 second using v1.01-cache-2.11-cpan-26ccb49234f )