KinoSearch

 view release on metacpan or  search on metacpan

core/KinoSearch/Index/SegPostingList.c  view on Meta::CPAN

        InStream *skip_stream           = self->skip_stream;
        SkipStepper *const skip_stepper = self->skip_stepper;
        uint32_t new_doc_id             = skip_stepper->doc_id;
        int64_t new_filepos             = InStream_Tell(post_stream);

        /* Assuming the default skip_interval of 16...
         * 
         * Say we're currently on the 5th doc matching this term, and we get a
         * request to skip to the 18th doc matching it.  We won't have skipped
         * yet, but we'll have already gone past 5 of the 16 skip docs --
         * ergo, the modulus in the following formula.
         */
        int32_t num_skipped = 0 - (self->count % skip_interval);
        if (num_skipped == 0 && self->count != 0) { 
            num_skipped = 0 - skip_interval; 
        }

        // See if there's anything to skip. 
        while (target > skip_stepper->doc_id) {
            new_doc_id    = skip_stepper->doc_id;
            new_filepos   = skip_stepper->filepos;

core/KinoSearch/Search/Compiler.cfh  view on Meta::CPAN

     * 
     * @param factor The multiplier.
     */
    public void
    Apply_Norm_Factor(Compiler *self, float factor);

    /**  Take a newly minted Compiler object and apply query-specific
     * normalization factors.  Should be called at or near the end of
     * construction.
     *
     * For a TermQuery, the scoring formula is approximately:
     * 
     *     ( tf_d * idf_t / norm_d ) * ( tf_q * idf_t / norm_q ) 
     * 
     * Normalize() is theoretically concerned with applying the second half of
     * that formula to a the Compiler's weight. What actually happens depends
     * on how the Compiler and Similarity methods called internally are
     * implemented.
     */
    public void
    Normalize(Compiler *self);
    
    /** Return an array of Span objects, indicating where in the given
     * field the text that matches the parent query occurs.  In this case,
     * the span's offset and length are measured in Unicode code points.
     * The default implementation returns an empty array.       

core/KinoSearch/Search/TermQuery.c  view on Meta::CPAN


void
TermCompiler_apply_norm_factor(TermCompiler *self, float query_norm_factor) 
{
    self->query_norm_factor = query_norm_factor;

    /* Multiply raw weight by the idf and norm_q factors in this:
     * 
     *      ( tf_q * idf_q / norm_q )
     *
     * Note: factoring in IDF a second time is correct.  See formula.
     */
    self->normalized_weight 
        = self->raw_weight * self->idf * query_norm_factor;
}

float
TermCompiler_get_weight(TermCompiler *self)
{
    return self->normalized_weight;
}

lib/KinoSearch/Search/Compiler.pod  view on Meta::CPAN

B<factor> - The multiplier.

=back

=head2 normalize()

Take a newly minted Compiler object and apply query-specific
normalization factors.  Should be called at or near the end of
construction.

For a TermQuery, the scoring formula is approximately:

    ( tf_d * idf_t / norm_d ) * ( tf_q * idf_t / norm_q ) 

normalize() is theoretically concerned with applying the second half of
that formula to a the Compiler's weight. What actually happens depends
on how the Compiler and Similarity methods called internally are
implemented.

=head2 get_parent()

Accessor for the Compiler's parent Query object.

=head2 get_similarity()

Accessor for the Compiler's Similarity object.



( run in 0.277 second using v1.01-cache-2.11-cpan-3cd7ad12f66 )