Acme-CPANModules-BloomFilters

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

    2022-03-18.

DESCRIPTION
    Bloom filter is a data structure that allows you to quickly check
    whether an element is in a set. Compared to a regular hash, it is much
    more memory-efficient. The downside is that bloom filter can give you
    false positives, although false negatives are not possible. So in
    essence you can ask a bloom filter which item is "possibly in set" or
    "definitely not in set". You can configure the rate of false positives.
    The larger the filter, the smaller the rate. Some examples for
    application of bloom filter include: 1) checking whether a password is
    in a dictionary of millions of common/compromised passwords; 2) checking
    an email address against leak database; 3) virus pattern checking; 4)
    IP/domain blacklisting/whitelisting. Due to its properties, it is
    sometimes combined with other data structures. For example, a small
    bloom filter can be distributed with a software to check against a
    database. When the answer from bloom filter is "possibly in set", the
    software can further consult on online database to make sure if it is
    indeed in set. Thus, bloom filter can be used to reduce the number of
    direct queries to database.

    In Perl, my default go-to choice is Algorithm::BloomFilter, unless

lib/Acme/CPANModules/BloomFilters.pm  view on Meta::CPAN

    summary => "List of bloom filter modules on CPAN",
    description => <<'_',

Bloom filter is a data structure that allows you to quickly check whether an
element is in a set. Compared to a regular hash, it is much more
memory-efficient. The downside is that bloom filter can give you false
positives, although false negatives are not possible. So in essence you can ask
a bloom filter which item is "possibly in set" or "definitely not in set". You
can configure the rate of false positives. The larger the filter, the smaller
the rate. Some examples for application of bloom filter include: 1) checking
whether a password is in a dictionary of millions of common/compromised
passwords; 2) checking an email address against leak database; 3) virus pattern
checking; 4) IP/domain blacklisting/whitelisting. Due to its properties, it is
sometimes combined with other data structures. For example, a small bloom filter
can be distributed with a software to check against a database. When the answer
from bloom filter is "possibly in set", the software can further consult on
online database to make sure if it is indeed in set. Thus, bloom filter can be
used to reduce the number of direct queries to database.

In Perl, my default go-to choice is <pm:Algorithm::BloomFilter>, unless there's
a specific feature I need from other implementations.

lib/Acme/CPANModules/BloomFilters.pm  view on Meta::CPAN


=head1 DESCRIPTION

Bloom filter is a data structure that allows you to quickly check whether an
element is in a set. Compared to a regular hash, it is much more
memory-efficient. The downside is that bloom filter can give you false
positives, although false negatives are not possible. So in essence you can ask
a bloom filter which item is "possibly in set" or "definitely not in set". You
can configure the rate of false positives. The larger the filter, the smaller
the rate. Some examples for application of bloom filter include: 1) checking
whether a password is in a dictionary of millions of common/compromised
passwords; 2) checking an email address against leak database; 3) virus pattern
checking; 4) IP/domain blacklisting/whitelisting. Due to its properties, it is
sometimes combined with other data structures. For example, a small bloom filter
can be distributed with a software to check against a database. When the answer
from bloom filter is "possibly in set", the software can further consult on
online database to make sure if it is indeed in set. Thus, bloom filter can be
used to reduce the number of direct queries to database.

In Perl, my default go-to choice is L<Algorithm::BloomFilter>, unless there's
a specific feature I need from other implementations.



( run in 1.099 second using v1.01-cache-2.11-cpan-5735350b133 )