AI-Categorizer
view release on metacpan or search on metacpan
view on that data.
A knowledge set is encapsulated by the "AI::Categorizer::KnowledgeSet"
class. Before you can start playing with categorizers, you will have to
start playing with knowledge sets, so that the categorizers have some data
to train on. See the documentation for the "AI::Categorizer::KnowledgeSet"
module for information on its interface.
Feature selection
Deciding which features are the most important is a very large part of the
categorization task - you cannot simply consider all the words in all the
documents when training, and all the words in the document being
categorized. There are two main reasons for this - first, it would mean that
your training and categorizing processes would take forever and use tons of
memory, and second, the significant stuff of the documents would get lost in
the "noise" of the insignificant stuff.
The process of selecting the most important features in the training set is
called "feature selection". It is managed by the
"AI::Categorizer::KnowledgeSet" class, and you will find the details of
lib/AI/Categorizer.pm view on Meta::CPAN
A knowledge set is encapsulated by the
C<AI::Categorizer::KnowledgeSet> class. Before you can start playing
with categorizers, you will have to start playing with knowledge sets,
so that the categorizers have some data to train on. See the
documentation for the C<AI::Categorizer::KnowledgeSet> module for
information on its interface.
=head3 Feature selection
Deciding which features are the most important is a very large part of
the categorization task - you cannot simply consider all the words in
all the documents when training, and all the words in the document
being categorized. There are two main reasons for this - first, it
would mean that your training and categorizing processes would take
forever and use tons of memory, and second, the significant stuff of
the documents would get lost in the "noise" of the insignificant stuff.
The process of selecting the most important features in the training
set is called "feature selection". It is managed by the
C<AI::Categorizer::KnowledgeSet> class, and you will find the details
lib/AI/Categorizer/Learner/Boolean.pm view on Meta::CPAN
return $score;
}
=head1 DESCRIPTION
This is an abstract class which turns boolean categorizers
(categorizers based on algorithms that can just provide yes/no
categorization decisions for a single document and single category)
into multi-valued categorizers. For instance, the decision tree
categorizer C<AI::Categorizer::Learner::DecisionTree> maintains a
decision tree for each category, then uses it to decide whether a
certain document belongs to the given category.
Any class that inherits from this class should implement the following
methods:
=head2 create_boolean_model()
Used during training to create a category-specific model. The type of
model you create is up to you - it should be returned as a scalar.
Whatever you return will be available to you in the
( run in 0.764 second using v1.01-cache-2.11-cpan-4505f990765 )