Algorithm-AM
view release on metacpan or search on metacpan
lib/Algorithm/AM/Batch.pm view on Meta::CPAN
Algorithm::AM::DataSet->import::into($target, 'dataset_from_file');
Algorithm::AM::DataSet::Item->import::into($target, 'new_item');
return;
}
sub BUILD {
my ($self, $args) = @_;
# check for invalid arguments
my $class = ref $self;
my %valid_attrs = map {$_ => 1}
Class::Tiny->get_all_attributes_for($class);
my @invalids = grep {!$valid_attrs{$_}} sort keys %$args;
if(@invalids){
croak "Invalid attributes for $class: " . join ' ',
sort @invalids;
}
if(!exists $args->{training_set}){
croak "Missing required parameter 'training_set'";
}
if(!(ref $args) || !$args->{training_set}->isa(
'Algorithm::AM::DataSet')){
croak 'Parameter training_set should be an ' .
'Algorithm::AM::DataSet';
}
for(qw(
begin_hook
begin_test_hook
begin_repeat_hook
training_item_hook
end_repeat_hook
end_test_hook
end_hook
)){
if(exists $args->{$_} and 'CODE' ne ref $args->{$_}){
croak "Input $_ should be a subroutine";
}
}
return;
}
sub classify_all {
my ($self, $test_set) = @_;
if(!$test_set || 'Algorithm::AM::DataSet' ne ref $test_set){
croak q[Must provide a DataSet to classify_all];
}
if($self->training_set->cardinality != $test_set->cardinality){
croak 'Training and test sets do not have the same ' .
'cardinality (' . $self->training_set->cardinality .
' and ' . $test_set->cardinality . ')';
}
$self->_set_test_set($test_set);
if($self->begin_hook){
$self->begin_hook->($self);
}
# save the result objects from all items, all iterations, here
my @all_results;
foreach my $item_number (0 .. $test_set->size - 1) {
if($log->is_debug){
$log->debug('Test items left: ' .
$test_set->size + 1 - $item_number);
}
my $test_item = $test_set->get_item($item_number);
# store the results just for this item
my @item_results;
if($self->begin_test_hook){
$self->begin_test_hook->($self, $test_item);
}
if($log->is_debug){
my ( $sec, $min, $hour ) = localtime();
$log->info(
sprintf( "Time: %2s:%02s:%02s\n", $hour, $min, $sec) .
$test_item->comment . "\n" .
sprintf( "0/$self->{repeat} %2s:%02s:%02s",
$hour, $min, $sec ) );
}
my $iteration = 1;
while ( $iteration <= $self->repeat ) {
if($self->begin_repeat_hook){
$self->begin_repeat_hook->(
$self, $test_item, $iteration);
}
# this sets excluded_items
my ($training_set, $excluded_items) = $self->_make_training_set(
$test_item, $iteration);
# classify the item with the given training set and
# configuration
my $am = Algorithm::AM->new(
training_set => $training_set,
exclude_nulls => $self->exclude_nulls,
exclude_given => $self->exclude_given,
linear => $self->linear,
);
my $result = $am->classify($test_item);
_log_result($result)
if($log->is_info);
if($log->is_info){
my ( $sec, $min, $hour ) = localtime();
$log->info(
sprintf(
$iteration . '/' . $self->repeat .
' %2s:%02s:%02s',
$hour, $min, $sec
)
);
}
if($self->end_repeat_hook){
lib/Algorithm/AM/Batch.pm view on Meta::CPAN
=head2 C<linear>
This is passed directly to the L<new|Algorithm::AM/new> method of
L<Algorithm::AM> during each classification in the L</classify_all>
method.
=head2 C<classify_all>
Using the analogical modeling algorithm, this method classifies
the test items in the project and returns a list of
L<Result|Algorithm::AM::Result> objects.
L<Log::Any> is used to log information about the current progress and
timing. The statistical summary, analogical set, and gang summary
(without items listed) are logged at the info level, and the full
gang summary with items listed is logged at the debug level.
Hooks are provided to the user for monitoring or modifying
classification configuration. These hooks may be passed into the
object constructor or set via one of the accessor methods.
Batch classification proceeds as follows:
call begin_hook
loop all test set items
call begin_test_hook
repeat X times, where X is specified by the "repeat" setting
call begin_repeat_hook
create a training set;
- for each item in the provided training set,
up to max_training_items
exclude the item with probability 1 - probability
exclude the item if specified via training_item_hook
classify the item with the given training set
call end_repeat_hook
call end_test_hook
call end_hook
The Batch object itself is passed to these hooks, so the user is free
to change settings such as L</probability> or L</max_training_items>,
or even add training data, at any point. Other information is passed to
these hooks as well, as detailed in the method documentation.
=head2 C<begin_hook>
$batch->begin_hook(sub {
my ($batch) = @_;
$batch->probability(.5);
});
This hook is called first thing in the L</classify_all> method, and is
given the Batch object instance.
=head2 C<begin_test_hook>
$batch->begin_repeat_hook(sub {
my ($batch, $test_item) = @_;
$batch->probability(.5);
print $test_item->comment . "\n";
});
This hook is called by L</classify_all> before any iterations of
classification start for each test item. It is provided with the Batch
object instance and the test item.
=head2 C<begin_repeat_hook>
$batch->begin_repeat_hook(sub {
my ($batch, $test_item, $iteration) = @_;
$batch->probability(.5);
print $test_item->comment . "\n";
print "I'm on iteration $iteration\n";
});
This hook is called during L</classify_all> at the beginning of each
iteration of classification of a test item. It is provided with
the Batch object instance, the test item, and the iteration number,
which will vary between 1 and the setting for L</repeat>.
=head2 C<training_item_hook>
$batch->begin_repeat_hook(sub {
my ($batch, $test_item, $iteration, $training_item) = @_;
$batch->probability(.5);
print $test_item->comment . "\n";
print "I'm on iteration $iteration\n";
if($training_item->comment eq 'include me!'){
return 1;
}else{
return 0;
}
});
This hook is called by L</classify_all> while populating a training
set during each iteration of classification. It is provided with
the Batch object instance, the test item, the iteration number, and
an item which may be included in the training set. If the return value
is true, then the item will be included in the training set; otherwise,
it will not.
=head2 C<end_repeat_hook>
$batch->begin_repeat_hook(sub {
my ($batch, $test_item, $iteration, $excluded_items, $result) = @_;
$batch->probability(.5);
print $test_item->comment . "\n";
print "I finished iteration $iteration\n";
print 'I excluded ' . scalar @$excluded_items .
" items from training\n";
print ${$result->statistical_summary};
});
This hook is called during L</classify_all> at the end of each
iteration of classification of a test item. It is provided with
the Batch object instance, the test item, the iteration number, an
array ref containing training items excluded from the training set, and
the result object returned by L<classify|Algorithm::AM/classify>.
=head2 C<end_test_hook>
$batch->begin_repeat_hook(sub {
my ($batch, $test_item, @results) = @_;
$batch->probability(.5);
print $test_item->comment . "\n";
my $iterations = @results;
my $correct = 0;
for my $result (@result){
$correct++ if $result->result ne 'incorrect';
}
print 'Item ' . $item->comment .
" correct $correct/$iterations times\n";
});
This hook is called by L</classify_all> after all classifications
of a single item are finished. It is provided with the Batch
object instance as well as a list of the
L<Result|Algorithm::AM::Result> objects returned by
L<Algorithm::AM/classify> during each iteration of classification.
=head2 C<end_hook>
$batch->end_hook(sub {
my ($batch, @results) = @_;
for my $result(@results){
print ${$result->statistical_summary};
}
});
This hook is called after all classifications are finished. It is
provided with the Batch object instance as well as a list of all of
the L<Result|Algorithm::AM::Result> objects returned by
L<Algorithm::AM/classify>.
=head1 AUTHOR
Theron Stanford <shixilun@yahoo.com>, Nathan Glenn <garfieldnate@gmail.com>
=head1 COPYRIGHT AND LICENSE
This software is copyright (c) 2021 by Royal Skousen.
This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.
=cut
( run in 0.903 second using v1.01-cache-2.11-cpan-96521ef73a4 )