ControlBreak
view release on metacpan or search on metacpan
# 'last' values on the next iteration.
$cb->continue();
}
# simulate break at end of data, if we iterated at least once
if ($cb->iteration > 0) {
printf "%s,%s,%d%s\n", $cb->last('Country'), $cb->last('District'), $district_total, '*';
printf "%s total,%s,%d%s\n", $cb->last('Country'), '', $country_total, '**';
}
__DATA__
Canada,Alberta,Calgary,1019942
Canada,Ontario,Ottawa,812129
Canada,Ontario,Toronto,2600000
Canada,Quebec,Montreal,1704694
Canada,Quebec,Quebec City,531902
Canada,Quebec,Sherbrooke,161323
USA,Arizona,Phoenix,1640641
USA,California,Los Angeles,3919973
USA,California,San Jose,1026700
USA,Illinois,Chicago,2756546
USA,New York,New York City,8930002
USA,New York,Buffalo,281757
USA,Pennsylvania,Philadelphia,1619355
USA,Texas,Houston,2345606
DESCRIPTION
The ControlBreak module provides a class that is used to detect control
breaks; i.e. when a value changes.
Typically, the data being retrieved or iterated over is ordered and
there may be more than one value that is of interest. For example
consider a table of population data with columns for country, district
and city, sorted by country and district. With this module you can
create an object that will detect changes in the district or country,
considered level 1 and level 2 respectively. The calling program can
take action, such as printing subtotals, whenever level changes are
detected.
Ordered data is not a requirement. An example using unordered data would
be counting consecutive numbers within a data stream; e.g. 0 0 1 1 1 1 0
1 1. Using ControlBreak you can detect each change and count the
consecutive values, yielding two zeros, four 1's, one zero, and two 1's.
Note that ControlBreak cannot detect the end of your data stream. The
test() method is normally called within a loop to detect changes in
control variables, but once the last iteration is processed there are no
further calls to test() as the loop ends. It may be necessary,
therefore, to do additional processing after the loop in order to handle
the very last data group; e.g. to print a final set of subtotals.
To simplify this situation, method test_and_do() can be used in place of
test() and continue().
FIELDS
iteration
A readonly field that provides the current iteration number.
This can be useful if you are doing an final processing after an
iteration loop has ended. In the event that the data stream is empty and
there were no iterations, then you can condition your final processing
on iteration > 0.
Note that the interation field is incremented by test() (or
test_and_do()). Therefore, when called within a loop it is effectively
zero-based if referenced within the iteration block before test() is
invoked, and then one-based after test().
level_names
A readonly field that provides a list of the level names that were
provided as arguments to new().
METHODS
new ( $level_name> [, $level_name> ]... )
Create a new ControlBreak object.
Arguments are user-defined names for each level, in minor to major
order. The set of names must be unique, and they must each start with a
letter or underscore, followed by any number of letters, numbers or
underscores.
A level name can also begin with a '+', which denotes that a numeric
comparison will be used for the values processed at this level.
The number of arguments to new() determines the number of control levels
that will be monitored. The variables provided to method test() must
match in number and datatype to these operators.
The order of the arguments corresponds to a hierarchical level of
control, from lowest to highest; i.e. the first argument corresponds to
level 1, the second to level 2, etc. This also corresponds to sort
order, from minor to major, when iterating through a data stream.
break ( [ $level_name ] )
The break() method provides a convenient way to check whether the last
invocation of the test method resulted in a control break, or a control
break greater than or equal to the <level_name> optionally provided as
an argument.
For example, if you have levels 'City', 'State' and 'Country', and
there's a control break on level 1 (City), then invoking break() will
return 1 and therefore be treated as true within a condition. If there
was no control break, then 0 (false) is returned.
When invoked with a level name argument, break() will map the level name
to a level number and compare it to the control break level determined
by the last invocation of test(). If the tested control break level
number is equal or higher than the argument level, then that level
number is returned and, since it will be non-zero, treated as a true
value within a condition. Otherwise, zero (false) is returned.
Ultimately the point of this is that you can use it to write a series of
actions, like printing subtotals and clearing subtotal variables, such
that a higher level control break will trigger actions associated with
lower level control breaks. For example:
my $cb = ControlBreak( qw/City State Country/ );
if ( $cb->break() ) {
say '=== control break detected at level: ' . $cb->levelname;
}
Normally this is used while iterating through a data stream. When a
level change (i.e. control break) is detected, the current data value
has changed relative to the preceding iteration. At this point it may be
necessary to take some action, such a printing a subtotal. But, the
subtotal will be for the preceding group of data and the current value
belongs to the next group. The last() method allows you to access the
value for the group that was just processed so, for example, the group
name can be included on the subtotal line.
For example, if control levels were named 'X' and 'Y' and you are
iterating through data and invoking test($x, $y) at each iteration, then
invoking $cb->last('Y') on iteration 9 will returns the value of $y on
iteration 8.
Note that continue() should not be invoked before last() within the
scope of an iteration loop; i.e. continue() should be the last thing
done before the next turn of the loop.
levelname
Return the level name for the most recent invocation of the test()
method.
levelnum
Return the level number for the most recent invocation of the test()
method.
level_numbers
Return a list of level numbers corresponding to the levels defined in
new(). This can be useful, for example, when you want to set up some
lexical variables for use as indexes into a list you might use to
accumulate subtotals.
my $cb = ControlBreak->new( qw( L1 L2 EOD ) );
my @totals;
my ($L1, $L2, $EOD) = $cb->level_numbers;
foreach my $sublist (@list_of_lists) {
my ($control1, $control2, $number) = $sublist->@*;
...
my $sub_totals = sub {
if ($cb->break('L1')) {
# report the L1 subtotal here
$totals[$L1] = 0; # clear the subtotal
}
...
# accumulate subtotals
map { $totals[$_] += $number } $cb->level_numbers;
};
$cb->test_and_do(
$control1,
$control2,
$cb->iteration == $list_of_lists - 1,
$sub_totals
);
}
reset
Resets the state of the object so it can be used again for another set
of iterations using the same number and type of controls establish when
the object was instantiated with new(). Any comparisons that were
subsequently modified are retained.
test ( $var1 [, $var2 ]... )
Submits the control variables for testing against the values from the
previous iteration.
Testing is done in reverse order, from highest to lowest (major to
minor) and stops once a change is detected. Where it stops determines
the control break level. For example, if $var2 changed, method levelnum
will return 2. If $var2 did not change, but $var1 did, then method
levelnum() will return 1. If nothing changes, then levelnum() will
return 0.
Note that the level numbers set by test(...) are true if there was a
level change, and false if there wasn't. So, they can be used as a
simple boolean test of whether there was a change. Or you can use the
break() method to determine whether any control break has occurred.
Because level numbers correspond to the hierarchical data order, they
can be use to trigger multiple actions; e.g. levelnum() >= 1 could be
used to print subtotals for levels 1 whenever a control break occurred
for level 1, 2 or 3. It is usually the case that higher control breaks
are meant to cascade to lower control levels and this can be achieved in
this fashion. The break() method simplifies this.
Note that method continue() must be called at the end of each iteration
in order to save the values of the iteration for the next iteration. If
not, the next test(...) invocation will croak.
test_and_do ( $var1 [, $var2 ]... $var_end, $coderef )
The test_and_do() method is similar to test(). It takes the same
arguments as test(), plus one additional argument that is an anonymous
code reference. Internally, it calls test() and then, if there is a
control break, calls the anonymous subroutine provided in the last
argument. Typically, that code will perform work related to subtotals or
other actions necessary when a control break occurs.
But test_and_do() does one other thing. It expects the last control
variable ($var_end) to be an end of data indicator, such as the perl
builtin operator eof. This indicator should return false on each
iteration over the data until the very last iteration -- when it should
change to true, thereby triggering a major control break.
What test_and_do does then is to add an extra loop. This simulates a
final record and will trigger test() to signal control breaks at all
levels. Thus, the code provided will be executed between every change of
data AND after all data has been iterated over.
This avoids the necessity of repeating the control break actions you've
put inside the data loop immediately after the loop's closing bracket.
When you just use test and continue(), an end-of-data control break
won't occur and the simplest workaround is to just duplicate your
control break code after the loops closing bracket.
Here's a typical use case involving end of file processing. Note the
extra control level, named 'EOF', and the use of the eof builtin
function as the second last argument of test_and_do():
my $cb = ControlBreak->new( qw( L1 L2 EOF ) );
( run in 0.987 second using v1.01-cache-2.11-cpan-71847e10f99 )