Benchmark-Lab
view release on metacpan or search on metacpan
my $bl = Benchmark::Lab->new;
my $context = { n => 25 };
my $res = $bl->start( "Fact", $context );
printf( "Median rate: %d/sec\n", $res->{median_rate} );
Analyzing results
TBD. Analysis will be added in a future release.
METHODS
new
Returns a new Benchmark::Lab object.
Valid attributes include:
* "min_secs" â minimum elapsed time in seconds; default 0
* "max_secs" â maximum elapsed time in seconds; default 300
* "min_reps" - minimum number of task repetitions; default 1; minimum
1
* "max_reps" - maximum number of task repetitions; default 100
* "verbose" â when true, progress will be logged to STDERR; default
false
The logic for benchmark duration is as follows:
* benchmarking always runs until both "min_secs" and "min_reps" are
satisfied
* when profiling, benchmarking stops after minimums are satisfied
* when not profiling, benchmarking stops once one of "max_secs" or
"max_reps" is exceeded.
Note that "elapsed time" for the "min_secs" and "max_secs" is wall-clock
time, not the cumulative recorded time of the task itself.
start
my $result = $bm->start( $package, $context, $label );
This method executes the structured benchmark from the given $package.
The $context parameter is passed to all task phases. The $label is used
for diagnostic output to describe the benchmark being run.
If parameters are omitted, $package defaults to "main", an empty hash
reference is used for the $context, and the $label defaults to the
$package.
It returns a hash reference with the following keys:
* "elapsed" â total wall clock time to execute the benchmark
(including non-timed portions).
* "total_time" â sum of recorded task iterations times.
* "iterations" â total number of "do_task" functions called.
* "percentiles" â hash reference with 1, 5, 10, 25, 50, 75, 90, 95 and
99th percentile iteration times. There may be duplicates if there
were fewer than 100 iterations.
* "median_rate" â the inverse of the 50th percentile time.
* "timing" â array reference with individual iteration times as
(floating point) seconds.
CAVEATS
If the "do_task" executes in less time than the timer granularity, an
error will be thrown. For benchmarks that do not have before/after
functions, just repeating the function under test in "do_task" will be
sufficient.
RATIONALE
I believe most approaches to benchmarking are flawed, primarily because
they focus on finding a *single* measurement. Single metrics are easy to
grok and easy to compare ("foo was 13% faster than bar!"), but they
obscure the full distribution of timing data and (as a result) are often
unstable.
Most of the time, people hand-wave this issue and claim that the Central
Limit Theorem (CLT) solves the problem for a large enough sample size.
Unfortunately, the CLT holds only if means and variances are finite and
some real world distributions are not (e.g. hard drive error frequencies
best fit a Pareto distribution).
Further, we often care more about the shape of the distribution than
just a single point. For example, I would rather have a process with
mean µ that stays within 0.9µ - 1.1µ than one that varies from 0.5µ -
1.5µ.
And a process that is 0.1µ 90% of the time and 9.1µ 10% of the time
(still with mean µ!) might be great or terrible, depending on the
application.
This module grew out of a desire for detailed benchmark timing data,
plus some additional features, which I couldn't find in existing
benchmarking modules:
* Raw timing data â I wanted to be able to get raw timing data, to
allow more flexible statistical analysis of timing distributions.
* Monotonic clock â I wanted times from a high-resolution monotonic
clock (if available).
* Setup/before/after/teardown â I wanted to be able to
initialize/reset state not just once at the start, but before each
iteration and without it being timed.
* Devel::NYTProf integration â I wanted to be able to run the exact
same code I benchmarked through Devel::NYTProf, also limiting the
profiler to the benchmark task alone, not the setup/teardown/etc.
code.
Eventually, I hope to add some more robust graphic visualization and
statistical analyses of timing distributions. This might include both
single-point estimates (like other benchmarking modules) but also more
sophisticated metrics, like non-parametric measures for comparing
samples with unequal variances.
SEE ALSO
There are many benchmarking modules on CPAN with a mix of features that
may be sufficient for your needs. To my knowledge, none give timing
( run in 1.345 second using v1.01-cache-2.11-cpan-39bf76dae61 )