Benchmark-MCE

 view release on metacpan or  search on metacpan

lib/Benchmark/MCE.pm  view on Meta::CPAN


Benchmark::MCE - Perl multi-core benchmarking framework

=head1 SYNOPSIS

  use Benchmark::MCE;

  # Run 2 benchmarks (custom functions) and time them on a single core:
  my %stat_single = suite_run({
    threads => 1,
    bench   => {
      Bench1 => sub { ...code1... },
      Bench2 => '...code2...'       # String is also fine
    }
  );

  # Run each across multiple cores.
  # Use the extended (arrayref) definition to check for correctness of output.
  my %stat_multi = suite_run({
    threads => system_identity(1),  # Workers count equal to system logical cores
    bench   => {
      Bench1 => [\&code1, $expected_output1],
      Bench2 => [\&code2, $expected_output2],
    }
  );
  
  # Calculate the multi/single core scalability
  my %scal = calc_scalability(\%stat_single, \%stat_multi);

=head1 DESCRIPTION

A benchmarking framework originally designed for the L<Benchmark::DKbench> multi-core
CPU benchmarking suite. Released as a stand-alone to be used for custom benchmarks
of any type, as well as other kinds of stress-testing, throughput testing etc. 

You define custom functions (usually with randomized workloads) that can be run on
any number of parallel workers, using the low-overhead Many-Core Engine (L<MCE>).

=head1 FUNCTIONS
 
=head2 C<system_identity>

 my $cores = system_identity($quiet?);

Prints out software/hardware configuration and returns the number of logical cores
detected using L<System::CPU>.

Any argument will suppress printout and will only return the number of cores.

=head2 C<suite_run>

 my %stats = suite_run(\%options);

Runs the benchmark suite given the C<%options> and prints results. Returns a hash
with run stats that looks like this:

 %stats = (
   $bench_name_1 => {times => [ ... ], scores => [ ... ]},
    ...
   _total => {times => [ ... ], scores => [ ... ]},
   _opt   => {iter => $iterations, threads => $no_threads, ...}
 );

Note that the times reported will be average times per thread (or per function
call if you prefer), however the scores reported (if a reference time is supplied)
are sums across all threads. So you expect for ideal scaling 1 thread vs 2 threads
to return the same times, double the scores.

=head3 Options:

=over 4

=item * C<bench> (HashRef, with alias C<benchmarks>) B<required>:
A hashref with keys being your unique custom benchmark names and values being
arrays:

C<< name => [ $coderef, $expected?, $ref_time?, $quick_arg?, $normal_arg? ] >>

where:

=over 4

=item * C<$coderef> B<required>:
Reference to your benchmark function. See L<BENCHMARK FUNCTIONS> for more details.

=item * C<$expected>:
Expected output of the benchmark function on successful run (for PASS/FAIL - PASS
will be always assumed is parameter is undefined).

=item * C<$ref_time>:
Reference time in seconds for score of 1000.

=item * C<$quick_arg>:
Argument to pass to the benchmark function in C<quick> mode (for workload scaling).

=item * C<$normal_arg>:
Argument to pass to the benchmark function in normal mode (for workload scaling).

=back

=item * C<threads> (Int; default 1):
Parallel benchmark threads. They are L<MCE> workers, so not 'threads' in the technical
sense. Each of the benchmarks defined will launch on each of the threads, hence the
total workload is multiplied by the number of C<threads>. Times will be averaged
across threads, while scores will be summed.

=item * C<iter> (Int; default 1):
Number of suite iterations (with min/max/avg at the end when > 1).

=item * C<include> (Regex):
Only run benchmarks whose names match regex.

=item * C<exclude> (Regex):
Skip benchmarks whose names match regex.

=item * C<filter> (CodeRef):
Custom filter callback for finer control. It receives C<($opt, $bench, $bench_def)>
and should return true to run a benchmark.

=item * C<time> (Bool):
Report time (sec) instead of score. Set to true by C<quick> or if at least one
benchmark has no reference time declared. Otherwise score output is the default.

=item * C<quick> (Bool; default 0):
Use each benchmark's quick argument and imply C<time=1>.

=item * C<scale> (Int; default 1):
Scale the bench workload (number of calls of the benchmark functions) by x times.
Forced to 1 with C<quick> or C<no_mce>.

=item * C<stdev> (Bool; default 0):
Show relative standard deviation (for C<iter> > 1).

=item * C<sleep> (Int; default 0):
Number of seconds to sleep after each benchmark run.

=item * C<duration> (Int, seconds):
Minimum duration in seconds for suite run (overrides C<iter>).

=item * C<srand> (Int; default 1):
Define a fixed seed to keep runs reproducible when your benchmark functions use
C<rand>. The seed will be passed to C<srand> before each call to a benchmark
function. Set to 0 to skip rand seeding.

=item * C<no_check> (Bool; default 0):
Do not check for Pass/Fail even if reference output is defined.

=item * C<no_mce> (Bool; default 0):
Do not run under L<MCE::Loop> (sets C<threads=1>, C<scale=1>).

=back

=head2 C<calc_scalability>

 my %scal = calc_scalability(\%stat_single, \%stat_multi, $keep_outliers?);

Given the C<%stat_single> results of a single-threaded C<suite_run> and C<%stat_multi>
results of a multi-threaded run, will calculate, print and return the multi-thread
scalability (including averages, ranges etc for multiple iterations).

Unless C<$keep_outliers> is true, the overall scalability is an average after droping
Benchmarks that are non-scaling outliers (over 2*stdev less than the mean).

The result hash return looks like this:

 %scal = (
   bench_name => $bench_avg_scalability,
    ...
   _total => $total_avg_scalability
 );


=head2 C<suite_calc>

 my ($stats, $stats_multi, $scal) = suite_calc(\%suite_run_options, $keep_outliers?);

Convenience function that combines 3 calls, L</suite_run> with C<threads=E<gt>1>,
L</suite_run> with C<threads=E<gt>system_identity(1)> and L</calc_scalability> with
the results of those two, returning hashrefs with the results of all three calls.

For single-core systems (or when C<system_identity(1)> does not return E<gt> 1)
only C<$stats> will be returned.

You can override the C<system_identity(1)> call and run the multi-thread bench with
a custom number of threads by passing C<threads =E<gt> [count]>.

=head1 BENCHMARK FUNCTIONS

The benchmark functions will be called with two parameters that you can choose to
take advantage of.
The first one is what you define as either the C<$quick_arg> or C<$normal_arg>,
with the intention being to have a way to run a C<quick> mode that lets you test with
smaller workloads. The second argument will be an integer that's the chunk number
from L<MCE::Loop> - it will be 1 for the call on the first thread, 2 from the second
thread etc, so your function may track which worker/chunk is running.

The function may return a string, usually a checksum, that will be checked against
the (optional) C<$expected> parameter to show a Pass/Fail (useful for verifying
correctness, stress testing, etc.).

Example:

  use Benchmark::MCE;
  use Math::Trig qw/:great_circle :pi/;

  sub great_circle {
    my $size  = shift || 1;  # Optionally have an argument that scales the workload
    my $chunk = shift;       # Optionally use the chunk number
    my $dist = 0;
    $dist +=
      great_circle_distance(rand(pi), rand(2 * pi), rand(pi), rand(2 * pi))
        for 1 .. $size;
    return $dist; # Returning a value is optional for the Pass/Fail functionality
  }

  my %stats = suite_run({
      bench => { 'Math::Trig' =>  # A unique name for the benchmark
        [
        \&great_circle,      # Reference to bench function

lib/Benchmark/MCE.pm  view on Meta::CPAN

    } else {
        foreach (1..$opt->{iter}) {
            _print("Iteration $_ of $opt->{iter}...\n") if $opt->{iter} > 1;
            _run_iteration($opt, \%stats);
        }
    }

    _total_stats($opt, \%stats) if $opt->{iter} > 1;

    return %stats;
}

sub calc_scalability {
    my $stats1   = shift;
    my $stats2   = shift;
    my $outliers = shift;
    my $opt      = $stats1->{_opt};
    my $opt2     = $stats2->{_opt};

    die "Different, non-zero thread count expected between runs"
        if !$opt->{threads}
        || !$opt2->{threads}
        || $opt->{threads} == $opt2->{threads};

    ($opt, $opt2) = ($stats2->{_opt}, $stats1->{_opt})
        if $opt->{threads} > $opt2->{threads};

    die "Same scale expected between runs" if $opt->{scale} != $opt2->{scale};

    my $threads = $opt2->{threads} / $opt->{threads};
    my $display = $opt->{time} ? 'times' : 'scores';

    $opt->{f} = $opt->{time} ? '%.3f' : '%5.0f';
    my ($cnt, @perf, @scal, %scal);
    _print(   "Multi thread Scalability:\n"
            . _pad("Benchmark",           24)
            . _pad("Multi perf xSingle",  24)
            . _pad("Multi scalability %", 24)
            . "\n");
    foreach my $bench (sort keys %{$stats1}) {
        next if $bench eq '_total';
        next unless $stats1->{$bench}->{times} && $stats2->{$bench}->{times};
        $cnt++;
        my @res1 = _min_max_avg($stats1->{$bench}->{times});
        my @res2 = _min_max_avg($stats2->{$bench}->{times});
        $scal{$bench} = $res1[2]/$res2[2]*100 if $res2[2];
        push @perf, $res1[2]/$res2[2]*$threads if $res2[2];
        push @scal, $scal{$bench} if $scal{$bench};
        _print(   _pad("$bench:", 24)
                . _pad(sprintf("%.2f",  $perf[-1]), 24)
                . _pad(sprintf("%2.0f", $scal[-1]), 24) . "\n")
            if @perf;
    }
    die "No bench times recorded" unless @perf;
    _print(("-"x40)."\n");
    my @avg1 = _min_max_avg($stats1->{_total}->{$display});
    my @avg2 = _min_max_avg($stats2->{_total}->{$display});
    _print(__PACKAGE__, " summary ($cnt benchmark");
    _print("s") if $cnt > 1;
    _print(" x$opt->{scale} scale")     if $opt->{scale} > 1;
    _print(", $opt->{iter} iterations") if $opt->{iter} > 1;
    _print(", $opt2->{threads} threads):\n");
    $opt->{f} .= "s" if $opt->{time};
    my $f = $opt->{time} ? '%.3f' : '%.0f';
    $f = $opt->{iter} > 1 ? "$opt->{f}\t($f - $f)" : $opt->{f};
    @avg1 =  $opt->{iter} > 1 ? ($avg1[2], $avg1[0], $avg1[1]) : ($avg1[2]);
    @avg2 =  $opt->{iter} > 1 ? ($avg2[2], $avg2[0], $avg2[1]) : ($avg2[2]);
    _print(_pad("Single:").sprintf($f, @avg1)."\n");
    _print(_pad("Multi:").sprintf($f, @avg2)."\n");
    my @newperf = $outliers ? @perf : _drop_outliers(\@perf, -1);
    my @newscal = $outliers ? @scal : _drop_outliers(\@scal, -1);
    @perf = _min_max_avg(\@newperf);
    @scal = _min_max_avg(\@newscal);
    $scal{_total} = $scal[2];
    _print(   _pad("Multi/Single perf:")
            . sprintf("%.2fx\t(%.2f - %.2f)", $perf[2], $perf[0], $perf[1])
            . "\n");
    _print(
        _pad("Multi scalability:") . sprintf(
            "%2.1f%% \t(%.0f%% - %.0f%%)", $scal[2], $scal[0], $scal[1]
            )
            . "\n"
    );

    return %scal;
}

sub _init_options {
    my $opt = shift;
    $opt->{iter}  ||= $opt->{iterations} || 1;
    $opt->{bench} ||= $opt->{benchmarks} || $opt->{extra_bench};
    die "No benchmarks defined" unless $opt->{bench} && %{$opt->{bench}};
    foreach my $b (keys %{$opt->{bench}}) {
        if (!ref($opt->{bench}->{$b})) { # string
            my $f = eval "sub { $opt->{bench}->{$b} }";
            die "Error compiling benchmark '$b': $@" if $@;
            $opt->{bench}->{$b} = $f;
        }
        $opt->{bench}->{$b} = [$opt->{bench}->{$b}]
            if ref($opt->{bench}->{$b}) eq 'CODE';    # wrap coderef
        die "Error defining benchmark '$b'"
            if ref($opt->{bench}->{$b}) ne 'ARRAY';
    }
    $opt->{threads} ||= 1;
    $opt->{scale}   ||= 1;
    ($opt->{time}, $opt->{no_check}) = (1, 1) if $opt->{quick};
    $opt->{scale} = 1 if $opt->{quick} || $opt->{no_mce};
    foreach my $arr (values %{$opt->{bench}}) {
        $opt->{time}     = 1 unless scalar(@$arr) > 2 && $arr->[2] && $arr->[2] > 0;
        $opt->{no_check} = 1 unless scalar(@$arr) > 1 && defined $arr->[1];
    }
    $opt->{f} = $opt->{time} ? '%.3f' : '%5.0f';
    $opt->{threads} = 1 if $opt->{no_mce};
}

sub _run_iteration {
    my $opt        = shift;
    my $stats      = shift;
    my $benchmarks = $opt->{bench};
    my $title      = $opt->{time} ? 'Time (sec)' : 'Score';
    _print(_pad("Benchmark")._pad($title));
    _print("Pass/Fail") unless $opt->{no_check};
    _print("\n");
    my ($total_score, $total_time, $i) = (0, 0, 0);
    foreach my $bench (sort keys %$benchmarks) {
        next if $opt->{filter} && !$opt->{filter}->($opt, $bench, $benchmarks->{$bench});
        next if $opt->{exclude} && $bench =~ /$opt->{exclude}/;
        next if $opt->{include} && $bench !~ /$opt->{include}/;
        my ($time, $res) = _mce_bench_run($opt, $benchmarks->{$bench});
        my $score =
            $benchmarks->{$bench}->[2] && $time
            ? int(1000 * $opt->{threads} * $benchmarks->{$bench}->[2] / $time + 0.5)
            : 1;
        $total_score += $score;
        $total_time  += $time;
        $i++;
        push @{$stats->{$bench}->{times}}, $time;
        push @{$stats->{$bench}->{scores}}, $score;
        my $d = $stats->{$bench}->{$opt->{time} ? 'times' : 'scores'}->[-1];
        $stats->{$bench}->{fail}++ if !$opt->{quick} && $res ne 'Pass';
        _print(_pad("$bench:")._pad(sprintf($opt->{f}, $d)));
        _print("$res") unless $opt->{no_check};
        _print("\n");
        sleep $opt->{sleep} if $opt->{sleep};
    }
    die "No benchmarks to run\n" unless $i;
    my $s = int($total_score/$i+0.5);
    _print(_pad("Overall $title: ")
            . sprintf($opt->{f} . "\n", $opt->{time} ? $total_time : $s));
    push @{$stats->{_total}->{times}}, $total_time;
    push @{$stats->{_total}->{scores}}, $s;
}

sub _mce_bench_run {
    my $opt        = shift;
    my $benchmark  = shift;
    my @bench_copy = @$benchmark;
    $bench_copy[3] = $bench_copy[4] if scalar(@bench_copy) > 3 && !$opt->{quick};
    return _bench_run(\@bench_copy, 1, $opt->{srand}) if $opt->{no_mce};

    my @stats = mce_loop {
        my ($mce, $chunk_ref, $chunk_id) = @_;
        for (@{$chunk_ref}) {
            my ($time, $res) = _bench_run(\@bench_copy, $_, $opt->{srand});
            MCE->gather([$time, $res]);
        }
    }
    (1 .. $opt->{threads} * $opt->{scale});

    my ($res, $time) = ('Pass', 0);
    foreach (@stats) {
        $time += $_->[0];
        $res = $_->[1] if $_->[1] ne 'Pass';
    }

    return $time/$opt->{threads} * $opt->{scale}, $res;
}

sub _bench_run {
    my $benchmark = shift;
    my $chunk_no  = shift;
    my $srand     = shift // 1;
    srand($srand) if $srand > 0; # For repeatability
    my $t0   = _get_time();
    my $out  = $benchmark->[0]->($benchmark->[3], $chunk_no);
    my $time = sprintf("%.3f", _get_time()-$t0);
    my $r    = !defined $benchmark->[1]
        || $out eq $benchmark->[1] ? 'Pass' : "Fail ($out)";
    return $time, $r;
}

sub _total_stats {
    my $opt     = shift;
    my $stats   = shift;
    my $display = $opt->{time} ? 'times'      : 'scores';
    my $title   = $opt->{time} ? 'Time (sec)' : 'Score';
    _print(   "Aggregates ($opt->{iter} iterations"
            . ($opt->{threads} > 1 ? ", $opt->{threads} threads" : "") . "):\n"
            . _pad("Benchmark", 24)
            . _pad("Avg $title")
            . _pad("Min $title")
            . _pad("Max $title"));
    _print(_pad("stdev %")) if $opt->{stdev};
    _print(_pad("Pass %")) unless $opt->{no_check};
    _print("\n");

    foreach my $bench (sort keys %{$opt->{bench}}) {
        next unless $stats->{$bench}->{$display};
        my $str = _calc_stats($opt, $stats->{$bench}->{$display});
        _print(_pad("$bench:",24).$str);
        _print(
            _pad(
                sprintf("%d",
                    100 * ($opt->{iter} - ($stats->{$bench}->{fail} || 0)) /
                        $opt->{iter})
            )
        ) unless $opt->{no_check};
        _print("\n");
    }

    my $str = _calc_stats($opt, $stats->{_total}->{$display});
    _print(_pad("Overall Avg $title:", 24)."$str\n");
}

sub _calc_stats {
    my $opt = shift;
    my $arr = shift;
    my $pad = shift;
    my ($min, $max, $avg) = _min_max_avg($arr);
    my $str = join '', map {_pad(sprintf($opt->{f}, $_), $pad)} ($avg,$min,$max);
    if ($opt->{stdev} && $avg) {
        my $stdev = _avg_stdev($arr);
        $stdev *= 100/$avg;
        $str .= _pad(sprintf("%0.2f%%", $stdev), $pad);
    }
    return $avg, $str;
}

sub _min_max_avg {
    my $arr = shift;
    return (0, 0, 0) unless @$arr;
    return min(@$arr), max(@$arr), sum(@$arr)/scalar(@$arr);
}

sub _avg_stdev {
    my $arr = shift;
    return (0, 0) unless @$arr;
    my $sum = sum(@$arr);
    my $avg = $sum/scalar(@$arr);
    my @sq;
    push @sq, ($avg - $_)**2 for (@$arr);
    my $dev = _min_max_avg(\@sq);
    return $avg, sqrt($dev);
}

# $single = single tail of dist curve outlier, 1 for over (right), -1 for under (left)
sub _drop_outliers {



( run in 1.753 second using v1.01-cache-2.11-cpan-96521ef73a4 )