C-Blocks

 view release on metacpan or  search on metacpan

bench/c-blocks-vs-inline.pl  view on Meta::CPAN


sub c_blocks_sub_Nth_prime {
	my $N = shift;
	my $to_return;
	cblock { sv_setiv($to_return, get_Nth_prime(SvIV($N))); }
	return $to_return;
}

use Time::HiRes qw(gettimeofday tv_interval);

my $N_iterations = 1000;
for my $log_N (1, 1.5, 2, 2.5, 3, 3.5, 4) {
	my $N = int(10**$log_N);
	print "--- N = $N ---\n";
	
	# C::Blocks test
	my $C_Blocks_accum = 0;
	my $C_Blocks_result;
	for (1 .. $N_iterations) {
		my $t0 = [gettimeofday];
		cblock { sv_setiv($C_Blocks_result, get_Nth_prime(SvIV($N))); }
		my $ellapsed = tv_interval ($t0);
		$C_Blocks_accum += $ellapsed;
	}
	my $C_Blocks_time = $C_Blocks_accum / $N_iterations;
	
	# C::Blocks sub test
	my $C_Blocks_sub_accum = 0;
	my $C_Blocks_sub_result;
	for (1 .. $N_iterations) {
		my $t0 = [gettimeofday];
		$C_Blocks_sub_result = c_blocks_sub_Nth_prime($N);
		my $ellapsed = tv_interval ($t0);
		$C_Blocks_sub_accum += $ellapsed;
	}
	my $C_Blocks_sub_time = $C_Blocks_sub_accum / $N_iterations;
	
	# Inline::C test
	my $Inline_C_accum = 0;
	my $Inline_C_result;
	for (1 .. $N_iterations) {
		my $t0 = [gettimeofday];
		$Inline_C_result = get_Nth_prime($N);
		my $ellapsed = tv_interval ($t0);
		$Inline_C_accum += $ellapsed;
	}
	my $Inline_C_time = $Inline_C_accum / $N_iterations;
	
	print "C::Blocks/sub took $C_Blocks_sub_accum seconds, $C_Blocks_sub_time on average\n";
	print "C::Blocks took $C_Blocks_accum seconds, $C_Blocks_time on average\n";
	print "Inline::C took $Inline_C_accum seconds, $Inline_C_time on average\n";
	print "C::Blocks gave $C_Blocks_result; C::Blocks/sub gave $C_Blocks_sub_result; Inline::C gave $Inline_C_result\n";
}

__END__

__C__

lib/C/Blocks.pm  view on Meta::CPAN

traditional C code, making it straight-forward but lengthy. The 
C::Blocks has the upper hand in execution rate---always faster than 
L<PDL>, though never more than by a factor of two---and in predictable 
scaling---almost perfectly linear in system size, vs slightly nonlinear 
behavior in the PDL implementation. I'd say the number of lines of code
is the primary deciding factor here, but the trade-off might fall
differently for more complicated calculations.

The calculation of the Mandelbrot set provides a very interesting 
benchmark. The algorithm involves a loop that has a fixed maximun 
number of iterations, but which can exit early if the calculation 
converges. This exit-early algorithm knocks PDL out of the race. 
There's no good way to implement this in PDL short of writing a 
low-level implementation.

The comparsion between C::Blocks and PDL can best be summarized thus. 
If you have a very small dataset, less than 1000 elements, C::Blocks 
will out-perform PDL due to PDL's costly method launch mechanism. If 
you have multiple tightly nested for-loops, where operations within the 
for loops are based on the indices, then C::Blocks will likely give you 
a competitive computation rate, at the cost of many more lines of code. 



( run in 1.680 second using v1.01-cache-2.11-cpan-96521ef73a4 )