iterations results from the CPAN

iterations
C-Blocks
view release on metacpan or search on metacpan
longer to compile, but the resulting machine code would be more 
efficient. When can you expect a C::Blocks solution to be a good 
choice?

=head2 When not to use C::Blocks

Don't rewrite an existing XS module using C::Blocks. A C::Blocks API to 
your XS code might be useful, but don't rewrite mature XS code. 
C::Blocks can save you from the effort of producing a new XS 
distribution, but if you've already put in that effort, don't throw it 
away.

Don't replace a handful of Perl statements with their C-API 
equivalents. Perl's core has been pretty highly optimized and is 
compiled at high optimization levels. At best, you'll get incremental
performance gains, and they will likely come at the expense of many
additional lines of code. This probably isn't worth it.

Don't discount the cost of marshalling Perl data into C data. Obtaining 
C representations of your data will always cost you at least a few 
clock cycles, and it will usually add lines of code, too. You're likely 
to see the best performance benefits if you can marshall the data as 
early as possible and use that C-accessible data many times over. For 
example, if you have a data-parsing stage in which you build a complex 
data structure representing that data, try to build a C structure 
instead of a Perl structure at parse time. All future operations will
have access to the C representation.

=head2 C::Blocks vs Perl and PDL

In what follows, I assume you have already marshalled your data into a
C data structure, like an array or a struct.

C<C::Blocks> outperforms Perl on O(N) numeric calculations on arrays, 
often by a factor greater than 10. (An O(N) calculation is any 
algorithm that only needs to examine each data point once, so the 
calculation should scale with the number of data points.) In fact, 
C<C::Blocks> is competitive with L<PDL> in such calculations. 
C<C::Blocks> requires more lines of code, though. For a calculation of 
the average of a dataset, L<PDL> uses only one line, a Perl 
implementation uses three, and C::Blocks uses 14. What you gain in 
speed you lose in lines of code.

Another interesting comparison between L<PDL> and C::Blocks is the 
calculation of euclidian distance for an N-dimensional vector, where N 
scales from very small to very large numbers. The calculation is always 
O(N), but is more complex than the simple average already discussed, 
and not explicitly implemented as a low-level L<PDL> routine. The 
L<PDL> implementation is only a single very readable line, highlighting 
L<PDL>'s expresiveness. The C::Blocks implementation is 14 lines of 
traditional C code, making it straight-forward but lengthy. The 
C::Blocks has the upper hand in execution rate---always faster than 
L<PDL>, though never more than by a factor of two---and in predictable 
scaling---almost perfectly linear in system size, vs slightly nonlinear 
behavior in the PDL implementation. I'd say the number of lines of code
is the primary deciding factor here, but the trade-off might fall
differently for more complicated calculations.

The calculation of the Mandelbrot set provides a very interesting 
benchmark. The algorithm involves a loop that has a fixed maximun 
number of iterations, but which can exit early if the calculation 
converges. This exit-early algorithm knocks PDL out of the race. 
There's no good way to implement this in PDL short of writing a 
low-level implementation.

The comparsion between C::Blocks and PDL can best be summarized thus. 
If you have a very small dataset, less than 1000 elements, C::Blocks 
will out-perform PDL due to PDL's costly method launch mechanism. If 
you have multiple tightly nested for-loops, where operations within the 
for loops are based on the indices, then C::Blocks will likely give you 
a competitive computation rate, at the cost of many more lines of code. 
If those for-loops have the possibility of an early exit, PDL may run 
significantly slower than C::Blocks, and may even run slower than pure 
Perl. Finally, if you have image manipulations or calculations, PDL is 
almost certainly the better tool, as it has a lot of low-level image 
manipulation routines already.

=head2 C::Blocks vs Graph

I have not had the opportunity to write and run additional benchmarks 
for C::Blocks. The next obvious choice would be a comparison with
L<Graph>, but I have not yet endeavored to produce those calculations.

=head1 KEYWORDS

The way that C<C::Blocks> provides these functionalities is through lexically
scoped keywords: C<cblock>, C<clex>, C<cshare>, and C<csub>. These keywords
precede a block of C code encapsulated in curly brackets. Because these use the
Perl keyword API, they parse the C code during Perl's parse stage, so any code
errors in your C code will be caught during parse time, not during run time.

=over

=item cblock { code }

C code contained in a C<cblock> gets wrapped into a special type of C function
and compiled during the compilation stage of the surrounding Perl code. The
resulting function is inserted into the Perl op tree at the precise location of
the block and is called when the interpreter reaches this part of the code.

The code in a C<cblock> is wrapped into a function, so function and struct
declarations are not allowed. Also, variable declarations and preprocessor
definitions are confined to the C<cblock> and will not be present in later
C<cblock>s. For that sort of behavior, see C<clex>.

By default, variables with C<$> sigils are interpreted as referring to
the C<SV*> representing the variable in the current lexical scope. The
exception is when a variable is declared with a type, a la

 my Class::Name $thing;

Here C<Class::Name> specifies the type of C<$thing>. The package
C<Class::Name> has information used by C::Blocks to unpack and repack
C<$thing> into the appropriate C data type. Implementation details are
discussed under L<C::Blocks/TYPES>.

Note: If you need to leave a C<cblock> early, you should use a C<return>
statement without any arguments. In particular, this will bypass any
cleanup code provided by types.

=item clex { code }
( run in 0.630 second using v1.01-cache-2.11-cpan-96521ef73a4 )