GCC-Builtins
view release on metacpan or search on metacpan
/* inputs */
: [input] "mr" (input)
/* clobbers: we are messing with these registers: */
: "eax"
);
// return an arrayref of the two outputs
AV* ret = newAV();
sv_2mortal((SV*)ret);
av_push(ret, newSViv(num_leading_zeros));
av_push(ret, newSViv(mssb));
return ret;
}
You can also inline assembly in your Perl code with Inline::ASM
Be advised that GCC builtins are also calling assembly code. In fact
the above assembly code is how GCC implements clz(). So, inline
assembly and GCC::Builtins should yield, more-or-less, the same
performance gain.
TESTING
For each exported sub there is a corresponding auto-generated test
file. The test goes as far as loading the library and calling the
function from Perl.
However, there may be errors in the expected results because that was
done without verifying with a C test program.
BENCHMARKS
Counting leading zeros (clz) will be used to benchmark the GCC builtin
__builtin_clz() and a pure Perl implementation as suggested by Perl
Monk coldr3ality <https://perlmonks.org/?node_id=1232041> in this
discussion <https://perlmonks.org/?node_id=11158279>
clz() operating on the binary representation of a number counts the
zeros starting from the most significant end until it finds the first
bit set (to 1). Which essentially gives the zero-based index of the MSB
set to 1.
The benchmarks favour the GCC builtin __builtin_clz() which is about
twice as fast as the pure Perl implementation.
The benchmarks can be run with make benchmarks An easy way to let Perl
fetch and unpack the distribution for you is to use cpanm to open a
shell
cpanm --look GCC::Builtins
and then
perl Makefile.PL && make all && make test && make benchmarks
The following benchamrk results indicate that the use of GCC::Builtins
(clz() in this case) yields more than 100% performance gain over
equivalent pure perl code:
Benchmark: timing 50000000 iterations of clz/xs, clz/pp-ugly...
clz/xs: 3.92331 wallclock secs ( 3.92 usr + 0.00 sys = 3.92 CPU) @ 12755102.04/s (n=50000000)
clz/pp-ugly: 8.24574 wallclock secs ( 8.23 usr + 0.00 sys = 8.23 CPU) @ 6075334.14/s (n=50000000)
Rate clz/pp-ugly clz/xs
clz/pp-ugly 6075334/s -- -52%
clz/xs 12755102/s 110% --
KEY:
clz/xs : calling GCC builtin clz() via XS from Perl
clz/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)
Benchmark: timing 50000000 iterations of clzl/xs, clzl/pp-ugly...
clzl/xs: 3.84597 wallclock secs ( 3.84 usr + 0.00 sys = 3.84 CPU) @ 13020833.33/s (n=50000000)
clzl/pp-ugly: 8.44006 wallclock secs ( 8.43 usr + 0.00 sys = 8.43 CPU) @ 5931198.10/s (n=50000000)
Rate clzl/pp-ugly clzl/xs
clzl/pp-ugly 5931198/s -- -54%
clzl/xs 13020833/s 120% --
KEY:
clzl/xs : calling GCC builtin clzl() via XS from Perl
clzl/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)
So, it pays to use this module if performance is an issue.
CAVEATS
If you observe weird return results or core-dumps it is very likely
that the fault is mine while compiling the XS typemap. The file in the
distribution typemap was compiled by me to translate C's data types
into Perls. And for some of this I am not sure what the right type is.
For example, is C's uint_fast16_t equivalent to Perl's T_UV? How about
C's long double mapping to Perl's T_DOUBLE and unsigned long long to
T_U_LONG?
Please report
<https://rt.cpan.org/NoAuth/ReportBug.html?Queue=GCC-Builtins> any
corrections.
Note that lib/GCC/Builtins.pm, lib/GCC/Builtins.xs and typemap are
auto-generated by above scripts. Do not edit them. Edit
sbin/build-gcc-builtins-package.pl instead.
AUTHOR
Andreas Hadjiprocopis, <bliako ta cpan.org / andreashad2 ta gmail.com>
BUGS
Please report any bugs or feature requests to bug-gcc-builtins at
rt.cpan.org, or through the web interface at
https://rt.cpan.org/NoAuth/ReportBug.html?Queue=GCC-Builtins. I will be
notified, and then you'll automatically be notified of progress on your
bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc GCC::Builtins
You can also look for information at:
* RT: CPAN's request tracker (report bugs here)
https://rt.cpan.org/NoAuth/Bugs.html?Dist=GCC-Builtins
* Review this module at PerlMonks
https://www.perlmonks.org/?node_id=21144
* Search CPAN
https://metacpan.org/release/GCC-Builtins
( run in 1.102 second using v1.01-cache-2.11-cpan-71847e10f99 )