HTML-Inspect

 view release on metacpan or  search on metacpan

xt/benchmark_collectOpenGraph.pl  view on Meta::CPAN

Here is the output on my computer.

1. Initial state
Higher values than 3617.11/s (n=10996) is better
    timethis for 3:  3 wallclock secs ( 2.94 usr +  0.10 sys =  3.04 CPU) @ 3617.11/s (n=10996)

Below is the benchmark.

2. With precompiled XPATH queries. Not much faster than hardcodded arguments

    Benchmark: timing 400000 iterations of HARD_XPATH, PRECOMPILED_XPATH, RAW_XPATH...
    HARD_XPATH: 20 wallclock secs (20.05 usr +  0.01 sys = 20.06 CPU) @ 19940.18/s (n=400000)
    PRECOMPILED_XPATH: 19 wallclock secs (19.14 usr +  0.01 sys = 19.15 CPU) @ 20887.73/s (n=400000)
     RAW_XPATH: 20 wallclock secs (20.10 usr +  0.00 sys = 20.10 CPU) @ 19900.50/s (n=400000)
=cut

my $html = slurp("$Bin/../t/data/open-graph-protocol-examples/video-movie.html");
timethis(
    -3,
    sub {
        HTML::Inspect->new(location => 'http://example.com/doc', html_ref => \$html)->collectOpenGraph();

xt/benchmark_hash_or_var.pl  view on Meta::CPAN

use strict;
use warnings;
use utf8;
use Benchmark;

=pod

Which is  faster, accessing a variable or a hash value by key?
Here is the output on my computer

Benchmark: timing 90000000 iterations of HASH_VALUE, LOCAL_VAR...
HASH_VALUE:  2 wallclock secs ( 1.67 usr +  0.00 sys =  1.67 CPU) @ 53892215.57/s (n=90000000)
 LOCAL_VAR: -1 wallclock secs ( 0.33 usr +  0.00 sys =  0.33 CPU) @ 272727272.73/s (n=90000000)
            (warning: too few iterations for a reliable count)

Below is the benchmark.

=cut

my $hash = {a => 1, b => 2, c => 3, d => 4, e => 5};
my $c    = $hash->{c};
timethese(
    90000000,
    {

xt/benchmark_trim.pl  view on Meta::CPAN

use warnings;
use utf8;
use Benchmark;

=pod

Is C<$string =~ s/^\s?(.*?)\s?$/$1/s;> faster than C<$string =~ s/^\s//s; $string =~s/\s$//s;> 
The regexes are specific to our special case, where we know that we MAY have only one space already.
Here is the output on my computer

    Benchmark: timing 5000000 iterations of COPY_TRIMMED, STRIPSS...
    COPY_TRIMMED:  1 wallclock secs ( 0.40 usr +  0.00 sys =  0.40 CPU) @ 12500000.00/s (n=5000000)
         STRIPSS:  2 wallclock secs ( 1.15 usr +  0.00 sys =  1.15 CPU) @ 4347826.09/s (n=5000000)


New iteration (The winner is definitely STRIPGRSZ):
    16:24:21|berov@kb-S340:HTML-Inspect$ perl xt/benchmark_trim.pl 
    Benchmark: timing 5000000 iterations of COPY_TRIMMED, STRIPGRSZ, STRIPSS...
    COPY_TRIMMED: 17 wallclock secs (18.33 usr +  0.00 sys = 18.33 CPU) @ 272776.87/s (n=5000000)
     STRIPGRSZ: 10 wallclock secs (10.50 usr +  0.00 sys = 10.50 CPU) @ 476190.48/s (n=5000000)
       STRIPSS: 12 wallclock secs (11.22 usr +  0.00 sys = 11.22 CPU) @ 445632.80/s (n=5000000)Below is the benchmark.

=cut

my $str = " some 
       qqqq
other        multi-spase

xt/benchmark_xpath.pl  view on Meta::CPAN

#use Test::More;
use XML::LibXML;
use TestUtils qw(slurp);
use Benchmark;

=pod

Using XPATH expressions is faster than getElementsBy*, especially for filtering.
Here is the output on my computer

    Benchmark: timing 200000 iterations of DOM, XPATH...
           DOM:  8 wallclock secs ( 7.89 usr +  0.00 sys =  7.89 CPU) @ 25348.54/s (n=200000)
         XPATH:  6 wallclock secs ( 5.78 usr +  0.00 sys =  5.78 CPU) @ 34602.08/s (n=200000)
    Benchmark: timing 200000 iterations of DOM2, XPATH2...
          DOM2:  7 wallclock secs ( 6.77 usr +  0.00 sys =  6.77 CPU) @ 29542.10/s (n=200000)
        XPATH2:  5 wallclock secs ( 4.98 usr +  0.00 sys =  4.98 CPU) @ 40160.64/s (n=200000)

Below is the benchmark.

=cut

my $dom = XML::LibXML->load_html(
    string            => \(slurp("$Bin/../t/data/collectOpenGraph.html")),
    recover           => 2,



( run in 1.387 second using v1.01-cache-2.11-cpan-71847e10f99 )