HTTP-LoadGen
view release on metacpan or search on metacpan
# start main processing and wait for then to finish
%result=%{HTTP::LoadGen::start_proc $handle};
# thread management
# create a collection of threads
$sem=HTTP::LoadGen::ramp_up
$procnr, $nproc, $start, $max, $duration, $handler;
# wait for them to finish
$sem->down;
# idle a bit
HTTP::LoadGen::delay $prefix, \%param;
# get current thread number
$nr=HTTP::LoadGen::threadnr;
# get the configuration hash
$config=HTTP::LoadGen::options;
# get/set thread-specific user data
$data=HTTP::LoadGen::userdata;
HTTP::LoadGen::userdata=$data;
# get/set thread specific random number generator
$rng=HTTP::LoadGen::rng;
HTTP::LoadGen::rng=$rng;
# next random number
$random=HTTP::LoadGen::rnd $max;
INSTALLATION
perl Makefile.PL
make
make test
make install
DEPENDENCIES
* perl 5.8.8
* IPC::ScoreBoard
* Coro
* AnyEvent
* Async::Interrupt
* Net::SSLeay
DESCRIPTION
This module implements a multi-process and multi-thread load generator
for HTTP. It uses Coro threads. So, in reality it does not use threads
but event-based IO.
Features
* limited support for SSL connections
* keep-alive connections
* configurable delay before and after each request
* run a list of URLs many times
* compute next URL based on the current request
* DNS cache can be preinitialized
* slow ramp up
* request bodies
* custom request headers
Overview
Note, this POD is best view via Apache2::PodBrowser.
Parallelism
The load generator follows a 2-level supervisor-worker pattern. The
central function, "loadgen", creates a certain number of child
processes. Each child process then creates in a slow ramp up phase
worker threads up to a configurable total upper thread limit.
The thread limit is configured independent on the number of worker
processes. You configure a number of processes that is about 1.5-5 times
the number of available CPUs. The number of threads can then be say 50
or 500 or even 5996 or so. Processes and threads are numbered starting
from 0.
So, assuming there are 3 processes and 10 threads configured the
following table shows how the threads are spread among the processes:
Process | Threads
--------+------------
0 | 0 3 6 9
1 | 1 4 7
2 | 2 5 8
Process 0 will run 4 threads, the other 2 processes 3 threads each. The
number of threads per process can be calculated as:
$TotalThreadCount / $NProc + ($ProcNr < $TotalThreadCount % $NProc)
where $NProc is the number of processes used, $ProcNr the number of the
current process and $TotalThreadCount the system-wide thread number.
$ProcNr ranges from 0 to "$NProc - 1".
At the beginning ot the ramp-up phase each process starts up a certain
number of threads (maybe 0) to reach the configured start-up thread
number. The configured ramp-up duration then determines in which
intervals new threads are added. So assuming the threads run long enough
you start up with a certain level of parallelism which increases
linearly over a certain time interval up to the configured maximum.
The Scoreboard
The multi-process model of "HTTP::LoadGen" means that each process knows
only about its own threads. Sometimes you may want to log for example
the overall number of active requests when a new request is started. Or
you may want to increment a shared variable for each request to see the
For the lack of a better place the $rc element is described here.
RC_STATUS (0)
the HTTP status code. If the request failed because the connection
couldn't be established a code 599 is set here. "RC_STATUSLINE"
describes the problem in more detail in that case.
RC_STATUSLINE (1)
the HTTP status message. If the server responds with the following
first line for example:
HTTP/1.1 501 Method Not Implemented
"RC_STATUS" is 501 while "RC_STATUSLINE" is "Method Not
Implemented".
RC_HTTPVERSION (2)
the server HTTP protocol version. Normally 1.1 or 1.0.
RC_STARTTIME (3)
when the request has been started, fractional number.
RC_CONNTIME (4)
when the connection has been established, fractional number.
RC_FIRSTTIME (5)
when the first line of output has been received, fractional number.
RC_HEADERTIME (6)
when the response HTTP header has been completely received,
fractional number.
RC_BODYTIME (7)
when the response body has been completely received, fractional
number.
RC_HEADERS (8)
a hash containing the response HTTP headers. The values of this hash
are arrays since HTTP header fields can be given multiple times.
Keys (header names) are converted to lower case.
Example:
{
'content-type' => ['text/html; charset=iso-8859-1'],
'connection' => ['close'],
'date' => ['Sun, 04 Jul 2010 18:21:12 GMT'],
'content-length' => ['217'],
'allow' => ['GET,HEAD,POST,OPTIONS,TRACE'],
'server' => ['Apache'],
}
RC_BODY (9)
the response body
RC_DNSCACHED (10)
boolean: has the DNS cache lookup resulted in a hit (1) or miss (0)?
RC_CONNCACHED (11)
boolean: has the has a kept-alive connection been used?
The %data hash
So, what can be specified in %data? Note, all keys here are case
sensitive.
NWorker (optional)
specifies the number of worker processes to be used. Default is 1.
RampUpStart (optional)
the number of threads to started up immediately (after the
"ProcInit" phase is over). Default is 1 thread per worker process,
that is "NWorker".
RampUpMax (optional)
the number of threads that have to be started up after the ramp-up
phase is over. That means all processes together will start this
number of threads. If a thread finishes before the ramp-up phase is
over this maximum level of parallelism will never be reached.
Default is the same as "RampUpStart".
RampUpDuration (optional)
the duration of the ramp-up phase in seconds (may be fraction).
Default is 300 (5 minutes).
ParentInit (optional)
the "ParentInit" handler called as
$data->{ParentInit}->();
One thing to do here is to create a scoreboard for interprocess
communication, see HTTP::LoadGen::ScoreBoard or IPC::ScoreBoard.
Example:
ParentInit=>sub {
# no parameters
# create scoreboard
# options() returns the config hash itself. The NWorker parameter
# is known. SbSlotsz and SbExtra are new. This is to demonstrate
# that the hook routines can access the configuration and evaluate
# and even add custom parameters.
HTTP::LoadGen::ScoreBoard::init_once
@{HTTP::LoadGen::options()}{qw/NWorker SbSlotsz SbExtra/};
}
ParentExit (optional)
the "ParentExit" handler called as
$data->{ParentExit}->();
If a scoreboard is used remember to disconnect.
Example:
ParentExit=>sub {
# no parameters
undef HTTP::LoadGen::ScoreBoard::scoreboard;
HTTP::LoadGen::ScoreBoard::header_count,
HTTP::LoadGen::ScoreBoard::header_bytes,
HTTP::LoadGen::ScoreBoard::body_bytes,
$rc->[RC_STARTTIME],
$rc->[RC_CONNTIME]-$rc->[RC_STARTTIME],
$rc->[RC_FIRSTTIME]-$rc->[RC_STARTTIME],
$rc->[RC_HEADERTIME]-$rc->[RC_STARTTIME],
$rc->[RC_BODYTIME]-$rc->[RC_STARTTIME],
$rc->[RC_STATUS], $rc->[RC_STATUSLINE],
length($rc->[RC_BODY]),
sprintf('%s(%s://%s:%s%s)',
@{$rq}[RQ_METHOD, RQ_SCHEME, RQ_HOST, RQ_PORT, RQ_URI]));
}
times (optional)
the number of times the URL iterator is charged. That many times the
URL list is fetched.
If omitted or "<=0" the test runs forever.
dnscache (optional)
"loadgen" caches DNS query results. One can prevent DNS queries
completely in 2 ways. One of them is to provide a hash here that
maps names to IP addresses. The other is to have the URL iterator
generate IP addresses instead of host names and optionally "Host"
request header fields.
Another use of this item is to cheat host name resolution. One can
for example test a newly installed or development server while the
real server continues to work unaffected.
Example:
dnscache=>{
'foertsch.name'=>'127.0.0.1',
},
InitURLs (either InitURLs or URLList or both must be present)
"InitURLs" initializes the URL iterator. It may be a string
describing one of the predefined iterators or a "CODE" reference.
In the latter case it is called without parameters as
$it=$data->{InitURLs}->();
It is expected to return a function that when called as
$new_rq=$it->($rc, $rq);
returns the next request item or "undef" when it runs out of items.
The parameters $rc and $rq describe the previous request ($rq) and
its result ($rc).
For a description of the $rq and $new_rq format see URLList below.
Example:
InitURLs=>sub {
my $url=[qw!GET http foertsch.name 80 /-redir!,
{
keepalive=>KEEPALIVE,
headers=>[
'X-auth'=>1, # necessary to trigger 401 for that URL
], # it also shows a custom request header
}];
return sub {
my ($rc, $rq)=@_;
if( $rc->[RC_STATUS]==401 ) {
# redo with Authorization header
push @{$rq->[RQ_PARAM]->{headers}}, Authorization=>'Basic YmxhOmJsdWI=';
return $rq;
}
my $new_rq=$url;
undef $url; # next time return undef (out-of-requests)
return $new_rq;
};
}
The iterator generator initializes the variable $url and then
returns a closure. Hence, $url is a static variable with respect to
the returned iterator.
The iterator itself checks the HTTP code of the previous request. In
case of a 401 (Authorization Required) it adds an "Authorization"
header to the request header list and retries the operation.
If the previous operation has ended with an other HTTP code it
copies $url to an auxiliary variable, undefines it and returns the
auxiliary variable. Thus, only the first time the iterator is called
it returns $url. After that it is always "undef" which signals
*Out-of-Requests*.
If "InitURLs" is a string it is the name of a predefined iterator
generator.
Example:
InitURLs=>'follow'
There are currently 4 such generators. All of them expect an
"URLList" (see below) to be provided.
default
simply walks the "URLList" from start to end.
This one is also used if "InitURLs" is omitted.
random_start
similar to "default" but starts at a random offset in "URLList".
At the end of the list it continues at the beginning until all
"URLList" elements are done once.
follow
similar to "default" but if a request results in a "3xx" HTTP
code and a "Location" header is provided by the server it tries
to follow it recursively.
If the request starting a series of redirections contains a
"postdelay" statement (see below) the delay is postponed until
after the last request of the series. Subsequent requests are
issues without delay.
Infact, the built-in "random_start_follow" iterator is
implemented for example as
register_iterator random_start_follow=>sub {
@_=get_iterator('random_start')->();
goto &{get_iterator 'follow'};
};
To turn your own iterator into a following you could write:
InitURLs=>sub {
return get_iterator->('follow')->($my_own_iterator);
}
where $my_own_iterator is an iterator function.
random_start_follow
a combination of the 2 above.
You can register your own named iterators by calling
register_iterator below.
URLList (either InitURLs or URLList or both must be present)
See also InitURLs above.
An "URLList" is an array of arrays. Each of these sub-arrays
describes one request. If consists of 6 elements:
[$method, $scheme, $host, $port, $uri, $param]
$method is the HTTP request method, e.g. "GET", "POST", ...
$scheme is either "http" or "https".
$host is the hostname or IP address of the server, e.g.
"foertsch.name" or 109.73.51.50.
$port is the server port to connect. Usually port 80 is used for
"http" and port 443 for "https".
$uri is the request URI normally starting with a slash ("/"), e.g.
"/impressum.html".
$param is a hash with further options.
To access the elements of a request description HTTP::LoadGen::Run
exports a few constants. They may be used to increase readability.
RQ_METHOD == 0
RQ_SCHEME == 1
RQ_HOST == 2
RQ_PORT == 3
RQ_URI == 4
RQ_PARAM == 5
Example:
URLList=>[
[qw!GET http 109.73.51.50 80 /-redir!,
{
keepalive=>KEEPALIVE,
headers=>[
Authorization=>'Basic YmxhOmJsdWI=',
Host=>'foertsch.name',
],
}],
[qw!HUGO https www.kabatinte.net 443 /!,
{
keepalive=>KEEPALIVE,
predelay=>0.5,
prejitter=>1,
postdelay=>3,
postjitter=>1.5,
body=>'blablub',
}]
]
This "URLList" contains 2 requests, one for a server with the IP
address 109.73.51.50 and one for the host "www.kabatinte.net".
The first one will send the following HTTP request to the server (IP
109.73.51.50, port 80):
GET /-redir HTTP/1.1
Authorization: Basic YmxhOmJsdWI=
Host: foertsch.name
If you need more header fields, "User-Agent" for example, add them
to the "headers" array of the options hash.
The second request is converted into the following HTTP message sent
over SSL to "84.38.75.176:443" assuming that "www.kabatinte.net"
resolves to 84.38.75.176:
HUGO / HTTP/1.1
Host: www.kabatinte.net
Content-Length: 7
blablub
Although no "Host" header is specified in the request element one is
sent. If the request element does not contain a "Host" header one is
added automatically based on $host and $port.
You may also notice the "Content-Length" header. It is sent because
a request body is specified (the "body" item in $param).
So, what can be specified in the $param part?
keepalive
HTTP::LoadGen::Run exports 3 constants to be used as values.
"KEEPALIVE_USE" permits to use a previously kept alive
connection. "KEEPALIVE_STORE" allows to keep the connection
alive after the request. "KEEPALIVE" combines both of the above.
If you hate readability you can also use the numerical values:
KEEPALIVE_USE==1
KEEPALIVE_STORE==2
KEEPALIVE==3
predelay and prejitter
These statements define a period to wait before sending the
request. The wait is done after the request description has been
pulled off the iterator but before the "ReqStart" handler is
run.
Both numbers can be fractions. Read them as
predelay ± prejitter
The actual waiting time is calculated as
interval = predelay - prejitter + rand( 2 * prejitter )
If "prejitter >= predelay" interval can become negative. In this
cases you won't jump back in time but simply not wait.
To achieve repeatable results a thread-specific random number
generator must be used. See the "rng" function below.
postdelay and postjitter
The same as "predelay" but waiting occurs after the request is
done or more precisely after the "ReqDone" handler returns.
headers
an array (not a hash!) of header fields to be appended to the
HTTP request.
body
a request body
conn_timeout
here you can specify the return value of the prepare-callback
function passed to "AnyEvent::Socket::tcp_connect" when
establishing a connection.
See AnyEvent::Socket for more information.
timeout
the "timeout" parameter used when a connection is converted into
a AnyEvent::Handle object.
See AnyEvent::Handle for more information.
tls_ctx
the "tls_ctx" parameter used when a connection is converted into
a AnyEvent::Handle object.
See AnyEvent::Handle for more information.
By now AnyEvent::Handle supports SSL features like client
certificates and server certificate verification. However, some
things are still missing like SSL session caching. How about
# thread accounting
thread_start;
# set a thread specific RNG
rng=Math::Random::MT->new(threadnr);
return []; # initializes thread specific user data
},
ThreadExit=>sub {
# no parameters
thread_done;
},
ReqStart=>sub {
my ($el)=@_;
# request accounting
req_start;
# started - succeeded - failed = currently pending number of requests
@{userdata()}=(thread_count, req_started-req_success-req_failed);
},
ReqDone=>sub {
my ($rc, $el)=@_;
# request accounting: HTTP status 2xx and 3xx are successful
# other requests are counted as failures.
req_done +($rc->[RC_STATUS]=~/^[23]/), $rc->[RC_HEADERS], $rc->[RC_BODY];
$logger->(threadnr,
@{$rc}[RC_DNSCACHED, RC_CONNCACHED],
@{userdata()},
req_success,
req_failed,
$rc->[RC_STARTTIME],
$rc->[RC_CONNTIME]-$rc->[RC_STARTTIME],
$rc->[RC_FIRSTTIME]-$rc->[RC_STARTTIME],
$rc->[RC_HEADERTIME]-$rc->[RC_STARTTIME],
$rc->[RC_BODYTIME]-$rc->[RC_STARTTIME],
$rc->[RC_STATUS],
length($rc->[RC_BODY]),
@{$el}[RQ_METHOD, RQ_SCHEME, RQ_HOST, RQ_PORT, RQ_URI],
$rc->[RC_STATUSLINE]);
},
dnscache=>{
localhost=>'127.0.0.1',
'kabatinte.net'=>'84.38.75.176',
'www.kabatinte.net'=>'84.38.75.176',
'foertsch.name'=>'109.73.51.50',
},
times=>3, # run the URL list 3 times
InitURLs=>'random_start',
URLList=>do {
my $o={
keepalive=>KEEPALIVE,
qw!predelay 0.05 prejitter 0.1 postdelay 0.5 postjitter 1!,
};
[[qw!GET http foertsch.name 80 /-redir!, $o],
[qw!HUGO https www.kabatinte.net 443 /!, $o]
];
},
}
SEE ALSO
* HTTP::LoadGen::Run
* HTTP::LoadGen::ScoreBoard
* HTTP::LoadGen::Logger
* loadgen
AUTHOR
Torsten Förtsch, <torsten.foertsch@gmx.net>
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Torsten Förtsch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself, either Perl version 5.10.0 or, at
your option, any later version of Perl 5 you may have available.
( run in 0.981 second using v1.01-cache-2.11-cpan-5511b514fd6 )