Apache2-ClickPath
view release on metacpan or search on metacpan
(if session and clusterid are passed as URL parts) or something
mixed.
Assuming that "clusterid" and "session" both identify the session on
"shop.tld.org" "Apache2::ClickPath" can extract them, encode them in
it's own session and place them in environment variables.
Each line in the "ClickPathFriendlySessions" section decribes one
friendly site. The line consists of the friendly hostname, a list of
URL parts or CGI parameters identifying the friendly session and an
optional short name for this friend, eg:
shop.tld.org uri(1) param(session) shop
This means sessions at "shop.tld.org" are identified by the
combination of 1st URL part after the leading slash (/) and a CGI
parameter named "session".
If now a request comes in with a "Referer" of
"http://shop.tld.org/25/bin/shop.pl?action=showbasket;session=213"
the "REMOTE_SESSION" environment variable will contain 2 lines:
25
session=213
Their order is determined by the order of "uri()" and "param()"
statements in the configuration section between the hostname and the
short name. The "REMOTE_SESSION_HOST" environment variable will
contain the host name the session belongs to.
Now a CGI script or a modperl handler or something similar can fetch
the environment and build links back to "shop.tld.org". Instead of
directly linking back to the shop your links then point to that
script. The script then puts out an appropriate redirect.
ClickPathFriendlySessionsFile
this directive takes a filename as argument. The file's syntax and
semantic are the same as for "ClickPathFriendlySessions". The file
is reread every time is has been changed avoiding server restarts
after configuration changes at the prize of memory consumption.
ClickPathSecret
ClickPathSecretIV
if you want to run something like a shop with our session
identifiers they must be unguessable. That means knowing a valid
session ID it must be difficult to guess another one. With these
directives a significant part of the session ID is encrypted with
Blowfish in the cipher block chaining mode thus making the session
ID unguessable. "ClickPathSecret" specifies the key,
"ClickPathSecretIV" the initialization vector.
"ClickPathSecretIV" is a simple string of arbitrary length. The
first 8 bytes of its MD5 digest are used as initialization vector.
If omitted the string "abcd1234" is the IV.
"ClickPathSecret" is given as "http:", "https:", "file:" or "data:"
URL. Thus the secret can be stored directly as data-URL in the
httpd.conf or in a separate file on the local disk or on a possibly
secured server. To enable all modes of accessing the WEB the
http(s)-URL syntax is a bit extented. Maybe you have already used
"http://user:password@server.tld/...". Many browsers allow this
syntax to specify a username and password for HTTP authentication.
But how about proxies, SSL-authentication etc? Well, add another
colon (:) after the password and append a semicolon (;) delimited
list of "key=value" pairs. The special characters (@:;\) can be
quoted with a backslash (\). In fact, all characters can be quoted.
Thus, "\a" and "a" produce the same string "a".
The following keys are defined:
https_proxy
https_proxy_username
https_proxy_password
https_version
https_cert_file
https_key_file
https_ca_file
https_ca_dir
https_pkcs12_file
https_pkcs12_password
their meaning is defined in Crypt::SSLeay.
http_proxy
http_proxy_username
http_proxy_password
these are passed to LWP::UserAgent.
Remember a HTTP-proxy is accessed with the GET or POST, ...
methods whereas a HTTPS-proxy is accessed with CONNECT. Don't mix
them, see Crypt::SSLeay.
Examples
ClickPathSecret https://john:a\@b\;c\::https_ca_file=/my/ca.pem@secrethost.tld/bin/secret.pl?host=me
fetches the secret from
"https://secrethost.tdl/bin/secret.pl?host=me" using "john" as
username and "a@b;c:" as password. The server certificate of
secrethost.tld is verified against the CA certificate found in
"/my/ca.pem".
ClickPathSecret https://::https_pkcs12_file=/my/john.p12;https_pkcs12_password=a\@b\;c\:;https_ca_file=/my/ca.pem@secrethost.tld/bin/secret.pl?host=me
fetches the secret again from
"https://secrethost.tdl/bin/secret.pl?host=me" using "/my/john.p12"
as client certificate with "a@b;c:" as password. The server
certificate of secrethost.tld is again verified against the CA
certificate found in "/my/ca.pem".
ClickPathSecret data:,password:very%20secret%20password
here a data-URL is used that produces the content "password:very
secret password".
The URL's content is fetched by LWP::UserAgent once at server
startup.
Its content defines the secret either in binary form or as string of
hexadecimal characters or as a password. If it starts with "binary:"
the rest of the content is taken as is as the key. If it starts with
"hex:" "pack( 'H*', $arg )" is used to convert it to binary. If it
starts with "password:" or with neither of them the MD5 digest of
the rest of the content is used as secret.
The Blowfish algorithm allows up to 56 bytes as secret. In hex and
binary mode the starting 56 bytes are used. You can specify more
bytes but they won't be regarded. In password mode the MD5 algorithm
produces 16 bytes long secret.
Working with a load balancer
Most load balancers are able to map a request to a particular machine
based on a part of the request URI. They look for a prefix followed by a
given number of characters or until a suffix is found. The string
between identifies the machine to route the request to.
The name set with "ClickPathMachine" can be used by a load balancer. It
is immediately following the session prefix and finished by a single
colon. The default name is always 6 bytes long.
Logging
The most important part of user tracking and clickstreams is logging.
With "Apache2::ClickPath" many request URIs contain an initial session
part. Thus, for logfile analyzers most requests are unique which leads
to useless results. Normally Apache's common logfile format starts with
%h %l %u %t \"%r\"
%r stands for *the request*. It is the first line a browser sends to a
server. For use with "Apache2::ClickPath" %r is better changed to "%m
%U%q %H". Since "Apache2::ClickPath" strips the session part from the
current URI %U appears without the session. With this modification
logfile analyzers will produce meaningful results again.
The session can be logged as "%{SESSION}e" at end of a logfile line.
A word about proxies
Depending on your content and your users community HTTP proxies can
serve a significant part of your traffic. With "Apache2::ClickPath"
almost all request have to be served by your server.
Debugging
Sometimes it is useful to know the information encoded in a session
identifier. This is why Apache2::ClickPath::Decode exists.
SEE ALSO
Apache2::ClickPath::Store Apache2::ClickPath::StoreClient
Apache2::ClickPath::Decode <http://perl.apache.org>,
<http://httpd.apache.org>
AUTHOR
Torsten Foertsch, <torsten.foertsch@gmx.net>
COPYRIGHT AND LICENSE
Copyright (C) 2004-2005 by Torsten Foertsch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
INSTALLATION
perl Makefile.PL
make
make test
make install
DEPENDENCIES
mod_perl 1.999022 (aka 2.0.0-RC5), perl 5.8.0
( run in 2.096 seconds using v1.01-cache-2.11-cpan-df04353d9ac )