ApacheLog-Compressor

 view release on metacpan or  search on metacpan

lib/ApacheLog/Compressor.pm  view on Meta::CPAN


=item * 01 - Change server

=item * 02 - timestamp

=item * 03 - vhost

=item * 04 - user

=item * 05 - useragent

=item * 06 - referer

=item * 07 - url

=item * 80 - reset

=back

The log entry itself normally consists of the following fields:

 N vhost
 N time
 N IP
 N user
 N useragent
 N timestamp
 C method
 C version
 n response
 N bytes
 N url

The format of the log file can be customised, see the next section for details.

=head3 FORMAT SPECIFICATION

A custom format can be provided as the C<format> parameter when instantiating
a new L<ApacheLog::Compressor> object via ->L</new>. This format consists of an
arrayref of key/value pairs, each value holding the following information:

=over 4

=item * id - the ID to use when sending packets

=item * type - L<pack> format specifier used when storing and retrieving the data, such as N1 or n1. Without this there will be no entry for the item in the compressed log stream

=item * regex - the regular expression used for matching this part of the log file. The
final regex will be the concatenation of all regex entries for the format, joined
using \s+ as the delimiter.

=item * process_in - coderef for converting incoming values from a plain text log source into compressed values, will receive $self (the current L<ApacheLog::Compressor> instance) and $data (the current hashref containing the raw data).

=item * process_out - coderef for converting values from a compressed source back to plain text, will receive $self (the current L<ApacheLog::Compressor> instance) and $data (the current hashref containing the raw data).

=back

=cut

our %HTTP_METHOD;
our @HTTP_METHOD_LIST = qw(GET PUT HEAD POST OPTIONS DELETE TRACE CONNECT MKCOL PATCH PROPFIND PROPPATCH FILEPATCH COPY MOVE LOCK UNLOCK SIGNATURE DELTA);
{ my $idx = 0; %HTTP_METHOD = map { $_ => $idx++ } @HTTP_METHOD_LIST; }

=head1 METHODS

=cut

=head2 new

Instantiate the class.

Takes the following named parameters:

=over 4

=item * on_write - coderef to call with packet data for each outgoing packet

=back

=cut


sub new {
	my $class = shift;
	my %args = @_;
	my $format = delete $args{format};
	my $self = bless {
		%args,
		entry_index	=> {},
		entry_cache	=> {},
		log_packet_count => 0,
		timestamp	=> undef,
		server		=> undef,
	}, $class;
	$self->{format} = $format || $self->default_format;
	$self->update_mapping;
	return $self;
}

=head2 default_format

Returns the default format used for parsing log lines.

This is an arrayref containing key => value pairs, see L</FORMAT SPECIFICATION> for
more details.

=cut

sub default_format {
	my $self = shift;
	return [
		type		=> { type => 'C1' },
		vhost		=> { id => 0x03, type => 'n1', regex => qr{([^ ]+)} },
		duration	=> { type => 'N1', regex => qr{(\d+)} },
		ip		=> {
			type => 'N1',
			regex => qr{(\S+)\s+\S+},
			process_in => sub {
				my ($self, $data) = @_;
				$data->{ip} = unpack('N1', inet_aton($data->{ip}));
			},



( run in 0.990 second using v1.01-cache-2.11-cpan-39bf76dae61 )