Linux-SocketFilter
view release on metacpan or search on metacpan
lib/Linux/SocketFilter/Assembler.pm view on Meta::CPAN
that is not interesting to the application, allowing higher performance due
to reduced context switches between kernel and userland.
This module allows filter programs to be written in textual code, and
assembled into a binary filter, to attach to the socket using the
C<SO_ATTACH_FILTER> socket option.
=cut
=head1 FILTER MACHINE
The virtual machine on which these programs run is a simple load/store
register machine operating on 32-bit words. It has one general-purpose
accumulator register (C<A>) and one special purpose index register (C<X>).
It has a number of temporary storage locations, called scratchpads (C<M[]>).
It is given read access to the contents of the packet to be filtered in 8-bit
(C<BYTE[]>), 16-bit (C<HALF[]>) or 32-bit (C<WORD[]>) sized quantities. It
also has an implicit program counter, though direct access to it is not
provided.
The filter program is run by the kernel on every packet captured by the socket
to which it is attached. It can inspect data in the packet and certain other
items of metadata concerning the packet, and decide if this packet should be
accepted by the capture socket. It returns the number of bytes to capture if
it should be captured, or zero to indicate this packet should be ignored. It
starts on the first instruction, and proceeds forwards, unless the flow is
modified by a jump instruction. The program terminates on a C<RET>
instruction, which informs the kernel of the required fate of the packet. The
last instruction in the filter must therefore be a C<RET> instruction; though
others may appear at earlier points.
In order to guarantee termination of the program in all circumstances, the
virtual machine is not fully Turing-powerful. All jumps, conditional or
unconditional, may only jump forwards in the program. It is not possible to
construct a loop of instructions that executes repeatedly.
=cut
=head1 FUNCTIONS
=cut
=head2 $filter = assemble( $text )
Takes a program (fragment) in text form and returns a binary string
representing the instructions packed ready for C<attach_filter()>.
The program consists of C<\n>-separated lines of instructions or comments.
Leading whitespace is ignored. Blank lines are ignored. Lines beginning with
a C<;> (after whitespace) are ignored as comments.
=cut
sub assemble
{
my $self = __PACKAGE__;
my ( $text ) = @_;
my $ret = "";
foreach ( split m/\n/, $text ) {
s/^\s+//; # trim whitespace
next if m/^$/; # skip blanks
next if m/^;/; # skip comments
my ( $op, $args ) = split ' ', $_, 2;
my @args = defined $args ? split m/,\s*/, $args : ();
$self->can( "assemble_$op" ) or
die "Can't compile $_ - unrecognised op '$op'\n";
$ret .= $self->${\"assemble_$op"}( @args );
}
return $ret;
}
=head1 INSTRUCTION FORMAT
Each instruction in the program is formed of an opcode followed by its
operands. Where numeric literals are involved, they may be given in decimal,
hexadecimal, or octal form. Literals will be notated as C<lit> in the
following descriptions.
=cut
my $match_literal = qr/-?(?:\d+|0x[0-9a-f]+)/;
sub _parse_literal
{
my ( $lit ) = @_;
my $sign = ( $lit =~ s/^-// ) ? -1 : 1;
return $sign * oct( $lit ) if $lit =~ m/^0x?/; # oct can manage octal or hex
return $sign * int( $lit ) if $lit =~ m/\d+/;
die "Cannot parse literal $lit\n";
}
=pod
LD BYTE[addr]
LD HALF[addr]
LD WORD[addr]
Load the C<A> register from the 8, 16, or 32-bit quantity in the packet buffer
at the address. The address may be given in the forms
lit
X+lit
NET+lit
NET+X+lit
To load from an immediate or C<X>-index address, starting from either the
beginning of the buffer, or the beginning of the network header, respectively.
LD len
Load the C<A> register with the length of the packet.
LD lit
Load the C<A> register with a literal value
LD M[lit]
Load the C<A> register with the value from the given scratchpad cell
( run in 0.506 second using v1.01-cache-2.11-cpan-71847e10f99 )