686 lines
19 KiB
Groff
686 lines
19 KiB
Groff
|
.\" Copyright (c) 1990 The Regents of the University of California.
|
||
|
.\" All rights reserved.
|
||
|
.\"
|
||
|
.\" Redistribution and use in source and binary forms, with or without
|
||
|
.\" modification, are permitted provided that: (1) source code distributions
|
||
|
.\" retain the above copyright notice and this paragraph in its entirety, (2)
|
||
|
.\" distributions including binary code include the above copyright notice and
|
||
|
.\" this paragraph in its entirety in the documentation or other materials
|
||
|
.\" provided with the distribution, and (3) all advertising materials mentioning
|
||
|
.\" features or use of this software display the following acknowledgement:
|
||
|
.\" ``This product includes software developed by the University of California,
|
||
|
.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
|
||
|
.\" the University nor the names of its contributors may be used to endorse
|
||
|
.\" or promote products derived from this software without specific prior
|
||
|
.\" written permission.
|
||
|
.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
|
||
|
.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
|
||
|
.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
|
||
|
.\"
|
||
|
.\" This document is derived in part from the enet man page (enet.4)
|
||
|
.\" distributed with 4.3BSD Unix.
|
||
|
.\"
|
||
|
.TH BPF 4 "23 May 1991"
|
||
|
.SH NAME
|
||
|
bpf \- Berkeley Packet Filter
|
||
|
.SH SYNOPSIS
|
||
|
.B "pseudo-device bpfilter 16"
|
||
|
.SH DESCRIPTION
|
||
|
The Berkeley Packet Filter
|
||
|
provides a raw interface to data link layers in a protocol
|
||
|
independent fashion.
|
||
|
All packets on the network, even those destined for other hosts,
|
||
|
are accessible through this mechanism.
|
||
|
.PP
|
||
|
The packet filter appears as a character special device,
|
||
|
.I /dev/bpf0, /dev/bpf1,
|
||
|
etc.
|
||
|
After opening the device, the file descriptor must be bound to a
|
||
|
specific network interface with the BIOSETIF ioctl.
|
||
|
A given interface can be shared be multiple listeners, and the filter
|
||
|
underlying each descriptor will see an identical packet stream.
|
||
|
The total number of open
|
||
|
files is limited to the value given in the kernel configuration; the
|
||
|
example given in the SYNOPSIS above sets the limit to 16.
|
||
|
.PP
|
||
|
A separate device file is required for each minor device.
|
||
|
If a file is in use, the open will fail and
|
||
|
.I errno
|
||
|
will be set to EBUSY.
|
||
|
.PP
|
||
|
Associated with each open instance of a
|
||
|
.I bpf
|
||
|
file is a user-settable packet filter.
|
||
|
Whenever a packet is received by an interface,
|
||
|
all file descriptors listening on that interface apply their filter.
|
||
|
Each descriptor that accepts the packet receives its own copy.
|
||
|
.PP
|
||
|
Reads from these files return the next group of packets
|
||
|
that have matched the filter.
|
||
|
To improve performance, the buffer passed to read must be
|
||
|
the same size as the buffers used internally by
|
||
|
.I bpf.
|
||
|
This size is returned by the BIOCGBLEN ioctl (see below), and under
|
||
|
BSD, can be set with BIOCSBLEN.
|
||
|
Note that an individual packet larger than this size is necessarily
|
||
|
truncated.
|
||
|
.PP
|
||
|
The packet filter will support any link level protocol that has fixed length
|
||
|
headers. Currently, only Ethernet, SLIP and PPP drivers have been
|
||
|
modified to interact with
|
||
|
.I bpf.
|
||
|
.PP
|
||
|
Since packet data is in network byte order, applications should use the
|
||
|
.I byteorder(3n)
|
||
|
macros to extract multi-byte values.
|
||
|
.PP
|
||
|
A packet can be sent out on the network by writing to a
|
||
|
.I bpf
|
||
|
file descriptor. The writes are unbuffered, meaning only one
|
||
|
packet can be processed per write.
|
||
|
Currently, only writes to Ethernets and SLIP links are supported.
|
||
|
.SH IOCTLS
|
||
|
The
|
||
|
.I ioctl
|
||
|
command codes below are defined in <net/bpf.h>. All commands require
|
||
|
these includes:
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
#include <sys/types.h>
|
||
|
#include <sys/time.h>
|
||
|
#include <sys/ioctl.h>
|
||
|
#include <net/bpf.h>
|
||
|
|
||
|
.fi
|
||
|
.ft R
|
||
|
Additionally, BIOCGETIF and BIOCSETIF require \fB<net/if.h>\fR.
|
||
|
|
||
|
In addition to FIONREAD and SIOCGIFADDR, the following commands
|
||
|
may be applied to any open
|
||
|
.I bpf
|
||
|
file.
|
||
|
The (third) argument to the
|
||
|
.I ioctl
|
||
|
should be a pointer to the type indicated.
|
||
|
.TP 10
|
||
|
.B BIOCGBLEN (u_int)
|
||
|
Returns the required buffer length for reads on
|
||
|
.I bpf
|
||
|
files.
|
||
|
.TP 10
|
||
|
.B BIOCSBLEN (u_int)
|
||
|
Sets the buffer length for reads on
|
||
|
.I bpf
|
||
|
files. The buffer must be set before the file is attached to an interface
|
||
|
with BIOCSETIF.
|
||
|
If the requested buffer size cannot be accomodated, the closest
|
||
|
allowable size will be set and returned in the argument.
|
||
|
A read call will result in EIO if it is passed a buffer that is not this size.
|
||
|
.TP 10
|
||
|
.B BIOCGDLT (u_int)
|
||
|
Returns the type of the data link layer underyling the attached interface.
|
||
|
EINVAL is returned if no interface has been specified.
|
||
|
The device types, prefixed with ``DLT_'', are defined in <net/bpf.h>.
|
||
|
.TP 10
|
||
|
.B BIOCPROMISC
|
||
|
Forces the interface into promiscuous mode.
|
||
|
All packets, not just those destined for the local host, are processed.
|
||
|
Since more than one file can be listening on a given interface,
|
||
|
a listener that opened its interface non-promiscuously may receive
|
||
|
packets promiscuously. This problem can be remedied with an
|
||
|
appropriate filter.
|
||
|
.IP
|
||
|
The interface remains in promiscuous mode until all files listening
|
||
|
promiscuously are closed.
|
||
|
.TP 10
|
||
|
.B BIOCFLUSH
|
||
|
Flushes the buffer of incoming packets,
|
||
|
and resets the statistics that are returned by BIOCGSTATS.
|
||
|
.TP 10
|
||
|
.B BIOCGETIF (struct ifreq)
|
||
|
Returns the name of the hardware interface that the file is listening on.
|
||
|
The name is returned in the if_name field of
|
||
|
.I ifr.
|
||
|
All other fields are undefined.
|
||
|
.TP 10
|
||
|
.B BIOCSETIF (struct ifreq)
|
||
|
Sets the hardware interface associate with the file. This
|
||
|
command must be performed before any packets can be read.
|
||
|
The device is indicated by name using the
|
||
|
.I if_name
|
||
|
field of the
|
||
|
.I ifreq.
|
||
|
Additionally, performs the actions of BIOCFLUSH.
|
||
|
.TP 10
|
||
|
.B BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
|
||
|
Set or get the read timeout parameter.
|
||
|
The
|
||
|
.I timeval
|
||
|
specifies the length of time to wait before timing
|
||
|
out on a read request.
|
||
|
This parameter is initialized to zero by
|
||
|
.IR open(2),
|
||
|
indicating no timeout.
|
||
|
.TP 10
|
||
|
.B BIOCGSTATS (struct bpf_stat)
|
||
|
Returns the following structure of packet statistics:
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
struct bpf_stat {
|
||
|
u_int bs_recv;
|
||
|
u_int bs_drop;
|
||
|
};
|
||
|
.fi
|
||
|
.ft R
|
||
|
.IP
|
||
|
The fields are:
|
||
|
.RS
|
||
|
.TP 15
|
||
|
.I bs_recv
|
||
|
the number of packets received by the descriptor since opened or reset
|
||
|
(including any buffered since the last read call);
|
||
|
and
|
||
|
.TP
|
||
|
.I bs_drop
|
||
|
the number of packets which were accepted by the filter but dropped by the
|
||
|
kernel because of buffer overflows
|
||
|
(i.e., the application's reads aren't keeping up with the packet traffic).
|
||
|
.RE
|
||
|
.TP 10
|
||
|
.B BIOCIMMEDIATE (u_int)
|
||
|
Enable or disable ``immediate mode'', based on the truth value of the argument.
|
||
|
When immediate mode is enabled, reads return immediately upon packet
|
||
|
reception. Otherwise, a read will block until either the kernel buffer
|
||
|
becomes full or a timeout occurs.
|
||
|
This is useful for programs like
|
||
|
.I rarpd(8c),
|
||
|
which must respond to messages in real time.
|
||
|
The default for a new file is off.
|
||
|
.TP 10
|
||
|
.B BIOCSETF (struct bpf_program)
|
||
|
Sets the filter program used by the kernel to discard uninteresting
|
||
|
packets. An array of instructions and its length is passed in using
|
||
|
the following structure:
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
struct bpf_program {
|
||
|
int bf_len;
|
||
|
struct bpf_insn *bf_insns;
|
||
|
};
|
||
|
.fi
|
||
|
.ft R
|
||
|
.IP
|
||
|
The filter program is pointed to by the
|
||
|
.I bf_insns
|
||
|
field while its length in units of `struct bpf_insn' is given by the
|
||
|
.I bf_len
|
||
|
field.
|
||
|
Also, the actions of BIOCFLUSH are performed.
|
||
|
.IP
|
||
|
See section \fBFILTER MACHINE\fP for an explanation of the filter language.
|
||
|
.TP 10
|
||
|
.B BIOCVERSION (struct bpf_version)
|
||
|
Returns the major and minor version numbers of the filter languange currently
|
||
|
recognized by the kernel. Before installing a filter, applications must check
|
||
|
that the current version is compatible with the running kernel. Version
|
||
|
numbers are compatible if the major numbers match and the application minor
|
||
|
is less than or equal to the kernel minor. The kernel version number is
|
||
|
returned in the following structure:
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
struct bpf_version {
|
||
|
u_short bv_major;
|
||
|
u_short bv_minor;
|
||
|
};
|
||
|
.fi
|
||
|
.ft R
|
||
|
.IP
|
||
|
The current version numbers are given by
|
||
|
.B BPF_MAJOR_VERSION
|
||
|
and
|
||
|
.B BPF_MINOR_VERSION
|
||
|
from <net/bpf.h>.
|
||
|
An incompatible filter
|
||
|
may result in undefined behavior (most likely, an error returned by
|
||
|
.I ioctl()
|
||
|
or haphazard packet matching).
|
||
|
.SH BPF HEADER
|
||
|
The following structure is prepended to each packet returned by
|
||
|
.I read(2):
|
||
|
.in 15
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
struct bpf_hdr {
|
||
|
struct timeval bh_tstamp;
|
||
|
u_long bh_caplen;
|
||
|
u_long bh_datalen;
|
||
|
u_short bh_hdrlen;
|
||
|
};
|
||
|
.fi
|
||
|
.ft R
|
||
|
.in -15
|
||
|
.PP
|
||
|
The fields, whose values are stored in host order, and are:
|
||
|
.TP 15
|
||
|
.I bh_tstamp
|
||
|
The time at which the packet was processed by the packet filter.
|
||
|
.TP
|
||
|
.I bh_caplen
|
||
|
The length of the captured portion of the packet. This is the minimum of
|
||
|
the truncation amount specified by the filter and the length of the packet.
|
||
|
.TP
|
||
|
.I bh_datalen
|
||
|
The length of the packet off the wire.
|
||
|
This value is independent of the truncation amount specified by the filter.
|
||
|
.TP
|
||
|
.I bh_hdrlen
|
||
|
The length of the BPF header, which may not be equal to
|
||
|
.I sizeof(struct bpf_hdr).
|
||
|
.RE
|
||
|
.PP
|
||
|
The
|
||
|
.I bh_hdrlen
|
||
|
field exists to account for
|
||
|
padding between the header and the link level protocol.
|
||
|
The purpose here is to guarantee proper alignment of the packet
|
||
|
data structures, which is required on alignment sensitive
|
||
|
architectures and and improves performance on many other architectures.
|
||
|
The packet filter insures that the
|
||
|
.I bpf_hdr
|
||
|
and the \fInetwork layer\fR header will be word aligned. Suitable precautions
|
||
|
must be taken when accessing the link layer protocol fields on alignment
|
||
|
restricted machines. (This isn't a problem on an Ethernet, since
|
||
|
the type field is a short falling on an even offset,
|
||
|
and the addresses are probably accessed in a bytewise fashion).
|
||
|
.PP
|
||
|
Additionally, individual packets are padded so that each starts
|
||
|
on a word boundary. This requires that an application
|
||
|
has some knowledge of how to get from packet to packet.
|
||
|
The macro BPF_WORDALIGN is defined in <net/bpf.h> to facilitate
|
||
|
this process. It rounds up its argument
|
||
|
to the nearest word aligned value (where a word is BPF_ALIGNMENT bytes wide).
|
||
|
.PP
|
||
|
For example, if `p' points to the start of a packet, this expression
|
||
|
will advance it to the next packet:
|
||
|
.sp
|
||
|
.RS
|
||
|
.ce 1
|
||
|
.nf
|
||
|
p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)
|
||
|
.fi
|
||
|
.RE
|
||
|
.PP
|
||
|
For the alignment mechanisms to work properly, the
|
||
|
buffer passed to
|
||
|
.I read(2)
|
||
|
must itself be word aligned.
|
||
|
.I malloc(3)
|
||
|
will always return an aligned buffer.
|
||
|
.ft R
|
||
|
.SH FILTER MACHINE
|
||
|
A filter program is an array of instructions, with all branches forwardly
|
||
|
directed, terminated by a \fBreturn\fP instruction.
|
||
|
Each instruction performs some action on the pseudo-machine state,
|
||
|
which consists of an accumulator, index register, scratch memory store,
|
||
|
and implicit program counter.
|
||
|
|
||
|
The following structure defines the instruction format:
|
||
|
.RS
|
||
|
.ft B
|
||
|
.nf
|
||
|
|
||
|
struct bpf_insn {
|
||
|
u_short code;
|
||
|
u_char jt;
|
||
|
u_char jf;
|
||
|
long k;
|
||
|
};
|
||
|
.fi
|
||
|
.ft R
|
||
|
.RE
|
||
|
|
||
|
The \fIk\fP field is used in differnet ways by different insutructions,
|
||
|
and the \fIjt\fP and \fIjf\fP fields are used as offsets
|
||
|
by the branch intructions.
|
||
|
The opcodes are encoded in a semi-hierarchical fashion.
|
||
|
There are eight classes of intructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
|
||
|
BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. Various other mode and
|
||
|
operator bits are or'd into the class to give the actual instructions.
|
||
|
The classes and modes are defined in <net/bpf.h>.
|
||
|
|
||
|
Below are the semantics for each defined BPF instruction.
|
||
|
We use the convention that A is the accumulator, X is the index register,
|
||
|
P[] packet data, and M[] scratch memory store.
|
||
|
P[i:n] gives the data at byte offset ``i'' in the packet,
|
||
|
interpreted as a word (n=4),
|
||
|
unsigned halfword (n=2), or unsigned byte (n=1).
|
||
|
M[i] gives the i'th word in the scratch memory store, which is only
|
||
|
addressed in word units. The memory store is indexed from 0 to BPF_MEMWORDS-1.
|
||
|
\fIk\fP, \fIjt\fP, and \fIjf\fP are the corresponding fields in the
|
||
|
instruction definition. ``len'' refers to the length of the packet.
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_LD
|
||
|
These instructions copy a value into the accumulator. The type of the
|
||
|
source operand is specified by an ``addressing mode'' and can be
|
||
|
a constant (\fBBPF_IMM\fP), packet data at a fixed offset (\fBBPF_ABS\fP),
|
||
|
packet data at a variable offset (\fBBPF_IND\fP), the packet length
|
||
|
(\fBBPF_LEN\fP),
|
||
|
or a word in the scratch memory store (\fBBPF_MEM\fP).
|
||
|
For \fBBPF_IND\fP and \fBBPF_ABS\fP, the data size must be specified as a word
|
||
|
(\fBBPF_W\fP), halfword (\fBBPF_H\fP), or byte (\fBBPF_B\fP).
|
||
|
The semantics of all the recognized BPF_LD instructions follow.
|
||
|
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_LD+BPF_W+BPF_ABS
|
||
|
A <- P[k:4]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_H+BPF_ABS
|
||
|
A <- P[k:2]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_B+BPF_ABS
|
||
|
A <- P[k:1]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_W+BPF_IND
|
||
|
A <- P[X+k:4]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_H+BPF_IND
|
||
|
A <- P[X+k:2]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_B+BPF_IND
|
||
|
A <- P[X+k:1]
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_W+BPF_LEN
|
||
|
A <- len
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_IMM
|
||
|
A <- k
|
||
|
.TP
|
||
|
.B BPF_LD+BPF_MEM
|
||
|
A <- M[k]
|
||
|
.RE
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_LDX
|
||
|
These instructions load a value into the index register. Note that
|
||
|
the addressing modes are more retricted than those of the accumulator loads,
|
||
|
but they include
|
||
|
.B BPF_MSH,
|
||
|
a hack for efficiently loading the IP header length.
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_LDX+BPF_W+BPF_IMM
|
||
|
X <- k
|
||
|
.TP
|
||
|
.B BPF_LDX+BPF_W+BPF_MEM
|
||
|
X <- M[k]
|
||
|
.TP
|
||
|
.B BPF_LDX+BPF_W+BPF_LEN
|
||
|
X <- len
|
||
|
.TP
|
||
|
.B BPF_LDX+BPF_B+BPF_MSH
|
||
|
X <- 4*(P[k:1]&0xf)
|
||
|
.RE
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_ST
|
||
|
This instruction stores the accumulator into the scratch memory.
|
||
|
We do not need an addressing mode since there is only one possibility
|
||
|
for the destination.
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_ST
|
||
|
M[k] <- A
|
||
|
.RE
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_STX
|
||
|
This instruction stores the index register in the scratch memory store.
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_STX
|
||
|
M[k] <- X
|
||
|
.RE
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_ALU
|
||
|
The alu instructions perform operations between the accumulator and
|
||
|
index register or constant, and store the result back in the accumulator.
|
||
|
For binary operations, a source mode is required (\fBBPF_K\fP or
|
||
|
\fBBPF_X\fP).
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_ALU+BPF_ADD+BPF_K
|
||
|
A <- A + k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_SUB+BPF_K
|
||
|
A <- A - k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_MUL+BPF_K
|
||
|
A <- A * k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_DIV+BPF_K
|
||
|
A <- A / k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_AND+BPF_K
|
||
|
A <- A & k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_OR+BPF_K
|
||
|
A <- A | k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_LSH+BPF_K
|
||
|
A <- A << k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_RSH+BPF_K
|
||
|
A <- A >> k
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_ADD+BPF_X
|
||
|
A <- A + X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_SUB+BPF_X
|
||
|
A <- A - X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_MUL+BPF_X
|
||
|
A <- A * X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_DIV+BPF_X
|
||
|
A <- A / X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_AND+BPF_X
|
||
|
A <- A & X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_OR+BPF_X
|
||
|
A <- A | X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_LSH+BPF_X
|
||
|
A <- A << X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_RSH+BPF_X
|
||
|
A <- A >> X
|
||
|
.TP
|
||
|
.B BPF_ALU+BPF_NEG
|
||
|
A <- -A
|
||
|
.RE
|
||
|
|
||
|
.TP 10
|
||
|
.B BPF_JMP
|
||
|
The jump instructions alter flow of control. Conditional jumps
|
||
|
compare the accumulator against a constant (\fBBPF_K\fP) or
|
||
|
the index register (\fBBPF_X\fP). If the result is true (or non-zero),
|
||
|
the true branch is taken, otherwise the false branch is taken.
|
||
|
Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
|
||
|
However, the jump always (\fBBPF_JA\fP) opcode uses the 32 bit \fIk\fP
|
||
|
field as the offset, allowing arbitrarily distant destinations.
|
||
|
All conditionals use unsigned comparison conventions.
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_JMP+BPF_JA
|
||
|
pc += k
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JGT+BPF_K
|
||
|
pc += (A > k) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JGE+BPF_K
|
||
|
pc += (A >= k) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JEQ+BPF_K
|
||
|
pc += (A == k) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JSET+BPF_K
|
||
|
pc += (A & k) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JGT+BPF_X
|
||
|
pc += (A > X) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JGE+BPF_X
|
||
|
pc += (A >= X) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JEQ+BPF_X
|
||
|
pc += (A == X) ? jt : jf
|
||
|
.TP
|
||
|
.B BPF_JMP+BPF_JSET+BPF_X
|
||
|
pc += (A & X) ? jt : jf
|
||
|
.RE
|
||
|
.TP 10
|
||
|
.B BPF_RET
|
||
|
The return instructions terminate the filter program and specify the amount
|
||
|
of packet to accept (i.e., they return the truncation amount). A return
|
||
|
value of zero indicates that the packet should be ignored.
|
||
|
The return value is either a constant (\fBBPF_K\fP) or the accumulator
|
||
|
(\fBBPF_A\fP).
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_RET+BPF_A
|
||
|
accept A bytes
|
||
|
.TP
|
||
|
.B BPF_RET+BPF_K
|
||
|
accept k bytes
|
||
|
.RE
|
||
|
.TP 10
|
||
|
.B BPF_MISC
|
||
|
The miscellaneous category was created for anything that doesn't
|
||
|
fit into the above classes, and for any new instructions that might need to
|
||
|
be added. Currently, these are the register transfer intructions
|
||
|
that copy the index register to the accumulator or vice versa.
|
||
|
.RS
|
||
|
.TP 30
|
||
|
.B BPF_MISC+BPF_TAX
|
||
|
X <- A
|
||
|
.TP
|
||
|
.B BPF_MISC+BPF_TXA
|
||
|
A <- X
|
||
|
.RE
|
||
|
.PP
|
||
|
The BPF interface provides the following macros to facilitate
|
||
|
array initializers:
|
||
|
.RS
|
||
|
\fBBPF_STMT\fI(opcode, operand)\fR
|
||
|
.br
|
||
|
and
|
||
|
.br
|
||
|
\fBBPF_JUMP\fI(opcode, operand, true_offset, false_offset)\fR
|
||
|
.RE
|
||
|
.PP
|
||
|
.SH EXAMPLES
|
||
|
The following filter is taken from the Reverse ARP Daemon. It accepts
|
||
|
only Reverse ARP requests.
|
||
|
.RS
|
||
|
.nf
|
||
|
|
||
|
struct bpf_insn insns[] = {
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
|
||
|
BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
|
||
|
sizeof(struct ether_header)),
|
||
|
BPF_STMT(BPF_RET+BPF_K, 0),
|
||
|
};
|
||
|
.fi
|
||
|
.RE
|
||
|
.PP
|
||
|
This filter accepts only IP packets between host 128.3.112.15 and
|
||
|
128.3.112.35.
|
||
|
.RS
|
||
|
.nf
|
||
|
|
||
|
struct bpf_insn insns[] = {
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 26),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
|
||
|
BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
|
||
|
BPF_STMT(BPF_RET+BPF_K, 0),
|
||
|
};
|
||
|
.fi
|
||
|
.RE
|
||
|
.PP
|
||
|
Finally, this filter returns only TCP finger packets. We must parse
|
||
|
the IP header to reach the TCP header. The \fBBPF_JSET\fP instruction
|
||
|
checks that the IP fragment offset is 0 so we are sure
|
||
|
that we have a TCP header.
|
||
|
.RS
|
||
|
.nf
|
||
|
|
||
|
struct bpf_insn insns[] = {
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
|
||
|
BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
|
||
|
BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
|
||
|
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
|
||
|
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
|
||
|
BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
|
||
|
BPF_STMT(BPF_RET+BPF_K, 0),
|
||
|
};
|
||
|
.fi
|
||
|
.RE
|
||
|
.SH SEE ALSO
|
||
|
tcpdump(8)
|
||
|
.LP
|
||
|
McCanne, S., Jacobson V.,
|
||
|
.RI ` "An efficient, extensible, and portable network monitor" '
|
||
|
.SH FILES
|
||
|
/dev/bpf0, /dev/bpf1, ...
|
||
|
.SH BUGS
|
||
|
The read buffer must be of a fixed size (returned by the BIOCGBLEN ioctl).
|
||
|
.PP
|
||
|
A file that does not request promiscuous mode may receive promiscuously
|
||
|
received packets as a side effect of another file requesting this
|
||
|
mode on the same hardware interface. This could be fixed in the kernel
|
||
|
with additional processing overhead. However, we favor the model where
|
||
|
all files must assume that the interface is promiscuous, and if
|
||
|
so desired, must utilize a filter to reject foreign packets.
|
||
|
.PP
|
||
|
Data link protocols with variable length headers are not currently supported.
|
||
|
.PP
|
||
|
Under SunOS, if a BPF application reads more than 2^31 bytes of
|
||
|
data, read will fail in EINVAL. You can either fix the bug in SunOS,
|
||
|
or lseek to 0 when read fails for this reason.
|
||
|
.SH HISTORY
|
||
|
.PP
|
||
|
The Enet packet filter was created in 1980 by Mike Accetta and
|
||
|
Rick Rashid at Carnegie-Mellon University. Jeffrey Mogul, at
|
||
|
Stanford, ported the code to BSD and continued its development from
|
||
|
1983 on. Since then, it has evolved into the Ultrix Packet Filter
|
||
|
at DEC, a STREAMS NIT module under SunOS 4.1, and BPF.
|
||
|
.SH AUTHORS
|
||
|
.PP
|
||
|
Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in
|
||
|
Summer 1990. Much of the design is due to Van Jacobson.
|