Commit Graph

51 Commits

Author SHA1 Message Date
David Malone
f11c35082b Add a new ioctl for changing the read filter (BIOCSETFNR). This is
just like BIOCSETF but it doesn't drop all the packets buffered on
the discriptor and reset the statistics.

Also, when setting the write filter, don't drop packets waiting to
be read or reset the statistics.

PR:		118486
Submitted by:	Matthew Luckie <mluckie@cs.waikato.ac.nz>
MFC after:	1 month
2008-07-07 09:25:49 +00:00
Christian S.J. Peron
4d621040ff Introduce support for zero-copy BPF buffering, which reduces the
overhead of packet capture by allowing a user process to directly "loan"
buffer memory to the kernel rather than using read(2) to explicitly copy
data from kernel address space.

The user process will issue new BPF ioctls to set the shared memory
buffer mode and provide pointers to buffers and their size. The kernel
then wires and maps the pages into kernel address space using sf_buf(9),
which on supporting architectures will use the direct map region. The
current "buffered" access mode remains the default, and support for
zero-copy buffers must, for the time being, be explicitly enabled using
a sysctl for the kernel to accept requests to use it.

The kernel and user process synchronize use of the buffers with atomic
operations, avoiding the need for system calls under load; the user
process may use select()/poll()/kqueue() to manage blocking while
waiting for network data if the user process is able to consume data
faster than the kernel generates it. Patchs to libpcap are available
to allow libpcap applications to transparently take advantage of this
support. Detailed information on the new API may be found in bpf(4),
including specific atomic operations and memory barriers required to
synchronize buffer use safely.

These changes modify the base BPF implementation to (roughly) abstrac
the current buffer model, allowing the new shared memory model to be
added, and add new monitoring statistics for netstat to print. The
implementation, with the exception of some monitoring hanges that break
the netstat monitoring ABI for BPF, will be MFC'd.

Zerocopy bpf buffers are still considered experimental are disabled
by default. To experiment with this new facility, adjust the
net.bpf.zerocopy_enable sysctl variable to 1.

Changes to libpcap will be made available as a patch for the time being,
and further refinements to the implementation are expected.

Sponsored by:		Seccuris Inc.
In collaboration with:	rwatson
Tested by:		pwood, gallatin
MFC after:		4 months [1]

[1] Certain portions will probably not be MFCed, specifically things
    that can break the monitoring ABI.
2008-03-24 13:49:17 +00:00
Robert Watson
2a0a392e1c Remove trailing whitespace from lines in BPF.
MFC after:	3 days
2007-12-23 14:10:33 +00:00
Max Laier
19ed78ce27 Additions from libpcap 0.9.8 unbreak the build.
Pointy hat to:	mlaier
X-MFC after:	RELENG_7 buildworld
2007-10-21 13:23:32 +00:00
Jung-uk Kim
560a54e10c Add three new ioctl(2) commands for bpf(4).
- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining
whether incoming, outgoing, or all packets on the interface should be
returned by BPF.  Set to BPF_D_IN to see only incoming packets on the
interface.  Set to BPF_D_INOUT to see packets originating locally and
remotely on the interface.  Set to BPF_D_OUT to see only outgoing
packets on the interface.  This setting is initialized to BPF_D_INOUT
by default.  BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but
kept for backward compatibility.

- BIOCFEEDBACK sets packet feedback mode.  This allows injected packets
to be fed back as input to the interface when output via the interface is
successful.  When BPF_D_INOUT direction is set, injected outgoing packet
is not returned by BPF to avoid duplication.  This flag is initialized to
zero by default.

Note that libpcap has been modified to support BPF_D_OUT direction for
pcap_setdirection(3) and PCAP_D_OUT direction is functional now.

Reviewed by:	rwatson
2007-02-26 22:24:14 +00:00
Sam Leffler
f09c8c4a46 more juniper dlt's
MFC after:	1 month
2006-09-04 19:24:34 +00:00
Christian S.J. Peron
7eae78a419 If bpf(4) has not been compiled into the kernel, initialize the bpf interface
pointer to a zeroed, statically allocated bpf_if structure. This way the
LIST_EMPTY() macro will always return true. This allows us to remove the
additional unconditional memory reference for each packet in the fast path.

Discussed with:	sam
2006-06-14 02:23:28 +00:00
Christian S.J. Peron
ffdc0471d4 Back out previous two commits, this caused some problems in the namespace
resulting in some build failures. Instead, to fix the problem of bpf not
being present, check the pointer before dereferencing it.

This is a temporary bandaid until we can decide on how we want to handle
the bpf code not being present. This will be fixed shortly.
2006-06-03 18:48:14 +00:00
Christian S.J. Peron
727b73816c Temporarily include files so that our macro checks do something useful. 2006-06-03 18:16:54 +00:00
Christian S.J. Peron
5255290c9c Make sure we don't try to dereference the the if_bpf pointer when bpf has
not been compiled into the the kernel.

Submitted by:	benno
2006-06-03 06:37:00 +00:00
Christian S.J. Peron
16d878cc99 Fix the following bpf(4) race condition which can result in a panic:
(1) bpf peer attaches to interface netif0
	(2) Packet is received by netif0
	(3) ifp->if_bpf pointer is checked and handed off to bpf
	(4) bpf peer detaches from netif0 resulting in ifp->if_bpf being
	    initialized to NULL.
	(5) ifp->if_bpf is dereferenced by bpf machinery
	(6) Kaboom

This race condition likely explains the various different kernel panics
reported around sending SIGINT to tcpdump or dhclient processes. But really
this race can result in kernel panics anywhere you have frequent bpf attach
and detach operations with high packet per second load.

Summary of changes:

- Remove the bpf interface's "driverp" member
- When we attach bpf interfaces, we now set the ifp->if_bpf member to the
  bpf interface structure. Once this is done, ifp->if_bpf should never be
  NULL. [1]
- Introduce bpf_peers_present function, an inline operation which will do
  a lockless read bpf peer list associated with the interface. It should
  be noted that the bpf code will pickup the bpf_interface lock before adding
  or removing bpf peers. This should serialize the access to the bpf descriptor
  list, removing the race.
- Expose the bpf_if structure in bpf.h so that the bpf_peers_present function
  can use it. This also removes the struct bpf_if; hack that was there.
- Adjust all consumers of the raw if_bpf structure to use bpf_peers_present

Now what happens is:

	(1) Packet is received by netif0
	(2) Check to see if bpf descriptor list is empty
	(3) Pickup the bpf interface lock
	(4) Hand packet off to process

From the attach/detach side:

	(1) Pickup the bpf interface lock
	(2) Add/remove from bpf descriptor list

Now that we are storing the bpf interface structure with the ifnet, there is
is no need to walk the bpf interface list to locate the correct bpf interface.
We now simply look up the interface, and initialize the pointer. This has a
nice side effect of changing a bpf interface attach operation from O(N) (where
N is the number of bpf interfaces), to O(1).

[1] From now on, we can no longer check ifp->if_bpf to tell us whether or
    not we have any bpf peers that might be interested in receiving packets.

In collaboration with:	sam@
MFC after:	1 month
2006-06-02 19:59:33 +00:00
Christian S.J. Peron
93e39f0b93 Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).

Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.

BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.

These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.

-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
 whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
 filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
 other for the packet data), execute one and manually copy the linker header
 into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.

It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.

Idea from:	OpenBSD
Reviewed by:	mlaier
2005-08-22 19:35:48 +00:00
Sam Leffler
e0d80bffb5 additions from libpcap 0.9.1 release
Approved by:	re (scottl)
2005-07-11 03:16:23 +00:00
Sam Leffler
f6f1669c0f integrate changes from libpcap-0.9.1-096
Reviewed by:	bms
2005-05-28 21:56:41 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
David Malone
bde800e688 Make the comment for DLT_NULL slightly more accurate.
PR:		62272
Submitted by:	Radim Kolar <hsn@netmag.cz>
MFC after:	1 week
2004-05-30 17:03:48 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Bruce M Simpson
1acc2f81b1 Add more DLT types required by libpcap 0.8.3.
Maintain numeric sort order.
2004-03-31 14:22:13 +00:00
Bruce M Simpson
a7135a6201 Update system bpf headers for libpcap 0.8.3.
Maintain listing of DLT link types in numeric order.
2004-03-31 14:09:26 +00:00
Max Laier
cc5934f5af Tweak existing header and other build infrastructure to be able to build
pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile
(i.e. do not connect it to any (automatic) builds - yet).

Approved by: bms(mentor)
2004-02-26 03:53:54 +00:00
Sam Leffler
437ffe1823 o eliminate widespread on-stack mbuf use for bpf by introducing
a new bpf_mtap2 routine that does the right thing for an mbuf
  and a variable-length chunk of data that should be prepended.
o while we're sweeping the drivers, use u_int32_t uniformly when
  when prepending the address family (several places were assuming
  sizeof(int) was 4)
o return M_ASSERTVALID to BPF_MTAP* now that all stack-allocated
  mbufs have been eliminated; this may better be moved to the bpf
  routines

Reviewed by:	arch@ and several others
2003-12-28 03:56:00 +00:00
Mike Silbersack
d7be4a893c Remove the call to M_ASSERTVALID from BPF_MTAP; some mbufs passed to
mpf are allocated on the stack, which causes this check to falsely trigger.

A new check which takes on-stack mbufs into account will be reintroduced
after 5.2 is out the door.

Approved by:	re (watson)
Requested by:	many
2003-11-28 18:48:59 +00:00
Mike Silbersack
5d5b5d0f99 Add a new macro M_ASSERTVALID which ensures that the mbuf in question
is non-free.  (More checks can/should be added in the future.)

Use M_ASSERTVALID in BPF_MTAP so that we catch when freed mbufs are
passed in, even if no bpf listeners are active.

Inspired by a bug in if_dc caught by Kenjiro Cho.
2003-10-19 22:33:41 +00:00
Sam Leffler
8eab61f3de o add BIOCGDLTLIST and BIOCSDLT ioctls to get the data link type list
and set the link type for use by libpcap and tcpdump
o move mtx unlock in bpfdetach up; it doesn't need to be held so long
o change printf in bpf_detach to distinguish it from the same one in bpfsetdlt

Note there are locking issues here related to ioctl processing; they
have not been addressed here.

Submitted by:	Guy Harris <guy@alum.mit.edu>
Obtained from:	NetBSD (w/ locking modifications)
2003-01-20 19:08:46 +00:00
Sam Leffler
24a229f466 o add support for multiple link types per interface (e.g. 802.11 and Ethernet)
o introduce BPF_TAP and BPF_MTAP macros to hide implementation details and
  ease code portability
o use m_getcl where appropriate

Reviewed by:	many
Approved by:	re
Obtained from:	NetBSD (multiple link type support)
2002-11-14 23:24:13 +00:00
Bill Fenner
94413c0dba Update for libpcap 0.7.1
Originally-committed-to-wrong-repository by:	fenner
2002-06-21 05:29:40 +00:00
Alfred Perlstein
929ddbbb89 Remove __P. 2002-03-19 21:54:18 +00:00
Bill Fenner
46da4bc6fc Update our bpf.h with tcpdump.org's new DLT_ types.
Use our bpf.h instead of tcpdump.org's to build libpcap.
2001-07-31 23:27:06 +00:00
Robert Watson
de5d99354f The advent of if_detach, allowing interface removal at runtime, makes it
possible for a panic to occur if BPF is in use on the interface at the
time of the call to if_detach.  This happens because BPF maintains pointers
to the struct ifnet describing the interface, which is freed by if_detach.

To correct this problem, a new call, bpfdetach, is introduced.  bpfdetach
locates BPF descriptor references to the interface, and NULLs them.  Other
BPF code is modified so that discovery of a NULL interface results in
ENXIO (already implemented for some calls).  Processes blocked on a BPF
call will also be woken up so that they can receive ENXIO.

Interface drivers that invoke bpfattach and if_detach must be modified to
also call bpfattach(ifp) before calling if_detach(ifp).  This is relevant
for buses that support hot removal, such as pccard and usb.  Patches to
all effected devices will not be committed, only to if_wi.c, due to
testing limitations.  To reproduce the crash, load up tcpdump on you
favorite pccard ethernet card, and then eject the card.  As some pccard
drivers do not invoke if_detach(ifp), this bug will not manifest itself
for those drivers.

Reviewed by:	wes
2000-03-19 05:42:34 +00:00
Robert Watson
8ed3828c3b Introduce a new bd_seesent flag to the BPF descriptor, indicating whether or
not the current BPF device should report locally generated packets or not.
This allows sniffing applications to see only packets that are not generated
locally, which can be useful for debugging bridging problems, or other
situations where MAC addresses are not sufficient to identify locally
sourced packets.  Default to true for this flag, so as to provide existing
behavior by default.

Introduce two new ioctls, BIOCGSEESENT and BIOCSSEESENT, which may be used
to manipulate this flag from userland, given appropriate privilege.

Modify bpf.4 to document these two new ioctl arguments.

Reviewed by:	asmodai
2000-03-18 06:30:42 +00:00
Poul-Henning Kamp
eba2a1aeb9 |The hard limit for the BPF buffer size is 32KB, which appears too low
|for high speed networks (even at 100Mbit/s this corresponds to 1/300th
|of a second). The default buffer size is 4KB, but libpcap and ipfilter
|both override this (using the BIOCSBLEN ioctl) and allocate 32KB.
|
|The following patch adds an sysctl for bpf_maxbufsize, similar to the
|one for bpf_bufsize that you added back in December 1995. I choose to
|make the default for this limit 512KB (the value suggested by NFR).

Submitted by:	se
Reviewed by:	phk
2000-01-15 19:46:12 +00:00
Peter Wemm
664a31e496 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 04:46:21 +00:00
Archie Cobbs
dcb129d597 Add 'const' to the bpf_filter() and bpf_validate() prototypes.
Remove a stale comment from bpf_validate().
1999-12-02 19:36:05 +00:00
Mike Smith
114ae644b5 Implement pseudo_AF_HDRCMPLT, which controls the state of the 'header
completion' flag.  If set, the interface output routine will assume that
the packet already has a valid link-level source address.  This defaults
to off (the address is overwritten)

PR:		kern/10680
Submitted by:	"Christopher N . Harrell" <cnh@mindspring.net>
Obtained from:	NetBSD
1999-10-15 05:07:00 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Alexander Langer
ba136d4fea Change BPF_ALIGNMENT to long, necessary for correct alignment on Alpha. 1998-10-04 21:53:59 +00:00
Bill Fenner
c2b0c42413 Add DLT_{SLIP,PPP}_BSDOS from libpcap 0.4 1998-09-15 19:35:37 +00:00
Andrey A. Chernov
22f05c4320 Implement DLT_RAW from libpcap 1998-08-18 10:13:11 +00:00
Bruce Evans
c086febef5 Don't attempt to optimize the space allocated for bpf headers if
sizeof(struct bpf_hdr) > 20.  20 is normal on 32-bit systems with
32-bit alignment, but we still assume that the last 2 bytes of the
struct are unnecessary padding on such systems.  On systems with
64-bit longs, struct timeval is bloated to 16 bytes, so bpf headers
certainly don't fit in 18 bytes.
1998-07-13 10:44:02 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Paul Traina
11a53ef3f3 Update to match definitions in LBL June 96 release 1996-08-19 20:28:25 +00:00
Garrett Wollman
9b44ff2214 Clean up Ethernet drivers:
- fill in and use ifp->if_softc
	- use if_bpf rather than private cookie variables
	- change bpf interface to take advantage of this
	- call ether_ifattach() directly from Ethernet drivers
	- delete kludge in if_attach() that did this indirectly
1996-02-06 18:51:28 +00:00
Mike Pritchard
6c5e9bbdf5 Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.
1996-01-30 23:02:38 +00:00
Bruce Evans
4fda91c705 Moved prototypes for devswitch functions from conf.c and driver sources
to <machine/conf.h>.  conf.h was mechanically generated by
`grep ^d_ conf.c >conf.h'.  This accounts for part of its ugliness.  The
prototypes should be moved back to the driver sources when the functions
are staticalized.
1995-11-04 13:25:33 +00:00
Bruce Evans
6003967057 Fix benign type mismatches in devsw functions. 82 out of 299 devsw
functions were wrong.
1995-09-08 11:09:15 +00:00
Paul Traina
00a838879b Give the BPF the ability to generate signals when a packet is available.
Reviewed by:	pst & wollman
Submitted by:	grossman@cygnus.com
1995-06-15 18:11:00 +00:00
Rodney W. Grimes
9b2e535452 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
Paul Richards
cea1da3be2 Make idempotent.
Submitted by:	Paul
1994-08-21 05:11:48 +00:00
David Greenman
3c4dd3568f Added $Id$ 1994-08-02 07:55:43 +00:00