freebsd-skq

Author	SHA1	Message	Date
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Xin LI	ada6f083b9	MFV r313676: libpcap 1.8.1 MFC after: 1 month	2017-02-13 08:23:39 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Bjoern A. Zeeb	05fc416403	During if_vmove() we call if_detach_internal() which in turn calls the event handler notifying about interface departure and one of the consumers will detach if_bpf. There is no way for us to re-attach this easily as the DLT and hdrlen are only given on interface creation. Add a function to allow us to query the DLT and hdrlen from a current BPF attachment and after if_attach_internal() manually re-add the if_bpf attachment using these values. Found by panics triggered by nd6 packets running past BPF_MTAP() with no proper if_bpf pointer on the interface. Also add a basic DDB show function to investigate the if_bpf attachment of an interface. Reviewed by: gnn MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5896	2016-04-11 10:00:38 +00:00
Luiz Otavio O Souza	f87e372ef2	Remove two unnecessary sleeps from the hot path in bpf(4). The first one never triggers because bpf_canfreebuf() can only be true for zero-copy buffers and zero-copy buffers are not read with read(2). The second also never triggers, because we check the free buffer before calling ROTATE_BUFFERS(). If the hold buffer is in use the free buffer will be NULL and there is nothing else to do besides drop the packet. If the free buffer isn't NULL the hold buffer _is_ free and it is safe to rotate the buffers. Update the comment in ROTATE_BUFFERS macro to match the logic described here. While here fix a few typos in comments. MFC after: 2 weeks Sponsored by: Rubicon Communications (Netgate)	2015-07-31 21:43:27 +00:00
Mark Johnston	b23cbbe6db	Move the definition of struct bpf_if to bpf.c. A couple of fields are still exposed via struct bpf_if_ext so that bpf_peers_present() can be inlined into its callers. However, this change eliminates some type duplication in the resulting CTF container, since otherwise ctfmerge(1) propagates the duplication through all types that contain a struct bpf_if. Differential Revision: https://reviews.freebsd.org/D2319 Reviewed by: melifaro, rpaulo	2015-04-20 22:08:11 +00:00
Xin LI	681ed54caa	MFV r276759: libpcap 1.6.2. MFC after: 1 month	2015-01-06 22:29:12 +00:00
Gleb Smirnoff	eaeb0c139a	Style: s/SYS_EVENTHANDLER_H/_SYS_EVENTHANDLER_H_/g Submitted by: bde	2013-10-28 20:32:05 +00:00
Gleb Smirnoff	7ced9c2f66	Instead of putting ifnet declaration into eventhandler.h, move bpf(4) and vlan(4) related event declarations to bpf.h and if_vlan_var.h. To avoid dependency on eventhandler.h, protect these declarations with ifdef SYS_EVENTHANDLER_H. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-28 07:45:03 +00:00
Guy Helmer	3b3b91e736	Changes to resolve races in bpfread() and catchpacket() that, at worst, cause kernel panics. Add a flag to the bpf descriptor to indicate whether the hold buffer is in use. In bpfread(), set the "hold buffer in use" flag before dropping the descriptor lock during the call to bpf_uiomove(). Everywhere else the hold buffer is used or changed, wait while the hold buffer is in use by bpfread(). Add a KASSERT in bpfread() after re-acquiring the descriptor lock to assist uncovering any additional hold buffer races.	2012-12-10 16:14:44 +00:00
Xin LI	15752fa858	MFV: libpcap 1.3.0. MFC after: 4 weeks	2012-10-05 18:42:50 +00:00
Alexander V. Chernikov	afa85850e7	Fix old panic when BPF consumer attaches to destroying interface. 'flags' field is added to the end of bpf_if structure. Currently the only flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd() Problem can be easily triggered on SMP stable/[89] by the following command (sort of): 'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done' Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory, while interface is still UP and there can be routes via this interface. Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api. Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches) Approved by: kib(mentor) MFC in: 4 weeks	2012-05-21 22:17:29 +00:00
Xin LI	47db53c31a	Sync DLTs with the latest pcap version. MFC after: 2 weeks	2012-05-14 05:10:41 +00:00
Alexander V. Chernikov	51ec1eb70d	- Improve performace for writer-only BPF users. Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send raw ethernet frames. The only FreeBSD interface that can be used to send raw frames is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses BPF only to send data. This leads us to the situation when software like cdpd, being run on high-traffic-volume interface significantly reduces overall performance since we have to acquire additional locks for every packet. Here we add sysctl that changes BPF behavior in the following way: If program came and opens BPF socket without explicitly specifyin read filter we assume it to be write-only and add it to special writer-only per-interface list. This makes bpf_peers_present() return 0, so no additional overhead is introduced. After filter is supplied, descriptor is added to original per-interface list permitting packets to be captured. Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of setting snap length. Fortunately, most programs explicitly sets (event catch-all) filter after that. tcpdump(1) is a good example. So a bit hackis approach is taken: we upgrade description only after second BIOCSETF is received. Sysctl is named net.bpf.optimize_writers and is turned off by default. - While here, document all sysctl variables in bpf.4 Sponsored by Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 4 weeks	2012-04-06 06:55:21 +00:00
Alexander V. Chernikov	e4b3229aa5	- Improve BPF locking model. Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. - Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. - Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 3 weeks	2012-04-06 06:53:58 +00:00
Lawrence Stewart	253a3814d4	Revert r228986 until it can be reworked to avoid panicing the kernel when the same interface is attached multiple times with different DLTs, as is done in net80211 for example. Reported by: adrian	2011-12-31 07:21:28 +00:00
Lawrence Stewart	0f89fc22f3	- Introduce the net.bpf.tscfg sysctl tree and associated code so as to make one aspect of time stamp configuration per interface rather than per BPF descriptor. Prior to this, the order in which BPF devices were opened and the per descriptor time stamp configuration settings could cause non-deterministic and unintended behaviour with respect to time stamping. With the new scheme, a BPF attached interface's tscfg sysctl entry can be set to "default", "none", "fast", "normal" or "external". Setting "default" means use the system default option (set with the net.bpf.tscfg.default sysctl), "none" means do not generate time stamps for tapped packets, "fast" means generate time stamps for tapped packets using a hz granularity system clock read, "normal" means generate time stamps for tapped packets using a full timecounter granularity system clock read and "external" (currently unimplemented) means use the time stamp provided with the packet from an underlying source. - Utilise the recently introduced sysclock_getsnapshot() and sysclock_snap2bintime() KPIs to ensure the system clock is only read once per packet, regardless of the number of BPF descriptors and time stamp formats requested. Use the per BPF attached interface time stamp configuration to control if sysclock_getsnapshot() is called and whether the system clock read is fast or normal. The per BPF descriptor time stamp configuration is then used to control how the system clock snapshot is converted to a bintime by sysclock_snap2bintime(). - Remove all FAST related BPF descriptor flag variants. Performing a "fast" read of the system clock is now controlled per BPF attached interface using the net.bpf.tscfg sysctl tree. - Update the bpf.4 man page. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)	2011-12-30 08:57:58 +00:00
Lawrence Stewart	3e47c78798	Revert r227778 in preparation for committing reworked patches in its place.	2011-11-29 12:55:26 +00:00
Lawrence Stewart	b6f1c7db32	- When feed-forward clock support is compiled in, change the BPF header to contain both a regular timestamp obtained from the system clock and the current feed-forward ffcounter value. This enables new possibilities including comparison of timekeeping performance and timestamp correction during post processing. - Add the net.bpf.ffclock_tstamp sysctl to provide a choice between timestamping packets using the feedback or feed-forward system clock. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-21 04:17:24 +00:00
Rui Paulo	09b6dcf968	Sync DLTs with the latest pcap version.	2010-10-29 18:41:09 +00:00
Jung-uk Kim	547d94bde3	Implement flexible BPF timestamping framework. - Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command. Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP command. Document all supported options in bpf(4) and their uses. - Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'. The new time stamp has both 64-bit second and fractional parts. bpf_xhdr has this time stamp instead of 'struct timeval' for bh_tstamp. The new structures let us use bh_tstamp of same size on both 32-bit and 64-bit platforms without adding additional shims for 32-bit binaries. On 64-bit platforms, size of BPF header does not change compared to bpf_hdr as its members are already all 64-bit long. On 32-bit platforms, the size may increase by 8 bytes. For backward compatibility, struct bpf_hdr with struct timeval is still the default header unless new time stamp format is explicitly requested. However, the behaviour may change in the future and all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now. - Add experimental support for tagging mbufs with time stamps from a lower layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs. The time stamps must be uptime in 'struct bintime' format as binuptime(9) and getbinuptime(9) do. Reviewed by: net@	2010-06-15 19:28:44 +00:00
Rui Paulo	dcf377edf1	Sync DLTs with latest libpcap version.	2009-04-02 13:02:12 +00:00
Jung-uk Kim	18b6f05552	Revert the previous commit to fix buildworld for now. We have constified 'struct bpf_insn *' for bpf_filter(9) and bpf_validate(9) since r1.19 but they conflict with pcap.h from libpcap.	2008-08-26 16:12:49 +00:00
Jung-uk Kim	32688992ef	Make sys/net/bpf_filter.c build cleanly on user land.	2008-08-26 00:09:26 +00:00
David Malone	f11c35082b	Add a new ioctl for changing the read filter (BIOCSETFNR). This is just like BIOCSETF but it doesn't drop all the packets buffered on the discriptor and reset the statistics. Also, when setting the write filter, don't drop packets waiting to be read or reset the statistics. PR: 118486 Submitted by: Matthew Luckie <mluckie@cs.waikato.ac.nz> MFC after: 1 month	2008-07-07 09:25:49 +00:00
Christian S.J. Peron	4d621040ff	Introduce support for zero-copy BPF buffering, which reduces the overhead of packet capture by allowing a user process to directly "loan" buffer memory to the kernel rather than using read(2) to explicitly copy data from kernel address space. The user process will issue new BPF ioctls to set the shared memory buffer mode and provide pointers to buffers and their size. The kernel then wires and maps the pages into kernel address space using sf_buf(9), which on supporting architectures will use the direct map region. The current "buffered" access mode remains the default, and support for zero-copy buffers must, for the time being, be explicitly enabled using a sysctl for the kernel to accept requests to use it. The kernel and user process synchronize use of the buffers with atomic operations, avoiding the need for system calls under load; the user process may use select()/poll()/kqueue() to manage blocking while waiting for network data if the user process is able to consume data faster than the kernel generates it. Patchs to libpcap are available to allow libpcap applications to transparently take advantage of this support. Detailed information on the new API may be found in bpf(4), including specific atomic operations and memory barriers required to synchronize buffer use safely. These changes modify the base BPF implementation to (roughly) abstrac the current buffer model, allowing the new shared memory model to be added, and add new monitoring statistics for netstat to print. The implementation, with the exception of some monitoring hanges that break the netstat monitoring ABI for BPF, will be MFC'd. Zerocopy bpf buffers are still considered experimental are disabled by default. To experiment with this new facility, adjust the net.bpf.zerocopy_enable sysctl variable to 1. Changes to libpcap will be made available as a patch for the time being, and further refinements to the implementation are expected. Sponsored by: Seccuris Inc. In collaboration with: rwatson Tested by: pwood, gallatin MFC after: 4 months [1] [1] Certain portions will probably not be MFCed, specifically things that can break the monitoring ABI.	2008-03-24 13:49:17 +00:00
Robert Watson	2a0a392e1c	Remove trailing whitespace from lines in BPF. MFC after: 3 days	2007-12-23 14:10:33 +00:00
Max Laier	19ed78ce27	Additions from libpcap 0.9.8 unbreak the build. Pointy hat to: mlaier X-MFC after: RELENG_7 buildworld	2007-10-21 13:23:32 +00:00
Jung-uk Kim	560a54e10c	Add three new ioctl(2) commands for bpf(4). - BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining whether incoming, outgoing, or all packets on the interface should be returned by BPF. Set to BPF_D_IN to see only incoming packets on the interface. Set to BPF_D_INOUT to see packets originating locally and remotely on the interface. Set to BPF_D_OUT to see only outgoing packets on the interface. This setting is initialized to BPF_D_INOUT by default. BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but kept for backward compatibility. - BIOCFEEDBACK sets packet feedback mode. This allows injected packets to be fed back as input to the interface when output via the interface is successful. When BPF_D_INOUT direction is set, injected outgoing packet is not returned by BPF to avoid duplication. This flag is initialized to zero by default. Note that libpcap has been modified to support BPF_D_OUT direction for pcap_setdirection(3) and PCAP_D_OUT direction is functional now. Reviewed by: rwatson	2007-02-26 22:24:14 +00:00
Sam Leffler	f09c8c4a46	more juniper dlt's MFC after: 1 month	2006-09-04 19:24:34 +00:00
Christian S.J. Peron	7eae78a419	If bpf(4) has not been compiled into the kernel, initialize the bpf interface pointer to a zeroed, statically allocated bpf_if structure. This way the LIST_EMPTY() macro will always return true. This allows us to remove the additional unconditional memory reference for each packet in the fast path. Discussed with: sam	2006-06-14 02:23:28 +00:00
Christian S.J. Peron	ffdc0471d4	Back out previous two commits, this caused some problems in the namespace resulting in some build failures. Instead, to fix the problem of bpf not being present, check the pointer before dereferencing it. This is a temporary bandaid until we can decide on how we want to handle the bpf code not being present. This will be fixed shortly.	2006-06-03 18:48:14 +00:00
Christian S.J. Peron	727b73816c	Temporarily include files so that our macro checks do something useful.	2006-06-03 18:16:54 +00:00
Christian S.J. Peron	5255290c9c	Make sure we don't try to dereference the the if_bpf pointer when bpf has not been compiled into the the kernel. Submitted by: benno	2006-06-03 06:37:00 +00:00
Christian S.J. Peron	16d878cc99	Fix the following bpf(4) race condition which can result in a panic: (1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load. Summary of changes: - Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present Now what happens is: (1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process From the attach/detach side: (1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1). [1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets. In collaboration with: sam@ MFC after: 1 month	2006-06-02 19:59:33 +00:00
Christian S.J. Peron	93e39f0b93	Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands enhance the security of bpf(4) by further relinquishing the privilege of the bpf(4) consumer (assuming the ioctl commands are being implemented). Once BIOCLOCK is executed, the device becomes locked which prevents the execution of ioctl(2) commands which can change the underly parameters of the bpf(4) device. An example might be the setting of bpf(4) filter programs or attaching to different network interfaces. BIOCSETWF can be used to set write filters for outgoing packets. Currently if a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used as a raw socket, regardless of consumer's UID. Write filters give users the ability to constrain which packets can be sent through the bpf(4) descriptor. These features are currently implemented by a couple programs which came from OpenBSD, such as the new dhclient and pflogd. -Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify whether a read or write filter is to be set. -Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the filter program on the mbuf data once we move the packet in from user-space. -Rather than execute two uiomove operations, (one for the link header and the other for the packet data), execute one and manually copy the linker header into the sockaddr structure via bcopy. -Restructure bpf_setf to compensate for write filters, as well as read. -Adjust bpf(4) stats structures to include a bd_locked member. It should be noted that the FreeBSD and OpenBSD implementations differ a bit in the sense that we unconditionally enforce the lock, where OpenBSD enforces it only if the calling credential is not root. Idea from: OpenBSD Reviewed by: mlaier	2005-08-22 19:35:48 +00:00
Sam Leffler	e0d80bffb5	additions from libpcap 0.9.1 release Approved by: re (scottl)	2005-07-11 03:16:23 +00:00
Sam Leffler	f6f1669c0f	integrate changes from libpcap-0.9.1-096 Reviewed by: bms	2005-05-28 21:56:41 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
David Malone	bde800e688	Make the comment for DLT_NULL slightly more accurate. PR: 62272 Submitted by: Radim Kolar <hsn@netmag.cz> MFC after: 1 week	2004-05-30 17:03:48 +00:00
Warner Losh	f36cfd49ad	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 20:46:16 +00:00
Bruce M Simpson	1acc2f81b1	Add more DLT types required by libpcap 0.8.3. Maintain numeric sort order.	2004-03-31 14:22:13 +00:00
Bruce M Simpson	a7135a6201	Update system bpf headers for libpcap 0.8.3. Maintain listing of DLT link types in numeric order.	2004-03-31 14:09:26 +00:00
Max Laier	cc5934f5af	Tweak existing header and other build infrastructure to be able to build pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile (i.e. do not connect it to any (automatic) builds - yet). Approved by: bms(mentor)	2004-02-26 03:53:54 +00:00
Sam Leffler	437ffe1823	o eliminate widespread on-stack mbuf use for bpf by introducing a new bpf_mtap2 routine that does the right thing for an mbuf and a variable-length chunk of data that should be prepended. o while we're sweeping the drivers, use u_int32_t uniformly when when prepending the address family (several places were assuming sizeof(int) was 4) o return M_ASSERTVALID to BPF_MTAP* now that all stack-allocated mbufs have been eliminated; this may better be moved to the bpf routines Reviewed by: arch@ and several others	2003-12-28 03:56:00 +00:00
Mike Silbersack	d7be4a893c	Remove the call to M_ASSERTVALID from BPF_MTAP; some mbufs passed to mpf are allocated on the stack, which causes this check to falsely trigger. A new check which takes on-stack mbufs into account will be reintroduced after 5.2 is out the door. Approved by: re (watson) Requested by: many	2003-11-28 18:48:59 +00:00
Mike Silbersack	5d5b5d0f99	Add a new macro M_ASSERTVALID which ensures that the mbuf in question is non-free. (More checks can/should be added in the future.) Use M_ASSERTVALID in BPF_MTAP so that we catch when freed mbufs are passed in, even if no bpf listeners are active. Inspired by a bug in if_dc caught by Kenjiro Cho.	2003-10-19 22:33:41 +00:00
Sam Leffler	8eab61f3de	o add BIOCGDLTLIST and BIOCSDLT ioctls to get the data link type list and set the link type for use by libpcap and tcpdump o move mtx unlock in bpfdetach up; it doesn't need to be held so long o change printf in bpf_detach to distinguish it from the same one in bpfsetdlt Note there are locking issues here related to ioctl processing; they have not been addressed here. Submitted by: Guy Harris <guy@alum.mit.edu> Obtained from: NetBSD (w/ locking modifications)	2003-01-20 19:08:46 +00:00
Sam Leffler	24a229f466	o add support for multiple link types per interface (e.g. 802.11 and Ethernet) o introduce BPF_TAP and BPF_MTAP macros to hide implementation details and ease code portability o use m_getcl where appropriate Reviewed by: many Approved by: re Obtained from: NetBSD (multiple link type support)	2002-11-14 23:24:13 +00:00
Bill Fenner	94413c0dba	Update for libpcap 0.7.1 Originally-committed-to-wrong-repository by: fenner	2002-06-21 05:29:40 +00:00

1 2

75 Commits