Commit Graph

3828 Commits

Author SHA1 Message Date
shurd
7d4b8facc7 Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers
to have to go through all manner of contortions to use a non sleepable lock.
Serialize multicast updates instead.

Submitted by:	mmacy <mmacy@mattmacy.io>
Reviewed by:	shurd, sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14969
2018-05-02 19:36:29 +00:00
gallatin
40ab8d5ea9 Fix iflib_encap() EFBIG handling bugs
1) Don't give up if m_collapse() fails.  Rather than giving up, try
m_defrag() immediately.

2) Fix a leak where, if the NIC driver rejected the defrag'ed chain
as having too many segments, we would fail to free the chain.

Reviewed by:  Matthew Macy <mmacy@mattmacy.io> (this version of patch)
Submitted by: Matthew Macy <mmacy@mattmacy.io> (early version of leak fix)
2018-04-30 23:53:27 +00:00
hselasky
5663cd2837 Add network device event for priority code point, PCP, changes.
When the PCP is changed for either a VLAN network interface or when
prio tagging is enabled for a regular ethernet network interface,
broadcast the IFNET_EVENT_PCP event so applications like ibcore can
update its GID tables accordingly.

MFC after:	3 days
Reviewed by:	ae, kib
Differential Revision:	https://reviews.freebsd.org/D15040
Sponsored by:	Mellanox Technologies
2018-04-26 08:58:27 +00:00
brooks
289ce12298 Translate 32-bit ifmedia requests into native ones.
We use transformation rather than accessors as virtually ever driver
implements SIOCGIFMEDIA and all would have to be touched.

Keep the code readable by always performing copies and (possiably no-op)
transforms.

Reviewed by:	jhb, kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14996
2018-04-25 15:30:42 +00:00
markj
943580a700 Use dead_bpf_if instead of bp_null.
This fixes a -Wunused error when DEV_BPF and NETGRAPH_BPF are not
defined.

Also remove a stray semicolon added in r332812.

X-MFC with:	r332812
2018-04-24 17:42:25 +00:00
brooks
9a0f94467e Finish removing FDDI and tokenring media support.
This fixes media display for 802.11 wireless devices.

Software outside the base system that uses these media types and
defines should use #ifdef IFM_FDDI or IFM_TOKEN to include or remove
support.

Reported by:	zeising
Reviewed by:	emaste, kib, zeising
Tested by:	zeising
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15170
2018-04-23 21:10:33 +00:00
ae
c6192ec749 Add dead_bpf_if structure, that should be used as fake bpf_if
during ifnet detach.

Since destroying interface is not atomic operation and due to the
lack of synhronization during destroy, it is possible, that in the
time between bpfdetach() and if_free() some queued on destroying
interface mbuf will be used by ether_input_internal() and
bpf_peers_present() can dereference NULL bpf_if pointer. To protect
from this, assign pointer to empty bpf_if_ext structure instead of
NULL pointer after bpfdetach().

Reviewed by:	melifaro, eugen
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15083
2018-04-20 09:57:31 +00:00
shurd
90779c2bbf iflib: Fix queue distribution when there are no threads
Previously, if there are no threads, all queues which targeted
cores that share an L2 cache were bound to a single core. The intent is
to distribute them across these cores.

Reported by:	olivier
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15120
2018-04-18 15:34:18 +00:00
brooks
26c165ead9 Remove support for the Arcnet protocol.
While Arcnet has some continued deployment in industrial controls, the
lack of drivers for any of the PCI, USB, or PCIe NICs on the market
suggests such users aren't running FreeBSD.

Evidence in the PR database suggests that the cm(4) driver (our sole
Arcnet NIC) was broken in 5.0 and has not worked since.

PR:		182297
Reviewed by:	jhibbits, vangyzen
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15057
2018-04-13 21:18:04 +00:00
sbruno
18512a7765 Restore r332389 after resolution of locking fixes.
Add one extra lock initialization to iflib_register() that was missed
in the git<->phab conversion.

Split out flag manipulation from general context manipulation in iflib

To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:   Matthew Macy <mmacy@mattmacy.io>
Reviewed by:    shurd
Sponsored by:   Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D14967
2018-04-12 14:35:37 +00:00
vmaffione
3c7434c730 netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
mjg
fa5413e897 iflib: fix up a mismerge in r332419
Lead to crashes on boot while in ifconfig.

Submitted by: Matthew Macy <mmacy@mattmacy.io>
2018-04-12 04:11:37 +00:00
shurd
63bcfab69d Properly initialize ifc_nhwtxqs.
Also, since ifc_nhwrxqs is only used in one place, remove it from the struct.
This was preventing iflib_dma_free() from being called via
iflib_device_detach().

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
2018-04-11 21:41:59 +00:00
brooks
6dcf9514b3 Remove support for FDDI networks.
Defines in net/if_media.h remain in case code copied from ifconfig is in
use elsewere (supporting non-existant media type is harmless).

Reviewed by:	kib, jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15017
2018-04-11 17:28:24 +00:00
sbruno
a87ab66df6 Revert r332389 as it is causing panics for various users and we need
to add some more test cases.
2018-04-11 17:26:53 +00:00
shurd
d55e53acf5 Split out flag manipulation from general context manipulation in iflib
To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14967
2018-04-10 19:48:24 +00:00
shurd
05f9f1edaa Make BPF global lock an SX
This allows NIC drivers to sleep on polling config operations.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14982
2018-04-10 19:42:50 +00:00
vmaffione
8b391e44ef netmap: align codebase to upstream version v11.4
Changelist:
  - remove unused nkr_slot_flags
  - new nm_intr adapter callback to enable/disable interrupts
  - remove unused sysctls and document the other sysctls
  - new infrastructure to support NS_MOREFRAG for NIC ports
  - support for external memory allocator (for now linux-only),
    including linux-specific changes in common headers
  - optimizations within netmap pipes datapath
  - improvements on VALE control API
  - new nm_parse() helper function in netmap_user.h
  - various bug fixes and code clean up

Approved by:	hrs (mentor)
2018-04-09 09:24:26 +00:00
brooks
c2e6899488 Remove the thread argument from ifr_buffer_*() accessors.
They are always used in a context where curthread is the correct thread.
This makes them more similar to the ifr_data_get_ptr() accessor.
2018-04-06 23:25:54 +00:00
brooks
7a2353df98 ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume
that each pseudo ifreq is of length MAX(sizeof(struct ifreq),
sizeof(ifr_name) + ifr_addr.sa_len).  For short sockaddrs we copied
too much from the source sockaddr resulting in a heap leak.

I believe only one such sockaddr exists (struct sockaddr_sco which
is 8 bytes) and it is unclear if such sockaddrs end up on interfaces
in practice.  If it did, the result would be an 8 byte heap leak on
current architectures.

admbugs:	869
Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	3 days
Security:	kernel heap leak
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14981
2018-04-06 20:26:56 +00:00
brooks
9d79658aab Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
kp
337a0778fc pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.

MFC after:	1 week
2018-04-06 15:54:30 +00:00
brooks
0080e81d7c Add 32-bit compat for ioctls that take struct ifgroupreq.
Use an accessor to access ifgr_group and ifgr_groups.

Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such
as "case SIOCAIFGROUP:". This avoids poluting the switch statements
with large numbers of #ifdefs.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14960
2018-04-05 22:14:55 +00:00
brooks
8f46ff8fe4 ifconf(): Always zero the whole struct ifreq.
The previous split of zeroing ifr_name and ifr_addr seperately is safe
on current architectures, but would be unsafe if pointers were larger
than 8 bytes. Combining the zeroing adds no real cost (a few
instructions) and makes the security property easier to verify.

Reviewed by:	kib, emaste
Obtained from:	CheriBSD
MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14912
2018-04-05 21:58:28 +00:00
vmaffione
6dfc9aee0f netmap: align if_ptnet guest driver to the upstream code (commit 0e15788)
The change upgrades the driver to use the split Communication Status
Block (CSB) format. In this way the variables written by the guest
and read by the host are allocated in a different cacheline than
the variables written by the host and read by the guest; this is
needed to avoid cache thrashing.

Approved by:	hrs (mentor)
2018-04-04 21:31:12 +00:00
brooks
2b96daf50f Document and enforce assumptions about struct (in6_)ifreq.
- The two types must be type-punnable for shared members of ifr_ifru.
  This allows compatibility accessors to be shared.

- There must be no padding gap between ifr_name and ifr_ifru.  This is
  assumed in tcpdump's use of SIOCGIFFLAGS output which attempts to be
  broadly portable.  This is true for all current architectures, but very
  large (256-bit) fat-pointers could violate this invariant.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14910
2018-03-30 21:38:53 +00:00
brooks
ac0325b4db Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
brooks
a45d44647f Remove infrastructure for token-ring networks.
Reviewed by:	cem, imp, jhb, jmallett
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14875
2018-03-28 23:33:26 +00:00
brooks
308f791e1c Improve copy-and-pasted versions of SIOCGIFADDR.
The original implementation used a reference to ifr_data and a cast to
do the equivalent of accessing ifr_addr. This was copied multiple
times since 1996.

Approved by:	kib
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14873
2018-03-27 20:51:49 +00:00
brooks
6907bd334c Fix a whitespace bug missed in refactoring prior to r331641.
MFC with:	r331641
2018-03-27 18:55:39 +00:00
brooks
0754c526f1 Fix access to ifru_buffer on freebsd32.
Make all kernel accesses to ifru_buffer go via access functions
which take the process ABI into account and use an appropriate union
to access members in the correct place in struct ifreq.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14846
2018-03-27 18:26:50 +00:00
kib
9de215608c Allow to specify PCP on packets not belonging to any VLAN.
According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be
considered as untagged, and only PCP and DEI values from the VLAN tag
are meaningful.  See for instance
https://www.cisco.com/c/en/us/td/docs/switches/connectedgrid/cg-switch-sw-master/software/configuration/guide/vlan0/b_vlan_0.html.

Make it possible to specify PCP value for outgoing packets on an
ethernet interface.  When PCP is supplied, the tag is appended, VLAN
id set to 0, and PCP is filled by the supplied value.  The code to do
VLAN tag encapsulation is refactored from the if_vlan.c and moved into
if_ethersubr.c.

Drivers might have issues with filtering VID 0 packets on
receive.  This bug should be fixed for each driver.

Reviewed by:	ae (previous version), hselasky, melifaro
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D14702
2018-03-27 15:29:32 +00:00
markj
12dff6d870 Clamp IFLIB_RX_COPY_THRESH to MHLEN in iflib_rxd_pkt_get().
If one has added fields to struct mbuf such that MHLEN is smaller than
this threshold (128), iflib_rxd_pkt_get() may otherwise overrun the
internal mbuf buffer while copying.

Reviewed by:	mmacy
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14843
2018-03-25 23:23:19 +00:00
kp
109a7b5eec netpfil: Introduce PFIL_FWD flag
Forwarded packets passed through PFIL_OUT, which made it difficult for
firewalls to figure out if they were forwarding or producing packets. This in
turn is an issue for pf for IPv6 fragment handling: it needs to call
ip6_output() or ip6_forward() to handle the fragments. Figuring out which was
difficult (and until now, incorrect).
Having pfil distinguish the two removes an ugly piece of code from pf.

Introduce a new variant of the netpfil callbacks with a flags variable, which
has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if
a packet is forwarded.

Reviewed by:	ae, kevans
Differential Revision:	https://reviews.freebsd.org/D13715
2018-03-23 16:56:44 +00:00
melifaro
75159f749d Use count(9) api for the bpf(4) statistics.
Currently each bfp descriptor uses u64 variables to maintain its counters.
On interfaces with high packet rate this leads to unnecessary contention
and inaccurate reporting.

PR:		kern/205320
Reported by:	elofu17 at hotmail.com
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14726
2018-03-20 22:57:06 +00:00
melifaro
7bb5ee0db4 Fix outgoing TCP/UDP packet drop on arp/ndp entry expiration.
Current arp/nd code relies on the feedback from the datapath indicating
 that the entry is still used. This mechanism is incorporated into the
 arpresolve()/nd6_resolve() routines. After the inpcb route cache
 introduction, the packet path for the locally-originated packets changed,
 passing cached lle pointer to the ether_output() directly. This resulted
 in the arp/ndp entry expire each time exactly after the configured max_age
 interval. During the small window between the ARP/NDP request and reply
 from the router, most of the packets got lost.

Fix this behaviour by plugging datapath notification code to the packet
 path used by route cache. Unify the notification code by using single
 inlined function with the per-AF callbacks.

Reported by:	sthaug at nethelp.no
Reviewed by:	ae
MFC after:	2 weeks
2018-03-17 17:05:48 +00:00
avos
6ddab56276 Correct comment for IFM_IEEE80211_VHT media variant. 2018-03-15 23:32:29 +00:00
ae
d3176e34c7 Define ethernet type 0x88A8 as ETHERTYPE_QINQ.
Reviewed by:	kp
Obtained from:	OpenBSD
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14593
2018-03-06 12:01:31 +00:00
shurd
6bea9d5c59 iflib: stop timer callout when stopping
iflib_timer has been seen running after the interface had been removed.
This change prevents that.

Submitted by:	matt.macy@joyent.com
2018-03-02 18:48:07 +00:00
kp
fc599d4911 pf: Cope with overly large net.pf.states_hashsize
If the user configures a states_hashsize or source_nodes_hashsize value we may
not have enough memory to allocate this. This used to lock up pf, because these
allocations used M_WAITOK.

Cope with this by attempting the allocation with M_NOWAIT and falling back to
the default sizes (with M_WAITOK) if these fail.

PR:		209475
Submitted by:	Fehmi Noyan Isi <fnoyanisi AT yahoo.com>
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D14367
2018-02-25 08:56:44 +00:00
rstone
dd56500020 Allow route change requests to not specify the gateway.
Only require a gateway to be specified on a route add request.  On
a route change request that does not specify the gateway, the
gateway will remain the same.  This allows changing other route
parameters without having to re-specifying the gateway, like in
"route change 10.0.0.0/8 -mtu 9000".

Update the route(8) manpage to explicitly call out this usage
as being supported.

MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Reviewed By: eugen (rtsock.c change), rgrimes
Differential Revision: https://reviews.freebsd.org/D14291
2018-02-21 19:13:23 +00:00
shurd
6699ee32ca IFLIB: Make isc_magic unsigned
The IFLIB_MAGIC macro is > INT_MAX, so isc_magic should
be able to contain it.

Reported by:	jeb
Sponsored by:	Limelight  Networks
2018-02-21 18:57:00 +00:00
np
a38188ac36 Catch up with the removal of nktr_slot_flags from upstream netmap. No
functional impact intended.

Submitted by:	Vincenzo Maffione <v.maffione@gmail.com>
2018-02-20 21:42:45 +00:00
shurd
c1c5080794 IFLIB: do not remove dmamap on buffer unload
Dmamap is created only on IFC attach. If we remove it on
buffer release, we won't be able to do ifconfig down&up. Only destroy
when in detach.

Reported by:	wma
Reviewed by:	wma
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14060
2018-02-20 18:33:45 +00:00
wma
d14bbbd48e BPF: Switch to 32 bit compatible mode only when thread is 32 bit
Sometimes 32 bit and 64 bit ioctls are represented by the same number.
It causes unnecessary switch to 32 bit commpatible mode.

This patch prevents switching when we are dealing with 64 bit executable.
It fixes issue mentioned here

Authored by:           Patryk Duda <pdk@semihalf.com>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Reviewed by:           andrew, wma
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14023
2018-01-25 12:13:41 +00:00
smh
c62edd09fd Added missing CTLFLAG_VNET to lacp default_strict_mode
Added CTLFLAG_VNET to net.link.lagg.lacp.default_strict_mode which was missed
in r290450.

Reported by:	julian@
MFC after:	1 week
Sponsored by:	Multiplay
2018-01-24 10:13:14 +00:00
rstone
9c794ac899 Increment the route table gen count after a modify
Increment the route table generation count after modifying a
route.  This signals back to TCP connections that they need to
update their L2 caches as the gateway for their route may have
changed.  This is a heavier hammer than is needed, strictly
speaking, but route changes will be unlikely enough that the
performance effects of invalidating all connection route caches
should be negligible.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13990
Reviewed by:	karels
2018-01-23 03:15:44 +00:00
rstone
c0c5474ab0 Reduce code duplication for inpcb route caching
Add a new macro to clear both the L3 and L2 route caches, to
hopefully prevent future instances where only the L3 cache was
cleared when both should have been.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13989
Reviewed by:	karels
2018-01-23 03:15:39 +00:00
rstone
4fb0175a26 Invalidate inpcb LLE cache if cached route is invalidated
When the inpcb route cache is invalidated after a change to the
routing tables, we need to invalidate the LLE cache as well.
Previous to this change packets for the connection would continue
to use the old L2 information from the old L3 gateway, and the
packets for the connection would likely be blackholed.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13988
Reviewed by:	karels
2018-01-23 03:15:39 +00:00
kib
027c7f4d66 Fix compat32 for sysctl net.PF_ROUTE...NET_RT_IFLISTL.
Route messages are aligned to the host long type alignment, which
breaks 32bit.

Reported and tested by:	lwhsu
Diagnosed by:	Yuri Pankov <yuripv@icloud.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-22 20:49:17 +00:00