Commit Graph

3922 Commits

Author SHA1 Message Date
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Matt Macy
f2d19f98c1 epoch(9): allocate net epochs earlier in boot 2018-05-18 18:48:00 +00:00
Matt Macy
d71e30de40 epoch: move epoch variables to read mostly section 2018-05-18 17:58:15 +00:00
Ed Maste
891cf3ed44 Use NULL for SYSINIT's last arg, which is a pointer type
Sponsored by:	The FreeBSD Foundation
2018-05-18 17:58:09 +00:00
Matt Macy
70398c2f86 epoch(9): Make epochs non-preemptible by default
There are risks associated with waiting on a preemptible epoch section.
Change the name to make them not be the default and document the issue
under CAVEATS.

Reported by:	markj
2018-05-18 17:29:43 +00:00
Matt Macy
5e68a3dfe3 epoch: add non-preemptible "critical" variant
adds:
- epoch_enter_critical() - can be called inside a different epoch,
  starts a section that will acquire any MTX_DEF mutexes or do
  anything that might sleep.
- epoch_exit_critical() - corresponding exit call
- epoch_wait_critical() - wait variant that is guaranteed that any
  threads in a section are running.
- epoch_global_critical - an epoch_wait_critical safe epoch instance

Requested by:   markj
Approved by:	sbruno
2018-05-18 01:52:51 +00:00
Matt Macy
2aa6f526de Fix !netmap build post r333686
Approved by:	sbruno
2018-05-16 22:25:47 +00:00
Stephen Hurd
5ee36c68e1 Work around lack of TX IRQs in iflib for netmap
When poll() is called via netmap, txsync is initially called,
and if there are no available buffers to reclaim, it waits for the driver
to notify of new buffers. Since the TX IRQ is generally not used in iflib
drivers, this ends up causing a timeout.

Work around this by having the reclaim DELAY(1) if it's initially unable
to reclaim anything, then schedule the tx task, which will spin by
continuously rescheduling itself until some buffers are reclaimed. In
general, the delay is enough to allow some buffers to be reclaimed, so
spinning is minimized.

Reported by:	Johannes Lundberg <johalun0@gmail.com>
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15455
2018-05-16 21:03:22 +00:00
Stephen Hurd
99031b8f7d Replace rmlock with epoch in lagg
Use the new epoch based reclamation API. Now the hot paths will not
block at all, and the sx lock is used for the softc data.  This fixes LORs
reported where the rwlock was obtained when the sxlock was held.

Submitted by:	mmacy
Reported by:	Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15355
2018-05-14 20:06:49 +00:00
Matt Macy
09f6ff4f1a iflib(9): Add support for cloning pseudo interfaces
Part 3 of many ...
The VPC framework relies heavily on cloning pseudo interfaces
(vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc).

This pulls in that piece. Some ancillary changes get pulled
in as a side effect.

Reviewed by:	shurd@
Approved by:	sbruno@
Sponsored by:	Joyent, Inc.
Differential Revision:	https://reviews.freebsd.org/D15347
2018-05-11 20:08:28 +00:00
Andrey V. Elsukov
e287c474be Apply the change from r272770 to if_ipsec(4) interface.
It is guaranteed that if_ipsec(4) interface is used only for tunnel
mode IPsec, i.e. decrypted and decapsultaed packet has its own IP header.
Thus we can consider it as new packet and clear the protocols flags.
This allows ICMP/ICMPv6 properly handle errors that may cause this packet.

PR:		228108
MFC after:	1 week
2018-05-11 16:50:25 +00:00
Matt Macy
5c30b378f0 Allow different bridge types to coexist
if_bridge has a lot of limitations that make it scale poorly to higher data
rates. In my projects/VPC branch I leverage the bridge interface between
layers for my high speed soft switch as well as for purposes of stacking
in general.

Reviewed by:	sbruno@
Approved by:	sbruno@
Differential Revision:	https://reviews.freebsd.org/D15344
2018-05-11 05:00:40 +00:00
Dag-Erling Smørgrav
20f8d7bc7e Slight cleanup of interface event logging.
Make if_printf() use vlog() instead of vprintf().  This means it can no
longer return the number of characters printed, as it used to, but every
single call to if_printf() in the entire kernel ignores the return value
anyway; just return 0 so we don't have to change the prototype.

Consistently use if_printf() throughout sys/net/if.c, instead of a
mixture of if_printf() and log().

In ifa_maintain_loopback_route(), don't needlessly log an error if we
either failed to add a route because it already existed or failed to
remove one because it did not.  We still return an error code, though.

MFC after:	1 week
2018-05-11 00:19:49 +00:00
Matt Macy
7bf272a612 Allocate epoch for networking at startup
Additionally add CK to include paths for modules

Approved by:	sbruno@
2018-05-10 19:13:00 +00:00
Andrey V. Elsukov
2e4531a12b Add IFCAP_LINKSTATE support to if_loop(4).
Reviewed by:	wollman
Obtained from:	Yandex LLC
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D15278
2018-05-09 10:50:51 +00:00
Stephen Hurd
ac88e6da11 iflib: print message when iflib_tx_structures_setup fails
Print a message when iflib_tx_structures_setup fails, like we do for
iflib_rx_structures_setup.

Now that we always print a message from within
iflib_qset_structures_setup when it fails, stop printing one in
iflib_device_register() at the call site.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15300
2018-05-08 17:15:10 +00:00
Stephen Hurd
6108c01395 iflib: cleanup queues when iflib_device_register fail
Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15299
2018-05-08 16:56:02 +00:00
Andrew Gallatin
1f7ce05d1d Fix an off-by-one error when deciding to request a tx interrupt
The canonical check for whether or not a ring is drainable is
TXQ_AVAIL() > MAX_TX_DESC() + 2.  Use this same construct here,
in order to avoid a potential off-by-one error where we might otherwise
fail to request an interrupt.

Reviewed by:	mmacy
Sponsored by:	Netflix
2018-05-07 18:11:22 +00:00
Matt Macy
b6f6f88018 r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl
to sleep on commands to the NIC when updating multicast filters. More generally this permitted
driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a
a multicast update would still be queued for deletion when ifconfig deleted the interface
thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses
on the interface.

Synchronously remove all external references to a multicast address before enqueueing for delete.

Reported by:	lwhsu
Approved by:	sbruno
2018-05-06 20:34:13 +00:00
Matt Macy
7edd877a1e The ifnet pointer (ifp) in rt_newaddrmsg can be valid without ifp->if_addr being set if
if the ifnet is still live by way of a reference but
in line for deletion. Check ifp->if_addr before dereferencing.

Approved by:	sbruno
2018-05-06 20:32:47 +00:00
Mark Johnston
9461882562 Add netdump support to iflib.
em(4) and igb(4) were tested by me, and ixgbe(4) and bnxt(4) were
tested by sbruno.

Reviewed by:	mmacy, shurd
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15262
2018-05-06 00:57:52 +00:00
Mark Johnston
e505460228 Import the netdump client code.
This is a component of a system which lets the kernel dump core to
a remote host after a panic, rather than to a local storage device.
The server component is available in the ports tree. netdump is
particularly useful on diskless systems.

The netdump(4) man page contains some details describing the protocol.
Support for configuring netdump will be added to dumpon(8) in a future
commit. To use netdump, the kernel must have been compiled with the
NETDUMP option.

The initial revision of netdump was written by Darrell Anderson and
was integrated into Sandvine's OS, from which this version was derived.

Reviewed by:	bdrewery, cem (earlier versions), julian, sbruno
MFC after:	1 month
X-MFC note:	use a spare field in struct ifnet
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15253
2018-05-06 00:38:29 +00:00
Matt Macy
1ae4848cd0 fix gcc8 warnings
Approved by:	sbruno
2018-05-04 18:57:05 +00:00
Stephen Hurd
b89827a052 iflib: fix invalid free during queue allocation failure
In r301567, code was added to cleanup to prevent memory leaks for the
Tx and Rx ring structs. This code carefully tracked txq and rxq, and
made sure to free them properly during cleanup.

Because we assigned the txq and rxq pointers into the ctx->ifc_txqs and
ctx->ifc_rxqs, we carefully reset these pointers to NULL, so that
cleanup code would not accidentally free the memory twice.

This was changed by r304021 ("Update iflib to support more NIC designs"),
which removed this resetting of the pointers to NULL, because it re-used
the txq and rxq pointers as an index into the queue set array.

Unfortunately, the cleanup code was left alone. Thus, if we fail to
allocate DMA or fail to configure the queues using the drivers ifdi
methods, we will attempt to free txq and rxq. These variables would now
incorrectly point to the wrong location, resulting in a page fault.

There are a number of methods to correct this, but ultimately the root
cause was that we reuse the txq and rxq pointers for two different
purposes.

Instead, when allocating, store the returned pointer directly into
ctx->ifc_txqs and ctx->ifc_rxqs. Then, assign this to txq and rxq as
index pointers before starting the loop to allocate each queue.
Drop the cleanup code for txq and rxq, and only use ctx->ifc_txqs and
ctx->ifc_rxqs.

Thus, we no longer need to free txq or rxq under any error flow, and
intsead rely solely on the pointers stored in ctx->ifc_txqs and
ctx->ifc_rxqs. This prevents the invalid free(), and ensures that we
still properly cleanup after ourselves as before when failing to
allocate.

Submitted by:	Jacob Keller
Reviewed by:	gallatin, sbruno
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15285
2018-05-04 15:20:34 +00:00
Stephen Hurd
4d613f5d04 iflib: remove unused brscp pointer from iflib_queues_alloc
This pointer was no longer written to as of r315217. Since nothing writes
to the variable, remove it.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin, kmacy, sbruno
Differential Revision:	https://reviews.freebsd.org/D15284
2018-05-04 15:11:16 +00:00
Stephen Hurd
aa8a24d37f Allow iflib NIC drivers to sleep rather than busy wait
Since the move to SMP NIC driver locking has had to go through serious
contortions using mtx around long running hardware operations. This moves
iflib past that.

Individual drivers may now sleep when appropriate.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14983
2018-05-03 17:02:31 +00:00
Stephen Hurd
f3e1324b41 Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers
to have to go through all manner of contortions to use a non sleepable lock.
Serialize multicast updates instead.

Submitted by:	mmacy <mmacy@mattmacy.io>
Reviewed by:	shurd, sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14969
2018-05-02 19:36:29 +00:00
Andrew Gallatin
f759470750 Fix iflib_encap() EFBIG handling bugs
1) Don't give up if m_collapse() fails.  Rather than giving up, try
m_defrag() immediately.

2) Fix a leak where, if the NIC driver rejected the defrag'ed chain
as having too many segments, we would fail to free the chain.

Reviewed by:  Matthew Macy <mmacy@mattmacy.io> (this version of patch)
Submitted by: Matthew Macy <mmacy@mattmacy.io> (early version of leak fix)
2018-04-30 23:53:27 +00:00
Hans Petter Selasky
4a381a9e42 Add network device event for priority code point, PCP, changes.
When the PCP is changed for either a VLAN network interface or when
prio tagging is enabled for a regular ethernet network interface,
broadcast the IFNET_EVENT_PCP event so applications like ibcore can
update its GID tables accordingly.

MFC after:	3 days
Reviewed by:	ae, kib
Differential Revision:	https://reviews.freebsd.org/D15040
Sponsored by:	Mellanox Technologies
2018-04-26 08:58:27 +00:00
Brooks Davis
3edb7f4eaf Translate 32-bit ifmedia requests into native ones.
We use transformation rather than accessors as virtually ever driver
implements SIOCGIFMEDIA and all would have to be touched.

Keep the code readable by always performing copies and (possiably no-op)
transforms.

Reviewed by:	jhb, kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14996
2018-04-25 15:30:42 +00:00
Mark Johnston
8bd1f0cfd3 Use dead_bpf_if instead of bp_null.
This fixes a -Wunused error when DEV_BPF and NETGRAPH_BPF are not
defined.

Also remove a stray semicolon added in r332812.

X-MFC with:	r332812
2018-04-24 17:42:25 +00:00
Brooks Davis
4204224162 Finish removing FDDI and tokenring media support.
This fixes media display for 802.11 wireless devices.

Software outside the base system that uses these media types and
defines should use #ifdef IFM_FDDI or IFM_TOKEN to include or remove
support.

Reported by:	zeising
Reviewed by:	emaste, kib, zeising
Tested by:	zeising
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15170
2018-04-23 21:10:33 +00:00
Andrey V. Elsukov
2b9600b449 Add dead_bpf_if structure, that should be used as fake bpf_if
during ifnet detach.

Since destroying interface is not atomic operation and due to the
lack of synhronization during destroy, it is possible, that in the
time between bpfdetach() and if_free() some queued on destroying
interface mbuf will be used by ether_input_internal() and
bpf_peers_present() can dereference NULL bpf_if pointer. To protect
from this, assign pointer to empty bpf_if_ext structure instead of
NULL pointer after bpfdetach().

Reviewed by:	melifaro, eugen
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15083
2018-04-20 09:57:31 +00:00
Stephen Hurd
0b75ac7796 iflib: Fix queue distribution when there are no threads
Previously, if there are no threads, all queues which targeted
cores that share an L2 cache were bound to a single core. The intent is
to distribute them across these cores.

Reported by:	olivier
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15120
2018-04-18 15:34:18 +00:00
Brooks Davis
3a4fc8a8a1 Remove support for the Arcnet protocol.
While Arcnet has some continued deployment in industrial controls, the
lack of drivers for any of the PCI, USB, or PCIe NICs on the market
suggests such users aren't running FreeBSD.

Evidence in the PR database suggests that the cm(4) driver (our sole
Arcnet NIC) was broken in 5.0 and has not worked since.

PR:		182297
Reviewed by:	jhibbits, vangyzen
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15057
2018-04-13 21:18:04 +00:00
Sean Bruno
7b610b60df Restore r332389 after resolution of locking fixes.
Add one extra lock initialization to iflib_register() that was missed
in the git<->phab conversion.

Split out flag manipulation from general context manipulation in iflib

To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:   Matthew Macy <mmacy@mattmacy.io>
Reviewed by:    shurd
Sponsored by:   Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D14967
2018-04-12 14:35:37 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Mateusz Guzik
66def52613 iflib: fix up a mismerge in r332419
Lead to crashes on boot while in ifconfig.

Submitted by: Matthew Macy <mmacy@mattmacy.io>
2018-04-12 04:11:37 +00:00
Stephen Hurd
90d72813d2 Properly initialize ifc_nhwtxqs.
Also, since ifc_nhwrxqs is only used in one place, remove it from the struct.
This was preventing iflib_dma_free() from being called via
iflib_device_detach().

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
2018-04-11 21:41:59 +00:00
Brooks Davis
0437c8e3b1 Remove support for FDDI networks.
Defines in net/if_media.h remain in case code copied from ifconfig is in
use elsewere (supporting non-existant media type is harmless).

Reviewed by:	kib, jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15017
2018-04-11 17:28:24 +00:00
Sean Bruno
7feb881918 Revert r332389 as it is causing panics for various users and we need
to add some more test cases.
2018-04-11 17:26:53 +00:00
Stephen Hurd
5c1d8c4b73 Split out flag manipulation from general context manipulation in iflib
To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14967
2018-04-10 19:48:24 +00:00
Stephen Hurd
f422673e10 Make BPF global lock an SX
This allows NIC drivers to sleep on polling config operations.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14982
2018-04-10 19:42:50 +00:00
Vincenzo Maffione
4f80b14ce2 netmap: align codebase to upstream version v11.4
Changelist:
  - remove unused nkr_slot_flags
  - new nm_intr adapter callback to enable/disable interrupts
  - remove unused sysctls and document the other sysctls
  - new infrastructure to support NS_MOREFRAG for NIC ports
  - support for external memory allocator (for now linux-only),
    including linux-specific changes in common headers
  - optimizations within netmap pipes datapath
  - improvements on VALE control API
  - new nm_parse() helper function in netmap_user.h
  - various bug fixes and code clean up

Approved by:	hrs (mentor)
2018-04-09 09:24:26 +00:00
Brooks Davis
8a4a4a43f8 Remove the thread argument from ifr_buffer_*() accessors.
They are always used in a context where curthread is the correct thread.
This makes them more similar to the ifr_data_get_ptr() accessor.
2018-04-06 23:25:54 +00:00
Brooks Davis
e7fdc72e95 ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume
that each pseudo ifreq is of length MAX(sizeof(struct ifreq),
sizeof(ifr_name) + ifr_addr.sa_len).  For short sockaddrs we copied
too much from the source sockaddr resulting in a heap leak.

I believe only one such sockaddr exists (struct sockaddr_sco which
is 8 bytes) and it is unclear if such sockaddrs end up on interfaces
in practice.  If it did, the result would be an 8 byte heap leak on
current architectures.

admbugs:	869
Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	3 days
Security:	kernel heap leak
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14981
2018-04-06 20:26:56 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Kristof Provost
adfe2f6aff pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.

MFC after:	1 week
2018-04-06 15:54:30 +00:00
Brooks Davis
756181b8f5 Add 32-bit compat for ioctls that take struct ifgroupreq.
Use an accessor to access ifgr_group and ifgr_groups.

Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such
as "case SIOCAIFGROUP:". This avoids poluting the switch statements
with large numbers of #ifdefs.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14960
2018-04-05 22:14:55 +00:00
Brooks Davis
2443045f30 ifconf(): Always zero the whole struct ifreq.
The previous split of zeroing ifr_name and ifr_addr seperately is safe
on current architectures, but would be unsafe if pointers were larger
than 8 bytes. Combining the zeroing adds no real cost (a few
instructions) and makes the security property easier to verify.

Reviewed by:	kib, emaste
Obtained from:	CheriBSD
MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14912
2018-04-05 21:58:28 +00:00