Commit Graph

138616 Commits

Author SHA1 Message Date
glebius
6d6c13e9ac Assert that any epoch tracker belongs to the thread stack.
Reviewed by:	kib
2019-10-21 23:12:14 +00:00
kevans
1879756c5f if_tuntap: remove if_{tun,tap}.ko -> if_tuntap.ko links
These drivers have been merged into a single if_tuntap in 13.0. The
compatibility links existed only for the interim and will be MFC'd along
with the if_tuntap merge shortly.

MFC after:	never
2019-10-21 20:28:38 +00:00
glebius
e344cc8c4e Remove epoch tracker from struct thread. It was an ugly crutch to emulate
locking semantics for if_addr_rlock() and if_maddr_rlock().
2019-10-21 18:19:32 +00:00
glebius
8ec25643ca Remove obsoleted KPIs that were used to access interface address lists. 2019-10-21 18:17:03 +00:00
glebius
5e4ec8a4ab Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:37 +00:00
glebius
94b667e416 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:33 +00:00
glebius
f4c54dafcb Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:28 +00:00
glebius
e5e567ec6b Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:24 +00:00
glebius
2cc29d9c44 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:19 +00:00
glebius
c329711ad4 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:13:14 +00:00
glebius
2b275c29ae Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:58 +00:00
glebius
e72d8e699e Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:40 +00:00
glebius
6f81707ddf Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:36 +00:00
glebius
8a79afdc2f Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:31 +00:00
glebius
70b8dd2ad2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:26 +00:00
glebius
ab1b6a2add Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:21 +00:00
glebius
6e3ad73bab Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:17 +00:00
glebius
fb73f23ad9 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:12 +00:00
glebius
ef00a2a81b Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:07 +00:00
glebius
197fdf9314 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:02 +00:00
glebius
d2ff69d93a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:58 +00:00
glebius
ecac278aef Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:54 +00:00
glebius
dfedfb9d27 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:48 +00:00
glebius
309ada9dc2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:43 +00:00
glebius
6d7378a35c Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:38 +00:00
glebius
6bb2f449cf Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:32 +00:00
glebius
6434b43321 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:28 +00:00
glebius
cbeb7969f0 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:24 +00:00
glebius
cec3f25f17 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:19 +00:00
glebius
d863887af9 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:15 +00:00
glebius
e674340498 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:11 +00:00
glebius
7ec3a50549 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:08 +00:00
glebius
27d717ee7e Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:02 +00:00
glebius
5b96bcf7a2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:10:58 +00:00
glebius
e0e4e8207e Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:10:52 +00:00
glebius
bf1634bd38 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:10:46 +00:00
glebius
226158378d Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:25 +00:00
glebius
7c12abe654 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:20 +00:00
glebius
b5b11156aa Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:16 +00:00
glebius
ba78c18044 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:12 +00:00
glebius
cbc7bb6bb0 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:03 +00:00
glebius
46300469b6 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:56 +00:00
glebius
be119b1749 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:53 +00:00
glebius
d06a2c38ba Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:49 +00:00
glebius
1eb153cb26 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:44 +00:00
glebius
cca28bd831 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:40 +00:00
glebius
0192bea570 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:35 +00:00
glebius
5e9ef24921 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:32 +00:00
glebius
cb9389f457 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:28 +00:00
glebius
649327d64a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:24 +00:00
glebius
43af96f030 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:19 +00:00
glebius
82c6c16e69 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:15 +00:00
glebius
2097d5a248 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:11 +00:00
glebius
4962a54bea Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:07 +00:00
glebius
d6d16b16c9 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:02 +00:00
glebius
a282ba9e7b Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:57 +00:00
glebius
4d005b1367 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:53 +00:00
glebius
7a35f9d5c4 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:31 +00:00
glebius
e2629a827f Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:26 +00:00
glebius
faba9462d8 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:23 +00:00
glebius
2c0cc8d7b9 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:19 +00:00
glebius
b74a0d237e Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:15 +00:00
glebius
135942a09a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:09 +00:00
glebius
4982be8fba Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:06 +00:00
glebius
a0e94fa30a Convert to if_foreach_llmaddr() KPI.
This driver seems to have a bug.  The bug was carefully saved during
conversion.  In the al_eth_mac_table_unicast_add() the argument 'addr',
which is the actual address is unused.  So, the function is called as
many times as we have addresses, but with the exactly same argument
list.  This doesn't make any sense, but was preserved.
2019-10-21 18:05:43 +00:00
glebius
0ae83a96f8 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:00:17 +00:00
glebius
bc8d002395 Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:53 +00:00
glebius
6a821f4424 Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:16 +00:00
glebius
eb077d0a6c Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:02 +00:00
kevans
26d6f82958 tuntap(4): restrict scope of net.link.tap.user_open slightly
net.link.tap.user_open has historically allowed non-root users to do devfs
cloning and open /dev/tap* nodes based on permissions. Loosen this up to
make it only allow users to do devfs cloning -- we no longer check it in
tunopen.

This allows tap devices to be created that can actually be opened by a user,
rather than swiftly restricting them to root because the magic sysctl has
not been set.

The sysctl has not yet been completely deprecated, because more thought is
needed for how to handle the devfs cloning case. There is not an easy
suitable replacement for the sysctl there, and more care needs to be placed
in determining whether that's OK or not.

PR:		200185
2019-10-21 14:38:11 +00:00
avg
bd88b63725 debug,kassert.warnings is a statistic, not a tunable
MFC after:	1 week
2019-10-21 12:21:56 +00:00
luporl
023b61aa0a [PPC64] Add minidump support to PowerNV
Implementation of PowerNV specific minidump code.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21643
2019-10-21 11:56:57 +00:00
bz
691baa0294 frag6: fix vnet teardown leak
When shutting down a VNET we did not cleanup the fragmentation hashes.
This has multiple problems: (1) leak memory but also (2) leak on the
global counters, which might eventually lead to a problem on a system
starting and stopping a lot of vnets and dealing with a lot of IPv6
fragments that the counters/limits would be exhausted and processing
would no longer take place.

Unfortunately we do not have a useable variable to indicate when
per-VNET initialization of frag6 has happened (or when destroy happened)
so introduce a boolean to flag this. This is needed here as well as
it was in r353635 for ip_reass.c in order to avoid tripping over the
already destroyed locks if interfaces go away after the frag6 destroy.

While splitting things up convert the TRY_LOCK to a LOCK operation in
now frag6_drain_one().  The try-lock was derived from a manual hand-rolled
implementation and carried forward all the time.  We no longer can afford
not to get the lock as that would mean we would continue to leak memory.

Assert that all the buckets are empty before destroying to lock to
ensure long-term stability of a clean shutdown.

Reported by:	hselasky
Reviewed by:	hselasky
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22054
2019-10-21 08:48:47 +00:00
bz
8fb71bf4c9 frag6: add read-only sysctl for nfrags.
Add a read-only sysctl exporting the global number of fragments
(base system and all vnets).  This is helpful to (a) know how many
fragments are currently being processed, (b) if there are possible
leaks, (c) if vnet teardown is not working correctly, and lastly
(d) it can be used as part of test-suits to ensure (a) to (c).

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-21 08:36:15 +00:00
kevans
70c45ec99a tuntap(4): use cdevpriv w/ dtor for last close instead of d_close
cdevpriv dtors will be called when the reference count on the associated
struct file drops to 0, while d_close can be unreliable for cleaning up
state at "last close" for a number of reasons. As far as tunclose/tundtor is
concerned the difference is minimal, so make the switch.
2019-10-20 22:55:47 +00:00
kevans
58e3081500 tuntap(4): Use make_dev_s to avoid si_drv1 race
This allows us to avoid some dance in tunopen for dealing with the
possibility of dev->si_drv1 being NULL as it's set prior to the devfs node
being created in all cases.

There's still the possibility that the tun device hasn't been fully
initialized, since that's done after the devfs node was created. Alleviate
this by returning ENXIO if we're not to that point of tuncreate yet.

This work is what sparked r353128, full initialization of cloned devices
w/ specified make_dev_args.
2019-10-20 22:39:40 +00:00
kevans
472be58c41 tuntap(4): break out after setting TUN_DSTADDR
This is now the only flag we set in this loop, terminate early.
2019-10-20 21:06:25 +00:00
kevans
1942df0600 tuntap(4): Drop TUN_IASET
This flag appears to have been effectively unused since introduction to
if_tun(4) -- drop it now.
2019-10-20 21:03:48 +00:00
marius
3feb54ebc6 - In em_intr(), just call em_handle_link() instead of duplicating it.
- In em_msix_link(), properly handle IGB-class devices after the iflib(4)
  conversion again by only setting EM_MSIX_LINK for the EM-class 82574
  and by re-arming link interrupts unconditionally, i. e. not only in
  case of spurious interrupts. This fixes the interface link state change
  detection for the IGB-class. [1]
- In em_if_update_admin_status(), only re-arm the link state change
  interrupt for 82574 and also only if such a device uses MSI-X, i. e.
  takes advantage of autoclearing. In case of INTx and MSI as well as
  for LEM- and IGB-class devices, re-arming isn't appropriate here and
  setting EM_MSIX_LINK isn't either.
  While at it, consistently take advantage of the hw variable.

PR:	236724 [1]
Differential Revision:	https://reviews.freebsd.org/D21924
2019-10-20 17:40:50 +00:00
jhibbits
9a2f00bcbd powerpc/booke: Don't zero MAS8, it's unnecessary
MAS8 is hypervisor privileged, defining the logical partition (VM) to
operate on for TLB accesses.  It's already guaranteed to be cleared when
booting bare metal (bootloader needs it zeroed to work), and we can't touch
it from a guest.  Assume that if/when we eventually port bhyve to PowerPC
(and Book-E) the hypervisor module will take care of managing MAS8.  This
saves several (tens) of clocks on each TLB miss.

MFC after:	2 weeks
2019-10-20 15:50:33 +00:00
vmaffione
ed5f1b48e4 netmap: minor misc improvements
- use ring->head rather than ring->cur in lb(8)
 - use strlcat() rather than strncat()
 - fix bandwidth computation in pkt-gen(8)

MFC after:	1 week
2019-10-20 14:15:45 +00:00
mmel
0af3eff0eb Add driver for DesignWare PCIE core, and its Armada 8K specific attachement.
MFC after:	3 weeks
2019-10-20 11:11:32 +00:00
mmel
189f4bb0a2 Update Armada 8k drivers to cover newly imported DT and latest changes
in simple multifunction driver.
- follow interrupt changes in DT. Split old ICU driver to function oriented
  parts and add drivers for newly defined parts (system error interrupts).
- Many drivers are children of simple multifunction driver. But after r349596
  simple MF driver doesn't longer exports memory resources, and all children
  must use syscon interface to access their registers. Adapt affected
  drivers to this fact.

MFC after:	3 weeks
2019-10-20 10:48:27 +00:00
tuexen
71224580cf Fix compile issues when building a kernel without the VIMAGE option.
Thanks to cem@ for discussing the issue which resulted in this patch.

Reviewed by:		cem@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D22089
2019-10-19 20:48:53 +00:00
cem
5e4aa6b57b hw.intrbalance: Make sysctl tunable
This allows specifying a boot-time preference in loader.conf.
2019-10-19 16:37:49 +00:00
jhibbits
483d2889cb powerpc/booke pmap: Fix printf format type warnings 2019-10-19 16:09:06 +00:00
jkim
84f1fa9393 Merge ACPICA 20191018. 2019-10-19 14:56:44 +00:00
avg
c9402a53ed remove wmb() call from x86 cpu_reset()
The rationale is pretty much the same as in r353747.
There is no subsequent dependent store.
The store is to the regular (TSO) memory anyway.

MFC after:	23 days
2019-10-19 07:13:15 +00:00
avg
043f2c4bee vmm: remove a wmb() call
After removing wmb(), vm_set_rendezvous_func() became super trivial, so
there was no point in keeping it.

The wmb (sfence on amd64, lock nop on i386) was not needed.  This can be
explained from several points of view.

First, wmb() is used for store-store ordering (although, the primitive
is undocumented).  There was no obvious subsequent store that needed the
barrier.

Second, x86 has a memory model with strong ordering including total
store order.  An explicit store barrier may be needed only when working
with special memory (device, special caching mode) or using special
instructions (non-temporal stores).  That was not the case for this
code.

Third, I believe that there is a misconception that sfence "flushes" the
store buffer in a sense that it speeds up the propagation of stores from
the store buffer to the global visibility.  I think that such
propagation always happens as fast as possible.  sfence only makes
subsequent stores wait for that propagation to complete.  So, sfence is
only useful for ordering of stores and only in the situations described
above.

Reviewed by:	jhb
MFC after:	23 days
Differential Revision: https://reviews.freebsd.org/D21978
2019-10-19 07:10:15 +00:00
jhibbits
43e6fd09e6 powerpc/aim: Fix comment typo 2019-10-19 02:47:32 +00:00
jhibbits
23cda530a2 powerpc/mpc85xx: Replace global PCI config mutex with per-controller mutex
PCI controllers need to enforce exclusive config register access on their
own bus, not between all buses.
2019-10-19 01:07:35 +00:00
cem
96be16debb Fix debugnet(4) link/build fallout on some configurations
Introduced in r353685 (sys/conf/files), r353694 (debugnet.c db_printf).

Submitted by:	kevans
Reported by:	cy
X-MFC-With:	r353685, r353694
2019-10-18 22:03:36 +00:00
vmaffione
8e95c619ec tap: add support for virtio-net offloads
This patch is part of an effort to make bhyve networking (in particular TCP)
faster. The key strategy to enhance TCP throughput is to let the whole packet
datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet
overhead is amortized over a large number of bytes.
This capability is supported in the guest by means of the vtnet(4) driver,
which is able to handle TSO/LRO packets leveraging the virtio-net header
(see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf).
A bhyve VM exchanges packets with the host through a network backend,
which can be vale(4) or if_tap(4).
While vale(4) supports TSO/LRO packets, if_tap(4) does not.
This patch extends if_tap(4) with the ability to understand the virtio-net
header, so that a tapX interface can process TSO/LRO packets.
A couple of ioctl commands have been added to configure and probe the
virtio-net header. Once the virtio-net header is set, the tapX interface
acquires all the IFCAP capabilities necessary for TSO/LRO.

Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D21263
2019-10-18 21:53:27 +00:00
cem
584954c065 nvdimm(4): Persist unit numbers in cdev
They're formatted into the device name like unit numbers, anyway; store the
number in mda_unit => si_drv0 like dev2unit() expects.

No functional change intended.

Sponsored by:	Dell EMC Isilon
2019-10-18 21:32:45 +00:00
markj
70e6052cef Further constrain the use of per-CPU caches for free pages.
In low memory conditions a significant number of pages may end up stuck
in the caches, and currently these caches cannot be reaped, leading to
spurious memory allocation failures and OOM kills.  So:

- Take into account the fact that we may cache up to two full buckets
  of pages per CPU, not just one.
- Increase the amount of RAM required per CPU to enable the caches.

This is a temporary measure until the page cache management policy is
improved.

PR:		241048
Reported and tested by:	Kevin Oberman <rkoberman@gmail.com>
Reviewed by:	alc, kib
Discussed with:	jeff
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22040
2019-10-18 17:36:42 +00:00
markj
6304690b84 Abbreviate softdep lock names.
The softdep lock names were unusually long and tended to stick out in
lock profiling reports.  Abbreviate them and make them consistent with
our conventional style for lock names.

Reviewed by:	mckusick
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22042
2019-10-18 17:01:27 +00:00
glebius
417268623e Make rt_getifa_fib() static. 2019-10-18 15:20:24 +00:00
markj
436ad09030 Tighten mapping protections on preloaded files on amd64.
- We load the kernel at 0x200000.  Memory below that address need not
  be executable, so do not map it as such.
- Remove references to .ldata and related sections in the kernel linker
  script.  They come from ld.bfd's default linker script, but are not
  used, and we now use ld.lld to link the amd64 kernel.  lld does not
  contain a default linker script.
- Pad the .bss to a 2MB as we do between .text and .data.  This
  forces the loader to load additional files starting in the following
  2MB page, preserving the use of superpage mappings for kernel data.
- Map memory above the kernel image with NX.  The kernel linker now
  upgrades protections as needed, and other preloaded file types
  (e.g., entropy, microcode) need not be mapped with execute permissions
  in the first place.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21859
2019-10-18 14:05:13 +00:00
markj
360bcad613 Apply mapping protections to preloaded kernel modules on amd64.
With an upcoming change the amd64 kernel will map preloaded files RW
instead of RWX, so the kernel linker must adjust protections
appropriately using pmap_change_prot().

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21860
2019-10-18 13:56:45 +00:00
markj
62149395d3 Apply mapping protections to .o kernel modules.
Use the section flags to derive mapping protections.  When multiple
sections overlap within a page, the union of their protections must be
applied.  With r353701 the .text and .rodata sections are padded to
ensure that this does not happen on amd64.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21896
2019-10-18 13:53:14 +00:00