Commit Graph

129329 Commits

Author SHA1 Message Date
Gleb Smirnoff
0407b8bf7a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:08:03 +00:00
Gleb Smirnoff
e9c5104f43 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:56 +00:00
Gleb Smirnoff
f2921d4c06 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:53 +00:00
Gleb Smirnoff
0ae47039de Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:49 +00:00
Gleb Smirnoff
33253a3794 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:44 +00:00
Gleb Smirnoff
9c0d672823 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:40 +00:00
Gleb Smirnoff
a51cd4c558 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:35 +00:00
Gleb Smirnoff
307b050d3b Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:32 +00:00
Gleb Smirnoff
9bf5c8b430 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:28 +00:00
Gleb Smirnoff
2db0e8ff39 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:24 +00:00
Gleb Smirnoff
ad4cb014c0 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:19 +00:00
Gleb Smirnoff
53dc8c2339 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:15 +00:00
Gleb Smirnoff
ba583cfed2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:11 +00:00
Gleb Smirnoff
208af059dd Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:07 +00:00
Gleb Smirnoff
69d3f28aa4 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:02 +00:00
Gleb Smirnoff
99e76377d2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:57 +00:00
Gleb Smirnoff
1d15d9f065 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:53 +00:00
Gleb Smirnoff
349ecfc38a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:31 +00:00
Gleb Smirnoff
0322ca23bc Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:26 +00:00
Gleb Smirnoff
f0bcd699b8 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:23 +00:00
Gleb Smirnoff
15fd62df7e Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:19 +00:00
Gleb Smirnoff
6545173041 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:15 +00:00
Gleb Smirnoff
16650a016c Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:09 +00:00
Gleb Smirnoff
eed57e321a Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:06 +00:00
Gleb Smirnoff
57d361630d Convert to if_foreach_llmaddr() KPI.
This driver seems to have a bug.  The bug was carefully saved during
conversion.  In the al_eth_mac_table_unicast_add() the argument 'addr',
which is the actual address is unused.  So, the function is called as
many times as we have addresses, but with the exactly same argument
list.  This doesn't make any sense, but was preserved.
2019-10-21 18:05:43 +00:00
Gleb Smirnoff
8f49db674c Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:00:17 +00:00
Gleb Smirnoff
f1c5eb0d97 Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:53 +00:00
Gleb Smirnoff
d6b5965b77 Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:16 +00:00
Gleb Smirnoff
7dce56596f Convert to if_foreach_llmaddr() KPI. 2019-10-21 17:59:02 +00:00
Kyle Evans
3d5013337a tuntap(4): restrict scope of net.link.tap.user_open slightly
net.link.tap.user_open has historically allowed non-root users to do devfs
cloning and open /dev/tap* nodes based on permissions. Loosen this up to
make it only allow users to do devfs cloning -- we no longer check it in
tunopen.

This allows tap devices to be created that can actually be opened by a user,
rather than swiftly restricting them to root because the magic sysctl has
not been set.

The sysctl has not yet been completely deprecated, because more thought is
needed for how to handle the devfs cloning case. There is not an easy
suitable replacement for the sysctl there, and more care needs to be placed
in determining whether that's OK or not.

PR:		200185
2019-10-21 14:38:11 +00:00
Andriy Gapon
3ad1ce46d3 debug,kassert.warnings is a statistic, not a tunable
MFC after:	1 week
2019-10-21 12:21:56 +00:00
Leandro Lupori
95ca4720f0 [PPC64] Add minidump support to PowerNV
Implementation of PowerNV specific minidump code.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D21643
2019-10-21 11:56:57 +00:00
Bjoern A. Zeeb
67a10c4644 frag6: fix vnet teardown leak
When shutting down a VNET we did not cleanup the fragmentation hashes.
This has multiple problems: (1) leak memory but also (2) leak on the
global counters, which might eventually lead to a problem on a system
starting and stopping a lot of vnets and dealing with a lot of IPv6
fragments that the counters/limits would be exhausted and processing
would no longer take place.

Unfortunately we do not have a useable variable to indicate when
per-VNET initialization of frag6 has happened (or when destroy happened)
so introduce a boolean to flag this. This is needed here as well as
it was in r353635 for ip_reass.c in order to avoid tripping over the
already destroyed locks if interfaces go away after the frag6 destroy.

While splitting things up convert the TRY_LOCK to a LOCK operation in
now frag6_drain_one().  The try-lock was derived from a manual hand-rolled
implementation and carried forward all the time.  We no longer can afford
not to get the lock as that would mean we would continue to leak memory.

Assert that all the buckets are empty before destroying to lock to
ensure long-term stability of a clean shutdown.

Reported by:	hselasky
Reviewed by:	hselasky
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22054
2019-10-21 08:48:47 +00:00
Bjoern A. Zeeb
65456706c0 frag6: add read-only sysctl for nfrags.
Add a read-only sysctl exporting the global number of fragments
(base system and all vnets).  This is helpful to (a) know how many
fragments are currently being processed, (b) if there are possible
leaks, (c) if vnet teardown is not working correctly, and lastly
(d) it can be used as part of test-suits to ensure (a) to (c).

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-21 08:36:15 +00:00
Kyle Evans
6025077704 tuntap(4): use cdevpriv w/ dtor for last close instead of d_close
cdevpriv dtors will be called when the reference count on the associated
struct file drops to 0, while d_close can be unreliable for cleaning up
state at "last close" for a number of reasons. As far as tunclose/tundtor is
concerned the difference is minimal, so make the switch.
2019-10-20 22:55:47 +00:00
Kyle Evans
6869d530c7 tuntap(4): Use make_dev_s to avoid si_drv1 race
This allows us to avoid some dance in tunopen for dealing with the
possibility of dev->si_drv1 being NULL as it's set prior to the devfs node
being created in all cases.

There's still the possibility that the tun device hasn't been fully
initialized, since that's done after the devfs node was created. Alleviate
this by returning ENXIO if we're not to that point of tuncreate yet.

This work is what sparked r353128, full initialization of cloned devices
w/ specified make_dev_args.
2019-10-20 22:39:40 +00:00
Kyle Evans
486c0b2269 tuntap(4): break out after setting TUN_DSTADDR
This is now the only flag we set in this loop, terminate early.
2019-10-20 21:06:25 +00:00
Kyle Evans
6041d76e0c tuntap(4): Drop TUN_IASET
This flag appears to have been effectively unused since introduction to
if_tun(4) -- drop it now.
2019-10-20 21:03:48 +00:00
Marius Strobl
cd1cf2fc1d - In em_intr(), just call em_handle_link() instead of duplicating it.
- In em_msix_link(), properly handle IGB-class devices after the iflib(4)
  conversion again by only setting EM_MSIX_LINK for the EM-class 82574
  and by re-arming link interrupts unconditionally, i. e. not only in
  case of spurious interrupts. This fixes the interface link state change
  detection for the IGB-class. [1]
- In em_if_update_admin_status(), only re-arm the link state change
  interrupt for 82574 and also only if such a device uses MSI-X, i. e.
  takes advantage of autoclearing. In case of INTx and MSI as well as
  for LEM- and IGB-class devices, re-arming isn't appropriate here and
  setting EM_MSIX_LINK isn't either.
  While at it, consistently take advantage of the hw variable.

PR:	236724 [1]
Differential Revision:	https://reviews.freebsd.org/D21924
2019-10-20 17:40:50 +00:00
Justin Hibbits
1cf56858b0 powerpc/booke: Don't zero MAS8, it's unnecessary
MAS8 is hypervisor privileged, defining the logical partition (VM) to
operate on for TLB accesses.  It's already guaranteed to be cleared when
booting bare metal (bootloader needs it zeroed to work), and we can't touch
it from a guest.  Assume that if/when we eventually port bhyve to PowerPC
(and Book-E) the hypervisor module will take care of managing MAS8.  This
saves several (tens) of clocks on each TLB miss.

MFC after:	2 weeks
2019-10-20 15:50:33 +00:00
Vincenzo Maffione
760fa2ab5d netmap: minor misc improvements
- use ring->head rather than ring->cur in lb(8)
 - use strlcat() rather than strncat()
 - fix bandwidth computation in pkt-gen(8)

MFC after:	1 week
2019-10-20 14:15:45 +00:00
Michal Meloun
26abae3f17 Add driver for DesignWare PCIE core, and its Armada 8K specific attachement.
MFC after:	3 weeks
2019-10-20 11:11:32 +00:00
Michal Meloun
4b84206b7f Update Armada 8k drivers to cover newly imported DT and latest changes
in simple multifunction driver.
- follow interrupt changes in DT. Split old ICU driver to function oriented
  parts and add drivers for newly defined parts (system error interrupts).
- Many drivers are children of simple multifunction driver. But after r349596
  simple MF driver doesn't longer exports memory resources, and all children
  must use syscon interface to access their registers. Adapt affected
  drivers to this fact.

MFC after:	3 weeks
2019-10-20 10:48:27 +00:00
Michael Tuexen
4ad8cb6813 Fix compile issues when building a kernel without the VIMAGE option.
Thanks to cem@ for discussing the issue which resulted in this patch.

Reviewed by:		cem@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D22089
2019-10-19 20:48:53 +00:00
Conrad Meyer
1a352c3c1b hw.intrbalance: Make sysctl tunable
This allows specifying a boot-time preference in loader.conf.
2019-10-19 16:37:49 +00:00
Justin Hibbits
4ffdb9f2a4 powerpc/booke pmap: Fix printf format type warnings 2019-10-19 16:09:06 +00:00
Jung-uk Kim
a009b7dcab Merge ACPICA 20191018. 2019-10-19 14:56:44 +00:00
Andriy Gapon
3aa8a8ed2d remove wmb() call from x86 cpu_reset()
The rationale is pretty much the same as in r353747.
There is no subsequent dependent store.
The store is to the regular (TSO) memory anyway.

MFC after:	23 days
2019-10-19 07:13:15 +00:00
Andriy Gapon
869dbab7ba vmm: remove a wmb() call
After removing wmb(), vm_set_rendezvous_func() became super trivial, so
there was no point in keeping it.

The wmb (sfence on amd64, lock nop on i386) was not needed.  This can be
explained from several points of view.

First, wmb() is used for store-store ordering (although, the primitive
is undocumented).  There was no obvious subsequent store that needed the
barrier.

Second, x86 has a memory model with strong ordering including total
store order.  An explicit store barrier may be needed only when working
with special memory (device, special caching mode) or using special
instructions (non-temporal stores).  That was not the case for this
code.

Third, I believe that there is a misconception that sfence "flushes" the
store buffer in a sense that it speeds up the propagation of stores from
the store buffer to the global visibility.  I think that such
propagation always happens as fast as possible.  sfence only makes
subsequent stores wait for that propagation to complete.  So, sfence is
only useful for ordering of stores and only in the situations described
above.

Reviewed by:	jhb
MFC after:	23 days
Differential Revision: https://reviews.freebsd.org/D21978
2019-10-19 07:10:15 +00:00
Justin Hibbits
f1d4707c31 powerpc/aim: Fix comment typo 2019-10-19 02:47:32 +00:00
Justin Hibbits
a877eb6143 powerpc/mpc85xx: Replace global PCI config mutex with per-controller mutex
PCI controllers need to enforce exclusive config register access on their
own bus, not between all buses.
2019-10-19 01:07:35 +00:00
Conrad Meyer
0634308df2 Fix debugnet(4) link/build fallout on some configurations
Introduced in r353685 (sys/conf/files), r353694 (debugnet.c db_printf).

Submitted by:	kevans
Reported by:	cy
X-MFC-With:	r353685, r353694
2019-10-18 22:03:36 +00:00
Vincenzo Maffione
f8bc74e2f4 tap: add support for virtio-net offloads
This patch is part of an effort to make bhyve networking (in particular TCP)
faster. The key strategy to enhance TCP throughput is to let the whole packet
datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet
overhead is amortized over a large number of bytes.
This capability is supported in the guest by means of the vtnet(4) driver,
which is able to handle TSO/LRO packets leveraging the virtio-net header
(see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf).
A bhyve VM exchanges packets with the host through a network backend,
which can be vale(4) or if_tap(4).
While vale(4) supports TSO/LRO packets, if_tap(4) does not.
This patch extends if_tap(4) with the ability to understand the virtio-net
header, so that a tapX interface can process TSO/LRO packets.
A couple of ioctl commands have been added to configure and probe the
virtio-net header. Once the virtio-net header is set, the tapX interface
acquires all the IFCAP capabilities necessary for TSO/LRO.

Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D21263
2019-10-18 21:53:27 +00:00
Conrad Meyer
43e4b6ca7f nvdimm(4): Persist unit numbers in cdev
They're formatted into the device name like unit numbers, anyway; store the
number in mda_unit => si_drv0 like dev2unit() expects.

No functional change intended.

Sponsored by:	Dell EMC Isilon
2019-10-18 21:32:45 +00:00
Mark Johnston
d307bdcc2c Further constrain the use of per-CPU caches for free pages.
In low memory conditions a significant number of pages may end up stuck
in the caches, and currently these caches cannot be reaped, leading to
spurious memory allocation failures and OOM kills.  So:

- Take into account the fact that we may cache up to two full buckets
  of pages per CPU, not just one.
- Increase the amount of RAM required per CPU to enable the caches.

This is a temporary measure until the page cache management policy is
improved.

PR:		241048
Reported and tested by:	Kevin Oberman <rkoberman@gmail.com>
Reviewed by:	alc, kib
Discussed with:	jeff
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22040
2019-10-18 17:36:42 +00:00
Mark Johnston
c456a0a1a6 Abbreviate softdep lock names.
The softdep lock names were unusually long and tended to stick out in
lock profiling reports.  Abbreviate them and make them consistent with
our conventional style for lock names.

Reviewed by:	mckusick
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22042
2019-10-18 17:01:27 +00:00
Gleb Smirnoff
fda454099b Make rt_getifa_fib() static. 2019-10-18 15:20:24 +00:00
Mark Johnston
14327f5334 Tighten mapping protections on preloaded files on amd64.
- We load the kernel at 0x200000.  Memory below that address need not
  be executable, so do not map it as such.
- Remove references to .ldata and related sections in the kernel linker
  script.  They come from ld.bfd's default linker script, but are not
  used, and we now use ld.lld to link the amd64 kernel.  lld does not
  contain a default linker script.
- Pad the .bss to a 2MB as we do between .text and .data.  This
  forces the loader to load additional files starting in the following
  2MB page, preserving the use of superpage mappings for kernel data.
- Map memory above the kernel image with NX.  The kernel linker now
  upgrades protections as needed, and other preloaded file types
  (e.g., entropy, microcode) need not be mapped with execute permissions
  in the first place.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21859
2019-10-18 14:05:13 +00:00
Mark Johnston
f822c9e287 Apply mapping protections to preloaded kernel modules on amd64.
With an upcoming change the amd64 kernel will map preloaded files RW
instead of RWX, so the kernel linker must adjust protections
appropriately using pmap_change_prot().

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21860
2019-10-18 13:56:45 +00:00
Mark Johnston
1d9eae9fb2 Apply mapping protections to .o kernel modules.
Use the section flags to derive mapping protections.  When multiple
sections overlap within a page, the union of their protections must be
applied.  With r353701 the .text and .rodata sections are padded to
ensure that this does not happen on amd64.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21896
2019-10-18 13:53:14 +00:00
Andriy Gapon
44a57bfe3a gpioiic: add the detach method
bus_generic_detach was not enough, we also need to clean up the iicbus
child device.

MFC after:	1 week
2019-10-18 12:34:30 +00:00
Yuri Pankov
a161fba992 linux: futex_mtx should follow futex_list
Move futex_mtx to linux_common.ko for amd64 and aarch64 along
with respective list/mutex init/destroy.

PR:		240989
Reported by:	Alex S <iwtcex@gmail.com>
2019-10-18 12:25:33 +00:00
Yuri Pankov
b9d3556a34 linux: provide just one instance of futex_list
Move futex_list definition to linux.c which is included once
in linux.ko (i386) and in linux_common.ko (amd64 and aarch64)
allowing 32/64 bit linux programs to access the same futexes
in the latter case.

PR:		240989
Reviewed by:	dchagin
Differential Revision:	https://reviews.freebsd.org/D22073
2019-10-18 10:28:08 +00:00
Kristof Provost
a0d571cbef pf: Must be in NET_EPOCH to call icmp_error
icmp_reflect(), called through icmp_error() requires us to be in NET_EPOCH.
Failure to hold it leads to the following panic (with INVARIANTS):

  panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/ip_icmp.c:742
  cpuid = 2
  time = 1571233273
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e0977920
  vpanic() at vpanic+0x17e/frame 0xfffffe00e0977980
  panic() at panic+0x43/frame 0xfffffe00e09779e0
  icmp_reflect() at icmp_reflect+0x625/frame 0xfffffe00e0977aa0
  icmp_error() at icmp_error+0x720/frame 0xfffffe00e0977b10
  pf_intr() at pf_intr+0xd5/frame 0xfffffe00e0977b50
  ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe00e0977bb0
  fork_exit() at fork_exit+0x80/frame 0xfffffe00e0977bf0
  fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e0977bf0

Note that we now enter NET_EPOCH twice if we enter ip_output() from pf_intr(),
but ip_output() will soon be converted to a function that requires epoch, so
entering NET_EPOCH directly from pf_intr() makes more sense.

Discussed with:	glebius@
2019-10-18 03:36:26 +00:00
Conrad Meyer
7bf55415d5 nvdimm_e820: Fix braino in size=all SPA hint
The sentinel value for "use the rest of the region," -1, isn't zero modulo
PAGE_SIZE.  Relax the check to permit the intended special value.

X-MFC-With:	r353110
Sponsored by:	Dell EMC Isilon
2019-10-18 03:01:21 +00:00
Conrad Meyer
b1f22a0083 x86: Remove unused variable from r353712
It was in my git tree (uncommitted) and didn't get carried over to SVN in
r353712.

X-MFC-With:	r353712
2019-10-18 02:25:30 +00:00
Conrad Meyer
bb044eaf54 x86: Fetch and save standard CPUID leaf 6 in identcpu
Rather than a few scattered places in the tree.  Organize flag names in a
contiguous region of specialreg.h.

While here, delete deprecated PCOMMIT from leaf 7.

No functional change.
2019-10-18 02:18:17 +00:00
Conrad Meyer
6310546dba gdb(4): Implement support for NoAckMode
When the underlying debugport transport is reliable, GDB's additional
checksums and acknowledgements are redundant.  NoAckMode eliminates the
the acks and allows us to skip checking RX checksums.  The GDB packet
framing does not change, so unfortunately (valid) checksums are still
included as message trailers.

The gdb(4) stub in FreeBSD advertises support for the feature in response to
the client's 'qSupported' request IFF the current debugport has the
gdb_dbfeatures flag GDB_DBGP_FEAT_RELIABLE set.  Currently, only netgdb(4)
supports this feature.

If the remote GDB client supports the feature and does not have it disabled
via a GDB configuration knob, it may instruct our gdb(4) stub to enter
NoAckMode.  Unless and until it issues that command, we must continue to
transmit acks as usual (and for now, we continue to wait until we receive
them as well, even if we know the debugport is on a reliable transport).

In the kernel sources, the sense of the flag representing the state of the
feature is reversed from that of the GDB command.  (I.e., it is
'gdb_ackmode', not 'gdb_noackmode.')  This is to avoid confusing double-
negative conditions.

For reference, see:
  * https://sourceware.org/gdb/onlinedocs/gdb/Packet-Acknowledgment.html
  * https://sourceware.org/gdb/onlinedocs/gdb/General-Query-Packets.html#QStartNoAckMode

Reviewed by:	jhb, markj (both earlier version)
Differential Revision:	https://reviews.freebsd.org/D21761
2019-10-17 22:37:25 +00:00
Mark Johnston
a211ca52cd Add an ldscript for amd64 kernel modules.
Use it to pad the text and read-only data sections to a 4KB boundary.
This will be used to enforce strict memory protections for some
sections of loadable kernel modules.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21970
2019-10-17 21:39:23 +00:00
Conrad Meyer
dda17b3672 Implement NetGDB(4)
NetGDB(4) is a component of a system using a panic-time network stack to
remotely debug crashed FreeBSD kernels over the network, instead of
traditional serial interfaces.

There are three pieces in the complete NetGDB system.

First, a dedicated proxy server must be running to accept connections from
both NetGDB and gdb(1), and pass bidirectional traffic between the two
protocols.

Second, the NetGDB client is activated much like ordinary 'gdb' and
similarly to 'netdump' in ddb(4) after a panic.  Like other debugnet(4)
clients (netdump(4)), the network interface on the route to the proxy server
must be online and support debugnet(4).

Finally, the remote (k)gdb(1) uses 'target remote <proxy>:<port>' (like any
other TCP remote) to connect to the proxy server.

The NetGDB v1 protocol speaks the literal GDB remote serial protocol, and
uses a 1:1 relationship between GDB packets and sequences of debugnet
packets (fragmented by MTU).  There is no encryption utilized to keep
debugging sessions private, so this is only appropriate for local
segments or trusted networks.

Submitted by:	John Reimer <john.reimer AT emc.com> (earlier version)
Discussed some with:	emaste, markj
Relnotes:	sure
Differential Revision:	https://reviews.freebsd.org/D21568
2019-10-17 21:33:01 +00:00
Mark Johnston
092bacb2c4 Clean up some nits in link_elf_(un)load_file().
- Remove a redundant assignment of ef->address.
- Don't return a Mach error number to the caller if vm_map_find() fails.
- Use ptoa() and fix style.

MFC after:	2 weeks
Sponsored by:	Netflix
2019-10-17 21:25:50 +00:00
Mark Johnston
516dc824d9 Belatedly bump __FreeBSD_version for r353537 and related commits.
At least one small update to the out-of-tree DRM drivers is required
now that cdev_pager_free_page() expects an xbusy page.

Discussed with:	jeff, zeising
2019-10-17 20:46:33 +00:00
Conrad Meyer
e9c6962599 debugnet(4): Add optional full-duplex mode
It remains unattached to any client protocol.  Netdump is unaffected
(remaining half-duplex).  The intended consumer is NetGDB.

Submitted by:	John Reimer <john.reimer AT emc.com> (earlier version)
Discussed with:	markj
Differential Revision:	https://reviews.freebsd.org/D21541
2019-10-17 20:25:15 +00:00
Gleb Smirnoff
b2807792f1 Revert two parts of r353292 that enter epoch when processing vlan capabilities.
It could be that entering epoch isn't necessary here, but better take a
conservative approach.

Submitted by:	kp
2019-10-17 20:18:07 +00:00
Conrad Meyer
fde2cf65ce debugnet(4): Infer non-server connection parameters
Loosen requirements for connecting to debugnet-type servers.  Only require a
destination address; the rest can theoretically be inferred from the routing
table.

Relax corresponding constraints in netdump(4) and move ifp validation to
debugnet connection time.

Submitted by:	John Reimer <john.reimer AT emc.com> (earlier version)
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D21482
2019-10-17 20:10:32 +00:00
Conrad Meyer
757dbc72ef acpica: Fix for the fix, unfortunately
Follow-up to incomplete pedantic change in r353691 by actually fixing the
default implementation to match the interface type.  Mea culpa.

X-MFC-With:	r353691, r339754
2019-10-17 19:53:55 +00:00
Conrad Meyer
8270d35eca Add ddb(4) 'netdump' command to netdump a core without preconfiguration
Add a 'X -s <server> -c <client> [-g <gateway>] -i <interface>' subroutine
to the generic debugnet code.  The imagined use is both netdump, shown here,
and NetGDB (vaporware).  It uses the ddb(4) lexer, with some new extensions,
to parse out IPv4 addresses.

'Netdump' uses the generic debugnet routine to load a configuration and
start a dump, without any netdump configuration prior to panic.

Loosely derived from work by:	John Reimer <john.reimer AT emc.com>
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D21460
2019-10-17 19:49:20 +00:00
Conrad Meyer
454d8a1ebb acpica: Match ID_PROBE default implementation to interface
After r339754, the additional interface parameter was accidentally left out
of the default acpi_generic_id_probe implementation.  Apparently this does
not cause any real problems, so this fix is mostly stylistic.

No functional change intended.

X-MFC-With:	r339754
2019-10-17 18:45:11 +00:00
Conrad Meyer
addccb8c51 Add a very limited DDB dumpon(8)-alike to MI dumper code
This allows ddb(4) commands to construct a static dumperinfo during
panic/debug and invoke doadump(false) using the provided dumper
configuration (always inserted first in the list).

The intended usecase is a ddb(4)-time netdump(4) command.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D21448
2019-10-17 18:29:44 +00:00
Conrad Meyer
6d567ec2da debugnet: Respond to broadcast ARP requests
The in-tree netdump code has always ignored non-directed ARP requests, and
that seems to work most of the time for netdump.

In my work and testing on NetGDB, it seems like sometimes the remote FreeBSD
conversant (the non-panic system) will send broadcast-destination ARP
requests to the debugnet kernel; without this change, those are dropped and
the remote will see EHOSTDOWN "Host is down" errors from the userspace
interface of the network stack.

Discussed with:	markj
2019-10-17 17:48:32 +00:00
Conrad Meyer
d39756c142 debugnet(4): Check hardware-validated UDP checksums
Similar to INET checksums, lazily validate UDP checksums when the driver has
already performed the check for us.  Like debugnet(4) INET checksums,
validation in software is left as future work.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D21745
2019-10-17 17:19:16 +00:00
Gleb Smirnoff
c0ebee428e Quickly fix up r353683: enter the epoch before calling into netisr_dispatch(). 2019-10-17 17:02:50 +00:00
Ed Maste
b1460baa73 Update Conrad Meyer's email
cem is now a committer

Approved by:	cem
2019-10-17 16:38:44 +00:00
Conrad Meyer
7790c8c199 Split out a more generic debugnet(4) from netdump(4)
Debugnet is a simplistic and specialized panic- or debug-time reliable
datagram transport.  It can drive a single connection at a time and is
currently unidirectional (debug/panic machine transmit to remote server
only).

It is mostly a verbatim code lift from netdump(4).  Netdump(4) remains
the only consumer (until the rest of this patch series lands).

The INET-specific logic has been extracted somewhat more thoroughly than
previously in netdump(4), into debugnet_inet.c.  UDP-layer logic and up, as
much as possible as is protocol-independent, remains in debugnet.c.  The
separation is not perfect and future improvement is welcome.  Supporting
INET6 is a long-term goal.

Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to
'debugnet_' or 'dn_' -- sorry.  I thought keeping the netdump name on the
generic module would be more confusing than the refactoring.

The only functional change here is the mbuf allocation / tracking.  Instead
of initiating solely on netdump-configured interface(s) at dumpon(8)
configuration time, we watch for any debugnet-enabled NIC for link
activation and query it for mbuf parameters at that time.  If they exceed
the existing high-water mark allocation, we re-allocate and track the new
high-water mark.  Otherwise, we leave the pre-panic mbuf allocation alone.
In a future patch in this series, this will allow initiating netdump from
panic ddb(4) without pre-panic configuration.

No other functional change intended.

Reviewed by:	markj (earlier version)
Some discussion with:	emaste, jhb
Objection from:	marius
Differential Revision:	https://reviews.freebsd.org/D21421
2019-10-17 16:23:03 +00:00
Gleb Smirnoff
756368b68b igmp_v1v2_queue_report() doesn't require epoch. 2019-10-17 16:02:34 +00:00
Ed Maste
b7d6b3ffec snd_hda: style(9) whitespace fixup
PR:		241299
Submitted by:	Neel Chauhan
2019-10-17 14:58:03 +00:00
Konstantin Belousov
303fa05a1f swapon_check_swzone(): use already calculated static variables.
Submitted by:	ota@j.email.ne.jp
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D22065
2019-10-17 13:49:47 +00:00
Ed Maste
0a7e64cc55 vt: remove comment that is not true since r259680
r259680 added support to vt(4) for printing double-width characters.
Remove the comment that claims no support.

MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-10-17 13:08:50 +00:00
Andriy Gapon
5fdc2c044e provide a way to assign taskqueue threads to a kernel process
This can be used to group all threads belonging to a single logical
entity under a common kernel process.
I am planning to use the new interface for ZFS threads.

MFC after:	4 weeks
2019-10-17 06:32:34 +00:00
Andriy Gapon
fc0b98158d wbwd: small clean-ups and improvements
This change applies some suggestions by delphij from D21979.
A write-only variable is removed.
There is a diagnostic message if the driver does not recognize the chip.
A chained if-statement is converted to a switch.

MFC after:	3 weeks
2019-10-17 06:21:09 +00:00
Philip Paeps
579b70db89 ether: add older ethertype definitions for QinQ
Older network equipment used the ethertypes 0x9100, 0x9200, and 0x9300 for
outer VLANs, before standardisation introduced 0x88a8.

Submitted by:	 Lutz Donnerhacke <lutz_donnerhacke.de>
Differential Revision:	https://reviews.freebsd.org/D21846
2019-10-17 00:34:53 +00:00
Mark Johnston
d6793db233 Formalize the use of linker scripts for kernel modules.
Automatically apply ldscript.kmod.${MACHINE_ARCH} if it exists.
We already have an i386-specific linker script; rename it accordingly.

Note that the linker script is applied when the object files are
partially linked.  (For amd64 this is also the final link.)

Reviewed by:	imp, kib
Discussed with:	jhb
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21887
2019-10-16 22:19:56 +00:00
Mark Johnston
6e6d41f20d Introduce pmap_change_prot() for amd64.
This updates the protection attributes of subranges of the kernel map.
Unlike pmap_protect(), which is typically used for user mappings,
pmap_change_prot() does not perform lazy upgrades of protections.
pmap_change_prot() also updates the aliasing range of the direct map.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21758
2019-10-16 22:12:34 +00:00
Mark Johnston
6d775f0ba1 Use KOBJMETHOD_END in the kernel linker.
MFC after:	1 week
2019-10-16 22:06:19 +00:00
Mark Johnston
01cef4caa7 Remove page locking from pmap_mincore().
After r352110 the page lock no longer protects a page's identity, so
there is no purpose in locking the page in pmap_mincore().  Instead,
if vm.mincore_mapped is set to the non-default value of 0, re-lookup
the page after acquiring its object lock, which holds the page's
identity stable.

The change removes the last callers of vm_page_pa_tryrelock(), so
remove it.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21823
2019-10-16 22:03:27 +00:00
Chuck Silvers
30738a349d Make all the gnop parameters optional in the request from userland,
filling in the same defaults that the current userland module uses.
This allows an old geom_nop.so userland module to work with a new kernel.

Approved by:	imp (mentor)
Reviewed by:	cem
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21972
2019-10-16 21:49:44 +00:00
Chuck Silvers
1e00bb458d Add a new gctl_get_paraml_opt() interface to extract optional parameters from
the request.  It is the same as gctl_get_paraml() except that the request
is not marked with an error if the parameter is not present.

Approved by:	imp (mentor)
Reviewed by:	cem
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21972
2019-10-16 21:49:39 +00:00
Mark Johnston
d0c9294b81 Correct the range boundaries used by kern_mincore().
Reported by:	alc
Sponsored by:	Netflix
2019-10-16 21:47:58 +00:00
Konstantin Belousov
96448820a3 Port r353622 to sparc64 and arm v4.
Noted by:	alc
Reviewed by:	alc, jeff, markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22056
2019-10-16 21:07:18 +00:00
Conrad Meyer
f677fed5a2 ddb: Add support for disassembling 'crc32' on amd64 2019-10-16 18:27:27 +00:00
Eric Joyner
087ea4103c Fix compile error introduced in r353658
"adapter" doesn't exist in ixl.

Reported by:	O. Hartmann <ohartmann@walstatt.org>
2019-10-16 18:12:22 +00:00
Eric Joyner
f17e0a71cd ixl: report whether device received pause frames
From Jake:
When updating the device statistics, report whether or not we have
received any pause frames to the iflib stack. This allows the iflib
stack to avoid generating a Tx hang message while the device is paused.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21870
2019-10-16 17:19:17 +00:00
Eric Joyner
e37d3dc11c ix: report isc_pause_frames during stat update
From Jake:
Notify the iflib stack of whether we received any pause frames during
the timer window. This allows the stack to avoid reporting a Tx hang due
to the device being paused.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21869
2019-10-16 17:16:32 +00:00
Eric Joyner
ea0e3f4db5 e1000: correctly set isc_pause_frames only when XOFF increases
From Jake:
The e1000 driver sets the iflib shared context isc_pause_frames value to
the number of received xoff frames. This is done so that the iflib
watchdog timer won't trigger a Tx Hang due to pause frames.

Unfortunately, the function simply sets it to the value of the xoffrxc
counter. Once the device has received a single XOFF packet, the driver
always reports that we received pause frames. This will prevent the Tx
hang detection entirely from that point on.

Fix this by assigning isc_pause_frames to a non-zero value if we
received any XOFF packets in the last timer interval.

We could attempt to calculate the total number of received packets by
doing a subtraction, but the iflib stack only seems to check if
isc_pause_frames is non-zero.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21868
2019-10-16 17:13:46 +00:00
Gleb Smirnoff
b46d70fd88 do_link_state_change() is executed in taskqueue context and in
general is allowed to sleep.  Don't enter the epoch for the
whole duration.  If some event handlers need the epoch, they
should handle that theirselves.

Discussed with:	hselasky
2019-10-16 16:32:58 +00:00
Ian Lepore
caf894f793 Update some comments; no functional changes. Some historical old comments
in this driver indicate that the SD_CAPA register is write-once and after
being set one time the values in it cannot be changed.  That turns out not
to be the case -- the values written to it survive a reset, but they can
be rewritten/changed at any time.
2019-10-16 16:26:35 +00:00
Ian Lepore
d436d6e96f Revert r351218 (by manu). While the changes in r351218 appear to be (and
should be) correct, they lead to the eMMC on a Beaglebone failing to work
in some situations.

The TI sdhci hardware is kind of strange.  The first device inherently
supports 1.8v and 3.3v and the abililty to switch between them, and the
other two devices must be set to 1.8v in the sdhci power control register to
operate correctly, but doing so actually makes them run at 3.3v (unless an
external level-shifter is present in the signal path).  Even the 1.8v on the
first device may actually be 3.3v (or any other value), depending on what
voltage is fed to the VDDS1-VDDS7 power supply pins on the am335x chip.

Another strange quirk is that the convention for am335x sdhci drivers in
linux and uboot and the am335x boot ROM seems to be to set the voltage in
the sdhci capabilities register to 3.0v even though the actual voltage is
3.3v.  Why this is done is a complete mystery to me, but it seems to be
required for correct operation.

If we had complete modern support for the am335x chip we could get the
actual voltages from the FDT data and the regulator framework.  But our
am335x code currently doesn't have any regulator framework support.
Reverting to the prior code will get the popular Beaglebone boards working
again.

This is part of the fix for PR 241301, but also requires r353651 for a
complete fix.

PR:		241301
Discussed with: manu
2019-10-16 16:19:21 +00:00
Ian Lepore
49dfdf633a Relax the sdhci(4) check that filters out the 1.8v voltage option unless
the slot is flagged as 'embedded'.

The features related to embedded and shared slots were added in v3.0 of
the sdhci spec.  Hardware prior to v3 sometimes supported 1.8v on non-
removable devices in embedded systems, but had no way to indicate that
via the standard sdhci registers (instead they use out of band metadata
such as FDT data).

This change adds the controller specification version to the check for
whether to filter out the 1.8v selection.  On older hardware, the 1.8v
option is allowed to remain.  On 3.0 or later it still requires the
embedded-slot flag to remain.

This is part of the fix for PR 241301 (eMMC not detected on Beaglebone).
Changes to the sdhci_ti driver are also needed for a full fix.

PR:		241301
2019-10-16 16:03:19 +00:00
Mark Johnston
b4efea53e0 Clear PGA_WRITEABLE in moea_pvo_remove().
moea_pvo_remove() might remove the last mapping of a page, in which case
it is clearly no longer writeable.  This can happen via pmap_remove(),
or when a CoW fault removes the last mapping of the old page.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22044
2019-10-16 15:50:12 +00:00
Andriy Gapon
88f8e0984f attach itwd to the module build on x86
MFC after:	19 days
X-MFC with:	r353647
2019-10-16 15:01:44 +00:00
Andriy Gapon
edca4938f7 itwd(4): driver for watchdog function in ITE Super I/O chips
The chips are commonly named with "IT" prefix.

MFC after:	19 days
2019-10-16 14:57:38 +00:00
Andriy Gapon
bc37ac7ea8 wbwd: move to superio(4) bus
This allows to remove a bunch of low level code.
Also, superio(4) provides safer interaction with other drivers
that work with Super I/O configuration registers.

Tested only on PCengines APU2:
superio0: <Nuvoton NCT5104D/NCT6102D/NCT6106D (rev. B+)> at port 0x2e-0x2f on isa0
wbwd0: <Nuvoton NCT6102 (0xc4/0x53) Watchdog Timer> at WDT ldn 0x08 on superio0

The watchdog output is incorrectly wired on that system and the watchdog
does not really do it its job, but the pulse can be seen with a signal
analyzer.

Reviewed by:	delphij, bcr (man page)
MFC after:	19 days
Differential Revision: https://reviews.freebsd.org/D21979
2019-10-16 14:46:04 +00:00
Andriy Gapon
e3df342a32 move nctgpio to superio(4) bus
This is where it logically belongs.
The change allows to drop a bunch of low lewel code.

Reviewed by:	gonzo
MFC after:	19 days
Differential Revision: https://reviews.freebsd.org/D21980
2019-10-16 14:42:49 +00:00
Emmanuel Vadot
3635891a92 dwc3: Use a pair of ()'s around arguments for some macros
Reported by:	hselasky
MFC after:	1 week
X-MFC-With:	r353533
2019-10-16 13:53:53 +00:00
Andrew Turner
e8709d7d42 Use tables to store the information to decode the arm64 ID registers.
Arm updates these with each new architecture revision. To help keep them
updated use a collection of tables to hold the needed information to
decode these registers.

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D22020
2019-10-16 13:30:28 +00:00
Andrew Turner
9bb37c03fb Stop leaking information from the kernel through timespec
The timespec struct holds a seconds value in a time_t and a nanoseconds
value in a long. On most architectures these are the same size, however
on 32-bit architectures other than i386 time_t is 8 bytes and long is
4 bytes.

Most ABIs will then pad a struct holding an 8 byte and 4 byte value to
16 bytes with 4 bytes of padding. When copying one of these structs the
compiler is free to copy the padding if it wishes.

In this case the padding may contain kernel data that is then leaked to
userspace. Fix this by copying the timespec elements rather than the
entire struct.

This doesn't affect Tier-1 architectures so no SA is expected.

admbugs:	651
MFC after:	1 week
Sponsored by:	DARPA, AFRL
2019-10-16 13:21:01 +00:00
Andriy Gapon
b6528d546f MFV r353637: 10844 Serialize ZTHR operations to eliminate races
illumos/illumos-gate@6a316e1f6d
6a316e1f6d

https://www.illumos.org/issues/10844
  ZoL 61c3391acc Serialize ZTHR operations to eliminate races

Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
Obtained from:	illumos, ZoL
MFC after:	3 weeks
2019-10-16 09:29:01 +00:00
Andriy Gapon
428c45f156 MFV r353630: 10809 Performance optimization of AVL tree comparator functions
illumos/illumos-gate@c4ab0d3f46
c4ab0d3f46

https://www.illumos.org/issues/10809
  Port ZoL ee36c709c3 Performance optimization of AVL tree comparator functions

This is a followup to r337567 that imported the ZoL commit directly into
FreeBSD.  It seems that at the time we did not have some of the earlier
changes, so some pieces of the ZoL change were not applicable.  Also,
the illumos version got a few style cleanups.  Some changes were missed
or incorrectly merged (e.g., vdev_cache_lastused_compare and
metaslab_rangesize_compare).

Obtained from:	ZoL, illumos
MFC after:	25 days
X-MFC after:	r353634
2019-10-16 09:20:08 +00:00
Hans Petter Selasky
a55383e720 Fix panic in network stack due to use after free when receiving
partial fragmented packets before a network interface is detached.

When sending IPv4 or IPv6 fragmented packets and a fragment is lost
before the network device is freed, the mbuf making up the fragment
will remain in the temporary hashed fragment list and cause a panic
when it times out due to accessing a freed network interface
structure.


1) Make sure the m_pkthdr.rcvif always points to a valid network
interface. Else the rcvif field should be set to NULL.

2) Use the rcvif of the last received fragment as m_pkthdr.rcvif for
the fully defragged packet, instead of the first received fragment.

Panic backtrace for IPv6:

panic()
icmp6_reflect() # tries to access rcvif->if_afdata[AF_INET6]->xxx
icmp6_error()
frag6_freef()
frag6_slowtimo()
pfslowtimo()
softclock_call_cc()
softclock()
ithread_loop()

Reviewed by:	bz
Differential Revision:	https://reviews.freebsd.org/D19622
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-16 09:11:49 +00:00
Andriy Gapon
786c532a8f MFV r348596: 9689 zfs range lock code should not be zpl-specific
illumos/illumos-gate@7931524763

FreeBSD note: some tweaking was needed to avoid a conflict with
sys/rangelock.h.

Author:	Matthew Ahrens <mahrens@delphix.com>
Obtained from:	illumos
MFC after:	3 weeks
2019-10-16 09:04:53 +00:00
Hans Petter Selasky
758a35d0bc VLAN_TRUNKDEV() requires epochification in ibcore after r353292.
Sponsored by:	Mellanox Technologies
2019-10-16 08:56:07 +00:00
Hans Petter Selasky
052bc06c25 Replace rdma_is_upper_dev_rcu() with rdma_vlan_dev_real_dev() in ibcore.
This reduces the number of references to VLAN_TRUNKDEV() in ibcore.
Currently only VLAN is supported as a child interface in FreeBSD.
Remove superfluous RCU locking.

Sponsored by:	Mellanox Technologies
2019-10-16 08:55:29 +00:00
Hans Petter Selasky
8232fd4dcd VLAN_DEVAT() requires epochification in ipoib after r353292.
Sponsored by:	Mellanox Technologies
2019-10-16 08:40:58 +00:00
Andriy Gapon
f6a4b91c75 MFV r353628:
10842 Mutex leak in dsl_dataset_hold_obj()

illumos/illumos-gate@ad027c0ff9
ad027c0ff9

https://www.illumos.org/issues/10842
  ZoL d10b2f1d35 Mutex leak in dsl_dataset_hold_obj()

Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Author: Jorgen Lundman <lundman@lundman.net>
Obtained from:	illumos, ZoL
MFC after:	15 days
2019-10-16 07:57:58 +00:00
Konstantin Belousov
2a499f92ba Fix assert in PowerPC pmaps after introduction of object busy.
The VM_PAGE_OBJECT_BUSY_ASSERT() in pmap_enter() implementation should
be only asserted when the code is executed as result of pmap_enter(),
not when the same code is entered from e.g. pmap_enter_quick().  This
is relevant for all PowerPC pmap variants, because mmu_*_enter() is
used as the backend, and assert is located there.

Add a PowerPC private pmap_enter() PMAP_ENTER_QUICK_LOCKED flag to
indicate that the call is not from pmap_enter().  For non-quick-locked
calls, assert that the object is locked.

Reported and tested by:	bdragon
Reviewed by:	alc, bdragon, markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22041
2019-10-16 07:09:15 +00:00
Andriy Gapon
0c4f60b734 MFV r353619: 9691 fat zap should prefetch when iterating
illumos/illumos-gate@52abb70e07
52abb70e07

https://www.illumos.org/issues/9691
  When iterating over a ZAP object, we're almost always certain to
  iterate over the entire object. If there are multiple leaf blocks, we
  can realize a performance win by issuing reads for all the leaf blocks
  in parallel when the iteration begins.
  For example, if we have 10,000 snapshots, "zfs destroy -nv
  pool/fs@1%9999" can take 30 minutes when the cache is cold. This
  change provides a >3x performance improvement, by issuing the reads
  for all ~64 blocks of each ZAP object in parallel.

Author: Matthew Ahrens <mahrens@delphix.com>
Obtained from:	illumos
MFC after:	2 weeks
2019-10-16 07:09:00 +00:00
Andriy Gapon
9efb961d9a MFV r353617: 9425 allow channel programs to be stopped via signals
illumos/illumos-gate@d0cb1fb926
d0cb1fb926

https://www.illumos.org/issues/9425
  Problem Statement
  ZFS Channel program scripts currently require a timeout, so that hung
  or long-running scripts return a timeout error instead of causing ZFS
  to get wedged.  This limit can currently be set up to 100 million Lua
  instructions. Even with a limit in place, it would be desirable to
  have a sys admin (support engineer) be able to cancel a script that is
  taking a long time.

  Proposed Solution
  Make it possible to abort a channel program by sending an interrupt
  signal.In the underlying txg_wait_sync function, switch the cv_wait to
  a cv_wait_sig to catch the signal. Once a signal is encountered, the
  dsl_sync_task function can install a Lua hook that will get called
  before the Lua interpreter executes a new line of code. The
  dsl_sync_task can resume with a standard txg_wait_sync call and wait
  for the txg to complete. Meanwhile, the hook will abort the script and
  indicate that the channel program was canceled. The kernel returns a
  EINTR to indicate that the channel program run was canceled.

FreeBSD note: the return value of cv_wait_sig() has inverted meaning
between us and illumos.

Author: Don Brady <don.brady@delphix.com>
Obtained from:	illumos
MFC after:	4 weeks
2019-10-16 07:00:18 +00:00
Andriy Gapon
179e6dab09 MFV r353615: 9485 Optimize possible split block search space
illumos/illumos-gate@a21fe34979
a21fe34979

https://www.illumos.org/issues/9485
  Port this commit from ZoL:
  4589f3ae4c

Author: Brian Behlendorf <behlendorf1@llnl.gov>
Obtained from:	illumos, ZoL
MFC after:	3 weeks
2019-10-16 06:43:22 +00:00
Andriy Gapon
7149963e95 MFV r353613: 10731 zfs: NULL pointer errors
FreeBSD already had these changes locally.
This commit removes a small formatting difference.

MFC after:	1 week
2019-10-16 06:38:05 +00:00
Andriy Gapon
6cb9ab2bad MFC r353611: 10330 merge recent ZoL vdev and metaslab changes
illumos/illumos-gate@a0b03b161c
a0b03b161c

https://www.illumos.org/issues/10330
  3 recent ZoL changes in the vdev and metaslab code which we can pull over:
  PR 8324 c853f382db 8324 Change target size of metaslabs from 256GB to 16GB
  PR 8290 b194fab0fb 8290 Factor metaslab_load_wait() in metaslab_load()
  PR 8286 419ba59145 8286 Update vdev_is_spacemap_addressable() for new spacemap
  encoding

Author: Serapheim Dimitropoulos <serapheimd@gmail.com>
Obtained from:	illumos, ZoL
MFC after:	2 weeks
2019-10-16 06:26:51 +00:00
Andriy Gapon
b399ca755a MFV r353608: 10165 libzpool: passing argument 1 to restrict-qualified parameter
illumos/illumos-gate@f91fcf59ac
f91fcf59ac

https://www.illumos.org/issues/10165

Author: Toomas Soome <tsoome@me.com>
MFC after:	10 days
2019-10-16 06:09:00 +00:00
Justin Hibbits
d70b36edb5 powerpc/mpc85xx: Fix function type for fsl_pcib_error_intr()
Since it's only called as an interrupt handler, fsl_pcib_eror_intr() should just
match the driver_intr_t type.

Reported by:	bdragon
2019-10-16 03:03:59 +00:00
Justin Hibbits
34ed25a82e powerpc: Add AmigaOne platform, a subclass of MPC85xx
Summary:
The AmigaOne platform, encompassing the X5000 and A1222 at this time, is
based on the mpc85xx platform, but includes some things not listed in
the device tree.  Some custom devices, like CPLD, could be added to the
device tree with an overlay, or other means.  However, some cannot
easily be done, such as the power button interrupt.

The directory will also become a location to add AmigaOne platform drivers,
such as the aforementioned CPLD, and its children.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D21829
2019-10-16 00:38:50 +00:00
Kristof Provost
1d95443818 Generalize ARM specific comments in devmap
The comments in devmap are very ARM specific, this generalizes them for other
architectures.

Submitted by:	Nicholas O'Brien <nickisobrien_gmail.com>
Reviewed by:	manu, philip
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D22035
2019-10-15 23:21:52 +00:00
Eric Joyner
d61b6a41dd ixgbe: Disable EEE for backplane X550EM_X
From Zach:
Intel documentation indicates that backplane X550EM_X KR devices do not
support Energy Efficient Ethernet. Prior to this patch, X552 devices
(device ID 0x15AB) will crash the system when transitioning EEE state
via sysctl.

Signed-off-by: Zach Vargas <zvargas@xes-inc.com>

PR:		240320
Submitted by:	Zach Vargas <zvargas@xes-inc.com>
Reviewed by:	erj@
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D21673
2019-10-15 21:56:19 +00:00
Gleb Smirnoff
4b25d1f2e3 Missing from r353596. 2019-10-15 21:32:38 +00:00
Gleb Smirnoff
bac060388f When assertion for a thread not being in an epoch fails also print all
entered epochs. Works with EPOCH_TRACE only.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D22017
2019-10-15 21:24:25 +00:00
Edward Tomasz Napierala
014931a858 Add copyrights that I forgot to add when splitting arb.h off from tree.h.
While here clean up the RCS tags.

Suggested by:	lstewart
MFC after:	2 weeks
Sponsored by:	Klara Inc, Netflix
2019-10-15 19:44:43 +00:00
John Baldwin
65db3aa8c4 Install an ACPI PCI bus notify handler.
Rescan a PCI bus when the ACPI_NOTIFY_BUS_CHECK event is posted to a
PCI bus.

Reviewed by:	scottl
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D21948
2019-10-15 19:12:09 +00:00
John Baldwin
8c9861f575 Support hot insertion and removal of PCI devices on EC2.
Install ACPI notify handlers on PCI devices with an _EJ0 method.  This
handler is invoked when devices are added or removed.

- When an ACPI_NOTIFY_DEVICE_CHECK event posts, rescan the parent bus
  device.  Note that strictly speaking we only need to rescan the
  specified device, but BUS_RESCAN is what is available, so we rescan
  the entire bus.
- When an ACPI_NOTIFY_EJECT_REQUEST event posts, detach the device
  associated with the ACPI handle, invoke the _EJ0 method, and then
  delete the device.

Eventually this might be changed to vector notify events to devd in
userspace where devctl can be used instead to permit more complex
actions such as graceful unmounting of filesystems.

Tested by:	cperciva
Reviewed by:	cperciva, imp, scottl
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21948
2019-10-15 19:04:39 +00:00
John Baldwin
21d3196209 Export pci_attach() and pci_detach().
Reviewed by:	imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21948
2019-10-15 18:58:01 +00:00
Navdeep Parhar
693a9dfce2 cxgbe(4): An EQ update can be requested in a TX_PKTS2 work request.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-10-15 17:35:39 +00:00
John Baldwin
a2b282151f Use -march=octeon+ for OCTEON1.
External binutils requires octeon+ for saa.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22033
2019-10-15 17:28:26 +00:00
Ruslan Bukin
a53a43d988 Fix dwmmc(4) driver attachment when ext_resources are not present.
Ignore only ENOENT (no DTS properties found) and ENODEV (driver not
present) non-zero return values from ext_resources.

Reviewed by:	manu
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D22043
2019-10-15 17:24:21 +00:00
John Baldwin
6ed6d6931a Fix a write-only variable warning from external GCC.
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22032
2019-10-15 17:17:16 +00:00
John Baldwin
017128c99b Don't set the OUTPUT_FORMAT explicitly but let ld derive it.
This fixes an error with modern ld.bfd and is inline with the changes in
r215251 and r217612.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22031
2019-10-15 17:14:30 +00:00
John Baldwin
64f1604a76 Update MIPS kernel builds to work with mips-gcc.
- Use a default -march of mips64 on N64 and N32 kernels.
- Set the endianness (via MIPS_ENDIAN) and ABI (via MIPS_ABI) in
  CFLAGS from MACHINE_ARCH.  ARCH_FLAGS now only sets a different
  -march value if needed.
- TRAMP_ARCH_FLAGS inherits MIPS_ENDIAN from MACHINE_ARCH but does
  not set the ABI since XLPN32 needs an N64 ABI for the trampoline
  loader.  When TRAMP_ARCH_FLAGS is used it must set both -march
  and -mabi.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22030
2019-10-15 17:11:42 +00:00
Andriy Gapon
67f8ab8ebb fix up r353565, somehow a few files did not get committed
MFC after:	3 weeks
X-MFC with:	r353565
2019-10-15 15:52:01 +00:00
Gleb Smirnoff
237c1f932b Remove pfctlinput2(). It came from KAME and had never ever been in use. 2019-10-15 15:40:03 +00:00
Andriy Gapon
6f2721b907 MFV r353561: 10343 ZoL: Prefix all refcount functions with zfs_
illumos/illumos-gate@e914ace2e9
e914ace2e9

https://www.illumos.org/issues/10343
  On the openzfs feature/porting matrix, this is listed as:
  prefix to refcount funcs/types
  Having these changes will make it easier to share other work across the
  different ZFS operating systems.
  PR 7963 424fd7c3e Prefix all refcount functions with zfs_
  PR 7885 & 7932 c13060e47 Linux 4.19-rc3+ compat: Remove refcount_t compat
  PR 5823 & 5842 4859fe796 Linux 4.11 compat: avoid refcount_t name conflict

Author: Tim Schumacher <timschumi@gmx.de>
Obtained from:	illumos, ZoL
MFC after:	3 weeks
2019-10-15 15:09:36 +00:00