Commit Graph

132288 Commits

Author SHA1 Message Date
Emmanuel Vadot
2491b25c3a linuxkpi: Add rcu_work functions
The rcu_work function helps to queue some work after waiting for a grace
period.
This is needed by DRM drivers.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24942
2020-05-21 20:18:38 +00:00
Mark Johnston
cb7c78fd28 Fix ACCEPT_FILTER_DEFINE to pass the version to MODULE_VERSION.
MFC with:	r361263
2020-05-21 18:38:41 +00:00
Brandon Bergren
e1110c4082 [PowerPC] Fix kernel boot on powerpc
Recent changes have caused the vmspace objects to start coming from KVA
instead of direct-mapped memory on powerpc. As far as I can tell, this is
not actually a problem, so we should stop arbitrarily asserting that it is.

I do not know why this was not being triggered before.

Approved by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
2020-05-21 15:53:16 +00:00
John-Mark Gurney
80d7c14c42 Bring in support for single core Zynq devices. Turns out that real
hardware, the registers appear like there's two cores, but the second
core does not work, so base the number of cores upon the chip id.

Tested on a XC7Z007S.

also, previous commit was suppose to be D14429.

Submitted by:   Thomas Skibo
Differential Revision:  https://reviews.freebsd.org/D14429
2020-05-21 06:40:51 +00:00
John-Mark Gurney
4ee4e0cd20 minor cleanup of white space, and function name in panic...
This is a partial commit of the review.

Submitted by:   Thomas Skibo
Differential Revision:  https://reviews.freebsd.org/D23319
Reviewed by:	andrew
2020-05-21 06:17:54 +00:00
Doug Moore
c807b9ec06 For the case when RB_REMOVE requires a nontrivial search to find the
node to replace the one being removed, restructure to first remove the
replacement node and correct the parent pointers around it, and then
let the all-cases code at the end deal with the parent of the deleted
node, making it point to the replacement node. This removes one or two
conditional branches.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D24845
2020-05-21 05:34:02 +00:00
Adrian Chadd
a100c050eb [ath] Hopefully recover better-er upon RX restart on AR9380.
This is all very long-standing bug stuff that is touchy and still poorly
documented. Ok, here goes.

The basic bug:

* deleting a VAP causes the RX path (and TX path too) to be restarted
  without a full chip reset, which causes RX hangs on the AR9380 and later.
  (ie, the ones with the newer DMA engine.)

The basic fix:

* do an RX flush when stopping RX in ath_vap_delete() to match what happens
  when RX is stopped elsewhere.  This ensures any pending frames are completed
  and we restart at the right spot; it also ensures we don't push new RX buffers
  into the hardware if we're stopping receive.

The other issues I found:

* Don't bother checking the RX packet ring in the deferred read taskqueue;
  that's specifically supposed to be for completing frames rather than
  just yanking them off the receive ring.

* Cancel/drain any pending deferred read taskqueue.  This isn't done inside
  any locks so we should be super careful here.  This stops the hardware
  being reprogrammed at the same time in another thread/CPU whilst we're
  stopping RX.

* .. (yes, this should be better serialised, but that's for another day. maybe.)

* Add more debugging to trace what's going on here.

And the fun bit:

* Reinitialise the RX FIFO ONLY if we've been reset or stopped, rather than just
  reset.  I noticed that after all the above was done I was STILL seeing RXEOL.
  RXEOL isn't enabled on the AR9380 so I'd only see it if I was sending TX frames
  (ie a ping where it'd be transmitted but never received) so I was not being
  spammed by RXEOL.  So, as long as stuff is stopped, restart it.

This seems to be doing the right thing in both AP and STA modes.

What I should do next, if I ever get time:

* as I said above, serialise the receive stop/start to include taskqueues
* monitor RXEOL on the AR9380 and I keep seeing it spammed / lockups, just
  go do a full chip reset to get things back on track. It sucks, but it
  is better than nothing.

Tested:

* AR9380 AP/STA mode, adding/deleting a hostap VAP to trigger the TX/RX
  queue stop/start; whilst also running an iperf through it.  Lots of times.
  Lots.  Of.. Times.
2020-05-21 04:35:12 +00:00
Adrian Chadd
9a2de0c3d6 [ath] reset hardware if this particular mac bug is seen.
I have to dig into why I'm seeing it on chips as late as the AR9380 era
stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing
aggregate TX frames complete with no blockack bit set.  So, everything
should be treated as a failure and do a hardware reset for good measure.

Tested:

* AR9380, STA mode
* AR9580 (5GHz), AP mode
2020-05-21 04:26:20 +00:00
Adrian Chadd
8af1445957 [ath_rate_sample] Obey the maximum frame length even when using static rates.
I wasn't enforcing the maximum packet length when using static rates
so although the driver was enforcing it itself OK, the statistics were
sometimes going into the wrong bin.

Tested:

* AR9380, STA mode
2020-05-21 03:53:45 +00:00
Justin Hibbits
b923b34a0f powerpc: Handle machine checks caused by D-ERAT multihit
Instead of crashing the user process when a D-ERAT multihit is detected, try
to flush the ERAT, and continue.  This machine check indicates a likely PMAP
invalidation shortcoming that will need to be addressed, but it's
recoverable, so just recover.  The recovery is pmap-specific to flush the
ERAT, so add a pmap function to do so, currently only implemented by the
POWER9 radix pmap.
2020-05-21 03:33:20 +00:00
Ryan Moeller
245bfd34da Deduplicate fsid comparisons
Comparing fsid_t objects requires internal knowledge of the fsid structure
and yet this is duplicated across a number of places in the code.

Simplify by creating a fsidcmp function (macro).

Reviewed by:	mjg, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24749
2020-05-21 01:55:35 +00:00
John Baldwin
4f98ffdd1d Fix libstand build breakage after r361298.
- Use enc_xform_aes_xts.setkey() directly instead of duplicating the code
  now that it no longer calls malloc().
- Rather than bringing back all of xform_userland.h, add a conditional
  #include of <stand.h> to xform_enc.h.
- Update calls to encrypt/decrypt callbacks in enc_xform_aes_xts for
  separate input/output pointers.

Pointy hat to:	jhb
2020-05-20 22:25:41 +00:00
Konstantin Belousov
2c6d9dc0bb Change the samantic of struct link_map l_addr member.
It previously returned the object map base address, while all other
ELF operating systems return load offset, i.e. the difference between
map base and the link base.

Explain the meaning of the field in the man page.

Stop filling the mips-only l_offs member, which is apparently unused.

PR:	246561
Requested by:	Damjan Jovanovic <damjan.jov@gmail.com>
Reviewed by:	emaste, jhb, cem (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24918
2020-05-20 22:08:26 +00:00
Konstantin Belousov
ea6020830c amd64: Add a knob to flush RSB on context switches if machine has SMEP.
The flush is needed to prevent cross-process ret2spec, which is not handled
on kernel entry if IBPB is enabled but SMEP is present.
While there, add i386 RSB flush.

Reported by:	Anthony Steinhauser <asteinhauser@google.com>
Reviewed by:	markj, Anthony Steinhauser
Discussed with:	philip
admbugs:	961
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-05-20 22:00:31 +00:00
Konstantin Belousov
36e1ad61e8 Do not consider CAP_RDCL_NO as an indicator for all MDS vulnerabilities
handled by hardware.

Reported by:	Anthony Steinhauser <asteinhauser@google.com>
admbugs:	962
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-05-20 21:22:25 +00:00
John Baldwin
3e9470482a Various cleanups to the software encryption transform interface.
- Consistently use 'void *' for key schedules / key contexts instead
  of a mix of 'caddr_t', 'uint8_t *', and 'void *'.

- Add a ctxsize member to enc_xform similar to what auth transforms use
  and require callers to malloc/zfree the context.  The setkey callback
  now supplies the caller-allocated context pointer and the zerokey
  callback is removed.  Callers now always use zfree() to ensure
  key contexts are zeroed.

- Consistently use C99 initializers for all statically-initialized
  instances of 'struct enc_xform'.

- Change the encrypt and decrypt functions to accept separate in and
  out buffer pointers.  Almost all of the backend crypto functions
  already supported separate input and output buffers and this makes
  it simpler to support separate buffers in OCF.

- Remove xform_userland.h shim to permit transforms to be compiled in
  userland.  Transforms no longer call malloc/free directly.

Reviewed by:	cem (earlier version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24855
2020-05-20 21:21:01 +00:00
John Baldwin
2aa1dc7e3b Print CPU informtion later in boot.
Match other architectures and print CPU information during
cpu_startup().  In particular, this prints the information after the
message buffer is initialized which allows it to be retrieved after
boot via dmesg(8).

While here, add some extern declarations to <machine/md_var.h> in
place of duplicated declarations in various source files.

Reviewed by:	brooks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24936
2020-05-20 21:16:54 +00:00
John Baldwin
6adcdf6577 Simplify hot-patching cpu_switch() for lack of UserLocal register.
Rather than walking all of cpu_switch looking for the sequence of
instructions to patch, add a global label at the location that needs
the patch applied.

Reviewed by:	brooks, Alfredo Mazzinghi <alfredo.mazzinghi_cl.cam.ac.uk>
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24931
2020-05-20 21:15:43 +00:00
John Baldwin
6faabe5411 Remove copyinfrom() and copyinstrfrom().
These functions were added in 2001 and are currently unused.
copyinfrom() looks to have never been used.  copyinstrfrom() was used
for two weeks before the code was refactored to remove it's sole use.

Reviewed by:	brooks, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24928
2020-05-20 20:58:17 +00:00
John Baldwin
871c7dd2aa Merge freebsd32_exec_setregs() into exec_setregs() on MIPS.
The stack pointer was being decremented by 64k twice previously.

Reviewed by:	brooks
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24930
2020-05-20 19:51:39 +00:00
Ed Maste
697b271da9 pkgbase: use -dev,-dbg instead of -development,-debug
-development is long and awkward, and is also inconsistent with prior art
from the Linux world, which uses -dev (Debian) or -devel (Red Hat).  Follow
the Debian convention, and similarly for debug info packages.

Also remove redundant pkgbase development tag from includes.  We already tag
include files with package=runtime,dev; there is no need to separately tag
them as dev.

Discussed with:	bapt
Reviewed by:	manu
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24139
2020-05-20 19:45:22 +00:00
Mark Johnston
66b415fb8f Don't block on the range lock in zfs_getpages().
After r358443 the vnode object lock no longer synchronizes concurrent
zfs_getpages() and zfs_write() (which must update vnode pages to
maintain coherence).  This created a potential deadlock between ZFS
range locks and VM page busy locks: a fault on a mapped file will cause
the fault page to be busied, after which zfs_getpages() locks a range
around the file offset in order to map adjacent, resident pages;
zfs_write() locks the range first, and then must busy vnode pages when
synchronizing.

Solve this by adding a non-blocking mode for ZFS range locks, and using
it in zfs_getpages().  If zfs_getpages() fails to acquire the range
lock, only the fault page will be populated.

Reported by:	bdrewery
Reviewed by:	avg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24839
2020-05-20 18:29:23 +00:00
Conrad Meyer
f4ce062964 vmm(4): Add 12 user ABI compat after r349948
Reported by:	kp
Reviewed by:	jhb, kp
Tested by:	kp
Differential Revision:	https://reviews.freebsd.org/D24929
2020-05-20 17:27:54 +00:00
Kristof Provost
5fcaec1a0b bnxt: isc_nrxd_max and isc_ntxd_max must be powers of two
Reviewed by:	gallatin, rpokala
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24922
2020-05-20 16:07:37 +00:00
Li-Wen Hsu
db7ec3c3e6 Fix i386 build for r361275
kponsored by:	The FreeBSD Foundation
2020-05-20 13:51:27 +00:00
Konstantin Belousov
d0a4068359 mlx5_core: add more port module event types to decode.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2020-05-20 11:20:45 +00:00
Konstantin Belousov
6418350cf4 mlx5_core: add "PMD type not enabled" port module event type.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2020-05-20 11:10:10 +00:00
Wei Hu
a560f3ebd7 HyperV socket implementation for FreeBSD
This change adds Hyper-V socket feature in FreeBSD. New socket address
family AF_HYPERV and its kernel support are added.

Submitted by:	Wei Hu <weh@microsoft.com>
Reviewed by:	Dexuan Cui <decui@microsoft.com>
Relnotes:	yes
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D24061
2020-05-20 11:03:59 +00:00
Roger Pau Monné
b5ba8a0f32 dev/xenstore: fix return with locks held
Fix returning from xenstore device with locks held, which triggers the
following panic:

# cat /dev/xen/xenstore
^C
userret: returning with the following locks held:
exclusive sx evtchn_ringc_sx (evtchn_ringc_sx) r = 0 (0xfffff8000650be40) locked @ /usr/src/sys/dev/xen/evtchn/evtchn_dev.c:262

Note this is not a security issue since access to the device is
limited to root by default.

Sponsored by:	Citrix Systems R&D
MFC after:	1 week
2020-05-20 11:01:10 +00:00
Andriy Gapon
22d1b05c8c iwm: improve rfkill handling
Previously the driver handled the bit within itself, but did not expose
the state change to net80211 and interface layers.
This change uses net80211 KPI for rfkill signaling.
The code is modeled after similar code in iwn and wpi.

Reviewed by:	adrian
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D24923
2020-05-20 08:15:09 +00:00
Justin Hibbits
baeeef1d8f powerpc/radix mmu: No need for delayed TLB invalidation
x86 needs delayed TLB invalidation because invalidation requires an
expensive IPI.  PowerPC has had a TLB invalidation instruction since the
POWER1 in 1990, so there's no need to delay anything.
2020-05-20 02:33:41 +00:00
Toomas Soome
22ed31c23f lz4 hash table does not start zeroed
illumos issue: https://www.illumos.org/issues/12757

Submitted by:	andyf
2020-05-19 19:53:12 +00:00
Mark Johnston
591b09b486 Define a module version for accept filter modules.
Otherwise accept filters compiled into the kernel do not preempt
preloaded accept filter modules.  Then, the preloaded file registers its
accept filter module before the kernel, and the kernel's attempt fails
since duplicate accept filter list entries are not permitted.  This
causes the preloaded file's module to be released, since
module_register_init() does a lookup by name, so the preloaded file is
unloaded, and the accept filter's callback points to random memory since
preload_delete_name() unmaps the file on x86 as of r336505.

Add a new ACCEPT_FILTER_DEFINE macro which wraps the accept filter and
module definitions, and ensures that a module version is defined.

PR:		245870
Reported by:	Thomas von Dein <freebsd@daemon.de>
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-05-19 18:35:08 +00:00
Mark Johnston
615a9fb5fe Use the symbolic name for "modmetadata_set".
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-05-19 18:34:50 +00:00
Navdeep Parhar
b0dede77b1 cxgbe/iw_cxgbe: Add an async callback to notify iw_cxgbe in case of a
fatal error.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-05-19 16:28:20 +00:00
Andrew Turner
ad020198ff Stop performing a full icache sync when the DIC and IDC flags are set
The DIC and IDC bits in the CTR_EL0 register signal to the kernel when it
can relax the instruction cache synchronisation operations. The IDC bit
means we can relax cleaning the data cache to the point of unification
while the DIC bit means we don't need to invalidate the instruction cache
for data coherence. In both cases an appropriate barrier is still needed.

For now only implement the case where both bits are set, as is the case
on the Neoverse-N1 as used in the Amazon AWS Graviton 2 CPU. Note that
this behaviour is a optional on the N1 so we may later need to implement
only one or the other bit being set.

There is a tunable to disable each flag on boot.

Testing on a 4 core Graviton 2 instance found a significant improvement
in sys and real time when running "make buildkernel -j4", with no
significant difference in user time.

Reviewed by:	markj
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D24853
2020-05-19 16:04:27 +00:00
Andrew Turner
13628ada98 Create MSI/MSI-X isrcs as needed in the GICv3 ITS driver
Previously we would create an isrc for each MSI/MSI-X interrupt. This
causes issues for other interrupt sources in the system, e.g. a GPIO
driver, as they may be unable to allocate interrupts. This works around
this by allocating the isrc only when needed.

Reported by:	alisaidi@amazon.com
Reviewed by:	mmel
Sponsored by:	Innovaate UK
Differential Revision:	https://reviews.freebsd.org/D24876
2020-05-19 15:27:20 +00:00
Takanori Watanabe
022f27959e Fix Typo in ng_hci_le_connection_complete_ep struct.
PR:	246538
Submitted by:	Marc Veldman
2020-05-19 13:58:52 +00:00
Emmanuel Vadot
eda697d2ef linuxkpi: Add irq_work.h
Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24859
2020-05-19 09:04:35 +00:00
Emmanuel Vadot
5e30a739c7 linuxkpi: add pci_dev_present
pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2

Submitted by:	Austing Shafer (ashafer@badland.io)
Differential Revision:	https://reviews.freebsd.org/D24796
2020-05-19 08:44:33 +00:00
Emmanuel Vadot
ff443195bf linuxkpi: Add __init_waitqueue_head
The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.

Sponsored-by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24861
2020-05-19 08:43:17 +00:00
Michael Tuexen
999f86d67d Replace snprintf() by SCTP_SNPRINTF() and let SCTP_SNPRINTF() map
to snprintf() on FreeBSD. This allows to check for failures of snprintf()
on platforms other than FreeBSD kernel.
2020-05-19 07:23:35 +00:00
Michael Tuexen
821bae7cf3 Revert r361209:
cem noted that on FreeBSD snprintf() can not fail and code should not
check for that.

A followup commit will replace the usage of snprintf() in the SCTP
sources with a variadic macro SCTP_SNPRINTF, which will simply map to
snprintf() on FreeBSD and do a checking similar to r361209 on
other platforms.
2020-05-19 07:21:11 +00:00
Kyle Evans
47c7d8327c zfs: reject read(2) of a dirfd with EISDIR
This is independent of the recently-discussed global change, which is still
in review/discussion stage.

This is effectively a measure for consistency in the ZFS world, where
FreeBSD was the only platform (as far as I could find) that allowed this.
What ZFS exposes is decidedly not useful for any real purposes, to
paraphrase (hopefully faithfully) jhb's findings when exploring this:

The size of a directory in ZFS is the number of directory entries within.
When reading a directory, you would instead get the leading part of its raw
contents; the amount you get being dictated by the "size," i.e. number of
directory entries. There's decidedly (luckily) no stack disclosure happening
here, though the behavior is bizarre and almost certainly a historical
accident.

This change has already been upstreamed to OpenZFS.

MFC after:	1 week
2020-05-19 02:41:05 +00:00
Justin Hibbits
1da3138f68 powerpc/mmu: Don't use the cache instructions to zero pages
A page (even physmem) can be marked as cache-inhibited.  Attempting to use
'dcbz' to zero a page mapped cache-inhibited triggers an alignment
exception, which is fatal in kernel.  This was seen when testing hardware
acceleration with X on POWER9.

At some point in the future, this should be changed to a more straight
forward zero loop instead of bzero(), and a similar change be made to the
other pmaps.

Reported by:	pkubaj@
2020-05-19 01:06:31 +00:00
Mike Karels
2fdbcbea76 Fix NULL-pointer bug from r361228.
Note that in_pcb_lport and in_pcb_lport_dest can be called with a NULL
local address for IPv6 sockets; handle it.  Found by syzkaller.

Reported by:	cem
MFC after:	1 month
2020-05-19 01:05:13 +00:00
Mike Karels
2510235150 Allow TCP to reuse local port with different destinations
Previously, tcp_connect() would bind a local port before connecting,
forcing the local port to be unique across all outgoing TCP connections
for the address family. Instead, choose a local port after selecting
the destination and the local address, requiring only that the tuple
is unique and does not match a wildcard binding.

Reviewed by:	tuexen (rscheff, rrs previous version)
MFC after:	1 month
Sponsored by:	Forcepoint LLC
Differential Revision:	https://reviews.freebsd.org/D24781
2020-05-18 22:53:12 +00:00
Michael Tuexen
6863ab0b8b Remove assignment without effect.
MFC after:	3 days
2020-05-18 19:48:38 +00:00
Michael Tuexen
9e8c9c9ef6 Don't check an unsigned variable for being negative.
MFC after:		3 days.
2020-05-18 19:35:46 +00:00
Michael Tuexen
04ce5df9c9 Remove redundant assignment.
MFC after:		3 days
2020-05-18 19:23:01 +00:00