Commit Graph

133034 Commits

Author SHA1 Message Date
Xin LI
8510f61acd sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25565
2020-07-09 02:52:39 +00:00
Emmanuel Vadot
e18c80bcb6 twsi: Fix for > Allwinner A20
Every revision of twsi after the A20 have a bug where we need to
write again the control register after each interrupts. We also need
to add some delay before writing to this register, a simple read of the
same register does the job so do that.
Also fix the case when we have finish sending all the bytes, it only worked
for 1 byte transfer (the same kind that we do for talking to the PMIC on A20
boards).
While here add more debug messages and rework some of them.

This was tested by talking to a AT23C32 eeprom and a DS3231 RTC from an
H3 and A20 board.

PR:		247576
Reported by:	Manuel Stühn (freebsd@justmail.de)
MFC after:	1 week
2020-07-08 19:14:44 +00:00
Emmanuel Vadot
9d2c88ab2a extres/syscon_generic: Make device quiet if not in boot verbose
On some boards there is a lot of of syscon node that are unused as
more specific drivers is probed before, no need to flood the console
for the mostly-unused generic ones.

MFC after:	1 week
2020-07-08 17:14:44 +00:00
Alan Somers
6f818c1fb0 geli: enable direct dispatch
geli does all of its crypto operations in a separate thread pool, so
g_eli_start, g_eli_read_done, and g_eli_write_done don't actually do very
much work. Enabling direct dispatch eliminates the g_up/g_down bottlenecks,
doubling IOPs on my system. This change does not affect the thread pool.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D25587
2020-07-08 17:12:12 +00:00
Michael Tuexen
fcbfdc0ab6 Improve consistency.
MFC after:		1 week
2020-07-08 16:23:40 +00:00
Michael Tuexen
ef9095c72a Fix error description.
MFC after:		1 week
2020-07-08 16:04:06 +00:00
Michael Tuexen
c96d7c373e Don't accept FORWARD-TSN chunks when I-FORWARD-TSN was negotiated
and vice versa.

MFC after:		1 week
2020-07-08 15:49:30 +00:00
Michael Tuexen
32df1c9ebb Improve handling of PKTDROP chunks. This includes the input validation
to address two issues found by ossfuzz testing the userland stack:
* https://oss-fuzz.com/testcase-detail/5387560242380800
* https://oss-fuzz.com/testcase-detail/4887954068865024
and adding support for I-DATA chunks in addition to DATA chunks.
2020-07-08 12:25:19 +00:00
Takanori Watanabe
de402d6322 Add support for [read|write] supported data length commands.
Fix ng_hci_le_long_term_key_request_negative_reply_cp struct
while here.

PR:	247809
Submitted by:	Marc Veldman
2020-07-08 06:33:07 +00:00
Rick Macklem
3eaf03766e Add support for ext_pgs mbufs to nfsm_uiombuf().
This patch uses a slightly different algorithm for the non-ext_pgs case,
where a variable called "mcp" is maintained, pointing to the current
location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-08 02:28:08 +00:00
Scott Long
b302c2e5c9 Migrate the feature of excluding RAM pages to use "excludelist"
as its nomenclature.

MFC after:	1 week
2020-07-07 20:33:11 +00:00
Mark Johnston
04ed45b968 Rebuild sysent when capabilities.conf is updated.
Reviewed by:	brooks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25571
2020-07-07 16:35:52 +00:00
Richard Scheffenegger
c201ce0b4a Fix KASSERT during tcp_newtcpcb when low on memory
While testing with system default cc set to cubic, and
running a memory exhaustion validation, FreeBSD panics for a
missing inpcb reference / lock.

Reviewed by:	rgrimes (mentor), tuexen (mentor)
Approved by:	rgrimes (mentor), tuexen (mentor)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25583
2020-07-07 12:10:59 +00:00
Rick Macklem
022346fa62 Add support for ext_pgs mbufs to nfsrvd_rephead().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-07 00:42:23 +00:00
Kristof Provost
38d715f789 riscv plic: Do not complete interrupts until the interrupt handler has run
We cannot complete the interrupt (i.e. write to the claims/complete register
until the interrupt handler has actually run. We don't run the interrupt
handler immediately from intr_isrc_dispatch(), we only schedule it for later
execution.

If we immediately complete it (i.e. before the interrupt handler proper has
run) the interrupt may be triggered again if the interrupt source remains set.
From RISC-V Instruction Set Manual: Volume II: Priviliged Architecture, 7.4
Interrupt Gateways:

"If a level-sensitive interrupt source deasserts the interrupt after the PLIC
core accepts the request and before the interrupt is serviced, the interrupt
request remains present in the IP bit of the PLIC core and will be serviced by
a handler, which will then have to determine that the interrupt device no
longer requires service."

In other words, we may receive interrupts twice.

Avoid that by postponing the completion until after the interrupt handler has
run.

If the interrupt is handled by a filter rather than by scheduling an interrupt
thread we must also complete the interrupt, so set up a post_filter handler
(which is the same as the post_ithread handler).

Reviewed by:	mhorne
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D25531
2020-07-06 21:29:50 +00:00
Mark Johnston
26dd427800 Split nhop_ref_object().
Now nhop_ref_object() unconditionally acquires a reference, and the new
nhop_try_ref_object() uses refcount_acquire_if_not_zero() to
conditionally acquire a reference.  Since the former is cheaper, use it
when we know that the initial counter value is non-zero.  No functional
change intended.

Reviewed by:	melifaro
Differential Revision:	https://reviews.freebsd.org/D25535
2020-07-06 21:20:57 +00:00
Mitchell Horne
2192efc03b RISC-V boot1.efi and loader.efi support
This implementation doesn't have any major deviations from the other EFI
ports. I've copied the boilerplate from arm and arm64.

I've tested this with the following boot flows:
OpenSBI (M-mode) -> u-boot (S-mode) -> loader.efi -> FreeBSD
OpenSBI (M-mode) -> u-boot (S-mode) -> boot1.efi -> loader.efi -> FreeBSD

Due to the way that u-boot handles secondary CPUs, OpenSBI >= v0.7 is required,
as the HSM extension is needed to bring them up explicitly. Because of this,
using BBL as the SBI implementation will not be possible. Additionally, there
are a few recent u-boot changes that are required as well, all of which will be
present in the upcoming v2020.07 release.

Looks good:	emaste
Differential Revision:	https://reviews.freebsd.org/D25135
2020-07-06 18:19:42 +00:00
Mark Johnston
866a5d1298 Regenerate.
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:34:49 +00:00
Mark Johnston
bdfe61e05e Permit cpuset_(get|set)domain() in capability mode.
These system calls already perform validation of their parameters when
called in capability mode, identical to cpuset_(get|set)affinity().

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:34:29 +00:00
Pawel Biernacki
e94fdc3833 kern.tty_info_kstacks: set compact format as default 2020-07-06 16:34:15 +00:00
Mark Johnston
69b565d7c0 Allow accesses of the caller's CPU and domain sets in capability mode.
cpuset_(get|set)(affinity|domain)(2) permit a get or set of the calling
thread or process' CPU and domain set in capability mode, but only when
the thread or process ID is specified as -1.  Extend this to cover the
case where the ID actually matches the caller's TID or PID, since some
code, such as our pthread_attr_get_np() implementation, always provides
an explicit ID.

It was not and still is not permitted to access CPU and domain sets for
other threads in the same process when the process is in capability
mode.  This might change in the future.

Submitted by:	Greg V <greg@unrelenting.technology> (original version)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25552
2020-07-06 16:34:09 +00:00
Pawel Biernacki
cd1c083d80 kern.tty_info_kstacks: add a compact format
Add a more compact display format for kern.tty_info_kstacks inspired by
procstat -kk. Set it as a default one.

# sysctl kern.tty_info_kstacks=1
kern.tty_info_kstacks: 0 -> 1
# sleep 2
^T
load: 0.17  cmd: sleep 623 [nanslp] 0.72r 0.00u 0.00s 0% 2124k
#0 0xffffffff80c4443e at mi_switch+0xbe
#1 0xffffffff80c98044 at sleepq_catch_signals+0x494
#2 0xffffffff80c982c2 at sleepq_timedwait_sig+0x12
#3 0xffffffff80c43af3 at _sleep+0x193
#4 0xffffffff80c50e31 at kern_clock_nanosleep+0x1a1
#5 0xffffffff80c5119b at sys_nanosleep+0x3b
#6 0xffffffff810ffc69 at amd64_syscall+0x119
#7 0xffffffff810d5520 at fast_syscall_common+0x101
sleep: about 1 second(s) left out of the original 2
^C
# sysctl kern.tty_info_kstacks=2
kern.tty_info_kstacks: 1 -> 2
# sleep 2
^T
load: 0.24  cmd: sleep 625 [nanslp] 0.81r 0.00u 0.00s 0% 2124k
mi_switch+0xbe sleepq_catch_signals+0x494 sleepq_timedwait_sig+0x12
sleep+0x193 kern_clock_nanosleep+0x1a1 sys_nanosleep+0x3b
amd64_syscall+0x119 fast_syscall_common+0x101
sleep: about 1 second(s) left out of the original 2
^C

Suggested by:	avg
Reviewed by:	mjg
Relnotes:	yes
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25487
2020-07-06 16:33:28 +00:00
Mark Johnston
9eb997cb48 Lift cpuset Capsicum checks into a subroutine.
Otherwise the same checks are duplicated across four different system
call implementations, cpuset_(get|set)(affinity|domain)().  No
functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:33:21 +00:00
Brandon Bergren
60185d8965 [PowerPC] XIVE dispatch tweaks
* Only read the DPCPU pointer once per xive_dispatch call.
  * Optimize HE decoding for the common cases.

Reported by:	jhibbits (in irc)
Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25545
2020-07-06 15:15:37 +00:00
Mark Johnston
b256d25c50 iflib: Fix some nits in the rx refill code.
- Get rid of the ifl_vm_addrs array.  It is not used by any existing
  consumer, so we are just dirtying a couple of cache lines for no
  reason.
- Use uma_zalloc(fl->ifl_zone) instead of m_cljget().  Otherwise
  m_cljget() is doing unnecessary work to look up the correct zone, when
  iflib already knows what that zone is.
- ifl_gen is only used when INVARIANTS is on, so make that more clear.
- Fix some style nits and inconsistencies.

Reviewed by:	gallatin
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25490
2020-07-06 14:52:21 +00:00
Mark Johnston
a363e1d4d0 iflib: Fix handling of mbuf cluster allocation failures.
When refilling an rx freelist, make sure we only update the hardware
producer index if at least one cluster was allocated.  Otherwise the
NIC is programmed to write a previously used cluster, typically
resulting in a use-after-free when packet data is written by the
hardware.

Also make sure that we don't update the fragment index cursor if the
last allocation attempt didn't succeed.  For at least Intel drivers,
iflib assumes that the consumer index and fragment index cursor stay in
lockstep, but this assumption was violated in the face of cluster
allocation failures.

Reported and tested by:	pho
Reviewed by:	gallatin, hselasky
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25489
2020-07-06 14:52:09 +00:00
Andrew Turner
eed8b80f64 Add a driver for bcm2838 PCI express controller
This adds support for the Broadcom bcm2711 PCI express controller, found
on the Raspberry Pi 4 (aka the bcm2838 SoC). The driver has only been
developed against the soldered-on VIA XHCI controller and not tested
with other end points.

Submitted by:	Robert Crowston <crowston_protonmail.com>
Differential Revision:	https://reviews.freebsd.org/D25068
2020-07-06 08:51:55 +00:00
Hans Petter Selasky
1866c98e64 Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which
order the clients are attached and detached. For example one IB client may
use resources from another IB client. This can lead to a potential deadlock
at shutdown. For example if the ipoib is unregistered after the ib_multicast
client is detached, then if ipoib is using multicast addresses a deadlock may
happen, because ib_multicast will wait for all its resources to be freed before
returning from the remove method.

Fix this by using module_xxx_order() instead of module_xxx().

Differential Revision:	https://reviews.freebsd.org/D23973
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-06 08:50:11 +00:00
Mateusz Guzik
9b0c2e5909 vfs: expand on vhold_smr comment 2020-07-06 02:00:35 +00:00
Mateusz Guzik
d363fa4127 lockf: elide avoidable locking in lf_advlockasync
While here assert on ls_threads state.
2020-07-05 23:07:54 +00:00
Rick Macklem
34fc29e0c9 Add support for ext_pgs mbufs to nfsm_strtom().
Also, add a new function nfsm_add_ext_pgs() which will either add a page
or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-05 21:55:16 +00:00
Konstantin Belousov
4543c1c329 Fix typo.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-07-05 20:54:01 +00:00
Hans Petter Selasky
588fbadffb Fix include file order in io.h in the LinuxKPI.
Make sure sys/types.h is included before machine/vm.h.

PR:		247775
Submitted by:	pkubaj@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-05 19:38:36 +00:00
Andrew Turner
fcf7a48191 Rerun kernel ifunc resolvers after all CPUs have started
On architectures that use RELA relocations it is safe to rerun the ifunc
resolvers on after all CPUs have started, but while they are sill parked.

On arm64 with big.LITTLE this is needed as some SoCs have shipped with
different ID register values the big and little clusters meaning we were
unable to rely on the register values from the boot CPU.

Add support for rerunning the resolvers on arm64 and amd64 as these are
both RELA using architectures.

Reviewed by:	kib
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25455
2020-07-05 14:38:22 +00:00
Edward Tomasz Napierala
4d2b7be54a Fix Linux recvmsg(2) when msg_namelen returned is 0. Previously
it would fail with EINVAL, breaking some of the Python regression
tests.

While here, cap the user-controlled message length.

Note that the code doesn't seem to be copying out the new length
in either (success or failure) case. This will be addressed separately.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25392
2020-07-05 10:57:28 +00:00
Navdeep Parhar
3bbb68f0e3 cxgbe(4): Fix a bug (introduced in r362905) where some tx traffic wasn't
being reported to BPF.
2020-07-05 05:14:33 +00:00
Pawel Biernacki
bae599753b dev.ixl.<N>.debug: mark as MPSAFE
This node provides no handler, it's implicitly MPSAFE.

Reviewed by:	erj
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25408
2020-07-04 14:20:03 +00:00
Edward Tomasz Napierala
7fcc9f7ee5 Add /proc/sys/kernel/tainted to linprocfs(5). Helps LTP.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25556
2020-07-04 11:26:03 +00:00
Edward Tomasz Napierala
8b99a63fd8 Make linprocfs(5) create /proc/bus/pci/devices/, and linsysfs(5)
create /sys/class/power_supply/.  This silences some warnings
from biology/linux-foldingathome.

Reported by:	0mp
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25557
2020-07-04 11:22:35 +00:00
Mateusz Guzik
11c345b18f devfs: fix a vnode use-after-free in devfs_ioctl
The vnode to be replaced was read with a shared lock, meaning 2 racing threads
can find the same one.

While here clean it up a little bit.
2020-07-04 06:27:28 +00:00
Mateusz Guzik
d6d9ddd41f linux: fix ioctl performance for termios
TCGETS et al are frequently issued by Linux binaries while the previous code
avoidably ping-pongs a global sx lock and serializes on Giant.

Note that even with the fix the common case will serialize on a per-tty lock.
2020-07-04 06:25:41 +00:00
Mateusz Guzik
dc3c991598 Add char and short types to kcsan 2020-07-04 06:22:05 +00:00
Mateusz Guzik
8e5679aa10 audit: provide AUDITING_TD for !AUDIT case 2020-07-04 06:21:20 +00:00
Rick Macklem
dccb580624 Add support for ext_pgs mbufs to nfscl_reqstart() and nfsm_set().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-04 03:28:13 +00:00
Conrad Meyer
c74a3041f0 Add domain policy allocation for amd64 fpu_kern_ctx
Like other types of allocation, fpu_kern_ctx are frequently allocated per-cpu.
Provide the API and sketch some example consumers.

fpu_kern_alloc_ctx_domain() preferentially allocates memory from the
provided domain, and falls back to other domains if that one is empty
(DOMAINSET_PREF(domain) policy).

Maybe it makes more sense to just shove one of these in the DPCPU area
sooner or later -- left for future work.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D22053
2020-07-03 14:54:46 +00:00
Mateusz Guzik
58199a7052 ifdef out pg_jobc assertions added in r361967
They trigger for some people, the bug is not obvious, there are no takers
for fixing it, the issue already had to be there for years beforehand and
is low priority.
2020-07-03 09:23:11 +00:00
Alexander V. Chernikov
eddfb2e86f Fix IPv6 regression introduced by r362900.
PR:		kern/247729
2020-07-03 08:06:26 +00:00
Rick Macklem
606007409c Fix build breakage caused by r362903. Only pmap.h is needed now, but
vm_page.h and vm_pageout.h is needed later, so put them in now.

Pointy hat goes on me.
2020-07-03 05:21:05 +00:00
Navdeep Parhar
d735920d33 cxgbe(4): changes in the Tx path to help increase tx coalescing.
- Ask the firmware for the number of frames that can be stuffed in one
  work request.

- Modify mp_ring to increase the likelihood of tx coalescing when there
  are just one or two threads that are doing most of the tx.  Add teeth
  to the abdication mechanism by pushing the consumer lock into mp_ring.
  This reduces the likelihood that a consumer will get stuck with all
  the work even though it is above its budget.

- Add support for coalesced tx WR to the VF driver.  This, with the
  changes above, results in a 7x improvement in the tx pps of the VF
  driver for some common cases.  The firmware vets the L2 headers
  submitted by the VF driver and it's a big win if the checks are
  performed for a batch of packets and not each one individually.

Reviewed by:	jhb@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25454
2020-07-03 04:44:23 +00:00
Rick Macklem
2da1527844 Add support for ext_pgs mbufs to nfsm_build().
This is the first of a series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-03 01:19:29 +00:00
Alexander V. Chernikov
6ad7446c6f Complete conversions from fib<4|6>_lookup_nh_<basic|ext> to fib<4|6>_lookup().
fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees
 over pointer validness and requiring on-stack data copying.

With no callers remaining, remove fib[46]_lookup_nh_ functions.

Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Differential Revision:	https://reviews.freebsd.org/D25445
2020-07-02 21:04:08 +00:00
Alan Somers
f60b4812d8 Fix page fault in zfsctl_snapdir_getattr
Must acquire the z_teardown_lock before accessing the zfsvfs_t object. I
can't reproduce this panic on demand, but this looks like the correct
solution.

PR:		247668
Reviewed by:	avg
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D25543
2020-07-02 13:17:31 +00:00
Mateusz Guzik
a2de789ebb cred: add a prediction to crfree for td->td_realucred == cr
This matches crhold and eliminates an assembly maze in the common case.
2020-07-02 12:58:07 +00:00
Mateusz Guzik
d23850207b cache: add missing call to cache_ncp_invalid for negative hits
Note the dtrace probe can fire even the entry is gone, but I don't think that's
worth fixing.
2020-07-02 12:56:20 +00:00
Mateusz Guzik
d129e0eba0 cache: fix misplaced fence in cache_ncp_invalidate
The intent was to mark the entry as invalid before cache_zap starts messing
with it.

While here add some comments.
2020-07-02 12:54:50 +00:00
Konstantin Belousov
92d8df2f37 mlx5_core: remove unneccessary LFENCE instruction.
Use fence instead of barrier, which is optimized to take advantage of
the x86 TSO memory model.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2020-07-02 10:44:45 +00:00
Konstantin Belousov
f334f212d9 linuxkpi: improvements for linux_pid_task() and linux_get_pid_task().
Unify functions bodies.
Do not call tdfind() if pid is passed, and do not call pfind() if tid
is supplied.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25534
2020-07-02 10:42:58 +00:00
Konstantin Belousov
4bc5ce2c74 Use tdfind() in pget().
Reviewed by:	jhb, hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25532
2020-07-02 10:40:47 +00:00
Kristof Provost
b865714d95 riscv pmap: zero reserved pte bits in ppn
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.

[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	kp, mhorne
Differential Revision:	https://reviews.freebsd.org/D25523
2020-07-01 19:15:43 +00:00
Kristof Provost
6f11e59d72 riscv locore.S: load constant prior to loop
A very minor micro-optimization; t0 is not clobbered between the loop top and
bottom and there appear to be no other branches to this label.

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	mhorne
Differential Revision:	https://reviews.freebsd.org/D25524
2020-07-01 19:12:47 +00:00
Kristof Provost
d53a2816c7 riscv: Log missing registers in dump_regs()
If we panic we dump the registers for debugging. This is very useful, but it
missed several registers (ra, sp, gp and tp).

Log these as well. Especially the return address value is extremely useful.

Sponsored by:	Axiado
2020-07-01 19:11:02 +00:00
Michael Tuexen
e54b7cd007 Fix the cleanup handling in a error path for TCP BBR.
Reported by:		syzbot+df7899c55c4cc52f5447@syzkaller.appspotmail.com
Reviewed by:		rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D25486
2020-07-01 17:17:06 +00:00
Andrew Turner
e4fc3b653a Read the CPU 0 arm64 ID registers early in initarm
We also update the kernel view early in the boot. This will allow the
use of the common kernel view in ifunc resolvers.

Sponsored by:	Innovate UK
2020-07-01 16:57:57 +00:00
Andrew Turner
eeada9221b Move ID reading signatures to a better header
The functions to read the common user and kernel ID registers should be
in cpu.h rather than undefined.h as they are related to CPU details and
used by undefined instruction handlers.

Sponsored by:	Innovate UK
2020-07-01 16:17:51 +00:00
Mark Johnston
d16a2e4784 Fix a possible next-hop refcount leak when handling IPSec traffic.
It may be possible to fix this by deferring the lookup, but let's
keep the initial change simple to make MFCs easier.

PR:		246951
Reviewed by:	melifaro
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25519
2020-07-01 15:42:48 +00:00
Andrew Turner
9eb07d5627 Read the arm64 ID registers earlier in the boot process.
Also move parsing the registers to just after the secondary CPUs have
started. This means the kernel register view from all CPUs is available
after the CPU SYSINITs have finished, e.g. for use by ifunc resolvers.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25505
2020-07-01 15:17:45 +00:00
Andrew Turner
ecc8ccb441 Simplify the flow when getting/setting an isrc
Rather than unlocking and returning we can just perform the needed action
only when the interrupt source is valid and reuse the unlock in both the
valid irq and invalid irq cases.

Sponsored by:	Innovate UK
2020-07-01 12:07:28 +00:00
Edward Tomasz Napierala
6d76adbb6d Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.

Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy.  Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.

It fixes Redis, among other things.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25461
2020-07-01 10:37:08 +00:00
Hans Petter Selasky
9a4e535b39 The "pid" field in the LinuxKPI task struct is typically set to the thread ID
and not the process ID. Make sure the linux_task_exiting() function uses tdfind()
to lookup the BSD procedure structure pointer by the "pid" field, and only
fallback to pfind() when no match is found! This makes linux_task_exiting()
in line with the rest of the code.

Differential Revision: https://reviews.freebsd.org/D25509
Submitted by:	Greg V <greg@unrelenting.technology>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-01 08:23:57 +00:00
Mateusz Guzik
5d1c042d32 cache: lockless forward lookup with smr
This eliminates the need to take bucket locks in the common case.

Concurrent lookup utilizng the same vnodes is still bottlenecked on referencing
and locking path components, this will be taken care of separately.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23913
2020-07-01 05:59:08 +00:00
Mateusz Guzik
f8022be3e6 vfs: protect vnodes with smr
vget_prep_smr and vhold_smr can be used to ref a vnode while within vfs_smr
section, allowing consumers to get away without locking.

See vhold_smr and vdropl for comments explaining caveats.

Reviewed by:	kib
Testec by:	pho
Differential Revision:	https://reviews.freebsd.org/D23913
2020-07-01 05:56:29 +00:00
Takanori Watanabe
263a104f43 Allow some Bluetooth LE related HCI request to non-root user.
PR:	247588
Reported by:	Greg V (greg@unrelenting.technology)
Reviewed by:	emax
Differential Revision:	https://reviews.freebsd.org/D25516
2020-07-01 04:00:54 +00:00
Ryan Moeller
e5539fb618 libifconfig: Add function to get bridge status
The new function operates similarly to ifconfig_lagg_get_lagg_status and
likewise is accompanied by a function to free the bridge status data structure.

I have included in this patch the relocation of some strings describing STP
parameters and the PV2ID macro from ifconfig into net/if_bridgevar.h as they
are useful for consumers of libifconfig.

Reviewed by:	kp, melifaro, mmacy
Approved by:	mmacy (mentor)
MFC after:	1 week
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25460
2020-07-01 02:32:41 +00:00
Conrad Meyer
64612d4e44 geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.

Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234."  However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.

Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0.  This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered.  That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.

(This change does not add support for the 'bootcode' verb to EBR.)

PR:		232463
Reported by:	Manish Jain <bourne.identity AT hotmail.com>
Discussed with:	ae (no objection)
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D24939
2020-07-01 02:16:36 +00:00
Oleksandr Tymoshenko
94bc2117b4 Add i.MX 8M Quad support
- Add CCM driver and clocks implementations for i.MX 8M
- Add GPC driver for iMX8
- Add clock tree for i.MX 8M Quad
- Add clocks support and new compat strings (where required) for existing i.MX 6 UART, I2C, and GPIO drivers
- Enable aarch64-compatible drivers form i.MX 6 in arm64 GENERIC kernel config
- Add dtb/imx8 kernel module with DTBs for Nitrogen8M and iMX8MQ EVK

With this patch both Nitrogen8M and iMX8MQ EVK boot with NFS root up to multiuser login prompt

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D25274
2020-07-01 00:33:16 +00:00
Adrian Chadd
39ca7ca568 [net80211] Commit files missing in the previous commit
These belong to my previous commit, but apparently I typed ieee80211_vhf.[ch]
and forgot ht.h.  Le oops.
2020-07-01 00:24:55 +00:00
Adrian Chadd
f1481c8d3b [net80211] Migrate HT/legacy protection mode and preamble calculation to per-VAP flags
The later firmware devices (including iwn!) support multiple configuration
contexts for a lot of things, leaving it up to the firmware to decide
which channel and vap is active.  This allows for things like off-channel
p2p sta/ap operation and other weird things.

However, net80211 is still focused on a "net80211 drives all" when it comes to driving
the NIC, and as part of this history a lot of these options are global and not per-VAP.
This is fine when net80211 drives things and all VAPs share a single channel - these
parameters importantly really reflect the state of the channel! - but it will increasingly
be not fine when we start supporting more weird configurations and more recent NICs.
Yeah, recent like iwn/iwm.

Anyway - so, migrate all of the HT protection, legacy protection and preamble
stuff to be per-VAP.  The global flags are still there; they're now calculated
in a deferred taskqueue that mirrors the old behaviour.  Firmware based drivers
which have per-VAP configuration of these parameters can now just listen to the
per-VAP options.

What do I mean by per-channel? Well, the above configuration parameters really
are about interoperation with other devices on the same channel. Eg, HT protection
mode will flip to legacy/mixed if it hears ANY BSS that supports non-HT stations or
indicates it has non-HT stations associated.  So, these flags really should be
per-channel rather than per-VAP, and then for things like "do i need short preamble
or long preamble?" turn into a "do I need it for this current operating channel".
Then any VAP using it can query the channel that it's on, reflecting the real
required state.

This patch does none of the above paragraph just yet.

I'm also cheating a bit - I'm currently not using separate taskqueues for
the beacon updates and the per-VAP configuration updates.  I can always further
split it later if I need to but I didn't think it was SUPER important here.

So:

* Create vap taskqueue entries for ERP/protection, HT protection and short/long
  preamble;
* Migrate the HT station count, short/long slot station count, etc - into per-VAP
  variables rather than global;
* Fix a bug with my WME work from a while ago which made it per-VAP - do the WME
  beacon update /after/ the WME update taskqueue runs, not before;
* Any time the HT protmode configuration changes or the ERP protection mode
  config changes - schedule the task, which will call the driver without the
  net80211 lock held and all correctly serialised;
* Use the global flags for beacon IEs and VAP flags for probe responses and
  other IE situations.

The primary consumer of this is ath10k.  iwn could use it when sending RXON,
but we don't support IBSS or AP modes on it yet, and I'm not yet sure whether
it's required in STA mode (ie whether the firmware parses beacons to change
protection mode or whether we need to.)

Tested:

* AR9280, STA/AP
* AR9380, DWDS STA+STA/AP
* ath10k work, STA/AP
* Intel 6235, STA
* Various rtwn / run NICs, DWDS STA and STA configurations
2020-07-01 00:23:49 +00:00
Mark Johnston
7290cb47fc Convert cryptostats to a counter_u64 array.
The global counters were not SMP-friendly.  Use per-CPU counters
instead.

Reviewed by:	jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25466
2020-06-30 22:01:21 +00:00
Michael Tuexen
7a3f60e7f5 Fix a bug introduced in https://svnweb.freebsd.org/changeset/base/362173
Reported by:		syzbot+f3a6fccfa6ae9d3ded29@syzkaller.appspotmail.com
MFC after:		1 week
2020-06-30 21:50:05 +00:00
Edward Tomasz Napierala
c2da36fecd Make linprocfs(5) create the /proc/<PID>/task/ directores.
This is to silence down some Chromium assertions.

PR:		kern/240991
Analyzed by:	Alex S <iwtcex@gmail.com>
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25256
2020-06-30 16:24:28 +00:00
Edward Tomasz Napierala
9bc42c18cb Make linux(4) ignore SA_INTERRUPT. The zsh(1) binary from Bionic uses it.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25499
2020-06-30 16:18:09 +00:00
Andrew Turner
518da7ace8 Add dwc_otg_acpi
Create an acpi attachment for the DWC USB OTG device. This is present in
the Raspberry Pi 4 in the USB-C port normally used to power the board. Some
firmware presents the kernel with ACPI tables rather than FDT so we need
an ACPI attachment.

Submitted by:	Greg V <greg_unrelenting.technology>
Approved by:	hselasky (removal of All rights reserved)
Differential Revision:	https://reviews.freebsd.org/D25203
2020-06-30 15:58:29 +00:00
Mark Johnston
a5ae70f5a0 Remove unused 32-bit compatibility structures from cryptodev.
The counters are exported by a sysctl and have the same width on all
platforms anyway.

Reviewed by:	cem, delphij, jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25465
2020-06-30 15:57:11 +00:00
Mark Johnston
a5c053f5a7 Remove CRYPTO_TIMING.
It was added a very long time ago.  It is single-threaded, so only
really useful for basic measurements, and in the meantime we've gotten
some more sophisticated profiling tools.

Reviewed by:	cem, delphij, jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25464
2020-06-30 15:56:54 +00:00
Hans Petter Selasky
d326a6c7c1 Document the is_signed(), type_max() and type_min() function macros in the
LinuxKPI. Try to make the function argument more readable.

Suggested by:	several
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-30 08:41:33 +00:00
Andrew Gallatin
46cac10b3b Fix a panic when unloading firmware
LIST_FOREACH_SAFE() is not safe in the presence
of other threads removing list entries when a
mutex is released.

This is not in the critical path, so just restart
the scan each time we drop the lock, rather than
using a marker.

Reviewed by:	jhb, markj
Sponsored by:	Netflix
2020-06-29 21:35:50 +00:00
Kyle Evans
97ce5033a8 linux: reposition the comment for bsd_to_linux_bits/linux_to_bsd_bits
rpokala notes that splitting the definitions like this is kind of silly,
since the comment applies to both.  Move the comment up (or the definition
down, depending on your perspective on life) accordingly.

Reported by:	rpokala
2020-06-29 17:47:00 +00:00
Conrad Meyer
8a64110e43 vm: Add missing WITNESS warnings for M_WAITOK allocation
vm_map_clip_{end,start} and lookup_clip_start allocate memory M_WAITOK
for !system_map vm_maps.  Add WITNESS warning annotation for !system_map
callers who may be holding non-sleepable locks.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25283
2020-06-29 16:54:00 +00:00
Hans Petter Selasky
d0eed838e3 Implement is_signed(), type_max() and type_min() function macros in the
LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-29 13:08:40 +00:00
Ruslan Bukin
1ea7952510 Coresight: provide device_attach method for FDT bus.
Sponsored by:	DARPA, AFRL
2020-06-29 12:59:09 +00:00
Andrew Turner
b639b3b195 Fix the spelling of identify in the arm64 identcpu code
Sponsored by:	Innovate UK
2020-06-29 09:37:07 +00:00
Andrew Turner
45e999d918 Create a kernel arm64 ID register view
In preparation for using ifuncs in the kernel is is useful to have a common
view of the arm64 ID registers across all CPUs. Add this and extract the
logic for finding the lower value of two fields to a new helper function.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25463
2020-06-29 09:08:36 +00:00
Kyle Evans
5403f186a7 linuxolator: implement memfd_create syscall
This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
  (?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR:		240874
Reviewed by:	kib, trasz
Differential Revision:	https://reviews.freebsd.org/D21845
2020-06-29 03:09:14 +00:00
Mark Johnston
8c277118d8 Fix UMA's first-touch policy on systems with empty domains.
Suppose a thread is running on a CPU in a NUMA domain with no physical
RAM.  When an item is freed to a first-touch zone, it ends up in the
cross-domain bucket.  When the bucket is full, it gets placed in another
domain's bucket queue.  However, when allocating an item, UMA will
always go to the keg upon a per-CPU cache miss because the empty
domain's bucket queue will always be empty.  This means that a non-empty
domain's bucket queues can grow very rapidly on such systems.  For
example, it can easily cause mbuf allocation failures when the zone
limit is reached.

Change cache_alloc() to follow a round-robin policy when running on an
empty domain.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25355
2020-06-28 21:35:04 +00:00
Mark Johnston
3507b8d467 Remove some redundant assignments and computations.
Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25400
2020-06-28 21:34:38 +00:00
Oleksandr Tymoshenko
4c95d46303 Configure rx_delay/tx_delay values for RK3399/RK3328 GMAC
For 1000Mb mode to work reliably TX/RX delays need to be configured
between the TX/RX clock and the respective signals on the PHY
to compensate for differing trace lengths on the PCB.

Reviewed by:	manu
MFC after:	1 week
2020-06-28 21:11:10 +00:00
Edward Tomasz Napierala
4fe5361cbe Make linux(4) support SO_PROTOCOL. Running Python test suite
with python3.8 from Focal triggers those.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25491
2020-06-28 18:56:32 +00:00
Andrew Turner
23e42a83c1 Use EFI memory map to determine attributes for Acpi mappings on arm64.
AcpiOsMapMemory is used for device memory when e.g. an _INI method wants
to access physical memory, however, aarch64 pmap_mapbios is hardcoded to
writeback. Search for the correct memory type to use in pmap_mapbios.

Submitted by:	Greg V <greg_unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D25201
2020-06-28 15:03:07 +00:00
Michael Tuexen
e99ce3eac5 Don't send packets containing ERROR chunks in response to unknown
chunks when being in a state where the verification tag to be used
is not known yet.

MFC after:		1 week
2020-06-28 14:11:36 +00:00
Michael Tuexen
f2f66ef6d2 Don't check ch for not being NULL, since that is true.
MFC after:		1 week
2020-06-28 11:12:03 +00:00
Konstantin Belousov
557905569d amd64 pmap: explain ptepindex.
Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25187
2020-06-27 19:29:07 +00:00
Edward Tomasz Napierala
d5629eb216 Make linux(4) warn about unsupported SA_ flags.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25453
2020-06-27 15:50:35 +00:00
Edward Tomasz Napierala
a39cdcd7e7 Regen.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-27 14:43:29 +00:00
Edward Tomasz Napierala
308e194cbf Add proper types for linux message queue syscalls; mostly taken
from 32-bit Linuxulator.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25386
2020-06-27 14:42:08 +00:00
Edward Tomasz Napierala
36507f85dc Add syscall definitions for linux xattr syscalls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25387
2020-06-27 14:39:44 +00:00
Edward Tomasz Napierala
8036e7876d Adjust types of linuxulator syscalls, to match include/linux/syscalls.h
in vanilla Linux git tree.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25385
2020-06-27 14:37:36 +00:00
Li-Wen Hsu
18db3c616f rtwn: Add a USB ID for Buffalo WI-U2-433DHP
PR:		247573
Submitted by:	HATANO Tomomi <hatanou@infolab.ne.jp>
MFC after:	1 week
2020-06-27 07:34:15 +00:00
Adrian Chadd
b5e7ee4718 [ath_hal] Add KeyMiss for AR5212/AR5416 series chips.
This is a flag from the MAC that says the received packet didn't match
a keycache slot.  This isn't technically a problem as WEP keys don't
match keycache slots (they're "global" keys), but it could be useful
for tracking down CCMP decryption failures.

Right now it's a no-op - it mirrors what the AR9300 HAL does and it
just increments a counter.  But, hey, maybe one day I'll use it for
diagnosing keycache/CCMP decrypt issues.
2020-06-27 02:59:51 +00:00
Konstantin Belousov
ee06cffcd2 vm_page_free_prep(): correct description of the required page and object state.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25482
2020-06-27 02:31:39 +00:00
Matt Macy
4dc16f4391 Fix "current" variable name conflict with openzfs
The variable "current" is an alias for curthread
in openzfs. Rename all variable uses of current
in dtrace.c to curstate.
2020-06-27 00:57:48 +00:00
Matt Macy
56e5ad5ff7 Rename nvpair.c to bsd_nvpair.c to not conflict with openzfs' version. 2020-06-27 00:55:03 +00:00
Alexander Motin
4cee4598e7 Add mostly dummy hw.pci.enable_aspm tunable.
The only thing this tunable enables now is reporting to ACPI _OSC that
Active State Power Management and Clock Power Management Capability are
"supported" by the OS.

I've found that at least some Supermicro server boards do not allow OS
to support native PCIe hot-plug unless it reports those capabilities.
After spending significant time in PCIe specs I have found very little
motivation for that, and none of it applies to those motherboards, not
enabling ASPM themselves.  So unless OS explicitly wants to save power,
I see nothing for it to do there actually.

I guess it may get sense to support ASPM when we get Thunderbolt support.
Otherwise I have no system with PCIe hot-plug where power saving matters.

It would be nice to enable this by default, but I worry that it affect
power saving of some laptops, even though I haven't noticed that myself.
2020-06-26 19:55:11 +00:00
Andriy Gapon
4302208388 sound/hda: fix interrupt handler endless loop after r362294
Not all interrupt sources that affect CIS bit were acknowledged.
Specifically, bits in STATESTS (aka WAKESTS) were left set.

The fix is to disable WAKEEN and clear STATESTS bits before the HDA
interrupt is enabled.  This way we should never get any STATESTS bits.

I also added placeholders for all event bits that we currently do not
enable, do not handle and do not clear.  This might get useful when / if
we enable any of them.

Reported by:	kib (Apollo Lake hardware)
Tested by:	kib (earlier, different change)
MFC after:	2 weeks
X-MFC with:	r362294
2020-06-26 09:46:03 +00:00
Andriy Gapon
8bf2c3c9f6 ena: fix module build after r362530
Somehow I missed the makefile when moving the change from phabricator to
svn.

MFC after:	1 week
X-MFC with:	r362530
2020-06-26 09:32:57 +00:00
Rick Macklem
db4b8d7e0d Bump the version since r362639 changed the internal API between the NFS
kernel modules so they must all be rebuilt.
2020-06-26 03:14:30 +00:00
Rick Macklem
4476c1def0 Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs
should be used.

For KERN_TLS (and possibly some other future network interface) the mbuf
list passed into sosend() must be ext_pgs mbufs. The krpc could simply
copy all the mbuf data into ext_pgs mbufs before calling sosend(), but
that would be inefficient for large RPC messages.
This patch adds an argument to nfscl_reqstart() to indicate that it should
fill the RPC message into ext_pgs mbufs.
It also adds fields to "struct nfsrv_descript" needed for building NFS RPC
messages in ext_pgs mbufs, along with new flags for this.

Since the argument is always "false", this commit should not result in any
semantic change. However, this commit prepares the code
for future commits that will add support for building of NFS RPC messages
in ext_pgs mbufs.
2020-06-26 03:11:54 +00:00
John Baldwin
94578db218 Reduce contention on per-adapter lock.
- Move temporary sglists into the session structure and protect them
  with a per-session lock instead of a per-adapter lock.

- Retire an unused session field, and move a debugging field under
  INVARIANTS to avoid using the session lock for completion handling
  when INVARIANTS isn't enabled.

- Use counter_u64 for per-adapter statistics.

Note that this helps for cases where multiple sessions are used
(e.g. multiple IPsec SAs or multiple KTLS connections).  It does not
help for workloads that use a single session (e.g. a single GELI
volume).

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25457
2020-06-26 00:01:31 +00:00
John Baldwin
dae61c9d09 Simplify IPsec transform-specific teardown.
- Rename from the teardown callback from 'zeroize' to 'cleanup' since
  this no longer zeroes keys.

- Change the callback return type to void.  Nothing checked the return
  value and it was always zero.

- Don't have esp call into ah since it no longer needs to depend on
  this to clear the auth key.  Instead, both are now private and
  self-contained.

Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25443
2020-06-25 23:59:16 +00:00
John Baldwin
f82eb2a6f0 Enter and exit the network epoch for async IPsec callbacks.
When an IPsec packet has been encrypted or decrypted, the next step in
the packet's traversal through the network stack is invoked from a
crypto worker thread, not from the original calling thread.  These
threads need to enter the network epoch before passing packets down to
IP output routines or up to transport protocols.

Reviewed by:	ae
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25444
2020-06-25 23:57:30 +00:00
Vincenzo Maffione
9503233f87 iflib: fix compilation issue introduced in r362621
The ifp local variable is useful even without netmap
and altq, as it is used to check for IFF_DRV_RUNNING.

MFC after:	2 weeks
2020-06-25 20:43:21 +00:00
John Baldwin
20869b25cc Use zfree() to explicitly zero IPsec keys.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25442
2020-06-25 20:31:06 +00:00
Mark Johnston
f4134e3d87 Implement an approximation of Linux MADV_DONTNEED semantics.
Linux MADV_DONTNEED is not advisory: it has side effects for anonymous
memory, and some system software depends on that.  In particular,
MADV_DONTNEED causes anonymous pages to be discarded.  If the mapping is
a private mapping of a named object then subsequent faults are to
repopulate the range from that object, otherwise pages will be
zero-filled.  For mappings of non-anonymous objects, Linux MADV_DONTNEED
can be implemented in the same way as our MADV_DONTNEED.

This implementation differs from Linux semantics in its handling of
private mappings, inherited through fork(), of non-anonymous objects.
After applying MADV_DONTNEED, subsequent faults will repopulate the
mapping from the parent object rather than the root of the shadow chain.

PR:		230160
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25330
2020-06-25 20:30:30 +00:00
Alexander Motin
701267ad19 Fix few panics on NVMe's timing out initialization requests.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-25 20:29:29 +00:00
John Baldwin
6572e5ff66 Use explicit_bzero() instead of bzero() for sensitive data.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25441
2020-06-25 20:25:35 +00:00
John Baldwin
9b6dc28176 Explicitly zero the temporary auth context used to generate HMAC state.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25439
2020-06-25 20:22:44 +00:00
John Baldwin
347c369294 Explicitly zero hash results and context in glxsb_authcompute().
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25438
2020-06-25 20:21:34 +00:00
John Baldwin
b172f23dd7 Use zfree() instead of bzero() and free().
These bzero's should have been explicit_bzero's.

Reviewed by:	cem, delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25437
2020-06-25 20:20:22 +00:00
John Baldwin
17a831ea25 Zero the temporary HMAC key in hmac_init_pad().
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25436
2020-06-25 20:18:55 +00:00
John Baldwin
4a711b8d04 Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by:	cem
Reviewed by:	cem, delphij
Approved by:	csprng (cem)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25435
2020-06-25 20:17:34 +00:00
Vincenzo Maffione
d8b2d26b15 iflib: netmap: add support for partial ring openings
Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25254
2020-06-25 19:44:24 +00:00
Vincenzo Maffione
88a688663a iflib: netmap: add per-tx-queue netmap support
Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25253
2020-06-25 19:35:43 +00:00
Mark Johnston
90297b6471 Add SCTP_SUPPORT to the default kernel options.
Otherwise out-of-tree module builds will be broken for a lack of a
definition of MK_SCTP_SUPPORT.

Reported by:	Michael Butler <imb@protected-networks.net>
MFC with:	r362614
Sponsored by:	The FreeBSD Foundation
2020-06-25 19:12:27 +00:00
Doug Moore
3a509754de Eliminate the color field from the RB element struct. Identify the
color of a node (or, really, the color of the link from the parent to
the node) by using one of the last two bits of the parent pointer in
that parent node. Adjust rebalancing methods to account for where
colors are stored, and the fact that null children have a color too.

Adjust RB_PARENT and RB_SET_PARENT to account for this change.

Reviewed by:	markj
Tested by:	pho, hselasky
Differential Revision:	https://reviews.freebsd.org/D25418
2020-06-25 17:44:14 +00:00
Navdeep Parhar
7c228be30b cxgbe(4): Add a pointer to the adapter softc in vi_info.
There were quite a few places where port_info was being accessed only to
get to the adapter.

Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25432
2020-06-25 17:04:22 +00:00
Mark Johnston
79ddb55c39 Add SCTP_SUPPORT handling to config.mk.
Reviewed by:	jhb, tuexen
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25402
2020-06-25 15:25:00 +00:00
Mark Johnston
84242cf68a Call swap_pager_freespace() from vm_object_page_remove().
All vm_object_page_remove() callers, except
linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when
removing a range of pages from an object.  The LinuxKPI case appears to
be an unintentional omission that could result in leaked swap blocks, so
unconditionally free swap space in vm_object_page_remove() to protect
against similar bugs in the future.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25329
2020-06-25 15:21:21 +00:00
Conrad Meyer
4daa95f85d bhyve(8): For prototyping, reattempt decode in userspace
If userspace has a newer bhyve than the kernel, it may be able to decode
and emulate some instructions vmm.ko is unaware of.  In this scenario,
reset decoder state and try again.

Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D24464
2020-06-25 00:18:42 +00:00
Vladimir Kondratyev
54cca285fc atkbd/evdev: recognize the Chromebook menu key as F13 like Linux does.
This is the key on the right side of the function keys, with the
"hamburger menu" icon on it.

Submitted by:		GregV <greg@unrelenting.technology>
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D25390
2020-06-25 00:09:43 +00:00
Mark Johnston
ddf1843203 acpi_ibm(4): Rename disengaged mode to unthrottled mode.
This mode was added in r362496.  Rename it to make the meaning more
clear.

PR:		247306
Suggested by:	rpokala
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC with:	r362496
2020-06-24 19:51:03 +00:00
Enji Cooper
d6701b6c8c Add kern.features.witness
Adding `kern.features.witness` helps expose whether or not the kernel has
`options WITNESS` enabled, so the `feature_present(3)` API can be used
to query whether or not witness(9) is built into the kernel.

This support is helpful with userspace applications (generally speaking,
tests), as it can be queried to determine whether or not tests related
to WITNESS should be run.

MFC after:	1 week
Reviewed by: cem, darrick.freebsd_gmail.com
Differential Revision: https://reviews.freebsd.org/D25302
Sponsored by:	DellEMC Isilon
2020-06-24 18:51:01 +00:00
Mark Johnston
1388cfe1b5 ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4.
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	Lutz Donnerhacke
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25227
2020-06-24 15:46:33 +00:00
Mitchell Horne
133b1f1461 Only invalidate the early DTB mapping if it exists
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24911
2020-06-24 15:21:12 +00:00
Mitchell Horne
f7d2df2a8a Handle load from loader(8)
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.

The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.

Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.

Reviewed by:	markj, jrtc27 (asm portion)
Differential Revision:	https://reviews.freebsd.org/D24912
2020-06-24 15:20:00 +00:00
Michael Tuexen
132c073866 Fix the acconting for fragmented unordered messages when using
interleaving.
This was reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19321

MFC after:		1 week
2020-06-24 14:47:51 +00:00
Richard Scheffenegger
6e26dd0dbe TCP: fix cubic RTO reaction.
Proper TCP Cubic operation requires the knowledge
of the maximum congestion window prior to the
last congestion event.

This restores and improves a bugfix previously added
by jtl@ but subsequently removed due to a revert.

Reported by:	chengc_netapp.com
Reviewed by:	chengc_netapp.com, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25133
2020-06-24 13:52:53 +00:00
Richard Scheffenegger
9dc7d8a246 TCP: make after-idle work for transactional sessions.
The use of t_rcvtime as proxy for the last transmission
fails for transactional IO, where the client requests
data before the server can respond with a bulk transfer.

Set aside a dedicated variable to actually track the last
locally sent segment going forward.

Reported by:	rrs
Reviewed by:	rrs, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25016
2020-06-24 13:42:42 +00:00
Marcin Wojtas
fab2a758cc Fix AccessWidth and BitWidth parsing in SPCR table
The ACPI Specification defines a Generic Address Structure (GAS),
which is used to describe UART controller register layout in the
SPCR table. The driver responsible for parsing it (uart_cpu_acpi)
wrongly associates the Access Size field to the uart_bas's regshft
and the register BitWidth to the regiowidth - according to
the definitions it should be opposite.

This problem remained hidden most likely because the majority of platforms
use 32-bit registers (BitWidth) which are accessed with the according
size (Dword). However on Marvell Armada 8k / Cn913x platforms,
the 32-bit registers should be accessed with Byte granulity, which
unveiled the issue.

This patch fixes above by proper values assignment and slightly improved
parsing.

Note that handling of the AccessWidth set to EFI_ACPI_6_0_UNDEFINED is
needed to work around a buggy SPCR table on EC2 x86 "bare metal" instances.

Reviewed by: manu, imp, cperciva, greg_unrelenting.technology
Obtained from: Semihalf
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25373
2020-06-24 12:15:27 +00:00
Michael Tuexen
87c0bf77d9 Fix alignment issue manifesting in the userland stack.
MFC after:		1 wwek
2020-06-23 23:05:05 +00:00
Doug Moore
158c55a584 In r362552, RB_SET_PARENT is defined, and use in parens in
RB_CLEAR_NODE.  But it is not an expression, and ought not to be
enclosed in parens.  Remove them.

Approved by:	markj
Differential Revision:	https://reviews.freebsd.org/D25421
2020-06-23 22:47:54 +00:00
Kirk McKusick
9407f25df2 Optimize g_journal's superblock update by noting that the summary
information is neither read nor written so it need not be written
out when updating the superblock.

PR:           247425
Sponsored by: Netflix
2020-06-23 21:44:00 +00:00