Commit Graph

132949 Commits

Author SHA1 Message Date
Michael Tuexen
6ddc843832 Fix a use-after-free bug for the userland stack. The kernel
stack is not affected.
Thanks to Mark Wodrich from Google for finding and reporting the
bug.

MFC after:		1 week
2020-07-10 11:15:10 +00:00
Andrew Turner
0d266dedf7 Split long lines in the Raspberry Pi FB driver
Sponsored by:	Innovate UK
2020-07-10 09:34:47 +00:00
Mateusz Guzik
74f61caed5 vfs: fix early termination of kern_getfsstat
The kernel would unlock already unlocked mutex if the buffer got filled up
before the mount list ended.

Reported by:	pho
Fixes:	r363069 ("vfs: depessimize getfsstat when only the count is requested")
2020-07-10 09:24:27 +00:00
Mateusz Guzik
422f38d8ea vfs: fix trivial whitespace issues which don't interefere with blame
.. even without the -w switch
2020-07-10 09:01:36 +00:00
Mateusz Guzik
6c69e69724 vfs: depessimize getfsstat when only the count is requested
This avoids relocking mountlist_mtx for each entry.
2020-07-10 06:47:58 +00:00
Mateusz Guzik
8c1f410c19 vfs: avoid spurious memcpy in vfs_statfs
It is quite often called for the very same buffer.
2020-07-10 06:46:42 +00:00
Kyle Evans
423a033ba7 memfd_create: turn on SHM_GROW_ON_WRITE
memfd_create fds will no longer require an ftruncate(2) to set the size;
they'll grow (to the extent that it's possible) upon write(2)-like syscalls.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25502
2020-07-10 00:45:16 +00:00
Kyle Evans
3f07b9d9f8 shm_open2: Implement SHM_GROW_ON_WRITE
Lack of SHM_GROW_ON_WRITE is actively breaking Python's memfd_create tests,
so go ahead and implement it. A future change will make memfd_create always
set SHM_GROW_ON_WRITE, to match Linux behavior and unbreak Python's tests
on -CURRENT.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25502
2020-07-10 00:43:45 +00:00
Kyle Evans
50030f9547 shmfd: make shm_size a vm_ooffset_t
On 32-bit platforms, this expands the shm_size to a 64-bit quantity and
resolves a mismatch between the shmfd size and underlying vm_object size.
The implementation did not account for this kind of mismatch.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25602
2020-07-10 00:03:06 +00:00
Scott Long
ffc568ba8b Revert r362998, r326999 while a better compatibility strategy is devised. 2020-07-09 22:38:36 +00:00
Mark Johnston
fe59cb6ba2 Apply the logic from r363051 to semctl(2) and __sem_base field.
Reported by:	Jeffball <jeffball@grimm-co.com>
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25600
2020-07-09 18:34:54 +00:00
Mark Johnston
f4f16af1d3 Avoid copying out kernel pointers from msgctl(IPC_STAT).
While this behaviour is harmless, it is really just an artifact of the
fact that the msgctl(2) implementation uses a user-visible structure as
part of the internal implementation, so it is not deliberate and these
pointers are not useful to userspace.  Thus, NULL them out before
copying out, and remove references to them from the manual page.

Reported by:	Jeffball <jeffball@grimm-co.com>
Reviewed by:	emaste, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25600
2020-07-09 17:26:49 +00:00
Andrew Turner
201a1f34da Add a driver to talk to the Raspberry Pi firmware
Communicating with the Raspberry Pi firmware is currently handled by each
driver calling into the mbox driver, however the device tree is structured
such that they should be calling into a firmware driver.

Add a driver for this node with an interface to communicate to the firmware
via the mbox interface.

There is a sysctl to get the firmware revision. This is a unix date so can
be parsed with:

root@generic:~ # date -j -f '%s' sysctl -n dev.bcm2835_firmware.0.revision
Tue Nov 19 16:40:28 UTC 2019

Reviewed by:	manu
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25572
2020-07-09 16:28:13 +00:00
Michael Tuexen
b6734d8f4a Optimize flushing of receive queues.
This addresses an issue found and reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21243

MFC after:		1 week
2020-07-09 16:18:42 +00:00
Xin LI
b23a7fbaab g_concat_find_device: trim /dev/ if it is present, like other GEOM
classes.

Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25596
2020-07-09 08:00:46 +00:00
Xin LI
8510f61acd sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25565
2020-07-09 02:52:39 +00:00
Emmanuel Vadot
e18c80bcb6 twsi: Fix for > Allwinner A20
Every revision of twsi after the A20 have a bug where we need to
write again the control register after each interrupts. We also need
to add some delay before writing to this register, a simple read of the
same register does the job so do that.
Also fix the case when we have finish sending all the bytes, it only worked
for 1 byte transfer (the same kind that we do for talking to the PMIC on A20
boards).
While here add more debug messages and rework some of them.

This was tested by talking to a AT23C32 eeprom and a DS3231 RTC from an
H3 and A20 board.

PR:		247576
Reported by:	Manuel Stühn (freebsd@justmail.de)
MFC after:	1 week
2020-07-08 19:14:44 +00:00
Emmanuel Vadot
9d2c88ab2a extres/syscon_generic: Make device quiet if not in boot verbose
On some boards there is a lot of of syscon node that are unused as
more specific drivers is probed before, no need to flood the console
for the mostly-unused generic ones.

MFC after:	1 week
2020-07-08 17:14:44 +00:00
Alan Somers
6f818c1fb0 geli: enable direct dispatch
geli does all of its crypto operations in a separate thread pool, so
g_eli_start, g_eli_read_done, and g_eli_write_done don't actually do very
much work. Enabling direct dispatch eliminates the g_up/g_down bottlenecks,
doubling IOPs on my system. This change does not affect the thread pool.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D25587
2020-07-08 17:12:12 +00:00
Michael Tuexen
fcbfdc0ab6 Improve consistency.
MFC after:		1 week
2020-07-08 16:23:40 +00:00
Michael Tuexen
ef9095c72a Fix error description.
MFC after:		1 week
2020-07-08 16:04:06 +00:00
Michael Tuexen
c96d7c373e Don't accept FORWARD-TSN chunks when I-FORWARD-TSN was negotiated
and vice versa.

MFC after:		1 week
2020-07-08 15:49:30 +00:00
Michael Tuexen
32df1c9ebb Improve handling of PKTDROP chunks. This includes the input validation
to address two issues found by ossfuzz testing the userland stack:
* https://oss-fuzz.com/testcase-detail/5387560242380800
* https://oss-fuzz.com/testcase-detail/4887954068865024
and adding support for I-DATA chunks in addition to DATA chunks.
2020-07-08 12:25:19 +00:00
Takanori Watanabe
de402d6322 Add support for [read|write] supported data length commands.
Fix ng_hci_le_long_term_key_request_negative_reply_cp struct
while here.

PR:	247809
Submitted by:	Marc Veldman
2020-07-08 06:33:07 +00:00
Rick Macklem
3eaf03766e Add support for ext_pgs mbufs to nfsm_uiombuf().
This patch uses a slightly different algorithm for the non-ext_pgs case,
where a variable called "mcp" is maintained, pointing to the current
location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-08 02:28:08 +00:00
Scott Long
b302c2e5c9 Migrate the feature of excluding RAM pages to use "excludelist"
as its nomenclature.

MFC after:	1 week
2020-07-07 20:33:11 +00:00
Mark Johnston
04ed45b968 Rebuild sysent when capabilities.conf is updated.
Reviewed by:	brooks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25571
2020-07-07 16:35:52 +00:00
Richard Scheffenegger
c201ce0b4a Fix KASSERT during tcp_newtcpcb when low on memory
While testing with system default cc set to cubic, and
running a memory exhaustion validation, FreeBSD panics for a
missing inpcb reference / lock.

Reviewed by:	rgrimes (mentor), tuexen (mentor)
Approved by:	rgrimes (mentor), tuexen (mentor)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25583
2020-07-07 12:10:59 +00:00
Rick Macklem
022346fa62 Add support for ext_pgs mbufs to nfsrvd_rephead().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-07 00:42:23 +00:00
Kristof Provost
38d715f789 riscv plic: Do not complete interrupts until the interrupt handler has run
We cannot complete the interrupt (i.e. write to the claims/complete register
until the interrupt handler has actually run. We don't run the interrupt
handler immediately from intr_isrc_dispatch(), we only schedule it for later
execution.

If we immediately complete it (i.e. before the interrupt handler proper has
run) the interrupt may be triggered again if the interrupt source remains set.
From RISC-V Instruction Set Manual: Volume II: Priviliged Architecture, 7.4
Interrupt Gateways:

"If a level-sensitive interrupt source deasserts the interrupt after the PLIC
core accepts the request and before the interrupt is serviced, the interrupt
request remains present in the IP bit of the PLIC core and will be serviced by
a handler, which will then have to determine that the interrupt device no
longer requires service."

In other words, we may receive interrupts twice.

Avoid that by postponing the completion until after the interrupt handler has
run.

If the interrupt is handled by a filter rather than by scheduling an interrupt
thread we must also complete the interrupt, so set up a post_filter handler
(which is the same as the post_ithread handler).

Reviewed by:	mhorne
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D25531
2020-07-06 21:29:50 +00:00
Mark Johnston
26dd427800 Split nhop_ref_object().
Now nhop_ref_object() unconditionally acquires a reference, and the new
nhop_try_ref_object() uses refcount_acquire_if_not_zero() to
conditionally acquire a reference.  Since the former is cheaper, use it
when we know that the initial counter value is non-zero.  No functional
change intended.

Reviewed by:	melifaro
Differential Revision:	https://reviews.freebsd.org/D25535
2020-07-06 21:20:57 +00:00
Mitchell Horne
2192efc03b RISC-V boot1.efi and loader.efi support
This implementation doesn't have any major deviations from the other EFI
ports. I've copied the boilerplate from arm and arm64.

I've tested this with the following boot flows:
OpenSBI (M-mode) -> u-boot (S-mode) -> loader.efi -> FreeBSD
OpenSBI (M-mode) -> u-boot (S-mode) -> boot1.efi -> loader.efi -> FreeBSD

Due to the way that u-boot handles secondary CPUs, OpenSBI >= v0.7 is required,
as the HSM extension is needed to bring them up explicitly. Because of this,
using BBL as the SBI implementation will not be possible. Additionally, there
are a few recent u-boot changes that are required as well, all of which will be
present in the upcoming v2020.07 release.

Looks good:	emaste
Differential Revision:	https://reviews.freebsd.org/D25135
2020-07-06 18:19:42 +00:00
Mark Johnston
866a5d1298 Regenerate.
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:34:49 +00:00
Mark Johnston
bdfe61e05e Permit cpuset_(get|set)domain() in capability mode.
These system calls already perform validation of their parameters when
called in capability mode, identical to cpuset_(get|set)affinity().

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:34:29 +00:00
Pawel Biernacki
e94fdc3833 kern.tty_info_kstacks: set compact format as default 2020-07-06 16:34:15 +00:00
Mark Johnston
69b565d7c0 Allow accesses of the caller's CPU and domain sets in capability mode.
cpuset_(get|set)(affinity|domain)(2) permit a get or set of the calling
thread or process' CPU and domain set in capability mode, but only when
the thread or process ID is specified as -1.  Extend this to cover the
case where the ID actually matches the caller's TID or PID, since some
code, such as our pthread_attr_get_np() implementation, always provides
an explicit ID.

It was not and still is not permitted to access CPU and domain sets for
other threads in the same process when the process is in capability
mode.  This might change in the future.

Submitted by:	Greg V <greg@unrelenting.technology> (original version)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25552
2020-07-06 16:34:09 +00:00
Pawel Biernacki
cd1c083d80 kern.tty_info_kstacks: add a compact format
Add a more compact display format for kern.tty_info_kstacks inspired by
procstat -kk. Set it as a default one.

# sysctl kern.tty_info_kstacks=1
kern.tty_info_kstacks: 0 -> 1
# sleep 2
^T
load: 0.17  cmd: sleep 623 [nanslp] 0.72r 0.00u 0.00s 0% 2124k
#0 0xffffffff80c4443e at mi_switch+0xbe
#1 0xffffffff80c98044 at sleepq_catch_signals+0x494
#2 0xffffffff80c982c2 at sleepq_timedwait_sig+0x12
#3 0xffffffff80c43af3 at _sleep+0x193
#4 0xffffffff80c50e31 at kern_clock_nanosleep+0x1a1
#5 0xffffffff80c5119b at sys_nanosleep+0x3b
#6 0xffffffff810ffc69 at amd64_syscall+0x119
#7 0xffffffff810d5520 at fast_syscall_common+0x101
sleep: about 1 second(s) left out of the original 2
^C
# sysctl kern.tty_info_kstacks=2
kern.tty_info_kstacks: 1 -> 2
# sleep 2
^T
load: 0.24  cmd: sleep 625 [nanslp] 0.81r 0.00u 0.00s 0% 2124k
mi_switch+0xbe sleepq_catch_signals+0x494 sleepq_timedwait_sig+0x12
sleep+0x193 kern_clock_nanosleep+0x1a1 sys_nanosleep+0x3b
amd64_syscall+0x119 fast_syscall_common+0x101
sleep: about 1 second(s) left out of the original 2
^C

Suggested by:	avg
Reviewed by:	mjg
Relnotes:	yes
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25487
2020-07-06 16:33:28 +00:00
Mark Johnston
9eb997cb48 Lift cpuset Capsicum checks into a subroutine.
Otherwise the same checks are duplicated across four different system
call implementations, cpuset_(get|set)(affinity|domain)().  No
functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:33:21 +00:00
Brandon Bergren
60185d8965 [PowerPC] XIVE dispatch tweaks
* Only read the DPCPU pointer once per xive_dispatch call.
  * Optimize HE decoding for the common cases.

Reported by:	jhibbits (in irc)
Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25545
2020-07-06 15:15:37 +00:00
Mark Johnston
b256d25c50 iflib: Fix some nits in the rx refill code.
- Get rid of the ifl_vm_addrs array.  It is not used by any existing
  consumer, so we are just dirtying a couple of cache lines for no
  reason.
- Use uma_zalloc(fl->ifl_zone) instead of m_cljget().  Otherwise
  m_cljget() is doing unnecessary work to look up the correct zone, when
  iflib already knows what that zone is.
- ifl_gen is only used when INVARIANTS is on, so make that more clear.
- Fix some style nits and inconsistencies.

Reviewed by:	gallatin
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25490
2020-07-06 14:52:21 +00:00
Mark Johnston
a363e1d4d0 iflib: Fix handling of mbuf cluster allocation failures.
When refilling an rx freelist, make sure we only update the hardware
producer index if at least one cluster was allocated.  Otherwise the
NIC is programmed to write a previously used cluster, typically
resulting in a use-after-free when packet data is written by the
hardware.

Also make sure that we don't update the fragment index cursor if the
last allocation attempt didn't succeed.  For at least Intel drivers,
iflib assumes that the consumer index and fragment index cursor stay in
lockstep, but this assumption was violated in the face of cluster
allocation failures.

Reported and tested by:	pho
Reviewed by:	gallatin, hselasky
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25489
2020-07-06 14:52:09 +00:00
Andrew Turner
eed8b80f64 Add a driver for bcm2838 PCI express controller
This adds support for the Broadcom bcm2711 PCI express controller, found
on the Raspberry Pi 4 (aka the bcm2838 SoC). The driver has only been
developed against the soldered-on VIA XHCI controller and not tested
with other end points.

Submitted by:	Robert Crowston <crowston_protonmail.com>
Differential Revision:	https://reviews.freebsd.org/D25068
2020-07-06 08:51:55 +00:00
Hans Petter Selasky
1866c98e64 Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which
order the clients are attached and detached. For example one IB client may
use resources from another IB client. This can lead to a potential deadlock
at shutdown. For example if the ipoib is unregistered after the ib_multicast
client is detached, then if ipoib is using multicast addresses a deadlock may
happen, because ib_multicast will wait for all its resources to be freed before
returning from the remove method.

Fix this by using module_xxx_order() instead of module_xxx().

Differential Revision:	https://reviews.freebsd.org/D23973
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-06 08:50:11 +00:00
Mateusz Guzik
9b0c2e5909 vfs: expand on vhold_smr comment 2020-07-06 02:00:35 +00:00
Mateusz Guzik
d363fa4127 lockf: elide avoidable locking in lf_advlockasync
While here assert on ls_threads state.
2020-07-05 23:07:54 +00:00
Rick Macklem
34fc29e0c9 Add support for ext_pgs mbufs to nfsm_strtom().
Also, add a new function nfsm_add_ext_pgs() which will either add a page
or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-05 21:55:16 +00:00
Konstantin Belousov
4543c1c329 Fix typo.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-07-05 20:54:01 +00:00
Hans Petter Selasky
588fbadffb Fix include file order in io.h in the LinuxKPI.
Make sure sys/types.h is included before machine/vm.h.

PR:		247775
Submitted by:	pkubaj@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-05 19:38:36 +00:00
Andrew Turner
fcf7a48191 Rerun kernel ifunc resolvers after all CPUs have started
On architectures that use RELA relocations it is safe to rerun the ifunc
resolvers on after all CPUs have started, but while they are sill parked.

On arm64 with big.LITTLE this is needed as some SoCs have shipped with
different ID register values the big and little clusters meaning we were
unable to rely on the register values from the boot CPU.

Add support for rerunning the resolvers on arm64 and amd64 as these are
both RELA using architectures.

Reviewed by:	kib
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25455
2020-07-05 14:38:22 +00:00
Edward Tomasz Napierala
4d2b7be54a Fix Linux recvmsg(2) when msg_namelen returned is 0. Previously
it would fail with EINVAL, breaking some of the Python regression
tests.

While here, cap the user-controlled message length.

Note that the code doesn't seem to be copying out the new length
in either (success or failure) case. This will be addressed separately.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25392
2020-07-05 10:57:28 +00:00