After r358443 the vnode object lock no longer synchronizes concurrent
zfs_getpages() and zfs_write() (which must update vnode pages to
maintain coherence). This created a potential deadlock between ZFS
range locks and VM page busy locks: a fault on a mapped file will cause
the fault page to be busied, after which zfs_getpages() locks a range
around the file offset in order to map adjacent, resident pages;
zfs_write() locks the range first, and then must busy vnode pages when
synchronizing.
Solve this by adding a non-blocking mode for ZFS range locks, and using
it in zfs_getpages(). If zfs_getpages() fails to acquire the range
lock, only the fault page will be populated.
Reported by: bdrewery
Reviewed by: avg
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24839
Libarchive 3.4.3
Relevant vendor changes:
PR #1352: support negative zstd compression levels
PR #1359: improve zstd version checking
PR #1348: support RHT.security.selinux from GNU tar
PR #1357: support for archives compressed with pzstd
PR #1367: fix issues in acl tests
PR #1372: child handling cleanup
PR #1378: fix memory leak from passphrase callback
This change adds Hyper-V socket feature in FreeBSD. New socket address
family AF_HYPERV and its kernel support are added.
Submitted by: Wei Hu <weh@microsoft.com>
Reviewed by: Dexuan Cui <decui@microsoft.com>
Relnotes: yes
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D24061
Fix returning from xenstore device with locks held, which triggers the
following panic:
# cat /dev/xen/xenstore
^C
userret: returning with the following locks held:
exclusive sx evtchn_ringc_sx (evtchn_ringc_sx) r = 0 (0xfffff8000650be40) locked @ /usr/src/sys/dev/xen/evtchn/evtchn_dev.c:262
Note this is not a security issue since access to the device is
limited to root by default.
Sponsored by: Citrix Systems R&D
MFC after: 1 week
Previously the driver handled the bit within itself, but did not expose
the state change to net80211 and interface layers.
This change uses net80211 KPI for rfkill signaling.
The code is modeled after similar code in iwn and wpi.
Reviewed by: adrian
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24923
radio is disabled through the communication device toggle key (also known
as the RF raidio kill button). Only the CTRL-EVENT-DISCONNECTED will be
issued.
Submitted by: avg
Reported by: avg
MFC after: 1 week
x86 needs delayed TLB invalidation because invalidation requires an
expensive IPI. PowerPC has had a TLB invalidation instruction since the
POWER1 in 1990, so there's no need to delay anything.
Otherwise accept filters compiled into the kernel do not preempt
preloaded accept filter modules. Then, the preloaded file registers its
accept filter module before the kernel, and the kernel's attempt fails
since duplicate accept filter list entries are not permitted. This
causes the preloaded file's module to be released, since
module_register_init() does a lookup by name, so the preloaded file is
unloaded, and the accept filter's callback points to random memory since
preload_delete_name() unmaps the file on x86 as of r336505.
Add a new ACCEPT_FILTER_DEFINE macro which wraps the accept filter and
module definitions, and ensures that a module version is defined.
PR: 245870
Reported by: Thomas von Dein <freebsd@daemon.de>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
The DIC and IDC bits in the CTR_EL0 register signal to the kernel when it
can relax the instruction cache synchronisation operations. The IDC bit
means we can relax cleaning the data cache to the point of unification
while the DIC bit means we don't need to invalidate the instruction cache
for data coherence. In both cases an appropriate barrier is still needed.
For now only implement the case where both bits are set, as is the case
on the Neoverse-N1 as used in the Amazon AWS Graviton 2 CPU. Note that
this behaviour is a optional on the N1 so we may later need to implement
only one or the other bit being set.
There is a tunable to disable each flag on boot.
Testing on a 4 core Graviton 2 instance found a significant improvement
in sys and real time when running "make buildkernel -j4", with no
significant difference in user time.
Reviewed by: markj
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D24853
Previously we would create an isrc for each MSI/MSI-X interrupt. This
causes issues for other interrupt sources in the system, e.g. a GPIO
driver, as they may be unable to allocate interrupts. This works around
this by allocating the isrc only when needed.
Reported by: alisaidi@amazon.com
Reviewed by: mmel
Sponsored by: Innovaate UK
Differential Revision: https://reviews.freebsd.org/D24876
If certctl is installed on the system we're configuring, do a certctl
rehash.
Note that certctl may not be present if the world we've installed was built
either WITHOUT_OPENSSL or WITHOUT_CAROOT. In this scenario, we don't
currently see if the host has a certctl as this may be an indication that
the system *shouldn't* have certs installed into /etc/ssl.
Reviewed by: allanjude, dteske
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24640
Before this change, swapon(8) implied that -F works as a standalone option,
which is not the case and would produce a usage message. This change extends
the description of the -F option to mention that -a is required with it.
PR: 238551
Submitted by: Christian Baltini
MFC after: 5 days
Apparently, when the -u, -i and -I options where added to sed(1), it was
forgotten to add them to both lines in the SYNOPSIS section. They were only
added to the second line, although they apply to both.
With the updated SYNOPSIS, it is now allowed (and consistent) to run:
sed -i BAK s/foo/bar/g some_file
PR: 240556
Submitted by: Oliver Fromme
MFC after: 5 days
Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24859
pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2
Submitted by: Austing Shafer (ashafer@badland.io)
Differential Revision: https://reviews.freebsd.org/D24796
The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24861
cem noted that on FreeBSD snprintf() can not fail and code should not
check for that.
A followup commit will replace the usage of snprintf() in the SCTP
sources with a variadic macro SCTP_SNPRINTF, which will simply map to
snprintf() on FreeBSD and do a checking similar to r361209 on
other platforms.
This is independent of the recently-discussed global change, which is still
in review/discussion stage.
This is effectively a measure for consistency in the ZFS world, where
FreeBSD was the only platform (as far as I could find) that allowed this.
What ZFS exposes is decidedly not useful for any real purposes, to
paraphrase (hopefully faithfully) jhb's findings when exploring this:
The size of a directory in ZFS is the number of directory entries within.
When reading a directory, you would instead get the leading part of its raw
contents; the amount you get being dictated by the "size," i.e. number of
directory entries. There's decidedly (luckily) no stack disclosure happening
here, though the behavior is bizarre and almost certainly a historical
accident.
This change has already been upstreamed to OpenZFS.
MFC after: 1 week
A page (even physmem) can be marked as cache-inhibited. Attempting to use
'dcbz' to zero a page mapped cache-inhibited triggers an alignment
exception, which is fatal in kernel. This was seen when testing hardware
acceleration with X on POWER9.
At some point in the future, this should be changed to a more straight
forward zero loop instead of bzero(), and a similar change be made to the
other pmaps.
Reported by: pkubaj@
Note that in_pcb_lport and in_pcb_lport_dest can be called with a NULL
local address for IPv6 sockets; handle it. Found by syzkaller.
Reported by: cem
MFC after: 1 month
Previously, tcp_connect() would bind a local port before connecting,
forcing the local port to be unique across all outgoing TCP connections
for the address family. Instead, choose a local port after selecting
the destination and the local address, requiring only that the tuple
is unique and does not match a wildcard binding.
Reviewed by: tuexen (rscheff, rrs previous version)
MFC after: 1 month
Sponsored by: Forcepoint LLC
Differential Revision: https://reviews.freebsd.org/D24781