The rcu_work function helps to queue some work after waiting for a grace
period.
This is needed by DRM drivers.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24942
Recent changes have caused the vmspace objects to start coming from KVA
instead of direct-mapped memory on powerpc. As far as I can tell, this is
not actually a problem, so we should stop arbitrarily asserting that it is.
I do not know why this was not being triggered before.
Approved by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
hardware, the registers appear like there's two cores, but the second
core does not work, so base the number of cores upon the chip id.
Tested on a XC7Z007S.
also, previous commit was suppose to be D14429.
Submitted by: Thomas Skibo
Differential Revision: https://reviews.freebsd.org/D14429
node to replace the one being removed, restructure to first remove the
replacement node and correct the parent pointers around it, and then
let the all-cases code at the end deal with the parent of the deleted
node, making it point to the replacement node. This removes one or two
conditional branches.
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D24845
This is all very long-standing bug stuff that is touchy and still poorly
documented. Ok, here goes.
The basic bug:
* deleting a VAP causes the RX path (and TX path too) to be restarted
without a full chip reset, which causes RX hangs on the AR9380 and later.
(ie, the ones with the newer DMA engine.)
The basic fix:
* do an RX flush when stopping RX in ath_vap_delete() to match what happens
when RX is stopped elsewhere. This ensures any pending frames are completed
and we restart at the right spot; it also ensures we don't push new RX buffers
into the hardware if we're stopping receive.
The other issues I found:
* Don't bother checking the RX packet ring in the deferred read taskqueue;
that's specifically supposed to be for completing frames rather than
just yanking them off the receive ring.
* Cancel/drain any pending deferred read taskqueue. This isn't done inside
any locks so we should be super careful here. This stops the hardware
being reprogrammed at the same time in another thread/CPU whilst we're
stopping RX.
* .. (yes, this should be better serialised, but that's for another day. maybe.)
* Add more debugging to trace what's going on here.
And the fun bit:
* Reinitialise the RX FIFO ONLY if we've been reset or stopped, rather than just
reset. I noticed that after all the above was done I was STILL seeing RXEOL.
RXEOL isn't enabled on the AR9380 so I'd only see it if I was sending TX frames
(ie a ping where it'd be transmitted but never received) so I was not being
spammed by RXEOL. So, as long as stuff is stopped, restart it.
This seems to be doing the right thing in both AP and STA modes.
What I should do next, if I ever get time:
* as I said above, serialise the receive stop/start to include taskqueues
* monitor RXEOL on the AR9380 and I keep seeing it spammed / lockups, just
go do a full chip reset to get things back on track. It sucks, but it
is better than nothing.
Tested:
* AR9380 AP/STA mode, adding/deleting a hostap VAP to trigger the TX/RX
queue stop/start; whilst also running an iperf through it. Lots of times.
Lots. Of.. Times.
I have to dig into why I'm seeing it on chips as late as the AR9380 era
stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing
aggregate TX frames complete with no blockack bit set. So, everything
should be treated as a failure and do a hardware reset for good measure.
Tested:
* AR9380, STA mode
* AR9580 (5GHz), AP mode
I wasn't enforcing the maximum packet length when using static rates
so although the driver was enforcing it itself OK, the statistics were
sometimes going into the wrong bin.
Tested:
* AR9380, STA mode
Instead of crashing the user process when a D-ERAT multihit is detected, try
to flush the ERAT, and continue. This machine check indicates a likely PMAP
invalidation shortcoming that will need to be addressed, but it's
recoverable, so just recover. The recovery is pmap-specific to flush the
ERAT, so add a pmap function to do so, currently only implemented by the
POWER9 radix pmap.
Comparing fsid_t objects requires internal knowledge of the fsid structure
and yet this is duplicated across a number of places in the code.
Simplify by creating a fsidcmp function (macro).
Reviewed by: mjg, rmacklem
Approved by: mav (mentor)
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D24749
- Use enc_xform_aes_xts.setkey() directly instead of duplicating the code
now that it no longer calls malloc().
- Rather than bringing back all of xform_userland.h, add a conditional
#include of <stand.h> to xform_enc.h.
- Update calls to encrypt/decrypt callbacks in enc_xform_aes_xts for
separate input/output pointers.
Pointy hat to: jhb
It previously returned the object map base address, while all other
ELF operating systems return load offset, i.e. the difference between
map base and the link base.
Explain the meaning of the field in the man page.
Stop filling the mips-only l_offs member, which is apparently unused.
PR: 246561
Requested by: Damjan Jovanovic <damjan.jov@gmail.com>
Reviewed by: emaste, jhb, cem (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24918
The flush is needed to prevent cross-process ret2spec, which is not handled
on kernel entry if IBPB is enabled but SMEP is present.
While there, add i386 RSB flush.
Reported by: Anthony Steinhauser <asteinhauser@google.com>
Reviewed by: markj, Anthony Steinhauser
Discussed with: philip
admbugs: 961
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Consistently use 'void *' for key schedules / key contexts instead
of a mix of 'caddr_t', 'uint8_t *', and 'void *'.
- Add a ctxsize member to enc_xform similar to what auth transforms use
and require callers to malloc/zfree the context. The setkey callback
now supplies the caller-allocated context pointer and the zerokey
callback is removed. Callers now always use zfree() to ensure
key contexts are zeroed.
- Consistently use C99 initializers for all statically-initialized
instances of 'struct enc_xform'.
- Change the encrypt and decrypt functions to accept separate in and
out buffer pointers. Almost all of the backend crypto functions
already supported separate input and output buffers and this makes
it simpler to support separate buffers in OCF.
- Remove xform_userland.h shim to permit transforms to be compiled in
userland. Transforms no longer call malloc/free directly.
Reviewed by: cem (earlier version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24855
Match other architectures and print CPU information during
cpu_startup(). In particular, this prints the information after the
message buffer is initialized which allows it to be retrieved after
boot via dmesg(8).
While here, add some extern declarations to <machine/md_var.h> in
place of duplicated declarations in various source files.
Reviewed by: brooks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24936
Rather than walking all of cpu_switch looking for the sequence of
instructions to patch, add a global label at the location that needs
the patch applied.
Reviewed by: brooks, Alfredo Mazzinghi <alfredo.mazzinghi_cl.cam.ac.uk>
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24931
These functions were added in 2001 and are currently unused.
copyinfrom() looks to have never been used. copyinstrfrom() was used
for two weeks before the code was refactored to remove it's sole use.
Reviewed by: brooks, kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24928
-development is long and awkward, and is also inconsistent with prior art
from the Linux world, which uses -dev (Debian) or -devel (Red Hat). Follow
the Debian convention, and similarly for debug info packages.
Also remove redundant pkgbase development tag from includes. We already tag
include files with package=runtime,dev; there is no need to separately tag
them as dev.
Discussed with: bapt
Reviewed by: manu
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24139
After r358443 the vnode object lock no longer synchronizes concurrent
zfs_getpages() and zfs_write() (which must update vnode pages to
maintain coherence). This created a potential deadlock between ZFS
range locks and VM page busy locks: a fault on a mapped file will cause
the fault page to be busied, after which zfs_getpages() locks a range
around the file offset in order to map adjacent, resident pages;
zfs_write() locks the range first, and then must busy vnode pages when
synchronizing.
Solve this by adding a non-blocking mode for ZFS range locks, and using
it in zfs_getpages(). If zfs_getpages() fails to acquire the range
lock, only the fault page will be populated.
Reported by: bdrewery
Reviewed by: avg
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24839
This change adds Hyper-V socket feature in FreeBSD. New socket address
family AF_HYPERV and its kernel support are added.
Submitted by: Wei Hu <weh@microsoft.com>
Reviewed by: Dexuan Cui <decui@microsoft.com>
Relnotes: yes
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D24061
Fix returning from xenstore device with locks held, which triggers the
following panic:
# cat /dev/xen/xenstore
^C
userret: returning with the following locks held:
exclusive sx evtchn_ringc_sx (evtchn_ringc_sx) r = 0 (0xfffff8000650be40) locked @ /usr/src/sys/dev/xen/evtchn/evtchn_dev.c:262
Note this is not a security issue since access to the device is
limited to root by default.
Sponsored by: Citrix Systems R&D
MFC after: 1 week
Previously the driver handled the bit within itself, but did not expose
the state change to net80211 and interface layers.
This change uses net80211 KPI for rfkill signaling.
The code is modeled after similar code in iwn and wpi.
Reviewed by: adrian
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24923
x86 needs delayed TLB invalidation because invalidation requires an
expensive IPI. PowerPC has had a TLB invalidation instruction since the
POWER1 in 1990, so there's no need to delay anything.
Otherwise accept filters compiled into the kernel do not preempt
preloaded accept filter modules. Then, the preloaded file registers its
accept filter module before the kernel, and the kernel's attempt fails
since duplicate accept filter list entries are not permitted. This
causes the preloaded file's module to be released, since
module_register_init() does a lookup by name, so the preloaded file is
unloaded, and the accept filter's callback points to random memory since
preload_delete_name() unmaps the file on x86 as of r336505.
Add a new ACCEPT_FILTER_DEFINE macro which wraps the accept filter and
module definitions, and ensures that a module version is defined.
PR: 245870
Reported by: Thomas von Dein <freebsd@daemon.de>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
The DIC and IDC bits in the CTR_EL0 register signal to the kernel when it
can relax the instruction cache synchronisation operations. The IDC bit
means we can relax cleaning the data cache to the point of unification
while the DIC bit means we don't need to invalidate the instruction cache
for data coherence. In both cases an appropriate barrier is still needed.
For now only implement the case where both bits are set, as is the case
on the Neoverse-N1 as used in the Amazon AWS Graviton 2 CPU. Note that
this behaviour is a optional on the N1 so we may later need to implement
only one or the other bit being set.
There is a tunable to disable each flag on boot.
Testing on a 4 core Graviton 2 instance found a significant improvement
in sys and real time when running "make buildkernel -j4", with no
significant difference in user time.
Reviewed by: markj
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D24853
Previously we would create an isrc for each MSI/MSI-X interrupt. This
causes issues for other interrupt sources in the system, e.g. a GPIO
driver, as they may be unable to allocate interrupts. This works around
this by allocating the isrc only when needed.
Reported by: alisaidi@amazon.com
Reviewed by: mmel
Sponsored by: Innovaate UK
Differential Revision: https://reviews.freebsd.org/D24876
Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.
Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24859
pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2
Submitted by: Austing Shafer (ashafer@badland.io)
Differential Revision: https://reviews.freebsd.org/D24796
The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24861
cem noted that on FreeBSD snprintf() can not fail and code should not
check for that.
A followup commit will replace the usage of snprintf() in the SCTP
sources with a variadic macro SCTP_SNPRINTF, which will simply map to
snprintf() on FreeBSD and do a checking similar to r361209 on
other platforms.
This is independent of the recently-discussed global change, which is still
in review/discussion stage.
This is effectively a measure for consistency in the ZFS world, where
FreeBSD was the only platform (as far as I could find) that allowed this.
What ZFS exposes is decidedly not useful for any real purposes, to
paraphrase (hopefully faithfully) jhb's findings when exploring this:
The size of a directory in ZFS is the number of directory entries within.
When reading a directory, you would instead get the leading part of its raw
contents; the amount you get being dictated by the "size," i.e. number of
directory entries. There's decidedly (luckily) no stack disclosure happening
here, though the behavior is bizarre and almost certainly a historical
accident.
This change has already been upstreamed to OpenZFS.
MFC after: 1 week
A page (even physmem) can be marked as cache-inhibited. Attempting to use
'dcbz' to zero a page mapped cache-inhibited triggers an alignment
exception, which is fatal in kernel. This was seen when testing hardware
acceleration with X on POWER9.
At some point in the future, this should be changed to a more straight
forward zero loop instead of bzero(), and a similar change be made to the
other pmaps.
Reported by: pkubaj@
Note that in_pcb_lport and in_pcb_lport_dest can be called with a NULL
local address for IPv6 sockets; handle it. Found by syzkaller.
Reported by: cem
MFC after: 1 month
Previously, tcp_connect() would bind a local port before connecting,
forcing the local port to be unique across all outgoing TCP connections
for the address family. Instead, choose a local port after selecting
the destination and the local address, requiring only that the tuple
is unique and does not match a wildcard binding.
Reviewed by: tuexen (rscheff, rrs previous version)
MFC after: 1 month
Sponsored by: Forcepoint LLC
Differential Revision: https://reviews.freebsd.org/D24781