Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
Instead of including stdint.h for uintptr_t, include sys/_types.h and use
__types for everything that isn't a native C keyword type.
Remove the #include of cdefs.h. It appears after the include of armreg.h
which has a precondition of cdefs.h being included before it, so everyone
including sysarch.h is already including cdefs.h. (When armv5 support
goes away, there will be no need include armreg.h here either.)
Unfortunately, the unprefixed struct member names "addr" and "len" cannot
be changed, because 3rd-party software is relying on them (libcompiler_rt
is one known consumer).
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics. The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.
This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead. Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.
No functional change is intended. __FreeBSD_version is bumped.
Reviewed by: alc, kib
Discussed with: jeff
Discussed with: jhb, np (cxgbe)
Tested by: pho (previous version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19247
All MMCBR bridges have to implement all the MMCBR variables. This
implements them for everybody that currently doesn't.
A common routine for this should be written.
This lets PCIe MSI-X device interrupts work on the MACCHIATObin
(Marvell Armada 8k), which allows e.g. the Intel igb NIC to fully work.
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: mw, bcran
Differential Revision: https://reviews.freebsd.org/D20775
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.
Numerous posts to arch@ and other locations have found no actual users
for this software.
Relnotes: Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
had been done years ago. I did. All this time we've only compiled a LINT
kernel for TARGET_ARCH=arm. Now separate LINT-V5 and LINT-V7 configs are
generated and built.
There are two new files in arm/conf, NOTES.armv5 and NOTES.armv7, containing
some of what used to be in the arm NOTES file. That file now contains only
the bits that are common to v5 and v7.
The makeLINT.mk file now creates the LINT-V5 and LINT-V7 files by concatening
sys/conf/NOTES, arm/conf/NOTES, and arm/conf/NOTES.armv{5,7} in that order.
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps. The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
upcoming functional changes.
Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data. Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite. Move the
device_t and driver_methods structs to the end of the file. Tweak comments.
is nothing left in the file that related to pwmbus at all. It just contains
prototypes for the functions implemented in dev/pwm.ofw_pwm.c, so name it
accordingly and fix the include protect wrappers to match.
A new pwmbus.h will be coming along in a future commit.
The pwm and pwmbus interfaces were nearly identical, this merges them into a
single pwmbus interface. The pwmbus driver now implements the pwmbus
interface by simply passing all calls through to its parent (the hardware
driver). The channel_count method moves from pwm to pwmbus, and the
get_bus method is deleted (just no longer needed).
The net effect is that the interface for doing pwm stuff is now the same
regardless of whether you're a child of pwmbus, or some random driver
elsewhere in the hierarchy that is bypassing the pwmbus layer and is talking
directly to the hardware driver via cross-hierarchy connections established
using fdt data.
The pwmc driver is now a child of pwmbus, instead of being its sibling
(that's why the get_bus method is no longer needed; pwmc now gets the
device_t of the bus using device_get_parent()).
pollution that was cleaned up recently, and this file got missed in the
cleanup because it's not attached to the build unless you specifically
request this device in a custom kernel config.
driver is compiled into the kernel but pwmbus will be loaded as a module
when needed (and because of that, pwmbus_attach_bus() is going away in
the near future). Instead, just directly do what that function did:
register the fdt xfef handle, and attach the pwmbus.
resources if they got allocated (because detach() gets called from attach()
to handle various failures), and delete the pwmbus child if it got created.
Hide unused code under #ifdef notyet (in one case the only caller is under
that same ifdef), or if it is arm (not arm64) specific code under the
__arm__ ifdef to not yield -Wunused-function warnings during the arm64
kernel compile.
MFC after: 2 weeks
The A3700 has a different GPIO controller and thus, do not use the old (and
shared) code for Marvell.
The pinctrl driver, also part of the controller, is not supported yet (but
the implementation should be straightforward).
Sponsored by: Rubicon Communications, LLC (Netgate)
This allows SDIO (through CAM) to attach to an upstream, e.g.,
..
sdhci_bcm0 pnpinfo name=mmc@7e300000 compat=brcm,bcm2835-mmc
sdiob0
..
Without this, upon trying to load sdio, we would panic with
"bus_add_child is not implemented".
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
In the DMA case, given we disable the data interrupts, we never seem
to get DATA_END. Given we are relying on DMA interrupts we are not
using the SDHCI state machine and hence only call into
sdhci_platform_will_handle() for the first check of data.
We do not call "will handle" for any following round trips of the same
transaction if block size * count > BCM_DMA_BLOCK_SIZE.
Manually check "left" in the DMA interrupt handler to see if we have at
least another full BCM_DMA_BLOCK_SIZE to handle.
Without this change we would DMA that and then even start a DMA with
left == 0 which would lead to a timeout and error.
Now we re-enable data interrupts and return and let the SDHCI generic
interrupt handler and state machine pick the SPACE_AVAIL up and then
find that it should punt to the pio_handler for the remaining bytes
or finish the data transaction.
With this change block mode seems to work beyond 7 * 64byte blocks,
which worked as it was below BCM_DMA_BLOCK_SIZE.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20199
Extending what the initial revision, r273264, r276985, r277346 have
started for the transfer mode and command registers, another pair of
16bit registers written in sequence are block size and block count,
which fall together onto the same 32bit line and hence the same
register(s) would be written twice in sequence for those as well.
Use a similar approach to transfer mode and command and save the writes
to either of the block regiters and then only execute a write once.
We can do this as with transfer mode their values are meaningless until
a command is issued so we can use that write to command as a trigger
to also write out the block registers.
Compared to transfer mode and command the value of block count can
change, so we need to keep state and actually read the block registers
back the first time after a write.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20197
These calls are not the same in general: the former will dequeue the
page if it is enqueued, while the latter will just leave it alone. But,
all existing uses of the former apply to unmanaged pages, which are
never enqueued in the first place. No functional change intended.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20470
cpufunc, in terms of __builtin_ffs and the like, for arm32 v6 and v7
architectures, and use those, rather than the simple libkern
implementations, in building arm32 kernels.
Reviewed by: manu
Approved by: kib, markj (mentors)
Tested by: iz-rpi03_hs-karlsruhe.de, mikael.urankar_gmail.com, ian
Differential Revision: https://reviews.freebsd.org/D20412
Some clocks used the NM type but this clock is for the ones with the
formula "clk = clkin / n / m" and not "clk = clkin * n / m"
Use the new frac clock for them.
Add a clock driver for clock that can either be used in integer mode
with one N factor and one M divider or in fractional mode where the
output frequency is chosen between two predifined output.
This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility. Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.
(These CUs did not show up as missing eventhandler.h in tinderbox.)
X-MFC-With: r347984
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.
EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.
LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).
No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
Having IPSEC compiled into the kernel imposes a non-trivial
performance penalty on multi-threaded workloads due to IPSEC
refcounting. In my benchmarks of multi-threaded UDP
transmit (connected sockets), I've seen a roughly 20% performance
penalty when the IPSEC option is included in the kernel (16.8Mpps
vs 13.8Mpps with 32 senders on a 14 core / 28 HTT Xeon
2697v3)). This is largely due to key_addref() incrementing and
decrementing an atomic reference count on the default
policy. This cause all CPUs to stall on the same cacheline, as it
bounces between different CPUs.
Given that relatively few users use ipsec, and that it can be
loaded as a module, it seems reasonable to ask those users to
load the ipsec module so as to avoid imposing this penalty on the
GENERIC kernel. Its my hope that this will make FreeBSD look
better in "out of the box" benchmark comparisons with other
operating systems.
Many thanks to ae for fixing auto-loading of ipsec.ko when
ifconfig tries to configure ipsec, and to cy for volunteering
to ensure the the racoon ports will load the ipsec.ko module
Reviewed by: cem, cy, delphij, gnn, jhb, jpaetzel
Differential Revision: https://reviews.freebsd.org/D20163
tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).
This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp
[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).
ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.
(MFC commentary)
This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.
I have no plans to do this MFC as of now.
Reviewed by: bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from: melifaro
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D20044
Use it wherever COMPAT_FREEBSD11 is currently specified, like r309749.
Reviewed by: imp, jhb, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20120