130662 Commits

Author SHA1 Message Date
Mateusz Guzik
44c78346f6 x86: fix assertion in ipi_send_cpu to range check the passed id
Prior to the change for sufficiently bad id (and in particular NOCPU which is -1)
it would access memory outside of the cpu_apic_ids array.
2020-01-19 21:35:51 +00:00
Justin Hibbits
95a8fce118 [PowerPC64] fix crash when using machdep.moea64_bpvo_pool_size tunable
Summary:
This fixes kernel crashing when tunable "machdep.moea64_bpvo_pool_size" is
set to a value higher then 327680 (default value).  Function
moea64_mid_bootstrap() relies on moea64_bpvo_pool_size, but at time of the
use the variable wan't yet updated with the new value provided by user.

Problem was detected after trying to use a VM with 64GB of RAM, and default
moea64_bpvo_pool_size is insufficient (kernel boot used more than 470000) .
I think default value must be discussed to address this use case, or find a
way to calculate pool size automatically based on amount of memory detected.

Test Plan: Tested on QEMU VM with 64GB of RAM using "set
machdep.moea64_bpvo_pool_size=655360" on loader prompt

Submitted by:	Alfredo Dal'Ava Júnior (alfredo.junior_eldorado.org.br)
Differential Revision:	https://reviews.freebsd.org/D23233
2020-01-19 21:17:57 +00:00
Emmanuel Vadot
2de9b4d347 zilinx/zy7_qspi: Add a qspi driver for Zynq platforms.
This is a qspi driver for the Xilinx Zynq-7000 chip.
It could be useful for anyone wanting to boot a system from flash memory
instead of SD cards.

Submitted by:	Thomas Skibo (thomasskibo@yahoo.com)
Differential Revision:	https://reviews.freebsd.org/D14698
2020-01-19 20:04:44 +00:00
Emmanuel Vadot
f70961bf86 rk805: Add a regnode_init method
This method will set the desired voltaged based on values in the DTS.
It will not enable the regulator, this is the job of either a consumer
or regnode_set_constraint SYSINIT if the regulator is boot_on or always_on.

Reviewed by:	mmel
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D23216
2020-01-19 19:56:50 +00:00
Emmanuel Vadot
0fe5379c6a arm: allwinner: Add GPIO Interrupt support
Not all pins in Allwinner have interrupts support so we rely
on the padconf data to add the proper caps when pin_getcaps is called.
The pin is switch to the specific "eint" function during setup_intr and
switched back to its old function in teardown_intr.
Only INTR_MAP_DATA_GPIO is supported for now.

MFC after:	1 month
2020-01-19 19:51:20 +00:00
Emmanuel Vadot
d07cc22b30 arm: allwinner: Fix padconf for interrupts information
Add a eint_bank member to the allwinner_pins structure.
On Allwinner SoCs not all pins can do interrupt.
Older SoC (A10/A13 and A20) there is a maximum number of interrupts
set to 32 and all the configuration is done in the same registers.
While on "newer" SoCs (>=A31) interrupts registers are splitted per
pin bank (i.e. all interrupts available in bank B will be configured
with a sets of registers and the one in bank G in another set).
While here set the names to all interrupts function to
pX_eintY where X is the bank name and Y the interrupt number.

To whom ever in the future look at the H5 manual and notice that the bank F
have interrupts support : This isn't true, trust me.

MFC after:	1 month
2020-01-19 19:14:49 +00:00
Jeff Roberson
9c83ff2d86 It has not been possible to recursively terminate a vnode object for some time
now.  Eliminate the dead code that supports it.

Approved by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D22908
2020-01-19 18:36:03 +00:00
Jeff Roberson
98087a066f Make collapse synchronization more explicit and allow it to complete during
paging.

Shadow objects are marked with a COLLAPSING flag while they are collapsing with
their backing object.  This gives us an explicit test rather than overloading
paging-in-progress.  While split is on-going we mark an object with SPLIT.
These two operations will modify the swap tree so they must be serialized
and swap_pager_getpages() can now directly detect these conditions and page
more conservatively.

Callers to vm_object_collapse() now will reliably wait for a collapse to finish
so that the backing chain is as short as possible before other decisions are
made that may inflate the object chain.  For example, split, coalesce, etc.
It is now safe to run fault concurrently with collapse.  It is safe to increase
or decrease paging in progress with no lock so long as there is another valid
ref on increase.

This change makes collapse more reliable as a secondary benefit.  The primary
benefit is making it safe to drop the object lock much earlier in fault or
never acquire it at all.

This was tested with a new shadow chain test script that uncovered long
standing bugs and will be integrated with stress2.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D22908
2020-01-19 18:30:23 +00:00
Jeff Roberson
811d05fcb7 Provide an API for interlocked refcount sleeps.
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D22908
2020-01-19 18:18:17 +00:00
Mateusz Guzik
28479aaae2 vfs: allow v_holdcnt to transition 0->1 without the interlock
Since r356672 ("vfs: rework vnode list management") there is nothing to do
apart from altering freevnodes count, but this much can be safely done based
on the result of atomic_fetchadd.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23186
2020-01-19 17:47:04 +00:00
Mateusz Guzik
059cb4843b cache: counter_u64_add_protected -> counter_u64_add
Fixes booting on RISC-V where it does happen to not be equivalent.

Reported by:	lwhsu
2020-01-19 17:05:26 +00:00
Mateusz Guzik
1399033590 cache: convert numcachehv to counter(9) on 64-bit platforms 2020-01-19 05:37:27 +00:00
Mateusz Guzik
512fa9a4e0 vfs: plug a conditional assigment of lo_name in getnewvnode
It only matters for witness. No functional changes.
2020-01-19 05:36:45 +00:00
Kyle Evans
05d7dd739c sysent targets: further cleanup and deduplication
r355473 vastly improved the readability and cleanliness of these Makefiles.
Every single one of them follows the same pattern and duplicates the exact
same logic.

Now that we have GENERATED/SRCS, split SRCS up into the two parameters we'll
use for ${MAKESYSCALLS} rather than assuming a specific ordering of SRCS and
include a common sysent.mk to handle the rest. This makes it less tedious to
make sweeping changes.

Some default values are provided for GENERATED/SYSENT_*; almost all of these
just use a 'syscalls.master' and 'syscalls.conf' in cwd, and they all use
effectively the same filenames with an arbitrary prefix. Most ABIs will be
able to get away with just setting GENERATED_PREFIX and including
^/sys/conf/sysent.mk, while others only need light additions. kern/Makefile
is the notable exception, as it doesn't take a SYSENT_CONF and the generated
files are spread out between ^/sys/kern and ^/sys/sys, but it otherwise fits
the pattern enough to use the common version.

Reviewed by:	brooks, imp
Nice!:		emaste
Differential Revision:	https://reviews.freebsd.org/D23197
2020-01-18 20:37:45 +00:00
Andrew Gallatin
2052680238 pcpu_page_alloc: guard against empty NUMA domains
Some systems, such as higher end Threadripper, may have
NUMA domains with no physical memory, Don't allocate
from these domains.

This fixes a "panic: vm_wait in early boot" on my 2990WX desktop

Reviewed by:	jeff
Sponsored by:	Netflix
2020-01-18 18:25:37 +00:00
Eugene Grosbein
2888eb4091 ifa_maintain_loopback_route: adjust debugging output
Correction after r333476:

- write this as LOG_DEBUG again instead of LOG_INFO;
- get back function name into the message;
- error may be ESRCH if an address is removed in process (by carp f.e.),
not only ENOENT;
- expression complexity grows, so try making it more readable.

MFC after:	1 week
2020-01-18 04:48:05 +00:00
Brandon Bergren
ee628685e8 D23057: [PowerPC] Fix offset calculations in bridge mode
In rS354701, I replaced text relocations with offsets from &generictrap.

Unfortunately, the magic variable I was using doesn't actually mean the
address of &generictrap, in bridge mode it actually means &generictrap64.

So, for bridge mode to work, it is necessary to differentiate between
"where do we need to branch to to handle a trap" and "where is &generictrap
for purposes of doing relative math".

Introduce a new TRAP_ENTRY and use it instead of TRAP_GENTRAP for doing
actual calls to the generic trap handler.

Reported by:	Mark Millard <marklmi@yahoo.com>
Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D23057
2020-01-18 04:12:41 +00:00
Conrad Meyer
151e04b3fe GEOM label: strip leading/trailing space synthesizing devfs names
%20%20%20 is ugly and doesn't really help make human-readable devfs names.

PR:		243318
Reported by:	Peter Eriksson <pen AT lysator.liu.se>
Relnotes:	yes
2020-01-18 03:33:44 +00:00
Justin Hibbits
de8dd262c4 Add a 'SINGLETON' directive to kobj interface definition
Summary:
This makes the interface described in the definition file act like a
pseudo-IFUNC service, by caching the found method locally.

Applying this to the PowerPC MMU definitions, it yields a significant
(15-20%) performance improvement, seen in both a 'make buildworld' and a
parallel build of LLVM, on a POWER9 system.

Reviewed By:	imp
Differential Revision:	https://reviews.freebsd.org/D23245
2020-01-18 02:39:38 +00:00
Mateusz Guzik
2d0c620272 vfs: distribute freevnodes counter per-cpu
It gets rolled up to the global when deferred requeueing is performed.
A dedicated read routine makes sure to return a value only off by a certain
amount.

This soothes a global serialisation point for all 0<->1 hold count transitions.

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D23235
2020-01-18 01:29:02 +00:00
Justin Hibbits
490ebb8f35 powerpc: Fix the NUMA domain list on powernv
Summary:
Consolidate the NUMA associativity handling into a platform function.
Non-NUMA platforms will just fall back to the default (0).  Currently
only implemented for powernv, which uses a lookup table to map the
device tree associativity into a system NUMA domain.

Fixes hangs on powernv after r356534, and corrects a fairly longstanding
bug in powernv's NUMA handling, which ended up using domains 1 and 2 for
devices and memory on power9, while CPUs were bound to domains 0 and 1.

Reviewed by:	bdragon, luporl
Differential Revision:	https://reviews.freebsd.org/D23220
2020-01-18 01:26:54 +00:00
Brandon Bergren
432ff6eead [PowerPC] Fix Book-E direct map for >=16G ram on e5500
It turns out the maximum TLB1 page size on e5500 is 4G, despite the format
being defined for up to 1TB.

So, we need to clamp the DMAP TLB1 entries to not attempt to create 16G or
larger entries.

Fixes boot on my X5000 in which I just installed 16G of RAM.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D23244
2020-01-18 01:22:54 +00:00
Brandon Bergren
eabe020579 [PowerPC] Save a dword in the powerpc64 signal trampoline
In r291668, an instruction was added to sigcode64.S without the nop pad at
the end being taken out.

Due to alignment, this means that a dword is being wasted on the shared
page for no reason.

Take out this nop, and add some comments while I'm here.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D23055
2020-01-17 23:41:35 +00:00
Conrad Meyer
c1475a191e net80211: Move rate printing in amrr_node_stats() to a separate method
This makes amrr_node_stats() cleaner and allows the rate printing to be
reusable.

Submitted by:	Neel Chauhan <neel at neelc.org>
Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D22318
2020-01-17 22:04:11 +00:00
Kyle Evans
ac64d0f1ea bcm2835_vcbus: unifdef all platform definitions
Raspberry Pi are all over the board, and the reality is that there's no harm
in including all of the definitions by default but plenty of harm in the
current situation. This change is safe because we match a definition by root
/compatible in the FDT, so there will be no false-positives because of it.

The main array of definitions grows, but it's only walked exactly once to
determine which we need to use.
2020-01-17 21:39:28 +00:00
John Baldwin
7f1c5f3604 Check for invalid sstatus values in set_mcontext().
Previously, this check was only in sys_sigreturn() which meant that
user applications could write invalid values to the register via
setcontext() or swapcontext().

Reviewed by:	mhorne
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23219
2020-01-17 19:13:49 +00:00
John Baldwin
ffbf0e3ce4 Save and restore floating point registers in get/set_mcontext().
arm64 and riscv were only saving and restoring floating point
registers for sendsig() and sys_sigreturn(), but not for getcontext(),
setcontext(), and swapcontext().

While here, remove an always-false check for uap being NULL from
sys_sigreturn().

Reviewed by:	br, mhorne
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23218
2020-01-17 19:01:59 +00:00
Mitchell Horne
7fdf7f9183 RISC-V: fix global pointer assignment at boot
As part of the RISC-V ABI, the gp register is expected to initialized
with the address of __global_pointer$ as early as possible. This allows
loads and stores from .sdata to be relaxed based on the value of gp. In
locore.S we do this initialization twice, once each for va and mpva.
However, in both cases the initialization is preceded by an la
instruction, which in theory could be relaxed by the linker.

Move the initialization of gp to be slightly earlier (before la
cpu_exception_handler), and add an additional gp initialization at the
very beginning of _start, before virtual memory is set up.

Reported by:	jrtc27
Reviewed by:	jrtc27
Differential Revision:	https://reviews.freebsd.org/D23139
2020-01-17 17:03:25 +00:00
Ruslan Bukin
36e33a8ef0 Use unsigned loads in fubyte, fuword16, generic_bs_r_1, generic_bs_r_2
as these functions should do zero-extend.

Discovered by running pci_read_cap(), and by hint from James Clarke.

Reviewed by:	James Clarke <jrtc27@jrtc27.com>
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D23236
2020-01-17 16:48:20 +00:00
Leandro Lupori
e07530d2df [PPC] Fix wrong comment
pcb_context[20] holds r12-r31 and not r14-r31, as the comment said.
2020-01-17 14:43:58 +00:00
Mateusz Guzik
d3cc535474 vfs: provide F_ISUNIONSTACK as a kludge for libc
Prior to introduction of this op libc's readdir would call fstatfs(2), in
effect unnecessarily copying kilobytes of data just to check fs name and a
mount flag.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D23162
2020-01-17 14:42:25 +00:00
Mateusz Guzik
1ad72b270c vfs: shorten lock hold time in vdbatch_process 2020-01-17 14:39:00 +00:00
Gleb Smirnoff
66c6c556b6 Change argument order of epoch_call() to more natural, first function,
then its argument.

Reviewed by:	imp, cem, jhb
2020-01-17 06:10:24 +00:00
Jeff Roberson
5844774900 Fix a long standing bug that was made worse in r355765. When we are cowing a
page that was previously mapped read-only it exists in pmap until pmap_enter()
returns.  However, we held no reference to the original page after the copy
was complete.  This allowed vm_object_scan_all_shadowed() to collapse an
object that still had pages mapped.  To resolve this, add another page pointer
to the faultstate so we can keep the page xbusy until we're done with
pmap_enter().  Handle busy pages in scan_all_shadowed.  This is already done
in vm_object_collapse_scan().

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23155
2020-01-17 03:44:04 +00:00
Warner Losh
38b37b93d4 We only want to send the speedup to the lower layers when there's a shortage.
Only send a speedup when there's a shortage. While this is a little racy, lost
races aren't a big deal for this function. If there's a shorage just popping up
after we check these values, then we'll catch it next time. If there's a
shortage that's just clearing up, we may do some work at the lower layers a
little sooner than we otherwise would have. Sicne shortages are relatively rare
events, both races are acceptable.

Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D23182
2020-01-17 01:16:23 +00:00
Warner Losh
3cf5dd8401 Use buf to send speedup
It turns out there's a problem with using g_io to send the speedup. It leads to
a race when there's a resource shortage when a disk fails.

Instead, send BIO_SPEEDUP via struct buf. This is pretty straight forward,
except we need to transfer the bio_flags from b_ioflags for BIO_SPEEDUP commands
in g_vfs_strategy.

Reviewed by: kirk, chs
Differential Revision: https://reviews.freebsd.org/D23117
2020-01-17 01:16:19 +00:00
Warner Losh
8b522bdae6 Pass BIO_SPEEDUP through all the geom layers
While some geom layers pass unknown commands down, not all do. For the ones that
don't, pass BIO_SPEEDUP down to the providers that constittue the geom, as
applicable. No changes to vinum or virstor because I was unsure how to add this
support, and I'm also unsure how to test these. gvinum doesn't implement
BIO_FLUSH either, so it may just be poorly maintained. gvirstor is for testing
and not supportig BIO_SPEEDUP is fine.

Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D23183
2020-01-17 01:15:55 +00:00
Mateusz Guzik
e2fa68513e unionfs: use MNTK_NOMSYNC 2020-01-16 22:45:08 +00:00
Emmanuel Vadot
e82ba2c544 dwmmc: Remove max_hz from the softc
We never use it so directly set the value to the mmc host structure.
2020-01-16 21:50:53 +00:00
Mateusz Guzik
66f67d5e5e vfs: increment numvnodes without the vnode list lock unless under pressure
The vnode list lock is only needed to reclaim free vnodes or kick the vnlru
thread (or to block and not miss a wake up (but note the sleep has a timeout so
this would not be a correctness issue)). Try to get away without the lock by
just doing an atomic increment.

The lock is contended e.g., during poudriere -j 104 where about half of all
acquires come from vnode allocation code.

Note the entire scheme needs a rewrite, the above just reduces it's SMP impact.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23140
2020-01-16 21:45:21 +00:00
Mateusz Guzik
b7f50b9ad1 vfs: refcator vnode allocation
Semantics are almost identical. Some code is deduplicated and there are
fewer memory accesses.

Reviewed by:	kib, jeff
Differential Revision:	https://reviews.freebsd.org/D23158
2020-01-16 21:43:13 +00:00
Emmanuel Vadot
bcd380e88b arm64: rockchip: Add RK3399 PWM driver
Add a driver for the pwm controller in the RK3399 SoC

Submitted by:	bdragon (original version)
Reviewed by:	ganbold (previous version)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19046
2020-01-16 21:25:13 +00:00
Emmanuel Vadot
3ee778c15e arm64: rockchip: Add new interface for rk_pinctrl
The gpio controller in rockchips SoC in a child of the pinctrl driver
and cannot control pullups and pulldowns.
Use the new fdt_pinctrl interface for accessing pin capabilities and
setting them.
We can now report that every pins is capable of being IN or OUT function
and PULLUP PULLDOWN.
If the pin isn't in gpio mode no changes will be allowed.

Reviewed by:	ganbold (previous version)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D22849
2020-01-16 21:21:20 +00:00
Emmanuel Vadot
94292d7e95 fdt_pinctrl: Add new methods for gpios
Most of the gpio controller cannot configure or get the configuration
of the pin muxing as it's usually handled in the pinctrl driver.
But they can know what is the pinmuxing driver either because they are
child of it or via the gpio-range property.
Add some new methods to fdt_pinctrl that a pin controller can implement.
Some methods are :
fdt_pinctrl_is_gpio: Use to know if the pin in the gpio mode
fdt_pinctrl_set_flags: Set the flags of the pin (pullup/pulldown etc ...)
fdt_pinctrl_get_flags: Get the flags of the pin (pullup/pulldown etc ...)

The defaults method returns EOPNOTSUPP.

Reviewed by:	ian, bcr (manpages)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D23093
2020-01-16 21:19:27 +00:00
Emmanuel Vadot
42d3cf0f82 regulator_fixed: Add a get_voltage method
Some consumer cannot know the voltage of the regulator without it.
While here, refuse to attach is min_voltage != max_voltage, it
shouldn't happens anyway.

Reviewed by:	mmel
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D23003
2020-01-16 20:52:26 +00:00
Emmanuel Vadot
2fbeda2aef arm: allwinner: ahci: target-supply is optional
The target-supply regulator is optional so don't fail if it's not present.
While here disable the clock on detach.

MFC after:	2 weeks
X-MFC-With:	356600
2020-01-16 20:19:20 +00:00
Kirill Ponomarev
bc6e80ddc1 Generate MAC address from the FreeBSD OUI range.
Submitted by:	aleksandr.fedorov_vstack_com
Approved by:	kevans
Differential Revision:	https://reviews.freebsd.org/D23168
2020-01-16 20:12:15 +00:00
Emmanuel Vadot
96c7e7f427 arm: allwinner: Add support for bank supply
Each GPIO bank is powered by a different pin and so can be powered at different
voltage from different regulators.
Add a new config that now hold the pinmux data and the banks available on each
SoCs.
Since the aw_gpio driver being also the pinmux one it's attached before the PMIC
so add a config_intrhook_oneshot function that will enable the needed regulators
when the system is fully functional.

MFC after:	2 weeks
2020-01-16 20:02:41 +00:00
Emmanuel Vadot
5fcd039b3f axp8xx: Add a regnode_init method
This method will set the desired voltaged based on values in the DTS.
It will not enable the regulator, this is the job of either a consumer
or regnode_set_constraint SYSINIT if the regulator is boot_on or always_on.

MFC after:	2 weeks
2020-01-16 19:59:00 +00:00
Emmanuel Vadot
194840a170 axp8xx: Add missing voltage regulators offset
This lead to writing the desired voltage value to the wrong register.

MFC after:	2 weeks
2020-01-16 19:57:38 +00:00