Commit Graph

138226 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
616a676a05 cam: clear stack-allocated CCB in the target layer
Note that, as pointed out by scottl@, this code should really look
a bit different, in that the stack allocations should be replaced
with dynamic allocation, and the periph creation should be moved
to a context where one can use M_WAITOK.  See the review for more
details.  For now let's go with a minimal fix until we're done with
UMA CCBs.

Reviewed By:	mav, imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30298
2021-07-21 10:18:28 +01:00
Hans Petter Selasky
0f8dafb458 Implement the SOUND_MIXER_WRITE_MUTE and SOUND_MIXER_READ_MUTE ioctl(9)s.
These two ioctls are not part of the current version of OSS and were
considered obsolete. However, their behaviour is not the same as their
old one, so this implementation is specific to FreeBSD.

Older OSS versions had the MUTE ioctls take and return an integer with
a value of 0 or 1, which meant that the _whole_ mixer is unmuted or
muted respectively. In my implementation, the ioctl takes and returns
a bitmask that tells us which devices are muted.

This allows us to mute and unmute only the devices we want, instead of the
whole mixer. The bitmask works the same way as in DEVMASK, RECMASK and
RECSRC.

Integrated the hardware volume feature with the new mute system.

Submitted by:	Christos Margiolis <christos@freebsd.org>
Differential Revision:	https://reviews.freebsd.org/D31130
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-21 11:10:30 +02:00
Jessica Clarke
8c439847f0 riscv: Include spibus and spigen in GENERIC
We already attempt to enable the SiFive SPI controller, but since spibus
isn't enabled it isn't actually built.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31027
2021-07-21 06:46:09 +01:00
Jessica Clarke
4707bb0430 pci_dw: Detect number of outbound regions automatically
Currently we use the num-viewports property to decide how many outbound
regions there are we can use, defaulting to 2. However, Linux has
stopped using that and so it no longer appears in new device trees, such
as for the SiFive FU740. Instead, it's possible to just probe the
hardware directly.

Reviewed by:	mmel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31030
2021-07-21 05:51:20 +01:00
Jessica Clarke
f240dfff22 pci_dw: Support modern "unroll" iATU mode
This supersedes the old legacy mode where a viewport register was used
to mux multiple regions behind a single set of registers, and is used on
the SiFive FU740.

Reviewed by:	mmel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31029
2021-07-21 05:50:50 +01:00
Jessica Clarke
f8c1701f23 pci_dw: Support multiple memory windows
Currently we assume there is only one memory and one prefetch memory
window, and ignore the latter. However, the SiFive FU740 has two normal
memory windows.

As part of this, the viewports are rearranged. Previously the viewports
were memory, config then optionally I/O. Both to simplify the config
index calculation and to ensure it can always be mapped even if we have
too many memory windows for the number of viewports, config is moved to
being the first viewport.

This generalisation now also naturally supports mapping prefetch memory
windows.

Reviewed by:	mmel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31028
2021-07-21 05:50:24 +01:00
Jessica Clarke
ade2ea3c45 riscv: Fix pindex level confusion
The pindex values are assigned from the L3 leaves upwards, meaning there
are NUL2E L3 tables and then NUL1E L2 tables (with a futher NUL0E L1
tables in future when we implement Sv48 support). Therefore anything
below NUL2E is an L3 table's page and anything above or equal to NUL2E
is an L2 table's page (with the threshold of NUL2E + NUL1E marking the
start of the L1 tables' pages in Sv48). Thus all the comparisons and
arithmetic operations must use NUL2E to handle the L3/L2 allocation (and
thus L2/L1 entry) transition point, not NUL1E as all but pmap_alloc_l2
were doing.

To make matters confusing, the NUL1E and NUL2E definitions in the RISC-V
pmap are based on a 4-level page hierarchy but we currently use the
3-level Sv39 format (as that's the only required one, and hardware
support for the 4-level Sv48 is not widespread). This means that, in
effect, the above bug cancels out with the bloated NULxE definitions
such that things "work" (but are still technically wrong, and thus would
break when adding Sv48 support), with one exception. pmap_enter_l2 is
currently the only function to use the correct constant, but since
_pmap_alloc_l3 uses the incorrect constant, it will do complete nonsense
when it needs to allocate a new L2 table (which is rather rare). In this
instance, _pmap_alloc_l3, whilst it would correctly determine the pindex
was for an L2 table, would only subtract NUL1E when computing l1index
and thus go way out of bounds (by 511*512*512 bytes, or 127.75 GiB) of
its own L1 table and, thanks to pmap_distribute_l1, of every other
pmap's L1 table in the whole system. This has likely never been hit as
it would presumably instantly fault and panic.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31087
2021-07-21 02:51:26 +01:00
Jessica Clarke
a1f9cdb1ab sifive_uart: Fix input character dropping in ddb and at a mountroot prompt
These use the raw console interface and poll. Unfortunately, the SiFive
UART puts the FIFO empty bit inside the FIFO data register, which means
that the act of checking whether a character is available also dequeues
any character from the FIFO, requiring the user to press each key twice.
However, since we configure the watermark to be 0 and, when the UART has
been grabbed for the console, we have interrupts off, we can abuse the
interrupt pending register to act as a substitute for the FIFO empty
bit.

This perhaps suggests that the console interface should move from having
rxready and getc to having getc_nonblock and getc (or make getc take a
bool), as all the places that call rxready do so to avoid blocking on
getc when there is no character available.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31025
2021-07-21 02:51:25 +01:00
Jessica Clarke
4c4a6884ad cgem: Add support for the SiFive FU740
Note that currently Linux's device tree uses the FU540's compatible
string, as does upstream U-Boot, but the U-Boot shipped with the board
based on an older patch series has the correct FU740 name. Thankfully
they are the same, at least as far as software is concerned.

Whilst here, fix a style(9) nit.

Reviewed by:	philip, kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31034
2021-07-21 02:51:25 +01:00
Jessica Clarke
d9e85f2c6f riscv: Implement missing nexus methods
This is required for the SiFive FU740's PCIe controller. Copied from
arm64 with the only difference being changing pmap_mapdev_attr to
pmap_mapdev as riscv only has the latter.

Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31032
2021-07-21 02:51:25 +01:00
Jessica Clarke
e77ef47d36 geom_label: Partially reinstate old sysinstall(8) workaround
This partially reverts commit af433832f7.
Since such bogus disklabels still exist in the wild, we now probe for a
disklabel to decide whether to ignore the UFS partition or not; if there
is a label then we use the old behaviour, and if there isn't one then we
use the new behaviour.

Reviewed by:	cy, mckusick
Differential Revision:	https://reviews.freebsd.org/D31068
2021-07-21 02:51:25 +01:00
Kevin Bowling
b684d812fc arm: Bump KSTACK_PAGES default to match i386/amd64
See 3f6867ef63 for additional context.

It is also needed for OpenZFS performance and stability.

Reviewed by:	ian (arm), imp
Differential Revision:	https://reviews.freebsd.org/D31244
2021-07-20 18:35:54 -07:00
Kornel Duleba
986bbba9a6 arm/mv: Don't rely on firmware MSI mapping in ICU
On Armada8k boards various peripherals (e.g. USB) have interrupt lines
connected to on of the ICU interrupt controllers.
After an interrupt is detected it triggers MSI to a given address,
with a programmed value. This in turn triggers an SPI interrupt.
Normally MSI vector should be allocated by ICUs parent and set
during interrupt allocation.
Instead of doing that we relied on the ICU being pre-configured in firmware.
This worked with EDK2 and older versions of U-Boot, but in the newer
ones that is no longer the case.
Extend ICU msi-parents - GICP and SEI to support MSI interface
and use it during interrupt allocation.
This allows us to boot on Armada 7k/8k SoCs independent from the
firmware configuration and successfully use modern U-Boot + device tree.

For SATA interrupts we need to apply a WA previously done in firmware.
We have two SATA ports connected to one controller.
Each ports gets its own interrupt, but only one of them is
described in dts, also ahci_generic driver expects only one irq too.
Fix it by mapping both interrupts to the same MSI when one of them
is allocated, which allows us to use both SATA ports.

Reviewed by: mmel, mw
Obtained from: Semihalf
Sponsored by: Marvell
Differential Revision: https://reviews.freebsd.org/D28803
2021-07-20 23:24:42 +02:00
Kristof Provost
32271c4d38 pf: clean up syncookie callout on vnet shutdown
Ensure that we cancel any outstanding callouts for syncookies when we
terminate the vnet.

MFC after:	1 week
Sponsored by:	Modirum MDPay
2021-07-20 21:13:25 +02:00
Kristof Provost
84db87b8da pf: remove stray debug line
MFC after:	1 week
Sponsored by:	Modirum MDPay
2021-07-20 21:13:22 +02:00
Mateusz Guzik
907257d696 pf: embed a pointer to the lock in struct pf_kstate
This shaves calculation which in particular helps on arm.

Note using the & hack instead would still be more work.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-20 16:11:31 +00:00
Kristof Provost
b972a7fa9e pf: fix LINT build
We failed to list the new pf_syncookies.c file in sys/conf/files. This
worked for the usual configurations, where pf is a module, but not for
LINT builds.

Reported by:	lwhsu
MFC after:	1 week
Sponsored by:	Modirum MDPay
2021-07-20 18:08:30 +02:00
Dmitry Chagin
75cb2382b8 linux(4): Factor out the futex_wait() op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:40:24 +03:00
Dmitry Chagin
ef4251e271 linux(4): Prevent an endless loop.
In the futex_atomic_op() the encoded_op is a user-supplied parameter.
If the user specifies an incorrect value for this parameter paired with a valid
*uaddr parameter the caller will go into the endless loop. To prevent this check
futex_atomic_op() result and break the loop in case of ENOSYS.

MFC after:		2 weeks
2021-07-20 14:40:08 +03:00
Dmitry Chagin
80b8d6b144 linux(4): Eliminate bogus comment.
For the caller is no need for access checking here, as the caller must take care
of EFAULT handling. Moreover, this check would be superfluous, since EFAULT is
extremily rare, and we prefer the fast path.

MFC after:		2 weeks
2021-07-20 14:39:56 +03:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
4c361d7a5a linux(4): Factor out the FUTEX_WAKE_OP op into linux_futex_wakeop().
MFC after:		2 weeks
2021-07-20 14:38:44 +03:00
Dmitry Chagin
bb62a91944 linux(4): Factor out the FUTEX_CMP_REQUEUE op into linux_futex_requeue().
MFC after:		2 weeks
2021-07-20 14:38:27 +03:00
Dmitry Chagin
19f7e2c2fb linux(4): Factor out the FUTEX_WAKE op into linux_futex_wake().
MFC after:		2 weeks
2021-07-20 14:38:05 +03:00
Dmitry Chagin
f6b0d275eb linux(4): Factor out the FUTEX_WAIT op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:37:51 +03:00
Dmitry Chagin
1866eef484 linux(4): Refactor the struct linux_futex_args.
Move flags and rtclock to the struct linux_futex_args. This will be used when
I split linux_futex() into separate futex op functions.

MFC after:		2 weeks
2021-07-20 14:37:37 +03:00
Andrew Turner
04f6015706 Split out the arm64 ID field comparison function
This will be useful in an update for finding which HWCAPS to set.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31200
2021-07-19 21:30:11 +00:00
Andrew Turner
b7a78d573a Start to clean up arm64 address space selection
On arm64 we should use bit 55 of the address to decide if aan address
is a user or kernel address. Add a new macro with this check and a
second to ensure the address is in teh canonical form, i.e.
the top bits are all zero or all one.

This will help with supporting future cpu features, including Top
Byte Ignore, Pointer Authentication, and Memory Tagging.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31179
2021-07-19 21:30:11 +00:00
Edward Tomasz Napierala
a40cf4175c Implement unprivileged chroot
This builds on recently introduced NO_NEW_PRIVS flag to implement
unprivileged chroot, enabled by `security.bsd.unprivileged_chroot`.
It allows non-root processes to chroot(2), provided they have the
NO_NEW_PRIVS flag set.

The chroot(8) utility gets a new flag, -n, which sets NO_NEW_PRIVS
before chrooting.

Reviewed By:	kib
Sponsored By:	EPSRC
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D30130
2021-07-20 08:57:53 +00:00
Kristof Provost
231e83d342 pf: syncookie ioctl interface
Kernel side implementation to allow switching between on and off modes,
and allow this configuration to be retrieved.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31139
2021-07-20 10:36:13 +02:00
Kristof Provost
8e1864ed07 pf: syncookie support
Import OpenBSD's syncookie support for pf. This feature help pf resist
TCP SYN floods by only creating states once the remote host completes
the TCP handshake rather than when the initial SYN packet is received.

This is accomplished by using the initial sequence numbers to encode a
cookie (hence the name) in the SYN+ACK response and verifying this on
receipt of the client ACK.

Reviewed by:	kbowling
Obtained from:	OpenBSD
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31138
2021-07-20 10:36:13 +02:00
Kristof Provost
ee9c3d3803 pf: factor out pf_synproxy()
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31137
2021-07-20 10:36:13 +02:00
Navdeep Parhar
76c8902296 cxgbe(4): Initialize abs_id for ctrl and ofld queues.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-07-20 00:54:13 -07:00
Kevin Bowling
9fd0cda92d e1000: Add missing branch prediction
I missed this edit from the ixgbe review (D30074)

Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30073
2021-07-20 00:21:21 -07:00
Kevin Bowling
41f0225714 e1000: Clean up igb_txrx
The intention here is to reduce differences between em, igb, igc, ixgbe.

The main functional change is logical simplification in igb_rx_checksum
and getting interface caps from scctx instead of the ifp.

Reviewed by:	gallatin, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30073
2021-07-20 00:11:30 -07:00
Dmitry Chagin
2b38186330 Drop rdivacky@ "All rights reserved" from linux_event.
I got explicit permission from Roman.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30913
MFC after:		2 weeks
2021-07-20 10:06:16 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
ae8330b448 linux(4): Add arch name to the some printfs.
Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30904
MFC after:		2 weeks
2021-07-20 10:05:08 +03:00
Dmitry Chagin
fe7409530c linprocfs: Fixup vDSO name in the procmaps after 9931033bbf.
As the sv_shared_page_base now pointed out to the native sharedpage and
the process VA layout has changed as follows:
VDSOPAGE	(2 * PAGE_SIZE)
SHAREDPAGE	(PAGE_SIZE)
USRSTACK
fixup the vDSO name by calculating the start of page relative to the
native sharedpage.

Differential revision:	https://reviews.freebsd.org/D30903
MFC after:		2 weeks
2021-07-20 10:04:20 +03:00
Dmitry Chagin
09cffde975 linux(4): Fixup the vDSO initialization order.
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();

As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D30902
MFC after		2 weeks
2021-07-20 10:02:34 +03:00
Dmitry Chagin
a543556c81 linux(4): Constify vdso install/deinstall.
In order to reduce diff between arches constify vdso install/deinstall
functions like arm64.

Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30901
MFC after:		2 weeks
2021-07-20 10:01:47 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Dmitry Chagin
5fd9cd53d2 linux(4): Modify sv_onexec hook to return an error.
Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30911
MFC after:		2 weeks
2021-07-20 09:56:25 +03:00
Dmitry Chagin
62ba4cd340 Call sv_onexec hook after the process VA is created.
For future use in the Linux emulation layer call sv_onexec hook right after
the new process address space is created. It's safe, as sv_onexec used only
by Linux abi and linux_on_exec() does not depend on a state of process VA.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D30899
MFC after:		2 weeks
2021-07-20 09:55:14 +03:00
Dmitry Chagin
b39fa4770d Remove bogus cast from exec_sysvec_init().
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30910
MFC after:		2 weeks
2021-07-20 09:54:09 +03:00
Dmitry Chagin
21629e2a45 Modify exec_sysvec_init() to allow non-native abi to setup their sysentvecs.
For future use in the Linux emulation layer modify the exec_sysvec_init()
to allow non-native abi to fill sv_timekeep_base and sv_shared_page_obj.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D30898
MFC after:		2 weeks
2021-07-20 09:53:21 +03:00
Dmitry Chagin
815165be20 linux(4): Remove function prototypes from the vDSO.
In preparation for vDSO code revision get rid of incomplete vDSO methods
from locore, but leave .note.Linux section commented out.
.note.Linux section is used by glibc rtld to get the kernel version, that
saves one system call call. I'll try to implement it later, if figure out
how to use it with jails.

MFC after:	2 weeks
2021-07-20 09:52:08 +03:00
Jessica Clarke
f221000127 elf: Remove R_RISCV_[GT]PREL_[IS] relocation defines
These were internal binutils relocations that have no way to be
generated in assembly nor will ever be seen in the output, and so should
never have been defined in the psABI in the first place. They have
therefore been removed from the spec as of [1], so do so here too.

[1] 44f98e0fd8
2021-07-20 06:13:43 +01:00
Rick Macklem
7685f8344d nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts
For NFSv4.1/4.2, the client may set the "seqid" field of the
stateid to 0 in RPC requests.  This indicates to the server that
it should not check the "seqid" or return NFSERR_OLDSTATEID if the
"seqid" value is not up to date w.r.t. Open/Lock operations
on the stateid.  This "seqid" is incremented by the NFSv4 server
for each Open/OpenDowngrade/Lock/Locku operation done on the stateid.

Since a failure return of NFSERR_OLDSTATEID is of no use to
the client for I/O operations, it makes sense to set "seqid"
to 0 for the stateid argument for I/O operations.
This avoids server failure replies of NFSERR_OLDSTATEID,
although I am not aware of any case where this failure occurs.

This makes the FreeBSD NFSv4.1/4.2 client compatible with the
Linux NFSv4.1/4.2 client.

MFC after:	2 weeks
2021-07-19 17:35:39 -07:00
John Baldwin
b5e73dd952 cxgbei: Don't assert F for data completion PDUs.
If a data PDU encounters an error such as a digest error, the firmware
will report that data PDU when completion moderation is active even if
it is not the final data PDU in a burst.

Sponsored by:	Chelsio Communications
2021-07-19 15:36:31 -07:00