Commit Graph

148481 Commits

Author SHA1 Message Date
Mateusz Guzik
5eab523053 timerfd: compute fflags before calling falloc
While here dodge list locking in timerfd_adjust if empty.
2023-08-25 15:09:21 +00:00
Mateusz Guzik
02f534b57f timerfd: fix up a memory leak and missing locking
timerfd01 from ltp passes (and some other don't), but none of the tests
crash the kernel.

This is a bare minimum patch to fix up the immediate regression.

Reported by:	yasu
2023-08-25 14:46:48 +00:00
Simon J. Gerraty
1554ba03b6 Add mac_grantbylabel
This module allows controlled privilege escallation via mac labels
securely associated with a process via mac_veriexec.

There are over 700 PRIV_* but we can compress many of them into
a single GBL_* thus constraining the size of gbl labels.

The goal is to allow a daemon to run as an unprivileged process while
still being able a set of privileged operations needed.

We add APIs to libveriexec so that userland processes can check labels
and an exec_script API that allows a suitably labeled process to run
something like a python interpreter directly if necessary;
overcomming the 'indirect' flag applied to the interpreter.

Add -l option to sbin/veriexec to report labels.

Reviewed by:	stevek
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D41431
2023-08-24 17:42:11 -07:00
Glen Barber
aee253d8a7 update main to 15
Approved by:	re (implicit)
Sponsored by:	GoFundMe https://www.gofundme.com/f/gjbbsd
Sponsored by:	PayPal https://paypal.me/gjbbsd
2023-08-24 19:10:35 -04:00
Mateusz Guzik
712806fc4b vfs: retried++ -> retried = true for the boolean
No real changes.

Noted by:	rpokala
2023-08-24 22:50:31 +00:00
Stephen J. Kiernan
30cdbb5833 freebsd32: Remove mac_syscall from the unimpl list
The mac_syscall system call works fine as long as any MAC module
that provides a mpo_syscall method handles compat32 appropriately.

Regenerate system call files for freebsd32.

Reviewed by:	sjg
Obtained from:	Juniper Networks, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D41575
2023-08-24 18:45:31 -04:00
Mateusz Guzik
c1d85ac3df vfs: try harder to find free vnodes when recycling
The free vnode marker can slide past eligible entries.

Artificially reducing vnode limit to 300k and spawning 104 workers each
creating a million files results in all of them trying to recycle, which
often fails when it should not have to.

Because of the excessive traffic in this scenario, the trylock to
requeue is virtually guaranteed to fail, meaning nothing gets pushed
forward.

Since no vnodes were found, the most unfortunate sleep for 1 second is
induced (see vn_alloc_hard, the "vlruwk" msleep).

Without the fix the machine is mostly idle with almost everyone stuck
off CPU waiting for the sleep to finish. With the fix it is busy
creating files.

Unrelated to the above problem the marker could have landed in a
similarly problematic spot for because of any failure in vtryrecycle.

Originally reported as poudriere builders stalling in a vnode-count
restricted setup.

Fixes:	138a5dafba ("vfs: trylock vnode requeue")
Reported by:	Mark Millard
2023-08-24 22:12:40 +00:00
John Hall
7ea28254ec smartpqi: update to version 4410.0.2005
This updates the smartpqi driver to Microsemi's latest code. This will
be the driver for FreeBSD 14 (with updates), but no MFC is planned.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D41550
2023-08-24 15:25:09 -06:00
Kevin Bowling
725e4008ef iflib: invert default restart on VLAN changes
In rS360398, a new iflib device method was added to opt out of VLAN
events needing an interface reset.

I am switching the default to not requiring a restart for:
* VLAN events
* unknown events

After fixing various bugs, I do not think this would be a common need
of hardware and it is undesirable from the user's perspective causing
link flaps and much slower VLAN configuration. Currently, there are no
other restart events besides VLAN events, and setting the
ifdi_needs_restart default to false will alleviate the need to churn
every driver if an odd event is added in the future for specific
hardware.

markj points out this could cause churn in the other direction; I will
solve that problem with an event registration system as he mentions in
the review should we need it in the future.

These drivers will opt into restart and need further inspection or work:
* ixv (needs code audit, 61a8231 fixed principal issue; re-init probably
not necessary)
* axgbe (needs code audit; re-init probably not necessary)
* iavf - (needs code audit; interaction with Malicious Driver Detection
mentioned in rS360398)
* mgb - no VLAN functions are currently implemented. Left a comment.

MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:48:19 -07:00
Kevin Bowling
14a14e36ae ice: Don't restart on VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This re-init is unnecessary for ice(4).

MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:46:57 -07:00
Kevin Bowling
1d6c12c511 iavf: Add explicit ifdi_needs_reset for VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

iavf(4) was the original need for this, because VLAN filter changes
currently have negative interactions with Malicious Driver Detection.

Add iavf_if_needs_restart and explicitly enable VLAN re-init.

MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:46:56 -07:00
Kevin Bowling
fe6c4e214d enic: Don't restart on VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This re-init is unintentional for vmxnet3(4).

MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:46:56 -07:00
Kevin Bowling
b6b75424c5 vmxnet3: Don't restart on VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This re-init is unintentional for vmxnet3(4).

MFC after:      2 weeks
Sponsored by:   BBOX.io
Differential Revision:  https://reviews.freebsd.org/D41558
2023-08-24 13:46:56 -07:00
Kevin Bowling
f9e0a790ae enetc: Don't restart on VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This re-init is unintentional for enetc(4).

MFC after:      2 weeks
Sponsored by:   BBOX.io
Differential Revision:  https://reviews.freebsd.org/D41558
2023-08-24 13:46:56 -07:00
Kevin Bowling
bce864d1c2 bnxt: Don't restart on VLAN changes
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.

This is unintentional for bnxt(4) and is causing another bug in its VLAN
initialization code to affect the common case of adding and removing
VLANs on an existing interface.

PR:		269133
Tested by:	kp
MFC after:	2 weeks
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41558
2023-08-24 13:46:56 -07:00
Jake Freeland
af93fea710 timerfd: Move implementation from linux compat to sys/kern
Move the timerfd impelemntation from linux compat code to sys/kern. Use
it to implement the new system calls for timerfd. Add a hook to kern_tc
to allow timerfd to know when the system time has stepped. Add kqueue
support to timerfd. Adjust a few names to be less Linux centric.

RelNotes: YES
Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack)
Differential Revision: https://reviews.freebsd.org/D38459
2023-08-24 14:28:56 -06:00
Michael Tuexen
847fa61fad sctp: improve handling of socket shutdown for reading
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.

Reported by:	syzbot+291f6581cecb77097b16@syzkaller.appspotmail.com
MFC after:	1 week
2023-08-24 15:52:55 +02:00
Kajetan Staszkiewicz
d10de21f2f pf: Access r->rpool.cur->kif under mutex protection
pf_route() sends traffic to a specified next hop over a specific
interface. The next hop is obtained in pf_map_addr() but the interface
is obtained directly via r->rpool.cur->kif` outside of the lock held in
pf_map_addr() in multiple places around pf. The chosen interface is not
stored in source node.

Move the interface selection into pf_map_addr(), have the function
return it together with the chosen IP address and ensure its stored
in struct pf_ksrc_node, store it in the source node and use the stored
value when needed.

Sponsored by:	InnoGames GmbH
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D41570
2023-08-24 13:05:33 +02:00
Warner Losh
d9fee1d021 cam/scsi_da: Bump deprecation one release.
These are still used in a quick poll that I've done, so we can't remove
them in 14. Reset the removal to FreeBSD 15.

Sponsored by:		Netflix
2023-08-23 22:34:41 -06:00
John Baldwin
0677f5ccbb cxgbe ddp: Trim stale function prototype
Sponsored by:	Chelsio Communications
2023-08-23 14:30:16 -07:00
Emmanuel Vadot
0fd310c83c arm64: Remove duplicate fdt build option
Reported by:	andrew
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2023-08-23 20:18:38 +02:00
Andrew Turner
7d2dd08d01 gicv3: Add checks for the device ID
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41554
2023-08-23 17:38:20 +01:00
Andrew Turner
629734783d gicv3: Add a verbose message for unknown tables
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41553
2023-08-23 17:38:20 +01:00
Andrew Turner
2f11b2abfc gicv3: Stop setting the esize field
The GITS_BASER esize field is read-only, there is no need to change it.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41552
2023-08-23 17:38:20 +01:00
Andrew Turner
43d74fcac0 Revert "gicv3: Stop setting the esize field"
This reverts commit 47a4b8ca96.

It has the wrong differential review link
2023-08-23 17:38:20 +01:00
Andrew Turner
b9cdb04f4e Revert "gicv3: Add a verbose message for unknown tables"
This reverts commit 7f9694ad7e.

It has the wrong differential review link
2023-08-23 17:38:20 +01:00
Andrew Turner
160919c864 Revert "gicv3: Add checks for the device ID"
This reverts commit 950421e231.

It has the wrong differential review link
2023-08-23 17:38:20 +01:00
Andrew Turner
676386b556 Support dynamically sized register sets
We don't always know the size of the register set at compile time,
e.g. on arm64 the size of the SVE registers need to be queried on boot.
To support register sets that needs to be calculated at run time
query the correct size when it is zero.

Reviewed by:	markj, kib (earlier version)
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41302
2023-08-23 15:32:56 +01:00
Andrew Turner
950421e231 gicv3: Add checks for the device ID
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41551
2023-08-23 15:29:34 +01:00
Andrew Turner
7f9694ad7e gicv3: Add a verbose message for unknown tables
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41551
2023-08-23 15:29:34 +01:00
Andrew Turner
47a4b8ca96 gicv3: Stop setting the esize field
The GITS_BASER esize field is read-only, there is no need to change it.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41551
2023-08-23 15:29:34 +01:00
Andrew Turner
3fc4f7c880 gicv3: Split out finding the page size
When adding indirect (2 level) tabled we will need to know the page
size to calculate the size of the level 1 table. To allow for this find
the page size before entering the loop to calculate the final register
value.

Reviewed by:	gallatin, imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41551
2023-08-23 15:29:34 +01:00
Piotr Kubaj
0834f13da9 iavf: remove compatibility code and address some warnings
Code for pre-11 FreeBSD versions is removed.
Also removed are macros that are not used anymore and "i" variable
does not shadow anymore other "i" variable.

Differential Revision: https://reviews.freebsd.org/D41547
Approved by:	erj
2023-08-23 14:48:11 +02:00
Zhenlei Huang
838c8c4786 net: Do not overwrite if_vlan's PCP
In commit c7cffd65c5 the function ether_8021q_frame() was slightly
refactored to use pointer of struct ether_8021q_tag as parameter qtag to
include the new option proto.

It is wrong to write to qtag->pcp as it will effectively change the memory
that qtag points to. Unfortunately the transmit routine of if_vlan parses
pointer of the member ifv_qtag of its softc which stores vlan interface's
PCP internally, when transmitting mbufs that contains PCP the vlan
interface's PCP will get overwritten.

Fix by operating on a local copy of qtag->pcp. Also mark 'struct ether_8021q_tag'
as const so that compilers can pick up such kind of bug.

PR:	273304
Reviewed by:	kp
Fixes:	c7cffd65c5 Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D39505
2023-08-23 17:53:48 +08:00
Michael Tuexen
d18c845f99 sctp: improve handling of SHUTDOWN and SHUTDOWN ACK chunks
When handling a SHUTDOWN or SHUTDOWN ACK chunk detect if the peer
is violating the protocol by not having made sure all user messages
are reveived by the peer. If this situation is detected, abort the
association.

MFC after:	1 week
2023-08-23 08:36:15 +02:00
Emmanuel Vadot
5fb94d0e16 arm64: xilinx: dwc3: Fix reset names
Use the correct resets and not the same one three times in a row.

Reported by:	rpokala
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2023-08-23 09:42:40 +02:00
Emmanuel Vadot
29bfcb3a28 arm64: xilinx: Add glue driver for usb3 controller
Like other dwc3 controller, on Xilinx ZynqMP the base node is just here
to provide resets, the main dwc3 controller node is a child node.

Sponsored by:	Beckhoff Automation GmbH & Co. KG
2023-08-23 09:00:05 +02:00
Konstantin Belousov
c7df872096 Regen 2023-08-23 03:02:21 +03:00
Konstantin Belousov
4a69fc16a5 Add membarrier(2)
This is an attempt at clean-room implementation of the Linux'
membarrier(2) syscall.  For documentation, you would need to read
both membarrier(2) Linux man page, the comments in Linux
kernel/sched/membarrier.c implementation and possibly look at
actual uses.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32360
2023-08-23 03:02:21 +03:00
Konstantin Belousov
74ccb8ecf6 Add cpu_sync_core()
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32360
2023-08-23 03:02:21 +03:00
Konstantin Belousov
8882b7852a add pmap_active_cpus()
For amd64, i386, arm, and riscv, i.e. all architectures except arm64,
the custom implementation is provided since we maintain the bitmask of
active CPUs anyway.

Arm64 uses somewhat naive iteration over CPUs and match current vmspace'
pmap with the argument. It is not guaranteed that vmspace->pmap is the
same as the active pmap, but the inaccuracy should be toleratable.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32360
2023-08-23 03:02:21 +03:00
Bartosz Sobczak
c7f73a1588
ofed: mask seq_num identifier to occupy only 3 bytes
The seq_num among other things is used to assign rq_psn value, which is
a 24-bit identifier.  When the seq_num is full 4-byte value, we are
usually receiving: '_ib_modify_qp rq_psn overflow, masking to 24 bits'
warning.

This is burdensome for running rdma traffic with large number of
connections, because the number of logs is growing fast.

Signed-off-by: Bartosz Sobczak <bartosz.sobczak@intel.com>
Signed-off-by: Eric Joyner <erj@FreeBSD.org>

Reviewed by:	kib@, erj@
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D41531
2023-08-22 16:09:13 -07:00
Jessica Clarke
c9b2751d76 arm: Add missing no-ctfconvert for fw_stub.awk target
This target produces a C file not an object file, so using ctfconvert on
it should not be attempted. This keeps it in sync with all other uses of
fw_stub.awk, squashes a warning seen during the build of TEGRA124 on
FreeBSD and avoids the same issue failing the build on non-FreeBSD (such
errors are #ifdef'ed into being warnings on FreeBSD in ctfconvert, which
should be revisited in the future).

Reviewed by:	manu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D41542
2023-08-22 21:00:37 +01:00
Marius Strobl
dc485b968d tcp_info: Add and export more FreeBSD-specific fields
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks

Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.

Sponsored by:	NetApp, Inc. (originally)
2023-08-22 20:34:01 +02:00
Marius Strobl
8c6104c48e tcp_fill_info(): Change lock assertion on INPCB to locked only
This function actually only ever reads from the TCP PCB. Consequently,
also make the pointer to its TCP PCB parameter const.

Sponsored by:	NetApp, Inc. (originally)
2023-08-22 20:33:49 +02:00
Kristof Provost
949491f2a6 if_ovpn: clear mbuf flags on rx
When we receive a packet and remove the encapsulating layer we should
also clear out protocol flags and any mbuf tags.

If we do not we risk confusing firewalls filtering the tunneled packet.

See also: 	https://redmine.pfsense.org/issues/14682#change-69073
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-08-22 20:30:11 +02:00
Zhenlei Huang
c941b82e1c geom_linux_lvm: Check the offset of physical volume header
The LVM label is stored on any of the first four sectors, and the
PV (physical volume) header is stored within the same sector following
the LVM label. The current implementation does not fully check the
offset of PV header, when attaching a bad formatted LVM PV the kernel
may crash due to out-of-bounds memory read.

PR:	266562
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36773
2023-08-22 17:20:10 +08:00
John Baldwin
7ccaf76a27 riscv db_trace: Ensure trapframe pointer is suitably aligned.
Suggested by:	jrtc27
Reviewed by:	jrtc27
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D41534
2023-08-21 21:00:26 -07:00
Warner Losh
682d5a87e5 Delete trailing whitespace from $FreeBSD$ removal
Fixes: d4bf8003ee
Sponsored by: Netflix
2023-08-21 19:37:28 -06:00
Ed Maste
4258eb5a0d x86: handle domains with no CPUs usable for intr delivery
We can end up with a domain having no CPUs capable of receiving I/O
interrupts.  This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.

In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc ("Only use CPUs in the
domain the device is attached to for default").  This has a performance
impact but at least allows the system to be functional.  It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.

Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.

Reviewed by:	jhb
Tested by:	cperciva, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501
2023-08-21 15:52:10 -04:00