timerfd01 from ltp passes (and some other don't), but none of the tests
crash the kernel.
This is a bare minimum patch to fix up the immediate regression.
Reported by: yasu
This module allows controlled privilege escallation via mac labels
securely associated with a process via mac_veriexec.
There are over 700 PRIV_* but we can compress many of them into
a single GBL_* thus constraining the size of gbl labels.
The goal is to allow a daemon to run as an unprivileged process while
still being able a set of privileged operations needed.
We add APIs to libveriexec so that userland processes can check labels
and an exec_script API that allows a suitably labeled process to run
something like a python interpreter directly if necessary;
overcomming the 'indirect' flag applied to the interpreter.
Add -l option to sbin/veriexec to report labels.
Reviewed by: stevek
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D41431
The mac_syscall system call works fine as long as any MAC module
that provides a mpo_syscall method handles compat32 appropriately.
Regenerate system call files for freebsd32.
Reviewed by: sjg
Obtained from: Juniper Networks, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41575
The free vnode marker can slide past eligible entries.
Artificially reducing vnode limit to 300k and spawning 104 workers each
creating a million files results in all of them trying to recycle, which
often fails when it should not have to.
Because of the excessive traffic in this scenario, the trylock to
requeue is virtually guaranteed to fail, meaning nothing gets pushed
forward.
Since no vnodes were found, the most unfortunate sleep for 1 second is
induced (see vn_alloc_hard, the "vlruwk" msleep).
Without the fix the machine is mostly idle with almost everyone stuck
off CPU waiting for the sleep to finish. With the fix it is busy
creating files.
Unrelated to the above problem the marker could have landed in a
similarly problematic spot for because of any failure in vtryrecycle.
Originally reported as poudriere builders stalling in a vnode-count
restricted setup.
Fixes: 138a5dafba ("vfs: trylock vnode requeue")
Reported by: Mark Millard
This updates the smartpqi driver to Microsemi's latest code. This will
be the driver for FreeBSD 14 (with updates), but no MFC is planned.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D41550
In rS360398, a new iflib device method was added to opt out of VLAN
events needing an interface reset.
I am switching the default to not requiring a restart for:
* VLAN events
* unknown events
After fixing various bugs, I do not think this would be a common need
of hardware and it is undesirable from the user's perspective causing
link flaps and much slower VLAN configuration. Currently, there are no
other restart events besides VLAN events, and setting the
ifdi_needs_restart default to false will alleviate the need to churn
every driver if an odd event is added in the future for specific
hardware.
markj points out this could cause churn in the other direction; I will
solve that problem with an event registration system as he mentions in
the review should we need it in the future.
These drivers will opt into restart and need further inspection or work:
* ixv (needs code audit, 61a8231 fixed principal issue; re-init probably
not necessary)
* axgbe (needs code audit; re-init probably not necessary)
* iavf - (needs code audit; interaction with Malicious Driver Detection
mentioned in rS360398)
* mgb - no VLAN functions are currently implemented. Left a comment.
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unnecessary for ice(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
iavf(4) was the original need for this, because VLAN filter changes
currently have negative interactions with Malicious Driver Detection.
Add iavf_if_needs_restart and explicitly enable VLAN re-init.
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for vmxnet3(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for vmxnet3(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for enetc(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This is unintentional for bnxt(4) and is causing another bug in its VLAN
initialization code to affect the common case of adding and removing
VLANs on an existing interface.
PR: 269133
Tested by: kp
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
Move the timerfd impelemntation from linux compat code to sys/kern. Use
it to implement the new system calls for timerfd. Add a hook to kern_tc
to allow timerfd to know when the system time has stepped. Add kqueue
support to timerfd. Adjust a few names to be less Linux centric.
RelNotes: YES
Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack)
Differential Revision: https://reviews.freebsd.org/D38459
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.
Reported by: syzbot+291f6581cecb77097b16@syzkaller.appspotmail.com
MFC after: 1 week
pf_route() sends traffic to a specified next hop over a specific
interface. The next hop is obtained in pf_map_addr() but the interface
is obtained directly via r->rpool.cur->kif` outside of the lock held in
pf_map_addr() in multiple places around pf. The chosen interface is not
stored in source node.
Move the interface selection into pf_map_addr(), have the function
return it together with the chosen IP address and ensure its stored
in struct pf_ksrc_node, store it in the source node and use the stored
value when needed.
Sponsored by: InnoGames GmbH
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41570
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41554
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41553
The GITS_BASER esize field is read-only, there is no need to change it.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41552
We don't always know the size of the register set at compile time,
e.g. on arm64 the size of the SVE registers need to be queried on boot.
To support register sets that needs to be calculated at run time
query the correct size when it is zero.
Reviewed by: markj, kib (earlier version)
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41302
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551
The GITS_BASER esize field is read-only, there is no need to change it.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551
When adding indirect (2 level) tabled we will need to know the page
size to calculate the size of the level 1 table. To allow for this find
the page size before entering the loop to calculate the final register
value.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551
Code for pre-11 FreeBSD versions is removed.
Also removed are macros that are not used anymore and "i" variable
does not shadow anymore other "i" variable.
Differential Revision: https://reviews.freebsd.org/D41547
Approved by: erj
In commit c7cffd65c5 the function ether_8021q_frame() was slightly
refactored to use pointer of struct ether_8021q_tag as parameter qtag to
include the new option proto.
It is wrong to write to qtag->pcp as it will effectively change the memory
that qtag points to. Unfortunately the transmit routine of if_vlan parses
pointer of the member ifv_qtag of its softc which stores vlan interface's
PCP internally, when transmitting mbufs that contains PCP the vlan
interface's PCP will get overwritten.
Fix by operating on a local copy of qtag->pcp. Also mark 'struct ether_8021q_tag'
as const so that compilers can pick up such kind of bug.
PR: 273304
Reviewed by: kp
Fixes: c7cffd65c5 Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39505
When handling a SHUTDOWN or SHUTDOWN ACK chunk detect if the peer
is violating the protocol by not having made sure all user messages
are reveived by the peer. If this situation is detected, abort the
association.
MFC after: 1 week
Like other dwc3 controller, on Xilinx ZynqMP the base node is just here
to provide resets, the main dwc3 controller node is a child node.
Sponsored by: Beckhoff Automation GmbH & Co. KG
This is an attempt at clean-room implementation of the Linux'
membarrier(2) syscall. For documentation, you would need to read
both membarrier(2) Linux man page, the comments in Linux
kernel/sched/membarrier.c implementation and possibly look at
actual uses.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32360
For amd64, i386, arm, and riscv, i.e. all architectures except arm64,
the custom implementation is provided since we maintain the bitmask of
active CPUs anyway.
Arm64 uses somewhat naive iteration over CPUs and match current vmspace'
pmap with the argument. It is not guaranteed that vmspace->pmap is the
same as the active pmap, but the inaccuracy should be toleratable.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32360
The seq_num among other things is used to assign rq_psn value, which is
a 24-bit identifier. When the seq_num is full 4-byte value, we are
usually receiving: '_ib_modify_qp rq_psn overflow, masking to 24 bits'
warning.
This is burdensome for running rdma traffic with large number of
connections, because the number of logs is growing fast.
Signed-off-by: Bartosz Sobczak <bartosz.sobczak@intel.com>
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reviewed by: kib@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D41531
This target produces a C file not an object file, so using ctfconvert on
it should not be attempted. This keeps it in sync with all other uses of
fw_stub.awk, squashes a warning seen during the build of TEGRA124 on
FreeBSD and avoids the same issue failing the build on non-FreeBSD (such
errors are #ifdef'ed into being warnings on FreeBSD in ctfconvert, which
should be revisited in the future).
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41542
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.
Sponsored by: NetApp, Inc. (originally)
This function actually only ever reads from the TCP PCB. Consequently,
also make the pointer to its TCP PCB parameter const.
Sponsored by: NetApp, Inc. (originally)
When we receive a packet and remove the encapsulating layer we should
also clear out protocol flags and any mbuf tags.
If we do not we risk confusing firewalls filtering the tunneled packet.
See also: https://redmine.pfsense.org/issues/14682#change-69073
Sponsored by: Rubicon Communications, LLC ("Netgate")
The LVM label is stored on any of the first four sectors, and the
PV (physical volume) header is stored within the same sector following
the LVM label. The current implementation does not fully check the
offset of PV header, when attaching a bad formatted LVM PV the kernel
may crash due to out-of-bounds memory read.
PR: 266562
Reviewed by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36773
We can end up with a domain having no CPUs capable of receiving I/O
interrupts. This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.
In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc ("Only use CPUs in the
domain the device is attached to for default"). This has a performance
impact but at least allows the system to be functional. It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.
Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.
Reviewed by: jhb
Tested by: cperciva, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501