Next phase of error recovery: Eliminate the REOVERY_START phase, since
we don't need to wait to start recovery. Eliminate the RECOVERY_RESET
phase since it is transient, we now transition from RECOVERY_NORMAL into
RECOVERY_WAITING.
In normal mode, read the status of the controller. If it is in failed
state, or appears to be hot-plugged, jump directly to reset which will
sort out the proper things to do. This will cause all pending I/O to
complete with an abort status before the reset.
When in the NORMAL state, call the interrupt handler. This will complete
all pending transactions when interrupts are broken or temporarily
misbehaving. We then check all the pending completions for timeouts. If
we have abort enabled, then we'll send an abort. Otherwise we'll assume
the controller is wedged and needs a reset. By calling the interrupt
handler here, we'll avoid an issue with the current code where we
transitioned to RECOVERY_START which prevented any completions from
happening. Now completions happen. In addition and follow-on I/O that is
scheduled in the completion routines will be submitted, rather than
queued, because the recovery state is correct. This also fixes a problem
where I/O would timeout, but never complete, leading to hung I/O.
Resetting remains the same as before, just when we chose to reset has
changed.
A nice side effect of these changes is that we now do I/O when
interrupts to the card are totally broken. Followon commits will improve
the error reporting and logging when this happens. Performance will be
aweful, but will at least be minimally functional.
There is a small race when we're checking the completions if interrupts
are working, but this is handled in a future commit.
Sponsored by: Netflix
MFC After: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36922
When we went to having a shared timeout routine, failing the timed-out
transaction code was inadvertantly dropped. Reinstate it.
Fixes: 502dc84a8b
Sponsored by: Netflix
MFC After: 2 weeks
Reviewed by: chuck, jhb
Differential Revision: https://reviews.freebsd.org/D36921
Make inline functions defined in a header file as static inline.
Mark inline functions used only in the compilation unit they are
defined in as merely static (the compiler can figure out it these
days).
Sponsored by: Netflix
In particular, don't use a socket level flag, use the inp level one.
After adding appropriate locking, this will close a race condition.
MFC after: 1 week
timerfd01 from ltp passes (and some other don't), but none of the tests
crash the kernel.
This is a bare minimum patch to fix up the immediate regression.
Reported by: yasu
This module allows controlled privilege escallation via mac labels
securely associated with a process via mac_veriexec.
There are over 700 PRIV_* but we can compress many of them into
a single GBL_* thus constraining the size of gbl labels.
The goal is to allow a daemon to run as an unprivileged process while
still being able a set of privileged operations needed.
We add APIs to libveriexec so that userland processes can check labels
and an exec_script API that allows a suitably labeled process to run
something like a python interpreter directly if necessary;
overcomming the 'indirect' flag applied to the interpreter.
Add -l option to sbin/veriexec to report labels.
Reviewed by: stevek
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D41431
The mac_syscall system call works fine as long as any MAC module
that provides a mpo_syscall method handles compat32 appropriately.
Regenerate system call files for freebsd32.
Reviewed by: sjg
Obtained from: Juniper Networks, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41575
The free vnode marker can slide past eligible entries.
Artificially reducing vnode limit to 300k and spawning 104 workers each
creating a million files results in all of them trying to recycle, which
often fails when it should not have to.
Because of the excessive traffic in this scenario, the trylock to
requeue is virtually guaranteed to fail, meaning nothing gets pushed
forward.
Since no vnodes were found, the most unfortunate sleep for 1 second is
induced (see vn_alloc_hard, the "vlruwk" msleep).
Without the fix the machine is mostly idle with almost everyone stuck
off CPU waiting for the sleep to finish. With the fix it is busy
creating files.
Unrelated to the above problem the marker could have landed in a
similarly problematic spot for because of any failure in vtryrecycle.
Originally reported as poudriere builders stalling in a vnode-count
restricted setup.
Fixes: 138a5dafba ("vfs: trylock vnode requeue")
Reported by: Mark Millard
This updates the smartpqi driver to Microsemi's latest code. This will
be the driver for FreeBSD 14 (with updates), but no MFC is planned.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D41550
In rS360398, a new iflib device method was added to opt out of VLAN
events needing an interface reset.
I am switching the default to not requiring a restart for:
* VLAN events
* unknown events
After fixing various bugs, I do not think this would be a common need
of hardware and it is undesirable from the user's perspective causing
link flaps and much slower VLAN configuration. Currently, there are no
other restart events besides VLAN events, and setting the
ifdi_needs_restart default to false will alleviate the need to churn
every driver if an odd event is added in the future for specific
hardware.
markj points out this could cause churn in the other direction; I will
solve that problem with an event registration system as he mentions in
the review should we need it in the future.
These drivers will opt into restart and need further inspection or work:
* ixv (needs code audit, 61a8231 fixed principal issue; re-init probably
not necessary)
* axgbe (needs code audit; re-init probably not necessary)
* iavf - (needs code audit; interaction with Malicious Driver Detection
mentioned in rS360398)
* mgb - no VLAN functions are currently implemented. Left a comment.
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unnecessary for ice(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
iavf(4) was the original need for this, because VLAN filter changes
currently have negative interactions with Malicious Driver Detection.
Add iavf_if_needs_restart and explicitly enable VLAN re-init.
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for vmxnet3(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for vmxnet3(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This re-init is unintentional for enetc(4).
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
In rS360398, a new iflib device method was added with default of opt out
for VLAN events needing an interface reset.
This is unintentional for bnxt(4) and is causing another bug in its VLAN
initialization code to affect the common case of adding and removing
VLANs on an existing interface.
PR: 269133
Tested by: kp
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
Move the timerfd impelemntation from linux compat code to sys/kern. Use
it to implement the new system calls for timerfd. Add a hook to kern_tc
to allow timerfd to know when the system time has stepped. Add kqueue
support to timerfd. Adjust a few names to be less Linux centric.
RelNotes: YES
Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack)
Differential Revision: https://reviews.freebsd.org/D38459
It takes a non-optional parameter string, one of "up", "down", or "both"
that can request tree traversal in the chosen directions. This adds PIDs
from the paths to the selection of PIDs and can be used together with -d
to draw a subset of the process tree.
Differential Revision: https://reviews.freebsd.org/D41231
This reverts commit ca8c0d5e81.
By commiting ca8c0d5e81 I was hoping that the existing option -d
could just be extended to work with -p to implement a feature that was
and I think is still needed, that is to show all descendant processes
of a given process id or a set of process ids.
After a complaint from -current which may represent a wider
dissatisfaction with this change in the program's behavior, I think it
will be better to revert ca8c0d5e81 and reintroduce this feature
using a separate option -D.
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.
Reported by: syzbot+291f6581cecb77097b16@syzkaller.appspotmail.com
MFC after: 1 week
pf_route() sends traffic to a specified next hop over a specific
interface. The next hop is obtained in pf_map_addr() but the interface
is obtained directly via r->rpool.cur->kif` outside of the lock held in
pf_map_addr() in multiple places around pf. The chosen interface is not
stored in source node.
Move the interface selection into pf_map_addr(), have the function
return it together with the chosen IP address and ensure its stored
in struct pf_ksrc_node, store it in the source node and use the stored
value when needed.
Sponsored by: InnoGames GmbH
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41570
The mount subcommand currently produces output such as:
# bectl mount <bootenv>
Successfully mounted <bootenv> at <mountpoint>
This commit changes it to only print the mountpoint:
# bectl mount <bootenv>
<mountpoint>
This makes it easier to script the mount subcommand. If an error occurs
while mounting, an error message is printed to stderr and bectl will
exit with a non-zero value.
PR: 273180
Reviewed by: kevans, asomers
Differential Revision: https://reviews.freebsd.org/D41562
We currently pass MACHINE and MACHINE_ARCH as TARGET and TARGET_ARCH
respectively for universe-toolchain, but on non-FreeBSD these may not
have values that we understand (e.g. on Linux it will be x86_64 rather
than amd64) for TARGET/TARGET_ARCH (note that we do support them for
MACHINE/MACHINE_ARCH). Since the choice is a bit arbitrary and merely
determines what LLVM's default triple will be, use amd64 on non-FreeBSD
as a known-good default.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D41545
The top-level Makefile passes -m to its sub-makes in order to ensure
they use the in-tree mk files in share/mk, but the top-level make itself
has to rely on whatever environment the bmake used has. For FreeBSD, we
configure the system bmake with .../share/mk:/usr/share/mk, which means
it will pick up src's share/mk whenever run from within the src tree,
but currently for non-FreeBSD we configure our bootstrap bmake only with
bmake's own mk files. This is mostly compatible, with two exceptions:
1. "targets" runs at the top level, but needs TARGET_MACHINE_LIST and
the corresponding MACHINE_ARCH_LIST_${target}, otherwise it will just
print an empty list.
2. "universe" and "universe-toolchain", when run at the top level (i.e.
not via the various wrappers around universe like tinderbox), end up
failing in universe-toolchain itself with:
bmake[1]: "/path/to/freebsd/share/mk/src.sys.obj.mk" line 112: Cannot use MAKEOBJDIR=
Unset MAKEOBJDIR to get default: MAKEOBJDIR='${.CURDIR:S,^${SRCTOP},${OBJTOP},}'
By including .../share/mk in the default sys path like FreeBSD's system
bmake we ensure that we get the in-tree mk files for the top-level make,
not just sub-makes, and avoid such issues.
Note that we cannot (yet) stop using the installed mk files, since the
MAKEOBJDIRPREFIX check in Makefile runs in the object directory and uses
env -i, thereby losing the MAKESYSPATH exported by src.sys.env.mk. Other
such issues may also exist, though are likely rare if so.
Reviewed by: sjg
Differential Revision: https://reviews.freebsd.org/D41544
We currently assume that any existing bootstrapped bmake binary will
work, but this means it never gets updated as contrib/bmake is, and
similarly we won't rebuild it as and when the configure arguments given
to boot-strap change. Whilst the former isn't necessarily a huge problem
given WANT_MAKE_VERSION rarely gets bumped in Makefile, having fewer
variables is a good thing, and so it's easiest if we just always keep it
up-to-date rather than trying to do something similar to what's already
in Makefile (which may or may not be accurate, given updating FreeBSD
gives you an updated bmake, but nothing does so for our bootstrapped
bmake on non-FreeBSD). The latter is more problematic, though, and the
next commit will be changing this configuration.
We thus now add in two checks. The first is to compare MAKE_VERSION
against _MAKE_VERSION from contrib/bmake/VERSION. The second is to
record at bootstrap time the exact configuration used, and compare that
against what we would bootstrap with.
Reviewed by: arichardson, sjg
Differential Revision: https://reviews.freebsd.org/D41556
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41554
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41553
The GITS_BASER esize field is read-only, there is no need to change it.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41552
We don't always know the size of the register set at compile time,
e.g. on arm64 the size of the SVE registers need to be queried on boot.
To support register sets that needs to be calculated at run time
query the correct size when it is zero.
Reviewed by: markj, kib (earlier version)
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41302
Add checks that the device ID is supported by the hardware and is
within the range allocated when the driver attaches.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551
Add a message under bootverbose when we find a gicv3 its table type
that is unknown.
Reviewed by: gallatin, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41551