When it was inline it made sense to depend on the existing nested check
in KTRUSERRET() rather than adding a new td_flags flag. However, since
we now have a TDA_KTRACE flag anyway, we might as well check it and
avoid the call.
Suggested by: jhb
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Explicitly pass the struct thread argument.
Move the function prototype from sys/systm.h to geom/geom.h, we do not
need almost each kernel source to see the prototype, it is now used
only by kern/vfs_mountroot.c outside geom/geom_event.c, where the
function is defined.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Otherwise we are using whatever the value was left from the previous
thread run on kernel entry from usermode. Typically it would be the
desired value as is, but it is not guaranteed.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
For an augmented rb_tree, allow a faster alternative to removing an
element from the tree, tweaking it slightly, and inserting it back
into the tree, knowing that its relative position in the tree is
unchanged. Instead, just change the element and invoke
RB_UPDATE_AUGMENT to fix the augmentation data for all the nodes in
the tree.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36010
Add some of the missing sysctls to tcp.4, using references to other
man pages where they exist. Added sysctls include recvbuf and sendbuf
controls for automatic buffer sizing. Updated recvspace and sendspace.
Add sysctl.8 to "see also" and intro to variable section. Rename
"MIB Variables" section to "MIB (sysctl) Variables", as most people
will associate with sysctl.
Reviewed by: manpages(pauamma), tuexen
Differential Revision: https://reviews.freebsd.org/D36004
Add missing sysctls to inet.4 and icmp.4, using references to ip.4
for variables and groups documented there. Add sysctl.8 to "see also"
and intro to variable section. Rename "MIB Variables" section to
"MIB (sysctl) Variables", as most people will associate with sysctl.
Revise history: the ICMP implementation was in 4.2BSD.
Reviewed by: manpages(pauamma)
Differential Revision: https://reviews.freebsd.org/D36003
In in_pcb_lport_dest(), if an IPv6 socket does not match any other IPv6
socket using in6_pcblookup_local(), and if the socket can also connect
to IPv4 (the INP_IPV4 vflag is set), check for IPv4 matches as well.
Otherwise, we can allocate a port that is used by an IPv4 socket
(possibly one created from IPv6 via the same procedure), and then
connect() can fail with EADDRINUSE, when it could have succeeded if
the bound port was not in use.
PR: 265064
Submitted by: firk at cantconnect.ru (with modifications)
Reviewed by: bz, melifaro
Differential Revision: https://reviews.freebsd.org/D36012
While there, also fix the setting of the SYN related flag.
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D35862
Originally, wtap_node_write() gets the wrong softc by iterating V_inet and
gets the ifp by string comparison, then gets softc by ifp->if_softc.
However, ifp->if_softc will not point to the correct softc owned by
ieee80211com, and thus causes a kernel panic.
Fix it by assigning softc to cdev's si_drv1 in wtap_vap_create()
and get the softc directly via dev->si_drv1 in wtap_node_write().
The cdev created by wtap_vap_create() use the name of ieee80211com
rather than the vap's name. It will cause the second vap based on
the same ieee80211com as first vap fail to create a device node
because the device node is already exists. Fix it by assigning
vap->iv_ifp->if_xname to cdev's name.
Sponsored by: Google, Inc. (GSoC 2022)
Reviewed by: adrian, cy, bz
Differential Revision: https://reviews.freebsd.org/D35752
* Make nhgrp_get_nhops() return const struct weightened_nhop to
indicate that the list is immutable
* Make nhgrp_get_group() return the actual group, instead of
group+weight.
MFC after: 2 weeks
The test was failing due to the assert on lack of MSG_TRUNC flag in the
output flags of recvmsg().
The code passed MSG_TRUNC, along with sufficient-size buffer to hold the
message to-be-received to the recvmsg(), and expected MSG_TRUNC to be
returned as well.
This is not exactly correct as a) MSG_TRUNC was not even a supported
recvmsg() flag before be1f485d7d and b) it violates POSIX, as
POSIX states it should be set only "If a message is too long to fit in
the supplied buffers,".
The test was working before as the kernel copied input flags to the
output flags. be1f485d7d changed that behaviour to clear MSG_TRUNC
if it was present on the input.
Fix the test by checking POSIX-defined behaviour.
Discussed with: glebius
Convert the last remaining pieces of old-style debug messages
to the new debugging framework.
Differential Revision: https://reviews.freebsd.org/D35994
MFC after: 2 weeks
Currently, rt_addrinfo(info) serves as a main "transport" moving
state between various functions inside the routing subsystem.
As all of the fields are filled in directly by the customers, it
is problematic to maintain consistency, resulting in repeated checks
inside many functions. Additionally, there are multiple ways of
specifying the same value (RTAX_IFP vs rti_ifp / rti_ifa) and so on.
With the upcoming nhop(9) kpi it is possible to store all of the
required state in the nexthops in the consistent fashion, reducing the
need to use "info" in the KPI calls.
Finally, rt_addrinfo structure format was derived from the rtsock wire
format, which is different from other kernel routing users or netlink.
This cleanup simplifies upcoming nhop(9) kpi and netlink introduction.
Reviewed by: zlei.huang@gmail.com
Differential Revision: https://reviews.freebsd.org/D35972
MFC after: 2 weeks
Mark dst/mask public API functions fields as const to clearly
indicate that these parameters are not modified or stored in
the datastructure.
Differential Revision: https://reviews.freebsd.org/D35971
MFC after: 2 weeks
Expiration time is actually a path property, not a route property.
Move its storage to nexthop to simplify upcoming nhop(9) KPI changes
and netlink introduction.
Differential Revision: https://reviews.freebsd.org/D35970
MFC after: 2 weeks
Identify each of the superblock validation checks as either a
warning or a fatal error. Any integrity check that can cause a
system hang or crash is marked as fatal. Those that may simply
lead to poor file layoutor other less good operating conditions
are marked as warning.
Normally both fatal and warning are treated as errors and prevent
the superblock from being loaded. A new flag, UFS_NOWARNFAIL, is
added. When passed to ffs_sbget() it will note warnings that it
finds, but will still proceed with loading the superblock. Note
that when UFS_NOWARNFAIL is used, it also includes UFS_NOHASHFAIL.
No legitimate superblocks should fail as a result of these changes.
Further updates based on analysis of the way the fields are used
in the various filesystem macros defined in fs.h.
Eliminate several checks for non-negative values where the fields
are checked for specific values. Since these specific values are
non-negative, if the value is a verified positive value then it
cannot be negative and such a check is redundant and unnecessary.
No legitimate superblocks should fail as a result of these changes.
- new sentence, new line
- unknown AT&T UNIX version: At v7
- no blank before trailing delimiter
- reference the ASCII(8) manual page
MFC after: 5 days
Rather than trying to shoehorn flags into the requested superblock
address, create a separate flags parameter to the ffs_sbget()
function in sys/ufs/ffs/ffs_subr.c. The ffs_sbget() function is
used both in the kernel and in user-level utilities through export
to the sbget() function in the libufs(3) library (see sbget(3)
for details). The kernel uses ffs_sbget() when mounting UFS
filesystems, in the glabel(8) and gjournal(8) GEOM utilities,
and in the standalone library used when booting the system
from a UFS root filesystem.
The ffs_sbget() function reads the superblock located at the byte
offset specified by its sblockloc parameter. The value UFS_STDSB
may be specified for sblockloc to request that the standard
location for the superblock be read.
The two existing options are now flags:
UFS_NOHASHFAIL will note if the check hash is wrong but will still
return the superblock. This is used by the bootstrap code to
give the system a chance to come up so that fsck can be run to
correct the problem.
UFS_NOMSG indicates that superblock inconsistency error messages
should not be printed. It is used by programs like fsck that
want to print their own error message and programs like glabel(8)
that just want to know if a UFS filesystem exists on a partition.
One additional flag is added:
UFS_NOCSUM causes only the superblock itself to be returned, but does
not read in any auxiliary data structures like the cylinder group
summary information. It is used by clients like glabel(8) that
just want to check for possible filesystem types. Using UFS_NOCSUM
skips the superblock checks for csum data which allows superblocks
that have corrupted csum data to be read and used.
The validate_sblock() function checks that the superblock has not
been corrupted in a way that can crash or hang the system. Unless
the UFS_NOMSG flag is specified, it will print out any errors that
it finds. Prior to this commit, validate_sblock() returned as soon
as it found an inconsistency so would print at most one message.
It now does all its checks so when UFS_NOMSG has not been specified
will print out everything that it finds inconsistent.
Sponsored by: The FreeBSD Foundation
The single line comment indicator '//' is only detected at the
beginning of a line or when following white space to allow URLs
in calendar entries.
MFC after: 3 days
Reorder a few checks to ensure fields have been checked before
using them to check other fields.
Add eight new checks mostly checking for non-negative values.
No legitimate superblocks should fail as a result of these changes.
Update iwlwifi 22000 firmware to -73 and rebuilds for 9000/9260.
Update the driver to accept the newer version.
Firmware was obtained from linux-firmware at
150864a4d73e8c448eb1e2c68e65f07635fe1a66.
Sponsored by: The FreeBSD Foundation
MFC after: 23 days
By convention, kernel threads must call kthread_exit() instead of
blindly returning from the thread function. We have some safety measure
in fork_exit(), which checks for the P_KPROC p_flag and does
kthread_exit() for kernel thread that forgot to do it itself.
But this workaround only works for kernel threads belonging to the
kernel process. If a kernel thread is attached to the normal process
with live userspace, and does not call kthread_exit(), then the
workaround is not activated, and for amd64 at least, the return from the
thread function/fork_exit() results in the return to userspace with the
copy of frame from the thread that did kthread_add().
Practically for smrstress, this destroys the user stack of the still
active frame in the other thread, which was the caller of kthread_add().
Fix it by adding kthread_exit() to the thread function.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35999
Since IOMMU map entries store a reference to the domain in which they
reside, there is no need to pass the domain to iommu_gas_free_entry(),
iommu_gas_free_space(), and iommu_gas_free_region().
Push down the acquisition and release of the IOMMU domain lock into
iommu_gas_free_space() and iommu_gas_free_region().
Both of these changes allow for simplifications in the callers of the
functions without really complicating the functions themselves.
Moreover, the latter change eliminates the direct use of the IOMMU
domain lock from the x86-specific DMAR code.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35995
Implement Linux-variant of MSG_TRUNC input flag used in recv(), recvfrom() and recvmsg().
Posix defines MSG_TRUNC as an output flag, indicating packet/datagram truncation.
Linux extended it a while (~15+ years) ago to act as input flag,
resulting in returning the full packet size regarless of the input
buffer size.
It's a (relatively) popular pattern to do recvmsg( MSG_PEEK | MSG_TRUNC) to get the
packet size, allocate the buffer and issue another call to fetch the packet.
In particular, it's popular in userland netlink code, which is the primary driving factor of this change.
This commit implements the MSG_TRUNC support for SOCK_DGRAM sockets (udp, unix and all soreceive_generic() users).
PR: kern/176322
Reviewed by: pauamma(doc)
Differential Revision: https://reviews.freebsd.org/D35909
MFC after: 1 month