Adding to zombie list can be perfomed by idle threads, which on ppc64 leads to
panics as it requires a sleepable lock.
Reported by: alfredo
Reviewed by: kib, markj
Fixes: r367842 ("thread: numa-aware zombie reaping")
Differential Revision: https://reviews.freebsd.org/D27288
Exec and exit are same as corresponding eventhandler hooks.
Thread exit hook is called somewhat earlier, while thread is still
owned by the process and enough context is available. Note that the
process lock is owned when the hook is called.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27309
As far as I can tell, this has been the case since initially committed in
2008. cpuset_setproc is the executor of cpuset reassignment; note this
excerpt from the description:
* 1) Set is non-null. This reparents all anonymous sets to the provided
* set and replaces all non-anonymous td_cpusets with the provided set.
However, reviewing cpuset_setproc_setthread() for some jail related work
unearthed the error: if tdset was not anonymous, we were replacing it with
`set`. If it was anonymous, then we'd rebase it onto `set` (i.e. copy the
thread's mask over and AND it with `set`) but give the new anonymous set
the original tdset as the parent (i.e. the base of the set we're supposed to
be leaving behind).
The primary visible consequences were that:
1.) cpuset_getid() following such assignment returns the wrong result, the
setid that we left behind rather than the one we joined.
2.) When a process attached to the jail, the base set of any anonymous
threads was a set outside of the jail.
This was initially bundled in D27298, but it's a minor fix that's fairly
easy to verify the correctness of.
A test is included in D27307 ("badparent"), which demonstrates the issue
with, effectively:
osetid = cpuset_getid()
newsetid = cpuset()
cpuset_setaffinity(thread)
cpuset_setid(osetid)
cpuset_getid(thread) -> observe that it matches newsetid instead of osetid.
MFC after: 1 week
Providing these in freebsd32.h facilitates local testing/measuring of the
structs rather than forcing one to locally recreate them. Sanity checking
offsets/sizes remains in kern_umtx.c where these are typically used.
oldfde may be invalidated if the table has grown due to the operation that
we're performing, either via fdalloc() or a direct fdgrowtable_exp().
This was technically OK before rS367927 because the old table remained valid
until the filedesc became unused, but now it may be freed immediately if
it's an unshared table in a single-threaded process, so it is no longer a
good assumption to make.
This fixes dup/dup2 invocations that grow the file table; in the initial
report, it manifested as a kernel panic in devel/gmake's configure script.
Reported by: Guy Yur <guyyur gmail com>
Reviewed by: rew
Differential Revision: https://reviews.freebsd.org/D27319
* Make rib_walk() order of arguments consistent with the rest of RIB api
* Add rib_walk_ext() allowing to exec callback before/after iteration.
* Rename rt_foreach_fib_walk_del -> rib_foreach_table_walk_del
* Rename rt_forach_fib_walk -> rib_foreach_table_walk
* Move rib_foreach_table_walk{_del} to route/route_helpers.c
* Slightly refactor rib_foreach_table_walk{_del} to make the implementation
consistent and prepare for upcoming iterator optimizations.
Differential Revision: https://reviews.freebsd.org/D27219
Do not hardcode what we setup for the DMA engine configuration but
lookup the fdt properties and configuring accordingly.
Use a default value of 8 for the burst dma length for both TX and
RX, this is what we used for TX before.
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
During the life of a process, new file descriptor tables may be allocated. When
a new table is allocated, the old table is placed in a free list and held onto
until all processes referencing them exit.
When a new file descriptor table is allocated, the old file descriptor table
can be freed when the current process has a single-thread and the file
descriptor table is not being shared with any other processes.
Reviewed by: kevans
Approved by: kevans (mentor)
Differential Revision: https://reviews.freebsd.org/D18617
- Allocate 256 handlers more than payload commands for management purposes.
- Increase maximum number of handlers from 8K to 16K by tuning the format.
- Just to be safe limit the number of payload commands to 16K - 256.
- Limit number of target exchanges in mixed mode to the number of atpds.
- If we still somehow get out of atpds -- return BUSY, since we really are.
There is not much to inherit any more, may create more problems than solve.
Instead parent them all directly to upstream.
While there, add missed payload tag and tune scratch tag destructions.
While there, do some minor cleanup for kclocks. They are only
registered from kern_time.c, make registration function static.
Remove event hooks, they are not used by both registered kclocks.
Add some consts.
Perhaps we can stop registering kclocks at all and statically
initialize them.
Reviewed by: mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27305
There is no point in dynamic registration, umtx hook is there always.
Reviewed by: mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27303
The netmap application using the driver is responsible for replenishing
the receive freelists and they may be totally depleted when the
application exits. Packets in flight, if any, might block the pipeline
in case there aren't enough buffers left in the freelist. Avoid this by
filling up the freelists with a driver allocated buffer.
MFC after: 1 week
Sponsored by: Chelsio Communications
Qlogic chips store S/G lists in the same queue as requests themselves. In
the worst case 1MB I/O may require up to 52 IOCBs, that means queue of 1024
IOCBs can store only 19 of such requests. The increase reduces chances of
overflow, while we should be able to afford additional 512KB of RAM per HBA.
The Linux driver uses comparable numbers.
While there, decouple ATIO queue size from response queue size. There is
no reason for them to be equal.
- Make isp_start() to set all the IOCB fields aside of S/G list, removing
extra information from isp_send_cmd(), now only doing S/G lists and sending.
- Turn DMA setup/free from being card and PCI-specific into OS-specific,
instead add new card-specific method for isp_send_cmd(). Previously this
function was a monster handling all the cards.
- Remove double error code translation.
Ensure we initialize the static environment when not booting via
loader(8), and provide a static buffer if this is the case. This fixes
two issues.
First, performing the initialization ensures that kenv variables set in
the kernel's config file are honored. Previously, any new or overridden
values were ignored.
Second, providing the static buffer allows variables to be set in the
device tree's bootargs property of the chosen node. This can be set by
u-boot or by QEMU's '-append' flag. Attempting to this prior to this
change resulted in an early panic, since the static environment had no
buffer backing it.
Submitted by: syrinx (earlier version)
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D25034
This also eliminates unsafe use of VFS_SYNC(MNT_WAIT).
Requested by: mckusick
Discussed with: imp
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D27269
Don't use "new" as an identifier, and add explicit casts from void *.
As a general policy, FreeBSD doesn't make any C++ compatibility
guarantees for kernel headers like it does for userland, but it is a
small effort to do so in this case, to the benefit of a downstream
consumer (NetApp).
Reviewed by: rscheff
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27286
with to == NULL for SYN segments. So don't assume tp != NULL.
Thanks to jhb@ for reporting and suggesting a fix.
PR: 250499
MFC after: 1 week
XMFC-with: r367530
Sponsored by: Netflix, Inc.