A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently. When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.
This patch adds a table of hash lists for the opens, hashed on
file handle. This table will be used by future commits to
search for an open based on file handle more efficiently.
MFC after: 2 weeks
In order to compare upcoming changes for their effectivness, measure
performance by counting opertions and the runtime of each operation
over the time. Accumulate all tests in a single instance, so make it
complicated over the time. If you wait long enough, you will notice
the expiry of old flows.
Reviewed by: kp (earlier version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30379
The previous:
if ((uoff_t)uio->uio_offset + uio->uio_resid > lim)
signal(....);
was replaced with:
if ((uoff_t)uio->uio_offset + uio->uio_resid < lim)
return;
signal(....);
Making (uoff_t)uio->uio_offset + uio->uio_resid == lim trip over the
limit, when it did not previously.
Unbreaks running 13.0 buildworld.
It was possible that termination of ktrace session occured during some
record write, in which case write occured after the close of the vnode.
Use ktr_io_params refcounting to avoid this situation, by taking the
reference on the structure instead of vnode.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30400
Instead of trying to partially ifdef out ktrace handling, define the
missing identifier to 0. Without this fix lack of ktrace in the kernel
also means there is no SIGXFSZ signal delivery.
This allows tracking all wait times with much smaller runtime impact.
For example when doing -j 104 buildkernel on tmpfs:
no profiling: 2921.70s user 282.72s system 6598% cpu 48.562 total
all acquires: 2926.87s user 350.53s system 6656% cpu 49.237 total
contested only: 2919.64s user 290.31s system 6583% cpu 48.756 total
Only print buffer cache debug message when a cache lookup has been done.
When running `fsck_ffs -d` on a gjournal'ed filesystem, it's possible
that totalreads is greater than zero when no cache lookup has been
done - causing a divide by zero. This commit fixes the following error:
Floating point exception (core dumped)
Reviewed by: mckusick
Differential Revision: https://reviews.freebsd.org/D30370
The default mb_use_ext_pgs value was toggled in commit 52cd25eb1a.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30393
These are not needed when including ktls.h to get sockopt definitions.
Reviewed by: gallatin, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30392
sys_ktrace() calls namei(), which may call ktrnamei(). But sys_ktrace()
also calls ktrace_enter() first, so if the caller is itself being
traced, the assertion in ktrace_enter() is triggered. And, ktrnamei()
does not check for recursion like most other ktrace ops do.
Fix the bug by simply deferring the ktrace_enter() call.
Also make the parameter to ktrnamei() const and convert to ANSI.
Reported by: syzbot+d0a4de45e58d3c08af4b@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30340
Handle the case where during socket option processing, the user
switches a stack such that processing the stack specific socket
option does not make sense anymore. Return an error in this case.
MFC after: 1 week
Reviewed by: markj
Reported by: syzbot+a6e1d91f240ad5d72cd1@syzkaller.appspotmail.com
Sponsored by: Netflix, Inc.
Differential revision: https://reviews.freebsd.org/D30395
When enabled, writes to ktrace.out that exceed the max file size limit
cause SIGXFSZ as it should be, but note that the limit is taken from
the process that initiated ktrace. When disabled, write is blocked,
but signal is not send.
Note that in either case ktrace for the affected process is stopped.
Requested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30257
Other processes might still be able to write, make the decision to stop
based on the per-process situation.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30257
and use the mark to stop applying file size limits on the write of
the accounting record. This allows to remove hack to clear process
limits in acct_process(), and avoids the bug with the clearing being
ineffective because limits are also cached in the thread structure.
Reported and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30257
Wrap too long lines.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30257
Otherwise pages are cleaned some time later when the lower fs decides
that it is time to do it. This mostly manifests itself as delayed
mtime update, e.g. breaking make-like programs.
Reported by: mav
Tested by: mav, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
There is no need to own vnode interlock, since v_object is type stable
and can only change to/from NULL, and no other checks in the function
access fields protected by the interlock. Remove the need variable, the
result of the test is directly usable as return value.
Tested by: mav, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This makes it possible to use core_write(), core_output(),
and sbuf_drain_core_output(), in Linux coredump code. Moving
them out of imgact_elf.c is necessary because of the weird way
it's being built.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30369
Makes pkg-gen quit after having received N packets, the same way it
already supports doing for sent packets.
Reviewed by: vmaffione
Sponsored by: Klara Inc.
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D30266
The correct character to add to the intername name is *, not +
Reviewed by: vmaffione, bcr
Sponsored By: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D30324
While partially reverting D24237 with D29690, due to introducing some
unintended effects for in-kernel TCP consumers, the preexisting lock
on the socket send buffer was not considered properly.
Found by: markj
MFC after: 2 weeks
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D30390
PRUS_NOTREADY indicates that the caller has not yet populated the chain
with data, and so it is not ready for transmission. This is used by
sendfile (for async I/O) and KTLS (for encryption). In particular, if
pru_send returns an error, the caller is responsible for freeing the
chain since other implicit references to the data buffers exist.
For async sendfile, it happens that an error will only be returned if
the connection was dropped, in which case tcp_usr_ready() will handle
freeing the chain. But since KTLS can be used in conjunction with the
regular socket I/O system calls, many more error cases - which do not
result in the connection being dropped - are reachable. In these cases,
KTLS was effectively assuming success.
So:
- Change sosend_generic() to free the mbuf chain if
pru_send(PRUS_NOTREADY) fails. Nothing else owns a reference to the
chain at that point.
- Similarly, in vn_sendfile() change the !async I/O && KTLS case to free
the chain.
- If async I/O is still outstanding when pru_send fails in
vn_sendfile(), set an error in the sfio structure so that the
connection is aborted and the mbuf chain is freed.
Reviewed by: gallatin, tuexen
Discussed with: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30349
- Free the input mbuf in a single place instead of in every error path.
- Handle PRUS_NOTREADY consistently.
- Flush the socket's send buffer if an implicit connect fails. At that
point the mbuf has already been enqueued but we don't want to keep it
in the send buffer.
Reviewed by: gallatin, tuexen
Discussed with: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30349
The segfault was being hit in ckfini() (sbin/fsck_ffs/fsutil.c)
while attempting to traverse the buffer cache to flush dirty buffers.
The tail queue used for the buffer cache was not initialized before
dropping into gjournal_check(). Move the buffer initialization earlier
so that it has been done before calling gjournal_check().
Reported by: crypt47, nvass
Fix by: Robert Wing
Tested by: Robert Wing
PR: 255030
PR: 255979
MFC after: 3 days
Sponsored by: Netflix
Move initialization of the mutex/condition variables required by the
save/restore feature to their own function.
The unix domain socket that facilitates communication between bhyvectl
and bhyve doesn't rely on these variables in order to be functional.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D30281
ELF ldconfig only maintains the search list, there is no hints
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30272
If a regulator hasn't been enable by a driver but is enabled in hardware
(most likely enabled by U-Boot), regulator_status will returns that it
is enabled and so any call to regulator_disable will panic as it wasn't
enabled by one of our drivers.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30293
This allow us to powerup/down the card and enabling/disabling the
regulators if any.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30292
This helper can be used to enable/disable the regulator and starting
the power sequence of sd/sdio/eMMC cards.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30291
This method is used to know if a regulator is enabled or not.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30290
If a sd/emmc node have a pwrseq property parse it and get the corresponding
driver.
This can later be used to powerup/powerdown the SDIO card or eMMC.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30289
This driver is used to power up sdio card or eMMC.
It handle the reset-gpio, clocks and needed delays for powerup/powerdown.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30288
By default name the gpio P<bank><bankpin>
This make it easier to find the gpio when reading schematics or DTS.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D30287
For the discovery phase of SD/eMMC we need to do some transaction in a async
way.
The classic CAM XPT_{GET,SET}_TRAN_SETTING cannot be used in a async way.
This also allow us to split the discovery phase into a more complete state
machine and we don't mtx_sleep with a random number to wait for completion
of the tasks.
For mmc_sim we now do the SET_TRAN_SETTING in a taskqueue so we can call
the needed function for regulators/clocks without the cam lock(s). This part is
still needed to be done for sdhci.
We also now save the host OCR in the discovery phase as it wasn't done before and
only worked because the same ccb was reused.
Reviewed by: imp, kibab, bz
Differential Revision: https://reviews.freebsd.org/D30038
When starting, ntpd calls setrlimit(2) to limit maximum size of its
stack. The stack limit chosen by ntpd is 200K, so when stack gap
is enabled, the stack gap is larger than this limit, which results
in ntpd crashing.
Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: cy, imp
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D29553