Originally it was added in order to prevent trashing of objects with
INVARIANTS enabled. The same effect is now provided with mere UMA_ZONE_NOFREE.
This reverts r286921.
Discussed with: kib
Objects obtained from such zones are supposed to retain type stability,
which was violated by aforementioned trashing.
This is a follow-up to r284861.
Discussed with: kib
broke in two ways. One, the pacing variable was accessed in multiple
threads in an unsafe way. Two, since large numbers of I/O could come
down from the buf layer at one time, large numbers of allocation
failures could happen all at once, resulting in a huge pace value that
would limit I/Os to 10 IOPS for minutes (or even hours) at a
time. While a real solution to these problems requires substantial
work (to go to a no-allocation after the first model, or to have some
way to wait for more memory with some kind of reserve for pager and
swapper requests), it is relatively easy to make this simplistic
pacing less pathological.
Move to using a volatile variable with loads and stores. While this is
a little racy, losing the race is safe: either you get memory and
proceed, or you don't and queue. Second, sleep for 1ms (or one tick, whichever
is larger) instead of 100ms. This removes the artificial 10 IOPS limit
while still easing up on new I/Os during memory shortages. Remove
tying the amount of time we do this to the number of failed requests
and do it only as long as we keep failing requests.
Finally, to avoid needless recursion when memory is tight (start ->
g_io_deliver() -> g_io_request() -> start -> ... until we use 1/2 the
stack), don't do direct dispatch while pacing. This should be a rare
event (not steady state) so the performance hit here is worth the
extra safety of not starving g_down() with directly dispatched I/O.
Differential Review: https://reviews.freebsd.org/D3546
Resetting some generations of the I/OAT hardware (just BDXDE for now)
resets the corresponding MSI-X registers. So, teardown and
re-initialize interrupts after resetting the hardware.
Reviewed by: jimharris
Approved by: markj (mentor)
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3549
and exit events. procfs stop events for system call tracing report these
values (argument count for system call entry and code for system call exit),
but ptrace() does not provide this information. (Note that while the system
call code can be determined in an ABI-specific manner during system call
entry, it is not generally available during system call exit.)
The values are exported via new fields at the end of struct ptrace_lwpinfo
available via PT_LWPINFO.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3536
If net.link.bridge.pfil_bridge is set we can end up thinking we're forwarding in
pf_test6() because the rcvif and the ifp (output interface) are different.
In that case we're bridging though, and the rcvif the the bridge member on which
the packet was received and ifp is the bridge itself.
If we'd set dir to PF_FWD we'd end up calling ip6_forward() which is incorrect.
Instead check if the rcvif is a member of the ifp bridge. (In other words, the
if_bridge is the ifp's softc). If that's the case we're not forwarding but
bridging.
PR: 202351
Reviewed by: eri
Differential Revision: https://reviews.freebsd.org/D3534
SoC is used in the HiKey board from 96boards.
Currently on the SD card is working on the HiKey, as such devices 0 and 2
will need to be disabled, for example by adding the following to
loader.conf:
hint.hisi_dwmmc.0.disabled=1
hint.hisi_dwmmc.2.disabled=1
Relnotes: yes (Hikey board booting)
Sponsored by: ABT Systems Ltd
particular, this invalidates the knote kn_link linkage, making the
SLIST_FOREACH() loop accessing undefined values (e.g. trashed by
QUEUE_MACRO_DEBUG). If the knote is freed by other thread when kq
lock is released or when influx is cleared, e.g. by knote_scan() for
kqueue owning the knote, the iteration step would access freed memory.
Use SLIST_FOREACH_SAFE() to fix iteration.
Diagnosed by: avg
Tested by: avg, lstewart, pawel
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Explain why it is fine to not check for M_NOWAIT failures in
kqueue_register(). Remove unneeded check for NULL result from
waitable allocation in kqueue_scan(). uma_free(9) handles NULL
argument correctly, remove checks for NULL. Remove useless cast and
adjust style in knote_alloc().
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
of the D_NEWBLK kinds of dependencies (i.e. D_ALLOCDIRECT and
D_ALLOCINDIR), which can exhaust kmem.
Handle excess of D_NEWBLK in the same way as excess of D_INODEDEP and
D_DIRREM, by scheduling ast to flush dependencies, after the thread,
which created new dep, left the VFS/FFS innards. For D_NEWBLK, the
only way to get rid of them is to do full sync, since items are
attached to data blocks of arbitrary vnodes. The check for D_NEWBLK
excess in softdep_ast_cleanup_proc() is unlocked.
For 32bit arches, reduce the total amount of allowed dependencies by
two. It could be considered increasing the limit for 64 bit platforms
with direct maps.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The typo was introduced in r278469 / 344ecf88af2dfb.
As a result of the bug there was a timing window where callout_reset()
would fail to cancel a concurrent execution of a callout that is about
to start and would schedule the callout again.
The callout would fire more times than it is scheduled.
That would happen even if the callout is initialized with a lock.
For example, the bug triggered the "Stray timeout" assertion in
taskqueue_timeout_func().
MFC after: 5 days
This makes it possible to analyze the performance of the new ZFS
write throttle with dtrace
PR: 200316
Submitted by: Lacey Powers <lacey.leanne@gmail.com>
Reviewed by: avg, smh, delphij (no objection)
Approved by: bapt (mentor)
MFC after: 1 month
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D3472
command called 'uga' to show whether UGA is implemented by the
firmware and what the settings are. It also includes filling
the efi_fb structure from the UGA information when GOP isn't
implemented by the firmware.
Since UGA does not provide information about the stride, we
set the stride to the horizontal resolution. This is likely
not correct and we should determine the stride by trial and
error. For now, this should show something on the console
rather than nothing.
Refactor this file to maximize code reuse.
PR: 202730
pins, they specify the bank and the pin in two separated cells.
This allow the use of vendor's DTS definitions by adding a gpio map
routine that copes with that.
scheduler types. It was intended to be used there, compare with the
min value, and with the test for correctness in ksched_setscheduler().
Note that P1B_PRIO_MAX and RTP_PRIO_MAX do have the same numerical
values, the change is cosmetical.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kernel configuration to A20.
There are other boards (namely the banana pi) that use exactly the same
devices.
Additionally, we are moving from static FDT support (DTB compiled
in-kernel) to DTB passed to kernel by the boot loader (ubldr). The u-boot
for these boards are already available on ports and as the crochet support
for these boards isn't committed yet, this should not bring any issues.
Discussed with: ian
it helps only the TCP timers callout(9) usage. As the benefit for
others callout(9) usages did not reach a consensus the historical
usage should prevail.
Differential Revision: https://reviews.freebsd.org/D3078
workaround for a callout(9) issue, it turns out it is instead the right
way to use callout in mpsafe mode without using callout_drain().
r284245 commit message:
Fix a callout race condition introduced in TCP timers callouts with r281599.
In TCP timer context, it is not enough to check callout_stop() return value
to decide if a callout is still running or not, previous callout_reset()
return values have also to be checked.
Differential Revision: https://reviews.freebsd.org/D2763
command has the following sub-commands:
list - list all possible modes (paged)
get - return the current mode
set <mode> - set the current mode to <mode>
Previously such LUNs were silently ignored. But while they indeed unable
to process most of SCSI commands, some, like RTPG, they still can.
MFC after: 1 month
r286951 by reinstating changes in r274628.
In l2arc_compress_buf(), we allocate a buffer to stash away the compressed
data in 'cdata', allocated of l2hdr->b_asize bytes.
We then ask zio_compress_data() to compress the buffer, b_l1hdr.b_tmp_cdata,
which is of l2hdr->b_asize bytes, and have the compressed size (or original
size, if compress didn't gain enough) stored in csize.
To pad the buffer to fit the optimal write size, we round up the compressed
size to L2 device's vdev_ashift.
Illumos code rounds up the size by at most SPA_MINBLOCKSIZE. Because we
know csize <= b_asize, and b_asize is integer multiple of SPA_MINBLOCKSIZE,
we are guaranteed that the rounded up csize would be <= b_asize. However,
this is not necessarily true when we round up to 1 << vdev_ashift, because
it could be larger than SPA_MINBLOCKSIZE.
So, in the worst case scenario, we are overwriting at most
(1 << vdev_ashift - SPA_MINBLOCKSIZE)
bytes of memory next to the compressed data buffer.
Andriy's original change in r274628 reorganized the code a little bit,
by moving the padding to after we determined that the compression was
beneficial. At which point, we would check rounded size against the
allocated buffer size, and the buffer overrun would not be possible.
as parent. In the case of a send or receive, the curproc would be the
userland application that issues the ioctl. This would trigger an assertion
failure introduced in Solaris compatibility shims in r196458 when kernel is
compiled with INVARIANTS.
Fix this by using p0 (proc0 or kernel) as the parent thread when creating
the kernel threads.