Previously, iscsi_poll() just panicked. This meant if you got a panic
on a box when using the iSCSI initiator, the attempt to shutdown would
trigger a nested panic and never write out a core. Now, CCB's sent to
iSCSI devices (such as the sychronize-cache request in dashutdown())
just fail with a timeout during a panic shutdown.
Reviewed by: scottl, mav
MFC after: 2 weeks
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D28455
If a disk's SIM doesn't support polling, then it can't be used to
store crashdumps. Leave d_dump NULL in that case so that dumpon(8)
fails gracefully rather than having dumps fail at crash time.
Reviewed by: scottl, mav, imp
MFC after: 2 weeks
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D28454
Some CAM sim drivers do not support polling (notably iscsi(4)).
Rather than using a no-op poll routine that always times out requests,
permit a SIM to set a NULL poll callback. cam_periph_runccb() will
fail polled requests non-pollable sims immediately as if they had
timed out.
Reviewed by: scottl, mav (earlier version)
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D28453
Widen the ifnet_detach_sxlock to cover the entire vnet sysuninit code.
This ensures that we can't end up having the vnet_sysuninit free the UDP
pcb while the detach code is running and trying to purge the UDP pcb.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28530
Since commit 8fa6abb6f4f64f ("Expose clang's alignment builtins and use
them for roundup2/rounddown2"), clang emits warnings for several
alignment operations in these drivers because the operation is a no-op.
The compiler is arguably being too strict here, but in the meantime
let's silence the warnings by conditionally compiling the alignment
operations.
Reviewed by: arichardson, hselasky
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28576
In error case we can leave `inp' locked, also we need to free
mbuf chain `m' in the same case. Release the lock and use `badunlocked'
label to exit with freed mbuf. Also modify UDP error statistic to
match the IPv6 code.
Remove redundant INP_RUNLOCK() from the `if (last == NULL)' block,
there are no ways to reach this point with locked `inp'.
Obtained from: Yandex LLC
MFC after: 3 days
Sponsored by: Yandex LLC
The lookup for a IPv6 multicast addresses corresponding to
the destination address in the datagram is protected by the
NET_EPOCH section. Access to each PCB is protected by INP_RLOCK
during comparing. But access to socket's so_options field is
not protected. And in some cases it is possible, that PCB
pointer is still valid, but inp_socket is not. The patch wides
lock holding to protect access to inp_socket. It copies locking
strategy from IPv4 UDP handling.
PR: 232192
Obtained from: Yandex LLC
MFC after: 3 days
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D28232
For broadcast, multicast and unknown unicast, the replication loop
sends a copy of the packet to each link, beside the first one. This
special path is handled later, but the counters are not updated.
Factor out the common send and count actions as a function.
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28537
This flag indicates that the page should be enqueued near the head of
the inactive queue, skipping the LRU queue. It is used when unwiring
pages from the buffer cache following direct I/O or after I/O when
POSIX_FADV_NOREUSE or _DONTNEED advice was specified, or when
sendfile(SF_NOCACHE) completes. For the direct I/O and sendfile cases
we only enqueue the page if we decide not to free it, typically because
it's mapped.
Pass "noreuse" through to vm_page_release_toq() so that we actually
honour the desired LRU policy for these scenarios.
Reported by: bdrewery
Reviewed by: alc, kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28555
57785538c6e0d7e8ca0f161ab95bae10fd304047 change the test for FreeBSD
from __FreeBSD_version to __FreeBSD__. However this test was performed
before sys/param.h was included, therefore __FreeBSD_version was never
defined. As the test was never true opt_random_ip_id.h was never included.
Submitted by: bdragon
Reported by: bdragon
MFC after: 1 week
X-MFC with: 57785538c6e0d7e8ca0f161ab95bae10fd304047
In the data path of ng_bridge(4), the only value of the host struct,
which needs to be modified, is the staleness, which is reset every
time a frame is received. It's save to leave the code as it is.
This patch is part of a series to make ng_bridge(4) multithreaded.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28546
In a earlier version of ng_bridge(4) the exernal visible host entry
structure was a strict subset of the internal one. So internal view
was a direct annotation of the external structure. This strict
inheritance was lost many versions ago. There is no need to
encapsulate a part of the internal represntation as a separate
structure.
This patch is a preparation to make the internal structure read only
in the data path in order to make ng_bridge(4) multithreaded.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28545
Apply https://github.com/openzfs/zfs/pull/11576
Direct commit from upstream openzfs. Full commit message below:
Set file mode during zfs_write
3d40b65 refactored zfs_vnops.c, which shared much code verbatim between
Linux and BSD. After a successful write, the suid/sgid bits are reset,
and the mode to be written is stored in newmode. On Linux, this was
propagated to both the in-memory inode and znode, which is then updated
with sa_update.
3d40b65 accidentally removed the initialization of newmode, which
happened to occur on the same line as the inode update (which has been
moved out of the function).
The uninitialized newmode can be saved to disk, leading to a crash on
stat() of that file, in addition to a merely incorrect file mode.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes#11474Closes#11576
Obtained from: openzfs/zfs@f8ce8aed0
MFC after: 0 days
Sponsored by: iXsystems, Inc.
This reverts commit af366d353b84bdc4e730f0fc563853abc338271c.
Trips over '\xa4' byte and terminates early, as found in
lib/libc/gen/setdomainname_test:setdomainname_basic testcase
However, keep moving libkern/strlen.c out of conf/files.
Reported by: lwhsu
BORDER_PIXELS is left over from picking up the source from illumos
port. Since FreeBSD VT does not use border in terminal size
calculation, there is no reason why should loader use it.
MFC after: 1 week
Protocol attachment has historically been able to observe and modify
so->so_options as needed, and it still can for newly created sockets.
779f106aa169 moved this to after pru_attach() when we re-acquire the
lock on the listening socket.
Restore the historical behavior so that pru_attach implementations can
consistently use it. Note that some pru_attach() do currently rely on
this, though that may change in the future. D28265 contains a change to
remove the use in TCP and IB/SDP bits, as resetting the requested linger
time on incoming connections seems questionable at best.
This does move the assignment out from under the head's listen lock, but
glebius notes that head won't be going away and applications cannot
assume any specific ordering with a race between a connection coming in
and the application changing socket options anyways.
Discussed-with: glebius
MFC-after: 1 week
AFAICT, this was an oversight from
9e5787d2284e187abb5b654d924394a65772e004 (svn r364746). That revision
inadvertently disabled assertions unconditionally.
Reviewed by: freqlabs
MFC after: 3 days
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D28256
My YOGA requires a minimum of 7 to parse w/o an error. Since the memory savings
are trivial and the yoga a popular system, bump the default up to 8. There's no
API/ABI issues in doing this. This hid_item struct isn't exported to userland
and the one libusbhid has is different and only shares a name...
MFC After: 3 days
Reviewed by: wulf@
Differential Revision: https://reviews.freebsd.org/D28543
It appears that production versions of EPYC firmware get the _STA method right
for these nodes. In fact, this workaround breaks on production hardware by
including too many uart nodes. This work around was for pre-release hardware
that wound up not having a large deployment. Move this work around to a kernel
option since the machines that needed it have been powered off and are difficult
to resurrect. Should there be a more significant deployment than is understood,
we can restrict it based on smbios strings.
Discussed with: mmacy@, seanc@, jhb@
MFC After: 3 days
Historically receive buffer overflows have been ignored and programs
could not tell if they missed messages or messages had been truncated
because of overflows. Since programs historically do not expect to get
receive overflow errors, this behavior is not the default.
This is really really important for programs that use route(4) to keep in sync
with the system. If we loose a message then we need to reload the full system
state, otherwise the behaviour from that point is undefined and can lead
to chasing bogus bug reports.
This adds a new sysctl to Wellspring Touchpad driver for controlling
Z-Axis (2-finger vertical scroll) direction "hw.usb.wsp.z_invert".
Submitted by: James Wright <james.wright_AT_digital-chaos_DOT_com>
Reviewed by: wulf
PR: 253321
Differential revision: https://reviews.freebsd.org/D28521
A BIOS bug may apparently cause the BSP to report that it does not
implement CMCI, with some APs reporting that they do. In this scenario,
avoid a NULL pointer dereference that occurs in cmci_monitor() because
cmc_state was not allocated by the BSP.
PR: 253272
Reported by: asomers, mmacy
Reviewed by: kib (previous version)
MFC after: 1 week
The C variant in libkern performs excessive branching to find the
non-zero byte instead of using the bsfq instruction. The same code
patched to use it is still slower than the routine implemented here
as the compiler keeps neglecting to perform certain optimizations
(like using leaq).
On top of that the routine can is a starting point for copyinstr
which operates on words instead of bytes.
Tested with glibc test suite.
Sample results (calls/s):
Haswell:
$(perl -e "print 'A' x 3"):
stock: 211198039
patched:338626619
asm: 465609618
$(perl -e "print 'A' x 100"):
stock: 83151997
patched: 98285919
asm: 120719888
AMD EPYC 7R32:
$(perl -e "print 'A' x 3"):
stock: 282523617
asm: 491498172
$(perl -e "print 'A' x 100"):
stock: 114857172
asm: 112082057
vt is using static buffers for on screen data, the buffer size is
calculated based on maximum supported screen size and 8x16 font.
When using hi-res graphics and very smaller than 8x16 font, we
need to be careful not to overflow static buffers in vt.
Testing: I did test by building smaller buffers than vt currently is using,
royger was testing on actual 4k capable hardware.
MFC after: 1 week
Tested by: royger
When performing encryption in software, the KTLS crypto callback always
locks the session to deliver a wakeup. But, if we're handling the
operation synchronously this is wasted effort and can result in
sleepqueue lock contention on large systems.
Use CRYPTO_SESS_SYNC() to determine whether the operation will be
completed asynchronously or not, and select a callback appropriately.
Avoid locking the session to check for completion if the session handles
requests synchronously.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28195
Currently, OpenCrypto consumers can request asynchronous dispatch by
setting a flag in the cryptop. (Currently only IPSec may do this.) I
think this is a bit confusing: we (conditionally) set cryptop flags to
request async dispatch, and then crypto_dispatch() immediately examines
those flags to see if the consumer wants async dispatch. The flag names
are also confusing since they don't specify what "async" applies to:
dispatch or completion.
Add a new KPI, crypto_dispatch_async(), rather than encoding the
requested dispatch type in each cryptop. crypto_dispatch_async() falls
back to crypto_dispatch() if the session's driver provides asynchronous
dispatch. Get rid of CRYPTOP_ASYNC() and CRYPTOP_ASYNC_KEEPORDER().
Similarly, add crypto_dispatch_batch() to request processing of a tailq
of cryptops, rather than encoding the scheduling policy using cryptop
flags. Convert GELI, the only user of this interface (disabled by
default) to use the new interface.
Add CRYPTO_SESS_SYNC(), which can be used by consumers to determine
whether crypto requests will be dispatched synchronously. This is just
a helper macro. Use it instead of looking at cap flags directly.
Fix style in crypto_done(). Also get rid of CRYPTO_RETW_EMPTY() and
just check the relevant queues directly. This could result in some
unnecessary wakeups but I think it's very uncommon to be using more than
one queue per worker in a given workload, so checking all three queues
is a waste of cycles.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28194
This makes it easier to refactor the GCM code to operate on
crypto_buffer_cursors rather than plain contiguous buffers, with the aim
of minimizing the amount of copying and zeroing done today.
No functional change intended.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28500
- We were only hashing up to the first 16 bytes of the AAD.
- When computing the digest during decryption, handle the case where
len == trailer, i.e., len < AES_BLOCK_LEN, properly.
While here:
- trailer is always smaller than AES_BLOCK_LEN, so remove a pair of
unnecessary modulus operations.
- Replace some byte-by-byte loops with memcpy() and memset() calls.
In particular, zero the full block before copying a partial block into
it since we do that elsewhere and it means that the memset() length is
known at compile time.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28501
This makes it a bit more straightforward to add new counters when
debugging. No functional change intended.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28498
Allocate the necessary memory for the conversion dynamically starting
with a value which is sufficient for almost all normal cases.
PR: 187835
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D23840
Microoptimize set_syscall_retval() for arm64 by predicting
the return value to be zero. This is similar to what has
been done for other architectures
Reviewed By: emaste, mhorne
Differential Revision: https://reviews.freebsd.org/D26991
Current epoll implementation stores udata fields of epoll_event
structure in special dynamically-sized table rather than in udata field
of backing kevent structure because of 2 reasons:
1. Kevent's udata size is smaller than epoll's on 32-bit archs.
2. Kevent's udata can be clobbered on execution EPOLL_CTL_ADD as kqueue
modifies existing event while epoll returns error in this case.
After r320043 has introduced four new 64bit user data members (ext[]),
we can store epoll udata in one of them and drop aforementioned table.
According to kqueue_register() source code ext members are not updated
when existing kevent is modified that fixes p.2.
As a side effect the patch fixes PR/252582.
Reviewed by: trasz
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D28169
In 78599c32efed3247d165302a1fbe8d9203e38974, CFI endproc decoration was
added to locore64.S. However, it missed the subtle detail that
__restartkernel_virtual() falls through to __restartkernel(). This was
causing boot failure on PowerMac G5, as it tried to execute the
epilogue as code.
Fix this by branching to __restartkernel() instead of intentionally
running off the end of the function.
While here, add some additional notes on how the virtual mode restart
works.
MFC after: 3 days