This allows compatibility translation to take place on the stack
(md_ioctl is too big) and is more suitable as a public interface within
the kernel than the kern_ioctl interface.
Except for the initialization of the md_req from the md_ioctl
(including detection of kernel md_file pointers) and the updating
of the md_ioctl prior to return, this is a mechanical replacment
of md_ioctl and mdio with md_req and mdr.
Reviewed by: markj, cem, kib (assorted versions)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14704
vmd_free_count manipulation. Reduce the scope of the free lock by
using a pageout lock to synchronize sleep and wakeup. Only trigger
the pageout daemon on transitions between states. Drive all wakeup
operations directly as side-effects from freeing memory rather than
requiring an additional function call.
Reviewed by: markj, kib
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14612
r328013 introduced a new error code from fsck_ffs that indicates that
it could not completely fix the file system; this happens when it
prints the message PLEASE RERUN FSCK. However, this status can happen
when fsck is run in "preen" mode and the rc.d/fsck script does not
handle that error code. Modify rc.d/fsck so that if "fsck -p"
("preen") returns the new status code (16) it will run "fsck -y", as
it currently does for a status code of 8 (the "standard error exit").
Reported by: markj
Reviewed by: mckusick, markj, ian, rgrimes
MFC after: 3 days
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D14679
Move locks from outside ioctl to the individual implementations.
This is the first step of changing the implementations to act on a
kernel-internal request struct rather than on struct md_ioctl and to
removing the use of kern_ioctl in mountroot.
Reviewed by: cem, kib, markj (prior version)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14700
The crash scenario goes like this: there's a thread waiting on "reinstate";
because it doesn't update the timeout counter it gets terminated by the
callout; at this point the maintenance thread starts the termination routine.
The first thread finishes waiting, proceeds to icl_conn_handoff(), and drops
the refcount, which allows the maintenance thread to free its resources. At
this point another thread receives a PDU. Boom.
PR: 222898, 219866
Reported by: Eugene M. Zheganin <emz at norma.perm.ru>
Tested by: Eugene M. Zheganin <emz at norma.perm.ru>
Reviewed by: mav@ (earlier version)
MFC after: 2 weeks
Sponsored by: playkey.net
unconditionally incrementing i in the loop;
Reported by: cem
MFC with: r330880
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14685
Improve clarity of a comment and style(9) some areas.
No functional change.
Reported by: markj (on review of a mostly-copied driver)
Sponsored by: Dell EMC Isilon
document details of salen in getnameinfo(3) manual page.
getnameinfo(3) returned EAI_FAIL when salen was not equal to
the length corresponding to the value specified by sa->sa_family.
However, POSIX or RFC 3493 does not require it and RFC 4038
Sec.6.2.3 shows an example passing sizeof(struct sockaddr_storage)
to salen.
This change makes the requirement less strict by accepting
salen up to sizeof(struct sockaddr_storage). It also includes
two more changes: one is to fix return values because both SUSv4
and RFC 3493 require EAI_FAMILY when the address length is invalid,
another is to fix sa_len dependency in PF_LOCAL.
Pointed out by: Christophe Beauval
Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D14585
The code assumed that disks (devices) used for testing are always named
like /dev/foo, but there is no reason for that restriction and we can
easily support paths like /dev/stripe/bar.
The change is to account for a different order in which the recursive
destroy may be attempted. If we first try a dataset that can be destroyed
then it will be destroyed, but if we first try a dataset that cannot be
destroyed then we will not attempt to destroy the other dataset.
The problem is that g_access() must be called with the GEOM topology
lock held. And that gives a false impression that the lock is indeed
held across the call. But this isn't always true because many classes,
ZVOL being one of the many, need to drop the lock. It's either to
perform an I/O on the first open or to acquire a different lock (like in
g_mirror_access).
That, of course, can break many assumptions. For example,
g_slice_access() adds an extra exclusive count on the first open. As
described above, an underlying geom may drop the topology lock and that
would open a race with another thread that would also request another
extra exclusive count. In general, two consumers may be granted
incompatible accesses.
To avoid this problem the code is changed to mark a geom with special
flag before calling its access method and clear the flag afterwards. If
another thread sees that flag, then it means that the topology lock has
been dropped (either by the geom in question or downstream from it), so
it is not safe to make another access call. So, the second thread would
use g_topology_sleep() to wait until the flag is cleared and only then
would it proceed with the access.
Also see http://docs.freebsd.org/cgi/mid.cgi?809d9254-ee56-59d8-69a4-08838e985cea
PR: 225960
Reported by: asomers
Reviewed by: markj, mav
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D14533
illumos/illumos-gate@5f5913bb835f5913bb83https://www.illumos.org/issues/9164
This issue has been reported by Alan Somers as
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225877
dmu_objset_refresh_ownership() first disowns a dataset (and releases
it) and then owns it again. There is an assert that the new dataset
object is the same as the old dataset object. When running ZFS Test
Suite on FreeBSD we see this panic from zpool_upgrade_007_pos test:
panic: solaris assert: newds == os->os_dsl_dataset (0xfffff80045f4c000
== 0xfffff80021ab4800)
I see that the old dataset has dsl_dataset_evict_async() pending in
ds_dbu.dbu_tqent and its ds_dbuf is NULL.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Andriy Gapon <avg@FreeBSD.org>
PR: 225877
Reported by: asomers
MFC after: 1 week
It seems default timeout of 100ms is not enough for my 2694L card,
while it was perfectly fine for others, even for full-height 2694.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
The NVME standard has required in section 7.2.6, since at least 1.1,
that a clean shutdown is signalled by deleting the subission and the
completion queues before setting the shutdown bit in CC. The 1.0
standard, apparently, did not and many of the early Intel cards didn't
care. Some newer cards care, at least one whose beta firmware can
scramble the card on an unclean shutdown. Linux has done this for some
time. To make it possible to move forward with an evaluation of this
pre-release card with wonky firmware, delete the queues on the card
when we delete the qpair structures.
Sponsored by: Netflix
We'll need to delete namespaces soon, so go ahead and stop making
these devices eternal. It doesn't help much, and will be getting in
the way soon.
Sponsored by: Netflix
It is believed that the conditions Coverity indicated were actually
impossible to hit. So this patch just adds a cleanup to only compute
v_mount once in brelse(), and in vfs_bio_getpages() always initializes error
to zero to appease the static analyzer.
No functional change intended.
Submitted by: Darrick Lew <darrick.freebsd AT gmail.com>
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14613
During shutdown mps waits for its SSU requests to complete however when
performing a reboot after handling a panic the scheduler is stopped so
getmicrotime which is used can be non-functional.
Switch to using the same method as shutdown_panic to ensure we actually
complete.
In addition reduce the timeout when RB_NOSYNC is set in howto as we expect
this to fail.
Reviewed by: slm
MFC after: 1 week
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D12776
Update the ZFS TRIM code to ensure it respects VTOC8 partition headers as
documented by the ZFS On-Disk Specification section 1.3
Before this a zpool create on a VTOC8 partitioned device would overwrite the
partition metadata.
Reported by: marius
Reviewed by: marius agv
MFC after: 1 week
Sponsored by: Multiplay
Compare sbavail() with the cached sb_off of already-sent data instead of
always comparing with zero. This will correctly close the connection and
send the FIN if the socket buffer contains some previously-sent data but
no unsent data.
Reported by: Harsh Jain @ Chelsio
Sponsored by: Chelsio Communications
- Remove the one use of is_tls_offload() and the function. AIO special
handling only needs to be disabled when a TOE socket is actively doing
TLS offload on transmit. The TOE socket's mode (which affects receive
operation) doesn't matter, so remove the check for the socket's mode and
only check if a TOE socket has TLS transmit keys configured to determine
if an AIO write request should fall back to the normal socket handling
instead of the TOE fast path.
- Move can_tls_offload() into t4_tls.c. It is not used in critical paths,
so inlining isn't that important. Change return type to bool while here.
Sponsored by: Chelsio Communications
Before = dpv: <__func__>: posix_spawnp(3): No such file or directory
After = dpv: <path/cmd>: No such file or directory
Most notably, show the 2nd argument being passed to posix_spawnp(3)
so we know what path/cmd failed.
Also, we don't need to have "posix_spawnp(3)" in the error message
nor the function because that can [a] change and [b] traversed using
a debugger if necessary.
being marked "standard", which is less confusing than having it conditional
on AIM CPUs here, and then picked up through options FDT from conf/files
on Book-E.
Request by: jhibbits
kern.cam.{,a,n}da.X.invalidate=1 forces *daX to detach by calling
cam_periph_invalidate on the underlying periph. This is for testing
purposes only. Include only with options CAM_TEST_FAILURE and rename
the former [AN]DA_TEST_FAILURE, and fix nda to compile with it set.
We're using it at work to harden geom and the buffer cache to be
resilient in the face of drive failure. Today, it far too often
results in a panic. While much work was done on SIM initiated removal
for the USB thumnb drive removal work, little has been done for periph
initiated removal. This simulates what *daerror() does for some errors
nicely: we get the same panics with it that we do with failing drives.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14581