The bhyve virtual local APIC uses an instance-global flag to indicate
when an error LVT is being delivered to prevent infinite recursion.
Use a function argument instead to reduce the amount of instance-global
state.
This was inspired by reviewing the bhyve save/restore work, which
saves a copy of the instance-global state for each vlapic.
Smart OS bug: https://smartos.org/bugview/OS-7777
Submitted by: Patrick Mooney
Reviewed by: markj, rgrimes
Obtained from: SmartOS / Joyent
Differential Revision: https://reviews.freebsd.org/D20365
When the pwm.9 manual was removed, a symlink between pwmbus.9 and pwm.9 was
created, but there's an entry in ObsoleteFiles.inc to remove pwn.9, meaning
that on every installation pwm.9 is created, and make delete-old deletes it.
Remove the entry from ObsoleteFiles.inc, the symlink is clearly intentional
and shouldn't be removed.
Reviewed by: imp, ian
Approved by: imp (implicit, review OK)
Differential Revision: https://reviews.freebsd.org/D21198
XPT_DEV_ADVINFO call should be protected by the lock of the specific
device it is addressed to, not the lock of SES device. In some weird
case, probably with hardware violating standards, it sometimes caused
NULL dereference due to race.
To protect from it further, add lock assertion to *_dev_advinfo().
MFC after: 1 week
Sponsored by: iXsystems, Inc.
This fixes a "timed sleep before timers are working" panic seen
while attaching jedec_dimm(4) instances too early in the boot.
Submitted by: ian
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D21452
This makes the media check process asynchronous, so we no longer block
in cdstrategy() to check for media.
PR: 219857
Obtained from: ken
MFC after: 3 weeks
Current implementation of vnode_create_vobject() and
vnode_destroy_vobject() is written so that it prepared to handle the
vm object destruction for live vnode. Practically, no filesystems use
this, except for some remnants that were present in UFS till today.
One of the consequences of that model is that each filesystem must
call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result
all of them get rid of the v_object in reclaim.
Move the call to vnode_destroy_vobject() to vgonel() before
VOP_RECLAIM(). This makes v_object stable: either the object is NULL,
or it is valid vm object till the vnode reclamation. Remove code from
vnode_create_vobject() to handle races with the parallel destruction.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21412
In ffs_valloc(), force reclaim existing vnode on inode reuse, instead
of trying to re-initialize the same vnode for new purposes. This is
done in preparation of changes to the vp->v_object lifecycle handling.
A new FFSV_REPLACE flag to ffs_vgetf() directs the function to
vgone(9) the vnode if found in vfs hash, instead of returning it.
Reviewed by: markj, mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21412
Many extern struct pcpu <something>__pcpu declarations were
copied/pasted in sources. The issue is that the definition is MD, but
it cannot be provided by machine/pcpu.h due to actual struct pcpu
defined in sys/pcpu.h later than the inclusion of machine/pcpu.h.
This forced the copying when other code needed direct access to
__pcpu. There is no way around it, due to machine/pcpu.h supplying
part of struct pcpu fields.
To work around the problem, add a new machine/pcpu_aux.h header, which
should fill any needed MD definitions after struct pcpu definition is
completed. This allows to remove copies of __pcpu spread around the
source. Also on x86 it makes it possible to remove work arounds like
OFFSETOF_CURTHREAD or clang specific warnings supressions.
Reported and tested by: lwhsu, bcran
Reviewed by: imp, markj (previous version)
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21418
Previously, the permissions were checked on the pool which was obviously
incorrect.
After this change, zfs_check_userprops() only validates the properties
without any permission checks. The permissions are checked individually
for each snapshotted dataset.
This was also committed to ZoL: zfsonlinux/zfs@e6203d2
Reported by: CyberSecure
MFC after: 1 week
Sponsored by: CyberSecure
The libxo xml feature of adding an annotation with the "original"
address from the utmpx file if it is different than the final "from"
field was broken by r351379. This was pointed out by the gcc error
that save_p might be used uninitialized. Save the original address
as needed in each entry, don't just use the last one from the previous
loop.
Reviewed by: marcel@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21390
vnode usecount drops to 0 all the time (e.g. for directories during path lookup).
When that happens the kernel would always lock the exclusive lock for the vnode
in order to call vinactive(). This blocks other threads who want to use the vnode
for looukp.
vinactive is very rarely needed and can be tested for without the vnode lock held.
This patch gives filesytems an opportunity to do it, sample total wait time for
tmpfs over 500 minutes of poudriere -j 104:
before: 557563641706 (lockmgr:tmpfs)
after: 46309603301 (lockmgr:tmpfs)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21371
- LK macro (conditional on SMP for the lock prefix) is unused
- SETLK unnecessarily performs xchg. obtained value is never used and the
implicit lock prefix adds avoidable cost. Barrier provided by it does
not appear to be of any use.
- the lock waited for is almost never blocked, yet the loop starts with
a pause. Move it out of the common case.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19563
In compare(), all error cases set the error code to EPIPE, so when an
error is set, the correct assertion to make is that the error is EPIPE,
not EINVAL.
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@freqlabs.com>
Closes#8743zfsonlinux/zfs@9dc41a769d
Submitted by: Ryan Moeller <ryan@freqlabs.com>
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D20118
It is not needed by anything in the kernel and it slightly drives up contention
on both proctree and allproc locks.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21447
uiomove_object_page() and exec_map_first_page() would previously wire a
page after having grabbed it. Ask vm_page_grab() to perform the wiring
instead: this removes some redundant code, and is cheaper in the case
where the requested page is not resident since the page allocator can be
asked to initialize the page as wired, whereas a separate vm_page_wire()
call requires the page lock.
In vm_imgact_hold_page(), use vm_page_unwire_noq() instead of
vm_page_unwire(PQ_NONE). The latter ensures that the page is dequeued
before returning, but this is unnecessary since vm_page_free() will
trigger a batched dequeue of the page.
Reviewed by: alc, kib
Tested by: pho (part of a larger patch)
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21440
* A small error in r338152 let to the returned size always being exactly
eight bytes too large.
* The FUSE_LISTXATTR operation works like Linux's listxattr(2): if the
caller does not provide enough space, then the server should return ERANGE
rather than return a truncated list. That's true even though in FUSE's
case the kernel doesn't provide space to the client at all; it simply
requests a maximum size for the list. We previously weren't handling the
case where the server returns ERANGE even though the kernel requested as
much size as the server had told us it needs; that can happen due to a
race.
* We also need to ensure that a pathological server that always returns
ERANGE no matter what size we request in FUSE_LISTXATTR won't cause an
infinite loop in the kernel. As of this commit, it will instead cause an
infinite loop that exits and enters the kernel on each iteration, allowing
signals to be processed.
Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21287
This field was not initialized in the !KERN_TLS case triggering an
assertion failure when using sendfile(2).
Reported by: pho, asomers
Sponsored by: Netflix
Warn when actual operations are performed instead of when sessions are
created. The /dev/crypto engine in OpenSSL 1.0.x tries to create
sessions for all possible algorithms each time it is initialized
resulting in spurious warnings.
Reported by: Mike Tancsa
MFC after: 3 days
Sponsored by: Chelsio Communications
This is part of the preparation to remove flags argument from VOP_UNLOCK.
Also has a side effect of fixing stacking on top of nullfs broken by r351472.
Reported by: cy
Sponsored by: The FreeBSD Foundation
The plan is to drop the flags argument. There is also a temporary bug
now that nullfs ignores the flag.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21252
Fix a problem which prevented -OServerSSLOptions or -OClientSSLOptions
specified in the command-line option from working.
This patch has been accepted by the upstream.
Reviewed by and discussed with: gshapiro
Since r351187 the pinctrl driver need to know the gpio bank as it
directly attach the gpio driver to handle some setup that might
be present in the dts, add the gpio banks table for rk3399.
While here fix some IOMUX definition that prevented to boot
on RK3399 as pinctrl wasn't configured correctly.
Submitted by: mmel (original version)
MFC after: 2 weeks
MFC With: r351187
Since r351187 the pinctrl driver need to know the gpio bank as it
directly attach the gpio driver to handle some setup that might
be present in the dts, add the gpio banks table for rk3328.
While here fix some IOMUX definition that prevented to boot
on RK3328 as pinctrl wasn't configured correctly.
Submitted by: mmel (original version)
MFC after: 2 weeks
MFC With: r351187
Even if we do not expect retries, we better be sure, since otherwise it
may result in use after free kernel panic. I've noticed that it retries
SCSI_STATUS_BUSY even with SF_NO_RECOVERY | SF_NO_RETRY.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
- Don't add 1 to the result of DOMAINSET_FLS.
- Do not modify domainsets containing only empty domains.
- Always flatten a _PREFER policy to _ROUNDROBIN if the preferred
domain is empty. Previously we were doing this only when ds_cnt > 1.
These bugs could cause hangs during boot if a VM domain is empty.
Tested by: hselasky
Reviewed by: hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21420
This makes it possible to perform mathematical operations on
fractional values without using floating point. It operates on Q
numbers, which are integer-sized, opaque structures initialized
to hold a chosen number of integer and fractional bits.
For a general description of the Q number system, see the "Fixed Point
Representation & Fractional Math" whitepaper[1]; for the actual
API see the qmath(3) man page.
This is one of dependencies for the upcoming stats(3) framework[2]
that will be applied to the TCP stack in a later commit.
1. https://www.superkits.net/whitepapers/Fixed%20Point%20Representation%20&%20Fractional%20Math.pdf
2. https://reviews.freebsd.org/D20477
Reviewed by: bcr (man pages, earlier version), sef (earlier version)
Discussed with: cem, dteske, imp, lstewart
Sponsored By: Klara Inc, Netflix
Obtained from: Netflix
Differential Revision: https://reviews.freebsd.org/D20116