illumos/illumos-gate@d8584ba6fbd8584ba6fbhttps://www.illumos.org/issues/7990
The snapspec_cb() callback function in libzfs does not need to call zfs_strdup().
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
illumos/illumos-gate@7c13517fff7c13517fffhttps://www.illumos.org/issues/7745
The problem is that consumers of `libZFS_Core` that forget to call
`libzfs_core_init()` before calling any other function of the library
are having a hard time realizing their mistake. The library's internal
file descriptor is declared as global static, which is ok, but it is not
initialized explicitly; therefore, it defaults to 0, which is a valid
file descriptor. If `libzfs_core_init()`, which explicitly initializes
the correct fd, is skipped, the ioctl functions return errors that do
not have anything to do with `libZFS_Core`, where the problem is
actually located.
Even though assertions for that existed within `libZFS_Core` for debug
builds, they were never enabled because the `-DDEBUG` flag was missing
from the compiler flags.
This patch applies the following changes:
1. It adds `-DDEBUG` for debug builds of `libZFS_Core` and `libzfs`,
to enable their assertions on debug builds.
2. It corrects an assertion within `libzfs`, where a function had
been spelled incorrectly (`zpool_prop_unsupported()`) and nobody
knew because the `-DDEBUG` flag was missing, and the preprocessor
was taking that part of the code away.
3. The library's internal fd is initialized to `-1` and `VERIFY`
assertions have been placed to check that the fd is not equal to
`-1` before issuing any ioctl. It is important here to note, that
the `VERIFY` assertions exist in both debug and non-debug builds.
4. In `libzfs_core_fini` we make sure to never increment the
refcount of our fd below 0, and also reset the fd to `-1` when no
one refers to it. The reason for this, is for the rare case that
the consumer closes all references but then calls one of the
library's functions without using `libzfs_core_init()` first, and
in the mean time, a previous call to `open()` decided to reuse
our previous fd. This scenario would have passed our assertion in
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
7604 if volblocksize property is the default, it displays as "-" rather than 8K
illumos/illumos-gate@4d86c0eab24d86c0eab2https://www.illumos.org/issues/7604
If a zvol has the default setting for the "volblocksize" property, it is
8KB. However, it is displayed as "-" (not present), rather than "8K".
The problem was introduced by:
commit 25228e830e86924a41243343b1de9daf2d7dd43a
Author: Matthew Ahrens <mahrens@delphix.com>
Date: Thu Nov 17 14:37:24 2016 -0800
7571 non-present readonly numeric ZFS props do not have default value
which changed changed get_numeric_property() to indicate that readonly
default properties are not present. However, zfs_prop_readonly() returns
TRUE for both readonly and set-once properties (e.g. volblocksize).
Amusingly, that commit essentially reverted:
6900484 default volblocksize is no longer being reported correctly
from November 2009. However, that change was not correct either; the
correct solution is to only do this check for "truly readonly" (i.e. not
setonce) properties.
$ zfs list -t volume -o name,volblocksize
NAME
VOLBLOCK
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
archive -
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
datafile -
domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/
external -
rpool/dump
128K
rpool/swap
4K
rpool/swap1
===============================================================================
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@09c9e6dc9b09c9e6dc9bhttps://www.illumos.org/issues/7542
libshare keeps a cached copy of the sharetab listing in memory, which can
become out of date if shares are destroyed or created while leaving a libzfs
handle open. This results in a spurious unmounting failure when an NFS share
exists but isn't in the stale libshare cache.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Amdur <matt.amdur@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>
illumos/illumos-gate@873c4903a5873c4903a5https://www.illumos.org/issues/7336
We can run into a problem where we call into zfs_mount, which in turn calls
is_dir_empty, which opens the directory to try and make sure it's empty. The
issue with the current approach is that it holds the directory open while it
traverses it with readdir, which, due to subtle interaction with the Java JVM,
vfork, and exec can cause a tricky race condition resulting in zfs_mount
failures.
The approach to resolving the issue in this patch is to drop the usage of
readdir altogether, and instead rely on the fact that ZFS stores the number of
entries contained in a directory using the st_size field of the stat structure.
Thus, if the directory in question is a ZFS directory, we can check to see if
it's empty by calling stat() and inspecting the st_size field of structure
returned.
===============================================================================
The root cause appears to be an interesting race between vfork, exec, and
zfs_mount's usage of O_CLOEXEC when calling openat. Here's what is going on:
1. We call zfs_mount, and this in turn calls openat to check if the directory
is empty, which results in opening the directory we're trying to mount onto,
and increment v_count.
2. As we're in the middle of reading the directory, vfork is called by the JVM
and proceeds to exec the jspawnhelper utility. As a result of the vfork, we
take an additional hold on the directory, which increments v_count a second
time. The semantics of vfork mean the parent process will wait for the child
process to exit or exec before the parent can continue; at this point the
parent is in the middle of zfs_mount, reading the directory to determine if
it's empty or not.
3. The child process exec-ing jspawnhelper gets to the relvm call within
exec_args (which is called by exec_common). relvm is the function that releases
the parent process, allowing the parent to proceed. The problem is, at this
point of calling relvm, the child hasn't yet called close_exec which is
responsible for closing the file descriptors inherited from the parent process
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
illumos/illumos-gate@d420209d9cd420209d9chttps://www.illumos.org/issues/7233
This fixes a race where one thread is executing zfs_mount() while another
thread forks and execs. If the fork occurs while the directory is open, the
child process will inherit (but not necessarily close immediately) the open fd
for the directory, preventing the mount.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Alex Reece <alex@delphix.com>
illumos/illumos-gate@c3c65d17f7c3c65d17f7https://www.illumos.org/issues/7502
Right now ztest executes zdb without -G, so when it has errors, the messages
are often not very helpful:
Executing zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest
zdb: can't open 'ztest': Operation not supported
ztest: '/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache ztest' exit
code 1
With -G, we'd have:
/usr/sbin/amd64/zdb -bccsv -d -U /rpool/tmp/zpool.cache -G ztest
zdb: can't open 'ztest': Operation not supported
ZFS_DBGMSG(zdb):
spa_open_common: opening ztest
spa_load(ztest): LOADING
spa_load(ztest): FAILED: unable to parse config [error=48]
spa_load(ztest): UNLOADING
Which indicates where the error came from
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Dmamap is created only on IFC attach. If we remove it on
buffer release, we won't be able to do ifconfig down&up. Only destroy
when in detach.
Reported by: wma
Reviewed by: wma
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D14060
Provisioned from MACHINE/MACHINE_ARCH on the system, expose loader.machine
and loader.machine_arch respectively.
These may be used to hide ACPI option on non-applicable archs.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D14446
In the remaining error case, return a 3-tuple consistent with the other
error return case.
Document how to invoke lfs.attributes() and detect/decode error return in
example comments.
Reviewed by: kevans
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14451
Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.
The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.
Reviewed by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14337
Carousel storage doesn't need to happen in the menu module, and indeed
storing it there introduces a circular reference between drawer and menu
that only works because of global pollution in loader.lua.
Carousel choices generally map to config entries anyways, making it as good
of place as any to store these. Move {get,set}CarouselIndex functionality
out into config so that drawer and menu may both use it. If we had more
carousel functionality, it might make sense to create a carousel module, but
this is not the case.
If we failed to execute the input line as pure lua, run the command through
parse for consistent argument parsing. Pass the parsed arguments through to
a global "cli_execute" written in Lua, which is expected to either handle it
or pass it back through to interp_builtin_cmd (via loader.command).
lua-handled cli commands will then exist as globals in whatever module they
most belong in, and invocations at the loader prompt will magically dispatch
to them if they exist.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D14450
We follow pretty closely the following structure of a module:
1. Copyright notice
2. Module requires
3. Module local declarations
4. Module local definitions
5. Module exports
6. return
Re-organize the one-offs (config/drawer) and denote the start of module
exports with a comment.
Declare these adjacent to the local definitions at the top of the module,
and make sure they're actually declared local to pollute global namespace a
little bit less.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.
Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.
Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.
Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.
Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
On POWER8 architecture there is a timer with 512Mhz frequency.
It has about 1,95ns period, but it is rounded to 1ns which is not accurate.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Reviewed by: wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14433
Currently Hypervisor Emulation Assistance interrupt is unhandled.
Executing an undefined instruction in userland triggers kernel panic.
Handle this the same way as Facility Unavailable Interrupt - send
SIGILL signal to userspace.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Reviewed by: nwhitehorn, pdk@semihalf.com, wma
Sponsored by: IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14437
The intent of this guideline is to avoid creating global variables in module
scope. Its main purpose is to serve as a reminder that variables at module
scope also need to be declared.
We want to avoid global variables in general, but this is easier to mess up
when designing things in the module scope.
illumos/illumos-gate@48bbca816848bbca8168https://www.illumos.org/issues/7812
This change removes all gendered language that did not refer specifically
to an individual person or pet. The convention taken was to use
variations on "they" when referring to users and/or human beings, while
using "it" when referring to code, functions, and/or libraries.
Additionally, we took the liberty to fix up any whitespace issues that
were found in any files that were already being modified.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
This refactor makes it straightforward to add new logos for drawing and
organizes logo definitions in a logical manner.
The graphic to be drawn for each logo may again be modified outside of
drawer, but it must be done on a case-by-case basis as a modification to the
loader_logo.
As part of an effort to slowly reduce our exports overall to a set of stable
properties/functions, go ahead and reduce what drawer exposes to others.
The graphics should generally not be modified on their own, but their
position could be modified if additional grahics also need to be drawn.
Export position/shift information, but leave the actual graphic local to
the module.
The next step will be to create a 'menudef' that gets exported instead, with
each entry in the menudef table describing the graphic to be drawn along
with specific positioning information.
Pull out specialized naming behavior into drawer.menu_name_handlers. This is
currently only needed for the carousel entry, where naming is based on the
current choice and the menu item purposefully does not store this state.
The setup was designed so that every type can have a special name handler,
and the default action is to simply use the result of entry.name().
This is a bit cleaner than our former method of an if ... else chain of
handlers. Store handlers in the menu.handlers table so that they may be
added to or removed dynamically.
All handlers take the current menu and selected entry as parameters, and
their return value indicates whether the menu processor should continue or
not. An omitted return value or 'true' will indicate that we should
continue, while returning 'false' will indicate that we should exit the
current menu.
The omitted return value behavior is due to continuing the loop being the
more common situation.
There is a proctree -> allproc ordering established.
Most of the time it is either xlock -> xlock or slock -> slock.
On fork however there is a slock -> xlock pair which results in
pathological wait times due to threads keeping proctree held for
reading and all waiting on allproc. Switch this to xlock -> xlock.
Longer term fix would get rid of proctree in this place to begin with.
Right now it is necessary to walk the session/process group lists to
determine which id is free. The walk can be avoided e.g. with bitmaps.
The exit path used to have one place which dealt with allproc and
then with proctree. Move the allproc acquire into the section protected
by proctree. This reduces contention against threads waiting on proctree
in the fork codepath - the fork proctree holder does not have to wait
for allproc as often.
Finally, move tidhash manipulation outside of the area protected by
either of these locks. The removal from the hash was already unprotected.
There is no legitimate reason to look up thread ids for a process still
under construction.
This results in about 50% wait time reduction during -j 128 package build.
Provide multiple clean queues partitioned into 'domains'. Each domain manages
its own bufspace and has its own bufspace daemon. Each domain has a set of
subqueues indexed by the current cpuid to reduce lock contention on the cleanq.
Refine the sleep/wakeup around the bufspace daemon to use atomics as much as
possible.
Add a B_REUSE flag that is used to requeue bufs during the scan to approximate
LRU rather than locking the queue on every use of a frequently accessed buf.
Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts.
Reviewed by: markj
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14274