zstd is kernel code that was not supposed to be in libzfs.
libzpool provides userland shims for kernel code and is where the
zstd code needs to be included.
Reported by: John Kennedy
Discussed with: mmacy
Sponsored by: iXsystems, Inc.
The primary benefit is maintaining a completely shared
code base with the community allowing FreeBSD to receive
new features sooner and with less effort.
I would advise against doing 'zpool upgrade'
or creating indispensable pools using new
features until this change has had a month+
to soak.
Work on merging FreeBSD support in to what was
at the time "ZFS on Linux" began in August 2018.
I first publicly proposed transitioning FreeBSD
to (new) OpenZFS on December 18th, 2018. FreeBSD
support in OpenZFS was finally completed in December
2019. A CFT for downstreaming OpenZFS support in
to FreeBSD was first issued on July 8th. All issues
that were reported have been addressed or, for
a couple of less critical matters there are
pull requests in progress with OpenZFS. iXsystems
has tested and dogfooded extensively internally.
The TrueNAS 12 release is based on OpenZFS with
some additional features that have not yet made
it upstream.
Improvements include:
project quotas, encrypted datasets,
allocation classes, vectorized raidz,
vectorized checksums, various command line
improvements, zstd compression.
Thanks to those who have helped along the way:
Ryan Moeller, Allan Jude, Zack Welch, and many
others.
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25872
To define USDT probes, dtrace -G makes use of relocations for undefined
symbols: the target address is overwritten with NOPs and the location is
recorded in the DOF section of the output object file. To avoid link
errors, the original relocation is destroyed. However, this means that
the same input object file cannot be processed multiple times, as
happens during incremental rebuilds. Instead, only set the relocation
type to NONE, so that all information required to reconstruct USDT
probes is preserved.
Reported by: bdrewery
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
This file is too complicated as it is and has diverged a fair bit from
illumos due to toolchain differences, so just drop unused code
(including SPARC support).
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
ZFS unmounts a dataset while receiving into it and remounts it afterwards.
But if ZFS is resuming an incomplete receive, it screws up and ends up with
a dataset that is mounted, but returns EIO for every access. This commit
fixes that condition.
While the vulnerable code also exists in OpenZFS, the problem is not
reproducible there. Apparently OpenZFS doesn't unmount the destination
dataset during receive, like FreeBSD does.
PR: 248606
Reviewed by: mmacy
MFC after: 2 weeks
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D26034
When zsh runs in POSIX sh mode it does not support the -e flag to echo.
Use printf instead of echo to avoid the "-e" characters being printed.
Obtained from: CheriBSD
Reviewed By: markj
Differential Revision: https://reviews.freebsd.org/D26026
Some of the scripts used for libdtrace invoke nawk instead of awk
(for example cddl/contrib/opensolaris/lib/libdtrace/common/mknames.sh).
When bootstrapping all tools, we get the nawk -> awk link while building
usr.bin/awk, but when linking/copying the dependencies from the host we
were only adding awk but not nawk.
This was silently generating invalid files when building libdtrace with
BUILD_WITH_STRICT_TMPPATH=1 since those scripts invoke nawk instead of
awk. In addition to adding the missing link this commit also adds
set -e to those scripts to catch errors like this in the future.
Reviewed By: markj, emaste
Differential Revision: https://reviews.freebsd.org/D26025
This does not appear to matter on FreeBSD or Linux, but when building an
amd64 kernel on macOS I was seeing infinite loops in ctfmerge.
It turns out the loop in wip_save_work() was looping forever due to
pthread_cond_wait() always returning -EINVAL.
Reviewed By: markj, brooks
Differential Revision: https://reviews.freebsd.org/D25973
We are building new bootonce mechanism (previously zfs bootnext) and it is
based on this OpenZFS change. Since this patch is nicely self contained,
I am commiting it as is, and we can stack our changes.
Original patch description follows:
Modern bootloaders leverage data stored in the root filesystem to
enable some of their powerful features. GRUB specifically has a grubenv
file which can store large amounts of configuration data that can be
read and written at boot time and during normal operation. This allows
sysadmins to configure useful features like automated failover after
failed boot attempts. Unfortunately, due to the Copy-on-Write nature
of ZFS, the standard behavior of these tools cannot handle writing to
ZFS files safely at boot time. We need an alternative way to store
data that allows the bootloader to make changes to the data.
This work is very similar to work that was done on Illumos to enable
similar functionality in the FreeBSD bootloader. This patch is different
in that the data being stored is a raw grubenv file; this file can store
arbitrary variables and values, and the scripting provided by grub is
powerful enough that special structures are not required to implement
advanced behavior.
We repurpose the second padding area in each label to store the grubenv
file, protected by an embedded checksum. We add two ioctls to get and
set this data, and libzfs_core and libzfs functions to access them more
easily. There are no direct command line interfaces to these functions;
these will be added directly to the bootloader utilities.
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes#10009
Obtained from: OpenZFS
Sponsored by: Netflix, Klara Inc.
In original implementation, zpool history will read the whole history
before printing anything, causing memory usage goes unbounded. We fix
this by breaking it into read-print iterations.
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes#9516
Note, this change changes the libzfs.so ABI by modifying the prototype
of zpool_get_history(). Since libzfs is effectively private to the base
system it is anticipated that this will not be a problem.
PR: 247557
Obtained from: OpenZFS
Reported and tested by: Sam Vaughan <samjvaughan@gmail.com>
Discussed with: freqlabs
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25745openzfs/zfs@7125a109dc
Since r304321 (-current: Aug 18, 2016) and r328866 (stable/11: Feb 5, 2018)
the FreeBSD loader has supported reading from datasets with the
large_blocks feature active.
PR: 247992
Reported by: Anton Saietskii <vsasjason@gmail.com>
MFC after: 2 weeks
Sponsored by: Klara Inc.
This pattern is used in callbacks with void * data arguments and seems
both relatively uncommon and relatively harmless. Silence the warning
by casting through uintptr_t.
This warning is on by default in Clang 11.
MFC after: 3 days
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24425
We want newer versions of libzfs_core to run against an existing
zfs kernel module (i.e. a deferred reboot or module reload after
an update).
Programmatically document, via a zfs_ioc_key_t, the valid arguments
for the ioc commands that rely on nvpair input arguments (i.e. non
legacy commands from libzfs_core). Automatically verify the expected
pairs before dispatching a command.
This initial phase focuses on the non-legacy ioctls. A follow-on
change can address the legacy ioctl input from the zfs_cmd_t.
The zfs_ioc_key_t for zfs_keys_channel_program looks like:
static const zfs_ioc_key_t zfs_keys_channel_program[] = {
{"program", DATA_TYPE_STRING, 0},
{"arg", DATA_TYPE_UNKNOWN, 0},
{"sync", DATA_TYPE_BOOLEAN_VALUE, ZK_OPTIONAL},
{"instrlimit", DATA_TYPE_UINT64, ZK_OPTIONAL},
{"memlimit", DATA_TYPE_UINT64, ZK_OPTIONAL},
};
Introduce four input errors to identify specific input failures
(in addition to generic argument value errors like EINVAL, ERANGE,
EBADF, and E2BIG).
ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel
ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel
ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing
ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type
Reviewed by: allanjude
Obtained from: OpenZFS
Sponsored by: Netflix, Klara Inc.
Differential Revision: https://reviews.freebsd.org/D25393
Keep link_map l_addr binary layout compatible, rename l_addr to l_base
where rtld returns map base. Provide relocbase in newly added l_addr.
This effectively reverts the patch to the initial version of D24918.
Reported by: antoine (portmgr)
Reviewed by: jhb, markj
Tested by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24946
And that should work even (especially) if there is no matching user or
group name. This change allows to see and modify delegations for
deleted groups and users.
The change is originally by Xin Li.
illumos report: https://www.illumos.org/issues/6037
OpenZFS (ZoL) PR: https://github.com/openzfs/zfs/pull/10280
Obtained from: delphij
MFC after: 2 weeks
Currently when the dataset is in use we can't receive snapshots.
zfs send test/1@asd | zfs recv -FM test/2
cannot unmount '/test/2': Device busy
This commits add option 'M' which attempts to forcibly unmount the
dataset. Thanks to this we can enforce receiving snapshots in a
single step.
Note that this functionality is not supported on Linux because the
VFS will prevent active mounted filesystems from being unmounted,
even with the force option. This is the intended VFS behavior.
Discussed-with: Pawel Jakub Dawidek <pjd@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Differential Revision: https://reviews.freebsd.org/D22306openzfs/zfs@a57d3d45d6
zfs create, receive and rename can bypass this hierarchy rule. Update
both userland and kernel module to prevent this issue and use pyzfs
unit tests to exercise the ioctls directly.
Note: this commit slightly changes zfs_ioc_create() ABI. This allow to
differentiate a generic error (EINVAL) from the specific case where we
tried to create a dataset below a ZVOL (ZFS_ERR_WRONG_PARENT).
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Approved by: mav (mentor)
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
openzfs/zfs@d8d418ff0c
We've observed that on some highly fragmented pools, most metaslab
allocations are small (~2-8KB), but there are some large, 128K
allocations. The large allocations are for ZIL blocks. If there is a
lot of fragmentation, the large allocations can be hard to satisfy.
The most common impact of this is that we need to check (and thus load)
lots of metaslabs from the ZIL allocation code path, causing sync writes
to wait for metaslabs to load, which can take a second or more. In the
worst case, we may not be able to satisfy the allocation, in which case
the ZIL will resort to txg_wait_synced() to ensure the change is on
disk.
To provide a workaround for this, this change adds a tunable that can
reduce the size of ZIL blocks.
External-issue: DLPX-61719
Reviewed-by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#8865openzfs/zfs@b8738257c2
MFC after: 2 weeks
This was the intent of the existing code, but instead it would
unconditionally load dtraceall.ko because of a stale errno value.
Reported by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
a single instance: use snd_recover also where sack_newdata was used.
Submitted by: Richard Scheffenegger
Differential Revision: https://reviews.freebsd.org/D18811
PPC64 ELFv2 acts like a "normal" platform in that it no longer needs
function descriptors. So, ensure we are only enabling them on ELFv1.
Additionally, ELFv2 requires that the ELF header have a nonzero e_flags,
so ensure that the synthesized ELF header in dt_link.c is setting it.
Reviewed by: jhibbits, markj
Approved by: gnn
Differential Revision: https://reviews.freebsd.org/D22403
By default, zpools may not be backed by zvols (that can be changed with the
"vfs.zfs.vol.recursive" sysctl). When that sysctl is set to 0, the kernel
does not attempt to read zvols when looking for vdevs. But the zpool command
still does. This change brings the zpool command into line with the kernel's
behavior. It speeds "zpool import" when an already imported pool has many
zvols, or a zvol with many snapshots.
PR: 241083
Reported by: Martin Birgmeier <d8zNeCFG@aon.at>
Reviewed by: mav, Ryan Moeller <ryan@freqlabs.com>
MFC after: 2 weeks
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D22077
This will be used in libbe in place of the internal zmount(); libbe only
wants to be able to mount a dataset at an arbitrary mountpoint without
altering dataset/pool properties. The natural way to do this in a portable
way is by creating a zfs_mount_at() interface that's effectively zfs_mount()
+ a mountpoint parameter. zfs_mount() is now a light wrapper around the new
method.
The interface and implementation have already been accepted into ZFS On
Linux, and the next commit to switch libbe() over to this new interface will
solve the last compatibility issue with ZoL. The next sysutils/openzfs
rebase against ZoL should be able to build libbe/bectl with only minor
adjustments to build glue.
Reviewed by: Ryan Moeller <ryan freqlabs com>
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D23132
Previously libdtrace used ftok(3), which hashes the inode number of the
input object file. To increase reproducibility of builds that embed
USDT probes, include a hash of the object file path in the symbol name
instead.
Reported and tested by: bdrewery
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
arm64 is still lacking a fasttrap implementation, which is required to
actually enable userland probes, but this at least allows USDT probes to
be linked into userland applications.
Submitted by: Klaus Küchemann <maciphone2@googlemail.com> (original)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22360
gcc9 grew a new warning for unbounded allocas, such as the one in
dt_options_load. Remove both uses of alloca in dt_options.c.
Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22880
Update a bunch of Makefile.depend files as
a result of adding Makefile.depend.options files
Reviewed by: bdrewery
MFC after: 1 week
Sponsored by: Juniper Networks
Differential Revision: https://reviews.freebsd.org/D22494
illumos/illumos-gate@555d674d5d555d674d5dhttps://www.illumos.org/issues/10592
This is a collection of recent fixes from ZoL:
8eef997679 Error path in metaslab_load_impl() forgets to drop ms_sync_lock
928e8ad47d Introduce auxiliary metaslab histograms
425d3237ee Get rid of space_map_update() for ms_synced_length
6c926f426a Simplify log vdev removal code
21e7cf5da8 zdb -L should skip leak detection altogether
df72b8bebe Rename range_tree_verify to range_tree_verify_not_present
75058f3303 Remove unused vdev_t fields
Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
MFC after: 4 weeks