a1d477c24c
OpenZFS 7614 - zfs device evacuation/removal OpenZFS 9064 - remove_mirror should wait for device removal to complete This project allows top-level vdevs to be removed from the storage pool with "zpool remove", reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now "indirect") vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become "obsolete" because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been "remapped" in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be "remapped" to their new (concrete) locations if possible. This process can be accelerated by using the "zfs remap" command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. At the moment, only mirrors and simple top-level vdevs can be removed and no removal is allowed if any of the top-level vdevs are raidz. Porting Notes: * Avoid zero-sized kmem_alloc() in vdev_compact_children(). The device evacuation code adds a dependency that vdev_compact_children() be able to properly empty the vdev_child array by setting it to NULL and zeroing vdev_children. Under Linux, kmem_alloc() and related functions return a sentinel pointer rather than NULL for zero-sized allocations. * Remove comment regarding "mpt" driver where zfs_remove_max_segment is initialized to SPA_MAXBLOCKSIZE. Change zfs_condense_indirect_commit_entry_delay_ticks to zfs_condense_indirect_commit_entry_delay_ms for consistency with most other tunables in which delays are specified in ms. * ZTS changes: Use set_tunable rather than mdb Use zpool sync as appropriate Use sync_pool instead of sync Kill jobs during test_removal_with_operation to allow unmount/export Don't add non-disk names such as "mirror" or "raidz" to $DISKS Use $TEST_BASE_DIR instead of /tmp Increase HZ from 100 to 1000 which is more common on Linux removal_multiple_indirection.ksh Reduce iterations in order to not time out on the code coverage builders. removal_resume_export: Functionally, the test case is correct but there exists a race where the kernel thread hasn't been fully started yet and is not visible. Wait for up to 1 second for the removal thread to be started before giving up on it. Also, increase the amount of data copied in order that the removal not finish before the export has a chance to fail. * MMP compatibility, the concept of concrete versus non-concrete devices has slightly changed the semantics of vdev_writeable(). Update mmp_random_leaf_impl() accordingly. * Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool feature which is not supported by OpenZFS. * Added support for new vdev removal tracepoints. * Test cases removal_with_zdb and removal_condense_export have been intentionally disabled. When run manually they pass as intended, but when running in the automated test environment they produce unreliable results on the latest Fedora release. They may work better once the upstream pool import refectoring is merged into ZoL at which point they will be re-enabled. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Alex Reece <alex@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://www.illumos.org/issues/7614 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb Closes #6900
1232 lines
36 KiB
C
1232 lines
36 KiB
C
/*
|
|
* CDDL HEADER START
|
|
*
|
|
* The contents of this file are subject to the terms of the
|
|
* Common Development and Distribution License (the "License").
|
|
* You may not use this file except in compliance with the License.
|
|
*
|
|
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
|
|
* or http://www.opensolaris.org/os/licensing.
|
|
* See the License for the specific language governing permissions
|
|
* and limitations under the License.
|
|
*
|
|
* When distributing Covered Code, include this CDDL HEADER in each
|
|
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
|
|
* If applicable, add the following below this CDDL HEADER, with the
|
|
* fields enclosed by brackets "[]" replaced with your own identifying
|
|
* information: Portions Copyright [yyyy] [name of copyright owner]
|
|
*
|
|
* CDDL HEADER END
|
|
*/
|
|
|
|
/*
|
|
* Copyright (c) 2012, 2017 by Delphix. All rights reserved.
|
|
* Copyright (c) 2013 Steven Hartland. All rights reserved.
|
|
* Copyright (c) 2017 Datto Inc.
|
|
* Copyright 2017 RackTop Systems.
|
|
* Copyright (c) 2017 Open-E, Inc. All Rights Reserved.
|
|
*/
|
|
|
|
/*
|
|
* LibZFS_Core (lzc) is intended to replace most functionality in libzfs.
|
|
* It has the following characteristics:
|
|
*
|
|
* - Thread Safe. libzfs_core is accessible concurrently from multiple
|
|
* threads. This is accomplished primarily by avoiding global data
|
|
* (e.g. caching). Since it's thread-safe, there is no reason for a
|
|
* process to have multiple libzfs "instances". Therefore, we store
|
|
* our few pieces of data (e.g. the file descriptor) in global
|
|
* variables. The fd is reference-counted so that the libzfs_core
|
|
* library can be "initialized" multiple times (e.g. by different
|
|
* consumers within the same process).
|
|
*
|
|
* - Committed Interface. The libzfs_core interface will be committed,
|
|
* therefore consumers can compile against it and be confident that
|
|
* their code will continue to work on future releases of this code.
|
|
* Currently, the interface is Evolving (not Committed), but we intend
|
|
* to commit to it once it is more complete and we determine that it
|
|
* meets the needs of all consumers.
|
|
*
|
|
* - Programmatic Error Handling. libzfs_core communicates errors with
|
|
* defined error numbers, and doesn't print anything to stdout/stderr.
|
|
*
|
|
* - Thin Layer. libzfs_core is a thin layer, marshaling arguments
|
|
* to/from the kernel ioctls. There is generally a 1:1 correspondence
|
|
* between libzfs_core functions and ioctls to /dev/zfs.
|
|
*
|
|
* - Clear Atomicity. Because libzfs_core functions are generally 1:1
|
|
* with kernel ioctls, and kernel ioctls are general atomic, each
|
|
* libzfs_core function is atomic. For example, creating multiple
|
|
* snapshots with a single call to lzc_snapshot() is atomic -- it
|
|
* can't fail with only some of the requested snapshots created, even
|
|
* in the event of power loss or system crash.
|
|
*
|
|
* - Continued libzfs Support. Some higher-level operations (e.g.
|
|
* support for "zfs send -R") are too complicated to fit the scope of
|
|
* libzfs_core. This functionality will continue to live in libzfs.
|
|
* Where appropriate, libzfs will use the underlying atomic operations
|
|
* of libzfs_core. For example, libzfs may implement "zfs send -R |
|
|
* zfs receive" by using individual "send one snapshot", rename,
|
|
* destroy, and "receive one snapshot" operations in libzfs_core.
|
|
* /sbin/zfs and /zbin/zpool will link with both libzfs and
|
|
* libzfs_core. Other consumers should aim to use only libzfs_core,
|
|
* since that will be the supported, stable interface going forwards.
|
|
*/
|
|
|
|
#include <libzfs_core.h>
|
|
#include <ctype.h>
|
|
#include <unistd.h>
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
#include <errno.h>
|
|
#include <fcntl.h>
|
|
#include <pthread.h>
|
|
#include <sys/nvpair.h>
|
|
#include <sys/param.h>
|
|
#include <sys/types.h>
|
|
#include <sys/stat.h>
|
|
#include <sys/zfs_ioctl.h>
|
|
|
|
static int g_fd = -1;
|
|
static pthread_mutex_t g_lock = PTHREAD_MUTEX_INITIALIZER;
|
|
static int g_refcount;
|
|
|
|
int
|
|
libzfs_core_init(void)
|
|
{
|
|
(void) pthread_mutex_lock(&g_lock);
|
|
if (g_refcount == 0) {
|
|
g_fd = open("/dev/zfs", O_RDWR);
|
|
if (g_fd < 0) {
|
|
(void) pthread_mutex_unlock(&g_lock);
|
|
return (errno);
|
|
}
|
|
}
|
|
g_refcount++;
|
|
(void) pthread_mutex_unlock(&g_lock);
|
|
return (0);
|
|
}
|
|
|
|
void
|
|
libzfs_core_fini(void)
|
|
{
|
|
(void) pthread_mutex_lock(&g_lock);
|
|
ASSERT3S(g_refcount, >, 0);
|
|
|
|
if (g_refcount > 0)
|
|
g_refcount--;
|
|
|
|
if (g_refcount == 0 && g_fd != -1) {
|
|
(void) close(g_fd);
|
|
g_fd = -1;
|
|
}
|
|
(void) pthread_mutex_unlock(&g_lock);
|
|
}
|
|
|
|
static int
|
|
lzc_ioctl(zfs_ioc_t ioc, const char *name,
|
|
nvlist_t *source, nvlist_t **resultp)
|
|
{
|
|
zfs_cmd_t zc = {"\0"};
|
|
int error = 0;
|
|
char *packed = NULL;
|
|
size_t size = 0;
|
|
|
|
ASSERT3S(g_refcount, >, 0);
|
|
VERIFY3S(g_fd, !=, -1);
|
|
|
|
if (name != NULL)
|
|
(void) strlcpy(zc.zc_name, name, sizeof (zc.zc_name));
|
|
|
|
if (source != NULL) {
|
|
packed = fnvlist_pack(source, &size);
|
|
zc.zc_nvlist_src = (uint64_t)(uintptr_t)packed;
|
|
zc.zc_nvlist_src_size = size;
|
|
}
|
|
|
|
if (resultp != NULL) {
|
|
*resultp = NULL;
|
|
if (ioc == ZFS_IOC_CHANNEL_PROGRAM) {
|
|
zc.zc_nvlist_dst_size = fnvlist_lookup_uint64(source,
|
|
ZCP_ARG_MEMLIMIT);
|
|
} else {
|
|
zc.zc_nvlist_dst_size = MAX(size * 2, 128 * 1024);
|
|
}
|
|
zc.zc_nvlist_dst = (uint64_t)(uintptr_t)
|
|
malloc(zc.zc_nvlist_dst_size);
|
|
if (zc.zc_nvlist_dst == (uint64_t)0) {
|
|
error = ENOMEM;
|
|
goto out;
|
|
}
|
|
}
|
|
|
|
while (ioctl(g_fd, ioc, &zc) != 0) {
|
|
/*
|
|
* If ioctl exited with ENOMEM, we retry the ioctl after
|
|
* increasing the size of the destination nvlist.
|
|
*
|
|
* Channel programs that exit with ENOMEM ran over the
|
|
* lua memory sandbox; they should not be retried.
|
|
*/
|
|
if (errno == ENOMEM && resultp != NULL &&
|
|
ioc != ZFS_IOC_CHANNEL_PROGRAM) {
|
|
free((void *)(uintptr_t)zc.zc_nvlist_dst);
|
|
zc.zc_nvlist_dst_size *= 2;
|
|
zc.zc_nvlist_dst = (uint64_t)(uintptr_t)
|
|
malloc(zc.zc_nvlist_dst_size);
|
|
if (zc.zc_nvlist_dst == (uint64_t)0) {
|
|
error = ENOMEM;
|
|
goto out;
|
|
}
|
|
} else {
|
|
error = errno;
|
|
break;
|
|
}
|
|
}
|
|
if (zc.zc_nvlist_dst_filled) {
|
|
*resultp = fnvlist_unpack((void *)(uintptr_t)zc.zc_nvlist_dst,
|
|
zc.zc_nvlist_dst_size);
|
|
}
|
|
|
|
out:
|
|
if (packed != NULL)
|
|
fnvlist_pack_free(packed, size);
|
|
free((void *)(uintptr_t)zc.zc_nvlist_dst);
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_create(const char *fsname, enum lzc_dataset_type type, nvlist_t *props,
|
|
uint8_t *wkeydata, uint_t wkeylen)
|
|
{
|
|
int error;
|
|
nvlist_t *hidden_args = NULL;
|
|
nvlist_t *args = fnvlist_alloc();
|
|
|
|
fnvlist_add_int32(args, "type", (dmu_objset_type_t)type);
|
|
if (props != NULL)
|
|
fnvlist_add_nvlist(args, "props", props);
|
|
|
|
if (wkeydata != NULL) {
|
|
hidden_args = fnvlist_alloc();
|
|
fnvlist_add_uint8_array(hidden_args, "wkeydata", wkeydata,
|
|
wkeylen);
|
|
fnvlist_add_nvlist(args, ZPOOL_HIDDEN_ARGS, hidden_args);
|
|
}
|
|
|
|
error = lzc_ioctl(ZFS_IOC_CREATE, fsname, args, NULL);
|
|
nvlist_free(hidden_args);
|
|
nvlist_free(args);
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_clone(const char *fsname, const char *origin, nvlist_t *props)
|
|
{
|
|
int error;
|
|
nvlist_t *hidden_args = NULL;
|
|
nvlist_t *args = fnvlist_alloc();
|
|
|
|
fnvlist_add_string(args, "origin", origin);
|
|
if (props != NULL)
|
|
fnvlist_add_nvlist(args, "props", props);
|
|
error = lzc_ioctl(ZFS_IOC_CLONE, fsname, args, NULL);
|
|
nvlist_free(hidden_args);
|
|
nvlist_free(args);
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_promote(const char *fsname, char *snapnamebuf, int snapnamelen)
|
|
{
|
|
/*
|
|
* The promote ioctl is still legacy, so we need to construct our
|
|
* own zfs_cmd_t rather than using lzc_ioctl().
|
|
*/
|
|
zfs_cmd_t zc = { "\0" };
|
|
|
|
ASSERT3S(g_refcount, >, 0);
|
|
VERIFY3S(g_fd, !=, -1);
|
|
|
|
(void) strlcpy(zc.zc_name, fsname, sizeof (zc.zc_name));
|
|
if (ioctl(g_fd, ZFS_IOC_PROMOTE, &zc) != 0) {
|
|
int error = errno;
|
|
if (error == EEXIST && snapnamebuf != NULL)
|
|
(void) strlcpy(snapnamebuf, zc.zc_string, snapnamelen);
|
|
return (error);
|
|
}
|
|
return (0);
|
|
}
|
|
|
|
int
|
|
lzc_remap(const char *fsname)
|
|
{
|
|
int error;
|
|
nvlist_t *args = fnvlist_alloc();
|
|
error = lzc_ioctl(ZFS_IOC_REMAP, fsname, args, NULL);
|
|
nvlist_free(args);
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* Creates snapshots.
|
|
*
|
|
* The keys in the snaps nvlist are the snapshots to be created.
|
|
* They must all be in the same pool.
|
|
*
|
|
* The props nvlist is properties to set. Currently only user properties
|
|
* are supported. { user:prop_name -> string value }
|
|
*
|
|
* The returned results nvlist will have an entry for each snapshot that failed.
|
|
* The value will be the (int32) error code.
|
|
*
|
|
* The return value will be 0 if all snapshots were created, otherwise it will
|
|
* be the errno of a (unspecified) snapshot that failed.
|
|
*/
|
|
int
|
|
lzc_snapshot(nvlist_t *snaps, nvlist_t *props, nvlist_t **errlist)
|
|
{
|
|
nvpair_t *elem;
|
|
nvlist_t *args;
|
|
int error;
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
|
|
*errlist = NULL;
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(snaps, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/@")] = '\0';
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_nvlist(args, "snaps", snaps);
|
|
if (props != NULL)
|
|
fnvlist_add_nvlist(args, "props", props);
|
|
|
|
error = lzc_ioctl(ZFS_IOC_SNAPSHOT, pool, args, errlist);
|
|
nvlist_free(args);
|
|
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* Destroys snapshots.
|
|
*
|
|
* The keys in the snaps nvlist are the snapshots to be destroyed.
|
|
* They must all be in the same pool.
|
|
*
|
|
* Snapshots that do not exist will be silently ignored.
|
|
*
|
|
* If 'defer' is not set, and a snapshot has user holds or clones, the
|
|
* destroy operation will fail and none of the snapshots will be
|
|
* destroyed.
|
|
*
|
|
* If 'defer' is set, and a snapshot has user holds or clones, it will be
|
|
* marked for deferred destruction, and will be destroyed when the last hold
|
|
* or clone is removed/destroyed.
|
|
*
|
|
* The return value will be 0 if all snapshots were destroyed (or marked for
|
|
* later destruction if 'defer' is set) or didn't exist to begin with.
|
|
*
|
|
* Otherwise the return value will be the errno of a (unspecified) snapshot
|
|
* that failed, no snapshots will be destroyed, and the errlist will have an
|
|
* entry for each snapshot that failed. The value in the errlist will be
|
|
* the (int32) error code.
|
|
*/
|
|
int
|
|
lzc_destroy_snaps(nvlist_t *snaps, boolean_t defer, nvlist_t **errlist)
|
|
{
|
|
nvpair_t *elem;
|
|
nvlist_t *args;
|
|
int error;
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(snaps, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/@")] = '\0';
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_nvlist(args, "snaps", snaps);
|
|
if (defer)
|
|
fnvlist_add_boolean(args, "defer");
|
|
|
|
error = lzc_ioctl(ZFS_IOC_DESTROY_SNAPS, pool, args, errlist);
|
|
nvlist_free(args);
|
|
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_snaprange_space(const char *firstsnap, const char *lastsnap,
|
|
uint64_t *usedp)
|
|
{
|
|
nvlist_t *args;
|
|
nvlist_t *result;
|
|
int err;
|
|
char fs[ZFS_MAX_DATASET_NAME_LEN];
|
|
char *atp;
|
|
|
|
/* determine the fs name */
|
|
(void) strlcpy(fs, firstsnap, sizeof (fs));
|
|
atp = strchr(fs, '@');
|
|
if (atp == NULL)
|
|
return (EINVAL);
|
|
*atp = '\0';
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_string(args, "firstsnap", firstsnap);
|
|
|
|
err = lzc_ioctl(ZFS_IOC_SPACE_SNAPS, lastsnap, args, &result);
|
|
nvlist_free(args);
|
|
if (err == 0)
|
|
*usedp = fnvlist_lookup_uint64(result, "used");
|
|
fnvlist_free(result);
|
|
|
|
return (err);
|
|
}
|
|
|
|
boolean_t
|
|
lzc_exists(const char *dataset)
|
|
{
|
|
/*
|
|
* The objset_stats ioctl is still legacy, so we need to construct our
|
|
* own zfs_cmd_t rather than using lzc_ioctl().
|
|
*/
|
|
zfs_cmd_t zc = {"\0"};
|
|
|
|
ASSERT3S(g_refcount, >, 0);
|
|
VERIFY3S(g_fd, !=, -1);
|
|
|
|
(void) strlcpy(zc.zc_name, dataset, sizeof (zc.zc_name));
|
|
return (ioctl(g_fd, ZFS_IOC_OBJSET_STATS, &zc) == 0);
|
|
}
|
|
|
|
/*
|
|
* outnvl is unused.
|
|
* It was added to preserve the function signature in case it is
|
|
* needed in the future.
|
|
*/
|
|
/*ARGSUSED*/
|
|
int
|
|
lzc_sync(const char *pool_name, nvlist_t *innvl, nvlist_t **outnvl)
|
|
{
|
|
return (lzc_ioctl(ZFS_IOC_POOL_SYNC, pool_name, innvl, NULL));
|
|
}
|
|
|
|
/*
|
|
* Create "user holds" on snapshots. If there is a hold on a snapshot,
|
|
* the snapshot can not be destroyed. (However, it can be marked for deletion
|
|
* by lzc_destroy_snaps(defer=B_TRUE).)
|
|
*
|
|
* The keys in the nvlist are snapshot names.
|
|
* The snapshots must all be in the same pool.
|
|
* The value is the name of the hold (string type).
|
|
*
|
|
* If cleanup_fd is not -1, it must be the result of open("/dev/zfs", O_EXCL).
|
|
* In this case, when the cleanup_fd is closed (including on process
|
|
* termination), the holds will be released. If the system is shut down
|
|
* uncleanly, the holds will be released when the pool is next opened
|
|
* or imported.
|
|
*
|
|
* Holds for snapshots which don't exist will be skipped and have an entry
|
|
* added to errlist, but will not cause an overall failure.
|
|
*
|
|
* The return value will be 0 if all holds, for snapshots that existed,
|
|
* were successfully created.
|
|
*
|
|
* Otherwise the return value will be the errno of a (unspecified) hold that
|
|
* failed and no holds will be created.
|
|
*
|
|
* In all cases the errlist will have an entry for each hold that failed
|
|
* (name = snapshot), with its value being the error code (int32).
|
|
*/
|
|
int
|
|
lzc_hold(nvlist_t *holds, int cleanup_fd, nvlist_t **errlist)
|
|
{
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
nvlist_t *args;
|
|
nvpair_t *elem;
|
|
int error;
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(holds, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/@")] = '\0';
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_nvlist(args, "holds", holds);
|
|
if (cleanup_fd != -1)
|
|
fnvlist_add_int32(args, "cleanup_fd", cleanup_fd);
|
|
|
|
error = lzc_ioctl(ZFS_IOC_HOLD, pool, args, errlist);
|
|
nvlist_free(args);
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* Release "user holds" on snapshots. If the snapshot has been marked for
|
|
* deferred destroy (by lzc_destroy_snaps(defer=B_TRUE)), it does not have
|
|
* any clones, and all the user holds are removed, then the snapshot will be
|
|
* destroyed.
|
|
*
|
|
* The keys in the nvlist are snapshot names.
|
|
* The snapshots must all be in the same pool.
|
|
* The value is an nvlist whose keys are the holds to remove.
|
|
*
|
|
* Holds which failed to release because they didn't exist will have an entry
|
|
* added to errlist, but will not cause an overall failure.
|
|
*
|
|
* The return value will be 0 if the nvl holds was empty or all holds that
|
|
* existed, were successfully removed.
|
|
*
|
|
* Otherwise the return value will be the errno of a (unspecified) hold that
|
|
* failed to release and no holds will be released.
|
|
*
|
|
* In all cases the errlist will have an entry for each hold that failed to
|
|
* to release.
|
|
*/
|
|
int
|
|
lzc_release(nvlist_t *holds, nvlist_t **errlist)
|
|
{
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
nvpair_t *elem;
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(holds, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/@")] = '\0';
|
|
|
|
return (lzc_ioctl(ZFS_IOC_RELEASE, pool, holds, errlist));
|
|
}
|
|
|
|
/*
|
|
* Retrieve list of user holds on the specified snapshot.
|
|
*
|
|
* On success, *holdsp will be set to an nvlist which the caller must free.
|
|
* The keys are the names of the holds, and the value is the creation time
|
|
* of the hold (uint64) in seconds since the epoch.
|
|
*/
|
|
int
|
|
lzc_get_holds(const char *snapname, nvlist_t **holdsp)
|
|
{
|
|
return (lzc_ioctl(ZFS_IOC_GET_HOLDS, snapname, NULL, holdsp));
|
|
}
|
|
|
|
/*
|
|
* Generate a zfs send stream for the specified snapshot and write it to
|
|
* the specified file descriptor.
|
|
*
|
|
* "snapname" is the full name of the snapshot to send (e.g. "pool/fs@snap")
|
|
*
|
|
* If "from" is NULL, a full (non-incremental) stream will be sent.
|
|
* If "from" is non-NULL, it must be the full name of a snapshot or
|
|
* bookmark to send an incremental from (e.g. "pool/fs@earlier_snap" or
|
|
* "pool/fs#earlier_bmark"). If non-NULL, the specified snapshot or
|
|
* bookmark must represent an earlier point in the history of "snapname").
|
|
* It can be an earlier snapshot in the same filesystem or zvol as "snapname",
|
|
* or it can be the origin of "snapname"'s filesystem, or an earlier
|
|
* snapshot in the origin, etc.
|
|
*
|
|
* "fd" is the file descriptor to write the send stream to.
|
|
*
|
|
* If "flags" contains LZC_SEND_FLAG_LARGE_BLOCK, the stream is permitted
|
|
* to contain DRR_WRITE records with drr_length > 128K, and DRR_OBJECT
|
|
* records with drr_blksz > 128K.
|
|
*
|
|
* If "flags" contains LZC_SEND_FLAG_EMBED_DATA, the stream is permitted
|
|
* to contain DRR_WRITE_EMBEDDED records with drr_etype==BP_EMBEDDED_TYPE_DATA,
|
|
* which the receiving system must support (as indicated by support
|
|
* for the "embedded_data" feature).
|
|
*/
|
|
int
|
|
lzc_send(const char *snapname, const char *from, int fd,
|
|
enum lzc_send_flags flags)
|
|
{
|
|
return (lzc_send_resume(snapname, from, fd, flags, 0, 0));
|
|
}
|
|
|
|
int
|
|
lzc_send_resume(const char *snapname, const char *from, int fd,
|
|
enum lzc_send_flags flags, uint64_t resumeobj, uint64_t resumeoff)
|
|
{
|
|
nvlist_t *args;
|
|
int err;
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_int32(args, "fd", fd);
|
|
if (from != NULL)
|
|
fnvlist_add_string(args, "fromsnap", from);
|
|
if (flags & LZC_SEND_FLAG_LARGE_BLOCK)
|
|
fnvlist_add_boolean(args, "largeblockok");
|
|
if (flags & LZC_SEND_FLAG_EMBED_DATA)
|
|
fnvlist_add_boolean(args, "embedok");
|
|
if (flags & LZC_SEND_FLAG_COMPRESS)
|
|
fnvlist_add_boolean(args, "compressok");
|
|
if (flags & LZC_SEND_FLAG_RAW)
|
|
fnvlist_add_boolean(args, "rawok");
|
|
if (resumeobj != 0 || resumeoff != 0) {
|
|
fnvlist_add_uint64(args, "resume_object", resumeobj);
|
|
fnvlist_add_uint64(args, "resume_offset", resumeoff);
|
|
}
|
|
err = lzc_ioctl(ZFS_IOC_SEND_NEW, snapname, args, NULL);
|
|
nvlist_free(args);
|
|
return (err);
|
|
}
|
|
|
|
/*
|
|
* "from" can be NULL, a snapshot, or a bookmark.
|
|
*
|
|
* If from is NULL, a full (non-incremental) stream will be estimated. This
|
|
* is calculated very efficiently.
|
|
*
|
|
* If from is a snapshot, lzc_send_space uses the deadlists attached to
|
|
* each snapshot to efficiently estimate the stream size.
|
|
*
|
|
* If from is a bookmark, the indirect blocks in the destination snapshot
|
|
* are traversed, looking for blocks with a birth time since the creation TXG of
|
|
* the snapshot this bookmark was created from. This will result in
|
|
* significantly more I/O and be less efficient than a send space estimation on
|
|
* an equivalent snapshot.
|
|
*/
|
|
int
|
|
lzc_send_space(const char *snapname, const char *from,
|
|
enum lzc_send_flags flags, uint64_t *spacep)
|
|
{
|
|
nvlist_t *args;
|
|
nvlist_t *result;
|
|
int err;
|
|
|
|
args = fnvlist_alloc();
|
|
if (from != NULL)
|
|
fnvlist_add_string(args, "from", from);
|
|
if (flags & LZC_SEND_FLAG_LARGE_BLOCK)
|
|
fnvlist_add_boolean(args, "largeblockok");
|
|
if (flags & LZC_SEND_FLAG_EMBED_DATA)
|
|
fnvlist_add_boolean(args, "embedok");
|
|
if (flags & LZC_SEND_FLAG_COMPRESS)
|
|
fnvlist_add_boolean(args, "compressok");
|
|
if (flags & LZC_SEND_FLAG_RAW)
|
|
fnvlist_add_boolean(args, "rawok");
|
|
err = lzc_ioctl(ZFS_IOC_SEND_SPACE, snapname, args, &result);
|
|
nvlist_free(args);
|
|
if (err == 0)
|
|
*spacep = fnvlist_lookup_uint64(result, "space");
|
|
nvlist_free(result);
|
|
return (err);
|
|
}
|
|
|
|
static int
|
|
recv_read(int fd, void *buf, int ilen)
|
|
{
|
|
char *cp = buf;
|
|
int rv;
|
|
int len = ilen;
|
|
|
|
do {
|
|
rv = read(fd, cp, len);
|
|
cp += rv;
|
|
len -= rv;
|
|
} while (rv > 0);
|
|
|
|
if (rv < 0 || len != 0)
|
|
return (EIO);
|
|
|
|
return (0);
|
|
}
|
|
|
|
/*
|
|
* Linux adds ZFS_IOC_RECV_NEW for resumable and raw streams and preserves the
|
|
* legacy ZFS_IOC_RECV user/kernel interface. The new interface supports all
|
|
* stream options but is currently only used for resumable streams. This way
|
|
* updated user space utilities will interoperate with older kernel modules.
|
|
*
|
|
* Non-Linux OpenZFS platforms have opted to modify the legacy interface.
|
|
*/
|
|
static int
|
|
recv_impl(const char *snapname, nvlist_t *recvdprops, nvlist_t *localprops,
|
|
const char *origin, boolean_t force, boolean_t resumable, boolean_t raw,
|
|
int input_fd, const dmu_replay_record_t *begin_record, int cleanup_fd,
|
|
uint64_t *read_bytes, uint64_t *errflags, uint64_t *action_handle,
|
|
nvlist_t **errors)
|
|
{
|
|
dmu_replay_record_t drr;
|
|
char fsname[MAXPATHLEN];
|
|
char *atp;
|
|
int error;
|
|
|
|
ASSERT3S(g_refcount, >, 0);
|
|
VERIFY3S(g_fd, !=, -1);
|
|
|
|
/* Set 'fsname' to the name of containing filesystem */
|
|
(void) strlcpy(fsname, snapname, sizeof (fsname));
|
|
atp = strchr(fsname, '@');
|
|
if (atp == NULL)
|
|
return (EINVAL);
|
|
*atp = '\0';
|
|
|
|
/* If the fs does not exist, try its parent. */
|
|
if (!lzc_exists(fsname)) {
|
|
char *slashp = strrchr(fsname, '/');
|
|
if (slashp == NULL)
|
|
return (ENOENT);
|
|
*slashp = '\0';
|
|
}
|
|
|
|
/*
|
|
* The begin_record is normally a non-byteswapped BEGIN record.
|
|
* For resumable streams it may be set to any non-byteswapped
|
|
* dmu_replay_record_t.
|
|
*/
|
|
if (begin_record == NULL) {
|
|
error = recv_read(input_fd, &drr, sizeof (drr));
|
|
if (error != 0)
|
|
return (error);
|
|
} else {
|
|
drr = *begin_record;
|
|
}
|
|
|
|
if (resumable || raw) {
|
|
nvlist_t *outnvl = NULL;
|
|
nvlist_t *innvl = fnvlist_alloc();
|
|
|
|
fnvlist_add_string(innvl, "snapname", snapname);
|
|
|
|
if (recvdprops != NULL)
|
|
fnvlist_add_nvlist(innvl, "props", recvdprops);
|
|
|
|
if (localprops != NULL)
|
|
fnvlist_add_nvlist(innvl, "localprops", localprops);
|
|
|
|
if (origin != NULL && strlen(origin))
|
|
fnvlist_add_string(innvl, "origin", origin);
|
|
|
|
fnvlist_add_byte_array(innvl, "begin_record",
|
|
(uchar_t *)&drr, sizeof (drr));
|
|
|
|
fnvlist_add_int32(innvl, "input_fd", input_fd);
|
|
|
|
if (force)
|
|
fnvlist_add_boolean(innvl, "force");
|
|
|
|
if (resumable)
|
|
fnvlist_add_boolean(innvl, "resumable");
|
|
|
|
if (cleanup_fd >= 0)
|
|
fnvlist_add_int32(innvl, "cleanup_fd", cleanup_fd);
|
|
|
|
if (action_handle != NULL)
|
|
fnvlist_add_uint64(innvl, "action_handle",
|
|
*action_handle);
|
|
|
|
error = lzc_ioctl(ZFS_IOC_RECV_NEW, fsname, innvl, &outnvl);
|
|
|
|
if (error == 0 && read_bytes != NULL)
|
|
error = nvlist_lookup_uint64(outnvl, "read_bytes",
|
|
read_bytes);
|
|
|
|
if (error == 0 && errflags != NULL)
|
|
error = nvlist_lookup_uint64(outnvl, "error_flags",
|
|
errflags);
|
|
|
|
if (error == 0 && action_handle != NULL)
|
|
error = nvlist_lookup_uint64(outnvl, "action_handle",
|
|
action_handle);
|
|
|
|
if (error == 0 && errors != NULL) {
|
|
nvlist_t *nvl;
|
|
error = nvlist_lookup_nvlist(outnvl, "errors", &nvl);
|
|
if (error == 0)
|
|
*errors = fnvlist_dup(nvl);
|
|
}
|
|
|
|
fnvlist_free(innvl);
|
|
fnvlist_free(outnvl);
|
|
} else {
|
|
zfs_cmd_t zc = {"\0"};
|
|
char *packed = NULL;
|
|
size_t size;
|
|
|
|
ASSERT3S(g_refcount, >, 0);
|
|
|
|
(void) strlcpy(zc.zc_name, fsname, sizeof (zc.zc_value));
|
|
(void) strlcpy(zc.zc_value, snapname, sizeof (zc.zc_value));
|
|
|
|
if (recvdprops != NULL) {
|
|
packed = fnvlist_pack(recvdprops, &size);
|
|
zc.zc_nvlist_src = (uint64_t)(uintptr_t)packed;
|
|
zc.zc_nvlist_src_size = size;
|
|
}
|
|
|
|
if (localprops != NULL) {
|
|
packed = fnvlist_pack(localprops, &size);
|
|
zc.zc_nvlist_conf = (uint64_t)(uintptr_t)packed;
|
|
zc.zc_nvlist_conf_size = size;
|
|
}
|
|
|
|
if (origin != NULL)
|
|
(void) strlcpy(zc.zc_string, origin,
|
|
sizeof (zc.zc_string));
|
|
|
|
ASSERT3S(drr.drr_type, ==, DRR_BEGIN);
|
|
zc.zc_begin_record = drr.drr_u.drr_begin;
|
|
zc.zc_guid = force;
|
|
zc.zc_cookie = input_fd;
|
|
zc.zc_cleanup_fd = -1;
|
|
zc.zc_action_handle = 0;
|
|
|
|
if (cleanup_fd >= 0)
|
|
zc.zc_cleanup_fd = cleanup_fd;
|
|
|
|
if (action_handle != NULL)
|
|
zc.zc_action_handle = *action_handle;
|
|
|
|
zc.zc_nvlist_dst_size = 128 * 1024;
|
|
zc.zc_nvlist_dst = (uint64_t)(uintptr_t)
|
|
malloc(zc.zc_nvlist_dst_size);
|
|
|
|
error = ioctl(g_fd, ZFS_IOC_RECV, &zc);
|
|
if (error != 0) {
|
|
error = errno;
|
|
} else {
|
|
if (read_bytes != NULL)
|
|
*read_bytes = zc.zc_cookie;
|
|
|
|
if (errflags != NULL)
|
|
*errflags = zc.zc_obj;
|
|
|
|
if (action_handle != NULL)
|
|
*action_handle = zc.zc_action_handle;
|
|
|
|
if (errors != NULL)
|
|
VERIFY0(nvlist_unpack(
|
|
(void *)(uintptr_t)zc.zc_nvlist_dst,
|
|
zc.zc_nvlist_dst_size, errors, KM_SLEEP));
|
|
}
|
|
|
|
if (packed != NULL)
|
|
fnvlist_pack_free(packed, size);
|
|
free((void *)(uintptr_t)zc.zc_nvlist_dst);
|
|
}
|
|
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* The simplest receive case: receive from the specified fd, creating the
|
|
* specified snapshot. Apply the specified properties as "received" properties
|
|
* (which can be overridden by locally-set properties). If the stream is a
|
|
* clone, its origin snapshot must be specified by 'origin'. The 'force'
|
|
* flag will cause the target filesystem to be rolled back or destroyed if
|
|
* necessary to receive.
|
|
*
|
|
* Return 0 on success or an errno on failure.
|
|
*
|
|
* Note: this interface does not work on dedup'd streams
|
|
* (those with DMU_BACKUP_FEATURE_DEDUP).
|
|
*/
|
|
int
|
|
lzc_receive(const char *snapname, nvlist_t *props, const char *origin,
|
|
boolean_t force, boolean_t raw, int fd)
|
|
{
|
|
return (recv_impl(snapname, props, NULL, origin, force, B_FALSE, raw,
|
|
fd, NULL, -1, NULL, NULL, NULL, NULL));
|
|
}
|
|
|
|
/*
|
|
* Like lzc_receive, but if the receive fails due to premature stream
|
|
* termination, the intermediate state will be preserved on disk. In this
|
|
* case, ECKSUM will be returned. The receive may subsequently be resumed
|
|
* with a resuming send stream generated by lzc_send_resume().
|
|
*/
|
|
int
|
|
lzc_receive_resumable(const char *snapname, nvlist_t *props, const char *origin,
|
|
boolean_t force, boolean_t raw, int fd)
|
|
{
|
|
return (recv_impl(snapname, props, NULL, origin, force, B_TRUE, raw,
|
|
fd, NULL, -1, NULL, NULL, NULL, NULL));
|
|
}
|
|
|
|
/*
|
|
* Like lzc_receive, but allows the caller to read the begin record and then to
|
|
* pass it in. That could be useful if the caller wants to derive, for example,
|
|
* the snapname or the origin parameters based on the information contained in
|
|
* the begin record.
|
|
* The begin record must be in its original form as read from the stream,
|
|
* in other words, it should not be byteswapped.
|
|
*
|
|
* The 'resumable' parameter allows to obtain the same behavior as with
|
|
* lzc_receive_resumable.
|
|
*/
|
|
int
|
|
lzc_receive_with_header(const char *snapname, nvlist_t *props,
|
|
const char *origin, boolean_t force, boolean_t resumable, boolean_t raw,
|
|
int fd, const dmu_replay_record_t *begin_record)
|
|
{
|
|
if (begin_record == NULL)
|
|
return (EINVAL);
|
|
|
|
return (recv_impl(snapname, props, NULL, origin, force, resumable, raw,
|
|
fd, begin_record, -1, NULL, NULL, NULL, NULL));
|
|
}
|
|
|
|
/*
|
|
* Like lzc_receive, but allows the caller to pass all supported arguments
|
|
* and retrieve all values returned. The only additional input parameter
|
|
* is 'cleanup_fd' which is used to set a cleanup-on-exit file descriptor.
|
|
*
|
|
* The following parameters all provide return values. Several may be set
|
|
* in the failure case and will contain additional information.
|
|
*
|
|
* The 'read_bytes' value will be set to the total number of bytes read.
|
|
*
|
|
* The 'errflags' value will contain zprop_errflags_t flags which are
|
|
* used to describe any failures.
|
|
*
|
|
* The 'action_handle' is used to pass the handle for this guid/ds mapping.
|
|
* It should be set to zero on first call and will contain an updated handle
|
|
* on success, it should be passed in subsequent calls.
|
|
*
|
|
* The 'errors' nvlist contains an entry for each unapplied received
|
|
* property. Callers are responsible for freeing this nvlist.
|
|
*/
|
|
int lzc_receive_one(const char *snapname, nvlist_t *props,
|
|
const char *origin, boolean_t force, boolean_t resumable, boolean_t raw,
|
|
int input_fd, const dmu_replay_record_t *begin_record, int cleanup_fd,
|
|
uint64_t *read_bytes, uint64_t *errflags, uint64_t *action_handle,
|
|
nvlist_t **errors)
|
|
{
|
|
return (recv_impl(snapname, props, NULL, origin, force, resumable,
|
|
raw, input_fd, begin_record, cleanup_fd, read_bytes, errflags,
|
|
action_handle, errors));
|
|
}
|
|
|
|
/*
|
|
* Like lzc_receive_one, but allows the caller to pass an additional 'cmdprops'
|
|
* argument.
|
|
*
|
|
* The 'cmdprops' nvlist contains both override ('zfs receive -o') and
|
|
* exclude ('zfs receive -x') properties. Callers are responsible for freeing
|
|
* this nvlist
|
|
*/
|
|
int lzc_receive_with_cmdprops(const char *snapname, nvlist_t *props,
|
|
nvlist_t *cmdprops, const char *origin, boolean_t force,
|
|
boolean_t resumable, boolean_t raw, int input_fd,
|
|
const dmu_replay_record_t *begin_record, int cleanup_fd,
|
|
uint64_t *read_bytes, uint64_t *errflags, uint64_t *action_handle,
|
|
nvlist_t **errors)
|
|
{
|
|
return (recv_impl(snapname, props, cmdprops, origin, force, resumable,
|
|
raw, input_fd, begin_record, cleanup_fd, read_bytes, errflags,
|
|
action_handle, errors));
|
|
}
|
|
|
|
/*
|
|
* Roll back this filesystem or volume to its most recent snapshot.
|
|
* If snapnamebuf is not NULL, it will be filled in with the name
|
|
* of the most recent snapshot.
|
|
* Note that the latest snapshot may change if a new one is concurrently
|
|
* created or the current one is destroyed. lzc_rollback_to can be used
|
|
* to roll back to a specific latest snapshot.
|
|
*
|
|
* Return 0 on success or an errno on failure.
|
|
*/
|
|
int
|
|
lzc_rollback(const char *fsname, char *snapnamebuf, int snapnamelen)
|
|
{
|
|
nvlist_t *args;
|
|
nvlist_t *result;
|
|
int err;
|
|
|
|
args = fnvlist_alloc();
|
|
err = lzc_ioctl(ZFS_IOC_ROLLBACK, fsname, args, &result);
|
|
nvlist_free(args);
|
|
if (err == 0 && snapnamebuf != NULL) {
|
|
const char *snapname = fnvlist_lookup_string(result, "target");
|
|
(void) strlcpy(snapnamebuf, snapname, snapnamelen);
|
|
}
|
|
nvlist_free(result);
|
|
|
|
return (err);
|
|
}
|
|
|
|
/*
|
|
* Roll back this filesystem or volume to the specified snapshot,
|
|
* if possible.
|
|
*
|
|
* Return 0 on success or an errno on failure.
|
|
*/
|
|
int
|
|
lzc_rollback_to(const char *fsname, const char *snapname)
|
|
{
|
|
nvlist_t *args;
|
|
nvlist_t *result;
|
|
int err;
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_string(args, "target", snapname);
|
|
err = lzc_ioctl(ZFS_IOC_ROLLBACK, fsname, args, &result);
|
|
nvlist_free(args);
|
|
nvlist_free(result);
|
|
return (err);
|
|
}
|
|
|
|
/*
|
|
* Creates bookmarks.
|
|
*
|
|
* The bookmarks nvlist maps from name of the bookmark (e.g. "pool/fs#bmark") to
|
|
* the name of the snapshot (e.g. "pool/fs@snap"). All the bookmarks and
|
|
* snapshots must be in the same pool.
|
|
*
|
|
* The returned results nvlist will have an entry for each bookmark that failed.
|
|
* The value will be the (int32) error code.
|
|
*
|
|
* The return value will be 0 if all bookmarks were created, otherwise it will
|
|
* be the errno of a (undetermined) bookmarks that failed.
|
|
*/
|
|
int
|
|
lzc_bookmark(nvlist_t *bookmarks, nvlist_t **errlist)
|
|
{
|
|
nvpair_t *elem;
|
|
int error;
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(bookmarks, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/#")] = '\0';
|
|
|
|
error = lzc_ioctl(ZFS_IOC_BOOKMARK, pool, bookmarks, errlist);
|
|
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* Retrieve bookmarks.
|
|
*
|
|
* Retrieve the list of bookmarks for the given file system. The props
|
|
* parameter is an nvlist of property names (with no values) that will be
|
|
* returned for each bookmark.
|
|
*
|
|
* The following are valid properties on bookmarks, all of which are numbers
|
|
* (represented as uint64 in the nvlist)
|
|
*
|
|
* "guid" - globally unique identifier of the snapshot it refers to
|
|
* "createtxg" - txg when the snapshot it refers to was created
|
|
* "creation" - timestamp when the snapshot it refers to was created
|
|
*
|
|
* The format of the returned nvlist as follows:
|
|
* <short name of bookmark> -> {
|
|
* <name of property> -> {
|
|
* "value" -> uint64
|
|
* }
|
|
* }
|
|
*/
|
|
int
|
|
lzc_get_bookmarks(const char *fsname, nvlist_t *props, nvlist_t **bmarks)
|
|
{
|
|
return (lzc_ioctl(ZFS_IOC_GET_BOOKMARKS, fsname, props, bmarks));
|
|
}
|
|
|
|
/*
|
|
* Destroys bookmarks.
|
|
*
|
|
* The keys in the bmarks nvlist are the bookmarks to be destroyed.
|
|
* They must all be in the same pool. Bookmarks are specified as
|
|
* <fs>#<bmark>.
|
|
*
|
|
* Bookmarks that do not exist will be silently ignored.
|
|
*
|
|
* The return value will be 0 if all bookmarks that existed were destroyed.
|
|
*
|
|
* Otherwise the return value will be the errno of a (undetermined) bookmark
|
|
* that failed, no bookmarks will be destroyed, and the errlist will have an
|
|
* entry for each bookmarks that failed. The value in the errlist will be
|
|
* the (int32) error code.
|
|
*/
|
|
int
|
|
lzc_destroy_bookmarks(nvlist_t *bmarks, nvlist_t **errlist)
|
|
{
|
|
nvpair_t *elem;
|
|
int error;
|
|
char pool[ZFS_MAX_DATASET_NAME_LEN];
|
|
|
|
/* determine the pool name */
|
|
elem = nvlist_next_nvpair(bmarks, NULL);
|
|
if (elem == NULL)
|
|
return (0);
|
|
(void) strlcpy(pool, nvpair_name(elem), sizeof (pool));
|
|
pool[strcspn(pool, "/#")] = '\0';
|
|
|
|
error = lzc_ioctl(ZFS_IOC_DESTROY_BOOKMARKS, pool, bmarks, errlist);
|
|
|
|
return (error);
|
|
}
|
|
|
|
static int
|
|
lzc_channel_program_impl(const char *pool, const char *program, boolean_t sync,
|
|
uint64_t instrlimit, uint64_t memlimit, nvlist_t *argnvl, nvlist_t **outnvl)
|
|
{
|
|
int error;
|
|
nvlist_t *args;
|
|
|
|
args = fnvlist_alloc();
|
|
fnvlist_add_string(args, ZCP_ARG_PROGRAM, program);
|
|
fnvlist_add_nvlist(args, ZCP_ARG_ARGLIST, argnvl);
|
|
fnvlist_add_boolean_value(args, ZCP_ARG_SYNC, sync);
|
|
fnvlist_add_uint64(args, ZCP_ARG_INSTRLIMIT, instrlimit);
|
|
fnvlist_add_uint64(args, ZCP_ARG_MEMLIMIT, memlimit);
|
|
error = lzc_ioctl(ZFS_IOC_CHANNEL_PROGRAM, pool, args, outnvl);
|
|
fnvlist_free(args);
|
|
|
|
return (error);
|
|
}
|
|
|
|
/*
|
|
* Executes a channel program.
|
|
*
|
|
* If this function returns 0 the channel program was successfully loaded and
|
|
* ran without failing. Note that individual commands the channel program ran
|
|
* may have failed and the channel program is responsible for reporting such
|
|
* errors through outnvl if they are important.
|
|
*
|
|
* This method may also return:
|
|
*
|
|
* EINVAL The program contains syntax errors, or an invalid memory or time
|
|
* limit was given. No part of the channel program was executed.
|
|
* If caused by syntax errors, 'outnvl' contains information about the
|
|
* errors.
|
|
*
|
|
* ECHRNG The program was executed, but encountered a runtime error, such as
|
|
* calling a function with incorrect arguments, invoking the error()
|
|
* function directly, failing an assert() command, etc. Some portion
|
|
* of the channel program may have executed and committed changes.
|
|
* Information about the failure can be found in 'outnvl'.
|
|
*
|
|
* ENOMEM The program fully executed, but the output buffer was not large
|
|
* enough to store the returned value. No output is returned through
|
|
* 'outnvl'.
|
|
*
|
|
* ENOSPC The program was terminated because it exceeded its memory usage
|
|
* limit. Some portion of the channel program may have executed and
|
|
* committed changes to disk. No output is returned through 'outnvl'.
|
|
*
|
|
* ETIME The program was terminated because it exceeded its Lua instruction
|
|
* limit. Some portion of the channel program may have executed and
|
|
* committed changes to disk. No output is returned through 'outnvl'.
|
|
*/
|
|
int
|
|
lzc_channel_program(const char *pool, const char *program, uint64_t instrlimit,
|
|
uint64_t memlimit, nvlist_t *argnvl, nvlist_t **outnvl)
|
|
{
|
|
return (lzc_channel_program_impl(pool, program, B_TRUE, instrlimit,
|
|
memlimit, argnvl, outnvl));
|
|
}
|
|
|
|
/*
|
|
* Executes a read-only channel program.
|
|
*
|
|
* A read-only channel program works programmatically the same way as a
|
|
* normal channel program executed with lzc_channel_program(). The only
|
|
* difference is it runs exclusively in open-context and therefore can
|
|
* return faster. The downside to that, is that the program cannot change
|
|
* on-disk state by calling functions from the zfs.sync submodule.
|
|
*
|
|
* The return values of this function (and their meaning) are exactly the
|
|
* same as the ones described in lzc_channel_program().
|
|
*/
|
|
int
|
|
lzc_channel_program_nosync(const char *pool, const char *program,
|
|
uint64_t timeout, uint64_t memlimit, nvlist_t *argnvl, nvlist_t **outnvl)
|
|
{
|
|
return (lzc_channel_program_impl(pool, program, B_FALSE, timeout,
|
|
memlimit, argnvl, outnvl));
|
|
}
|
|
|
|
/*
|
|
* Performs key management functions
|
|
*
|
|
* crypto_cmd should be a value from zfs_ioc_crypto_cmd_t. If the command
|
|
* specifies to load or change a wrapping key, the key should be specified in
|
|
* the hidden_args nvlist so that it is not logged
|
|
*/
|
|
int
|
|
lzc_load_key(const char *fsname, boolean_t noop, uint8_t *wkeydata,
|
|
uint_t wkeylen)
|
|
{
|
|
int error;
|
|
nvlist_t *ioc_args;
|
|
nvlist_t *hidden_args;
|
|
|
|
if (wkeydata == NULL)
|
|
return (EINVAL);
|
|
|
|
ioc_args = fnvlist_alloc();
|
|
hidden_args = fnvlist_alloc();
|
|
fnvlist_add_uint8_array(hidden_args, "wkeydata", wkeydata, wkeylen);
|
|
fnvlist_add_nvlist(ioc_args, ZPOOL_HIDDEN_ARGS, hidden_args);
|
|
if (noop)
|
|
fnvlist_add_boolean(ioc_args, "noop");
|
|
error = lzc_ioctl(ZFS_IOC_LOAD_KEY, fsname, ioc_args, NULL);
|
|
nvlist_free(hidden_args);
|
|
nvlist_free(ioc_args);
|
|
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_unload_key(const char *fsname)
|
|
{
|
|
return (lzc_ioctl(ZFS_IOC_UNLOAD_KEY, fsname, NULL, NULL));
|
|
}
|
|
|
|
int
|
|
lzc_change_key(const char *fsname, uint64_t crypt_cmd, nvlist_t *props,
|
|
uint8_t *wkeydata, uint_t wkeylen)
|
|
{
|
|
int error;
|
|
nvlist_t *ioc_args = fnvlist_alloc();
|
|
nvlist_t *hidden_args = NULL;
|
|
|
|
fnvlist_add_uint64(ioc_args, "crypt_cmd", crypt_cmd);
|
|
|
|
if (wkeydata != NULL) {
|
|
hidden_args = fnvlist_alloc();
|
|
fnvlist_add_uint8_array(hidden_args, "wkeydata", wkeydata,
|
|
wkeylen);
|
|
fnvlist_add_nvlist(ioc_args, ZPOOL_HIDDEN_ARGS, hidden_args);
|
|
}
|
|
|
|
if (props != NULL)
|
|
fnvlist_add_nvlist(ioc_args, "props", props);
|
|
|
|
error = lzc_ioctl(ZFS_IOC_CHANGE_KEY, fsname, ioc_args, NULL);
|
|
nvlist_free(hidden_args);
|
|
nvlist_free(ioc_args);
|
|
|
|
return (error);
|
|
}
|
|
|
|
int
|
|
lzc_reopen(const char *pool_name, boolean_t scrub_restart)
|
|
{
|
|
nvlist_t *args = fnvlist_alloc();
|
|
int error;
|
|
|
|
fnvlist_add_boolean_value(args, "scrub_restart", scrub_restart);
|
|
|
|
error = lzc_ioctl(ZFS_IOC_POOL_REOPEN, pool_name, args, NULL);
|
|
nvlist_free(args);
|
|
return (error);
|
|
}
|