freebsd-nq/man/man8/zstream.8
Matthew Ahrens c618f87cd2
Add zstream redup command to convert deduplicated send streams
Deduplicated send and receive is deprecated.  To ease migration to the
new dedup-send-less world, the commit adds a `zstream redup` utility to
convert deduplicated send streams to normal streams, so that they can
continue to be received indefinitely.

The new `zstream` command also replaces the functionality of
`zstreamdump`, by way of the `zstream dump` subcommand.  The
`zstreamdump` command is replaced by a shell script which invokes
`zstream dump`.

The way that `zstream redup` works under the hood is that as we read the
send stream, we build up a hash table which maps from `<GUID, object,
offset> -> <file_offset>`.

Whenever we see a WRITE record, we add a new entry to the hash table,
which indicates where in the stream file to find the WRITE record for
this block. (The key is `drr_toguid, drr_object, drr_offset`.)

For entries other than WRITE_BYREF, we pass them through unchanged
(except for the running checksum, which is recalculated).

For WRITE_BYREF records, we change them to WRITE records.  We find the
referenced WRITE record by looking in the hash table (for the record
with key `drr_refguid, drr_refobject, drr_refoffset`), and then reading
the record header and payload from the specified offset in the stream
file.  This is why the stream can not be a pipe.  The found WRITE record
replaces the WRITE_BYREF record, with its `drr_toguid`, `drr_object`,
and `drr_offset` fields changed to be the same as the WRITE_BYREF's
(i.e. we are writing the same logical block, but with the data supplied
by the previous WRITE record).

This algorithm requires memory proportional to the number of WRITE
records (same as `zfs send -D`), but the size per WRITE record is
relatively low (40 bytes, vs. 72 for `zfs send -D`).  A 1TB send stream
with 8KB blocks (`recordsize=8k`) would use around 5GB of RAM to
"redup".

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10124 
Closes #10156
2020-04-10 10:39:55 -07:00

102 lines
2.4 KiB
Groff

.\"
.\" CDDL HEADER START
.\"
.\" The contents of this file are subject to the terms of the
.\" Common Development and Distribution License (the "License").
.\" You may not use this file except in compliance with the License.
.\"
.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
.\" or http://www.opensolaris.org/os/licensing.
.\" See the License for the specific language governing permissions
.\" and limitations under the License.
.\"
.\" When distributing Covered Code, include this CDDL HEADER in each
.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
.\" If applicable, add the following below this CDDL HEADER, with the
.\" fields enclosed by brackets "[]" replaced with your own identifying
.\" information: Portions Copyright [yyyy] [name of copyright owner]
.\"
.\" CDDL HEADER END
.\"
.\"
.\" Copyright (c) 2020 by Delphix. All rights reserved.
.Dd March 25, 2020
.Dt ZSTREAM 8
.Os Linux
.Sh NAME
.Nm zstream
.Nd manipulate zfs send streams
.Sh SYNOPSIS
.Nm
.Cm dump
.Op Fl Cvd
.Op Ar file
.Nm
.Cm redup
.Op Fl v
.Ar file
.Sh DESCRIPTION
.sp
.LP
The
.Sy zstream
utility manipulates zfs send streams, which are the output of the
.Sy zfs send
command.
.Bl -tag -width ""
.It Xo
.Nm
.Cm dump
.Op Fl Cvd
.Op Ar file
.Xc
Print information about the specified send stream, including headers and
record counts.
The send stream may either be in the specified
.Ar file ,
or provided on standard input.
.Bl -tag -width "-D"
.It Fl C
Suppress the validation of checksums.
.It Fl v
Verbose.
Print metadata for each record.
.It Fl d
Dump data contained in each record.
Implies verbose.
.El
.It Xo
.Nm
.Cm redup
.Op Fl v
.Ar file
.Xc
Deduplicated send streams can be generated by using the
.Nm zfs Cm send Fl D
command.
The ability to send deduplicated send streams is deprecated.
In the future, the ability to receive a deduplicated send stream with
.Nm zfs Cm receive
will be removed.
However, deduplicated send streams can still be received by utilizing
.Nm zstream Cm redup .
.Pp
The
.Nm zstream Cm redup
command is provided a
.Ar file
containing a deduplicated send stream, and outputs an equivalent
non-deduplicated send stream on standard output.
Therefore, a deduplicated send stream can be received by running:
.Bd -literal
# zstream redup DEDUP_STREAM_FILE | zfs receive ...
.Ed
.Bl -tag -width "-D"
.It Fl v
Verbose.
Print summary of converted records.
.Sh SEE ALSO
.Xr zfs 8 ,
.Xr zfs-send 8 ,
.Xr zfs-receive 8