This PR adds two new compression types, based on ZStandard: - zstd: A basic ZStandard compression algorithm Available compression. Levels for zstd are zstd-1 through zstd-19, where the compression increases with every level, but speed decreases. - zstd-fast: A faster version of the ZStandard compression algorithm zstd-fast is basically a "negative" level of zstd. The compression decreases with every level, but speed increases. Available compression levels for zstd-fast: - zstd-fast-1 through zstd-fast-10 - zstd-fast-20 through zstd-fast-100 (in increments of 10) - zstd-fast-500 and zstd-fast-1000 For more information check the man page. Implementation details: Rather than treat each level of zstd as a different algorithm (as was done historically with gzip), the block pointer `enum zio_compress` value is simply zstd for all levels, including zstd-fast, since they all use the same decompression function. The compress= property (a 64bit unsigned integer) uses the lower 7 bits to store the compression algorithm (matching the number of bits used in a block pointer, as the 8th bit was borrowed for embedded block pointers). The upper bits are used to store the compression level. It is necessary to be able to determine what compression level was used when later reading a block back, so the concept used in LZ4, where the first 32bits of the on-disk value are the size of the compressed data (since the allocation is rounded up to the nearest ashift), was extended, and we store the version of ZSTD and the level as well as the compressed size. This value is returned when decompressing a block, so that if the block needs to be recompressed (L2ARC, nop-write, etc), that the same parameters will be used to result in the matching checksum. All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`, `zio_prop_t`, etc.) uses the separated _compress and _complevel variables. Only the properties ZAP contains the combined/bit-shifted value. The combined value is split when the compression_changed_cb() callback is called, and sets both objset members (os_compress and os_complevel). The userspace tools all use the combined/bit-shifted value. Additional notes: zdb can now also decode the ZSTD compression header (flag -Z) and inspect the size, version and compression level saved in that header. For each record, if it is ZSTD compressed, the parameters of the decoded compression header get printed. ZSTD is included with all current tests and new tests are added as-needed. Per-dataset feature flags now get activated when the property is set. If a compression algorithm requires a feature flag, zfs activates the feature when the property is set, rather than waiting for the first block to be born. This is currently only used by zstd but can be extended as needed. Portions-Sponsored-By: The FreeBSD Foundation Co-authored-by: Allan Jude <allanjude@freebsd.org> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Co-authored-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl> Co-authored-by: Michael Niewöhner <foss@mniewoehner.de> Signed-off-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Allan Jude <allanjude@freebsd.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl> Signed-off-by: Michael Niewöhner <foss@mniewoehner.de> Closes #6247 Closes #9024 Closes #10277 Closes #10278
ZSTD-On-ZFS Library Manual
Introduction
This subtree contains the ZSTD library used in ZFS. It is heavily cut-down by dropping any unneeded files, and combined into a single file, but otherwise is intentionally unmodified. Please do not alter the file containing the zstd library, besides upgrading to a newer ZSTD release.
Tree structure:
zfs_zstd.c
is the actualzzstd
kernel module.lib/
contains the the unmodified, "amalgamated" version of theZstandard
library, generated from our template filezstd-in.c
is our template file for generating the libraryinclude/
: This directory contains supplemental includes for platform compatibility, which are not expected to be used by ZFS elsewhere in the future. Thus we keep them private to ZSTD.
Updating ZSTD
To update ZSTD the following steps need to be taken:
- Grab the latest release of ZSTD.
- Update
module/zstd/zstd-in.c
if required. (seezstd/contrib/single_file_libs/zstd-in.c
in the zstd repository) - Generate the "single-file-library" and put it to
module/zstd/lib/
. - Copy the following files to
module/zstd/lib/
:zstd/lib/zstd.h
zstd/lib/common/zstd_errors.h
This can be done using a few shell commands from inside the zfs repo:
cd PATH/TO/ZFS
url="https://github.com/facebook/zstd"
release="$(curl -s "${url}"/releases/latest | grep -oP '(?<=v)[\d\.]+')"
zstd="/tmp/zstd-${release}/"
wget -O /tmp/zstd.tar.gz \
"${url}/releases/download/v${release}/zstd-${release}.tar.gz"
tar -C /tmp -xzf /tmp/zstd.tar.gz
cp ${zstd}/lib/zstd.h module/zstd/lib/
cp ${zstd}/lib/zstd_errors.h module/zstd/lib/
${zstd}/contrib/single_file_libs/combine.sh \
-r ${zstd}/lib -o module/zstd/lib/zstd.c module/zstd/zstd-in.c
Altering ZSTD and breaking changes
If ZSTD made changes that break compatibility or you need to make breaking changes to the way we handle ZSTD, it is required to maintain backwards compatibility.
We already save the ZSTD version number within the block header to be used to add future compatibility checks and/or fixes. However, currently it is not actually used in such a way.