freebsd-dev

d/freebsd-dev

Fork 0

Commit Graph

Author	SHA1	Message	Date
Conrad Meyer	eefd8f96fb	geom_uzip(4), mkuzip(8): Add Zstd image mode The Zstd format bumps the CLOOP major number to 4 to avoid incompatibility with older systems. Support in geom_uzip(4) is conditional on the ZSTDIO kernel option, which is enabled in amd64 GENERIC, but not all in-tree configurations. mkuzip(8) was modified slightly to always initialize the nblocks + 1'th offset in the CLOOP file format. Previously, it was only initialized in the case where the final compressed block happened to be unaligned w.r.t. DEV_BSIZE. The "Fake" last+1 block change in r298619 means that the final compressed block's 'blen' was never correct unless the compressed uzip image happened to be BSIZE-aligned. This happened in about 1 out of every 512 cases. The zlib and lzma decompressors are probably tolerant of extra trash following the frame they were told to decode, but Zstd complains that the input size is incorrect. Correspondingly, geom_uzip(4) was modified slightly to avoid trashing the nblocks + 1'th offset when it is known to be initialized to a good value. This corrects the calculated final real cluster compressed length to match that printed by mkuzip(8). mkuzip(8) was refactored somewhat to reduce code duplication and increase ease of adding other compression formats. * Input block size validation was pulled out of individual compression init routines into main(). * Init routines now validate a user-provided compression level or select an algorithm-specific default, if none was provided. * A new interface for calculating the maximal compressed size of an incompressible input block was added for each driver. The generic code uses it to validate against MAXPHYS as well as to allocate compression result buffers in the generic code. * Algorithm selection is now driven by a table lookup, to increase ease of adding other formats in the future. mkuzip(8) gained the ability to explicitly specify a compression level with '-C'. The prior defaults -- 9 for zlib and 6 for lzma -- are maintained. The new zstd default is 9, to match zlib. Rather than select lzma or zlib with '-L' or its absense, respectively, a new argument '-A <algorithm>' is provided to select 'zlib', 'lzma', or 'zstd'. '-L' is considered deprecated, but will probably never be removed. All of the new features were documented in mkuzip.8; the page was also cleaned up slightly. Relnotes: yes	2019-08-13 23:32:56 +00:00
Maxim Sobolev	4fc55e3e46	Improve performance in a few key areas: o Split the compression across several worker threads. By default, "several" matches number of CPUs, capped at 24 for sanity when running on a very big hardwares. Provide option to set that number manually; o Fix bug inherited from the mkulzma (R.I.P) which degraded already slow LZMA compression even further by calling function to release compression state after processing each block. It is neither documented as required nor actually required by the LZMA library. This caused spree of system calls to release memory and then map it again for every block. LZMA compression is more than 2x faster after this change alone; o Record time it takes to do compression and report throughput achieved. o Add simple first-level 256 entry hash table for de-dup code, so it's not becoming a bottleneck at big files.	2016-04-23 07:23:43 +00:00

Author

SHA1

Message

Date

Conrad Meyer

eefd8f96fb

geom_uzip(4), mkuzip(8): Add Zstd image mode

The Zstd format bumps the CLOOP major number to 4 to avoid incompatibility
with older systems.  Support in geom_uzip(4) is conditional on the ZSTDIO
kernel option, which is enabled in amd64 GENERIC, but not all in-tree
configurations.

mkuzip(8) was modified slightly to always initialize the nblocks + 1'th
offset in the CLOOP file format.  Previously, it was only initialized in the
case where the final compressed block happened to be unaligned w.r.t.
DEV_BSIZE.  The "Fake" last+1 block change in r298619 means that the final
compressed block's 'blen' was never correct unless the compressed uzip image
happened to be BSIZE-aligned.  This happened in about 1 out of every 512
cases.  The zlib and lzma decompressors are probably tolerant of extra trash
following the frame they were told to decode, but Zstd complains that the
input size is incorrect.

Correspondingly, geom_uzip(4) was modified slightly to avoid trashing the
nblocks + 1'th offset when it is known to be initialized to a good value.
This corrects the calculated final real cluster compressed length to match
that printed by mkuzip(8).

mkuzip(8) was refactored somewhat to reduce code duplication and increase
ease of adding other compression formats.

  * Input block size validation was pulled out of individual compression
    init routines into main().

  * Init routines now validate a user-provided compression level or select
    an algorithm-specific default, if none was provided.

  * A new interface for calculating the maximal compressed size of an
    incompressible input block was added for each driver.  The generic code
    uses it to validate against MAXPHYS as well as to allocate compression
    result buffers in the generic code.

  * Algorithm selection is now driven by a table lookup, to increase ease of
    adding other formats in the future.

mkuzip(8) gained the ability to explicitly specify a compression level with
'-C'.  The prior defaults -- 9 for zlib and 6 for lzma -- are maintained.
The new zstd default is 9, to match zlib.

Rather than select lzma or zlib with '-L' or its absense, respectively, a
new argument '-A <algorithm>' is provided to select 'zlib', 'lzma', or
'zstd'.  '-L' is considered deprecated, but will probably never be removed.

All of the new features were documented in mkuzip.8; the page was also
cleaned up slightly.

Relnotes:	yes

2019-08-13 23:32:56 +00:00

Maxim Sobolev

4fc55e3e46

Improve performance in a few key areas:

o Split the compression across several worker threads. By default, "several"
   matches number of CPUs, capped at 24 for sanity when running on a very big
   hardwares. Provide option to set that number manually;

 o Fix bug inherited from the mkulzma (R.I.P) which degraded already slow LZMA
   compression even further by calling function to release compression state
   after processing each block.

   It is neither documented as required nor actually required by the LZMA
   library. This caused spree of system calls to release memory and then map
   it again for every block. LZMA compression is more than 2x faster after this
   change alone;

 o Record time it takes to do compression and report throughput achieved.

 o Add simple first-level 256 entry hash table for de-dup code, so it's not
   becoming a bottleneck at big files.

2016-04-23 07:23:43 +00:00

2 Commits