Commit Graph

47 Commits

Author SHA1 Message Date
cem
604e65334e geom_uzip(4), mkuzip(8): Add Zstd image mode
The Zstd format bumps the CLOOP major number to 4 to avoid incompatibility
with older systems.  Support in geom_uzip(4) is conditional on the ZSTDIO
kernel option, which is enabled in amd64 GENERIC, but not all in-tree
configurations.

mkuzip(8) was modified slightly to always initialize the nblocks + 1'th
offset in the CLOOP file format.  Previously, it was only initialized in the
case where the final compressed block happened to be unaligned w.r.t.
DEV_BSIZE.  The "Fake" last+1 block change in r298619 means that the final
compressed block's 'blen' was never correct unless the compressed uzip image
happened to be BSIZE-aligned.  This happened in about 1 out of every 512
cases.  The zlib and lzma decompressors are probably tolerant of extra trash
following the frame they were told to decode, but Zstd complains that the
input size is incorrect.

Correspondingly, geom_uzip(4) was modified slightly to avoid trashing the
nblocks + 1'th offset when it is known to be initialized to a good value.
This corrects the calculated final real cluster compressed length to match
that printed by mkuzip(8).

mkuzip(8) was refactored somewhat to reduce code duplication and increase
ease of adding other compression formats.

  * Input block size validation was pulled out of individual compression
    init routines into main().

  * Init routines now validate a user-provided compression level or select
    an algorithm-specific default, if none was provided.

  * A new interface for calculating the maximal compressed size of an
    incompressible input block was added for each driver.  The generic code
    uses it to validate against MAXPHYS as well as to allocate compression
    result buffers in the generic code.

  * Algorithm selection is now driven by a table lookup, to increase ease of
    adding other formats in the future.

mkuzip(8) gained the ability to explicitly specify a compression level with
'-C'.  The prior defaults -- 9 for zlib and 6 for lzma -- are maintained.
The new zstd default is 9, to match zlib.

Rather than select lzma or zlib with '-L' or its absense, respectively, a
new argument '-A <algorithm>' is provided to select 'zlib', 'lzma', or
'zstd'.  '-L' is considered deprecated, but will probably never be removed.

All of the new features were documented in mkuzip.8; the page was also
cleaned up slightly.

Relnotes:	yes
2019-08-13 23:32:56 +00:00
kib
b648c57632 Minor cleanup for mkuzip(8) man page.
List all single-letter options in summary.
Order options alphabetically.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2019-02-19 20:26:03 +00:00
pfg
7551d83c35 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
bdrewery
a598c4b809 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
sobomax
2e2334027d Doh, fix some botched "fix" in r320277.
Reported by:	cem
MFC after:	6 weeks
2017-06-23 23:11:05 +00:00
sobomax
8f9a541591 Don't leak file descriptor in some cases.
Reported by:	cem
MFC after:	6 weeks
2017-06-23 17:39:00 +00:00
sobomax
c9cda0bfe0 o Move logic that determines size of the input image into its own
file. That logic has grown quite significantly now;

o add a special handling for the snapshot images. Those have some
  extra headers at the end of the image and we don't need those
  in the output image really.

MFC after:	6 weeks
2017-06-17 02:58:31 +00:00
asomers
0ef1036c0b Don't depend on assert(3) getting evaluated
Reported by:	imp
MFC after:	3 weeks
X-MFC-With:	318141, 318143
Sponsored by:	Spectra Logic Corp
2017-05-10 16:06:22 +00:00
asomers
b2359e58e4 strcpy => strlcpy
Reported by:	Coverity
CID:		1352771
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-05-10 15:27:36 +00:00
bapt
deab5b8b88 Remove spaces at EOL and sort correctly the SEE ALSO section
Reported by:	make manlint
MFC after:	2 days
2017-02-11 23:40:57 +00:00
bdrewery
3d31e0f30a DIRDEPS_BUILD: Update dependencies.
Sponsored by:	EMC / Isilon Storage Division
2016-05-04 03:14:34 +00:00
bz
8f86ed9783 Try to make gcc builds happy again by removing a redundant declaration. 2016-04-25 13:20:35 +00:00
sobomax
0b911c8b6b GC duplicate define. 2016-04-23 07:28:32 +00:00
sobomax
5be3733785 Improve performance in a few key areas:
o Split the compression across several worker threads. By default, "several"
   matches number of CPUs, capped at 24 for sanity when running on a very big
   hardwares. Provide option to set that number manually;

 o Fix bug inherited from the mkulzma (R.I.P) which degraded already slow LZMA
   compression even further by calling function to release compression state
   after processing each block.

   It is neither documented as required nor actually required by the LZMA
   library. This caused spree of system calls to release memory and then map
   it again for every block. LZMA compression is more than 2x faster after this
   change alone;

 o Record time it takes to do compression and report throughput achieved.

 o Add simple first-level 256 entry hash table for de-dup code, so it's not
   becoming a bottleneck at big files.
2016-04-23 07:23:43 +00:00
sobomax
bd7cbc7f20 In the de-duplication mode, when found matching md5 checksum also read
back block and compare actual content. Just output original block
instead of back reference in the unlikely event of collision.
2016-03-13 21:09:08 +00:00
sobomax
0e0b4ac1f0 When -S is specified dump summary to stdout, not stderr, so it's
easier to capture and process it with external tools via pipe.
2016-03-10 23:19:35 +00:00
sobomax
773a64f003 Add -S option to print out summary after compression has been
completed.

MFC after: 	2 weeks
2016-03-10 21:36:24 +00:00
bdrewery
b6e8f9c3c9 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	EMC / Isilon Storage Division
2016-02-24 17:18:35 +00:00
sobomax
8abe971b5e Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):

1. mkuzip(8):

 - Proper support for eliminating all-zero blocks when compressing an
   image. This feature is already supported by the geom_uzip(4) module
   and CLOOP format in general, so it's just a matter of making mkuzip(8)
   match. It should be noted, however that this feature while it sounds
   great, results in very slight improvement in the overall compression
   ratio, since compressing default 16k all-zero block produces only 39
   bytes compressed output block, which is 99.8% compression ratio. With
   typical average compression ratio of amd64 binaries and data being
   around 60-70% the difference between 99.8% and 100.0% is not that
   great further diluted by the ratio of number of zero blocks in the
   uncompressed image to the overall number of blocks being less than
   0.5 (typically). However, this may be important from performance
   standpoint, so that kernel are not spinning its wheels decompressing
   those empty blocks every time this zero region is read. It could also
   be important when you create huge image mostly filled with zero
   blocks for testing purposes.

 - New feature allowing to de-duplicate output image. It turns out that
   if you twist CLOOP format a bit you can do that as well. And unlike
   zero-blocks elimination, this gives a noticeable improvement in the
   overall compression ratio, reducing output image by something like
   3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
   plus some of the packages (openjdk, apache etc), about 2.3GB worth of
   file data (800+MB compressed). The only caveat is that images created
   with this feature "on" would not work on older versions of FeeBSDxi
   kernel, hence it's turned off by default.

 - provide options to control both features and document them in manual
   page.

 - merge in all relevant LZMA compression support from the mkulzma(8),
   add new option to select between both.

 - switch license from ad-hoc beerware into standard 2-clause BSD.

2. geom_uzip(4):

 - implement support for de-duplicated images;

 - optimize some code paths to handle "all-zero" blocks without reading
   any compressed data;

 - beef up manual page to explain that geom_uzip(4) is not limited only
   to md(4) images. The compressed data can be written to the block
   device and accessed directly via magic of GEOM(4) and devfs(4),
   including to mount root fs from a compressed drive.

 - convert debug log code from being compiled in conditionally into
   being present all the time and provide two sysctls to turn it on or
   off. Due to intended use of the module, it can be used in
   environments where there may not be a luxury to put new kernel with
   debug code enabled. Having those options handy allows debug issues
   without as much problem by just having access to serial console or
   network shell access to a box/appliance. The resulting additional
   CPU cycles are just few int comparisons and branches, and those are
   minuscule when compared to data decompression which is the main
   feature of the module.

 - hopefully improve robustness and resiliency of the geom_uzip(4) by
   performing some of the data validation / range checking on the TOC
   entries and rejecting to attach to an image if those checks fail.

 - merge in all relevant LZMA decompression support from the
   geom_uncompress(4), enable automatically when appropriate format is
   indicated in the header.

 - move compilation work into its own worker thread so that it does not
   clog g_up. This allows multiple instances work in parallel utilizing
   smp cores.

 - document new knobs in the manual page.

Reviewed by:		adrian
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
sjg
008d7c831f Add META_MODE support.
Off by default, build behaves normally.
WITH_META_MODE we get auto objdir creation, the ability to
start build from anywhere in the tree.

Still need to add real targets under targets/ to build packages.

Differential Revision:       D2796
Reviewed by: brooks imp
2015-06-13 19:20:56 +00:00
sjg
75a137820d dirdeps.mk now sets DEP_RELDIR 2015-06-08 23:35:17 +00:00
sjg
65145fa4c8 Merge sync of head 2015-05-27 01:19:58 +00:00
bapt
8d6c7a49a6 Convert to usr.bin/ to LIBADD
Reduce overlinking
2014-11-25 14:29:10 +00:00
sjg
d7cd1d425c Merge head from 7/28 2014-08-19 06:50:54 +00:00
bapt
1f77f137dc use .Mt to mark up email addresses consistently (part3)
PR:		191174
Submitted by:	Franco Fichtner  <franco at lastsummer.de>
2014-06-23 08:23:05 +00:00
sjg
5860f0d106 Updated dependencies 2014-05-16 14:09:51 +00:00
sjg
1a7e48acf1 Updated dependencies 2014-05-10 05:16:28 +00:00
sjg
6d37b86f2b Updated dependencies 2013-03-11 17:21:52 +00:00
sjg
0ee5295509 Updated dependencies 2013-02-16 01:23:54 +00:00
marcel
9dd41e3647 Sync FreeBSD's bmake branch with Juniper's internal bmake branch.
Requested by: Simon Gerraty <sjg@juniper.net>
2012-08-22 19:25:57 +00:00
ru
bacb593f25 Fixed an embedded shell script.
Reviewed by:	sobomax
2011-05-13 09:55:48 +00:00
uqs
e644199c18 mdoc: consistently spell our email addresses <foo@FreeBSD.org>
Reviewed by:	ru
2010-05-19 08:57:53 +00:00
ed
9b380e30d4 Build usr.bin/ with WARNS=6 by default.
Also add some missing $FreeBSD$ to keep svn happy.
2010-01-02 10:27:05 +00:00
fjoe
b67850e438 Support character device as input file.
PR:             103500
2007-03-06 17:04:15 +00:00
ru
33e34aeeb5 Markup fixes. 2006-09-29 15:20:48 +00:00
sobomax
9003203e3b A few minor corrections to the mkuzip.8 man page.
PR:		92576
Submitted by:	Stefan Bethke
2006-03-17 20:48:10 +00:00
pjd
6964e18a97 Tell the user exactly where the problem was. 2006-01-30 23:00:48 +00:00
fjoe
80c76b58f6 - check for geom_uzip module presence using kldstat -m.
kldstat -m finds geom_uzip module even if it is compiled in statically.
- create output file with x bit set.
- build mkuzip on all architectures (verified with "make universe").
- fix typo in info message.
2005-05-11 17:02:38 +00:00
sobomax
415d965096 Make WARNS=6 clean, which should make it compiling on amd64.
Submitted by:	Matteo Riondato <rionda@gufi.org>
2005-05-02 17:38:49 +00:00
ru
7f3c7f0d46 Sort sections. 2005-01-18 13:43:56 +00:00
ru
6cc4b6c220 Added the EXIT STATUS section where appropriate. 2005-01-17 07:44:44 +00:00
marcel
b083044ab4 Fix build: s/mkunzip.8/mkuzip.8/ 2004-09-12 00:32:35 +00:00
ru
24866f1a1e Normalize the manpage.
Reviewed by:	sobomax
2004-09-11 18:39:01 +00:00
ru
73e2ab9fdb Normalize the makefile.
Reviewed by:	sobomax
2004-09-11 18:38:26 +00:00
sobomax
11ecdd7768 o Print more info in the verbose mode;
o use zlib(3) function which computes maximum length of the output
  buffer instead of rolling own version;

o allow size of input file to be not multiple of cluster size by applying
  zero padding.
2004-09-10 23:16:05 +00:00
sobomax
b9945320c2 Clarify/extend in several places and make sure that everything matches reality. 2004-09-10 22:26:31 +00:00
sobomax
f292884004 Add mkuzip(8), non-GPL utility to compress filesystem images for use with
geom_uzip module. This is based on utility I wrote some 3 years ago for a
hack for md(4), which functionally was close to what geom_uzip does today.

Since I don't have a time to test that it compiles/works on other arches,
stick it to i386 only. Will do it later.

Unlike original cloop util, this one embedds FreeBSD-compatible shell code
into the generated image, not Linux one. Unfortunately severe space
restriction imposed by the CLOOP format doesn't allow to put conditional
code which will work both on Linux and FreeBSD. In fact it was quite a
challenge to fit necessary FreeBSD code into 127 bytes. ;-)
2004-09-10 20:17:31 +00:00