Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
.\"-
|
|
|
|
.\" Copyright (c) 2004-2016 Maxim Sobolev <sobomax@FreeBSD.org>
|
|
|
|
.\" All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
2004-09-10 20:17:31 +00:00
|
|
|
.\"
|
|
|
|
.\" $FreeBSD$
|
|
|
|
.\"
|
2006-09-29 15:20:48 +00:00
|
|
|
.Dd March 17, 2006
|
2004-09-11 18:39:01 +00:00
|
|
|
.Dt MKUZIP 8
|
2004-09-10 20:17:31 +00:00
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm mkuzip
|
2004-09-10 22:26:31 +00:00
|
|
|
.Nd compress disk image for use with
|
2004-09-10 20:17:31 +00:00
|
|
|
.Xr geom_uzip 4
|
2004-09-10 22:26:31 +00:00
|
|
|
class
|
2004-09-10 20:17:31 +00:00
|
|
|
.Sh SYNOPSIS
|
|
|
|
.Nm
|
|
|
|
.Op Fl v
|
|
|
|
.Op Fl o Ar outfile
|
|
|
|
.Op Fl s Ar cluster_size
|
2016-04-23 07:23:43 +00:00
|
|
|
.Op Fl j Ar compression_jobs
|
2004-09-10 20:17:31 +00:00
|
|
|
.Ar infile
|
|
|
|
.Sh DESCRIPTION
|
|
|
|
The
|
|
|
|
.Nm
|
2006-03-17 20:48:10 +00:00
|
|
|
utility compresses a disk image file so that the
|
2004-09-10 20:17:31 +00:00
|
|
|
.Xr geom_uzip 4
|
2006-03-17 20:48:10 +00:00
|
|
|
class will be able to decompress the resulting image at run-time.
|
|
|
|
This allows for a significant reduction of size of disk image at
|
2004-09-10 22:26:31 +00:00
|
|
|
the expense of some CPU time required to decompress the data each
|
2004-09-11 18:39:01 +00:00
|
|
|
time it is read.
|
2006-09-29 15:20:48 +00:00
|
|
|
The
|
2006-03-17 20:48:10 +00:00
|
|
|
.Nm
|
2006-09-29 15:20:48 +00:00
|
|
|
utility
|
2006-03-17 20:48:10 +00:00
|
|
|
works in two phases:
|
2004-09-10 20:17:31 +00:00
|
|
|
.Bl -enum
|
|
|
|
.It
|
|
|
|
An
|
|
|
|
.Ar infile
|
2006-03-17 20:48:10 +00:00
|
|
|
image is split into clusters; each cluster is compressed using
|
2017-02-11 23:40:57 +00:00
|
|
|
.Xr zlib 3
|
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
or
|
|
|
|
.Xr lzma 3 .
|
2004-09-10 20:17:31 +00:00
|
|
|
.It
|
2006-03-17 20:48:10 +00:00
|
|
|
The resulting set of compressed clusters along with headers that allow
|
|
|
|
locating each individual cluster is written to the output file.
|
2004-09-10 20:17:31 +00:00
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
The options are:
|
2004-09-11 18:39:01 +00:00
|
|
|
.Bl -tag -width indent
|
2004-09-10 20:17:31 +00:00
|
|
|
.It Fl o Ar outfile
|
2006-03-17 20:48:10 +00:00
|
|
|
Name of the output file
|
2004-09-10 20:17:31 +00:00
|
|
|
.Ar outfile .
|
|
|
|
The default is to use the input name with the suffix
|
2017-02-11 23:40:57 +00:00
|
|
|
.Pa .uzip
|
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
for the
|
|
|
|
.Xr zlib 3
|
|
|
|
compression or
|
|
|
|
.Pa .ulzma
|
|
|
|
for the
|
|
|
|
.Xr lzma 3 .
|
|
|
|
.It Fl L
|
|
|
|
Use
|
|
|
|
.Xr lzma 3
|
|
|
|
compression algorithm instead of the default
|
|
|
|
.Xr zlib 3 .
|
|
|
|
The
|
|
|
|
.Xr lzma 3
|
|
|
|
provides noticeable better compression levels on the same data set
|
|
|
|
at the expense of much slower compression speed (10-20x) and somewhat slower
|
|
|
|
decompression (2-3x).
|
2004-09-10 20:17:31 +00:00
|
|
|
.It Fl s Ar cluster_size
|
2006-03-17 20:48:10 +00:00
|
|
|
Split the image into clusters of
|
2004-09-10 20:17:31 +00:00
|
|
|
.Ar cluster_size
|
2006-03-17 20:48:10 +00:00
|
|
|
bytes, 16384 bytes by default.
|
2004-09-11 18:39:01 +00:00
|
|
|
The
|
2004-09-10 20:17:31 +00:00
|
|
|
.Ar cluster_size
|
2004-09-11 18:39:01 +00:00
|
|
|
should be a multiple of 512 bytes.
|
2004-09-10 20:17:31 +00:00
|
|
|
.It Fl v
|
|
|
|
Display verbose messages.
|
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
.It Fl Z
|
|
|
|
Disable zero-blocks detection and elimination.
|
|
|
|
When this option is set, the
|
|
|
|
.Nm
|
|
|
|
would compress empty blocks (i.e. clusters that consist of only zero bytes)
|
|
|
|
just as it would any other block.
|
|
|
|
When the option is not set, the
|
|
|
|
.Nm
|
|
|
|
detects such blocks and skips them from the output.
|
|
|
|
Setting
|
|
|
|
.Fl Z
|
|
|
|
results is slight increase of compressed image size, typically less than 0.1%
|
|
|
|
of a final size of the compressed image.
|
|
|
|
.It Fl d
|
|
|
|
Enable de-duplication.
|
|
|
|
When the option is enabled the
|
|
|
|
.Nm
|
|
|
|
detects identical blocks in the input and replaces each subsequent occurence
|
|
|
|
of such block with pointer to the very first one in the output.
|
|
|
|
Setting this option results is moderate decrease of compressed image size,
|
|
|
|
typically around 3-5% of a final size of the compressed image.
|
2016-03-10 21:36:24 +00:00
|
|
|
.It Fl S
|
|
|
|
Print summary about the compression ratio as well as output
|
|
|
|
file size after file has been processed.
|
2016-04-23 07:23:43 +00:00
|
|
|
.It Fl j Ar compression_jobs
|
|
|
|
Specify the number of compression jobs that
|
|
|
|
.Nm
|
|
|
|
runs in parallel to speed up compression.
|
|
|
|
When option is not specified the number of jobs set to be equal
|
|
|
|
to the value of
|
|
|
|
.Va hw.ncpu
|
|
|
|
.Xr sysctl 8
|
|
|
|
variable.
|
2004-09-10 20:17:31 +00:00
|
|
|
.El
|
|
|
|
.Sh NOTES
|
2006-03-17 20:48:10 +00:00
|
|
|
The compression ratio largely depends on the cluster size used.
|
|
|
|
.\" The following two sentences are unclear: how can gzip(1) be
|
|
|
|
.\" used in a comparable fashion, and wouldn't a gzip-compressed
|
|
|
|
.\" image suffer from larger cluster sizes as well?
|
2004-09-11 18:39:01 +00:00
|
|
|
For large cluster sizes (16K and higher), typical compression ratios
|
2006-03-17 20:48:10 +00:00
|
|
|
are only 1-2% less than those achieved with
|
|
|
|
.Xr gzip 1 .
|
2004-09-11 18:39:01 +00:00
|
|
|
However, it should be kept in mind that larger cluster
|
2004-09-10 20:17:31 +00:00
|
|
|
sizes lead to higher overhead in the
|
|
|
|
.Xr geom_uzip 4
|
|
|
|
class, as the class has to decompress the whole cluster even if
|
2006-03-17 20:48:10 +00:00
|
|
|
only a few bytes from that cluster have to be read.
|
2004-09-10 22:26:31 +00:00
|
|
|
.Pp
|
2006-09-29 15:20:48 +00:00
|
|
|
The
|
2004-09-10 22:26:31 +00:00
|
|
|
.Nm
|
2006-09-29 15:20:48 +00:00
|
|
|
utility
|
2006-03-17 20:48:10 +00:00
|
|
|
inserts a short shell script at the beginning of the generated image,
|
2004-09-10 22:26:31 +00:00
|
|
|
which makes it possible to
|
2004-09-11 18:39:01 +00:00
|
|
|
.Dq run
|
|
|
|
the image just like any other shell script.
|
2006-03-17 20:48:10 +00:00
|
|
|
The script tries to load the
|
2004-09-10 22:26:31 +00:00
|
|
|
.Xr geom_uzip 4
|
2006-03-17 20:48:10 +00:00
|
|
|
class if it is not loaded, configure the image as an
|
2004-09-10 22:26:31 +00:00
|
|
|
.Xr md 4
|
|
|
|
disk device using
|
2006-03-17 20:48:10 +00:00
|
|
|
.Xr mdconfig 8 ,
|
|
|
|
and automatically mount it using
|
2004-09-10 22:26:31 +00:00
|
|
|
.Xr mount_cd9660 8
|
2006-03-17 20:48:10 +00:00
|
|
|
on the mount point provided as the first argument to the script.
|
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
.Pp
|
|
|
|
The de-duplication is a
|
|
|
|
.Fx
|
|
|
|
specific feature and while it does not require any changes to on-disk
|
|
|
|
compressed image format, however it did require some matching changes to the
|
|
|
|
.Xr geom_uzip 4
|
|
|
|
to handle resulting images correctly.
|
2005-01-18 13:43:56 +00:00
|
|
|
.Sh EXIT STATUS
|
|
|
|
.Ex -std
|
2004-09-10 20:17:31 +00:00
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr gzip 1 ,
|
Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):
1. mkuzip(8):
- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.
- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.
- provide options to control both features and document them in manual
page.
- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.
- switch license from ad-hoc beerware into standard 2-clause BSD.
2. geom_uzip(4):
- implement support for de-duplicated images;
- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;
- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.
- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.
- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.
- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.
- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.
- document new knobs in the manual page.
Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
|
|
|
.Xr xz 1 ,
|
|
|
|
.Xr lzma 3 ,
|
2017-02-11 23:40:57 +00:00
|
|
|
.Xr zlib 3 ,
|
2004-09-10 22:26:31 +00:00
|
|
|
.Xr geom 4 ,
|
2004-09-11 18:39:01 +00:00
|
|
|
.Xr geom_uzip 4 ,
|
2004-09-10 22:26:31 +00:00
|
|
|
.Xr md 4 ,
|
|
|
|
.Xr mdconfig 8 ,
|
2004-09-11 18:39:01 +00:00
|
|
|
.Xr mount_cd9660 8
|
2004-09-10 20:17:31 +00:00
|
|
|
.Sh AUTHORS
|
2014-06-23 08:23:05 +00:00
|
|
|
.An Maxim Sobolev Aq Mt sobomax@FreeBSD.org
|