freebsd-dev/sys/geom
Maxim Sobolev 8f8cb840b0 Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):

1. mkuzip(8):

 - Proper support for eliminating all-zero blocks when compressing an
   image. This feature is already supported by the geom_uzip(4) module
   and CLOOP format in general, so it's just a matter of making mkuzip(8)
   match. It should be noted, however that this feature while it sounds
   great, results in very slight improvement in the overall compression
   ratio, since compressing default 16k all-zero block produces only 39
   bytes compressed output block, which is 99.8% compression ratio. With
   typical average compression ratio of amd64 binaries and data being
   around 60-70% the difference between 99.8% and 100.0% is not that
   great further diluted by the ratio of number of zero blocks in the
   uncompressed image to the overall number of blocks being less than
   0.5 (typically). However, this may be important from performance
   standpoint, so that kernel are not spinning its wheels decompressing
   those empty blocks every time this zero region is read. It could also
   be important when you create huge image mostly filled with zero
   blocks for testing purposes.

 - New feature allowing to de-duplicate output image. It turns out that
   if you twist CLOOP format a bit you can do that as well. And unlike
   zero-blocks elimination, this gives a noticeable improvement in the
   overall compression ratio, reducing output image by something like
   3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
   plus some of the packages (openjdk, apache etc), about 2.3GB worth of
   file data (800+MB compressed). The only caveat is that images created
   with this feature "on" would not work on older versions of FeeBSDxi
   kernel, hence it's turned off by default.

 - provide options to control both features and document them in manual
   page.

 - merge in all relevant LZMA compression support from the mkulzma(8),
   add new option to select between both.

 - switch license from ad-hoc beerware into standard 2-clause BSD.

2. geom_uzip(4):

 - implement support for de-duplicated images;

 - optimize some code paths to handle "all-zero" blocks without reading
   any compressed data;

 - beef up manual page to explain that geom_uzip(4) is not limited only
   to md(4) images. The compressed data can be written to the block
   device and accessed directly via magic of GEOM(4) and devfs(4),
   including to mount root fs from a compressed drive.

 - convert debug log code from being compiled in conditionally into
   being present all the time and provide two sysctls to turn it on or
   off. Due to intended use of the module, it can be used in
   environments where there may not be a luxury to put new kernel with
   debug code enabled. Having those options handy allows debug issues
   without as much problem by just having access to serial console or
   network shell access to a box/appliance. The resulting additional
   CPU cycles are just few int comparisons and branches, and those are
   minuscule when compared to data decompression which is the main
   feature of the module.

 - hopefully improve robustness and resiliency of the geom_uzip(4) by
   performing some of the data validation / range checking on the TOC
   entries and rejecting to attach to an image if those checks fail.

 - merge in all relevant LZMA decompression support from the
   geom_uncompress(4), enable automatically when appropriate format is
   indicated in the header.

 - move compilation work into its own worker thread so that it does not
   clog g_up. This allows multiple instances work in parallel utilizing
   smp cores.

 - document new knobs in the manual page.

Reviewed by:		adrian
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D5333
2016-02-23 23:59:08 +00:00
..
bde Replace sys/crypto/sha2/sha2.c with lib/libmd/sha512c.c 2015-12-27 17:33:59 +00:00
cache Unsigned values can never be less than 0. 2014-08-07 21:56:37 +00:00
concat Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
eli Make additional parts of sys/geom/eli more usable in userspace 2016-01-07 05:47:34 +00:00
gate CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten 2015-05-22 17:05:21 +00:00
journal Create an API to reset a struct bio (g_reset_bio). This is mandatory 2016-02-17 17:16:02 +00:00
label Fix off-by-one error in fstyp(8) and geom_label(4) that made them use 2015-06-18 21:55:55 +00:00
linux_lvm Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
mirror Create an API to reset a struct bio (g_reset_bio). This is mandatory 2016-02-17 17:16:02 +00:00
mountver
multipath Prevent g_access calls to bad multipath members 2015-12-15 21:11:41 +00:00
nop Make geom_nop(4) collect statistics on all types of BIOs, not just 2015-10-10 09:03:31 +00:00
part Add some additional GPT partition types 2015-12-27 18:12:13 +00:00
raid Create an API to reset a struct bio (g_reset_bio). This is mandatory 2016-02-17 17:16:02 +00:00
raid3 Create an API to reset a struct bio (g_reset_bio). This is mandatory 2016-02-17 17:16:02 +00:00
sched It turns out that it's OK to sleep in this context, so use M_WAITOK 2015-12-18 14:10:00 +00:00
shsec Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
stripe Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
uncompress Make some debug printf's into DPRINTF's to reduce noise on attach/detahh 2015-08-09 06:58:06 +00:00
uzip Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8) 2016-02-23 23:59:08 +00:00
vinum Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
virstor Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
zero Merge GEOM direct dispatch changes from the projects/camlock branch. 2013-10-22 08:22:19 +00:00
geom_aes.c
geom_bsd_enc.c
geom_bsd.c Remove old ioctl use and support, once and for all. 2015-01-06 05:28:37 +00:00
geom_ccd.c
geom_ctl.c Always free sbuf in gctl_free(). 2014-01-23 21:30:31 +00:00
geom_ctl.h
geom_dev.c Fix early kernel dump via dumpdev env 2015-11-17 20:55:50 +00:00
geom_disk.c Add rotationrate to geom disk dumpconf 2016-01-14 21:52:21 +00:00
geom_disk.h Reject attempts to attack a disk device that has the old NEEDSGIANT 2013-10-25 19:19:12 +00:00
geom_dump.c Report withered providers as such alike to GEOMs. 2015-03-26 11:19:24 +00:00
geom_event.c We have two functions from where a geom orphan method could be called: 2014-05-19 16:05:42 +00:00
geom_flashmap.c Teach the flashmap code about the SPI flash. 2016-01-23 05:26:29 +00:00
geom_fox.c
geom_int.h Escape special XML chars, returned by some devices, confusing XML parsers. 2013-11-27 14:25:06 +00:00
geom_io.c Use the right size for zeroing. 2016-02-17 18:28:38 +00:00
geom_kern.c Fix multiple incorrect SYSCTL arguments in the kernel: 2014-10-21 07:31:21 +00:00
geom_map.c Fix incorrect error message in geom map 2015-12-27 17:09:23 +00:00
geom_mbr_enc.c
geom_mbr.c
geom_pc98_enc.c
geom_pc98.c
geom_redboot.c
geom_slice.c Make sure we don't free memory that's already been freed by setting 2014-04-06 02:20:42 +00:00
geom_slice.h
geom_subr.c When searching for provider by name, prefer non-withered one. 2015-03-26 11:02:29 +00:00
geom_sunlabel_enc.c
geom_sunlabel.c
geom_vfs.c Merge GEOM direct dispatch changes from the projects/camlock branch. 2013-10-22 08:22:19 +00:00
geom_vfs.h
geom_vol_ffs.c
geom.h Create an API to reset a struct bio (g_reset_bio). This is mandatory 2016-02-17 17:16:02 +00:00
notes