5c13551143
This brings support for multi-threaded compression. This brings close N times faster compression where N is the number of CPU cores. Because of this, liblzma now depends on libthr. Soon libarchive will be modified to use the new lzma API. Thanks to antoine@ for the exp-run. Differential Revision: https://reviews.freebsd.org/D1786 Reviewed by: bapt
112 lines
3.9 KiB
Plaintext
112 lines
3.9 KiB
Plaintext
|
|
XZ Utils To-Do List
|
|
===================
|
|
|
|
Known bugs
|
|
----------
|
|
|
|
The test suite is too incomplete.
|
|
|
|
If the memory usage limit is less than about 13 MiB, xz is unable to
|
|
automatically scale down the compression settings enough even though
|
|
it would be possible by switching from BT2/BT3/BT4 match finder to
|
|
HC3/HC4.
|
|
|
|
XZ Utils compress some files significantly worse than LZMA Utils.
|
|
This is due to faster compression presets used by XZ Utils, and
|
|
can often be worked around by using "xz --extreme". With some files
|
|
--extreme isn't enough though: it's most likely with files that
|
|
compress extremely well, so going from compression ratio of 0.003
|
|
to 0.004 means big relative increase in the compressed file size.
|
|
|
|
xz doesn't quote unprintable characters when it displays file names
|
|
given on the command line.
|
|
|
|
tuklib_exit() doesn't block signals => EINTR is possible.
|
|
|
|
SIGTSTP is not handled. If xz is stopped, the estimated remaining
|
|
time and calculated (de)compression speed won't make sense in the
|
|
progress indicator (xz --verbose).
|
|
|
|
If liblzma has created threads and fork() gets called, liblzma
|
|
code will break in the child process unless it calls exec() and
|
|
doesn't touch liblzma.
|
|
|
|
|
|
Missing features
|
|
----------------
|
|
|
|
Add support for storing metadata in .xz files. A preliminary
|
|
idea is to create a new Stream type for metadata. When both
|
|
metadata and data are wanted in the same .xz file, two or more
|
|
Streams would be concatenated.
|
|
|
|
The state stored in lzma_stream should be cloneable, which would
|
|
be mostly useful when using a preset dictionary in LZMA2, but
|
|
it may have other uses too. Compare to deflateCopy() in zlib.
|
|
|
|
Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and
|
|
other streams that don't have an end of payload marker.
|
|
|
|
Adjust dictionary size when the input file size is known.
|
|
Maybe do this only if an option is given.
|
|
|
|
xz doesn't support copying extended attributes, access control
|
|
lists etc. from source to target file.
|
|
|
|
Multithreaded compression:
|
|
- Reduce memory usage of the current method.
|
|
- Implement threaded match finders.
|
|
- Implement pigz-style threading in LZMA2.
|
|
|
|
Multithreaded decompression
|
|
|
|
Buffer-to-buffer coding could use less RAM (especially when
|
|
decompressing LZMA1 or LZMA2).
|
|
|
|
I/O library is not implemented (similar to gzopen() in zlib).
|
|
It will be a separate library that supports uncompressed, .gz,
|
|
.bz2, .lzma, and .xz files.
|
|
|
|
Support changing lzma_options_lzma.mode with lzma_filters_update().
|
|
|
|
Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at
|
|
Block and Stream boundaries.
|
|
|
|
lzma_strerror() to convert lzma_ret to human readable form?
|
|
This is tricky, because the same error codes are used with
|
|
slightly different meanings, and this cannot be fixed anymore.
|
|
|
|
Make it possible to adjust LZMA2 options in the middle of a Block
|
|
so that the encoding speed vs. compression ratio can be optimized
|
|
when the compressed data is streamed over network.
|
|
|
|
Improved BCJ filters. The current filters are small but they aren't
|
|
so great when compressing binary packages that contain various file
|
|
types. Specifically, they make things worse if there are static
|
|
libraries or Linux kernel modules. The filtering could also be
|
|
more effective (without getting overly complex), for example,
|
|
streamable variant BCJ2 from 7-Zip could be implemented.
|
|
|
|
Filter that autodetects specific data types in the input stream
|
|
and applies appropriate filters for the corrects parts of the input.
|
|
Perhaps combine this with the BCJ filter improvement point above.
|
|
|
|
Long-range LZ77 method as a separate filter or as a new LZMA2
|
|
match finder.
|
|
|
|
|
|
Documentation
|
|
-------------
|
|
|
|
More tutorial programs are needed for liblzma.
|
|
|
|
Document the LZMA1 and LZMA2 algorithms.
|
|
|
|
|
|
Miscellaneous
|
|
------------
|
|
|
|
Try to get the media type for .xz registered at IANA.
|
|
|