59 lines
2.6 KiB
Plaintext
59 lines
2.6 KiB
Plaintext
TODO file for gzip.
|
|
|
|
Some of the planned features include:
|
|
|
|
- Structure the sources so that the compression and decompression code
|
|
form a library usable by any program, and write both gzip and zip on
|
|
top of this library. This would ideally be a reentrant (thread safe)
|
|
library, but this would degrade performance. In the meantime, you can
|
|
look at the sample program zread.c.
|
|
|
|
The library should have one mode in which compressed data is sent
|
|
as soon as input is available, instead of waiting for complete
|
|
blocks. This can be useful for sending compressed data to/from interactive
|
|
programs.
|
|
|
|
- Make it convenient to define alternative user interfaces (in
|
|
particular for windowing environments).
|
|
|
|
- Support in-memory compression for arbitrarily large amounts of data
|
|
(zip currently supports in-memory compression only for a single buffer.)
|
|
|
|
- Map files in memory when possible, this is generally much faster
|
|
than read/write. (zip currently maps entire files at once, this
|
|
should be done in chunks to reduce memory usage.)
|
|
|
|
- Add a super-fast compression method, suitable for implementing
|
|
file systems with transparent compression. One problem is that the
|
|
best candidate (lzrw1) is patented twice (Waterworth 4,701,745
|
|
and Gibson & Graybill 5,049,881). The lzrw series of algorithms
|
|
are available by ftp in ftp.adelaide.edu.au:/pub/compression/lzrw*.
|
|
|
|
- Add a super-tight (but slow) compression method, suitable for long
|
|
term archives. One problem is that the best versions of arithmetic
|
|
coding are patented (4,286,256 4,295,125 4,463,342 4,467,317
|
|
4,633,490 4,652,856 4,891,643 4,905,297 4,935,882 4,973,961
|
|
5,023,611 5,025,258).
|
|
|
|
Note: I will introduce new compression methods only if they are
|
|
significantly better in either speed or compression ratio than the
|
|
existing method(s). So the total number of different methods should
|
|
reasonably not exceed 3. (The current 9 compression levels are just
|
|
tuning parameters for a single method, deflation.)
|
|
|
|
- Add optional error correction. One problem is that the current version
|
|
of ecc cannot recover from inserted or missing bytes. It would be
|
|
nice to recover from the most common error (transfer of a binary
|
|
file in ascii mode).
|
|
|
|
- Add a block size (-b) option to improve error recovery in case of
|
|
failure of a complete sector. Each block could be extracted
|
|
independently, but this reduces the compression ratio.
|
|
|
|
- Use a larger window size to deal with some large redundant files that
|
|
'compress' currently handles better than gzip.
|
|
|
|
- Implement the -e (encrypt) option.
|
|
|
|
Send comments to Jean-loup Gailly <jloup@chorus.fr>.
|