Even though I believe this is a good change, it does
have the potential to break certain clients, so it's
good to document the reasoning behind the change.
write a new test to exercise the hardlink strategies used
by different archive formats (tar, old cpio, new cpio).
This uncovered two problems, both fixed by this commit:
1) Enforce file size when writing files to disk.
2) When restoring hardlink entries, if they have data associated, go
ahead and open the file so we can write the data.
In particular, this fixes bsdtar/bsdcpio extraction of new cpio
formats where the "original" is empty and the subsequent "hardlink"
entry actually carries the data. It also provides correct behavior
for old cpio archives where hardlinked entries have their bodies
stored multiple times in the archive; the last body should always be
the one that ends up in the final file. The new pax format also
permits (but does not require) hardlinks to carry file data; again,
the last contents should always win.
Note that with any of these, a size of zero on a hardlink simply means
that the hardlink carries no data; it does not mean that the file has
zero size. A non-zero size on a hardlink does provide the file size.
Thanks to: John Baldwin, for reminding me about this long-standing bug
and sending me a simple example archive that prompted this test case
one part" by simply ignoring the marker at the beginning
of the file. (Zip archivers reserve four bytes at the beginning
of each part of a multi-part archive, if it happens to only
require one part, those four bytes get filled with a placeholder
that can be ignored.)
Thanks to: Marius Nuennerich,
for pointing me to a Zip archive that libarchive couldn't handle
MFC after: 7 days
doesn't need to compensate for this situation.
While here, fix a minor longstanding bug that empty tar archives
(which begin with at least 512 zero bytes) never properly reported
their format. In particular, this fixes the output of:
bsdtar tvvf /dev/zero
And, of course, a new test to verify that libarchive correctly
recognizes the format of such files.
the number of bytes read is actually not important as long as we have at
least what we ask for. Illustrate its benefits by using it throughout
the ZIP support code, except for the few cases where it doesn't apply.
Approved by: kientzle
exercises and verifies the libarchive APIs:
* Improved error reporting; hexdumps are now provided for
many file/memory content differences.
* Overall status more clearly counts "tests" and "assertions"
* Reference files can now be stored on disk instead of having
to be compiled into the test program itself. A couple of
tests have been converted to this more natural structure.
* Several memory leaks corrected so that leaks within libarchive
itself can be more easily detected and diagnosed.
* New test: GNU tar compatibility
* New test: Zip compatibility
* New test: Zero-byte writes to a compressed archive entry
* New test: archive_entry_strmode() format verification
* New test: mtree reader
* New test: write/read of large (2G - 1TB) entries to tar archives
(thanks to recent performance work, this test only requires a few seconds)
* New test: detailed format verification of cpio odc and newc writers
* Many minor additions/improvements to existing tests as well.
that I've been working on but put off committing until after the
RELENG_7 branch, including:
* New manpages: cpio.5 mtree.5
* New archive_entry_strmode()
* New archive_entry_link_resolver()
* New read support: mtree format
* Internal API change: read format auction only runs once
* Running the auction only once allowed simplifying a lot of bid logic.
* Cpio robustness: search for next header after a sync error
* Support device nodes on ISO9660 images
* Eliminate a lot of unnecessary copies for uncompressed archives
* Corrected handling of new GNU --sparse --posix formats
* Correctly handle a zero-byte write to a compressed archive
* Fixed memory leaks
Many of these improvements were motivated by the upcoming bsdcpio
front-end.
There have also been extensive improvements to the libarchive_test
test harness, which I'll commit separately.
a length field of zero; it does not mean the body is empty.
Thanks to: Lapo Luchini for sending me a JAR archive that demonstrated this bug
MFC after: 3 days
This can only happen on 32-bit systems when you're reading
an uncompressed archive and the skip request is an exact
multiple of 4G (e.g., skipping a tar entry with an 8G body).
The symptom is that the read_ahead() ends up returning zero
bytes, and the extraction stops with a premature end-of-file.
Using '1' here is more correct anyway, as it allows read_ahead()
to function opportunistically and minimize copying.
MFC after: 5 days
In particular, the previous code led to archives that had
non-empty bodies following directory entries. Not a fatal
problem, as bsdtar and GNU cpio are both happy to just skip
this bogus data, but it still shouldn't be there.
MFC after: 3 days
Return EOF immediately if an entry in a ZIP archive has no body.
In particular, the latter issue was causing bsdtar to emit spurious
warnings when extracting directory entries from ZIP archives.
MFC after: 3 days
number of bytes written, even when used to write files to
disk. Extend the test suite to verify the correct return
values for archive_write_data() and archive_write_data_block().
Thanks to: Bruce Mah, for stepping in promptly to back out the
earlier broken version of this fix
Thanks to: Colin Percival, for pointing out the correct fix
MFC after: 5 days
Approved by: re (ksmith)
Pointy hat: \me
most noticably the incorrect extraction of files by bsdtar.
This commit reverts:
src/lib/libarchive/archive_write_disk.c 1.15
src/lib/libarchive/test/test_write_disk.c 1.4
Approved by: re (implicitly)
(when used to restore files to disk) to match:
* The documentation
* The return values of this function when used
to write files into an archive.
Approved by: re (bmah)
Pointy hat: \me
MFC after: 5 days
GNU tar 1.17's implementation of --posix --sparse,
at the cost of losing compatibility with GNU tar 1.16.
Fortunately, the 1.17 implementation actually makes sense,
so the libarchive code is now a bit more straightforward
than before.
Background: GNU tar 1.16 defined a new way to store
sparse files in --posix archives. Unfortunately,
the implementation incorrectly inserted several
blocks of null padding after each such entry.
As a result, non-GNU tar implementations saw the
archive as truncated after any sparse entry.
This was fixed in GNU tar 1.17 at the cost of
losing compatibility with GNU tar 1.16 for this
new format (which is not the default, so hopefully
rarely used). Libarchive recently gained support
for reading the GNU tar 1.16 formats; this commit
updates it to read the GNU tar 1.17 variant instead.
Approved by: re (ksmith for libarchive portion)
Approved by: re (blanket for libarchive_test portion)
MFC after: 5 days
owner restore is not requested. If you ask
for permissions to be restored but not owner,
you will now get no error if suid/sgid bits
cannot be set. (It's a security hole to restore
suid/sgid bits if the owner/group aren't restored.)
This fixes an obscure problem where a simple
"tar -xf" with no other options will sometimes
fail gratuitously because of suid/sgid bits.
This is causing occasional problems for people
using bsdtar as a drop-in replacement for
"that other tar program." ;-)
Note: If you do ask for owner restore, then suid/sgid
restore failures still issue an error. This
only suppresses the error in the case where an
suid/sgid bit restore fails because of an owner
mismatch and owner restore was not requested.
Approved by: re (bmah)
MFC after: 7 days
In particular:
* Include a second entry in all of the test archives (to catch errors
with intermediate padding)
* Test the GNU tar 1.17 version of "posix sparse format 1.0"
instead of the GNU tar 1.16 version (the latter is no longer
supported by GNU tar).
Right now, libarchive fails this test because I originally
implemented the GNU tar 1.16 version of "posix sparse format 1.0".
I'll fix libarchive shortly.
Approved by: re (blanket, libarchive testing)
* Allow libarchive_test to compile on Interix again.
* Track the test name (not just line number) when counting skipped tests.
Thanks to: Joerg Sonnenberger
Approved by: re (blanket; libarchive testing)
couldn't allocate more memory for a string. Change
this so it returns NULL in that case, and update
all of its callers to handle the error. Some of
those callers can now return errors back to the
client instead of calling exit(3).
Approved by: re (bmah)
if there was more than one. In particular, this simplifies
test_tar_filenames.c, which has a tendency to be very noisy otherwise.
Approved by: re (blanket, libarchive testing)
it now verifies that the returned blocks have the correct data
at the correct file offsets, ignoring any null padding that
may exist.
Approved by: re (blanket, libarchive test suite)
behavior with truncated or damaged pax archives. This
tests most of the cases covered by the recent security advisory.
Approved by: re (blanket, libarchive test suite)
archive_read_open_memory.c that tries to test border
cases. In particular, it copies over each returned block
so that formats or decompressors that read past the end
of a returned block will break.
Approved by: re (blanket, libarchive test suite)
tar archives, including a potentially exploitable buffer overflow.
Approved by: re (kensmith, security blanket)
Reviewed by: kientzle
Security: FreeBSD-SA-07:05.libarchive
ARCHIVE_VERSION_STAMP to selectively disable tests that don't
apply to that version; new "skipping()" function reports skipped
tests; modify final summary to report component test failures and
skips.
Note: I don't currently intend to MFC the test suite itself;
anyone interested should just checkout and use this version
of the test suite, which should work for any library version.
Approved by: re (Ken Smith, blanket)
of libarchive being used. I've been taking advantage of this
with a recent round of updates to libarchive_test so that it
can test older and newer versions of the library.
Approved by: re (Ken Smith)
skip() callback to skip over data when reading uncompressed
archives. This gets invoked, for example, during tar -t
or tar -x with a filename argument. The revised code
only calls [lf]seek() on regular files, instead of depending
on the kernel to return an error.
Thanks to: bde for explaining the implementation of lseek()
Thanks to: Daniel O'Connor for testing
Approved by: re (Ken Smith)
MFC after: 5 days
assume yes unless seek has previously failed, but I fear I'll have to
avoid seeks under other circumstances. (For instance, tape drives on
FreeBSD seem to return garbage from lseek().) Also, optimize away
zero-byte skips.