Restoring POSIX.1e Extended Attributes on FreeBSD, part 1
This implements the basic ability to restore extended attributes
on FreeBSD, including a test suite.
from libarchive.googlecode.com: Add a new "archive_read_disk" API
that provides the important service of reading metadata from the
disk. In particular, this will make it possible to remove all
knowledge of extended attributes, ACLs, etc, from clients such
as bsdtar and bsdcpio.
Closely related, this API also provides pluggable uid->uname
and gid->gname lookup and caching services similar to
the uname->uid and gname->gid services provided by archive_write_disk.
Remember this is also required for correct ACL management.
Documentation is still pending...
In archive_write_disk: If archive_write_header() fails to create
the file, that's a failure and should return ARCHIVE_FAILED.
Metadata restore failures still return ARCHIVE_WARN, because
that's non-critical. Fix test_write_disk_secure test to
verify the correct return code in one case; add test_write_disk_failures
to do another very simple test of restore failure.
This should fix cpio coredumping when it tries to restore to
a write-protected directory.
Thanks to: Giorgos Keramidas
MFC after: 30 days
* support for bzip2 file with multiple concatenated bzip2 streams
* support for bzip2 file with junk after bzip2 stream
* support for gzip file with junk after gzip stream
* "fuzz" tester randomly modifies a bunch of input files in order to try
to crash libarchive (this found an amusing hang in the ISO9660 code
when trying to read images that advertised a zero blocksize).
This test is implemented, but commented out for now:
* support for gzip file with multiple concatenated gzip streams
feedback, but the 2.5 branch is shaping up nicely.)
In addition to many small bug fixes and code improvements:
* Another iteration of versioning; I think I've got it right now.
* Portability: A lot of progress on Windows support (though I'm
not committing all of the Windows support files to FreeBSD CVS)
* Explicit tracking of MBS, WCS, and UTF-8 versions of strings
in archive_entry; the archive_entry routines now correctly return
NULL only when something is unset, setting NULL properly clears
string values. Most charset conversions have been pushed down to
archive_string.
* Better handling of charset conversion failure when writing or
reading UTF-8 headers in pax archives
* archive_entry_linkify() provides multiple strategies for
hardlink matching to suit different format expectations
* More accurate bzip2 format detection
* Joerg Sonnenberger's extensive improvements to mtree support
* Rough support for self-extracting ZIP archives. Not an ideal
approach, but it works for the archives I've tried.
* New "sparsify" option in archive_write_disk converts blocks of nulls
into seeks.
* Better default behavior for the test harness; it now reports
all failures by default instead of coredumping at the first one.
(including pathname, gname, uname) be stored in UTF-8. This usually
doesn't cause problems on FreeBSD because the "C" locale on FreeBSD
can convert any byte to Unicode/wchar_t and from there to UTF-8. In
other locales (including the "C" locale on Linux which is really
ASCII), you can get into trouble with pathnames that cannot be
converted to UTF-8.
Libarchive's pax writer truncated pathnames and other strings at the
first nonconvertible character. (ouch!) Other archivers have worked
around this by storing unconvertible pathnames as raw binary, a
practice which has been sanctioned by the Austin group. However,
libarchive's pax reader would segfault reading headers that weren't
proper UTF-8. (ouch!) Since bsdtar defaults to pax format, this
affects bsdtar rather heavily.
To correctly support the new "hdrcharset" header that is going into
SUS and to handle conversion failures in general, libarchive's pax reader
and writer have been overhauled fairly extensively. They used to do
most of the pax header processing using wchar_t (Unicode); they now do
most of it using char so that common logic applies to either UTF-8 or
"binary" strings.
As a bonus, a number of extraneous conversions to/from wchar_t have
been eliminated, which should speed things up just a tad.
Thanks to: Bjoern Jacke for originally reporting this to me
Thanks to: Joerg Sonnenberger for noting a bad typo in my first draft of this
Thanks to: Gunnar Ritter for getting the standard fixed
MFC after: 5 days
uudecode into the main test driver and invoking it just-in-time
within the various tests.
Also, incorporate a number of improvements to the main test support
code that have proven useful on other projects where I've used this
framework.
write a new test to exercise the hardlink strategies used
by different archive formats (tar, old cpio, new cpio).
This uncovered two problems, both fixed by this commit:
1) Enforce file size when writing files to disk.
2) When restoring hardlink entries, if they have data associated, go
ahead and open the file so we can write the data.
In particular, this fixes bsdtar/bsdcpio extraction of new cpio
formats where the "original" is empty and the subsequent "hardlink"
entry actually carries the data. It also provides correct behavior
for old cpio archives where hardlinked entries have their bodies
stored multiple times in the archive; the last body should always be
the one that ends up in the final file. The new pax format also
permits (but does not require) hardlinks to carry file data; again,
the last contents should always win.
Note that with any of these, a size of zero on a hardlink simply means
that the hardlink carries no data; it does not mean that the file has
zero size. A non-zero size on a hardlink does provide the file size.
Thanks to: John Baldwin, for reminding me about this long-standing bug
and sending me a simple example archive that prompted this test case
exercises and verifies the libarchive APIs:
* Improved error reporting; hexdumps are now provided for
many file/memory content differences.
* Overall status more clearly counts "tests" and "assertions"
* Reference files can now be stored on disk instead of having
to be compiled into the test program itself. A couple of
tests have been converted to this more natural structure.
* Several memory leaks corrected so that leaks within libarchive
itself can be more easily detected and diagnosed.
* New test: GNU tar compatibility
* New test: Zip compatibility
* New test: Zero-byte writes to a compressed archive entry
* New test: archive_entry_strmode() format verification
* New test: mtree reader
* New test: write/read of large (2G - 1TB) entries to tar archives
(thanks to recent performance work, this test only requires a few seconds)
* New test: detailed format verification of cpio odc and newc writers
* Many minor additions/improvements to existing tests as well.
behavior with truncated or damaged pax archives. This
tests most of the cases covered by the recent security advisory.
Approved by: re (blanket, libarchive test suite)
archive_read_open_memory.c that tries to test border
cases. In particular, it copies over each returned block
so that formats or decompressors that read past the end
of a returned block will break.
Approved by: re (blanket, libarchive test suite)
- Add and document the KVM and KVM_SUPPORT options that
are needed for the ifmcstats(3) makefile
- Garbage collect unused variables
- Add missing inclusion of bsd.own.mk where needed
Approved by: kan (mentor)
Reviewed by: ru
* "compression_program" support uses an external program
* Portability: no longer uses "struct stat" as a primary
data interchange structure internally
* Part of the above: refactor archive_entry to separate
out copy_stat() and stat() functions
* More complete tests for archive_entry
* Finish archive_entry_clone()
* Isolate major()/minor()/makedev() in archive_entry; remove
these from everywhere else.
* Bug fix: properly handle decompression look-ahead at end-of-data
* Bug fixes to 'ar' support
* Fix memory leak in ZIP reader
* Portability: better timegm() emulation in iso9660 reader
* New write_disk flags to suppress auto dir creation and not
overwrite newer files (for future cpio front-end)
* Simplify trailing-'/' fixup when writing tar and pax
* Test enhancements: fix various compiler warnings, improve
portability, add lots of new tests.
* Documentation: document new functions, first draft of
libarchive_internals.3
MFC after: 14 days
Thanks to: Joerg Sonnenberger (compression_program)
Thanks to: Kai Wang (ar)
Thanks to: Colin Percival (many small fixes)
Thanks to: Many others who sent me various patches and problem reports.
for directories. bsdtar used to add this, but that recently got
lost somehow. So now I'm adding it back in libarchive.
The only odd part of doing this in libarchive: Adding a directory to
a tar archive and then reading it back again can yield a different name.
Add a test case to exercise some boundary conditions with
tar filenames and ensure that trailing slashes are added to
dir names only as necessary.
Thanks to: Oliver Lehmann for bringing this regression to my attention.
These tests verify that archive_entry objects can store and return
ACL data and that pax format archives can read and write ACL
information. These do not (yet) test that ACL data is read or
written to disk correctly. (And hence would not have caught the
recent snafu about ACL read-from-disk being turned off.)
* libarchive_test program exercises many of the core features
* Refactored old "read_extract" into new "archive_write_disk", which
uses archive_write methods to put entries onto disk. In particular,
you can now use archive_write_disk to create objects on disk
without having an archive available.
* Pushed some security checks from bsdtar down into libarchive, where
they can be better optimized.
* Rearchitected the logic for creating objects on disk to reduce
the number of system calls. Several common cases now use a
minimum number of system calls.
* Virtualized some internal interfaces to provide a clearer separation
of read and write handling and make it simpler to override key
methods.
* New "empty" format reader.
* Corrected return types (this ABI breakage required the "2.0" version bump)
* Many bug fixes.