1) Avoid an infinite loop in the header resync for certain malformed
archives.
2) Don't try to match hardlinks if the nlinks count is < 2. This
reduces the likelihood of a false hardlink match due to ino truncation.
MFC after: 7 days
(incorrect handling of zero-length reads before the copy buffer is
allocated) is masked by the iso9660 taster. Tar and cpio both enable
that taster so were protected from the bug; unzip is susceptible.
This both fixes the bug and updates the test harness to exercise
this case.
Submitted by: Ed Schouten diagnosed the bug and drafted a patch
MFC after: 7 days
on iso9660 images were returned. While I'm poking around, update
some comments around this area to try to clarify what's going on and
what still remains to be improved.
but returned them incorrectly, causing tar to actually
erase the resulting file while trying to restore the
link. This one-line fix corrects the hardlink descriptions
to avoid this problem.
Thanks to Jung-uk Kim for pointing this out.
Approved by: re (kib)
preparation for 8.0-RELEASE. Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.
Reviewed by: kib
Approved by: re (rwatson)
field when computing the length of the gzip header.
Thanks to Dag-Erling for pointing me to the OpenSSH tarballs,
which are the first files I've seen that actually used this field.
to eliminate some duplicated code. In particular,
archive_read_open_filename() has different close
handling than archive_read_open_fd(), so delegating
the former to the latter in the degenerate case
(a NULL filename is treated as stdin) broke reading
from pipelines. In particular, this fixes occasional
port failures that were seen when using "gunzip | tar"
pipelines under /bin/csh.
Thanks to Alexey Shuvaev for reporting this failure and
patiently helping me to track down the cause.
Unfortunately, liblzma itself is GPLed, so unlikely to become part of
the FreeBSD base system.
However, the core lzma compression/decompression code is public
domain, so it should be feasible for someone to create a compatible
library without the GPL strings.
read_support_format_raw() allows people to exploit libarchive's
automatic decompression support by simply stubbing out the
archive format handler.
The raw handler is not enabled by support_format_all(), of course.
It bids 1 on any non-empty input and always returns a single
entry named "data" with no properties set.
Fix reading big-endian binary cpio archives, and add a test.
While I'm here, add a note about Solaris ACL extension for cpio,
which should be relatively straightforward to support.
Thanks to: Edward Napierala, who sent me a big-endian cpio archive
from a Solaris system he's been playing with.
Pointy hat: me
Make test_fuzz a bit more sensitive by actually reading the body
of each entry instead of skipping it.
While I'm here, move the "UnsupportedCompress" macro into the
only file that still uses it.
* Fix parsing of POSIX.1e ACLs from Solaris tar archives
* Test the above
* Preserve the order of POSIX.1e ACL entries
* Update tests whose results depended on the order of ACL entries
* Identify NFSv4 ACLs in Solaris tar archives and warn that
they're not yet supported. (In particular, don't try to parse
them as POSIX.1e ACLs.)
Thanks to: Edward Napierala sent me some Solaris 10 tar archives to test
* Split whiny skip function to create a new best-effort skip_lenient()
* Correctly increment the top-level file position only for the top filter
* Simulate skip by reading against the current filter, not the top filter
The latter two bugs aren't currently visible because no existing
filter delegates skip operations.
access to the file data (if the file exists on
disk). This was broken for the first regular
file; fix it and add a test so it won't break again.
In particular, this fixes the following idiom for creating
a tar archive in which every file is owned by root:
tar cf - --format=mtree . \
| sed -e 's/uname=[a-z]*/uname=root/' -e 's/uid=[0-9]*/uid=0/' \
| tar cf - @-
descriptions of the GNU tar "posix-style" sparse format,
clarification of the Solaris tar ACL storage,
and a few comments about Mac OS X tar's resource storage.
Not an issue for FreeBSD, since the base system has the necessary libraries.
Since all decompressors are always available now, we can unconditionally
enable them in archive_read_support_compression_all().
Since FreeBSD doesn't have liblzma in the base system, the
read side will always fall back to the unxz/unlzma commands for now.
(Which will in turn fail if those commands are not currently
installed.) The write side does not yet have a fallback, so
that will just fail.
fixes to read_support_compression_program. In particular, failure of
the external program is detected a lot earlier, which gives much more
reasonable error handling.
corrections to the Windows support to reconcile differences
between Visual Studio and Cygwin. Includes parts of
revisions 757, 774, 787, 815, 817, 819, 820, 844, and 886.
Of particular note, r886 overhauled the UTF-8/Unicode conversions to
work correctly regardless of whether the local system uses 16-bit
or 32-bit wchar_t. (I assume that systems with 16-bit wchar_t
use UTF-16 and those with 32-bit wchar_t use UCS-4.) This revision
also added a preference for wcrtomb() (which is thread-safe) on
platforms that support it.
r751: Change __archive_strncat() to use a void * source, which reduces
the amount of casting needed to use this with "char", "signed char"
and "unsigned char".
r752: Use additions instead of multiplications when growing buffer;
faster and less chance of overflow.
conditioning tests on HAVE_ZLIB, etc, just ask libarchive for the
service and handle the failure coming back from libarchive. This
gives us better test coverage of common client usage where clients
simply try to use libarchive services and handle the errors coming
back instead of trying to second-guess which libarchive services are
compiled in.
Refactor the read_compression_program to add two new abilities:
* Public API: You can now include a signature string when you
register a program; the program will run only on input that
matches the signature string.
* Internal API: You can use the init() function to instantiate
an external program as part of a filter pipeline. This
can be used for graceful fallback (if zlib is unavailable, use
external gzip instead) and to use external programs with
bidders that are more sophisticated than a static signature check.
Support Joliet extensions. This currently ignores Rockridge extensions
if both exist on the same disk unless the '!joliet' option is provided.
e.g.: tar -xvf example.iso --options '!joliet'
Thanks to: Andreas Henriksson
as the compression name when no other read filter bid. Add some
assertions to various tests to verify that read filters are properly
setting the textual name as well as the compression code.
information to error strings. This caused a lot of unnecessary
duplication in error messages; in particular, there are a few cases
where error messages get copied from one archive object to another
and this would cause the strerror() info to get appended each time.
Restoring POSIX.1e Extended Attributes on FreeBSD, part 1
This implements the basic ability to restore extended attributes
on FreeBSD, including a test suite.
Zip entries that are zero length but stored with deflate. This
is arguably a silly thing to do (deflating a zero-length file actually
makes it bigger) but apparently quite a few Zip writers do this.
This was broken in two places: archive_write_disk disliked being asked
to write data to zero-length files (even if the write was zero-length)
and zip_read_file_header tripped over itself when non-regular files
had compressed bodies.
from libarchive.googlecode.com: Add a new "archive_read_disk" API
that provides the important service of reading metadata from the
disk. In particular, this will make it possible to remove all
knowledge of extended attributes, ACLs, etc, from clients such
as bsdtar and bsdcpio.
Closely related, this API also provides pluggable uid->uname
and gid->gname lookup and caching services similar to
the uname->uid and gname->gid services provided by archive_write_disk.
Remember this is also required for correct ACL management.
Documentation is still pending...
into the debugger on test setup failures (otherwise, the console window
just goes away and you can't see what went wrong). On all platforms,
clean up a stray buffer before exiting.
This is the last phase of the "big decompression refactor" that
puts a lazy reblocking layer between each pair of read filters.
I've also changed the terminology for this area---the two kinds
of objects are now called "read filters" and "read filter bidders"---and
moved ownership of these objects to the archive_read core.
This greatly simplifies implementing new read filters, which
can now use peek/consume I/O semantics both for bidding (arbitrary
look-ahead!) and for reading streams (look-ahead simplifies handling
concatenated streams, for instance).
The first merge here is the overhaul proper; the remainder are small
fixes to correct errors in the initial implementation.
locale-based failures on systems where the "C" locale is so permissive
that it cannot possibly fail. In particular, this fixes a test
problem on Cygwin.
In archive_write_disk: If archive_write_header() fails to create
the file, that's a failure and should return ARCHIVE_FAILED.
Metadata restore failures still return ARCHIVE_WARN, because
that's non-critical. Fix test_write_disk_secure test to
verify the correct return code in one case; add test_write_disk_failures
to do another very simple test of restore failure.
This should fix cpio coredumping when it tries to restore to
a write-protected directory.
Thanks to: Giorgos Keramidas
MFC after: 30 days
end of the compressed stream. This is desirable behavior,
but the implementation here is very broken and causes strange
problems, so disable it for now.
Thanks to Simon L. Nielsen for reporting this problem.
* support for bzip2 file with multiple concatenated bzip2 streams
* support for bzip2 file with junk after bzip2 stream
* support for gzip file with junk after gzip stream
* "fuzz" tester randomly modifies a bunch of input files in order to try
to crash libarchive (this found an amusing hang in the ISO9660 code
when trying to read images that advertised a zero blocksize).
This test is implemented, but commented out for now:
* support for gzip file with multiple concatenated gzip streams
This is an attempt to eliminate a lot of redundant
code from the read ("decompression") filters by
changing them to juggle arbitrary-sized blocks
and consolidate reblocking code at a single point
in archive_read.c.
Along the way, I've changed the internal read/consume
API used by the format handlers to a slightly
different style originally suggested by des@. It
does seem to simplify a lot of common cases.
The most dramatic change is, of course, to
archive_read_support_compression_none(), which
has just evaporated into a no-op as the blocking
code this used to hold has all been moved up
a level.
There's at least one more big round of refactoring
yet to come before the individual filters are as
straightforward as I think they should be...
* Wrap long declarations to fit 80 chars
* #undef macros that shouldn't be exported
* Organize the version-dependent conditionals a
bit more consistently
Speculative:
* libarchive 3.0 will (eventually) use int64_t
instead of off_t. This is an attempt to avoid
some the headaches caused by Linux LFS. (I'll
still have to do ugly things for the struct stat
references in archive_entry.h, of course.)
If it's not a regular file, don't return any data, even if the size is unknown.
Update the Zip test with a hand-tweaked Zip archive that has a
directory (with length-at-end set), a regular file without
length-at-end set, and a regular file with length-at-end set and a bad
CRC. Update the test code to verify that the file size is unset
for the regular file with length-at-end.
MFC after: 7 days
logic here gets a little complex, but the net effect is that the
SECURE_SYMLINKS flag will prevent us from ever following a symlink.
Without it, we'll only follow symlinks to dirs. bsdtar specifies
SECURE_SYMLINKS by default, suppresses it for -P.
I've also beefed up the write_disk_secure test to verify this
behavior.
PR: bin/126849
unspecified size are "unlimited" (required by Zip reader, which
sometimes does not know the uncompressed size of an entry until it
gets to the end). Also, hardlinks with unspecified (or zero) size do
not overwrite the data on disk nor do they set metadata. This is
compatible with GNU tar and NetBSD pax behavior.
This generalizes the existing set/unset tracking for hardlink/symlink
fields and extends it to cover non-string fields. Eventually, this
will be further extended to cover most fields.
In particular, this is needed to correctly detect when time fields
are missing (for example, reading ustar archives doesn't set atime or
ctime) for proper time restore and is helpful when trying to determine
whether to overwrite data when restoring hardlinks.
This commit updates the tests but not the docs.
Since various 'find' incantations can emit container directories
in various orders, we cannot refuse to update a dir because it's
apparently the same age.
MFC after: 3 days
understand which code paths aren't possible.
This commit eliminates 117 false positive bug reports of the form
"allocate memory; error out if pointer is NULL; use pointer".
schedule a chmod() fixup for directories. In particular, this fixes
sgid handling on systems where the sgid bit is inherited from the
parent directory (which means that the actual mode of the dir
does not match the mode used in the mkdir() system call.
It may be possible to tighten this condition a bit. In
working through this, I also found a few other places where
it looks like we can avoid a redundant syscall or two. I've
commented those here but not yet tried to address them.