In particular, this check avoids a warning when
extracting directory entries from certain GNU tar
archives that store directory contents.
MFC after: 3 days
feedback, but the 2.5 branch is shaping up nicely.)
In addition to many small bug fixes and code improvements:
* Another iteration of versioning; I think I've got it right now.
* Portability: A lot of progress on Windows support (though I'm
not committing all of the Windows support files to FreeBSD CVS)
* Explicit tracking of MBS, WCS, and UTF-8 versions of strings
in archive_entry; the archive_entry routines now correctly return
NULL only when something is unset, setting NULL properly clears
string values. Most charset conversions have been pushed down to
archive_string.
* Better handling of charset conversion failure when writing or
reading UTF-8 headers in pax archives
* archive_entry_linkify() provides multiple strategies for
hardlink matching to suit different format expectations
* More accurate bzip2 format detection
* Joerg Sonnenberger's extensive improvements to mtree support
* Rough support for self-extracting ZIP archives. Not an ideal
approach, but it works for the archives I've tried.
* New "sparsify" option in archive_write_disk converts blocks of nulls
into seeks.
* Better default behavior for the test harness; it now reports
all failures by default instead of coredumping at the first one.
* "compression_program" support uses an external program
* Portability: no longer uses "struct stat" as a primary
data interchange structure internally
* Part of the above: refactor archive_entry to separate
out copy_stat() and stat() functions
* More complete tests for archive_entry
* Finish archive_entry_clone()
* Isolate major()/minor()/makedev() in archive_entry; remove
these from everywhere else.
* Bug fix: properly handle decompression look-ahead at end-of-data
* Bug fixes to 'ar' support
* Fix memory leak in ZIP reader
* Portability: better timegm() emulation in iso9660 reader
* New write_disk flags to suppress auto dir creation and not
overwrite newer files (for future cpio front-end)
* Simplify trailing-'/' fixup when writing tar and pax
* Test enhancements: fix various compiler warnings, improve
portability, add lots of new tests.
* Documentation: document new functions, first draft of
libarchive_internals.3
MFC after: 14 days
Thanks to: Joerg Sonnenberger (compression_program)
Thanks to: Kai Wang (ar)
Thanks to: Colin Percival (many small fixes)
Thanks to: Many others who sent me various patches and problem reports.
occur on the write side of extracting a file to ARCHIVE_WARN errors
when returning them from archive_read_extract.
In bsdtar: Use the return code from archive_read_data_into_fd and
archive_read_extract to determine whether we should continue trying to
extract an archive after one of the entries fails.
This commit makes extracting a truncated tarball complain once about
the archive being truncated, instead of complaining twice (once when
trying to extract an entry, and once when trying to seek to the next
entry).
Discussed with: kientzle
message in the reader to the error message from the writer if the
error which occurred was in the writer. This avoids error messages
of "Empty error message" when extracting truncated archives.
* The ACL formatter was mis-formatting entries which had a
user/group ID but no name. Make the parser tolerant of
these, so that old archives can be correctly restored;
fix the formatter to generate correct entries.
* Fix overwrite detection by introducing a new "FAILED" return
code that indicates the current entry cannot be continued
but the archive as a whole is still sound.
* Header cleanup: Remove some unused headers, add some that
are required with new Linux systems.
* libarchive_test program exercises many of the core features
* Refactored old "read_extract" into new "archive_write_disk", which
uses archive_write methods to put entries onto disk. In particular,
you can now use archive_write_disk to create objects on disk
without having an archive available.
* Pushed some security checks from bsdtar down into libarchive, where
they can be better optimized.
* Rearchitected the logic for creating objects on disk to reduce
the number of system calls. Several common cases now use a
minimum number of system calls.
* Virtualized some internal interfaces to provide a clearer separation
of read and write handling and make it simpler to override key
methods.
* New "empty" format reader.
* Corrected return types (this ABI breakage required the "2.0" version bump)
* Many bug fixes.
a vanilla 2-clause BSD license, but somehow some confusing
extra verbage get copied from somewhere.
Also, update the copyright dates to 2007 for all of the files.
Prompted by: several questions about what those extra words really mean
* Actually use the HAVE_<header>_H macros to conditionally include
system headers. They've been defined for a long time, but only
used in a few places. Now they're used pretty consistently
throughout.
* Fill in a lot of missing casts for conversions from void*.
Although Standard C doesn't require this, some people have been
trying to use C++ compilers with this code, and they do require it.
Bit-for-bit, the compiled object files are identical, except for
one assert() whose line number changed, so I'm pretty confident I
didn't break anything. ;-)
restore it directly and skip chmod() during the post-extract fixup.
In particular, bsdtar -xm now completely skips the post-extract fixup
for directories, which produces a noticable speedup in that case.
* Expose functions for setting the "skip file" dev/ino information
* Expose functions for setting/querying the block size on reads
* Correctly propagate errors out of archive_read_close/archive_write_close
* Update manpage with information about new functions
in part by OpenBSD's not-quite-standard-compliant
standard libraries. (No loss of functionality,
just minor recoding to not rely on certain "standard"
facilities that weren't actually needed.)
it's only a failure if there were actually attributes to be restored.
In particular, this fixes the problem where tar -xp always returned
a failure code on FreeBSD (which doesn't yet have all of the extended
attribute support).
Thanks to: Diego "Flameeyes" Petteno
This commit implements storing/reading POSIX.1e-style extended
attribute information in "pax" format archives. An outline of the
storage format is in the tar.5 manpage. The archive_read_extract()
function has code to restore those archives to disk for Linux; FreeBSD
implementation is forthcoming.
Many thanks to Jaakko Heinonen for finding flaws in earlier
proposals and doing the bulk of the coding in this work.
and restoring the metadata. In particular, the metadata-restore
functions now all accept a file descriptor and a pathname. If the
file descriptor is set and the platform supports the appropriate
syscall, restore the metadata through the file descriptor. Otherwise,
restore it through the pathname. This is complicated by varying
syscall support (FreeBSD has an fchmod(2) but no fchflags(2), for
example) and because non-file entries don't have an fd to use in
restoring attributes (for example, mknod(2) doesn't return a file
handle).
MFC after: 14 days
This should provide a big performance boost for folks using NIS or LDAP.
MFC after: 3 days
Thanks to: Jun Kuriyama (for reminding me that this was still on my TODO list)
This closes a security hole. Otherwise, libarchive will happily
extract into directories to which it lacks write permissions by
resetting the permissions during the extract.
Thanks to: Kris Kennaway
is present for FreeBSD. If you "make distfile" on FreeBSD, you will
soon have a tar.gz file suitable for deploying to other systems
(complete with the expected "configure" script, etc). This latter
relies (at least for now) on the GNU auto??? tools. (I like autoconf
okay, but someday I hope to write a custom Makefile.in and dispense
with automake, which is somewhat odious.)
As part of this, I've cleaned up some of the conditional
compilation options, added make-foo to construct archive.h dynamically
(it now contains some version constants), and added some useful
informational files.
Thanks to: David O'Brien for pointing this out.
Also, add in a few additional portability tweaks and make a few
more things conditional on features (HAVE_XXXX macros) rather
than platform.
code. <whew!> This version handles all of the following edge cases:
* Restoring explicit dirs with 000 permissions (star fails this test)
* Restore of implicit or explicit dirs when umask=777
(gtar and star both fail this test)
* Restoring dir paths containing "." and ".." components
This version initially creates all dirs with permission 700 (ignoring
umask), then does a post-extract "fixup" pass to set the correct
permissions (which may or may not depend on umask, depending on the
restore flags and whether it's an explicit or implicit dir).
Permissions are restored depth-first so that permissions within
non-writable dirs can be correctly restored. (The depth-sorting does
correctly account for dirs with ".." components.)
applied to their permissions. Just calculate the
default dir mode once and use it consistently, rather than
trying to remember to calculate it everywhere it's needed.
* Rename some variables/functions/etc to try to make things clearer.
* Add separate flags to control fflag/acl restore
* Collect metadata restore into a single function for clarity
* Propagate errors in metadata restore back out to the client
* Fix some places where errors were being returned when they
shouldn't and vice-versa
* Modes are now always restored; ARCHIVE_EXTRACT_PERM just controls
whether or not umask is obeyed.
* Restore suid/sgid bits only if user/group matches archive
* Cache the last stat results to try to reduce the number of stat calls
Mostly, these were being used correctly even though a lot of
variables and function names were mis-named.
In the process, I found and fixed a couple of latent bugs and
added a guard against adding an archive to itself.
a/././b/../b/../c/./../d/e/f now work correctly. And yes, a/b and a/c
both get created in this example; if you want, you can create an
entire dir heirarchy from a tar archive with only one entry.
More tweaks to umask support: umasks are now obeyed for all objects,
not just directories; the umask used is now the one in effect at the
corresponding call to archive_read_extract(), so clients that want to
tinker with umask during extract should get the expected behavior.
umask in effect when the archive is closed
* Correct a typo that broke implicit dir creation for non-directories.
Thanks to: Garret A Wollman for pointing out my umask oversight
read_extract_dir (which creates directories in the archive). This
brings a number of advantages:
* FINALLY fix the problems creating dirs ending in "/." <sigh>
* Missing parent dirs now get created securely, just like explicit dirs.
(Created 0700 initially, then edited to 0755 at end of extraction.)
* Eliminate some duplicate code and some weird special cases.
While I'm cleaning, inline the regular-file creation code as well.
This change also pointed out one API deficiency: the
archive_read_data_into_XXX functions were originally defined to return
the total bytes read. This is, of course, ambiguous when dealing with
non-contiguous files. Change it to just return a status value.
file already exists on disk.
Pointed out by: www/resin3 port (whose distfile contains the same file
twice with different permissions and relies on the permissions associated
with the second instance)
Thanks again to: Kris Kennaway
* Restore directories with 0700 permissions initially,
then use the fixup pass to correct the permissions
* Trim trailing "/" and "/." in mkdirpath()
Suggested by: Garrett Wollman
push extract data down into archive_read_extract.c and out
of the library-global archive_private.h; push dir-specific
mode/time fixup down into dir restore function; now that the
fixup list is file-local, I can use somewhat more natural
naming.
Oh, yeah, update a bunch of comments to match current reality.
certain flags set (e.g., schg or uappend) would fail because the flags
were restored before the hardlink was created.
To address this, I've generalized the existing machinery for deferring
directory timestamp/mode restoration and used it to defer the
restoration of highly-restrictive flags to the end of the extraction,
after any links have been created.
Pointed out by: Pawel Jakub Dawidek (pjd@)