This should provide a big performance boost for folks using NIS or LDAP.
MFC after: 3 days
Thanks to: Jun Kuriyama (for reminding me that this was still on my TODO list)
This closes a security hole. Otherwise, libarchive will happily
extract into directories to which it lacks write permissions by
resetting the permissions during the extract.
Thanks to: Kris Kennaway
is present for FreeBSD. If you "make distfile" on FreeBSD, you will
soon have a tar.gz file suitable for deploying to other systems
(complete with the expected "configure" script, etc). This latter
relies (at least for now) on the GNU auto??? tools. (I like autoconf
okay, but someday I hope to write a custom Makefile.in and dispense
with automake, which is somewhat odious.)
As part of this, I've cleaned up some of the conditional
compilation options, added make-foo to construct archive.h dynamically
(it now contains some version constants), and added some useful
informational files.
Thanks to: David O'Brien for pointing this out.
Also, add in a few additional portability tweaks and make a few
more things conditional on features (HAVE_XXXX macros) rather
than platform.
code. <whew!> This version handles all of the following edge cases:
* Restoring explicit dirs with 000 permissions (star fails this test)
* Restore of implicit or explicit dirs when umask=777
(gtar and star both fail this test)
* Restoring dir paths containing "." and ".." components
This version initially creates all dirs with permission 700 (ignoring
umask), then does a post-extract "fixup" pass to set the correct
permissions (which may or may not depend on umask, depending on the
restore flags and whether it's an explicit or implicit dir).
Permissions are restored depth-first so that permissions within
non-writable dirs can be correctly restored. (The depth-sorting does
correctly account for dirs with ".." components.)
applied to their permissions. Just calculate the
default dir mode once and use it consistently, rather than
trying to remember to calculate it everywhere it's needed.
* Rename some variables/functions/etc to try to make things clearer.
* Add separate flags to control fflag/acl restore
* Collect metadata restore into a single function for clarity
* Propagate errors in metadata restore back out to the client
* Fix some places where errors were being returned when they
shouldn't and vice-versa
* Modes are now always restored; ARCHIVE_EXTRACT_PERM just controls
whether or not umask is obeyed.
* Restore suid/sgid bits only if user/group matches archive
* Cache the last stat results to try to reduce the number of stat calls
Mostly, these were being used correctly even though a lot of
variables and function names were mis-named.
In the process, I found and fixed a couple of latent bugs and
added a guard against adding an archive to itself.
a/././b/../b/../c/./../d/e/f now work correctly. And yes, a/b and a/c
both get created in this example; if you want, you can create an
entire dir heirarchy from a tar archive with only one entry.
More tweaks to umask support: umasks are now obeyed for all objects,
not just directories; the umask used is now the one in effect at the
corresponding call to archive_read_extract(), so clients that want to
tinker with umask during extract should get the expected behavior.
umask in effect when the archive is closed
* Correct a typo that broke implicit dir creation for non-directories.
Thanks to: Garret A Wollman for pointing out my umask oversight
read_extract_dir (which creates directories in the archive). This
brings a number of advantages:
* FINALLY fix the problems creating dirs ending in "/." <sigh>
* Missing parent dirs now get created securely, just like explicit dirs.
(Created 0700 initially, then edited to 0755 at end of extraction.)
* Eliminate some duplicate code and some weird special cases.
While I'm cleaning, inline the regular-file creation code as well.
This change also pointed out one API deficiency: the
archive_read_data_into_XXX functions were originally defined to return
the total bytes read. This is, of course, ambiguous when dealing with
non-contiguous files. Change it to just return a status value.
file already exists on disk.
Pointed out by: www/resin3 port (whose distfile contains the same file
twice with different permissions and relies on the permissions associated
with the second instance)
Thanks again to: Kris Kennaway
* Restore directories with 0700 permissions initially,
then use the fixup pass to correct the permissions
* Trim trailing "/" and "/." in mkdirpath()
Suggested by: Garrett Wollman
push extract data down into archive_read_extract.c and out
of the library-global archive_private.h; push dir-specific
mode/time fixup down into dir restore function; now that the
fixup list is file-local, I can use somewhat more natural
naming.
Oh, yeah, update a bunch of comments to match current reality.
certain flags set (e.g., schg or uappend) would fail because the flags
were restored before the hardlink was created.
To address this, I've generalized the existing machinery for deferring
directory timestamp/mode restoration and used it to defer the
restoration of highly-restrictive flags to the end of the extraction,
after any links have been created.
Pointed out by: Pawel Jakub Dawidek (pjd@)
-U flag to bsdtar. Essentially, this option breaks existing hard
links. According to SUSv2, tar is supposed to overwrite existing
files on extract by default which, in particular, preserves
existing hard links. Note that this is yet another bug in gtar; it
appears to always break existing links. (Maybe gtar's -U is broken?)
I'm unsure about how to handle this for other file types; the current
code always unlinks first unless the NO_OVERWRITE flag is specified.
I've commented this issue liberally and will come back to it later.
The new fflags support in archive_entry supports Linux and FreeBSD
file flags and is a bit more gracious about unrecognized flag names
than strtofflags(3). This involves some minor API breakage.
The default tar format ("restricted pax") now enables pax extensions
when archiving files that have flags. In particular, copying dir
heirarchies with 'bsdtar cf - -C src . | bsdtar xpf - -C dest' now
preserves file flags. (Note the "p" on extract!)
While I'm here, fill in some additional explanation in the
archive_entry.3 manpage, fill in some missing MLINKS, mark some
overlooked internal functions 'static', and make a few minor style
fixes.
High-resolution mtime/ctime/atime is not POSIX-standard, so hide
set/get of high-resolution time fields behind easily-mutable macros.
That makes it easier to change how those fields are accessed.
try to set ACLs even if fflag restore fails, first cut at reading
Solaris tar ACLs
Code improvement: merge gnu tar read support into main tar reader;
this eliminates a lot of duplicate code and generalizes the tar
reader to handle formats with GNU-like extensions.
Style: Makefile cleanup, eliminate 'dmalloc' references, remove 'tartype'
from archive_entry (this makes archive_entry more format-agnostic)
Thanks to: David Magda for providing Solaris tar test files
* ACL storage is no longer erased before a group of entries are added.
* ACL text creation no longer tries to skip over non-existent text.
* UTF8 encoder no longer blows up on invalid wide characters.
* Fixed ACL state management for default ACLs.
Also, publicize function for obtaining text-format ACL in various
formats. The interface is now extensible through a "flags" argument
that allows you to select a variant format.
with 'star' ACL handling, though there's still a
bit more work needed in this area.
Added 'write_open_fd' and 'read_open_fd' to simplify, e.g.,
tar's u and r modes. Eliminated old 'write_open_file_position'
as a bad idea. (It required closing/reopening files to
do updates, which led to unpleasant implications.)
Various other minor fixes, API tweaks, etc.
Portability: Thanks to Juergen Lock, libarchive now compiles cleanly
on Linux. Along the way, I cleaned up a lot of error return codes and
reorganized some code to simplify conditional compilation of certain
sections.
Bug fixes:
* pax format now actually stores filenames that are 101-154
characters long.
* pax format now allows newline characters in extended attributes
(this fixes a long-standing bug in ACL handling)
* mtime/atime are now restored for directories
* directory list is now sorted prior to fix-up to permit
correct restore of non-writable dir heirarchies
What it is:
A library for reading and writing various streaming archive
formats, especially tar and cpio. Being a library, it should
be easy to incorporate into pkg_* tools, sysinstall, and any
other place that needs to read or write such archives.
Features:
* Full automatic detection of both compression and archive format.
* Extensible internal architecture to make it easy to add new formats.
* Support for "pax interchange format," a new POSIX-standard tar format
that eliminates essentially all of the restrictions of historic formats.
* BSD license
Thanks to: jkh for pushing me to start this work, gordon for
encouraging me to commit it, bde for answering endless style
questions, and many others for feedback and encouragement.
Status: Pretty good overall, though there are still a few rough edges and
the library could always use more testing. Feedback eagerly solicited.