code. <whew!> This version handles all of the following edge cases:
* Restoring explicit dirs with 000 permissions (star fails this test)
* Restore of implicit or explicit dirs when umask=777
(gtar and star both fail this test)
* Restoring dir paths containing "." and ".." components
This version initially creates all dirs with permission 700 (ignoring
umask), then does a post-extract "fixup" pass to set the correct
permissions (which may or may not depend on umask, depending on the
restore flags and whether it's an explicit or implicit dir).
Permissions are restored depth-first so that permissions within
non-writable dirs can be correctly restored. (The depth-sorting does
correctly account for dirs with ".." components.)
applied to their permissions. Just calculate the
default dir mode once and use it consistently, rather than
trying to remember to calculate it everywhere it's needed.
* Rename some variables/functions/etc to try to make things clearer.
* Add separate flags to control fflag/acl restore
* Collect metadata restore into a single function for clarity
* Propagate errors in metadata restore back out to the client
* Fix some places where errors were being returned when they
shouldn't and vice-versa
* Modes are now always restored; ARCHIVE_EXTRACT_PERM just controls
whether or not umask is obeyed.
* Restore suid/sgid bits only if user/group matches archive
* Cache the last stat results to try to reduce the number of stat calls
archive_entry.
Update the Makefile MLINKS and manpage to bring it up-to-date with
the current status of archive_entry. At least the manpage actually
lists all of the functions now, even if it doesn't really yet explain
them all.
Mostly, these were being used correctly even though a lot of
variables and function names were mis-named.
In the process, I found and fixed a couple of latent bugs and
added a guard against adding an archive to itself.
a/././b/../b/../c/./../d/e/f now work correctly. And yes, a/b and a/c
both get created in this example; if you want, you can create an
entire dir heirarchy from a tar archive with only one entry.
More tweaks to umask support: umasks are now obeyed for all objects,
not just directories; the umask used is now the one in effect at the
corresponding call to archive_read_extract(), so clients that want to
tinker with umask during extract should get the expected behavior.
umask in effect when the archive is closed
* Correct a typo that broke implicit dir creation for non-directories.
Thanks to: Garret A Wollman for pointing out my umask oversight
read_extract_dir (which creates directories in the archive). This
brings a number of advantages:
* FINALLY fix the problems creating dirs ending in "/." <sigh>
* Missing parent dirs now get created securely, just like explicit dirs.
(Created 0700 initially, then edited to 0755 at end of extraction.)
* Eliminate some duplicate code and some weird special cases.
While I'm cleaning, inline the regular-file creation code as well.
This change also pointed out one API deficiency: the
archive_read_data_into_XXX functions were originally defined to return
the total bytes read. This is, of course, ambiguous when dealing with
non-contiguous files. Change it to just return a status value.
file already exists on disk.
Pointed out by: www/resin3 port (whose distfile contains the same file
twice with different permissions and relies on the permissions associated
with the second instance)
Thanks again to: Kris Kennaway
* Restore directories with 0700 permissions initially,
then use the fixup pass to correct the permissions
* Trim trailing "/" and "/." in mkdirpath()
Suggested by: Garrett Wollman
distinguish files from dirs (trailing '/' indicated a dir). Since
POSIX.1-1987, this convention is no longer necessary. However, there
are current tar programs that pretend to write POSIX-compliant
archives, yet store directories as "regular files", relying on this
old filename convention to save them. <sigh> So, move the check for
this old convention so it applies to all tar archives, not just those
identified as "old."
Pointed out by: Broken distfile for audio/faad port
it sees a truncated input the first time it gets called.
(In particular, files shorter than 512 bytes cannot be tar archives.)
This allows the top-level archive_read_next_header code to
generate a proper error message for unrecognized file types.
Pointed out by: numerous ports that expect tar to extract non-tar files ;-(
Thanks to: Kris Kennaway
version called the higher-level archive_read_data and
archive_read_data_skip functions, which screwed up state management of
those functions. This bit of mis-design has existed for a long time,
but became a serious issue with the recent changes to the
archive_read_data APIs, which added more internal state to the
high-level archive_read_data function. Most common symptom was a
failure to correctly read 'L' entries (long filename) from GNU-style
archives, causing the message ": Can't open: No such file or
directory" with an empty filename.
Pointed out by: Numerous port build failures
Thanks to: Kris Kennaway
that as end-of-archive. Otherwise, a short read at this point
generates an error. This accomodates broken tar writers (such as the
one apparently in use at AT&T Labs) that don't even write a single
end-of-archive block.
Note that both star and pdtar behave this way as well.
In contrast, gtar doesn't complain in either case, and as a
result, will generate no warning for a lot of trashed archives.
Pointed out by: shells/ksh93 port (Thanks to Kris Kennaway)
push extract data down into archive_read_extract.c and out
of the library-global archive_private.h; push dir-specific
mode/time fixup down into dir restore function; now that the
fixup list is file-local, I can use somewhat more natural
naming.
Oh, yeah, update a bunch of comments to match current reality.
* New read_data_block is both sparse-file aware and uses zero-copy semantics
* Push read_data_block down into specific formats (opens door to
various encoded entry bodies, such as zip or gtar -S)
* Reimplement read_data, read_data_skip, read_data_into_fd in terms
of new read_data_block.
* Update documentation
It's unfortunate that I couldn't just call the new interface
archive_read_data, but didn't want to upset the API that much.
there's no need to enable support for it separately
from 'tar.' (The call to enable gnutar support is
now just an alias for the tar support, left in to
avoid API breakage.)
certain flags set (e.g., schg or uappend) would fail because the flags
were restored before the hardlink was created.
To address this, I've generalized the existing machinery for deferring
directory timestamp/mode restoration and used it to defer the
restoration of highly-restrictive flags to the end of the extraction,
after any links have been created.
Pointed out by: Pawel Jakub Dawidek (pjd@)
character, as some tar implementations incorrectly include a '/' with
the prefix.
Thanks to: Divacky Roman for the UnixWare 7 tarfile that
demonstrated this issue.
code available at tuhs.org, and found out that my chronology
is a bit off. In particular, Seventh Edition already used
the "linkflag" and "linkname" fields. Also, it appears that
there was no tar in Sixth Edition, contrary to what an earlier
tar.1 manpage claimed.
A few mdoc fixes also crept in here.
the size field for a hardlink entry. Specifically, ensure that
we do obey the size field for archives that we know are pax interchange
format archives, as required by POSIX.
Also, clarify the comment explaining why this is necessary and explain
the (very unusual) conditions under which it might fail.