message in the reader to the error message from the writer if the
error which occurred was in the writer. This avoids error messages
of "Empty error message" when extracting truncated archives.
implementation, and mark it as deprecated. It will be removed entirely
in libarchive 3.0 (in FreeBSD 8.0?) but there's no reason for anyone to
use it instead of archive_read_data.
Approved by: kientzle
skip over the end-of-entry padding instead of reading and discarding
it.
Considering that tar files normally have a block size of 10kB, this
isn't likely to avoid reading any data, but at least it makes the code
simpler and clearer.
discards it, for use when the compression layer code doesn't know how to
skip data (e.g., everything other than the "none" compressor). This makes
format level code simpler because that code can now assume that the
compression layer always knows how to skip and will always skip exactly
the requested number of bytes.
Discussed with: kientzle (3 months ago)
Don't change permissions on an existing dir unless _EXTRACT_PERM
is requested.
In particular, bsdtar -x should not edit mode of existing dirs
now; bsdtar -xp will.
* Only try to remove the existing item if we're not restoring a directory.
* If unlink fails, try rmdir next.
This should fix the broken --unlink option in bsdtar.
Thanks again to: Kris Kennaway, for beating up bsdtar on pointyhat.
* The ACL formatter was mis-formatting entries which had a
user/group ID but no name. Make the parser tolerant of
these, so that old archives can be correctly restored;
fix the formatter to generate correct entries.
* Fix overwrite detection by introducing a new "FAILED" return
code that indicates the current entry cannot be continued
but the archive as a whole is still sound.
* Header cleanup: Remove some unused headers, add some that
are required with new Linux systems.
These tests verify that archive_entry objects can store and return
ACL data and that pax format archives can read and write ACL
information. These do not (yet) test that ACL data is read or
written to disk correctly. (And hence would not have caught the
recent snafu about ACL read-from-disk being turned off.)
ACL data from the archive entry. This doesn't impact
archive_read_extract or archive_write_disk since they only
check for != ARCHIVE_OK when calling this function. (Though
they should be more careful.)
* libarchive_test program exercises many of the core features
* Refactored old "read_extract" into new "archive_write_disk", which
uses archive_write methods to put entries onto disk. In particular,
you can now use archive_write_disk to create objects on disk
without having an archive available.
* Pushed some security checks from bsdtar down into libarchive, where
they can be better optimized.
* Rearchitected the logic for creating objects on disk to reduce
the number of system calls. Several common cases now use a
minimum number of system calls.
* Virtualized some internal interfaces to provide a clearer separation
of read and write handling and make it simpler to override key
methods.
* New "empty" format reader.
* Corrected return types (this ABI breakage required the "2.0" version bump)
* Many bug fixes.
copy the symlink target name, not just copy the reference.
This problem sometimes caused crashes when extracting
symlinks from ISO9660 images.
Thanks to: Diego "Flameeyes" Pettenò
Fallout from changing the skip API to use off_t instead of size_t: Print
the skip length using %jd and cast to (intmax_t) instead of %d / (int),
and if ARCHIVE_API_VERSION >= 2, allow the client skipper to be called
for requests longer than SSIZE_MAX. [2]
Approved by: kientzle
Pointy hats to: kientzle [1], cperciva [2]
MFC after: 3 days
a vanilla 2-clause BSD license, but somehow some confusing
extra verbage get copied from somewhere.
Also, update the copyright dates to 2007 for all of the files.
Prompted by: several questions about what those extra words really mean
wrap this within #if/#else/#endif so that it will only take effect once
ARCHIVE_API_VERSION is increased (which should happen on HEAD some time
between now and when RELENG_7 is branched).
returning the length skipped in a ssize_t to using off_t for both. This
does not break any A[BP]Is, since compression_skip is entirely internal
to libarchive.
If a skip request is > SSIZE_MAX, don't pass it down to the client layer
skip function, since those still uses size_t / ssize_t. Instead, just
read the data and throw it away.
With this commit, libarchive/bsdtar should now successfully skip archive
entries of >2GB on 32-bit systems, but does so slower than necessary.
The performance will improve with a future A[BP]I breaking commit which
makes client layer skip functions use off_t.
Discussed with: kientzle
MFC after: 1 week
functions are required to skip the requested distance, so we can avoid
lots of bookkeeping which would otherwise be necessary.
Reviewed by: kientzle
MFC after: 1 week
config_freebsd.h. archive_platform.h decides which config file
to bring in and uses some of those selectors to define wrapper
macros and other compatibility glue.
* Correct a signed/unsigned problem that broke handling of files >2G.
* Implement "skip" support for much faster "tar -t".
Thanks to: Robert Sciuk for sending me a DVD that illustrated the first problem
* If write block size is zero, don't block at all.
This supports the unusual requirement of applications
that need "no-delay" writes.
* Expose _write_finish_entry() to give such applications more
control over write boundaries. (Normal applications do not
need this, as entries are completed automatically.)
* Correct the type of write callbacks; this is a minor API
change that does not affect the ABI.
* Correct the error handling in _write_next_header() around
completing the previous entry.
* Correct the documentation for block-size markers: Remove
docs for the long-defunct _read_set_block_size(); document
all of the write block size manipulators.
MFC after: 14 days
traditional shortcut of defining on-disk layouts using structures of
character arrays. Unfortunately, as recently discussed on cvs-all@,
this usage is not actually sanctioned by the standards and
specifically fails on GCC/arm (unless your data structures happen to
be "naturally aligned").
The new code defines offsets/sizes for data fields and accesses
them using explicit pointer arithmetic, instead of casting to
a structure and accessing structure fields. In particular,
the new code is now clean with WARNS=6 on arm.
MFC after: 14 days
and correct the use of unary minus with an unsigned value. (The unary
minus here is actually being used as a bitwise operation, which is
unusual enough to deserve a clarifying cast.)
archive_{read,write}_open_filename():
* Update Makefile to build the files using the new name.
* Update docs to document the new names, mentioning the
old ones as "deprecated synonyms."
* The old filenames will be reconnected to the build soon;
I'll soon recyce those files for a slightly different purpose.
internal format-specific functions return the same as the public
function, so that the public API layer doesn't have to guess the
correct return value. This addresses an obscure problem that occurs
when someone tries to write more data than the size of the entry (as
indicated in the entry header). In this case, the return value from
archive_write_data() was incorrect, reflecting the requested write
rather than the amount actually written.
MFC after: 15 days
* Use public API, don't access struct archive directly. (People should be able to copy these into their applications as a template for custom I/O callbacks.)
* Set "skip" only for regular files. ("skip" allows the low-level library to catch attempts to add an archive to itself or extract over itself.)
* Simplify the write_open functions by just calling stat() at the beginning. Somehow, these functions had acquired some complex logic that tried to avoid the stat() call but never succeeded.
MFC after: 10 days
file. This doesn't happen in normal use, because the file I/O and
decompression layers only pass through smaller blocks. It can happen
with custom read functions that block I/O in larger blocks.
* Actually use the HAVE_<header>_H macros to conditionally include
system headers. They've been defined for a long time, but only
used in a few places. Now they're used pretty consistently
throughout.
* Fill in a lot of missing casts for conversions from void*.
Although Standard C doesn't require this, some people have been
trying to use C++ compilers with this code, and they do require it.
Bit-for-bit, the compiled object files are identical, except for
one assert() whose line number changed, so I'm pretty confident I
didn't break anything. ;-)
restore it directly and skip chmod() during the post-extract fixup.
In particular, bsdtar -xm now completely skips the post-extract fixup
for directories, which produces a noticable speedup in that case.
* Expose functions for setting the "skip file" dev/ino information
* Expose functions for setting/querying the block size on reads
* Correctly propagate errors out of archive_read_close/archive_write_close
* Update manpage with information about new functions
increases performance when extracting a single entry from a large
uncompressed archive, especially on slow devices such as USB hard
drives.
Requires a number of changes:
* New archive_read_open2() supports a 'skip' client function
* Old archive_read_open() is implemented as a wrapper now, to
continue supporting the old API/ABI.
* _read_open_fd and _read_open_file sprout new 'skip' functions.
* compression layer gets a new 'skip' operation.
* compression_none passes skip requests through to client.
* compression_{gzip,bzip2,compress} simply ignore skip requests.
Thanks to: Benjamin Lutz, who designed and implemented the whole thing.
I'm just committing it. ;-)
TODO: Need to update the documentation a little bit.
pax wasn't introduced until the 1993 (?) revision.
(I need to double-check when pax was introduced and
clarify some of the history here. In particular,
I should explain that the 'pax' standard now owns the
'ustar' format spec.)
in part by OpenBSD's not-quite-standard-compliant
standard libraries. (No loss of functionality,
just minor recoding to not rely on certain "standard"
facilities that weren't actually needed.)
it's only a failure if there were actually attributes to be restored.
In particular, this fixes the problem where tar -xp always returned
a failure code on FreeBSD (which doesn't yet have all of the extended
attribute support).
Thanks to: Diego "Flameeyes" Petteno
This commit implements storing/reading POSIX.1e-style extended
attribute information in "pax" format archives. An outline of the
storage format is in the tar.5 manpage. The archive_read_extract()
function has code to restore those archives to disk for Linux; FreeBSD
implementation is forthcoming.
Many thanks to Jaakko Heinonen for finding flaws in earlier
proposals and doing the bulk of the coding in this work.
routine fails or the first read fails), invoke the client close
routine immediately so the client can clean up. Also, don't store the
client pointers in this case, so that the client close routine can't
accidentally get called more than once.
A minor style fix to archive_read_open_fd.c while I'm here.
PR: 86453
Thanks to: Andrew Turner for reporting this and suggesting a fix.
archiver for Fourth Edition through Sixth Edition Unix; it was
replaced by tar in Seventh Edition. (First Edition through
Third Edition used "tap.")
Unfortunately, tp was not so very standard; there were a
few different variants. The code here attempts to support
what I believe were the most common variants.
tp support is not yet enabled by archive_read_support_format_all(),
as I'm not yet entirely comfortable with the detection
heuristics. People interested in experimenting can
add archive_read_support_format_tp() just after any calls
to archive_read_support_format_all() in bsdtar to see how
well this works.
TODO: tp format is roughly similar in structure to dump/restore
archive formats used by many systems. It should be possible
to generalize this code to handle many dump/restore variants.
Format detection heuristics are going to be rough, though.
Thanks to: Warren Toomey, whose very basic tp extraction programs
and documentation made this possible.
libarchive doesn't make malloc(0) requests, so the autoconf
checks aren't needed and the autoconf workarounds for
broken malloc(0) just create problems.
Thanks to: Dan Nelson, who reports that this fixes libarchive on AIX 5.2
expr and printf are not available during installworld, so
use /bin/sh arithmetic expansion instead of expr and simply
give up on vanity formatting. ;-)
systems (or on FreeBSD systems when using ports).
2) Overhaul the versioning logic. In particular,
SHLIB_MAJOR number is now computed as "major+minor",
which ensures library versions are the same for
the FreeBSD build system and the portable
libtool/autoconf/automake build system.
link names, usernames, or group names that contain
non-ASCII characters.
In particular, this corrects an inconsistency reported
by Ed Maste when archiving symlinks with odd characters:
long symlinks would get preserved, short ones would
be changed.
"HEADER" unless the open is successful. Instead, leave the state as
"NEW." In particular, if archive_read_open() fails, a subsequent call
to archive_read_next_header() will now cause an explicit assertion
failure instead of a silent segmentation fault.
This may need a little more work to fully realize the intention: If
archive_read_open() fails, you should be able to call it again on the
same archive handle to open a different archive (or the same archive
using a different mechanism).
(wchar_t is defined in stddef.h, and only two files need more than that.)
Portability: Since the wchar requirements are really quite modest,
it's easy to define basic replacements for wcslen, wcscmp, wcscpy,
etc, for use on systems that lack <wchar.h>. In particular, this allows
libarchive to be used on older OpenBSD systems.
extracted from tar archives. Otherwise, converting tar archives to
cpio format (with "bsdtar -cf out.cpio @in.tar") convert every entry
into a hard link to a single file. This simple logic breaks hard
links, but that's better than the alternative.
MFC after: 7 days
header of the pax extension entry, clip them to ustar limits. In particular,
this prevents an internal panic for very old files.
Thanks to: Chris Spiegel
MFC after: 7 days
GNU tar sparse files, people have extended cpio) and clarify an
important detail about pax format (that ustar-compliant archivers
can mostly read pax archives correctly).
MFC after: 7 days
compiling on IRIX and Solaris. Remove the "archive_check_magic" macro
that existed only to provide __func__ to the underlying __archive_check_magic
function.
Thanks to: Darin Broady
MFC after: 14 days
from mode before using mode for extended attributes entry, copy
mtime/atime/ctime to extended attributes entry so it's a little more
clear that it corresponds to the like-named regular entry.
MFC after: 14 days
and restoring the metadata. In particular, the metadata-restore
functions now all accept a file descriptor and a pathname. If the
file descriptor is set and the platform supports the appropriate
syscall, restore the metadata through the file descriptor. Otherwise,
restore it through the pathname. This is complicated by varying
syscall support (FreeBSD has an fchmod(2) but no fchflags(2), for
example) and because non-file entries don't have an fd to use in
restoring attributes (for example, mknod(2) doesn't return a file
handle).
MFC after: 14 days
(symlink or hardlink) is already set. Instead, it was always setting
the hardlink field. In particular, this caused GNU tar format long
symlinks to be interpreted as hardlinks.
Thanks to: Brooks Davis
MFC after: 7 days
internal error if pax extended attributes were being generated. Being
< 255 characters, the first-pass path editing (to generate a
ustar-compatible name for the main entry) wouldn't occur, and the
second-pass path editing (to generate a ustar name for the pax
attributes entry) assumed the input was already < 245 chars.
The core problem here was using an abbreviated algorithm for the
second pass that relied on the first pass having already run. The
rewritten code is much simpler: It just uses the full path-shortening
algorithm for building both ustar pathnames. This way, the second
ustar pathname will always be short enough.
Thanks to: Mark Cammidge
Related to: bin/74385
* Handles entries with compressed size >2GB (signed/unsigned cleanup)
* Handles entries with compressed size >4GB ("ZIP64" extension)
* Handles Unix extensions (ctime, atime, mtime, mode, uid, etc)
* Format-specific "skip data" override allows ZIP reader to skip
entries without decompressing them, which makes "tar -t"
a lot faster.
* Handles "length-at-end" entries generated by, e.g., "zip -r - foo"
Many thanks to: Dan Nelson, who contributed the code and test files for
the first three items above and suggested the fourth.
return a generic text message instead.
(Someday, I'll track down all the places that
are generating errors but not recording messages. ;-/
Thanks to: Jaakko Heinonen
occurred with large read-ahead requests. This only affected
formats that incorrectly make large requests (ZIP did this until
recently) or with block sizes over 32k.