Various improvements to the tar.5 manpage, including

descriptions of the GNU tar "posix-style" sparse format,
clarification of the Solaris tar ACL storage,
and a few comments about Mac OS X tar's resource storage.
This commit is contained in:
Tim Kientzle 2009-04-26 18:46:40 +00:00
parent 6388433b62
commit 9a4ac3e81e

View File

@ -1,4 +1,4 @@
.\" Copyright (c) 2003-2007 Tim Kientzle
.\" Copyright (c) 2003-2009 Tim Kientzle
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
@ -24,8 +24,8 @@
.\"
.\" $FreeBSD$
.\"
.Dd May 20, 2004
.Dt TAR 5
.Dd April 19, 2009
.Dt tar 5
.Os
.Sh NAME
.Nm tar
@ -71,7 +71,7 @@ necessary.
This section describes the variant implemented by the tar command
included in
.At v7 ,
which is one of the earliest widely-used versions of the tar program.
which seems to be the earliest widely-used version of the tar program.
.Pp
The header record for an old-style
.Nm
@ -390,6 +390,11 @@ user or group information available (such as when NIS/YP or LDAP services
are temporarily unavailable).
.It Cm SCHILY.devminor , Cm SCHILY.devmajor
The full minor and major numbers for device nodes.
.It Cm SCHILY.fflags
The file flags.
.It Cm SCHILY.realsize
The full size of the file on disk.
XXX explain? XXX
.It Cm SCHILY.dev, Cm SCHILY.ino , Cm SCHILY.nlinks
The device number, inode number, and link count for the entry.
In particular, note that a pax interchange format archive using Joerg
@ -560,9 +565,6 @@ plus
When extracting, GNU tar checks that the header file name is the one it is
expecting, that the header offset is in the correct sequence, and that
the sum of offset and size is equal to realsize.
FreeBSD's version of GNU tar does not handle the corner case of an
archive's being continued in the middle of a long name or other
extension header.
.It "N"
Type "N" records are no longer generated by GNU tar.
They contained a
@ -575,6 +577,8 @@ or
.Dq Symlink %s to %s\en ;
in either case, both
filenames are escaped using K&R C syntax.
Due to security concerns, "N" records are now generally ignored
when reading archives.
.It "S"
This is a
.Dq sparse
@ -644,6 +648,66 @@ entry; the
.Va realsize
field will indicate the total size of the file.
.El
.Ss GNU tar pax archives
GNU tar 1.14 (XXX check this XXX) and later will write
pax interchange format archives when you specify the
.Fl -posix
flag.
This format uses custom keywords to store sparse file information.
There have been three iterations of this support, referred to
as
.Dq 0.0 ,
.Dq 0.1 ,
and
.Dq 1.0 .
.Bl -tag -width indent
.It Cm GNU.sparse.numblocks , Cm GNU.sparse.offset , Cm GNU.sparse.numbytes , Cm GNU.sparse.size
The
.Dq 0.0
format used an initial
.Cm GNU.sparse.numblocks
attribute to indicate the number of blocks in the file, a pair of
.Cm GNU.sparse.offset
and
.Cm GNU.sparse.numbytes
to indicate the offset and size of each block,
and a single
.Cm GNU.sparse.size
to indicate the full size of the file.
This is not the same as the size in the tar header because the
latter value does not include the size of any holes.
This format required that the order of attributes be preserved and
relied on readers accepting multiple appearances of the same attribute
names, which is not officially permitted by the standards.
.It Cm GNU.sparse.map
The
.Dq 0.1
format used a single attribute that stored a comma-separated
list of decimal numbers.
Each pair of numbers indicated the offset and size, respectively,
of a block of data.
This does not work well if the archive is extracted by an archiver
that does not recognize this extension, since many pax implementations
simply discard unrecognized attributes.
.It Cm GNU.sparse.major , Cm GNU.sparse.minor , Cm GNU.sparse.name , Cm GNU.sparse.realsize
The
.Dq 1.0
format stores the sparse block map in one or more 512-byte blocks
prepended to the file data in the entry body.
The pax attributes indicate the existence of this map
(via the
.Cm GNU.sparse.major
and
.Cm GNU.sparse.minor
fields)
and the full size of the file.
The
.Cm GNU.sparse.name
holds the true name of the file.
To avoid confusion, the name stored in the regular tar header
is a modified name so that extraction errors will be apparent
to users.
.El
.Ss Solaris Tar
XXX More Details Needed XXX
.Pp
@ -667,16 +731,42 @@ An additional
.Cm A
entry is used to store an ACL for the following regular entry.
The body of this entry contains a seven-digit octal number
(whose value is 01000000 plus the number of ACL entries)
followed by a zero byte, followed by the
textual ACL description.
The octal value is the number of ACL entries
plus a constant that indicates the ACL type: 01000000
for POSIX.1e ACLs and 03000000 for NFSv4 ACLs.
.El
.Ss AIX Tar
XXX More details needed XXX
.Ss Mac OS X Tar
The tar distributed with Apple's Mac OS X stores most regular files
as two separate entries in the tar archive.
The two entries have the same name except that the first
one has
.Dq ._
added to the beginning of the name.
This first entry stores the
.Dq resource fork
with additional attributes for the file.
The Mac OS X
.Fn CopyFile
API is used to separate a file on disk into separate
resource and data streams and to reassemble those separate
streams when the file is restored to disk.
.Ss Other Extensions
One common extension, utilized by GNU tar, star, and other newer
One obvious extension to increase the size of files is to
eliminate the terminating characters from the various
numeric fields.
For example, the standard only allows the size field to contain
11 octal digits, reserving the twelfth byte for a trailing
NUL character.
Allowing 12 octal digits allows file sizes up to 64 GB.
.Pp
Another extension, utilized by GNU tar, star, and other newer
.Nm
implementations, permits binary numbers in the standard numeric
fields.
This is flagged by setting the high bit of the first character.
implementations, permits binary numbers in the standard numeric fields.
This is flagged by setting the high bit of the first byte.
This permits 95-bit values for the length and time fields
and 63-bit values for the uid, gid, and device numbers.
GNU tar supports this extension for the
@ -686,12 +776,9 @@ all numeric fields.
Note that this extension is largely obsoleted by the extended attribute
record provided by the pax interchange format.
.Pp
Another early GNU extension allowed base-64 values rather
than octal.
This extension was short-lived and such archives are almost never seen.
However, there is still code in GNU tar to support them; this code is
responsible for a very cryptic warning message that is sometimes seen when
GNU tar encounters a damaged archive.
Another early GNU extension allowed base-64 values rather than octal.
This extension was short-lived and is no longer supported by any
implementation.
.Sh SEE ALSO
.Xr ar 1 ,
.Xr pax 1 ,