Table of Contents
The library provides
a flexible interface for reading and writing streaming archive files such
as tar and cpio. The library is inherently stream-oriented; readers serially
iterate through the archive, writers serially add things to the archive.
In particular, note that there is no built-in support for random access
nor for in-place modification. When reading an archive, the library automatically
detects the format and the compression. The library currently has read support
for: old-style tar archives, most variants of the POSIX format, the
POSIX format, GNU-format tar archives, most common cpio archive formats,
ISO9660 CD images (with or without RockRidge extensions), Zip archives.
The library automatically detects archives compressed with or and decompresses
them transparently. When writing an archive, you can specify the compression
to be used and the format to use. The library can write POSIX-standard
archives, POSIX archives, POSIX octet-oriented cpio archives, two different
variants of shar archives. Pax interchange format is an extension of the
tar archive format that eliminates essentially all of the limitations of
historic tar formats in a standard fashion that is supported by POSIX-compliant
implementations on many systems as well as several newer implementations
of Note that the default write format will suppress the pax extended attributes
for most entries; explicitly requesting pax format will enable those attributes
for all entries. The read and write APIs are accessed through the functions
and the functions, respectively, and either can be used independently
of the other. The rest of this manual page provides an overview of the
library operation. More detailed information can be found in the individual
manual pages for each API or utility function.
To read
an archive, you must first obtain an initialized object from You can
then modify this object for the desired operations with the various and
functions. In particular, you will need to invoke appropriate functions
to enable the corresponding compression and format support. Note that these
latter functions perform two distinct operations: they cause the corresponding
support code to be linked into your program, and they enable the corresponding
auto-detect code. Unless you have specific constraints, you will generally
want to invoke and to enable auto-detect for all formats and compression
types currently supported by the library. Once you have prepared the object,
you call to actually open the archive and prepare it for reading. There
are several variants of this function; the most basic expects you to provide
pointers to several functions that can provide blocks of bytes from the
archive. There are convenience forms that allow you to specify a filename,
file descriptor, object, or a block of memory from which to read the archive
data. Note that the core library makes no assumptions about the size of
the blocks read; callback functions are free to read whatever block size
is most appropriate for the medium. Each archive entry consists of a header
followed by a certain amount of data. You can obtain the next header with
which returns a pointer to an structure with information about the current
archive element. If the entry is a regular file, then the header will be
followed by the file data. You can use (which works much like the system
call) to read this data from the archive. You may prefer to use the higher-level
which reads and discards the data for this entry, which reads the data
into an in-memory buffer, which copies the data to the provided file descriptor,
or which recreates the specified entry on disk and copies data from the
archive. In particular, note that uses the structure that you provide
it, which may differ from the entry just read from the archive. In particular,
many applications will want to override the pathname, file permissions,
or ownership. Once you have finished reading data from the archive, you
should call to close the archive, then call to release all resources,
including all memory allocated by the library. The manual page provides
more detailed calling information for this API.
You use
a similar process to write an archive. The function creates an archive
object useful for writing, the various functions are used to set parameters
for writing the archive, and completes the setup and opens the archive
for writing. Individual archive entries are written in a three-step process:
You first initialize a structure with information about the new entry.
At a minimum, you should set the pathname of the entry and provide a with
a valid field, which specifies the type of object and field, which specifies
the size of the data portion of the object. The function actually writes
the header data to the archive. You can then use to write the actual data.
After all entries have been written, use the function to release all
resources. The manual page provides more detailed calling information
for this API.
Detailed descriptions of each function are provided
by the corresponding manual pages. All of the functions utilize an opaque
datatype that provides access to the archive contents. The structure
contains a complete description of a single archive entry. It uses an opaque
interface that is fully documented in Users familiar with historic formats
should be aware that the newer variants have eliminated most restrictions
on the length of textual fields. Clients should not assume that filenames,
link names, user names, or group names are limited in length. In particular,
pax interchange format can easily accommodate pathnames in arbitrary character
sets that exceed
Most functions return zero on success, non-zero
on error. The return value indicates the general severity of the error,
ranging from which indicates a minor problem that should probably be reported
to the user, to which indicates a serious problem that will prevent any
further operations on this archive. On error, the function can be used
to retrieve a numeric error code (see The returns a textual error message
suitable for display. and return pointers to an allocated and initialized
object. and return a count of the number of bytes actually read or written.
A value of zero indicates the end of the data for this entry. A negative
value indicates an error, in which case the and functions can be used
to obtain more information.
There are character set conversions
within the functions that are impacted by the currently-selected locale.
The library first appeared in
The library
was written by
Some archive formats support information that is not
supported by Such information cannot be fully archived or restored using
this library. This includes, for example, comments, character sets, or the
arbitrary key/value pairs that can appear in pax interchange format archives.
Conversely, of course, not all of the information that can be stored in
an is supported by all formats. For example, cpio formats do not support
nanosecond timestamps; old tar formats do not support large device numbers.
Table of Contents