6aa1f5995a
Add Caldera license. Approved by: David Taylor <davidt@caldera.com> Make buildable under FreeBSD.
601 lines
14 KiB
Plaintext
601 lines
14 KiB
Plaintext
.\" Copyright (C) Caldera International Inc. 2001-2002. All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions are
|
|
.\" met:
|
|
.\"
|
|
.\" Redistributions of source code and documentation must retain the above
|
|
.\" copyright notice, this list of conditions and the following
|
|
.\" disclaimer.
|
|
.\"
|
|
.\" Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\"
|
|
.\" All advertising materials mentioning features or use of this software
|
|
.\" must display the following acknowledgement:
|
|
.\"
|
|
.\" This product includes software developed or owned by Caldera
|
|
.\" International, Inc. Neither the name of Caldera International, Inc.
|
|
.\" nor the names of other contributors may be used to endorse or promote
|
|
.\" products derived from this software without specific prior written
|
|
.\" permission.
|
|
.\"
|
|
.\" USE OF THE SOFTWARE PROVIDED FOR UNDER THIS LICENSE BY CALDERA
|
|
.\" INTERNATIONAL, INC. AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR
|
|
.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
|
.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
|
.\" DISCLAIMED. IN NO EVENT SHALL CALDERA INTERNATIONAL, INC. BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
|
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
|
|
.\" BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
|
.\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
|
|
.\" OR OTHERWISE) RISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
|
|
.\" IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
.\"
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.\" @(#)p4 8.1 (Berkeley) 6/8/93
|
|
.\"
|
|
.NH
|
|
LOW-LEVEL I/O
|
|
.PP
|
|
This section describes the
|
|
bottom level of I/O on the
|
|
.UC UNIX
|
|
system.
|
|
The lowest level of I/O in
|
|
.UC UNIX
|
|
provides no buffering or any other services;
|
|
it is in fact a direct entry into the operating system.
|
|
You are entirely on your own,
|
|
but on the other hand,
|
|
you have the most control over what happens.
|
|
And since the calls and usage are quite simple,
|
|
this isn't as bad as it sounds.
|
|
.NH 2
|
|
File Descriptors
|
|
.PP
|
|
In the
|
|
.UC UNIX
|
|
operating system,
|
|
all input and output is done
|
|
by reading or writing files,
|
|
because all peripheral devices, even the user's terminal,
|
|
are files in the file system.
|
|
This means that a single, homogeneous interface
|
|
handles all communication between a program and peripheral devices.
|
|
.PP
|
|
In the most general case,
|
|
before reading or writing a file,
|
|
it is necessary to inform the system
|
|
of your intent to do so,
|
|
a process called
|
|
``opening'' the file.
|
|
If you are going to write on a file,
|
|
it may also be necessary to create it.
|
|
The system checks your right to do so
|
|
(Does the file exist?
|
|
Do you have permission to access it?),
|
|
and if all is well,
|
|
returns a small positive integer
|
|
called a
|
|
.ul
|
|
file descriptor.
|
|
Whenever I/O is to be done on the file,
|
|
the file descriptor is used instead of the name to identify the file.
|
|
(This is roughly analogous to the use of
|
|
.UC READ(5,...)
|
|
and
|
|
.UC WRITE(6,...)
|
|
in Fortran.)
|
|
All
|
|
information about an open file is maintained by the system;
|
|
the user program refers to the file
|
|
only
|
|
by the file descriptor.
|
|
.PP
|
|
The file pointers discussed in section 3
|
|
are similar in spirit to file descriptors,
|
|
but file descriptors are more fundamental.
|
|
A file pointer is a pointer to a structure that contains,
|
|
among other things, the file descriptor for the file in question.
|
|
.PP
|
|
Since input and output involving the user's terminal
|
|
are so common,
|
|
special arrangements exist to make this convenient.
|
|
When the command interpreter (the
|
|
``shell'')
|
|
runs a program,
|
|
it opens
|
|
three files, with file descriptors 0, 1, and 2,
|
|
called the standard input,
|
|
the standard output, and the standard error output.
|
|
All of these are normally connected to the terminal,
|
|
so if a program reads file descriptor 0
|
|
and writes file descriptors 1 and 2,
|
|
it can do terminal I/O
|
|
without worrying about opening the files.
|
|
.PP
|
|
If I/O is redirected
|
|
to and from files with
|
|
.UL <
|
|
and
|
|
.UL > ,
|
|
as in
|
|
.P1
|
|
prog <infile >outfile
|
|
.P2
|
|
the shell changes the default assignments for file descriptors
|
|
0 and 1
|
|
from the terminal to the named files.
|
|
Similar observations hold if the input or output is associated with a pipe.
|
|
Normally file descriptor 2 remains attached to the terminal,
|
|
so error messages can go there.
|
|
In all cases,
|
|
the file assignments are changed by the shell,
|
|
not by the program.
|
|
The program does not need to know where its input
|
|
comes from nor where its output goes,
|
|
so long as it uses file 0 for input and 1 and 2 for output.
|
|
.NH 2
|
|
Read and Write
|
|
.PP
|
|
All input and output is done by
|
|
two functions called
|
|
.UL read
|
|
and
|
|
.UL write .
|
|
For both, the first argument is a file descriptor.
|
|
The second argument is a buffer in your program where the data is to
|
|
come from or go to.
|
|
The third argument is the number of bytes to be transferred.
|
|
The calls are
|
|
.P1
|
|
n_read = read(fd, buf, n);
|
|
|
|
n_written = write(fd, buf, n);
|
|
.P2
|
|
Each call returns a byte count
|
|
which is the number of bytes actually transferred.
|
|
On reading,
|
|
the number of bytes returned may be less than
|
|
the number asked for,
|
|
because fewer than
|
|
.UL n
|
|
bytes remained to be read.
|
|
(When the file is a terminal,
|
|
.UL read
|
|
normally reads only up to the next newline,
|
|
which is generally less than what was requested.)
|
|
A return value of zero bytes implies end of file,
|
|
and
|
|
.UL -1
|
|
indicates an error of some sort.
|
|
For writing, the returned value is the number of bytes
|
|
actually written;
|
|
it is generally an error if this isn't equal
|
|
to the number supposed to be written.
|
|
.PP
|
|
The number of bytes to be read or written is quite arbitrary.
|
|
The two most common values are
|
|
1,
|
|
which means one character at a time
|
|
(``unbuffered''),
|
|
and
|
|
512,
|
|
which corresponds to a physical blocksize on many peripheral devices.
|
|
This latter size will be most efficient,
|
|
but even character at a time I/O
|
|
is not inordinately expensive.
|
|
.PP
|
|
Putting these facts together,
|
|
we can write a simple program to copy
|
|
its input to its output.
|
|
This program will copy anything to anything,
|
|
since the input and output can be redirected to any file or device.
|
|
.P1
|
|
#define BUFSIZE 512 /* best size for PDP-11 UNIX */
|
|
|
|
main() /* copy input to output */
|
|
{
|
|
char buf[BUFSIZE];
|
|
int n;
|
|
|
|
while ((n = read(0, buf, BUFSIZE)) > 0)
|
|
write(1, buf, n);
|
|
exit(0);
|
|
}
|
|
.P2
|
|
If the file size is not a multiple of
|
|
.UL BUFSIZE ,
|
|
some
|
|
.UL read
|
|
will return a smaller number of bytes
|
|
to be written by
|
|
.UL write ;
|
|
the next call to
|
|
.UL read
|
|
after that
|
|
will return zero.
|
|
.PP
|
|
It is instructive to see how
|
|
.UL read
|
|
and
|
|
.UL write
|
|
can be used to construct
|
|
higher level routines like
|
|
.UL getchar ,
|
|
.UL putchar ,
|
|
etc.
|
|
For example,
|
|
here is a version of
|
|
.UL getchar
|
|
which does unbuffered input.
|
|
.P1
|
|
#define CMASK 0377 /* for making char's > 0 */
|
|
|
|
getchar() /* unbuffered single character input */
|
|
{
|
|
char c;
|
|
|
|
return((read(0, &c, 1) > 0) ? c & CMASK : EOF);
|
|
}
|
|
.P2
|
|
.UL c
|
|
.ul
|
|
must
|
|
be declared
|
|
.UL char ,
|
|
because
|
|
.UL read
|
|
accepts a character pointer.
|
|
The character being returned must be masked with
|
|
.UL 0377
|
|
to ensure that it is positive;
|
|
otherwise sign extension may make it negative.
|
|
(The constant
|
|
.UL 0377
|
|
is appropriate for the
|
|
.UC PDP -11
|
|
but not necessarily for other machines.)
|
|
.PP
|
|
The second version of
|
|
.UL getchar
|
|
does input in big chunks,
|
|
and hands out the characters one at a time.
|
|
.P1
|
|
#define CMASK 0377 /* for making char's > 0 */
|
|
#define BUFSIZE 512
|
|
|
|
getchar() /* buffered version */
|
|
{
|
|
static char buf[BUFSIZE];
|
|
static char *bufp = buf;
|
|
static int n = 0;
|
|
|
|
if (n == 0) { /* buffer is empty */
|
|
n = read(0, buf, BUFSIZE);
|
|
bufp = buf;
|
|
}
|
|
return((--n >= 0) ? *bufp++ & CMASK : EOF);
|
|
}
|
|
.P2
|
|
.NH 2
|
|
Open, Creat, Close, Unlink
|
|
.PP
|
|
Other than the default
|
|
standard input, output and error files,
|
|
you must explicitly open files in order to
|
|
read or write them.
|
|
There are two system entry points for this,
|
|
.UL open
|
|
and
|
|
.UL creat
|
|
[sic].
|
|
.PP
|
|
.UL open
|
|
is rather like the
|
|
.UL fopen
|
|
discussed in the previous section,
|
|
except that instead of returning a file pointer,
|
|
it returns a file descriptor,
|
|
which is just an
|
|
.UL int .
|
|
.P1
|
|
int fd;
|
|
|
|
fd = open(name, rwmode);
|
|
.P2
|
|
As with
|
|
.UL fopen ,
|
|
the
|
|
.UL name
|
|
argument
|
|
is a character string corresponding to the external file name.
|
|
The access mode argument
|
|
is different, however:
|
|
.UL rwmode
|
|
is 0 for read, 1 for write, and 2 for read and write access.
|
|
.UL open
|
|
returns
|
|
.UL -1
|
|
if any error occurs;
|
|
otherwise it returns a valid file descriptor.
|
|
.PP
|
|
It is an error to
|
|
try to
|
|
.UL open
|
|
a file that does not exist.
|
|
The entry point
|
|
.UL creat
|
|
is provided to create new files,
|
|
or to re-write old ones.
|
|
.P1
|
|
fd = creat(name, pmode);
|
|
.P2
|
|
returns a file descriptor
|
|
if it was able to create the file
|
|
called
|
|
.UL name ,
|
|
and
|
|
.UL -1
|
|
if not.
|
|
If the file
|
|
already exists,
|
|
.UL creat
|
|
will truncate it to zero length;
|
|
it is not an error to
|
|
.UL creat
|
|
a file that already exists.
|
|
.PP
|
|
If the file is brand new,
|
|
.UL creat
|
|
creates it with the
|
|
.ul
|
|
protection mode
|
|
specified by
|
|
the
|
|
.UL pmode
|
|
argument.
|
|
In the
|
|
.UC UNIX
|
|
file system,
|
|
there are nine bits of protection information
|
|
associated with a file,
|
|
controlling read, write and execute permission for
|
|
the owner of the file,
|
|
for the owner's group,
|
|
and for all others.
|
|
Thus a three-digit octal number
|
|
is most convenient for specifying the permissions.
|
|
For example,
|
|
0755
|
|
specifies read, write and execute permission for the owner,
|
|
and read and execute permission for the group and everyone else.
|
|
.PP
|
|
To illustrate,
|
|
here is a simplified version of
|
|
the
|
|
.UC UNIX
|
|
utility
|
|
.IT cp ,
|
|
a program which copies one file to another.
|
|
(The main simplification is that our version
|
|
copies only one file,
|
|
and does not permit the second argument
|
|
to be a directory.)
|
|
.P1
|
|
#define NULL 0
|
|
#define BUFSIZE 512
|
|
#define PMODE 0644 /* RW for owner, R for group, others */
|
|
|
|
main(argc, argv) /* cp: copy f1 to f2 */
|
|
int argc;
|
|
char *argv[];
|
|
{
|
|
int f1, f2, n;
|
|
char buf[BUFSIZE];
|
|
|
|
if (argc != 3)
|
|
error("Usage: cp from to", NULL);
|
|
if ((f1 = open(argv[1], 0)) == -1)
|
|
error("cp: can't open %s", argv[1]);
|
|
if ((f2 = creat(argv[2], PMODE)) == -1)
|
|
error("cp: can't create %s", argv[2]);
|
|
|
|
while ((n = read(f1, buf, BUFSIZE)) > 0)
|
|
if (write(f2, buf, n) != n)
|
|
error("cp: write error", NULL);
|
|
exit(0);
|
|
}
|
|
.P2
|
|
.P1
|
|
error(s1, s2) /* print error message and die */
|
|
char *s1, *s2;
|
|
{
|
|
printf(s1, s2);
|
|
printf("\en");
|
|
exit(1);
|
|
}
|
|
.P2
|
|
.PP
|
|
As we said earlier,
|
|
there is a limit (typically 15-25)
|
|
on the number of files which a program
|
|
may have open simultaneously.
|
|
Accordingly, any program which intends to process
|
|
many files must be prepared to re-use
|
|
file descriptors.
|
|
The routine
|
|
.UL close
|
|
breaks the connection between a file descriptor
|
|
and an open file,
|
|
and frees the
|
|
file descriptor for use with some other file.
|
|
Termination of a program
|
|
via
|
|
.UL exit
|
|
or return from the main program closes all open files.
|
|
.PP
|
|
The function
|
|
.UL unlink(filename)
|
|
removes the file
|
|
.UL filename
|
|
from the file system.
|
|
.NH 2
|
|
Random Access \(em Seek and Lseek
|
|
.PP
|
|
File I/O is normally sequential:
|
|
each
|
|
.UL read
|
|
or
|
|
.UL write
|
|
takes place at a position in the file
|
|
right after the previous one.
|
|
When necessary, however,
|
|
a file can be read or written in any arbitrary order.
|
|
The
|
|
system call
|
|
.UL lseek
|
|
provides a way to move around in
|
|
a file without actually reading
|
|
or writing:
|
|
.P1
|
|
lseek(fd, offset, origin);
|
|
.P2
|
|
forces the current position in the file
|
|
whose descriptor is
|
|
.UL fd
|
|
to move to position
|
|
.UL offset ,
|
|
which is taken relative to the location
|
|
specified by
|
|
.UL origin .
|
|
Subsequent reading or writing will begin at that position.
|
|
.UL offset
|
|
is
|
|
a
|
|
.UL long ;
|
|
.UL fd
|
|
and
|
|
.UL origin
|
|
are
|
|
.UL int 's.
|
|
.UL origin
|
|
can be 0, 1, or 2 to specify that
|
|
.UL offset
|
|
is to be
|
|
measured from
|
|
the beginning, from the current position, or from the
|
|
end of the file respectively.
|
|
For example,
|
|
to append to a file,
|
|
seek to the end before writing:
|
|
.P1
|
|
lseek(fd, 0L, 2);
|
|
.P2
|
|
To get back to the beginning (``rewind''),
|
|
.P1
|
|
lseek(fd, 0L, 0);
|
|
.P2
|
|
Notice the
|
|
.UL 0L
|
|
argument;
|
|
it could also be written as
|
|
.UL (long)\ 0 .
|
|
.PP
|
|
With
|
|
.UL lseek ,
|
|
it is possible to treat files more or less like large arrays,
|
|
at the price of slower access.
|
|
For example, the following simple function reads any number of bytes
|
|
from any arbitrary place in a file.
|
|
.P1
|
|
get(fd, pos, buf, n) /* read n bytes from position pos */
|
|
int fd, n;
|
|
long pos;
|
|
char *buf;
|
|
{
|
|
lseek(fd, pos, 0); /* get to pos */
|
|
return(read(fd, buf, n));
|
|
}
|
|
.P2
|
|
.PP
|
|
In pre-version 7
|
|
.UC UNIX ,
|
|
the basic entry point to the I/O system
|
|
is called
|
|
.UL seek .
|
|
.UL seek
|
|
is identical to
|
|
.UL lseek ,
|
|
except that its
|
|
.UL offset
|
|
argument is an
|
|
.UL int
|
|
rather than a
|
|
.UL long .
|
|
Accordingly,
|
|
since
|
|
.UC PDP -11
|
|
integers have only 16 bits,
|
|
the
|
|
.UL offset
|
|
specified
|
|
for
|
|
.UL seek
|
|
is limited to 65,535;
|
|
for this reason,
|
|
.UL origin
|
|
values of 3, 4, 5 cause
|
|
.UL seek
|
|
to multiply the given offset by 512
|
|
(the number of bytes in one physical block)
|
|
and then interpret
|
|
.UL origin
|
|
as if it were 0, 1, or 2 respectively.
|
|
Thus to get to an arbitrary place in a large file
|
|
requires two seeks, first one which selects
|
|
the block, then one which
|
|
has
|
|
.UL origin
|
|
equal to 1 and moves to the desired byte within the block.
|
|
.NH 2
|
|
Error Processing
|
|
.PP
|
|
The routines discussed in this section,
|
|
and in fact all the routines which are direct entries into the system
|
|
can incur errors.
|
|
Usually they indicate an error by returning a value of \-1.
|
|
Sometimes it is nice to know what sort of error occurred;
|
|
for this purpose all these routines, when appropriate,
|
|
leave an error number in the external cell
|
|
.UL errno .
|
|
The meanings of the various error numbers are
|
|
listed
|
|
in the introduction to Section II
|
|
of the
|
|
.I
|
|
.UC UNIX
|
|
Programmer's Manual,
|
|
.R
|
|
so your program can, for example, determine if
|
|
an attempt to open a file failed because it did not exist
|
|
or because the user lacked permission to read it.
|
|
Perhaps more commonly,
|
|
you may want to print out the
|
|
reason for failure.
|
|
The routine
|
|
.UL perror
|
|
will print a message associated with the value
|
|
of
|
|
.UL errno ;
|
|
more generally,
|
|
.UL sys\_errno
|
|
is an array of character strings which can be indexed
|
|
by
|
|
.UL errno
|
|
and printed by your program.
|