dcef4f65ae
Historically, we've allowed read() of a directory and some filesystems will accommodate (e.g. ufs/ffs, msdosfs). From the history department staffed by Warner: <<EOF pdp-7 unix seemed to allow reading directories, but they were weird, special things there so I'm unsure (my pdp-7 assembler sucks). 1st Edition's sources are lost, mostly. The kernel allows it. The reconstructed sources from 2nd or 3rd edition read it though. V6 to V7 changed the filesystem format, and should have been a warning, but reading directories weren't materially changed. 4.1b BSD introduced readdir because of UFS. UFS broke all directory reading programs in 1983. ls, du, find, etc all had to be rewritten. readdir() and friends were introduced here. SysVr3 picked up readdir() in 1987 for the AT&T fork of Unix. SysVr4 updated all the directory reading programs in 1988 because different filesystem types were introduced. In the 90s, these interfaces became completely ubiquitous as PDP-11s running V7 faded from view and all the folks that initially started on V7 upgraded to SysV. Linux never supported this (though I've not done the software archeology to check) because it has always had a pathological diversity of filesystems. EOF Disallowing read(2) on a directory has the side-effect of masking application bugs from relying on other implementation's behavior (e.g. Linux) of rejecting these with EISDIR across the board, but allowing it has been a vector for at least one stack disclosure bug in the past[0]. By POSIX, this is implementation-defined whether read() handles directories or not. Popular implementations have chosen to reject them, and this seems sensible: the data you're reading from a directory is not structured in some unified way across filesystem implementations like with readdir(2), so it is impossible for applications to portably rely on this. With this patch, we will reject most read(2) of a dirfd with EISDIR. Users that know what they're doing can conscientiously set bsd.security.allow_read_dir=1 to allow read(2) of directories, as it has proven useful for debugging or recovery. A future commit will further limit the sysctl to allow only the system root to read(2) directories, to make it at least relatively safe to leave on for longer periods of time. While we're adding logic pertaining to directory vnodes to vn_io_fault, an additional assertion has also been added to ensure that we're not reaching vn_io_fault with any write request on a directory vnode. Such request would be a logical error in the kernel, and must be debugged rather than allowing it to potentially silently error out. Commented out shell aliases have been placed in root's chsrc/shrc to promote awareness that grep may become noisy after this change, depending on your usage. A tentative MFC plan has been put together to try and make it as trivial as possible to identify issues and collect reports; note that this will be strongly re-evaluated. Tentatively, I will MFC this knob with the default as it is in HEAD to improve our odds of actually getting reports. The future priv(9) to further restrict the sysctl WILL NOT BE MERGED BACK, so the knob will be a faithful reversion on stable/12. We will go into the merge acknowledging that the sysctl default may be flipped back to restore historical behavior at *any* point if it's warranted. [0] https://www.freebsd.org/security/advisories/FreeBSD-SA-19:10.ufs.asc PR: 246412 Reviewed by: mckusick, kib, emaste, jilles, cy, phk, imp (all previous) Reviewed by: rgrimes (latest version) MFC after: 1 month (note the MFC plan mentioned above) Relnotes: absolutely, but will amend previous RELNOTES entry Differential Revision: https://reviews.freebsd.org/D24596
311 lines
7.2 KiB
Groff
311 lines
7.2 KiB
Groff
.\" Copyright (c) 1980, 1991, 1993
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)read.2 8.4 (Berkeley) 2/26/94
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.Dd June 4, 2020
|
|
.Dt READ 2
|
|
.Os
|
|
.Sh NAME
|
|
.Nm read ,
|
|
.Nm readv ,
|
|
.Nm pread ,
|
|
.Nm preadv
|
|
.Nd read input
|
|
.Sh LIBRARY
|
|
.Lb libc
|
|
.Sh SYNOPSIS
|
|
.In unistd.h
|
|
.Ft ssize_t
|
|
.Fn read "int fd" "void *buf" "size_t nbytes"
|
|
.Ft ssize_t
|
|
.Fn pread "int fd" "void *buf" "size_t nbytes" "off_t offset"
|
|
.In sys/uio.h
|
|
.Ft ssize_t
|
|
.Fn readv "int fd" "const struct iovec *iov" "int iovcnt"
|
|
.Ft ssize_t
|
|
.Fn preadv "int fd" "const struct iovec *iov" "int iovcnt" "off_t offset"
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Fn read
|
|
system call
|
|
attempts to read
|
|
.Fa nbytes
|
|
of data from the object referenced by the descriptor
|
|
.Fa fd
|
|
into the buffer pointed to by
|
|
.Fa buf .
|
|
The
|
|
.Fn readv
|
|
system call
|
|
performs the same action, but scatters the input data
|
|
into the
|
|
.Fa iovcnt
|
|
buffers specified by the members of the
|
|
.Fa iov
|
|
array: iov[0], iov[1], ..., iov[iovcnt\|\-\|1].
|
|
The
|
|
.Fn pread
|
|
and
|
|
.Fn preadv
|
|
system calls
|
|
perform the same functions, but read from the specified position in
|
|
the file without modifying the file pointer.
|
|
.Pp
|
|
For
|
|
.Fn readv
|
|
and
|
|
.Fn preadv ,
|
|
the
|
|
.Fa iovec
|
|
structure is defined as:
|
|
.Pp
|
|
.Bd -literal -offset indent -compact
|
|
struct iovec {
|
|
void *iov_base; /* Base address. */
|
|
size_t iov_len; /* Length. */
|
|
};
|
|
.Ed
|
|
.Pp
|
|
Each
|
|
.Fa iovec
|
|
entry specifies the base address and length of an area
|
|
in memory where data should be placed.
|
|
The
|
|
.Fn readv
|
|
system call
|
|
will always fill an area completely before proceeding
|
|
to the next.
|
|
.Pp
|
|
On objects capable of seeking, the
|
|
.Fn read
|
|
starts at a position
|
|
given by the pointer associated with
|
|
.Fa fd
|
|
(see
|
|
.Xr lseek 2 ) .
|
|
Upon return from
|
|
.Fn read ,
|
|
the pointer is incremented by the number of bytes actually read.
|
|
.Pp
|
|
Objects that are not capable of seeking always read from the current
|
|
position.
|
|
The value of the pointer associated with such an
|
|
object is undefined.
|
|
.Pp
|
|
Upon successful completion,
|
|
.Fn read ,
|
|
.Fn readv ,
|
|
.Fn pread
|
|
and
|
|
.Fn preadv
|
|
return the number of bytes actually read and placed in the buffer.
|
|
The system guarantees to read the number of bytes requested if
|
|
the descriptor references a normal file that has that many bytes left
|
|
before the end-of-file, but in no other case.
|
|
.Pp
|
|
In accordance with
|
|
.St -p1003.1-2004 ,
|
|
both
|
|
.Xr read 2
|
|
and
|
|
.Xr write 2
|
|
syscalls are atomic with respect to each other in the effects on file
|
|
content, when they operate on regular files.
|
|
If two threads each call one of the
|
|
.Xr read 2
|
|
or
|
|
.Xr write 2 ,
|
|
syscalls, each call will see either all of the changes of the other call,
|
|
or none of them.
|
|
The
|
|
.Fx
|
|
kernel implements this guarantee by locking the file ranges affected by
|
|
the calls.
|
|
.Sh RETURN VALUES
|
|
If successful, the
|
|
number of bytes actually read is returned.
|
|
Upon reading end-of-file,
|
|
zero is returned.
|
|
Otherwise, a -1 is returned and the global variable
|
|
.Va errno
|
|
is set to indicate the error.
|
|
.Sh ERRORS
|
|
The
|
|
.Fn read ,
|
|
.Fn readv ,
|
|
.Fn pread
|
|
and
|
|
.Fn preadv
|
|
system calls
|
|
will succeed unless:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EBADF
|
|
The
|
|
.Fa fd
|
|
argument
|
|
is not a valid file or socket descriptor open for reading.
|
|
.It Bq Er ECONNRESET
|
|
The
|
|
.Fa fd
|
|
argument refers to a socket, and the remote socket end is
|
|
forcibly closed.
|
|
.It Bq Er EFAULT
|
|
The
|
|
.Fa buf
|
|
argument
|
|
points outside the allocated address space.
|
|
.It Bq Er EIO
|
|
An I/O error occurred while reading from the file system.
|
|
.It Bq Er EINTEGRITY
|
|
Corrupted data was detected while reading from the file system.
|
|
.It Bq Er EBUSY
|
|
Failed to read from a file, e.g. /proc/<pid>/regs while <pid> is not stopped
|
|
.It Bq Er EINTR
|
|
A read from a slow device
|
|
(i.e.\& one that might block for an arbitrary amount of time)
|
|
was interrupted by the delivery of a signal
|
|
before any data arrived.
|
|
.It Bq Er EINVAL
|
|
The pointer associated with
|
|
.Fa fd
|
|
was negative.
|
|
.It Bq Er EAGAIN
|
|
The file was marked for non-blocking I/O,
|
|
and no data were ready to be read.
|
|
.It Bq Er EISDIR
|
|
The file descriptor is associated with a directory.
|
|
Directories may only be read directly if the filesystem supports it and
|
|
the
|
|
.Dv security.bsd.allow_read_dir
|
|
sysctl MIB is set to a non-zero value.
|
|
For most scenarios, the
|
|
.Xr readdir 3
|
|
function should be used instead.
|
|
.It Bq Er EOPNOTSUPP
|
|
The file descriptor is associated with a file system and file type that
|
|
do not allow regular read operations on it.
|
|
.It Bq Er EOVERFLOW
|
|
The file descriptor is associated with a regular file,
|
|
.Fa nbytes
|
|
is greater than 0,
|
|
.Fa offset
|
|
is before the end-of-file, and
|
|
.Fa offset
|
|
is greater than or equal to the offset maximum established
|
|
for this file system.
|
|
.It Bq Er EINVAL
|
|
The value
|
|
.Fa nbytes
|
|
is greater than
|
|
.Dv INT_MAX .
|
|
.El
|
|
.Pp
|
|
In addition,
|
|
.Fn readv
|
|
and
|
|
.Fn preadv
|
|
may return one of the following errors:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EINVAL
|
|
The
|
|
.Fa iovcnt
|
|
argument
|
|
was less than or equal to 0, or greater than
|
|
.Dv IOV_MAX .
|
|
.It Bq Er EINVAL
|
|
One of the
|
|
.Fa iov_len
|
|
values in the
|
|
.Fa iov
|
|
array was negative.
|
|
.It Bq Er EINVAL
|
|
The sum of the
|
|
.Fa iov_len
|
|
values in the
|
|
.Fa iov
|
|
array overflowed a 32-bit integer.
|
|
.It Bq Er EFAULT
|
|
Part of the
|
|
.Fa iov
|
|
array points outside the process's allocated address space.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Fn pread
|
|
and
|
|
.Fn preadv
|
|
system calls may also return the following errors:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EINVAL
|
|
The
|
|
.Fa offset
|
|
value was negative.
|
|
.It Bq Er ESPIPE
|
|
The file descriptor is associated with a pipe, socket, or FIFO.
|
|
.El
|
|
.Sh SEE ALSO
|
|
.Xr dup 2 ,
|
|
.Xr fcntl 2 ,
|
|
.Xr getdirentries 2 ,
|
|
.Xr open 2 ,
|
|
.Xr pipe 2 ,
|
|
.Xr select 2 ,
|
|
.Xr socket 2 ,
|
|
.Xr socketpair 2 ,
|
|
.Xr fread 3 ,
|
|
.Xr readdir 3
|
|
.Sh STANDARDS
|
|
The
|
|
.Fn read
|
|
system call is expected to conform to
|
|
.St -p1003.1-90 .
|
|
The
|
|
.Fn readv
|
|
and
|
|
.Fn pread
|
|
system calls are expected to conform to
|
|
.St -xpg4.2 .
|
|
.Sh HISTORY
|
|
The
|
|
.Fn preadv
|
|
system call appeared in
|
|
.Fx 6.0 .
|
|
The
|
|
.Fn pread
|
|
function appeared in
|
|
.At V.4 .
|
|
The
|
|
.Fn readv
|
|
system call appeared in
|
|
.Bx 4.2 .
|
|
The
|
|
.Fn read
|
|
function appeared in
|
|
.At v1 .
|