freebsd-skq/lib/libc/sys/posix_fadvise.2
jhb 78c075174e Add the posix_fadvise(2) system call. It is somewhat similar to
madvise(2) except that it operates on a file descriptor instead of a
memory region.  It is currently only supported on regular files.

Just as with madvise(2), the advice given to posix_fadvise(2) can be
divided into two types.  The first type provide hints about data access
patterns and are used in the file read and write routines to modify the
I/O flags passed down to VOP_READ() and VOP_WRITE().  These modes are
thus filesystem independent.  Note that to ease implementation (and
since this API is only advisory anyway), only a single non-normal
range is allowed per file descriptor.

The second type of hints are used to hint to the OS that data will or
will not be used.  These hints are implemented via a new VOP_ADVISE().
A default implementation is provided which does nothing for the WILLNEED
request and attempts to move any clean pages to the cache page queue for
the DONTNEED request.  This latter case required two other changes.
First, a new V_CLEANONLY flag was added to vinvalbuf().  This requests
vinvalbuf() to only flush clean buffers for the vnode from the buffer
cache and to not remove any backing pages from the vnode.  This is
used to ensure clean pages are not wired into the buffer cache before
attempting to move them to the cache page queue.  The second change adds
a new vm_object_page_cache() method.  This method is somewhat similar to
vm_object_page_remove() except that instead of freeing each page in the
specified range, it attempts to move clean pages to the cache queue if
possible.

To preserve the ABI of struct file, the f_cdevpriv pointer is now reused
in a union to point to the currently active advice region if one is
present for regular files.

Reviewed by:	jilles, kib, arch@
Approved by:	re (kib)
MFC after:	1 month
2011-11-04 04:02:50 +00:00

140 lines
4.0 KiB
Groff

.\" Copyright (c) 1991, 1993
.\" The Regents of the University of California. All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" 4. Neither the name of the University nor the names of its contributors
.\" may be used to endorse or promote products derived from this software
.\" without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" @(#)madvise.2 8.1 (Berkeley) 6/9/93
.\" $FreeBSD$
.\"
.Dd October 26, 2011
.Dt POSIX_FADVISE 2
.Os
.Sh NAME
.Nm posix_fadvise
.Nd give advice about use of file data
.Sh LIBRARY
.Lb libc
.Sh SYNOPSIS
.In fcntl.h
.Ft int
.Fn posix_fadvise "int fd" "off_t offset" "off_t len" "int advice"
.Sh DESCRIPTION
The
.Fn posix_fadvise
system call
allows a process to describe to the system its data access behavior for an
open file descriptor
.Fa fd .
The advice covers the data starting at offset
.Fa offset
and continuing for
.Fa len
bytes.
If
.Fa len
is zero,
all data from
.Fa offset
to the end of the file is covered.
.Pp
The behavior is specified by the
.Fa advice
parameter and may be one of:
.Bl -tag -width POSIX_FADV_SEQUENTIAL
.It Dv POSIX_FADV_NORMAL
Tells the system to revert to the default data access behavior.
.It Dv POSIX_FADV_RANDOM
Is a hint that file data will be accessed randomly,
and prefetching is likely not advantageous.
.It Dv POSIX_FADV_SEQUENTIAL
Tells the system that file data will be accessed sequentially.
This currently does nothing as the default behavior uses heuristics to
detect sequential behavior.
.It Dv POSIX_FADV_WILLNEED
Tells the system that the specified data will be accessed in the near future.
The system may initiate an asychronous read of the data if it is not already
present in memory.
.It Dv POSIX_FADV_DONTNEED
Tells the system that the specified data will not be accessed in the near
future.
The system may decrease the in-memory priority of clean data within the
specified range and future access to this data may require a read operation.
.It Dv POSIX_FADV_NOREUSE
Tells the system that the specified data will only be accessed once and
then not reused.
Accesses to data within the specified range are treated as if the file
descriptor has the
.Dv O_DIRECT
flag enabled.
.El
.Pp
.Sh RETURN VALUES
.Rv -std posix_fadvise
.Sh ERRORS
The
.Fn posix_fadvise
system call will fail if:
.Bl -tag -width Er
.It Bq Er EBADF
The
.Fa fd
argument is not a valid file descriptor.
.It Bq Er EINVAL
The
.Fa advice
argument is not valid.
.It Bq Er EINVAL
The
.Fa offset
or
.Fa len
arguments are negative,
or
.Fa offset
+
.Fa len
is greater than the maximum file size.
.It Bq Er ENODEV
The
.Fa fd
argument does not refer to a regular file.
.It Bq Er ESPIPE
The
.Fa fd
argument is associated with a pipe or FIFO.
.El
.Sh SEE ALSO
.Xr madvise 2
.Sh STANDARDS
The
.Fn posix_fadvise
interface conforms to
.St -p1003.1-2001 .
.Sh HISTORY
The
.Fn posix_fadvise
system call first appeared in
.Fx 10.0 .