b66fb2c648
madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg
291 lines
8.7 KiB
Groff
291 lines
8.7 KiB
Groff
.\" Copyright (c) 1991, 1993
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
.\" must display the following acknowledgement:
|
|
.\" This product includes software developed by the University of
|
|
.\" California, Berkeley and its contributors.
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)mmap.2 8.4 (Berkeley) 5/11/95
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.Dd May 11, 1995
|
|
.Dt MMAP 2
|
|
.Os BSD 4
|
|
.Sh NAME
|
|
.Nm mmap
|
|
.Nd map files or devices into memory
|
|
.Sh SYNOPSIS
|
|
.Fd #include <sys/types.h>
|
|
.Fd #include <sys/mman.h>
|
|
.Ft void *
|
|
.Fn mmap "void * addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset"
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Fn mmap
|
|
function causes the pages starting at
|
|
.Fa addr
|
|
and continuing for at most
|
|
.Fa len
|
|
bytes to be mapped from the object described by
|
|
.Fa fd ,
|
|
starting at byte offset
|
|
.Fa offset .
|
|
If
|
|
.Fa len
|
|
is not a multiple of the pagesize, the mapped region may extend past the
|
|
specified range.
|
|
Any such extension beyond the end of the mapped object will be zero-filled.
|
|
.Pp
|
|
If
|
|
.Fa addr
|
|
is non-zero, it is used as a hint to the system.
|
|
(As a convenience to the system, the actual address of the region may differ
|
|
from the address supplied.)
|
|
If
|
|
.Fa addr
|
|
is zero, an address will be selected by the system.
|
|
The actual starting address of the region is returned.
|
|
A successful
|
|
.Fa mmap
|
|
deletes any previous mapping in the allocated address range.
|
|
.Pp
|
|
The protections (region accessibility) are specified in the
|
|
.Fa prot
|
|
argument by
|
|
.Em or Ns 'ing
|
|
the following values:
|
|
.Pp
|
|
.Bl -tag -width MAP_FIXEDX
|
|
.It Dv PROT_EXEC
|
|
Pages may be executed.
|
|
.It Dv PROT_READ
|
|
Pages may be read.
|
|
.It Dv PROT_WRITE
|
|
Pages may be written.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Fa flags
|
|
parameter specifies the type of the mapped object, mapping options and
|
|
whether modifications made to the mapped copy of the page are private
|
|
to the process or are to be shared with other references.
|
|
Sharing, mapping type and options are specified in the
|
|
.Fa flags
|
|
argument by
|
|
.Em or Ns 'ing
|
|
the following values:
|
|
.Pp
|
|
.Bl -tag -width MAP_FIXEDX
|
|
.It Dv MAP_ANON
|
|
Map anonymous memory not associated with any specific file.
|
|
The file descriptor used for creating
|
|
.Dv MAP_ANON
|
|
must be \-1.
|
|
The
|
|
.Fa offset
|
|
parameter is ignored.
|
|
.\".It Dv MAP_FILE
|
|
.\"Mapped from a regular file or character-special device memory.
|
|
.It Dv MAP_FIXED
|
|
Do not permit the system to select a different address than the one
|
|
specified.
|
|
If the specified address cannot be used,
|
|
.Fn mmap
|
|
will fail.
|
|
If MAP_FIXED is specified,
|
|
.Fa addr
|
|
must be a multiple of the pagesize.
|
|
Use of this option is discouraged.
|
|
.It Dv MAP_HASSEMAPHORE
|
|
Notify the kernel that the region may contain semaphores and that special
|
|
handling may be necessary.
|
|
.It Dv MAP_INHERIT
|
|
Permit regions to be inherited across
|
|
.Xr execve 2
|
|
system calls.
|
|
.It Dv MAP_PRIVATE
|
|
Modifications are private.
|
|
.It Dv MAP_SHARED
|
|
Modifications are shared.
|
|
.It Dv MAP_STACK
|
|
This option is only available if your system has been compiled with
|
|
VM_STACK defined when compiling the kernel. This is the default for
|
|
i386 only. Consider adding -DVM_STACK to COPTFLAGS in your /etc/make.conf
|
|
to enable this option for other architechures. MAP_STACK implies
|
|
MAP_ANON, and
|
|
.Fa offset
|
|
of 0.
|
|
.Fa fd
|
|
must be -1 and
|
|
.Fa prot
|
|
must include at least PROT_READ and PROT_WRITE. This option creates
|
|
a memory region that grows to at most
|
|
.Fa len
|
|
bytes in size, starting from the stack top and growing down. The
|
|
stack top is the starting address returned by the call, plus
|
|
.Fa len
|
|
bytes. The bottom of the stack at maximum growth is the starting
|
|
address returned by the call.
|
|
.It Dv MAP_NOSYNC
|
|
Causes data dirtied via this VM map to be flushed to physical media
|
|
only when necessary (usually by the pager) rather then gratuitously.
|
|
Typically this prevents the update daemons from flushing pages dirtied
|
|
through such maps and thus allows efficient sharing of memory across
|
|
unassociated processes using a file-backed shared memory map. Without
|
|
this option any VM pages you dirty may be flushed to disk every so often
|
|
(every 30-60 seconds usually) which can create performance problems if you
|
|
do not need that to occur (such as when you are using shared file-backed
|
|
mmap regions for IPC purposes). Note that VM/filesystem coherency is
|
|
maintained whether you use MAP_NOSYNC or not. This option is not portable
|
|
across UNIX platforms (yet), though some may implement the same behavior
|
|
by default.
|
|
.Pp
|
|
The
|
|
.Xr fsync 2
|
|
function will flush all dirty data and metadata associated with a file,
|
|
including dirty NOSYNC VM data, to physical media. The
|
|
.Xr sync 1
|
|
command and
|
|
.Xr sync 2
|
|
system call generally do not flush dirty NOSYNC VM data.
|
|
The
|
|
.Xr msync 2
|
|
system call is obsolete since
|
|
.Os BSD
|
|
implements a coherent filesystem buffer cache. However, it may be
|
|
used to associate dirty VM pages with filesystem buffers and thus cause
|
|
them to be flushed to physical media sooner rather then later.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Xr close 2
|
|
function does not unmap pages, see
|
|
.Xr munmap 2
|
|
for further information.
|
|
.Pp
|
|
The current design does not allow a process to specify the location of
|
|
swap space.
|
|
In the future we may define an additional mapping type,
|
|
.Dv MAP_SWAP ,
|
|
in which
|
|
the file descriptor argument specifies a file or device to which swapping
|
|
should be done.
|
|
.Sh RETURN VALUES
|
|
Upon successful completion,
|
|
.Fn mmap
|
|
returns a pointer to the mapped region.
|
|
Otherwise, a value of MAP_FAILED is returned and
|
|
.Va errno
|
|
is set to indicate the error.
|
|
.Sh ERRORS
|
|
.Fn Mmap
|
|
will fail if:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EACCES
|
|
The flag
|
|
.Dv PROT_READ
|
|
was specified as part of the
|
|
.Fa prot
|
|
parameter and
|
|
.Fa fd
|
|
was not open for reading.
|
|
The flags
|
|
.Dv MAP_SHARED
|
|
and
|
|
.Dv PROT_WRITE
|
|
were specified as part of the
|
|
.Fa flags
|
|
and
|
|
.Fa prot
|
|
parameters and
|
|
.Fa fd
|
|
was not open for writing.
|
|
.It Bq Er EBADF
|
|
.Fa Fd
|
|
is not a valid open file descriptor.
|
|
.It Bq Er EINVAL
|
|
.Dv MAP_FIXED
|
|
was specified and the
|
|
.Fa addr
|
|
parameter was not page aligned, or part of the desired address space
|
|
resides out of the valid address space for a user process.
|
|
.It Bq Er EINVAL
|
|
.Fa Len
|
|
was negative.
|
|
.It Bq Er EINVAL
|
|
.Dv MAP_ANON
|
|
was specified and the
|
|
.Fa fd
|
|
parameter was not -1.
|
|
.It Bq Er EINVAL
|
|
.Dv MAP_ANON
|
|
has not been specified and
|
|
.Fa fd
|
|
did not reference a regular or character special file.
|
|
.It Bq Er EINVAL
|
|
.Fa Offset
|
|
was not page-aligned. (See BUGS below.)
|
|
.It Bq Er ENOMEM
|
|
.Dv MAP_FIXED
|
|
was specified and the
|
|
.Fa addr
|
|
parameter wasn't available.
|
|
.Dv MAP_ANON
|
|
was specified and insufficient memory was available.
|
|
.Sh "SEE ALSO"
|
|
.Xr madvise 2 ,
|
|
.Xr mincore 2 ,
|
|
.Xr mlock 2 ,
|
|
.Xr mprotect 2 ,
|
|
.Xr msync 2 ,
|
|
.Xr munlock 2 ,
|
|
.Xr munmap 2 ,
|
|
.Xr getpagesize 3
|
|
.Sh BUGS
|
|
.Ar len
|
|
is limited to 2GB. Mmapping slightly more than 2GB doesn't work, but
|
|
it is possible to map a window of size (filesize % 2GB) for file sizes
|
|
of slightly less than 2G, 4GB, 6GB and 8GB.
|
|
.Pp
|
|
The limit is imposed for a variety of reasons. Most of them have to do
|
|
with
|
|
.Tn FreeBSD
|
|
not wanting to use 64 bit offsets in the VM system due to
|
|
the extreme performance penalty. So
|
|
.Tn FreeBSD
|
|
uses 32bit page indexes and
|
|
this gives
|
|
.Tn FreeBSD
|
|
a maximum of 8TB filesizes. It's actually bugs in
|
|
the filesystem code that causes the limit to be further restricted to
|
|
1TB (loss of precision when doing blockno calculations).
|
|
.Pp
|
|
Another reason for the 2GB limit is that filesystem metadata can
|
|
reside at negative offsets.
|
|
.Pp
|
|
We currently can only deal with page aligned file offsets.
|