fa9896e082
Remove /^\.\\"\n\.\\"\s*\$FreeBSD\$$\n/
623 lines
15 KiB
Groff
623 lines
15 KiB
Groff
.\"
|
|
.\" Copyright 2000 Massachusetts Institute of Technology
|
|
.\"
|
|
.\" Permission to use, copy, modify, and distribute this software and
|
|
.\" its documentation for any purpose and without fee is hereby
|
|
.\" granted, provided that both the above copyright notice and this
|
|
.\" permission notice appear in all copies, that both the above
|
|
.\" copyright notice and this permission notice appear in all
|
|
.\" supporting documentation, and that the name of M.I.T. not be used
|
|
.\" in advertising or publicity pertaining to distribution of the
|
|
.\" software without specific, written prior permission. M.I.T. makes
|
|
.\" no representations about the suitability of this software for any
|
|
.\" purpose. It is provided "as is" without express or implied
|
|
.\" warranty.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS
|
|
.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
|
|
.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
|
.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
|
|
.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
|
.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
|
|
.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
|
|
.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
|
.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
|
|
.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.Dd January 30, 2023
|
|
.Dt SHM_OPEN 2
|
|
.Os
|
|
.Sh NAME
|
|
.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink
|
|
.Nd "shared memory object operations"
|
|
.Sh LIBRARY
|
|
.Lb libc
|
|
.Sh SYNOPSIS
|
|
.In sys/types.h
|
|
.In sys/mman.h
|
|
.In fcntl.h
|
|
.Ft int
|
|
.Fn memfd_create "const char *name" "unsigned int flags"
|
|
.Ft int
|
|
.Fo shm_create_largepage
|
|
.Fa "const char *path"
|
|
.Fa "int flags"
|
|
.Fa "int psind"
|
|
.Fa "int alloc_policy"
|
|
.Fa "mode_t mode"
|
|
.Fc
|
|
.Ft int
|
|
.Fn shm_open "const char *path" "int flags" "mode_t mode"
|
|
.Ft int
|
|
.Fn shm_rename "const char *path_from" "const char *path_to" "int flags"
|
|
.Ft int
|
|
.Fn shm_unlink "const char *path"
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Fn shm_open
|
|
function opens (or optionally creates) a
|
|
POSIX
|
|
shared memory object named
|
|
.Fa path .
|
|
The
|
|
.Fa flags
|
|
argument contains a subset of the flags used by
|
|
.Xr open 2 .
|
|
An access mode of either
|
|
.Dv O_RDONLY
|
|
or
|
|
.Dv O_RDWR
|
|
must be included in
|
|
.Fa flags .
|
|
The optional flags
|
|
.Dv O_CREAT ,
|
|
.Dv O_EXCL ,
|
|
and
|
|
.Dv O_TRUNC
|
|
may also be specified.
|
|
.Pp
|
|
If
|
|
.Dv O_CREAT
|
|
is specified,
|
|
then a new shared memory object named
|
|
.Fa path
|
|
will be created if it does not exist.
|
|
In this case,
|
|
the shared memory object is created with mode
|
|
.Fa mode
|
|
subject to the process' umask value.
|
|
If both the
|
|
.Dv O_CREAT
|
|
and
|
|
.Dv O_EXCL
|
|
flags are specified and a shared memory object named
|
|
.Fa path
|
|
already exists,
|
|
then
|
|
.Fn shm_open
|
|
will fail with
|
|
.Er EEXIST .
|
|
.Pp
|
|
Newly created objects start off with a size of zero.
|
|
If an existing shared memory object is opened with
|
|
.Dv O_RDWR
|
|
and the
|
|
.Dv O_TRUNC
|
|
flag is specified,
|
|
then the shared memory object will be truncated to a size of zero.
|
|
The size of the object can be adjusted via
|
|
.Xr ftruncate 2
|
|
and queried via
|
|
.Xr fstat 2 .
|
|
.Pp
|
|
The new descriptor is set to close during
|
|
.Xr execve 2
|
|
system calls;
|
|
see
|
|
.Xr close 2
|
|
and
|
|
.Xr fcntl 2 .
|
|
.Pp
|
|
The constant
|
|
.Dv SHM_ANON
|
|
may be used for the
|
|
.Fa path
|
|
argument to
|
|
.Fn shm_open .
|
|
In this case, an anonymous, unnamed shared memory object is created.
|
|
Since the object has no name,
|
|
it cannot be removed via a subsequent call to
|
|
.Fn shm_unlink ,
|
|
or moved with a call to
|
|
.Fn shm_rename .
|
|
Instead,
|
|
the shared memory object will be garbage collected when the last reference to
|
|
the shared memory object is removed.
|
|
The shared memory object may be shared with other processes by sharing the
|
|
file descriptor via
|
|
.Xr fork 2
|
|
or
|
|
.Xr sendmsg 2 .
|
|
Attempting to open an anonymous shared memory object with
|
|
.Dv O_RDONLY
|
|
will fail with
|
|
.Er EINVAL .
|
|
All other flags are ignored.
|
|
.Pp
|
|
The
|
|
.Fn shm_create_largepage
|
|
function behaves similarly to
|
|
.Fn shm_open ,
|
|
except that the
|
|
.Dv O_CREAT
|
|
flag is implicitly specified, and the returned
|
|
.Dq largepage
|
|
object is always backed by aligned, physically contiguous chunks of memory.
|
|
This ensures that the object can be mapped using so-called
|
|
.Dq superpages ,
|
|
which can improve application performance in some workloads by reducing the
|
|
number of translation lookaside buffer (TLB) entries required to access a
|
|
mapping of the object,
|
|
and by reducing the number of page faults performed when accessing a mapping.
|
|
This happens automatically for all largepage objects.
|
|
.Pp
|
|
An existing largepage object can be opened using the
|
|
.Fn shm_open
|
|
function.
|
|
Largepage shared memory objects behave slightly differently from non-largepage
|
|
objects:
|
|
.Bl -bullet -offset indent
|
|
.It
|
|
Memory for a largepage object is allocated when the object is
|
|
extended using the
|
|
.Xr ftruncate 2
|
|
system call, whereas memory for regular shared memory objects is allocated
|
|
lazily and may be paged out to a swap device when not in use.
|
|
.It
|
|
The size of a mapping of a largepage object must be a multiple of the
|
|
underlying large page size.
|
|
Most attributes of such a mapping can only be modified at the granularity
|
|
of the large page size.
|
|
For example, when using
|
|
.Xr munmap 2
|
|
to unmap a portion of a largepage object mapping, or when using
|
|
.Xr mprotect 2
|
|
to adjust protections of a mapping of a largepage object, the starting address
|
|
must be large page size-aligned, and the length of the operation must be a
|
|
multiple of the large page size.
|
|
If not, the corresponding system call will fail and set
|
|
.Va errno
|
|
to
|
|
.Er EINVAL .
|
|
.El
|
|
.Pp
|
|
The
|
|
.Fa psind
|
|
argument to
|
|
.Fn shm_create_largepage
|
|
specifies the size of large pages used to back the object.
|
|
This argument is an index into the page sizes array returned by
|
|
.Xr getpagesizes 3 .
|
|
In particular, all large pages backing a largepage object must be of the
|
|
same size.
|
|
For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage
|
|
object will consist of either 1024 2MB pages, or 2 1GB pages, depending on
|
|
the value specified for the
|
|
.Fa psind
|
|
argument.
|
|
The
|
|
.Fa alloc_policy
|
|
parameter specifies what happens when an attempt to use
|
|
.Xr ftruncate 2
|
|
to allocate memory for the object fails.
|
|
The following values are accepted:
|
|
.Bl -tag -offset indent -width SHM_
|
|
.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT
|
|
If the (non-blocking) memory allocation fails because there is insufficient free
|
|
contiguous memory, the kernel will attempt to defragment physical memory and
|
|
try another allocation.
|
|
The subsequent allocation may or may not succeed.
|
|
If this subsequent allocation also fails,
|
|
.Xr ftruncate 2
|
|
will fail and set
|
|
.Va errno
|
|
to
|
|
.Er ENOMEM .
|
|
.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT
|
|
If the memory allocation fails,
|
|
.Xr ftruncate 2
|
|
will fail and set
|
|
.Va errno
|
|
to
|
|
.Er ENOMEM .
|
|
.It Dv SHM_LARGEPAGE_ALLOC_HARD
|
|
The kernel will attempt defragmentation until the allocation succeeds,
|
|
or an unblocked signal is delivered to the thread.
|
|
However, it is possible for physical memory to be fragmented such that the
|
|
allocation will never succeed.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Dv FIOSSHMLPGCNF
|
|
and
|
|
.Dv FIOGSHMLPGCNF
|
|
.Xr ioctl 2
|
|
commands can be used with a largepage shared memory object to get and set
|
|
largepage object parameters.
|
|
Both commands operate on the following structure:
|
|
.Bd -literal
|
|
struct shm_largepage_conf {
|
|
int psind;
|
|
int alloc_policy;
|
|
};
|
|
|
|
.Ed
|
|
The
|
|
.Dv FIOGSHMLPGCNF
|
|
command populates this structure with the current values of these parameters,
|
|
while the
|
|
.Dv FIOSSHMLPGCNF
|
|
command modifies the largepage object.
|
|
Currently only the
|
|
.Va alloc_policy
|
|
parameter may be modified.
|
|
Internally,
|
|
.Fn shm_create_largepage
|
|
works by creating a regular shared memory object using
|
|
.Fn shm_open ,
|
|
and then converting it into a largepage object using the
|
|
.Dv FIOSSHMLPGCNF
|
|
ioctl command.
|
|
.Pp
|
|
The
|
|
.Fn shm_rename
|
|
system call atomically removes a shared memory object named
|
|
.Fa path_from
|
|
and relinks it at
|
|
.Fa path_to .
|
|
If another object is already linked at
|
|
.Fa path_to ,
|
|
that object will be unlinked, unless one of the following flags are provided:
|
|
.Bl -tag -offset indent -width Er
|
|
.It Er SHM_RENAME_EXCHANGE
|
|
Atomically exchange the shms at
|
|
.Fa path_from
|
|
and
|
|
.Fa path_to .
|
|
.It Er SHM_RENAME_NOREPLACE
|
|
Return an error if an shm exists at
|
|
.Fa path_to ,
|
|
rather than unlinking it.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Fn shm_unlink
|
|
system call removes a shared memory object named
|
|
.Fa path .
|
|
.Pp
|
|
The
|
|
.Fn memfd_create
|
|
function creates an anonymous shared memory object, identical to that created
|
|
by
|
|
.Fn shm_open
|
|
when
|
|
.Dv SHM_ANON
|
|
is specified.
|
|
Newly created objects start off with a size of zero.
|
|
The size of the new object must be adjusted via
|
|
.Xr ftruncate 2 .
|
|
.Pp
|
|
The
|
|
.Fa name
|
|
argument must not be
|
|
.Dv NULL ,
|
|
but it may be an empty string.
|
|
The length of the
|
|
.Fa name
|
|
argument may not exceed
|
|
.Dv NAME_MAX
|
|
minus six characters for the prefix
|
|
.Dq memfd: ,
|
|
which will be prepended.
|
|
The
|
|
.Fa name
|
|
argument is intended solely for debugging purposes and will never be used by the
|
|
kernel to identify a memfd.
|
|
Names are therefore not required to be unique.
|
|
.Pp
|
|
The following
|
|
.Fa flags
|
|
may be specified to
|
|
.Fn memfd_create :
|
|
.Bl -tag -width MFD_ALLOW_SEALING
|
|
.It Dv MFD_CLOEXEC
|
|
Set
|
|
.Dv FD_CLOEXEC
|
|
on the resulting file descriptor.
|
|
.It Dv MFD_ALLOW_SEALING
|
|
Allow adding seals to the resulting file descriptor using the
|
|
.Dv F_ADD_SEALS
|
|
.Xr fcntl 2
|
|
command.
|
|
.It Dv MFD_HUGETLB
|
|
This flag is currently unsupported.
|
|
.El
|
|
.Sh RETURN VALUES
|
|
If successful,
|
|
.Fn memfd_create
|
|
and
|
|
.Fn shm_open
|
|
both return a non-negative integer,
|
|
and
|
|
.Fn shm_rename
|
|
and
|
|
.Fn shm_unlink
|
|
return zero.
|
|
All functions return -1 on failure, and set
|
|
.Va errno
|
|
to indicate the error.
|
|
.Sh COMPATIBILITY
|
|
The
|
|
.Fn shm_create_largepage
|
|
and
|
|
.Fn shm_rename
|
|
functions are
|
|
.Fx
|
|
extensions, as is support for the
|
|
.Dv SHM_ANON
|
|
value in
|
|
.Fn shm_open .
|
|
.Pp
|
|
The
|
|
.Fa path ,
|
|
.Fa path_from ,
|
|
and
|
|
.Fa path_to
|
|
arguments do not necessarily represent a pathname (although they do in
|
|
most other implementations).
|
|
Two processes opening the same
|
|
.Fa path
|
|
are guaranteed to access the same shared memory object if and only if
|
|
.Fa path
|
|
begins with a slash
|
|
.Pq Ql \&/
|
|
character.
|
|
.Pp
|
|
Only the
|
|
.Dv O_RDONLY ,
|
|
.Dv O_RDWR ,
|
|
.Dv O_CREAT ,
|
|
.Dv O_EXCL ,
|
|
and
|
|
.Dv O_TRUNC
|
|
flags may be used in portable programs.
|
|
.Pp
|
|
POSIX
|
|
specifications state that the result of using
|
|
.Xr open 2 ,
|
|
.Xr read 2 ,
|
|
or
|
|
.Xr write 2
|
|
on a shared memory object, or on the descriptor returned by
|
|
.Fn shm_open ,
|
|
is undefined.
|
|
However, the
|
|
.Fx
|
|
kernel implementation explicitly includes support for
|
|
.Xr read 2
|
|
and
|
|
.Xr write 2 .
|
|
.Pp
|
|
.Fx
|
|
also supports zero-copy transmission of data from shared memory
|
|
objects with
|
|
.Xr sendfile 2 .
|
|
.Pp
|
|
Neither shared memory objects nor their contents persist across reboots.
|
|
.Pp
|
|
Writes do not extend shared memory objects, so
|
|
.Xr ftruncate 2
|
|
must be called before any data can be written.
|
|
See
|
|
.Sx EXAMPLES .
|
|
.Sh EXAMPLES
|
|
This example fails without the call to
|
|
.Xr ftruncate 2 :
|
|
.Bd -literal -compact
|
|
|
|
uint8_t buffer[getpagesize()];
|
|
ssize_t len;
|
|
int fd;
|
|
|
|
fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600);
|
|
if (fd < 0)
|
|
err(EX_OSERR, "%s: shm_open", __func__);
|
|
if (ftruncate(fd, getpagesize()) < 0)
|
|
err(EX_IOERR, "%s: ftruncate", __func__);
|
|
len = pwrite(fd, buffer, getpagesize(), 0);
|
|
if (len < 0)
|
|
err(EX_IOERR, "%s: pwrite", __func__);
|
|
if (len != getpagesize())
|
|
errx(EX_IOERR, "%s: pwrite length mismatch", __func__);
|
|
.Ed
|
|
.Sh ERRORS
|
|
.Fn memfd_create
|
|
fails with these error codes for these conditions:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EBADF
|
|
The
|
|
.Fa name
|
|
argument was NULL.
|
|
.It Bq Er EINVAL
|
|
The
|
|
.Fa name
|
|
argument was too long.
|
|
.Pp
|
|
An invalid or unsupported flag was included in
|
|
.Fa flags .
|
|
.It Bq Er EMFILE
|
|
The process has already reached its limit for open file descriptors.
|
|
.It Bq Er ENFILE
|
|
The system file table is full.
|
|
.It Bq Er ENOSYS
|
|
In
|
|
.Fa memfd_create ,
|
|
.Dv MFD_HUGETLB
|
|
was specified in
|
|
.Fa flags ,
|
|
and this system does not support forced hugetlb mappings.
|
|
.El
|
|
.Pp
|
|
.Fn shm_open
|
|
fails with these error codes for these conditions:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EINVAL
|
|
A flag other than
|
|
.Dv O_RDONLY ,
|
|
.Dv O_RDWR ,
|
|
.Dv O_CREAT ,
|
|
.Dv O_EXCL ,
|
|
or
|
|
.Dv O_TRUNC
|
|
was included in
|
|
.Fa flags .
|
|
.It Bq Er EMFILE
|
|
The process has already reached its limit for open file descriptors.
|
|
.It Bq Er ENFILE
|
|
The system file table is full.
|
|
.It Bq Er EINVAL
|
|
.Dv O_RDONLY
|
|
was specified while creating an anonymous shared memory object via
|
|
.Dv SHM_ANON .
|
|
.It Bq Er EFAULT
|
|
The
|
|
.Fa path
|
|
argument points outside the process' allocated address space.
|
|
.It Bq Er ENAMETOOLONG
|
|
The entire pathname exceeds 1023 characters.
|
|
.It Bq Er EINVAL
|
|
The
|
|
.Fa path
|
|
does not begin with a slash
|
|
.Pq Ql \&/
|
|
character.
|
|
.It Bq Er ENOENT
|
|
.Dv O_CREAT
|
|
is not specified and the named shared memory object does not exist.
|
|
.It Bq Er EEXIST
|
|
.Dv O_CREAT
|
|
and
|
|
.Dv O_EXCL
|
|
are specified and the named shared memory object does exist.
|
|
.It Bq Er EACCES
|
|
The required permissions (for reading or reading and writing) are denied.
|
|
.It Bq Er ECAPMODE
|
|
The process is running in capability mode (see
|
|
.Xr capsicum 4 )
|
|
and attempted to create a named shared memory object.
|
|
.El
|
|
.Pp
|
|
.Fn shm_create_largepage
|
|
can fail for the reasons listed above.
|
|
It also fails with these error codes for the following conditions:
|
|
.Bl -tag -width Er
|
|
.It Bq Er ENOTTY
|
|
The kernel does not support large pages on the current platform.
|
|
.El
|
|
.Pp
|
|
The following errors are defined for
|
|
.Fn shm_rename :
|
|
.Bl -tag -width Er
|
|
.It Bq Er EFAULT
|
|
The
|
|
.Fa path_from
|
|
or
|
|
.Fa path_to
|
|
argument points outside the process' allocated address space.
|
|
.It Bq Er ENAMETOOLONG
|
|
The entire pathname exceeds 1023 characters.
|
|
.It Bq Er ENOENT
|
|
The shared memory object at
|
|
.Fa path_from
|
|
does not exist.
|
|
.It Bq Er EACCES
|
|
The required permissions are denied.
|
|
.It Bq Er EEXIST
|
|
An shm exists at
|
|
.Fa path_to ,
|
|
and the
|
|
.Dv SHM_RENAME_NOREPLACE
|
|
flag was provided.
|
|
.El
|
|
.Pp
|
|
.Fn shm_unlink
|
|
fails with these error codes for these conditions:
|
|
.Bl -tag -width Er
|
|
.It Bq Er EFAULT
|
|
The
|
|
.Fa path
|
|
argument points outside the process' allocated address space.
|
|
.It Bq Er ENAMETOOLONG
|
|
The entire pathname exceeds 1023 characters.
|
|
.It Bq Er ENOENT
|
|
The named shared memory object does not exist.
|
|
.It Bq Er EACCES
|
|
The required permissions are denied.
|
|
.Fn shm_unlink
|
|
requires write permission to the shared memory object.
|
|
.El
|
|
.Sh SEE ALSO
|
|
.Xr posixshmcontrol 1 ,
|
|
.Xr close 2 ,
|
|
.Xr fstat 2 ,
|
|
.Xr ftruncate 2 ,
|
|
.Xr ioctl 2 ,
|
|
.Xr mmap 2 ,
|
|
.Xr munmap 2 ,
|
|
.Xr sendfile 2
|
|
.Sh STANDARDS
|
|
The
|
|
.Fn memfd_create
|
|
function is expected to be compatible with the Linux system call of the same
|
|
name.
|
|
.Pp
|
|
The
|
|
.Fn shm_open
|
|
and
|
|
.Fn shm_unlink
|
|
functions are believed to conform to
|
|
.St -p1003.1b-93 .
|
|
.Sh HISTORY
|
|
The
|
|
.Fn memfd_create
|
|
function appeared in
|
|
.Fx 13.0 .
|
|
.Pp
|
|
The
|
|
.Fn shm_open
|
|
and
|
|
.Fn shm_unlink
|
|
functions first appeared in
|
|
.Fx 4.3 .
|
|
The functions were reimplemented as system calls using shared memory objects
|
|
directly rather than files in
|
|
.Fx 8.0 .
|
|
.Pp
|
|
.Fn shm_rename
|
|
first appeared in
|
|
.Fx 13.0
|
|
as a
|
|
.Fx
|
|
extension.
|
|
.Sh AUTHORS
|
|
.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org
|
|
(C library support and this manual page)
|
|
.Pp
|
|
.An Matthew Dillon Aq Mt dillon@FreeBSD.org
|
|
.Pq Dv MAP_NOSYNC
|
|
.Pp
|
|
.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
|
|
.Pq Dv shm_rename implementation
|