shm: Document shm_create_largepage()

While here, move notes about FreeBSD-specific functionality to the
COMPATIBILITY section, and document the ECAPMODE error for shm_open().

Reviewed by:	pauamma, kib
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D38282
This commit is contained in:
Mark Johnston 2023-02-03 10:55:30 -05:00
parent a2286a1f46
commit 5f03f96fbe
2 changed files with 162 additions and 10 deletions

View File

@ -482,6 +482,7 @@ MLINKS+=setuid.2 setegid.2 \
setuid.2 setgid.2
MLINKS+=shmat.2 shmdt.2
MLINKS+=shm_open.2 memfd_create.3 \
shm_open.2 shm_create_largepage.3 \
shm_open.2 shm_unlink.2 \
shm_open.2 shm_rename.2
MLINKS+=sigwaitinfo.2 sigtimedwait.2

View File

@ -28,11 +28,11 @@
.\"
.\" $FreeBSD$
.\"
.Dd June 25, 2021
.Dd January 30, 2023
.Dt SHM_OPEN 2
.Os
.Sh NAME
.Nm memfd_create , shm_open , shm_rename, shm_unlink
.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink
.Nd "shared memory object operations"
.Sh LIBRARY
.Lb libc
@ -43,6 +43,14 @@
.Ft int
.Fn memfd_create "const char *name" "unsigned int flags"
.Ft int
.Fo shm_create_largepage
.Fa "const char *path"
.Fa "int flags"
.Fa "int psind"
.Fa "int alloc_policy"
.Fa "mode_t mode"
.Fc
.Ft int
.Fn shm_open "const char *path" "int flags" "mode_t mode"
.Ft int
.Fn shm_rename "const char *path_from" "const char *path_to" "int flags"
@ -51,7 +59,7 @@
.Sh DESCRIPTION
The
.Fn shm_open
system call opens (or optionally creates) a
function opens (or optionally creates) a
POSIX
shared memory object named
.Fa path .
@ -114,9 +122,7 @@ see
and
.Xr fcntl 2 .
.Pp
As a
.Fx
extension, the constant
The constant
.Dv SHM_ANON
may be used for the
.Fa path
@ -143,6 +149,131 @@ will fail with
All other flags are ignored.
.Pp
The
.Fn shm_create_largepage
function behaves similarly to
.Fn shm_open ,
except that the
.Dv O_CREAT
flag is implicitly specified, and the returned
.Dq largepage
object is always backed by aligned, physically contiguous chunks of memory.
This ensures that the object can be mapped using so-called
.Dq superpages ,
which can improve application performance in some workloads by reducing the
number of translation lookaside buffer (TLB) entries required to access a
mapping of the object,
and by reducing the number of page faults performed when accessing a mapping.
This happens automatically for all largepage objects.
.Pp
An existing largepage object can be opened using the
.Fn shm_open
function.
Largepage shared memory objects behave slightly differently from non-largepage
objects:
.Bl -bullet -offset indent
.It
Memory for a largepage object is allocated when the object is
extended using the
.Xr ftruncate 2
system call, whereas memory for regular shared memory objects is allocated
lazily and may be paged out to a swap device when not in use.
.It
The size of a mapping of a largepage object must be a multiple of the
underlying large page size.
Most attributes of such a mapping can only be modified at the granularity
of the large page size.
For example, when using
.Xr munmap 2
to unmap a portion of a largepage object mapping, or when using
.Xr mprotect 2
to adjust protections of a mapping of a largepage object, the starting address
must be large page size-aligned, and the length of the operation must be a
multiple of the large page size.
If not, the corresponding system call will fail and set
.Va errno
to
.Er EINVAL .
.El
.Pp
The
.Fa psind
argument to
.Fn shm_create_largepage
specifies the size of large pages used to back the object.
This argument is an index into the page sizes array returned by
.Xr getpagesizes 3 .
In particular, all large pages backing a largepage object must be of the
same size.
For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage
object will consist of either 1024 2MB pages, or 2 1GB pages, depending on
the value specified for the
.Fa psind
argument.
The
.Fa alloc_policy
parameter specifies what happens when an attempt to use
.Xr ftruncate 2
to allocate memory for the object fails.
The following values are accepted:
.Bl -tag -offset indent -width SHM_
.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT
If the (non-blocking) memory allocation fails because there is insufficient free
contiguous memory, the kernel will attempt to defragment physical memory and
try another allocation.
The subsequent allocation may or may not succeed.
If this subsequent allocation also fails,
.Xr ftruncate 2
will fail and set
.Va errno
to
.Er ENOMEM .
.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT
If the memory allocation fails,
.Xr ftruncate 2
will fail and set
.Va errno
to
.Er ENOMEM .
.It Dv SHM_LARGEPAGE_ALLOC_HARD
The kernel will attempt defragmentation until the allocation succeeds,
or an unblocked signal is delivered to the thread.
However, it is possible for physical memory to be fragmented such that the
allocation will never succeed.
.El
.Pp
The
.Dv FIOSSHMLPGCNF
and
.Dv FIOGSHMLPGCNF
.Xr ioctl 2
commands can be used with a largepage shared memory object to get and set
largepage object parameters.
Both commands operate on the following structure:
.Bd -literal
struct shm_largepage_conf {
int psind;
int alloc_policy;
};
.Ed
The
.Dv FIOGSHMLPGCNF
command populates this structure with the current values of these parameters,
while the
.Dv FIOSSHMLPGCNF
command modifies the largepage object.
Currently only the
.Va alloc_policy
parameter may be modified.
Internally,
.Fn shm_create_largepage
works by creating a regular shared memory object using
.Fn shm_open ,
and then converting it into a largepage object using the
.Dv FIOSSHMLPGCNF
ioctl command.
.Pp
The
.Fn shm_rename
system call atomically removes a shared memory object named
.Fa path_from
@ -162,10 +293,6 @@ Return an error if an shm exists at
.Fa path_to ,
rather than unlinking it.
.El
.Fn shm_rename
is also a
.Fx
extension.
.Pp
The
.Fn shm_unlink
@ -235,6 +362,17 @@ All functions return -1 on failure, and set
to indicate the error.
.Sh COMPATIBILITY
The
.Fn shm_create_largepage
and
.Fn shm_rename
functions are
.Fx
extensions, as is support for the
.Dv SHM_ANON
value in
.Fn shm_open .
.Pp
The
.Fa path ,
.Fa path_from ,
and
@ -377,6 +515,18 @@ and
are specified and the named shared memory object does exist.
.It Bq Er EACCES
The required permissions (for reading or reading and writing) are denied.
.It Bq Er ECAPMODE
The process is running in capability mode (see
.Xr capsicum 4 )
and attempted to create a named shared memory object.
.El
.Pp
.Fn shm_create_largepage
can fail for the reasons listed above.
It also fails with these error codes for the following conditions:
.Bl -tag -width Er
.It Bq Er ENOTTY
The kernel does not support large pages on the current platform.
.El
.Pp
The following errors are defined for
@ -425,6 +575,7 @@ requires write permission to the shared memory object.
.Xr close 2 ,
.Xr fstat 2 ,
.Xr ftruncate 2 ,
.Xr ioctl 2 ,
.Xr mmap 2 ,
.Xr munmap 2 ,
.Xr sendfile 2