b74d3e0c37
potential extreme contention in the kernel for multi-threaded applications on SMP systems. Reported by: kris
585 lines
18 KiB
Groff
585 lines
18 KiB
Groff
.\" Copyright (c) 1980, 1991, 1993
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
.\"
|
|
.\" This code is derived from software contributed to Berkeley by
|
|
.\" the American National Standards Committee X3, on Information
|
|
.\" Processing Systems.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)malloc.3 8.1 (Berkeley) 6/4/93
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.Dd August 26, 2008
|
|
.Dt MALLOC 3
|
|
.Os
|
|
.Sh NAME
|
|
.Nm malloc , calloc , realloc , free , reallocf , malloc_usable_size
|
|
.Nd general purpose memory allocation functions
|
|
.Sh LIBRARY
|
|
.Lb libc
|
|
.Sh SYNOPSIS
|
|
.In stdlib.h
|
|
.Ft void *
|
|
.Fn malloc "size_t size"
|
|
.Ft void *
|
|
.Fn calloc "size_t number" "size_t size"
|
|
.Ft void *
|
|
.Fn realloc "void *ptr" "size_t size"
|
|
.Ft void *
|
|
.Fn reallocf "void *ptr" "size_t size"
|
|
.Ft void
|
|
.Fn free "void *ptr"
|
|
.Ft const char *
|
|
.Va _malloc_options ;
|
|
.Ft void
|
|
.Fo \*(lp*_malloc_message\*(rp
|
|
.Fa "const char *p1" "const char *p2" "const char *p3" "const char *p4"
|
|
.Fc
|
|
.In malloc_np.h
|
|
.Ft size_t
|
|
.Fn malloc_usable_size "const void *ptr"
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Fn malloc
|
|
function allocates
|
|
.Fa size
|
|
bytes of uninitialized memory.
|
|
The allocated space is suitably aligned (after possible pointer coercion)
|
|
for storage of any type of object.
|
|
.Pp
|
|
The
|
|
.Fn calloc
|
|
function allocates space for
|
|
.Fa number
|
|
objects,
|
|
each
|
|
.Fa size
|
|
bytes in length.
|
|
The result is identical to calling
|
|
.Fn malloc
|
|
with an argument of
|
|
.Dq "number * size" ,
|
|
with the exception that the allocated memory is explicitly initialized
|
|
to zero bytes.
|
|
.Pp
|
|
The
|
|
.Fn realloc
|
|
function changes the size of the previously allocated memory referenced by
|
|
.Fa ptr
|
|
to
|
|
.Fa size
|
|
bytes.
|
|
The contents of the memory are unchanged up to the lesser of the new and
|
|
old sizes.
|
|
If the new size is larger,
|
|
the contents of the newly allocated portion of the memory are undefined.
|
|
Upon success, the memory referenced by
|
|
.Fa ptr
|
|
is freed and a pointer to the newly allocated memory is returned.
|
|
Note that
|
|
.Fn realloc
|
|
and
|
|
.Fn reallocf
|
|
may move the memory allocation, resulting in a different return value than
|
|
.Fa ptr .
|
|
If
|
|
.Fa ptr
|
|
is
|
|
.Dv NULL ,
|
|
the
|
|
.Fn realloc
|
|
function behaves identically to
|
|
.Fn malloc
|
|
for the specified size.
|
|
.Pp
|
|
The
|
|
.Fn reallocf
|
|
function is identical to the
|
|
.Fn realloc
|
|
function, except that it
|
|
will free the passed pointer when the requested memory cannot be allocated.
|
|
This is a
|
|
.Fx
|
|
specific API designed to ease the problems with traditional coding styles
|
|
for realloc causing memory leaks in libraries.
|
|
.Pp
|
|
The
|
|
.Fn free
|
|
function causes the allocated memory referenced by
|
|
.Fa ptr
|
|
to be made available for future allocations.
|
|
If
|
|
.Fa ptr
|
|
is
|
|
.Dv NULL ,
|
|
no action occurs.
|
|
.Pp
|
|
The
|
|
.Fn malloc_usable_size
|
|
function returns the usable size of the allocation pointed to by
|
|
.Fa ptr .
|
|
The return value may be larger than the size that was requested during
|
|
allocation.
|
|
The
|
|
.Fn malloc_usable_size
|
|
function is not a mechanism for in-place
|
|
.Fn realloc ;
|
|
rather it is provided solely as a tool for introspection purposes.
|
|
Any discrepancy between the requested allocation size and the size reported by
|
|
.Fn malloc_usable_size
|
|
should not be depended on, since such behavior is entirely
|
|
implementation-dependent.
|
|
.Sh TUNING
|
|
Once, when the first call is made to one of these memory allocation
|
|
routines, various flags will be set or reset, which affects the
|
|
workings of this allocator implementation.
|
|
.Pp
|
|
The
|
|
.Dq name
|
|
of the file referenced by the symbolic link named
|
|
.Pa /etc/malloc.conf ,
|
|
the value of the environment variable
|
|
.Ev MALLOC_OPTIONS ,
|
|
and the string pointed to by the global variable
|
|
.Va _malloc_options
|
|
will be interpreted, in that order, from left to right as flags.
|
|
.Pp
|
|
Each flag is a single letter, optionally prefixed by a non-negative base 10
|
|
integer repetition count.
|
|
For example,
|
|
.Dq 3N
|
|
is equivalent to
|
|
.Dq NNN .
|
|
Some flags control parameter magnitudes, where uppercase increases the
|
|
magnitude, and lowercase decreases the magnitude.
|
|
Other flags control boolean parameters, where uppercase indicates that a
|
|
behavior is set, or on, and lowercase means that a behavior is not set, or off.
|
|
.Bl -tag -width indent
|
|
.It A
|
|
All warnings (except for the warning about unknown
|
|
flags being set) become fatal.
|
|
The process will call
|
|
.Xr abort 3
|
|
in these cases.
|
|
.It B
|
|
Double/halve the per-arena lock contention threshold at which a thread is
|
|
randomly re-assigned to an arena.
|
|
This dynamic load balancing tends to push threads away from highly contended
|
|
arenas, which avoids worst case contention scenarios in which threads
|
|
disproportionately utilize arenas.
|
|
However, due to the highly dynamic load that applications may place on the
|
|
allocator, it is impossible for the allocator to know in advance how sensitive
|
|
it should be to contention over arenas.
|
|
Therefore, some applications may benefit from increasing or decreasing this
|
|
threshold parameter.
|
|
This option is not available for some configurations (non-PIC).
|
|
.It C
|
|
Double/halve the size of the maximum size class that is a multiple of the
|
|
cacheline size (64).
|
|
Above this size, subpage spacing (256 bytes) is used for size classes.
|
|
The default value is 512 bytes.
|
|
.It D
|
|
Use
|
|
.Xr sbrk 2
|
|
to acquire memory in the data storage segment (DSS).
|
|
This option is enabled by default.
|
|
See the
|
|
.Dq M
|
|
option for related information and interactions.
|
|
.It F
|
|
Double/halve the per-arena maximum number of dirty unused pages that are
|
|
allowed to accumulate before informing the kernel about at least half of those
|
|
pages via
|
|
.Xr madvise 2 .
|
|
This provides the kernel with sufficient information to recycle dirty pages if
|
|
physical memory becomes scarce and the pages remain unused.
|
|
The default is 512 pages per arena;
|
|
.Ev MALLOC_OPTIONS=10f
|
|
will prevent any dirty unused pages from accumulating.
|
|
.It G
|
|
When there are multiple threads, use thread-specific caching for objects that
|
|
are smaller than one page.
|
|
This option is enabled by default.
|
|
Thread-specific caching allows many allocations to be satisfied without
|
|
performing any thread synchronization, at the cost of increased memory use.
|
|
See the
|
|
.Dq R
|
|
option for related tuning information.
|
|
This option is not available for some configurations (non-PIC).
|
|
.It J
|
|
Each byte of new memory allocated by
|
|
.Fn malloc ,
|
|
.Fn realloc
|
|
or
|
|
.Fn reallocf
|
|
will be initialized to 0xa5.
|
|
All memory returned by
|
|
.Fn free ,
|
|
.Fn realloc
|
|
or
|
|
.Fn reallocf
|
|
will be initialized to 0x5a.
|
|
This is intended for debugging and will impact performance negatively.
|
|
.It K
|
|
Double/halve the virtual memory chunk size.
|
|
The default chunk size is 1 MB.
|
|
.It M
|
|
Use
|
|
.Xr mmap 2
|
|
to acquire anonymously mapped memory.
|
|
This option is enabled by default.
|
|
If both the
|
|
.Dq D
|
|
and
|
|
.Dq M
|
|
options are enabled, the allocator prefers anonymous mappings over the DSS,
|
|
but allocation only fails if memory cannot be acquired via either method.
|
|
If neither option is enabled, then the
|
|
.Dq M
|
|
option is implicitly enabled in order to assure that there is a method for
|
|
acquiring memory.
|
|
.It N
|
|
Double/halve the number of arenas.
|
|
The default number of arenas is two times the number of CPUs, or one if there
|
|
is a single CPU.
|
|
.It P
|
|
Various statistics are printed at program exit via an
|
|
.Xr atexit 3
|
|
function.
|
|
This has the potential to cause deadlock for a multi-threaded process that exits
|
|
while one or more threads are executing in the memory allocation functions.
|
|
Therefore, this option should only be used with care; it is primarily intended
|
|
as a performance tuning aid during application development.
|
|
.It Q
|
|
Double/halve the size of the maximum size class that is a multiple of the
|
|
quantum (8 or 16 bytes, depending on architecture).
|
|
Above this size, cacheline spacing is used for size classes.
|
|
The default value is 128 bytes.
|
|
.It R
|
|
Double/halve magazine size, which approximately doubles/halves the number of
|
|
rounds in each magazine.
|
|
Magazines are used by the thread-specific caching machinery to acquire and
|
|
release objects in bulk.
|
|
Increasing the magazine size decreases locking overhead, at the expense of
|
|
increased memory usage.
|
|
This option is not available for some configurations (non-PIC).
|
|
.It U
|
|
Generate
|
|
.Dq utrace
|
|
entries for
|
|
.Xr ktrace 1 ,
|
|
for all operations.
|
|
Consult the source for details on this option.
|
|
.It V
|
|
Attempting to allocate zero bytes will return a
|
|
.Dv NULL
|
|
pointer instead of
|
|
a valid pointer.
|
|
(The default behavior is to make a minimal allocation and return a
|
|
pointer to it.)
|
|
This option is provided for System V compatibility.
|
|
This option is incompatible with the
|
|
.Dq X
|
|
option.
|
|
.It X
|
|
Rather than return failure for any allocation function,
|
|
display a diagnostic message on
|
|
.Dv stderr
|
|
and cause the program to drop
|
|
core (using
|
|
.Xr abort 3 ) .
|
|
This option should be set at compile time by including the following in
|
|
the source code:
|
|
.Bd -literal -offset indent
|
|
_malloc_options = "X";
|
|
.Ed
|
|
.It Z
|
|
Each byte of new memory allocated by
|
|
.Fn malloc ,
|
|
.Fn realloc
|
|
or
|
|
.Fn reallocf
|
|
will be initialized to 0.
|
|
Note that this initialization only happens once for each byte, so
|
|
.Fn realloc
|
|
and
|
|
.Fn reallocf
|
|
calls do not zero memory that was previously allocated.
|
|
This is intended for debugging and will impact performance negatively.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Dq J
|
|
and
|
|
.Dq Z
|
|
options are intended for testing and debugging.
|
|
An application which changes its behavior when these options are used
|
|
is flawed.
|
|
.Sh IMPLEMENTATION NOTES
|
|
Traditionally, allocators have used
|
|
.Xr sbrk 2
|
|
to obtain memory, which is suboptimal for several reasons, including race
|
|
conditions, increased fragmentation, and artificial limitations on maximum
|
|
usable memory.
|
|
This allocator uses both
|
|
.Xr sbrk 2
|
|
and
|
|
.Xr mmap 2
|
|
by default, but it can be configured at run time to use only one or the other.
|
|
If resource limits are not a primary concern, the preferred configuration is
|
|
.Ev MALLOC_OPTIONS=dM
|
|
or
|
|
.Ev MALLOC_OPTIONS=DM .
|
|
When so configured, the
|
|
.Ar datasize
|
|
resource limit has little practical effect for typical applications; use
|
|
.Ev MALLOC_OPTIONS=Dm
|
|
if that is a concern.
|
|
Regardless of allocator configuration, the
|
|
.Ar vmemoryuse
|
|
resource limit can be used to bound the total virtual memory used by a
|
|
process, as described in
|
|
.Xr limits 1 .
|
|
.Pp
|
|
This allocator uses multiple arenas in order to reduce lock contention for
|
|
threaded programs on multi-processor systems.
|
|
This works well with regard to threading scalability, but incurs some costs.
|
|
There is a small fixed per-arena overhead, and additionally, arenas manage
|
|
memory completely independently of each other, which means a small fixed
|
|
increase in overall memory fragmentation.
|
|
These overheads are not generally an issue, given the number of arenas normally
|
|
used.
|
|
Note that using substantially more arenas than the default is not likely to
|
|
improve performance, mainly due to reduced cache performance.
|
|
However, it may make sense to reduce the number of arenas if an application
|
|
does not make much use of the allocation functions.
|
|
.Pp
|
|
In addition to multiple arenas, this allocator supports thread-specific
|
|
caching for small objects (smaller than one page), in order to make it
|
|
possible to completely avoid synchronization for most small allocation requests.
|
|
Such caching allows very fast allocation in the common case, but it increases
|
|
memory usage and fragmentation, since a bounded number of objects can remain
|
|
allocated in each thread cache.
|
|
.Pp
|
|
Memory is conceptually broken into equal-sized chunks, where the chunk size is
|
|
a power of two that is greater than the page size.
|
|
Chunks are always aligned to multiples of the chunk size.
|
|
This alignment makes it possible to find metadata for user objects very
|
|
quickly.
|
|
.Pp
|
|
User objects are broken into three categories according to size: small, large,
|
|
and huge.
|
|
Small objects are smaller than one page.
|
|
Large objects are smaller than the chunk size.
|
|
Huge objects are a multiple of the chunk size.
|
|
Small and large objects are managed by arenas; huge objects are managed
|
|
separately in a single data structure that is shared by all threads.
|
|
Huge objects are used by applications infrequently enough that this single
|
|
data structure is not a scalability issue.
|
|
.Pp
|
|
Each chunk that is managed by an arena tracks its contents as runs of
|
|
contiguous pages (unused, backing a set of small objects, or backing one large
|
|
object).
|
|
The combination of chunk alignment and chunk page maps makes it possible to
|
|
determine all metadata regarding small and large allocations in constant time.
|
|
.Pp
|
|
Small objects are managed in groups by page runs.
|
|
Each run maintains a bitmap that tracks which regions are in use.
|
|
Allocation requests that are no more than half the quantum (8 or 16, depending
|
|
on architecture) are rounded up to the nearest power of two.
|
|
Allocation requests that are more than half the quantum, but no more than the
|
|
minimum cacheline-multiple size class (see the
|
|
.Dq Q
|
|
option) are rounded up to the nearest multiple of the quantum.
|
|
Allocation requests that are more than the minumum cacheline-multiple size
|
|
class, but no more than the minimum subpage-multiple size class (see the
|
|
.Dq C
|
|
option) are rounded up to the nearest multiple of the cacheline size (64).
|
|
Allocation requests that are more than the minimum subpage-multiple size class
|
|
are rounded up to the nearest multiple of the subpage size (256).
|
|
Allocation requests that are more than one page, but small enough to fit in
|
|
an arena-managed chunk (see the
|
|
.Dq K
|
|
option), are rounded up to the nearest run size.
|
|
Allocation requests that are too large to fit in an arena-managed chunk are
|
|
rounded up to the nearest multiple of the chunk size.
|
|
.Pp
|
|
Allocations are packed tightly together, which can be an issue for
|
|
multi-threaded applications.
|
|
If you need to assure that allocations do not suffer from cacheline sharing,
|
|
round your allocation requests up to the nearest multiple of the cacheline
|
|
size.
|
|
.Sh DEBUGGING MALLOC PROBLEMS
|
|
The first thing to do is to set the
|
|
.Dq A
|
|
option.
|
|
This option forces a coredump (if possible) at the first sign of trouble,
|
|
rather than the normal policy of trying to continue if at all possible.
|
|
.Pp
|
|
It is probably also a good idea to recompile the program with suitable
|
|
options and symbols for debugger support.
|
|
.Pp
|
|
If the program starts to give unusual results, coredump or generally behave
|
|
differently without emitting any of the messages mentioned in the next
|
|
section, it is likely because it depends on the storage being filled with
|
|
zero bytes.
|
|
Try running it with the
|
|
.Dq Z
|
|
option set;
|
|
if that improves the situation, this diagnosis has been confirmed.
|
|
If the program still misbehaves,
|
|
the likely problem is accessing memory outside the allocated area.
|
|
.Pp
|
|
Alternatively, if the symptoms are not easy to reproduce, setting the
|
|
.Dq J
|
|
option may help provoke the problem.
|
|
.Pp
|
|
In truly difficult cases, the
|
|
.Dq U
|
|
option, if supported by the kernel, can provide a detailed trace of
|
|
all calls made to these functions.
|
|
.Pp
|
|
Unfortunately this implementation does not provide much detail about
|
|
the problems it detects; the performance impact for storing such information
|
|
would be prohibitive.
|
|
There are a number of allocator implementations available on the Internet
|
|
which focus on detecting and pinpointing problems by trading performance for
|
|
extra sanity checks and detailed diagnostics.
|
|
.Sh DIAGNOSTIC MESSAGES
|
|
If any of the memory allocation/deallocation functions detect an error or
|
|
warning condition, a message will be printed to file descriptor
|
|
.Dv STDERR_FILENO .
|
|
Errors will result in the process dumping core.
|
|
If the
|
|
.Dq A
|
|
option is set, all warnings are treated as errors.
|
|
.Pp
|
|
The
|
|
.Va _malloc_message
|
|
variable allows the programmer to override the function which emits
|
|
the text strings forming the errors and warnings if for some reason
|
|
the
|
|
.Dv stderr
|
|
file descriptor is not suitable for this.
|
|
Please note that doing anything which tries to allocate memory in
|
|
this function is likely to result in a crash or deadlock.
|
|
.Pp
|
|
All messages are prefixed by
|
|
.Dq Ao Ar progname Ac Ns Li : (malloc) .
|
|
.Sh RETURN VALUES
|
|
The
|
|
.Fn malloc
|
|
and
|
|
.Fn calloc
|
|
functions return a pointer to the allocated memory if successful; otherwise
|
|
a
|
|
.Dv NULL
|
|
pointer is returned and
|
|
.Va errno
|
|
is set to
|
|
.Er ENOMEM .
|
|
.Pp
|
|
The
|
|
.Fn realloc
|
|
and
|
|
.Fn reallocf
|
|
functions return a pointer, possibly identical to
|
|
.Fa ptr ,
|
|
to the allocated memory
|
|
if successful; otherwise a
|
|
.Dv NULL
|
|
pointer is returned, and
|
|
.Va errno
|
|
is set to
|
|
.Er ENOMEM
|
|
if the error was the result of an allocation failure.
|
|
The
|
|
.Fn realloc
|
|
function always leaves the original buffer intact
|
|
when an error occurs, whereas
|
|
.Fn reallocf
|
|
deallocates it in this case.
|
|
.Pp
|
|
The
|
|
.Fn free
|
|
function returns no value.
|
|
.Pp
|
|
The
|
|
.Fn malloc_usable_size
|
|
function returns the usable size of the allocation pointed to by
|
|
.Fa ptr .
|
|
.Sh ENVIRONMENT
|
|
The following environment variables affect the execution of the allocation
|
|
functions:
|
|
.Bl -tag -width ".Ev MALLOC_OPTIONS"
|
|
.It Ev MALLOC_OPTIONS
|
|
If the environment variable
|
|
.Ev MALLOC_OPTIONS
|
|
is set, the characters it contains will be interpreted as flags to the
|
|
allocation functions.
|
|
.El
|
|
.Sh EXAMPLES
|
|
To dump core whenever a problem occurs:
|
|
.Pp
|
|
.Bd -literal -offset indent
|
|
ln -s 'A' /etc/malloc.conf
|
|
.Ed
|
|
.Pp
|
|
To specify in the source that a program does no return value checking
|
|
on calls to these functions:
|
|
.Bd -literal -offset indent
|
|
_malloc_options = "X";
|
|
.Ed
|
|
.Sh SEE ALSO
|
|
.Xr limits 1 ,
|
|
.Xr madvise 2 ,
|
|
.Xr mmap 2 ,
|
|
.Xr sbrk 2 ,
|
|
.Xr alloca 3 ,
|
|
.Xr atexit 3 ,
|
|
.Xr getpagesize 3 ,
|
|
.Xr memory 3 ,
|
|
.Xr posix_memalign 3
|
|
.Sh STANDARDS
|
|
The
|
|
.Fn malloc ,
|
|
.Fn calloc ,
|
|
.Fn realloc
|
|
and
|
|
.Fn free
|
|
functions conform to
|
|
.St -isoC .
|
|
.Sh HISTORY
|
|
The
|
|
.Fn reallocf
|
|
function first appeared in
|
|
.Fx 3.0 .
|
|
.Pp
|
|
The
|
|
.Fn malloc_usable_size
|
|
function first appeared in
|
|
.Fx 7.0 .
|