Commit Graph

671 Commits

Author SHA1 Message Date
David Schultz
ac48ad2e5e Changing 'r' to a size_t in the previous commit turned quicksort
into slowsort for some sequences because different parts of the
code used 'r' to store two different things, one of which was
signed. Clean things up by splitting 'r' into two variables, and
use a more meaningful name.
2008-01-14 09:21:34 +00:00
David Schultz
badf97cd55 Use size_t to avoid overflow when sorting arrays larger than 2 GB.
PR:		111085
MFC after:	2 weeks
2008-01-13 02:11:10 +00:00
Jason Evans
f38512f4af Enable both sbrk(2)- and mmap(2)-based memory acquisition methods by
default.  This has the disadvantage of rendering the datasize resource
limit irrelevant, but without this change, legitimate uses of more
memory than will fit in the data segment are thwarted by default.

Fix chunk_alloc_mmap() to work correctly if initial mapping is not
chunk-aligned and mapping extension fails.
2008-01-03 23:22:13 +00:00
Jason Evans
36ac4cc502 Fix a major chunk-related memory leak in chunk_dealloc_dss_record(). [1]
Clean up DSS-related locking and protect all pertinent variables with
dss_mtx (remove dss_chunks_mtx).  This fixes race conditions that could
cause chunk leaks.

Reported by:	[1] kris
2007-12-31 06:19:48 +00:00
Jason Evans
07aa172f11 Fix a bug related to sbrk() calls that could cause address space leaks.
This is a long-standing bug, but until recent changes it was difficult
to trigger, and even then its impact was non-catastrophic, with the
exception of revision 1.157.

Optimize chunk_alloc_mmap() to avoid the need for unmapping pages in the
common case.  Thanks go to Kris Kennaway for a patch that inspired this
change.

Do not maintain a record of previously mmap'ed chunk address ranges.
The original intent was to avoid the extra system call overhead in
chunk_alloc_mmap(), which is no longer a concern.  This also allows some
simplifications for the tree of unused DSS chunks.

Introduce huge_mtx and dss_chunks_mtx to replace chunks_mtx.  There was
no compelling reason to use the same mutex for these disjoint purposes.

Avoid memset() for huge allocations when possible.

Maintain two trees instead of one for tracking unused DSS address
ranges.  This allows scalable allocation of multi-chunk huge objects in
the DSS.  Previously, multi-chunk huge allocation requests failed if the
DSS could not be extended.
2007-12-31 00:59:16 +00:00
Jason Evans
14a7e7b5e1 Back out premature commit of previous version. 2007-12-28 09:21:12 +00:00
Jason Evans
03947063d0 Maintain two trees instead of one (old_chunks --> old_chunks_{ad,szad}) in
order to support re-use of multi-chunk unused regions within the DSS for
huge allocations.  This generalization is important to correct function
when mmap-based allocation is disabled.

Avoid zeroing re-used memory in the DSS unless it really needs to be
zeroed.
2007-12-28 07:24:19 +00:00
Jason Evans
3762647250 Release chunks_mtx for all paths through chunk_dealloc().
Reported by:	kris
2007-12-28 02:15:08 +00:00
Jason Evans
ebc87e7e0b Add the 'D' and 'M' run time options, and use them to control whether
memory is acquired from the system via sbrk(2) and/or mmap(2).  By default,
use sbrk(2) only, in order to support traditional use of resource limits.
Additionally, when both options are enabled, prefer the data segment to
anonymous mappings, in order to coexist better with large file mappings
in applications on 32-bit platforms.  This change has the potential to
increase memory fragmentation due to the linear nature of the data
segment, but from a performance perspective this is mitigated by the use
of madvise(2). [1]

Add the ability to interpret integer prefixes in MALLOC_OPTIONS
processing.  For example, MALLOC_OPTIONS=lllllllll can now be specified as
MALLOC_OPTIONS=9l.

Reported by:	[1] rwatson
Design review:	[1] alc, peter, rwatson
2007-12-27 23:29:44 +00:00
John Baldwin
d32324f64f Clean up some of the pts(4) vs pty(4) stuff in grantpt(3) and friends:
- Use PTY* for all pty(4) related constants.
- Use PTMX* for all pts(4) related constants.
- Consistently use _PATH_DEV PTMX rather than "/dev/ptmx".
- Revert 1.7 and properly fix it by using the correct prefix string for
  pts(4) masters.

MFC after:	3 days
2007-12-21 21:26:08 +00:00
Jason Evans
a0a474aed6 Use fixed point integer math instead of floating point math when
calculating run sizes.  Use of the floating point unit was a potential
pessimization to context switching for applications that do not otherwise
use floating point math. [1]

Reformat cpp macro-related comments to improve consistency.

Submitted by:	das
2007-12-18 05:27:57 +00:00
Jason Evans
d55bd6236f Refactor features a bit in order to make it possible to disable lazy
deallocation and dynamic load balancing via the MALLOC_LAZY_FREE and
MALLOC_BALANCE knobs.  This is a non-functional change, since these
features are still enabled when possible.

Clean up a few things that more pedantic compiler settings would cause
complaints over.
2007-12-17 01:20:04 +00:00
David Schultz
4b6b574455 Implement and document nan(), nanf(), and nanl(). This commit
adds two new directories in msun: ld80 and ld128. These are for
long double functions specific to the 80-bit long double format
used on x86-derived architectures, and the 128-bit format used on
sparc64, respectively.
2007-12-16 21:19:28 +00:00
John Baldwin
ca81364fb1 Update posix_openpt(3) to handle 512 ptys. This was missed in the earlier
pty(4) changes.

MFC after:	3 days
2007-12-13 00:08:59 +00:00
Andrey A. Chernov
192b5193c7 Fix typo in the comment 2007-12-11 20:39:32 +00:00
Jason Evans
7e42e29b9b Only zero large allocations when necessary (for calloc()). 2007-11-28 00:17:34 +00:00
Jason Evans
77cfb3fec2 Document the B and L MALLOC_OPTIONS. 2007-11-27 03:18:26 +00:00
Jason Evans
5ea8413d0a Implement dynamic load balancing of thread-->arena mapping, based on lock
contention.  The intent is to dynamically adjust to load imbalances, which
can cause severe contention.

Use pthread mutexes where possible instead of libc "spinlocks" (they aren't
actually spin locks).  Conceptually, this change is meant only to support
the dynamic load balancing code by enabling the use of spin locks, but it
has the added apparent benefit of substantially improving performance due to
reduced context switches when there is moderate arena lock contention.

Proper tuning parameter configuration for this change is a finicky business,
and it is very much machine-dependent.  One seemingly promising solution
would be to run a tuning program during operating system installation that
computes appropriate settings for load balancing.  (The pthreads adaptive
spin locks should probably be similarly tuned.)
2007-11-27 03:17:30 +00:00
Jason Evans
26b5e3a18e Implement lazy deallocation of small objects. For each arena, maintain a
vector of slots for lazily freed objects.  For each deallocation, before
doing the hard work of locking the arena and deallocating, try several times
to randomly insert the object into the vector using atomic operations.

This approach is particularly effective at reducing contention for
multi-threaded applications that use the producer-consumer model, wherein
one producer thread allocates objects, then multiple consumer threads
deallocate those objects.
2007-11-27 03:13:15 +00:00
Jason Evans
bcd3523138 Avoid re-zeroing memory in calloc() when possible. 2007-11-27 03:12:15 +00:00
Jason Evans
1bbd1b8613 Fix stats printing of the amount of memory currently consumed by huge
allocations. [1]

Fix calculation of the number of arenas when 'n' is specified via
MALLOC_OPTIONS.

Clean up various style inconsistencies.

Obtained from:	[1] NetBSD
2007-11-27 03:09:23 +00:00
David Xu
c5081fcd35 Remove out of date notes, the atoi code is thread-safe and async-cancel
safe.

Discussed with: desichen
2007-10-19 06:23:39 +00:00
Sean Farley
8e5b20fa9c The precision for a string argument in a call to warnx() needs to be cast
to an int to remove the warning from using a size_t variable on 64-bit
platforms.

Submitted by:	Xin LI <delphij@FreeBSD.org>
Approved by:	wes
Approved by:	re (kensmith)
2007-09-22 02:30:44 +00:00
Sean Farley
21c376969a Skip rebuilding environ in setenv() only upon reuse of an active variable;
inactive variables should cause a rebuild of environ, otherwise, exec()'d
processes will be missing a variable in environ that has been unset then
set.

Submitted by:	Taku Yamamoto <taku@tackymt.homeip.net>
Reviewed by:	ache
Approved by:	wes (mentor)
Approved by:	re (kensmith)
2007-09-15 21:48:54 +00:00
Sean Farley
9bab236702 Added environ-replacement detection. For programs that "clean" (i.e., su)
or replace (i.e., zdump) the environment after a call to setenv(), putenv()
or unsetenv() has been made, a few changes were made.
  - getenv() will return the value from the new environ array.
  - setenv() was split into two functions:  __setenv() which is most of the
    previous setenv() without checks on the name and setenv() which
    contains the checks before calling __setenv().
  - setenv(), putenv() and unsetenv() will unset all previous values and
    call __setenv() on all entries in the new environ array which in turn
    adds them to the end of the envVars array.  Calling __setenv() instead
    of setenv() is done to avoid the temporary replacement of the '=' in a
    string with a NUL byte.  Some strings may be read-only data.

Added more regression checks for clearing the environment array.

Replaced gettimeofday() with getrusage() in timing regression check for
better accuracy.

Fixed an off-by-one bug in __remove_putenv() in the use of memmove().  This
went unnoticed due to the allocation of double the number of environ
entries when building envVars.

Fixed a few spelling mistakes in the comments.

Reviewed by:	ache
Approved by:	wes
Approved by:	re (kensmith)
2007-07-20 23:30:13 +00:00
Sean Farley
2966d28c32 Significantly reduce the memory leak as noted in BUGS section for
setenv(3) by tracking the size of the memory allocated instead of using
strlen() on the current value.

Convert all calls to POSIX from historic BSD API:
 - unsetenv returns an int.
 - putenv takes a char * instead of const char *.
 - putenv no longer makes a copy of the input string.
 - errno is set appropriately for POSIX.  Exceptions involve bad environ
   variable and internal initialization code.  These both set errno to
   EFAULT.

Several patches to base utilities to handle the POSIX changes from
Andrey Chernov's previous commit.  A few I re-wrote to use setenv()
instead of putenv().

New regression module for tools/regression/environ to test these
functions.  It also can be used to test the performance.

Bump __FreeBSD_version to 700050 due to API change.

PR:		kern/99826
Approved by:	wes
Approved by:	re (kensmith)
2007-07-04 00:00:41 +00:00
Jason Evans
0061e03d7f Add information about the implications of using mmap(2) instead of sbrk(2).
Submitted by:	bmah, jhb
2007-06-15 22:32:33 +00:00
Jason Evans
76507741ab Fix junk/zero filling for realloc(). Junk filling was missing in one case,
and zero filling was broken in a way that could cause memory corruption.

Update comments.
2007-06-15 22:00:16 +00:00
Jonathan Chen
959496efbf Backout 1.5 as requested by deischen 2007-05-22 05:28:40 +00:00
Jonathan Chen
81d8304713 __cleanup() is needed for ports/devel/valgrind, export it. 2007-05-22 03:03:28 +00:00
Andrey A. Chernov
ba174a5e38 Back out all POSIXified *env() changes.
Not because I admit they are technically wrong and not because of bug
reports (I receive nothing). But because I surprisingly meets so
strong opposition and resistance so lost any desire to continue that.

Anyone who interested in POSIX can dig out what changes and how
through cvs diffs.
2007-05-01 16:02:44 +00:00
Andrey A. Chernov
fad6917924 Bump .Dd
Suggested by: Henrik Brix Andersen <henrik@brixandersen.dk>
2007-04-30 19:37:10 +00:00
Andrey A. Chernov
a7b27253d0 Add phrase
"so altering the argument shall change the environment."
into putenv description.
2007-04-30 18:01:51 +00:00
Andrey A. Chernov
15fdb055e5 Make putenv() fully conforms to Open Group specs Issue 6
(also IEEE Std 1003.1-2001)

The specs explicitly says that altering passed string
should change the environment, i.e. putenv() directly puts its arg
into environment (unlike setenv() which just copies it there).
It means that putenv() can't be implemented via setenv()
(like we have before) at all. Putenv() value lives (allows modifying)
up to the next putenv() or setenv() call.
2007-04-30 16:56:18 +00:00
Andrey A. Chernov
00f8652278 Remove special case skipping initial '=' of the setenv() value "for
compatibility with the different environment conventions" (man page).
With the standards, we don't have them different anymore and
IEEE Std 1003.1-2001 says that

"The values that the environment variables may be assigned are not
restricted except that they are considered to end with a null byte"
2007-04-30 03:47:31 +00:00
Andrey A. Chernov
bdda893471 Make setenv, putenv, getenv and unsetenv conforming to Open Group specs
Issue 6 (also IEEE Std 1003.1-2001) in following areas:
args, return, errors.

Putenv still needs rewriting because specs explicitly says that
altering passed string later should change the environment (currently we
copy the string so can't provide that).
2007-04-30 02:25:02 +00:00
Daniel Eischen
5f864214bb Use C comments since we now preprocess these files with CPP. 2007-04-29 14:05:22 +00:00
Ruslan Ermilov
41e0cfabe9 Swap "underflow"/"overflow" in the table header.
Submitted by:	Ricardo Nabinger Sanchez
MFC after:	3 days
2007-04-10 11:17:00 +00:00
Jason Evans
d33f4690ba Use size_t instead of unsigned for pagesize-related values, in order to
avoid downcasting issues.  In particular, this change fixes
posix_memalign(3) for alignments greater than 2^31 on LP64 systems.

Make sure that NDEBUG is always set to be compatible with MALLOC_DEBUG. [1]

Reported by:	[1] Lee Hyo geol <hyogeollee@gmail.com>
2007-03-29 21:07:17 +00:00
Jason Evans
eaf8d73212 Remove the run promotion/demotion machinery. Replace it with red-black
trees that track all non-full runs for each bin.  Use the red-black
trees to be able to guarantee that each new allocation is placed in the
lowest address available in any non-full run.  This change completes the
transition to allocating from low addresses in order to reduce the
retention of sparsely used chunks.

If the run in current use by a bin becomes empty, deallocate the run
rather than retaining it for later use.  The previous behavior had the
tendency to spread empty runs across multiple chunks, thus preventing
the release of chunks that were completely unused.

Generalize base_chunk_alloc() (and rename it to base_pages_alloc()) to
handle allocation sizes larger than the chunk size, so that it is
possible to support chunk sizes that are smaller than an arena object.

Reduce the minimum chunk size from 64kB to 8kB.

Optimize tracking of addresses for deleted chunks.

Fix a statistics bug for huge allocations.
2007-03-28 19:55:07 +00:00
Jason Evans
0a19939042 Update the IMPLEMENTATION NOTES section to reflect recent malloc
enhancements.
2007-03-28 04:34:19 +00:00
Jason Evans
cec95ce278 Add a HISTORY section. 2007-03-28 04:32:51 +00:00
Jason Evans
12fbf47cfb Fix some subtle bugs for posix_memalign() having to do with integer
rounding and overflow.  Carefully document what the various overflow
tests actually detect.

The bugs mostly canceled out, such that the worst possible failure
cases resulted in non-fatal over-allocations.
2007-03-24 20:44:06 +00:00
Jason Evans
e3da012f00 Fix posix_memalign() for large objects. Now that runs are extents rather
than binary buddies, the alignment guarantees are weaker, which requires
a more complex aligned allocation algorithm, similar to that used for
alignment greater than the chunk size.

Reported by:	matteo
2007-03-23 22:58:15 +00:00
Jason Evans
bb99793a2b Use extents rather than binary buddies to track free pages within
chunks.  This allows runs to be any multiple of the page size.  The
primary advantage is that large objects are no longer constrained to be
2^n pages, which can dramatically decrease internal fragmentation for
large objects.  This also allows the sizes for runs that back small
objects to be more finely tuned.

Free runs are searched for linearly using the chunk page map (with the
help of some heuristic optimizations).  This changes the allocation
policy from "first best fit" to "first fit".  A prototype red-black tree
implementation for tracking free runs that implemented "first best fit"
did not cause a measurable speed or memory usage difference for
realistic chunk sizes (though of course it is possible to construct
benchmarks that favor one allocation policy over another).

Refine the handling of fullness constraints for small runs to be more
tunable.

Restructure the per chunk page map to contain only two fields per entry,
rather than four.  Also, increase each entry from 4 to 8 bytes, since it
allows for 32-bit integers, without increasing the number of chunk
header pages.

Relax the maximum chunk size constraint.  This is of no practical
interest; it is merely fallout from the chunk page map restructuring.

Revamp statistics gathering and reporting to be faster, clearer and more
informative.  Statistics gathering is fast enough now to have little
to no impact on application speed, but it still requires approximately
two extra pages of memory per arena (per process).  This memory overhead
may be acceptable for most systems, but we still need to leave
statistics gathering disabled by default in RELENG branches.

Rename NO_MALLOC_EXTRAS to MALLOC_PRODUCTION in order to make its intent
clearer (i.e. it should be defined in RELENG branches).
2007-03-23 05:05:48 +00:00
Jason Evans
c9f0c8fd74 Avoid using vsnprintf(3) unless MALLOC_STATS is defined, in order to
avoid substantial potential bloat for static binaries that do not
otherwise use any printf(3)-family functions. [1]

Rearrange arena_run_t so that the region bitmask can be minimally sized
according to constraints related to each bin's size class.  Previously,
the region bitmask was the same size for all run headers, which wasted
a measurable amount of memory.

Rather than making runs for small objects as large as possible, make
runs as small as possible such that header overhead stays below a
certain bound.  There are two exceptions that override the header
overhead bound:

	1) If the bound is impossible to honor, it is relaxed on a
	   per-size-class basis.  Since there is one bit of header
	   overhead per object (plus a constant), it is impossible to
	   achieve a header overhead less than or equal to 1/(# of bits
	   per object).  For the current setting of maximum 0.5% header
	   overhead, this relaxation comes into play for {2, 4, 8,
	   16}-byte objects, for which header overhead is (on 64-bit
	   systems) {7.1, 4.3, 2.2, 1.2}%, respectively.

	2) There is still a cap on small run size, still set to 64kB.
	   This comes into play for {1024, 2048}-byte objects, for which
	   header overhead is {1.6, 3.1}%, respectively.

In practice, this reduces the run sizes, which makes worst case
low-water memory usage due to fragmentation less bad.  It also reduces
worst case high-water run fragmentation due to non-full runs, but this
is only a constant improvement (most important to small short-lived
processes).

Reduce the default chunk size from 2MB to 1MB.  Benchmarks indicate that
the external fragmentation reduction makes 1MB the new sweet spot (as
small as possible without adversely affecting performance).

Reported by:	[1] kientzle
2007-03-20 03:44:10 +00:00
Jason Evans
a326064e24 Modify chunk_alloc() to prefer mmap()ed memory over sbrk()ed memory.
This has no impact unless USE_BRK is defined (32-bit platforms), in
which case user allocations are allocated via mmap() if at all possible,
in order to avoid the possibility of unreclaimable chunks in the data
segment.

Fix an obscure bug in base_alloc() that could have allowed undefined
behavior if an application were to use sbrk() in conjunction with a
USE_BRK-enabled malloc.
2007-02-22 19:10:30 +00:00
Jason Evans
38cc6e0a82 Fix a utrace(2)-related bug in calloc(3).
Integrate various pedantic cleanups.

Submitted by:	Andrew Doran <ad@netbsd.org>
2007-01-31 22:54:19 +00:00
Warner Losh
c879ae3536 Per Regents of the University of Calfornia letter, remove advertising
clause.

# If I've done so improperly on a file, please let me know.
2007-01-09 00:28:16 +00:00
Jason Evans
ee0ab7cd86 Implement chunk allocation/deallocation hysteresis by caching one spare
chunk per arena, rather than immediately deallocating all unused chunks.
This fixes a potential performance issue when allocating/deallocating
an object of size (4kB..1MB] in a loop.

Reported by:	davidxu
2006-12-23 00:18:51 +00:00
Tom Rhodes
79d9a182ab Note that the value from getenv() should not be modified by applications.
PR:		60544
Reviewed by:	ru
2006-10-12 08:39:24 +00:00
Tom Rhodes
0b0ea94893 getenv.3: Put "is" on a line with other words
getobjformat.3: "takes precedence over" is not an envrionment variable.

PR:		75545
Submitted by:	n-kogane@syd.odn.ne.jp
MFC after:	3 days
2006-10-07 21:27:21 +00:00
Ruslan Ermilov
ad136d1e29 Revise markup in recently added manpages. 2006-09-30 10:34:13 +00:00
Andrey A. Chernov
d3ff3b5f2f Keep compatible parts in sync with OpenBSD v1.21, add some comments.
No functional changes.
2006-09-23 14:48:31 +00:00
Andrey A. Chernov
6843acf8ac Remove code #ifndef'ed in prev. commit to stay in sync with OpenBSD
v1.21 which just do that.
2006-09-22 18:59:03 +00:00
Andrey A. Chernov
f27c7b4713 Be more GNU compatible:
don't be greedy on the GNU "::" extension when arg separated by whitespace
and POSIX_CORRECTLY is set. From POSIX point of view this is unclear
situation, so minimal assumption looks right.
2006-09-22 17:01:38 +00:00
Ruslan Ermilov
a73a3ab56b Markup fixes. 2006-09-17 21:27:35 +00:00
Jason Evans
820e03699c Change the way base allocation is done for internal malloc data
structures, in order to avoid the possibility of attempted recursive
lock acquisition for chunks_mtx.

Reported by:	Slawa Olhovchenkov <slw@zxy.spb.ru>
2006-09-08 17:52:15 +00:00
Ruslan Ermilov
4d6ff03d50 alloca() cannot check if the allocation is valid; mention the consequences.
Obtained from:	OpenBSD
2006-09-05 16:30:11 +00:00
Marcel Moolenaar
ce2dfbd199 Enable TLS on PowerPC. 2006-09-01 19:14:14 +00:00
Marcel Moolenaar
bc14049e96 Enable TLS on ia64. 2006-09-01 06:18:43 +00:00
Colin Percival
e981a4e863 Correctly handle the case in calloc(num, size) where
(size_t)(num * size) == 0
but both num and size are nonzero.

Reported by:	Ilja van Sprundel
Approved by:	jasone
Security:	Integer overflow; calloc was allocating 1 byte in
		response to a request for a multiple of 2^32 (or 2^64)
		bytes instead of returning NULL.
2006-08-13 21:54:47 +00:00
Marcel Moolenaar
5011eea82f Define NO_TLS on PowerPC.
See also: PR ia64/91846
2006-08-09 19:01:27 +00:00
Jason Evans
b3dcb52814 Conditionally expand the size_invs lookup table in arena_run_reg_dalloc()
so that architectures with a quantum of 8 (rather than 16) work.

Restore arm's quantum to 8.

Submitted by:	jmg
2006-07-27 19:09:32 +00:00
Olivier Houchard
4cfa5e0135 Use 4 as QUANTUM_2POW_MIN on arm as it is on any other architecture, to avoid
triggering an assertion later.
2006-07-27 14:36:28 +00:00
Jason Evans
b8f9774731 Fix cpp logic in arena_malloc() to adjust size when assertions are enabled,
even if stats gathering is disabled. [1]

Remove 'size' parameter from several functions that do not use it.

Reported by:	[1] ache
2006-07-27 04:00:12 +00:00
Jason Evans
5355c74026 Use some math tricks in arena_run_reg_dalloc() to avoid actual division, as
well as avoiding a switch statement.  This change has no significant impact
to performance when branch prediction is successful at predicting the sizes
of objects passed to free(), but in the case that the object sizes are
semi-random, this change has the potential to prevent many branch prediction
misses, thus improving performance substantially.

Take advantage of alignment guarantees in ipalloc(), and pad object sizes to
something less than a power of two when possible.  This has the potential
to substantially reduce internal fragmentation for objects allocated via
posix_memalign().

Avoid an unnecessary pow2_ceil() call in arena_ralloc().

Submitted by:	djam8193ah@hotmail.com
2006-07-01 16:51:10 +00:00
Jason Evans
00d8242c2b Make the behavior of malloc(0) standards-compliant by getting rid of nil,
and instead creating a small allocation for each malloc(0) call.  The
optional SysV compatibility behavior remains unchanged.

Add a couple of assertions.

Fix a couple of typos in error message strings.
2006-06-30 20:54:15 +00:00
Giorgos Keramidas
1d3a1c8bce twalk() expects an `action' function not a comparison function.
The text is correct in the "DESCRIPTION" section, so fix "SYNOPSIS"
to use the correct name.

PR:		docs/90498
Submitted by:	Vasil Dimov
MFC after:	3 days
2006-06-23 13:36:33 +00:00
Jason Evans
0fc8aff0c4 Add a missing case for the switch statement in arena_run_reg_dalloc(). [1]
Fix a leak in chunk_dealloc(). [2]

Reported by:	[1] djam8193ah@hotmail.com,
		[2] Ville-Pertti Keinonen <will@exomi.com>
2006-06-20 20:38:25 +00:00
Maxim Konovalov
3953c11715 o .Xr strtonum(3).
MFC after:	1 week
2006-05-20 21:11:35 +00:00
Jung-uk Kim
1761ec1040 Correct decoding a string containing '/'.
PR:		97485
Submitted by:	Mikko Tyolajarvi < mbsd at pacbell dot net >
2006-05-19 19:06:38 +00:00
Jason Evans
3212b810d8 Increase the minimum chunk size by a power of two (32kB --> 64kB, assuming
4kB pages), in order to avoid dangerous rounding error when calculating
fullness limits during run promotion/demotion.

Convert a structure bitfield to a normal field in areana_run_t.  This should
have been changed along with the other fields in revision 1.120.
2006-05-10 00:07:45 +00:00
Jason Evans
f7768b9f34 Change the semantics of brk_max to dynamically deal with data segment
bounds. [1]

Modify logic for utilizing the data segment, such that it is possible to
create huge allocations there.

Shrink the data segment when deallocating a chunk, if it is at the end of
the data segment.

Rename chunk_size to csize in huge_malloc(), in order to avoid masking a
static variable of the same name. [1]

Reported by:	Paul Allen <nospam@ugcs.caltech.edu>
2006-04-27 01:03:00 +00:00
Jens Schweikhardt
e4b2624f46 s/soley/solely 2006-04-13 18:19:44 +00:00
Jason Evans
f90cbdf17f Add an unreachable return statement, in order to avoid a compiler warning
for non-standard optimization levels.

Reported by:	Michael Zach <zach@webges.com>
2006-04-05 18:46:24 +00:00
Jason Evans
50ff9670e2 Only initialize the first per-chunk page map element for free runs. This
makes run split/coalesce operations of complexity lg(n) rather than n.
2006-04-05 04:15:12 +00:00
Jason Evans
94fc7dc0d5 Add malloc_usable_size() to the RETURN VALUES section. 2006-04-04 20:27:53 +00:00
Jason Evans
cf01f0d7c5 Add init_lock, and use it to protect against allocator initialization
races.  This isn't currently necessary for libpthread or libthr, but
without it external threads libraries like the linuxthreads port are
not safe to use.

Reported by:	ganbold@micom.mng.net
2006-04-04 19:46:28 +00:00
Jason Evans
1c6d5bde6c Refactor per-run bitmap manipulation functions so that bitmap offsets only
have to be calculated once per allocator operation.

Make nil const.

Update various comments.

Remove/avoid division where possible.

For the one division operation that remains in the critical path, add a
switch statement that has a case for each small size class, and do division
with a constant divisor in each case.  This allows the compiler to generate
optimized code that does not use hardware division [1].

Obtained from:	peter [1]
2006-04-04 03:51:47 +00:00
Jason Evans
cd70100e5d Optimize runtime performance, primary using the following techniques:
* Avoid choosing an arena until it's certain that an arena is needed
    for allocation.

  * Convert division/multiplication to bitshifting where possible.

  * Avoid accessing TLS variables in single-threaded code.

  * Reduce the amount of pointer dereferencing.

  * Move lock acquisition in critical paths to only protect the the code
    that requires synchronization, and completely remove locking where
    possible.
2006-03-30 20:25:52 +00:00
Jason Evans
6b2c15da6a Add malloc_usable_size(3).
Discussed with:		arch@
2006-03-28 22:16:04 +00:00
Jason Evans
9f9bc9367c Allow the 'n' option to decrease the number of arenas below the default,
to as little as one arena.  Also, limit the number of arenas to avoid a
potential invariant violation in base_alloc().
2006-03-26 23:41:35 +00:00
Jason Evans
4328edf534 Add comments and reformat/rearrange code. There are no significant
functional changes in this commit.
2006-03-26 23:37:25 +00:00
Jason Evans
0c21f9eda7 Convert TINY_MIN_2POW from a cpp macro to tiny_min_2pow (a variable), and
determine its value at run time according to other relevant values.  This
avoids the creation of runs that are incompletely utilized, as long as
pagesize isn't too large (>32kB, given the current RUN_MIN_REGS_2POW
setting).

Increase the size of several structure bitfields in arena_run_t in order
to avoid integer overflow in the case that a run's header does not overlap
with the space that is usable as application allocation regions.  Given
the tiny_min_2pow change, this fix has no additional impact unless
pagesize is >32kB.

Reported by:	kris
2006-03-24 22:13:49 +00:00
Jason Evans
efafcfa7fb Add USE_BRK-specific code in malloc_init_hard() to allow the first
internally used chunk to start at the beginning of the heap, rather
than at a chunk-aligned address.  This reduces mapped memory somewhat
for 32-bit architectures.

Add the arena_run_link_t type and use it wherever a run object is only
used as a ring 'header'.  This saves approximately 40 kB of memory per
arena.

Remove an obsolete (no longer used) code path from base_alloc(), which
supported the internal allocation of objects larger than the chunk
size.

Enhance chunk_dealloc() to cache chunk addresses for all deallocated
chunks.  This has no impact for most programs, but has the potential
to reduce VM map fragmentation for programs that use huge
allocations.
2006-03-24 00:28:08 +00:00
Jason Evans
c07ee180bc Separate completely full runs from runs that are merely almost full, so
that no linear searching is necessary if we resort to allocating from a
run that is known to be mostly full.  There are pathological edge cases
that could have caused severely degraded performance, and this change
fixes that.
2006-03-20 04:05:05 +00:00
Jason Evans
bd6a7799c4 Optimize realloc() to reallocate in place if the old and new sizes are
close enough to each other that reallocation would allocate a new region
of the same size.  This improves the performance of repeated incremental
reallocations by up to three orders of magnitude. [1]

Fix arena_new() to properly constrain run size if a small chunk size was
specified during runtime configuration.

Suggested by:	se [1]
2006-03-19 18:28:06 +00:00
Jason Evans
2d07e432d4 Modify allocation policy, in order to avoid excessive fragmentation for
allocation patterns that involve a relatively even mixture of many
different size classes.

Reduce the chunk size from 16 MB to 2 MB.  Since chunks are now carved up
using an address-ordered first best fit policy, VM map fragmentation is
much less likely, which makes smaller chunks not as much of a risk.  This
reduces the virtual memory size of most applications.

Remove redzones, since program buffer overruns are no longer as likely to
corrupt malloc data structures.

Remove the C MALLOC_OPTIONS flag, and add H and S.
2006-03-17 09:00:27 +00:00
Ruslan Ermilov
91545fccf9 Add a non-optional newline after ".Bx". 2006-03-15 14:45:45 +00:00
Andre Oppermann
7727f485de Revert previous changes as we do support the .Ox macro for OpenBSD.
Pointed out by:	ceri, ru, delphij
2006-03-15 14:05:41 +00:00
Andrey A. Chernov
7768950fe3 POSIXed strtoll() (and ours one too) can set errno to EINVAL, so check
it first.

Approved by:    andre
2006-03-14 19:53:03 +00:00
Andre Oppermann
b0b2326781 Fix HISTORY and point to OpenBSD. 2006-03-14 17:01:21 +00:00
Andre Oppermann
c74dfa2faf Import of OpenBSD's strtonum(3) which is a nicer version of strtoll(3)
providing proper error checking and other improvements.

Obtained from:	OpenBSD
Requested by:	flz (to port Open[BGP|OSPF]D)
MFC after:	3 days
2006-03-14 16:57:30 +00:00
Daniel Eischen
6fad3aaf15 Add each directory's symbol map file to SYM_MAPS. 2006-03-13 01:15:01 +00:00
Daniel Eischen
cce72e8860 Add symbol maps and initial symbol version definitions to libc.
Reviewed by:	davidxu
2006-03-13 00:53:21 +00:00
Wojciech A. Koszek
9d0e4617f3 Fix typo in manual page reference.
Approved by:	cognet (mentor)
MFC after:	3 days
2006-02-26 23:01:11 +00:00
Alexander Kabaev
129d4752a0 Remove extra slash from pty slave device name returned by ptsname. 2006-02-13 00:04:04 +00:00
Jason Evans
d8a1377b1b Fix calculation of the number of arenas to use on multi-processor systems. 2006-02-04 01:11:30 +00:00
Joel Dahl
fbf9b468d5 Expand contractions. 2006-02-01 14:33:14 +00:00
Olivier Houchard
9b1fa2482e If the sysctl kern.pts.enable doesn't exist, check that /dev/ptmx is there,
and if so, use the pts system.

Suggested by:	rwatson
2006-01-29 00:02:57 +00:00
Jason Evans
4fae5e8fda Remove unwarranted uses of 'goto'. 2006-01-27 07:46:22 +00:00
Jason Evans
a3d0ab47a6 Add NO_MALLOC_EXTRAS, so that various extra features that can cause
performance degradation can be disabled via something like the following
in /etc/malloc.conf:

	CFLAGS+=-DNO_MALLOC_EXTRAS

Suggested by:	deischen
2006-01-27 04:42:10 +00:00
Jason Evans
7138ef5b1d Fix the type of a statistics counter (unsigned --> unsigned long). 2006-01-27 04:36:39 +00:00
Jason Evans
842e5e3d91 Clean up statistics gathering and printing. 2006-01-27 02:36:44 +00:00
Jason Evans
499168546f Optimize arena_bin_pop() to reduce the number of separator operations.
Remove the block of code that tries to use delayed regions in LIFO order,
since from a policy perspective, it conflicts with LRU caching of newly
coalesced regions in arena_undelay().  There are numerous policy
alternatives, and it isn't readily obvious which (if any) is superior;
this change at least has the virtue of being consistent with policy.
2006-01-26 08:11:23 +00:00
Olivier Houchard
67c7201e18 ptsname() bits for pts. 2006-01-26 01:33:55 +00:00
Jason Evans
0653ddb655 Remove a redundant variable assignment in arena_reg_frag_alloc(). 2006-01-25 05:41:02 +00:00
Jason Evans
b97aec1d61 If no coalesced exact-fit small regions are available, but delayed exact-
fit regions are available, use the delayed regions in LIFO order, in order
to increase locality of reference.  We might expect this to cause delayed
regions to be removed from the delay ring buffer more often (since we're
now re-using more recently buffered regions), but numerous tests indicate
that the overall impact on memory usage tends to be good (reduced
fragmentation).

Re-work arena_frag_reg_alloc() so that when large free regions are
exhausted, it uses small regions in a way that favors contiguous allocation
of sequentially allocated small regions.  Use arena_frag_reg_alloc() in
this capacity, rather than directly attempting over-fitting of small
requests when no large regions are available.

Remove the bin overfit statistic, since it is no longer relevant due to
the arena_frag_reg_alloc() changes.

Do not specify arena_frag_reg_alloc() as an inline function.  It is too
large to benefit much from being inlined, and it is also called in two
places, only one of which is in the critical path (the other call bloated
arena_reg_alloc()).

Call arena_coalesce() for a region before caching it with
arena_mru_cache().

Add assertions that detect the attempted caching of adjacent free regions,
so that we notice this problem when it is first created, rather than in
arena_coalesce(), when it's too late to know how the problem arose.

Reported by:    Hans Blancke
2006-01-25 04:21:22 +00:00
Jason Evans
ad4e4c676f Make the 'C' and 'c' malloc options consistent with other options; 'C'
doubles the cache size, and 'c' halves the cache size.
2006-01-23 03:32:38 +00:00
Jason Evans
5531d7fdc6 In arena_chunk_reg_alloc(), try to avoid touching the last page in the
chunk during initialization, in order to avoid physically backing the
page unless data are allocated there.
2006-01-23 03:19:01 +00:00
Jason Evans
677bc78b39 Use uintptr_t rather than size_t when casting pointers to integers. Also,
fix the few remaining casting style(9) errors that remained after the
functional change.

Reported by:	jmallett
2006-01-20 03:11:11 +00:00
Jason Evans
5d11758a9f Revert addtion of assertions in revision 1.99. These assertions cause
problems in cases where regions are faked up for the purposes of red-black
tree searches, since those faked region headers reside on the stack, rather
than in a malloc chunk.
2006-01-19 19:20:42 +00:00
Jason Evans
ea41be77ba Add assertions that detect some forms of region separator corruption. 2006-01-19 19:08:11 +00:00
Jason Evans
a3bb22bc8e Remove loops in arena_coalesce(). They are no longer necessary, now that
internal allocation does not rely on recursive arena use (base_arena was
removed in revision 1.95).
2006-01-19 18:37:30 +00:00
Jason Evans
a4922fdaf5 Make all internal variables and functions static.
Reported by:	ache
2006-01-19 07:23:13 +00:00
Jason Evans
2addd81287 Return NULL if there is an OOM error during initialization, rather than
allowing the error to be fatal.

Move a label in order to make sure to properly handle errors in malloc(0).

Reported by:	Alastair D'Silva, Saneto Takanori
2006-01-19 02:11:05 +00:00
Jason Evans
3842ca4db5 Add a separate simple internal base allocator and remove base_arena, so that
there is never any need to recursively call the main allocation functions.

Remove recursive spinlock support, since it is no longer needed.

Allow chunks to be as small as the page size.

Correctly propagate OOM errors from arena_new().
2006-01-16 05:13:49 +00:00
Marcel Moolenaar
707ca316b6 Define NO_TLS on ia64. The dynamic TLS implementation on ia64 is
broken for non-threaded shared processes in that __tls_get_addr()
assumes the thread pointer is always initialized. This is not the
case. When arenas_map is referenced in choose_arena() and it is
defined as a thread-local variable, it will result in a SIGSEGV.

PR: ia64/91846 (describes the TLS/ia64 bug).
2006-01-16 00:32:46 +00:00
Jason Evans
24b6d11c34 Replace malloc(), calloc(), posix_memalign(), realloc(), and free() with
a scalable concurrent allocator implementation.

Reviewed by:	current@
Approved by:	phk, markm (mentor)
2006-01-13 18:38:56 +00:00
Jason Evans
352219015d Fix a bitwise logic error in posix_memalign().
Reported by:	glebius
2006-01-12 18:09:25 +00:00
Jason Evans
52828c0e9c In preparation for a new malloc implementation:
* Add posix_memalign().

  * Move calloc() from calloc.c to malloc.c.  Add a calloc() implementation in
    rtld-elf in order to make the loader happy (even though calloc() isn't
    used in rtld-elf).

  * Add _malloc_prefork() and _malloc_postfork(), and use them instead of
    directly manipulating __malloc_lock.

Approved by:	phk, markm (mentor)
2006-01-12 07:28:21 +00:00
Tom Rhodes
257551c6a0 Add a64l(), l64a(), and l64a_r() XSI extentions. These functions convert
between a 32-bit integer and a radix-64 ASCII string.  The l64a_r() function
is a NetBSD addition.

PR:		51209 (based on submission, but very different)
Reviewed by:	bde, ru
2005-12-24 22:37:59 +00:00
Ruslan Ermilov
4ca0505435 Fix prototype. 2005-11-23 20:34:37 +00:00
Stefan Farfeleder
613100918d Include a couple of headers to ensure consistency between the prototype and
the function definition.
2005-09-12 19:52:42 +00:00
Stefan Farfeleder
2ba64027bc Move the declaration of __cleanup to libc_private.h as it is used in both
stdio/ and stdlib/.  Don't define __cleanup twice.
2005-09-12 13:46:32 +00:00
Joe Marcus Clarke
a617a18a23 Fix ptsname(3) by converting it to use devname(3) to obtain the name of
a tty device instead of the legacy minor number approach.  This is known to
fix gnome-vfs' sftp module as well as kio_sftp and kdesu on -CURRENT.

Thanks to scottl for the snprintf() approach idea.

Reviewed by:	phk
Tested by:	pav
		mich
Approved by:	re (scottl)
2005-07-07 17:48:40 +00:00
Brian Feldman
91320d17cc Do not require the pty(4) majors to be anything in particular. 2005-03-04 20:23:32 +00:00
Xin LI
2dcb9ce484 Remove the check about whether MALLOC_EXTRA_SANITY is defined,
surrounding the undef'ing it.  It does not seem necessary to
undef some symbol that is not exist, and gcc does not complain
about whether a symbol is exist before #undef'ing it out.

Spotted by:	mingyanguo via ChinaUnix.net forum
Reviewed by:	phk
2005-02-27 17:16:16 +00:00
Andrey A. Chernov
d308373710 Especially mention that setting errno to EINVAL in "no conversion" case
is not portable.

Asked by:       joerg
2005-01-22 18:02:58 +00:00
Andrey A. Chernov
db0e25eeb9 Whitespace/style tweaking of prev. commit.
Noted by:       bde
2005-01-21 13:31:02 +00:00
Andrey A. Chernov
2571c7f720 POSIX says that 0[xX] prefix is _optional_ even in base 16 case, make it
really so.

"If the value of base is 16, the characters 0x or 0X may optionally
precede the sequence of letters and digits, following the sign if
present."

Found by:       joerg
2005-01-21 00:42:13 +00:00
Ruslan Ermilov
24a0682c64 Sort sections. 2005-01-20 09:17:07 +00:00
Ruslan Ermilov
629a7369d7 Markup fixes. 2005-01-14 21:07:56 +00:00
Brian Somers
41843e7135 Fix some signed/unsigned comparisons. Fix prototypes while I'm here.
PR:		28890
Submitted by:	matthias.andree at web dot de
MFC after:	7 days
2005-01-12 03:39:34 +00:00
Warner Losh
dfcc91e219 sranddev() is not magic pixie dust. While it gives a good random
seed, the random number generator rand(3) still sucks and is unlikely
sufficient for crypto use.  Correct what appears to be a cut and paste
error from the srandomdev() man page.

Submitted by: Ben Mesander
2004-11-10 17:25:49 +00:00
Alfred Perlstein
65da79c4be Reword recent addition about memory moving.
Requested by: keramida

Bump .Dd

Requested by: ru
2004-08-19 16:34:31 +00:00
Alfred Perlstein
09a12d75cf Clarify that realloc and reallocf may move the memory allocation. 2004-08-18 21:13:15 +00:00
Warner Losh
d68c1e59a8 Use #include <unistd.h> rather than the explicit externs in the
example.  The externs haven't been needed in about 10 years, so
there's no reason to have them other than for hysterical raisins.  And
the California Rasins haven't been around for a long time...
2004-07-31 01:00:50 +00:00
Ruslan Ermilov
2410103c1d mdoc(7) fixes. 2004-07-07 19:57:16 +00:00
Hiten Pandya
af73aa7cce Move the return value information about the getenv(3) library function
under the RETURN VALUES section so it is consistent with others.

Cleanup the return value text for getenv(3) a little while I am here.

PR:     	docs/58033
MFC after:	3 days
2004-07-06 23:21:36 +00:00
Andrey A. Chernov
42aeacc4d4 Keep it sync with OpenBSD:
An optional argument cannot start with '-', even if permutation is
disabled.

Obtained from: OpenBSD getopt_long.c v1.17
2004-07-06 13:58:45 +00:00
Ruslan Ermilov
1c85060a13 Sort SEE ALSO references (in dictionary order, ignoring case). 2004-07-04 20:55:50 +00:00
Stefan Farfeleder
5908d366fb Consistently use __inline instead of __inline__ as the former is an empty macro
in <sys/cdefs.h> for compilers without support for inline.
2004-07-04 16:11:03 +00:00
Ruslan Ermilov
30950a21e1 Eliminate double whitespace. 2004-07-03 22:30:10 +00:00
Ruslan Ermilov
1a0a934547 Mechanically kill hard sentence breaks. 2004-07-02 23:52:20 +00:00
Olivier Houchard
57734c02cd Define malloc_pageshift and malloc_minsize for arm. 2004-05-14 11:50:51 +00:00
Ruslan Ermilov
cb30b9c545 Link radixsort(3) to sradixsort(3), make the latter appear in
the whatis(1) output.
2004-05-12 08:13:40 +00:00
Andrey A. Chernov
f853699a55 Simplify one condition in prev. commit:
short_too already assumes FLAG_LONGONLY
2004-04-01 22:32:28 +00:00
Andrey A. Chernov
ed4fbbd5e3 Fix parsing of ambiguous options, whole loop must be processed 2004-04-01 22:09:07 +00:00