491 Commits

Author SHA1 Message Date
Jason Evans
cd70100e5d Optimize runtime performance, primary using the following techniques:
* Avoid choosing an arena until it's certain that an arena is needed
    for allocation.

  * Convert division/multiplication to bitshifting where possible.

  * Avoid accessing TLS variables in single-threaded code.

  * Reduce the amount of pointer dereferencing.

  * Move lock acquisition in critical paths to only protect the the code
    that requires synchronization, and completely remove locking where
    possible.
2006-03-30 20:25:52 +00:00
Jason Evans
6b2c15da6a Add malloc_usable_size(3).
Discussed with:		arch@
2006-03-28 22:16:04 +00:00
Jason Evans
9f9bc9367c Allow the 'n' option to decrease the number of arenas below the default,
to as little as one arena.  Also, limit the number of arenas to avoid a
potential invariant violation in base_alloc().
2006-03-26 23:41:35 +00:00
Jason Evans
4328edf534 Add comments and reformat/rearrange code. There are no significant
functional changes in this commit.
2006-03-26 23:37:25 +00:00
Jason Evans
0c21f9eda7 Convert TINY_MIN_2POW from a cpp macro to tiny_min_2pow (a variable), and
determine its value at run time according to other relevant values.  This
avoids the creation of runs that are incompletely utilized, as long as
pagesize isn't too large (>32kB, given the current RUN_MIN_REGS_2POW
setting).

Increase the size of several structure bitfields in arena_run_t in order
to avoid integer overflow in the case that a run's header does not overlap
with the space that is usable as application allocation regions.  Given
the tiny_min_2pow change, this fix has no additional impact unless
pagesize is >32kB.

Reported by:	kris
2006-03-24 22:13:49 +00:00
Jason Evans
efafcfa7fb Add USE_BRK-specific code in malloc_init_hard() to allow the first
internally used chunk to start at the beginning of the heap, rather
than at a chunk-aligned address.  This reduces mapped memory somewhat
for 32-bit architectures.

Add the arena_run_link_t type and use it wherever a run object is only
used as a ring 'header'.  This saves approximately 40 kB of memory per
arena.

Remove an obsolete (no longer used) code path from base_alloc(), which
supported the internal allocation of objects larger than the chunk
size.

Enhance chunk_dealloc() to cache chunk addresses for all deallocated
chunks.  This has no impact for most programs, but has the potential
to reduce VM map fragmentation for programs that use huge
allocations.
2006-03-24 00:28:08 +00:00
Jason Evans
c07ee180bc Separate completely full runs from runs that are merely almost full, so
that no linear searching is necessary if we resort to allocating from a
run that is known to be mostly full.  There are pathological edge cases
that could have caused severely degraded performance, and this change
fixes that.
2006-03-20 04:05:05 +00:00
Jason Evans
bd6a7799c4 Optimize realloc() to reallocate in place if the old and new sizes are
close enough to each other that reallocation would allocate a new region
of the same size.  This improves the performance of repeated incremental
reallocations by up to three orders of magnitude. [1]

Fix arena_new() to properly constrain run size if a small chunk size was
specified during runtime configuration.

Suggested by:	se [1]
2006-03-19 18:28:06 +00:00
Jason Evans
2d07e432d4 Modify allocation policy, in order to avoid excessive fragmentation for
allocation patterns that involve a relatively even mixture of many
different size classes.

Reduce the chunk size from 16 MB to 2 MB.  Since chunks are now carved up
using an address-ordered first best fit policy, VM map fragmentation is
much less likely, which makes smaller chunks not as much of a risk.  This
reduces the virtual memory size of most applications.

Remove redzones, since program buffer overruns are no longer as likely to
corrupt malloc data structures.

Remove the C MALLOC_OPTIONS flag, and add H and S.
2006-03-17 09:00:27 +00:00
Ruslan Ermilov
91545fccf9 Add a non-optional newline after ".Bx". 2006-03-15 14:45:45 +00:00
Andre Oppermann
7727f485de Revert previous changes as we do support the .Ox macro for OpenBSD.
Pointed out by:	ceri, ru, delphij
2006-03-15 14:05:41 +00:00
Andrey A. Chernov
7768950fe3 POSIXed strtoll() (and ours one too) can set errno to EINVAL, so check
it first.

Approved by:    andre
2006-03-14 19:53:03 +00:00
Andre Oppermann
b0b2326781 Fix HISTORY and point to OpenBSD. 2006-03-14 17:01:21 +00:00
Andre Oppermann
c74dfa2faf Import of OpenBSD's strtonum(3) which is a nicer version of strtoll(3)
providing proper error checking and other improvements.

Obtained from:	OpenBSD
Requested by:	flz (to port Open[BGP|OSPF]D)
MFC after:	3 days
2006-03-14 16:57:30 +00:00
Daniel Eischen
6fad3aaf15 Add each directory's symbol map file to SYM_MAPS. 2006-03-13 01:15:01 +00:00
Daniel Eischen
cce72e8860 Add symbol maps and initial symbol version definitions to libc.
Reviewed by:	davidxu
2006-03-13 00:53:21 +00:00
Wojciech A. Koszek
9d0e4617f3 Fix typo in manual page reference.
Approved by:	cognet (mentor)
MFC after:	3 days
2006-02-26 23:01:11 +00:00
Alexander Kabaev
129d4752a0 Remove extra slash from pty slave device name returned by ptsname. 2006-02-13 00:04:04 +00:00
Jason Evans
d8a1377b1b Fix calculation of the number of arenas to use on multi-processor systems. 2006-02-04 01:11:30 +00:00
Joel Dahl
fbf9b468d5 Expand contractions. 2006-02-01 14:33:14 +00:00
Olivier Houchard
9b1fa2482e If the sysctl kern.pts.enable doesn't exist, check that /dev/ptmx is there,
and if so, use the pts system.

Suggested by:	rwatson
2006-01-29 00:02:57 +00:00
Jason Evans
4fae5e8fda Remove unwarranted uses of 'goto'. 2006-01-27 07:46:22 +00:00
Jason Evans
a3d0ab47a6 Add NO_MALLOC_EXTRAS, so that various extra features that can cause
performance degradation can be disabled via something like the following
in /etc/malloc.conf:

	CFLAGS+=-DNO_MALLOC_EXTRAS

Suggested by:	deischen
2006-01-27 04:42:10 +00:00
Jason Evans
7138ef5b1d Fix the type of a statistics counter (unsigned --> unsigned long). 2006-01-27 04:36:39 +00:00
Jason Evans
842e5e3d91 Clean up statistics gathering and printing. 2006-01-27 02:36:44 +00:00
Jason Evans
499168546f Optimize arena_bin_pop() to reduce the number of separator operations.
Remove the block of code that tries to use delayed regions in LIFO order,
since from a policy perspective, it conflicts with LRU caching of newly
coalesced regions in arena_undelay().  There are numerous policy
alternatives, and it isn't readily obvious which (if any) is superior;
this change at least has the virtue of being consistent with policy.
2006-01-26 08:11:23 +00:00
Olivier Houchard
67c7201e18 ptsname() bits for pts. 2006-01-26 01:33:55 +00:00
Jason Evans
0653ddb655 Remove a redundant variable assignment in arena_reg_frag_alloc(). 2006-01-25 05:41:02 +00:00
Jason Evans
b97aec1d61 If no coalesced exact-fit small regions are available, but delayed exact-
fit regions are available, use the delayed regions in LIFO order, in order
to increase locality of reference.  We might expect this to cause delayed
regions to be removed from the delay ring buffer more often (since we're
now re-using more recently buffered regions), but numerous tests indicate
that the overall impact on memory usage tends to be good (reduced
fragmentation).

Re-work arena_frag_reg_alloc() so that when large free regions are
exhausted, it uses small regions in a way that favors contiguous allocation
of sequentially allocated small regions.  Use arena_frag_reg_alloc() in
this capacity, rather than directly attempting over-fitting of small
requests when no large regions are available.

Remove the bin overfit statistic, since it is no longer relevant due to
the arena_frag_reg_alloc() changes.

Do not specify arena_frag_reg_alloc() as an inline function.  It is too
large to benefit much from being inlined, and it is also called in two
places, only one of which is in the critical path (the other call bloated
arena_reg_alloc()).

Call arena_coalesce() for a region before caching it with
arena_mru_cache().

Add assertions that detect the attempted caching of adjacent free regions,
so that we notice this problem when it is first created, rather than in
arena_coalesce(), when it's too late to know how the problem arose.

Reported by:    Hans Blancke
2006-01-25 04:21:22 +00:00
Jason Evans
ad4e4c676f Make the 'C' and 'c' malloc options consistent with other options; 'C'
doubles the cache size, and 'c' halves the cache size.
2006-01-23 03:32:38 +00:00
Jason Evans
5531d7fdc6 In arena_chunk_reg_alloc(), try to avoid touching the last page in the
chunk during initialization, in order to avoid physically backing the
page unless data are allocated there.
2006-01-23 03:19:01 +00:00
Jason Evans
677bc78b39 Use uintptr_t rather than size_t when casting pointers to integers. Also,
fix the few remaining casting style(9) errors that remained after the
functional change.

Reported by:	jmallett
2006-01-20 03:11:11 +00:00
Jason Evans
5d11758a9f Revert addtion of assertions in revision 1.99. These assertions cause
problems in cases where regions are faked up for the purposes of red-black
tree searches, since those faked region headers reside on the stack, rather
than in a malloc chunk.
2006-01-19 19:20:42 +00:00
Jason Evans
ea41be77ba Add assertions that detect some forms of region separator corruption. 2006-01-19 19:08:11 +00:00
Jason Evans
a3bb22bc8e Remove loops in arena_coalesce(). They are no longer necessary, now that
internal allocation does not rely on recursive arena use (base_arena was
removed in revision 1.95).
2006-01-19 18:37:30 +00:00
Jason Evans
a4922fdaf5 Make all internal variables and functions static.
Reported by:	ache
2006-01-19 07:23:13 +00:00
Jason Evans
2addd81287 Return NULL if there is an OOM error during initialization, rather than
allowing the error to be fatal.

Move a label in order to make sure to properly handle errors in malloc(0).

Reported by:	Alastair D'Silva, Saneto Takanori
2006-01-19 02:11:05 +00:00
Jason Evans
3842ca4db5 Add a separate simple internal base allocator and remove base_arena, so that
there is never any need to recursively call the main allocation functions.

Remove recursive spinlock support, since it is no longer needed.

Allow chunks to be as small as the page size.

Correctly propagate OOM errors from arena_new().
2006-01-16 05:13:49 +00:00
Marcel Moolenaar
707ca316b6 Define NO_TLS on ia64. The dynamic TLS implementation on ia64 is
broken for non-threaded shared processes in that __tls_get_addr()
assumes the thread pointer is always initialized. This is not the
case. When arenas_map is referenced in choose_arena() and it is
defined as a thread-local variable, it will result in a SIGSEGV.

PR: ia64/91846 (describes the TLS/ia64 bug).
2006-01-16 00:32:46 +00:00
Jason Evans
24b6d11c34 Replace malloc(), calloc(), posix_memalign(), realloc(), and free() with
a scalable concurrent allocator implementation.

Reviewed by:	current@
Approved by:	phk, markm (mentor)
2006-01-13 18:38:56 +00:00
Jason Evans
352219015d Fix a bitwise logic error in posix_memalign().
Reported by:	glebius
2006-01-12 18:09:25 +00:00
Jason Evans
52828c0e9c In preparation for a new malloc implementation:
* Add posix_memalign().

  * Move calloc() from calloc.c to malloc.c.  Add a calloc() implementation in
    rtld-elf in order to make the loader happy (even though calloc() isn't
    used in rtld-elf).

  * Add _malloc_prefork() and _malloc_postfork(), and use them instead of
    directly manipulating __malloc_lock.

Approved by:	phk, markm (mentor)
2006-01-12 07:28:21 +00:00
Tom Rhodes
257551c6a0 Add a64l(), l64a(), and l64a_r() XSI extentions. These functions convert
between a 32-bit integer and a radix-64 ASCII string.  The l64a_r() function
is a NetBSD addition.

PR:		51209 (based on submission, but very different)
Reviewed by:	bde, ru
2005-12-24 22:37:59 +00:00
Ruslan Ermilov
4ca0505435 Fix prototype. 2005-11-23 20:34:37 +00:00
Stefan Farfeleder
613100918d Include a couple of headers to ensure consistency between the prototype and
the function definition.
2005-09-12 19:52:42 +00:00
Stefan Farfeleder
2ba64027bc Move the declaration of __cleanup to libc_private.h as it is used in both
stdio/ and stdlib/.  Don't define __cleanup twice.
2005-09-12 13:46:32 +00:00
Joe Marcus Clarke
a617a18a23 Fix ptsname(3) by converting it to use devname(3) to obtain the name of
a tty device instead of the legacy minor number approach.  This is known to
fix gnome-vfs' sftp module as well as kio_sftp and kdesu on -CURRENT.

Thanks to scottl for the snprintf() approach idea.

Reviewed by:	phk
Tested by:	pav
		mich
Approved by:	re (scottl)
2005-07-07 17:48:40 +00:00
Brian Feldman
91320d17cc Do not require the pty(4) majors to be anything in particular. 2005-03-04 20:23:32 +00:00
Xin LI
2dcb9ce484 Remove the check about whether MALLOC_EXTRA_SANITY is defined,
surrounding the undef'ing it.  It does not seem necessary to
undef some symbol that is not exist, and gcc does not complain
about whether a symbol is exist before #undef'ing it out.

Spotted by:	mingyanguo via ChinaUnix.net forum
Reviewed by:	phk
2005-02-27 17:16:16 +00:00
Andrey A. Chernov
d308373710 Especially mention that setting errno to EINVAL in "no conversion" case
is not portable.

Asked by:       joerg
2005-01-22 18:02:58 +00:00