Commit Graph

613 Commits

Author SHA1 Message Date
jasone
f4190c00f8 Use PAGE_{SIZE,MASK,SHIFT} from machine/param.h rather than hard-coding
page size and using sysconf(3).

Suggested by:	marcel
2008-09-10 14:27:34 +00:00
marcel
af87cdf28e Unbreak ia64: pges are 8KB. 2008-09-06 05:26:31 +00:00
jasone
a734052e9c Add thread-specific caching for small size classes, based on magazines.
This caching allows for completely lock-free allocation/deallocation in the
steady state, at the expense of likely increased memory use and
fragmentation.

Reduce the default number of arenas to 2*ncpus, since thread-specific
caching typically reduces arena contention.

Modify size class spacing to include ranges of 2^n-spaced, quantum-spaced,
cacheline-spaced, and subpage-spaced size classes.  The advantages are:
fewer size classes, reduced false cacheline sharing, and reduced internal
fragmentation for allocations that are slightly over 512, 1024, etc.

Increase RUN_MAX_SMALL, in order to limit fragmentation for the
subpage-spaced size classes.

Add a size-->bin lookup table for small sizes to simplify translating sizes
to size classes.  Include a hard-coded constant table that is used unless
custom size class spacing is specified at run time.

Add the ability to disable tiny size classes at compile time via
MALLOC_TINY.
2008-08-27 02:00:53 +00:00
ed
f827c5e9cb Remove grantpt.c, which should have been deleted in the MPSAFE TTY commit.
The routines in grantpt.c have been moved to ptsname.c in the MPSAFE TTY
layer, because grantpt() is now effectively a no-op. I forgot to remove
the corresponding source file from libc.
2008-08-20 09:43:46 +00:00
ed
cc3116a938 Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
jasone
463ec8e02d Move CPU_SPINWAIT into the innermost spin loop, in order to allow faster
preemption while busy-waiting.

Submitted by:	Mike Schuster <schuster@adobe.com>
2008-08-14 17:31:42 +00:00
jasone
db987f53e4 Re-order the terms of an expression in arena_run_reg_dalloc() to correctly
detect whether the integer division table is large enough to handle the
divisor.  Before this change, the last two table elements were never used,
thus causing the slow path to be used for those divisors.
2008-08-14 17:03:29 +00:00
cperciva
0b041e8f96 Remove variables which are assigned values and never used thereafter.
Found by:	LLVM/Clang Static Checker
Approved by:	jasone
2008-08-08 20:42:42 +00:00
scf
d2bab1788d Restructure and use different variables in the tests that involve
environ[0] to be more obvious that environ is not NULL before environ[0]
is tested.  Although I believe the previous code worked, this change
improves code maintainability.

Reviewed by:	ache
MFC after:	3 days
2008-08-03 22:47:23 +00:00
scf
e0b5c971c2 Detect if the application has cleared the environ variable by setting
the first value (environ[0]) to NULL.  This is in addition to the
current detection of environ being replaced, which includes being set to
NULL.  Without this fix, the environment is not truly wiped, but appears
to be by getenv() until an *env() call is made to alter the enviroment.

This change is necessary to support those applications that use this
method for clearing environ such as Dovecot and Postfix.  Applications
such as Sendmail and the base system's env replace environ (already
detected).  While neither of these methods are defined by SUSv3, it is
best to support them due to historic reasons and in lieu of a clean,
defined method.

Add extra units tests for clearing environ using four different methods:
1. Set environ to NULL pointer.
2. Set environ[0] to NULL pointer.
3. Set environ to calloc()'d NULL-terminated array.
4. Set environ to static NULL-terminated array.

Noticed by:	Timo Sirainen

MFC after:	3 days
2008-08-02 02:34:35 +00:00
jasone
1e148a8be3 Enhance arena_chunk_map_t to directly support run coalescing, and use
the chunk map instead of red-black trees where possible.  Remove the
red-black trees and node objects that are obsoleted by this change.  The
net result is a ~1-2% memory savings, and a substantial allocation speed
improvement.
2008-07-18 19:35:44 +00:00
danger
64ff5656f8 - This code was intially obtained from NetBSD, but it's missing licence
statement. Add the one from the current NetBSD version.
- Also bump a date to reflect my content changes I have done in previous
  revision

Approved by:	imp
MFC after:	3 days
2008-07-06 17:03:37 +00:00
danger
518dda6b8f - Add description about a missing return value
PR:		docs/75995
Submitted by:	Tarc <tarc@po.cs.msu.su>
MFC after:	3 days
2008-07-06 12:17:53 +00:00
danger
3fba957487 - remove superfluous word
- remove contractions

MFC after:	3 days
2008-07-06 11:31:20 +00:00
danger
ca3c788812 Mark the section describing return values with an appropriate section flag.
PR:		docs/122818
MFC after:	3 days
2008-06-26 08:24:59 +00:00
ed
9e714aa901 Don't export the unused __use_pts() routine.
The __use_pts() routine was once probably used by libutil to determine
if we are using BSD or UNIX98 style PTY device names. It doesn't seem to
be used outside grantpt.c, which means we can make it static and remove
it from the Symbol.map.

Reviewed by:	cognet, kib
Approved by:	philip (mentor)
2008-06-17 14:05:03 +00:00
jasone
ab071c304d In the error path through base_alloc(), release base_mtx [1].
Fix bit vector initialization for run headers.

Submitted by:	[1] Mike Schuster <schuster@adobe.com>
2008-06-10 15:46:18 +00:00
jasone
1869e07b5f Clean up cpp logic and comments. 2008-05-14 18:33:13 +00:00
jasone
d859583ec5 Fix a comment. 2008-05-03 17:49:16 +00:00
jasone
8482ff3aff Add a separate tree to track arena chunks that contain dirty pages.
This substantially improves worst case allocation performance, since
O(lg n) tree search can be used instead of O(n) tree iteration.

Use rb_wrap() instead of directly calling rb_*() macros.
2008-05-01 17:25:55 +00:00
jasone
57d95000d8 Add rb_wrap(), which creates C function wrappers for most rb_*()
macros.

Add rb_foreach_next() and rb_foreach_reverse_prev(), which make it
possible to re-synchronize tree iteration after the tree has been
modified.

Rename rb_tree_new() to rb_new().
2008-05-01 17:24:37 +00:00
gonzo
1a49460b16 Set QUANTUM_2POW_MIN and SIZEOF_PTR_2POW parameters for MIPS
Approved by: imp
2008-04-29 22:56:05 +00:00
jasone
138e8f0fdc Check for integer overflow before calling sbrk(2), since it uses a
signed increment argument, but the size is an unsigned integer.
2008-04-29 01:32:42 +00:00
ru
c17c108c2a Stricter check for integer overflow. 2008-04-24 07:49:00 +00:00
jasone
14c9906df5 Implement red-black trees without using parent pointers, and store the
color bit in the least significant bit of the right child pointer, in
order to reduce red-black tree linkage overhead by ~2X as compared to
sys/tree.h.

Use the new red-black tree implementation in malloc, which drops
memory usage by ~0.5 or ~1%, for 32- and 64-bit systems, respectively.
2008-04-23 16:09:18 +00:00
ru
27e53e835a Don't forget to free() currency_symbol and asciivalue when multiple
conversion specifiers for them are present.

Submitted by:	Maxim Dounin <mdounin@mdounin.ru>
Obtained from:	NetBSD (partially)
MFC after:	3 days
2008-04-19 07:22:58 +00:00
ru
b15775c48a Better strfmon(3) conversion specifiers sanity checking.
There were no checks for left and right precisions at all, and
a check for field width had integer overflow bug.

Reported by:	Maksymilian Arciemowicz
Security:	http://securityreason.com/achievement_securityalert/53
Submitted by:	Maxim Dounin <mdounin@mdounin.ru>
MFC after:	3 days
2008-04-19 07:18:22 +00:00
delphij
b612c0d58d Use calloc() instaed of zeroing memory ourselves. 2008-04-13 08:05:08 +00:00
jasone
423cb10cb4 Remove stale #include <machine/atomic.h>, which as needed by lazy
deallocation.
2008-03-07 16:54:03 +00:00
scf
7ee4756ce9 Replace the use of warnx() with direct output to stderr using _write().
This reduces the size of a statically-linked binary by approximately 100KB
in a trivial "return (0)" test application.  readelf -S was used to verify
that the .text section was reduced and that using strlen() saved a few
more bytes over using sizeof().  Since the section of code is only called
when environ is corrupt (program bug), I went with fewer bytes over fewer
cycles.

I made minor edits to the submitted patch to make the output resemble
warnx().

Submitted by:	kib bz
Approved by:	wes (mentor)
MFC after:	5 days
2008-02-28 04:09:08 +00:00
jasone
2bc29a1530 Fix a race condition in arena_ralloc() for shrinking in-place large
reallocation, when junk filling is enabled.  Junk filling must occur
prior to shrinking, since any deallocated trailing pages are immediately
available for use by other threads.

Reported by:	Mats Palmgren <mats.palmgren@bredband.net>
2008-02-17 18:34:17 +00:00
jasone
b08b976e68 Remove support for lazy deallocation. Benchmarks across a wide range of
allocation patterns, number of CPUs, and MALLOC_OPTIONS settings indicate
that lazy deallocation has the potential to worsen throughput dramatically.
Performance degradation occurs when multiple threads try to clear the lazy
free cache simultaneously.  Various experiments to avoid this bottleneck
failed to completely solve this problem, while adding yet more complexity.
2008-02-17 17:09:24 +00:00
jasone
f6ce9fe601 Fix a bug in lazy deallocation that was introduced when
arena_dalloc_lazy_hard() was split out of arena_dalloc_lazy() in revision
1.162.

Reduce thundering herd problems in lazy deallocation by randomly varying
how many probes a thread does before taking the slow path.
2008-02-08 08:02:34 +00:00
jasone
c614695539 Clean up manipulation of chunk page map elements to remove some tenuous
assumptions about whether bits are set at various times.  This makes
adding other flags safe.

Reorganize functions in order to inline i{m,c,p,s,re}alloc().  This
allows the entire fast-path call chains for malloc() and free() to be
inlined. [1]

Suggested by:	[1] Stuart Parmenter <stuart@mozilla.com>
2008-02-08 00:35:56 +00:00
jasone
44c343f8fa Track dirty unused pages so that they can be purged if they exceed a
threshold, according to the 'F' MALLOC_OPTIONS flag.  This obsoletes the
'H' flag.

Try to realloc() large objects in place.  This substantially speeds up
incremental large reallocations in the common case.

Fix a bug in arena_ralloc() that caused relocation of sub-page objects
even if the old and new sizes were in the same size class.

Maintain trees of runs and simplify the per-chunk page map.  This allows
logarithmic-time searching for sufficiently large runs in
arena_run_alloc(), whereas the previous algorithm required linear time
in the worst case.

Break various large functions into smaller sub-functions, and inline
only the functions that are in the fast path for small object
allocation/deallocation.

Remove an unnecessary check in base_pages_alloc_mmap().

Avoid integer division in choose_arena() for the NO_TLS case on
single-CPU systems.
2008-02-06 02:59:54 +00:00
jhb
184b0a421c Remove some now-unused macros.
MFC after:	1 week
2008-01-15 18:55:52 +00:00
jhb
c02890da0b Put back the openpty(3) and ptsname(3) fixes but don't disable ptsname(3)
on pts(4) devices this time.  This fixes the issues while leaving pts(4)
enabled on HEAD.
2008-01-15 15:36:23 +00:00
cperciva
2f49f42d98 Back out last commit, since it accidentally broke pts.
The security fix will be re-committed soon, hopefully without breaking
anything.
2008-01-15 13:59:13 +00:00
cperciva
533f13b8b2 Fix issues which allow snooping on ptys. [08:01]
Fix an off-by-one error in inet_network(3). [08:02]

Security: FreeBSD-SA-08:01.pty
Security: FreeBSD-SA-08:02.libc
2008-01-14 22:56:05 +00:00
das
00c36da743 Changing 'r' to a size_t in the previous commit turned quicksort
into slowsort for some sequences because different parts of the
code used 'r' to store two different things, one of which was
signed. Clean things up by splitting 'r' into two variables, and
use a more meaningful name.
2008-01-14 09:21:34 +00:00
das
1daf1db8d4 Use size_t to avoid overflow when sorting arrays larger than 2 GB.
PR:		111085
MFC after:	2 weeks
2008-01-13 02:11:10 +00:00
jasone
6aea4f4c16 Enable both sbrk(2)- and mmap(2)-based memory acquisition methods by
default.  This has the disadvantage of rendering the datasize resource
limit irrelevant, but without this change, legitimate uses of more
memory than will fit in the data segment are thwarted by default.

Fix chunk_alloc_mmap() to work correctly if initial mapping is not
chunk-aligned and mapping extension fails.
2008-01-03 23:22:13 +00:00
jasone
573b21b457 Fix a major chunk-related memory leak in chunk_dealloc_dss_record(). [1]
Clean up DSS-related locking and protect all pertinent variables with
dss_mtx (remove dss_chunks_mtx).  This fixes race conditions that could
cause chunk leaks.

Reported by:	[1] kris
2007-12-31 06:19:48 +00:00
jasone
1f55b95c0b Fix a bug related to sbrk() calls that could cause address space leaks.
This is a long-standing bug, but until recent changes it was difficult
to trigger, and even then its impact was non-catastrophic, with the
exception of revision 1.157.

Optimize chunk_alloc_mmap() to avoid the need for unmapping pages in the
common case.  Thanks go to Kris Kennaway for a patch that inspired this
change.

Do not maintain a record of previously mmap'ed chunk address ranges.
The original intent was to avoid the extra system call overhead in
chunk_alloc_mmap(), which is no longer a concern.  This also allows some
simplifications for the tree of unused DSS chunks.

Introduce huge_mtx and dss_chunks_mtx to replace chunks_mtx.  There was
no compelling reason to use the same mutex for these disjoint purposes.

Avoid memset() for huge allocations when possible.

Maintain two trees instead of one for tracking unused DSS address
ranges.  This allows scalable allocation of multi-chunk huge objects in
the DSS.  Previously, multi-chunk huge allocation requests failed if the
DSS could not be extended.
2007-12-31 00:59:16 +00:00
jasone
138c627354 Back out premature commit of previous version. 2007-12-28 09:21:12 +00:00
jasone
c907683663 Maintain two trees instead of one (old_chunks --> old_chunks_{ad,szad}) in
order to support re-use of multi-chunk unused regions within the DSS for
huge allocations.  This generalization is important to correct function
when mmap-based allocation is disabled.

Avoid zeroing re-used memory in the DSS unless it really needs to be
zeroed.
2007-12-28 07:24:19 +00:00
jasone
93c7f7517b Release chunks_mtx for all paths through chunk_dealloc().
Reported by:	kris
2007-12-28 02:15:08 +00:00
jasone
15ff969441 Add the 'D' and 'M' run time options, and use them to control whether
memory is acquired from the system via sbrk(2) and/or mmap(2).  By default,
use sbrk(2) only, in order to support traditional use of resource limits.
Additionally, when both options are enabled, prefer the data segment to
anonymous mappings, in order to coexist better with large file mappings
in applications on 32-bit platforms.  This change has the potential to
increase memory fragmentation due to the linear nature of the data
segment, but from a performance perspective this is mitigated by the use
of madvise(2). [1]

Add the ability to interpret integer prefixes in MALLOC_OPTIONS
processing.  For example, MALLOC_OPTIONS=lllllllll can now be specified as
MALLOC_OPTIONS=9l.

Reported by:	[1] rwatson
Design review:	[1] alc, peter, rwatson
2007-12-27 23:29:44 +00:00
jhb
590aeb53cc Clean up some of the pts(4) vs pty(4) stuff in grantpt(3) and friends:
- Use PTY* for all pty(4) related constants.
- Use PTMX* for all pts(4) related constants.
- Consistently use _PATH_DEV PTMX rather than "/dev/ptmx".
- Revert 1.7 and properly fix it by using the correct prefix string for
  pts(4) masters.

MFC after:	3 days
2007-12-21 21:26:08 +00:00
jasone
b720912697 Use fixed point integer math instead of floating point math when
calculating run sizes.  Use of the floating point unit was a potential
pessimization to context switching for applications that do not otherwise
use floating point math. [1]

Reformat cpp macro-related comments to improve consistency.

Submitted by:	das
2007-12-18 05:27:57 +00:00