Commit Graph

132 Commits

Author SHA1 Message Date
Alan Cox
e2997fea72 Simplify the invocation of vm_fault(). Specifically, eliminate the flag
VM_FAULT_DIRTY.  The information provided by this flag can be trivially
inferred by vm_fault().

Discussed with:	kib
2009-11-27 20:24:11 +00:00
Alan Cox
2db65ab46e Simplify both the invocation and the implementation of vm_fault() for wiring
pages.

(Note: Claims made in the comments about the handling of breakpoints in
wired pages have been false for roughly a decade.  This and another bug
involving breakpoints will be fixed in coming changes.)

Reviewed by:	kib
2009-11-18 18:05:54 +00:00
Konstantin Belousov
3364c323e6 Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
Konstantin Belousov
6d7e809123 When vm_map_wire(9) is allowed to skip holes in the wired region, skip
the mappings without any of read and execution rights, in particular,
the PROT_NONE entries. This makes mlockall(2) work for the process
address space that has such mappings.

Since protection mode of the entry may change between setting
MAP_ENTRY_IN_TRANSITION and final pass over the region that records
the wire status of the entries, allocate new map entry flag
MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries.

Reported and tested by:	Hans Ottevanger <fbsdhackers beasties demon nl>
Reviewed by:	alc
MFC after:	3 weeks
2009-04-10 10:16:03 +00:00
Konstantin Belousov
655c349022 Revert the addition of the freelist argument for the vm_map_delete()
function, done in r188334. Instead, collect the entries that shall be
freed, in the deferred_freelist member of the map. Automatically purge
the deferred freelist when map is unlocked.

Tested by:	pho
Reviewed by:	alc
2009-02-24 20:57:43 +00:00
Konstantin Belousov
897d81a020 Do not call vm_object_deallocate() from vm_map_delete(), because we
hold the map lock there, and might need the vnode lock for OBJT_VNODE
objects. Postpone object deallocation until caller of vm_map_delete()
drops the map lock. Link the map entries to be freed into the freelist,
that is released by the new helper function vm_map_entry_free_freelist().

Reviewed by:	tegge, alc
Tested by:	pho
2009-02-08 20:39:17 +00:00
Alan Cox
05a8c41419 Resurrect shared map locks allowing greater concurrency during some map
operations, such as page faults.

An earlier version of this change was ...

Reviewed by:	kib
Tested by:	pho
MFC after:	6 weeks
2009-01-01 00:31:46 +00:00
Alan Cox
e2abaaaa2b Update or eliminate some stale comments. 2008-12-31 05:44:05 +00:00
Alan Cox
26c538ffcd Generalize vm_map_find(9)'s parameter "find_space". Specifically, add
support for VMFS_ALIGNED_SPACE, which requests the allocation of an
address range best suited to superpages.  The old options TRUE and FALSE
are mapped to VMFS_ANY_SPACE and VMFS_NO_SPACE, so that there is no
immediate need to update all of vm_map_find(9)'s callers.

While I'm here, correct a misstatement about vm_map_find(9)'s return
values in the man page.
2008-05-10 18:55:35 +00:00
Alan Cox
b8ca4ef2e3 vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can be
passed by value.
2008-04-28 05:30:23 +00:00
Marcel Moolenaar
8775db6f50 Make the vm_pmap field of struct vmspace the last field in the
structure. This allows per-CPU variations of struct pmap on a
single architecture without affecting the machine-independent
fields. As such, the PMAP variations don't affect the ABI. They
become part of it.
2008-03-01 22:54:42 +00:00
Pawel Jakub Dawidek
8ce2d00a04 Change unused 'user_wait' argument to 'timo' argument, which will be
used to specify timeout for msleep(9).

Discussed with:	alc
Reviewed by:	alc
2007-11-07 21:56:58 +00:00
Konstantin Belousov
d239bd3ccc Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert().
For this, introduce vm_map_fixed() that does that for MAP_FIXED case.

Dropping the lock allowed for parallel thread to occupy the freed space.

Reported by:	Tijl Coosemans <tijl ulyssis org>
Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2007-08-20 12:05:45 +00:00
Tor Egge
57051fdc4b Close race between vmspace_exitfree() and exit1() and races between
vmspace_exitfree() and vmspace_free() which could result in the same
vmspace being freed twice.

Factor out part of exit1() into new function vmspace_exit().  Attach
to vmspace0 to allow old vmspace to be freed earlier.

Add new function, vmspace_acquire_ref(), for obtaining a vmspace
reference for a vmspace belonging to another process.  Avoid changing
vmspace refcount from 0 to 1 since that could also lead to the same
vmspace being freed twice.

Change vmtotal() and swapout_procs() to use vmspace_acquire_ref().

Reviewed by:	alc
2006-05-29 21:28:56 +00:00
Alan Cox
51016cdfd7 Eliminate unneeded preallocation at initialization.
Reviewed by: tegge
2005-12-03 22:41:15 +00:00
Warner Losh
60727d8b86 /* -> /*- for license, minor formatting changes 2005-01-07 02:29:27 +00:00
Alan Cox
0164e05781 Replace the linear search in vm_map_findspace() with an O(log n)
algorithm built into the map entry splay tree.  This replaces the
first_free hint in struct vm_map with two fields in vm_map_entry:
adj_free, the amount of free space following a map entry, and
max_free, the maximum amount of free space in the entry's subtree.
These fields make it possible to find a first-fit free region of a
given size in one pass down the tree, so O(log n) amortized using
splay trees.

This significantly reduces the overhead in vm_map_findspace() for
applications that mmap() many hundreds or thousands of regions, and
has a negligible slowdown (0.1%) on buildworld.  See, for example, the
discussion of a micro-benchmark titled "Some mmap observations
compared to Linux 2.6/OpenBSD" on -hackers in late October 2003.

OpenBSD adopted this approach in March 2002, and NetBSD added it in
November 2003, both with Red-Black trees.

Submitted by: Mark W. Krentel
2004-08-13 08:06:34 +00:00
Tor Egge
19dc560756 The vm map lock is needed in vm_fault() after the page has been found,
to avoid later changes before pmap_enter() and vm_fault_prefault()
has completed.

Simplify deadlock avoidance by not blocking on vm map relookup.

In collaboration with: alc
2004-08-12 20:14:49 +00:00
Brian Feldman
9689d5e5ee Revamp VM map wiring.
* Allow no-fault wiring/unwiring to succeed for consistency;
  however, the wired count remains at zero, so it's a special case.

* Fix issues inside vm_map_wire() and vm_map_unwire() where the
  exact state of user wiring (one or zero) and system wiring
  (zero or more) could be confused; for example, system unwiring
  could succeed in removing a user wire, instead of being an
  error.

* Require all mappings to be unwired before they are deleted.
  When VM space is still wired upon deletion, it will be waited
  upon for the following unwire.  This makes vslock(9) work
  rather than allowing kernel-locked memory to be deleted
  out from underneath of its consumer as it would before.
2004-08-09 19:52:29 +00:00
Maxime Henrion
12c649749c Get rid of another lockmgr(9) consumer by using sx locks for the user
maps.  We always acquire the sx lock exclusively here, but we can't
use a mutex because we want to be able to sleep while holding the
lock.  This is completely equivalent to what we were doing with the
lockmgr(9) locks before.

Approved by:	alc
2004-07-30 09:10:28 +00:00
Alan Cox
51ab6c2890 Simplify vmspace initialization. The bcopy() of fields from the old
vmspace to the new vmspace in vmspace_exec() is mostly wasted effort.  With
one exception, vm_swrss, the copied fields are immediately overwritten.
Instead, initialize these fields to zero in vmspace_alloc(), eliminating a
bcopy() from vmspace_exec() and a bzero() from vmspace_fork().
2004-07-24 07:40:35 +00:00
Alan Cox
fd2d354908 Micro-optimize vmspace for 64-bit architectures: Colocate vm_refcnt and
vm_exitingcnt so that alignment does not result in wasted space.
2004-07-06 17:35:10 +00:00
Alan Cox
d9dd6bfb56 Remove an unused field from the vmspace structure. 2004-06-26 19:16:35 +00:00
Alan Cox
4da4d293df In cases where a file was resident in memory mmap(..., PROT_NONE, ...)
would actually map the file with read access enabled.  According to
http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is
an error.  Similarly, an madvise(..., MADV_WILLNEED) would enable read
access on a virtual address range that was PROT_NONE.

The solution implemented herein is (1) to pass a vm_prot_t to
vm_map_pmap_enter() describing the allowed access and (2) to make
vm_map_pmap_enter() responsible for understanding the limitations of
pmap_enter_quick().

Submitted by:	"Mark W. Krentel" <krentel@dreamscape.com>
PR:		kern/64573
2004-04-24 03:46:44 +00:00
Warner Losh
05eb3785e7 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-06 20:15:37 +00:00
Peter Wemm
2965c04576 Part 2 of rev 1.68. Update comment to match reality now that vm_endcopy
exists and we no longer copy to the end of the struct.

Forgotten by:  alfred and green
2004-03-12 00:16:48 +00:00
Alan Cox
950f8459d4 - Rename vm_map_clean() to vm_map_sync(). This better reflects the fact
that msync(2) is its only caller.
 - Migrate the parts of the old vm_map_clean() that examined the internals
   of a vm object to a new function vm_object_sync() that is implemented in
   vm_object.c.  At the same, introduce the necessary vm object locking so
   that vm_map_sync() and vm_object_sync() can be called without Giant.

Reviewed by:	tegge
2003-11-09 05:25:35 +00:00
Dag-Erling Smørgrav
a86fa82659 Whitespace cleanup. 2003-11-03 16:14:45 +00:00
Bruce M Simpson
2bc7dd5661 Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by:	truckman
Discussed with:	peter
2003-10-06 01:47:12 +00:00
Marcel Moolenaar
fd75d71049 Part 2 of implementing rstacks: add the ability to create rstacks and
use the ability on ia64 to map the register stack. The orientation of
the stack (i.e. its grow direction) is passed to vm_map_stack() in the
overloaded cow argument. Since the grow direction is represented by
bits, it is possible and allowed to create bi-directional stacks.
This is not an advertised feature, more of a side-effect.

Fix a bug in vm_map_growstack() that's specific to rstacks and which
we could only find by having the ability to create rstacks: when
the mapped stack ends at the faulting address, we have not actually
mapped the faulting address. we need to include or cover the faulting
address.

Note that at this time mmap(2) has not been extended to allow the
creation of rstacks by processes. If such a need arises, this can
be done.

Tested on: alpha, i386, ia64, sparc64
2003-09-27 22:28:14 +00:00
Marcel Moolenaar
b21a0008ba Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for
growable (stack) entries that not only grow down, but also grow up.
Have vm_map_growstack() take these flags into account when growing
an entry.

This is the first step in adding support for upward growable stacks.
It is a required feature on ia64 to support the register stack (or
rstack as I like to call it -- it also means reverse stack). We do
not currently create rstacks, so the upward growing is not exercised
and the change should be a functional no-op.

Reviewed by: alc
2003-08-30 21:25:23 +00:00
Alan Cox
46add12552 Reduce the size of the vm map (and by inclusion the vm space) on 64-bit
architectures by moving a field within the structure.
2003-08-13 03:13:22 +00:00
Bruce M Simpson
abd498aa71 Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
Alan Cox
0551c08dee Introduce vm_map_pmap_enter(). Presently, this is a stub calling the MD
pmap_object_init_pt().
2003-06-29 23:32:55 +00:00
Alan Cox
bf5f21b622 Remove an unnecessary forward declaration. 2003-06-15 07:28:33 +00:00
David Schultz
72d97679ff - When the VM daemon is out of swap space and looking for a
process to kill, don't block on a map lock while holding the
  process lock.  Instead, skip processes whose map locks are held
  and find something else to kill.
- Add vm_map_trylock_read() to support the above.

Reviewed by:	alc, mike (mentor)
2003-03-12 23:13:16 +00:00
Alan Cox
09c80124a3 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
Alan Cox
ea0081b61e Add a needed #include.
Reported by:	ia64 tinderbox
2003-01-01 00:13:01 +00:00
Alan Cox
36daaecd04 Implement a variant locking scheme for vm maps: Access to system maps
is now synchronized by a mutex, whereas access to user maps is still
synchronized by a lockmgr()-based lock.  Why?  No single type of lock,
including sx locks, meets the requirements of both types of vm map.
Sometimes we sleep while holding the lock on a user map.  Thus, a
a mutex isn't appropriate.  On the other hand, both lockmgr()-based
and sx locks release Giant when a thread/process blocks during
contention for a lock.  This could lead to a race condition in a legacy
driver (that relies on Giant for synchronization) if it attempts to
kmem_malloc() and fails to immediately obtain the lock.  Fortunately,
we never sleep while holding a system map lock.
2002-12-31 19:38:04 +00:00
Matthew Dillon
389d2b6e21 Fix a refcount race with the vmspace structure. In order to prevent
resource starvation we clean-up as much of the vmspace structure as we
can when the last process using it exits.  The rest of the structure
is cleaned up when it is reaped.  But since exit1() decrements the ref
count it is possible for a double-free to occur if someone else, such as
the process swapout code, references and then dereferences the structure.
Additionally, the final cleanup of the structure should not occur until
the last process referencing it is reaped.

This commit solves the problem by introducing a secondary reference count,
calling 'vm_exitingcnt'.  The normal reference count is decremented on exit
and vm_exitingcnt is incremented.  vm_exitingcnt is decremented when the
process is reaped.  When both vm_exitingcnt and vm_refcnt are 0, the
structure is freed for real.

MFC after:	3 weeks
2002-12-15 18:50:04 +00:00
Alan Cox
e94ce82689 o Update some comments. 2002-09-22 04:33:43 +00:00
Alfred Perlstein
8209f090f1 Change struct vmspace->vm_shm from void * to struct shmmap_state *, this
removes the need for casts in several cases.
2002-07-22 16:22:27 +00:00
Alfred Perlstein
2cc593fd8e Remove caddr_t. 2002-07-22 16:12:55 +00:00
Alan Cox
9688f93163 o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait()
and kmem_free_wakeup().  Previously, kmem_free_wakeup() always
   called wakeup().  In general, no one was sleeping.
 o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c
   for use in vm_kern.c.
2002-07-11 02:39:24 +00:00
Alan Cox
366838ddfe o Eliminate vmspace::vm_minsaddr. It's initialized but never used.
o Replace stale comments in vmspace by "const until freed" annotations
   on some fields.
2002-06-25 18:14:38 +00:00
Alan Cox
1d7cf06c8c o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and
vm_map_user_pageable().
 o Remove vm_map_pageable() and vm_map_user_pageable().
 o Remove vm_map_clear_recursive() and vm_map_set_recursive().  (They were
   only used by vm_map_pageable() and vm_map_user_pageable().)

Reviewed by:	tegge
2002-06-14 18:21:01 +00:00
Alan Cox
e27e17b711 o Remove an unnecessary call to vm_map_wakeup() from vm_map_unwire().
o Add a stub for vm_map_wire().

Note: the description of the previous commit had an error.  The in-
transition flag actually blocks the deallocation of a vm_map_entry by
vm_map_delete() and vm_map_simplify_entry().
2002-06-08 07:32:38 +00:00
Alan Cox
acd9a301ec o Add vm_map_unwire() for unwiring contiguous regions of either kernel
or user vm_maps.  In accordance with the standards for munlock(2),
   and in contrast to vm_map_user_pageable(), this implementation does not
   allow holes in the specified region.  This implementation uses the
   "in transition" flag described below.
 o Introduce a new flag, "in transition," to the vm_map_entry.
   Eventually, vm_map_delete() and vm_map_simplify_entry() will respect
   this flag by deallocating in-transition vm_map_entrys, allowing
   the vm_map lock to be safely released in vm_map_unwire() and (the
   forthcoming) vm_map_wire().
 o Modify vm_map_simplify_entry() to respect the in-transition flag.

In collaboration with:	tegge
2002-06-07 18:34:23 +00:00
Alan Cox
61c075b67f o Remove GIANT_REQUIRED from vm_map_zfini(), vm_map_zinit(),
vm_map_create(), and vm_map_submap().
 o Make further use of a local variable in vm_map_entry_splay()
   that caches a reference to one of a vm_map_entry's children.
   (This reduces code size somewhat.)
 o Revert a part of revision 1.66, deinlining vmspace_pmap().
   (This function is MPSAFE.)
2002-06-01 22:41:43 +00:00
Alan Cox
794316a866 o Revert a part of revision 1.66, contrary to what that commit message says,
deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior()
   actually increases the kernel's size.
 o Make vm_map_entry_set_behavior() static and add a comment describing
   its purpose.
 o Remove an unnecessary initialization statement from vm_map_entry_splay().
2002-06-01 16:59:30 +00:00