freebsd-skq

Author	SHA1	Message	Date
kib	fa686c638e	Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)	2009-06-23 20:45:22 +00:00
kib	81638d9884	When vm_map_wire(9) is allowed to skip holes in the wired region, skip the mappings without any of read and execution rights, in particular, the PROT_NONE entries. This makes mlockall(2) work for the process address space that has such mappings. Since protection mode of the entry may change between setting MAP_ENTRY_IN_TRANSITION and final pass over the region that records the wire status of the entries, allocate new map entry flag MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries. Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl> Reviewed by: alc MFC after: 3 weeks	2009-04-10 10:16:03 +00:00
kib	66c697aade	Revert the addition of the freelist argument for the vm_map_delete() function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked. Tested by: pho Reviewed by: alc	2009-02-24 20:57:43 +00:00
kib	ae7c4cfa11	Do not call vm_object_deallocate() from vm_map_delete(), because we hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist(). Reviewed by: tegge, alc Tested by: pho	2009-02-08 20:39:17 +00:00
alc	a2ba509efc	Resurrect shared map locks allowing greater concurrency during some map operations, such as page faults. An earlier version of this change was ... Reviewed by: kib Tested by: pho MFC after: 6 weeks	2009-01-01 00:31:46 +00:00
alc	81e94e0a4c	Update or eliminate some stale comments.	2008-12-31 05:44:05 +00:00
alc	5a23437099	Generalize vm_map_find(9)'s parameter "find_space". Specifically, add support for VMFS_ALIGNED_SPACE, which requests the allocation of an address range best suited to superpages. The old options TRUE and FALSE are mapped to VMFS_ANY_SPACE and VMFS_NO_SPACE, so that there is no immediate need to update all of vm_map_find(9)'s callers. While I'm here, correct a misstatement about vm_map_find(9)'s return values in the man page.	2008-05-10 18:55:35 +00:00
alc	e919c282cd	vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can be passed by value.	2008-04-28 05:30:23 +00:00
marcel	7834123faf	Make the vm_pmap field of struct vmspace the last field in the structure. This allows per-CPU variations of struct pmap on a single architecture without affecting the machine-independent fields. As such, the PMAP variations don't affect the ABI. They become part of it.	2008-03-01 22:54:42 +00:00
pjd	f372a74b85	Change unused 'user_wait' argument to 'timo' argument, which will be used to specify timeout for msleep(9). Discussed with: alc Reviewed by: alc	2007-11-07 21:56:58 +00:00
kib	77766ce03f	Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert(). For this, introduce vm_map_fixed() that does that for MAP_FIXED case. Dropping the lock allowed for parallel thread to occupy the freed space. Reported by: Tijl Coosemans <tijl ulyssis org> Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks	2007-08-20 12:05:45 +00:00
tegge	0d5f191162	Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice. Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier. Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice. Change vmtotal() and swapout_procs() to use vmspace_acquire_ref(). Reviewed by: alc	2006-05-29 21:28:56 +00:00
alc	55eca1baec	Eliminate unneeded preallocation at initialization. Reviewed by: tegge	2005-12-03 22:41:15 +00:00
imp	f0bf889d0d	/* -> /*- for license, minor formatting changes	2005-01-07 02:29:27 +00:00
alc	482b6818af	Replace the linear search in vm_map_findspace() with an O(log n) algorithm built into the map entry splay tree. This replaces the first_free hint in struct vm_map with two fields in vm_map_entry: adj_free, the amount of free space following a map entry, and max_free, the maximum amount of free space in the entry's subtree. These fields make it possible to find a first-fit free region of a given size in one pass down the tree, so O(log n) amortized using splay trees. This significantly reduces the overhead in vm_map_findspace() for applications that mmap() many hundreds or thousands of regions, and has a negligible slowdown (0.1%) on buildworld. See, for example, the discussion of a micro-benchmark titled "Some mmap observations compared to Linux 2.6/OpenBSD" on -hackers in late October 2003. OpenBSD adopted this approach in March 2002, and NetBSD added it in November 2003, both with Red-Black trees. Submitted by: Mark W. Krentel	2004-08-13 08:06:34 +00:00
tegge	c5a462b4d9	The vm map lock is needed in vm_fault() after the page has been found, to avoid later changes before pmap_enter() and vm_fault_prefault() has completed. Simplify deadlock avoidance by not blocking on vm map relookup. In collaboration with: alc	2004-08-12 20:14:49 +00:00
green	a5ad2c7311	Revamp VM map wiring. * Allow no-fault wiring/unwiring to succeed for consistency; however, the wired count remains at zero, so it's a special case. * Fix issues inside vm_map_wire() and vm_map_unwire() where the exact state of user wiring (one or zero) and system wiring (zero or more) could be confused; for example, system unwiring could succeed in removing a user wire, instead of being an error. * Require all mappings to be unwired before they are deleted. When VM space is still wired upon deletion, it will be waited upon for the following unwire. This makes vslock(9) work rather than allowing kernel-locked memory to be deleted out from underneath of its consumer as it would before.	2004-08-09 19:52:29 +00:00
mux	cccb9bacb0	Get rid of another lockmgr(9) consumer by using sx locks for the user maps. We always acquire the sx lock exclusively here, but we can't use a mutex because we want to be able to sleep while holding the lock. This is completely equivalent to what we were doing with the lockmgr(9) locks before. Approved by: alc	2004-07-30 09:10:28 +00:00
alc	2057234305	Simplify vmspace initialization. The bcopy() of fields from the old vmspace to the new vmspace in vmspace_exec() is mostly wasted effort. With one exception, vm_swrss, the copied fields are immediately overwritten. Instead, initialize these fields to zero in vmspace_alloc(), eliminating a bcopy() from vmspace_exec() and a bzero() from vmspace_fork().	2004-07-24 07:40:35 +00:00
alc	4f5d8d196f	Micro-optimize vmspace for 64-bit architectures: Colocate vm_refcnt and vm_exitingcnt so that alignment does not result in wasted space.	2004-07-06 17:35:10 +00:00
alc	3121dcf134	Remove an unused field from the vmspace structure.	2004-06-26 19:16:35 +00:00
alc	ff037b9220	In cases where a file was resident in memory mmap(..., PROT_NONE, ...) would actually map the file with read access enabled. According to http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is an error. Similarly, an madvise(..., MADV_WILLNEED) would enable read access on a virtual address range that was PROT_NONE. The solution implemented herein is (1) to pass a vm_prot_t to vm_map_pmap_enter() describing the allowed access and (2) to make vm_map_pmap_enter() responsible for understanding the limitations of pmap_enter_quick(). Submitted by: "Mark W. Krentel" <krentel@dreamscape.com> PR: kern/64573	2004-04-24 03:46:44 +00:00
imp	04c9462c02	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-06 20:15:37 +00:00
peter	2f1d89c4f5	Part 2 of rev 1.68. Update comment to match reality now that vm_endcopy exists and we no longer copy to the end of the struct. Forgotten by: alfred and green	2004-03-12 00:16:48 +00:00
alc	269cf5aa09	- Rename vm_map_clean() to vm_map_sync(). This better reflects the fact that msync(2) is its only caller. - Migrate the parts of the old vm_map_clean() that examined the internals of a vm object to a new function vm_object_sync() that is implemented in vm_object.c. At the same, introduce the necessary vm object locking so that vm_map_sync() and vm_object_sync() can be called without Giant. Reviewed by: tegge	2003-11-09 05:25:35 +00:00
des	3120373a25	Whitespace cleanup.	2003-11-03 16:14:45 +00:00
bms	d8d01a1fa7	Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count(). Reviewed by: truckman Discussed with: peter	2003-10-06 01:47:12 +00:00
marcel	d75cf98307	Part 2 of implementing rstacks: add the ability to create rstacks and use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect. Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address. Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done. Tested on: alpha, i386, ia64, sparc64	2003-09-27 22:28:14 +00:00
marcel	8ce42b9bdd	Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for growable (stack) entries that not only grow down, but also grow up. Have vm_map_growstack() take these flags into account when growing an entry. This is the first step in adding support for upward growable stacks. It is a required feature on ia64 to support the register stack (or rstack as I like to call it -- it also means reverse stack). We do not currently create rstacks, so the upward growing is not exercised and the change should be a functional no-op. Reviewed by: alc	2003-08-30 21:25:23 +00:00
alc	23077fe3b5	Reduce the size of the vm map (and by inclusion the vm space) on 64-bit architectures by moving a field within the structure.	2003-08-13 03:13:22 +00:00
bms	44aa51e3ae	Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture are necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC). PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks	2003-08-11 07:14:08 +00:00
alc	65c6a689bc	Introduce vm_map_pmap_enter(). Presently, this is a stub calling the MD pmap_object_init_pt().	2003-06-29 23:32:55 +00:00
alc	ff1c2851e3	Remove an unnecessary forward declaration.	2003-06-15 07:28:33 +00:00
das	96f973e7de	- When the VM daemon is out of swap space and looking for a process to kill, don't block on a map lock while holding the process lock. Instead, skip processes whose map locks are held and find something else to kill. - Add vm_map_trylock_read() to support the above. Reviewed by: alc, mike (mentor)	2003-03-12 23:13:16 +00:00
alc	c50367da67	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
alc	c28c69ebf9	Add a needed #include. Reported by: ia64 tinderbox	2003-01-01 00:13:01 +00:00
alc	1842adc340	Implement a variant locking scheme for vm maps: Access to system maps is now synchronized by a mutex, whereas access to user maps is still synchronized by a lockmgr()-based lock. Why? No single type of lock, including sx locks, meets the requirements of both types of vm map. Sometimes we sleep while holding the lock on a user map. Thus, a a mutex isn't appropriate. On the other hand, both lockmgr()-based and sx locks release Giant when a thread/process blocks during contention for a lock. This could lead to a race condition in a legacy driver (that relies on Giant for synchronization) if it attempts to kmem_malloc() and fails to immediately obtain the lock. Fortunately, we never sleep while holding a system map lock.	2002-12-31 19:38:04 +00:00
dillon	2925e70a14	Fix a refcount race with the vmspace structure. In order to prevent resource starvation we clean-up as much of the vmspace structure as we can when the last process using it exits. The rest of the structure is cleaned up when it is reaped. But since exit1() decrements the ref count it is possible for a double-free to occur if someone else, such as the process swapout code, references and then dereferences the structure. Additionally, the final cleanup of the structure should not occur until the last process referencing it is reaped. This commit solves the problem by introducing a secondary reference count, calling 'vm_exitingcnt'. The normal reference count is decremented on exit and vm_exitingcnt is incremented. vm_exitingcnt is decremented when the process is reaped. When both vm_exitingcnt and vm_refcnt are 0, the structure is freed for real. MFC after: 3 weeks	2002-12-15 18:50:04 +00:00
alc	8dcc127f87	o Update some comments.	2002-09-22 04:33:43 +00:00
alfred	c683b04586	Change struct vmspace->vm_shm from void * to struct shmmap_state *, this removes the need for casts in several cases.	2002-07-22 16:22:27 +00:00
alfred	2d7017c840	Remove caddr_t.	2002-07-22 16:12:55 +00:00
alc	9d3ad1a89f	o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait() and kmem_free_wakeup(). Previously, kmem_free_wakeup() always called wakeup(). In general, no one was sleeping. o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c for use in vm_kern.c.	2002-07-11 02:39:24 +00:00
alc	698aabc226	o Eliminate vmspace::vm_minsaddr. It's initialized but never used. o Replace stale comments in vmspace by "const until freed" annotations on some fields.	2002-06-25 18:14:38 +00:00
alc	42cf959f18	o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().) Reviewed by: tegge	2002-06-14 18:21:01 +00:00
alc	75c335726c	o Remove an unnecessary call to vm_map_wakeup() from vm_map_unwire(). o Add a stub for vm_map_wire(). Note: the description of the previous commit had an error. The in- transition flag actually blocks the deallocation of a vm_map_entry by vm_map_delete() and vm_map_simplify_entry().	2002-06-08 07:32:38 +00:00
alc	ee9168748e	o Add vm_map_unwire() for unwiring contiguous regions of either kernel or user vm_maps. In accordance with the standards for munlock(2), and in contrast to vm_map_user_pageable(), this implementation does not allow holes in the specified region. This implementation uses the "in transition" flag described below. o Introduce a new flag, "in transition," to the vm_map_entry. Eventually, vm_map_delete() and vm_map_simplify_entry() will respect this flag by deallocating in-transition vm_map_entrys, allowing the vm_map lock to be safely released in vm_map_unwire() and (the forthcoming) vm_map_wire(). o Modify vm_map_simplify_entry() to respect the in-transition flag. In collaboration with: tegge	2002-06-07 18:34:23 +00:00
alc	2abbbe7b8a	o Remove GIANT_REQUIRED from vm_map_zfini(), vm_map_zinit(), vm_map_create(), and vm_map_submap(). o Make further use of a local variable in vm_map_entry_splay() that caches a reference to one of a vm_map_entry's children. (This reduces code size somewhat.) o Revert a part of revision 1.66, deinlining vmspace_pmap(). (This function is MPSAFE.)	2002-06-01 22:41:43 +00:00
alc	b55171a0a3	o Revert a part of revision 1.66, contrary to what that commit message says, deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior() actually increases the kernel's size. o Make vm_map_entry_set_behavior() static and add a comment describing its purpose. o Remove an unnecessary initialization statement from vm_map_entry_splay().	2002-06-01 16:59:30 +00:00
alc	480071f11a	o Replace the vm_map's hint by the root of a splay tree. By design, the last accessed datum is moved to the root of the splay tree. Therefore, on lookups in which the hint resulted in O(1) access, the splay tree still achieves O(1) access. In contrast, on lookups in which the hint failed miserably, the splay tree achieves amortized logarithmic complexity, resulting in dramatic improvements on vm_maps with a large number of entries. For example, the execution time for replaying an access log from www.cs.rice.edu against the thttpd web server was reduced by 23.5% due to the large number of files simultaneously mmap()ed by this server. (The machine in question has enough memory to cache most of this workload.) Nothing comes for free: At present, I see a 0.2% slowdown on "buildworld" due to the overhead of maintaining the splay tree. I believe that some or all of this can be eliminated through optimizations to the code. Developed in collaboration with: Juan E Navarro <jnavarro@cs.rice.edu> Reviewed by: jeff	2002-05-24 01:33:24 +00:00
alc	94ec8e207f	o Header files shouldn't depend on options: Provide prototypes for uiomoveco(), uioread(), and vm_uiomove() regardless of whether ENABLE_VFS_IOOPT is defined or not. Submitted by: bde	2002-05-06 06:20:04 +00:00

1 2 3

130 Commits