freebsd-skq/sys/vm/vm_unix.c

247 lines
6.6 KiB
C
Raw Normal View History

/*-
1994-05-24 10:09:53 +00:00
* Copyright (c) 1988 University of Utah.
* Copyright (c) 1991, 1993
* The Regents of the University of California. All rights reserved.
*
* This code is derived from software contributed to Berkeley by
* the Systems Programming Group of the University of Utah Computer
* Science Department.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* from: Utah $Hdr: vm_unix.c 1.1 89/11/07$
*
* @(#)vm_unix.c 8.1 (Berkeley) 6/11/93
*/
#include "opt_compat.h"
1994-05-24 10:09:53 +00:00
/*
* Traditional sbrk/grow interface to VM
*/
2003-06-11 23:50:51 +00:00
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
1994-05-24 10:09:53 +00:00
#include <sys/param.h>
#include <sys/lock.h>
#include <sys/mutex.h>
1994-05-24 10:09:53 +00:00
#include <sys/proc.h>
#include <sys/racct.h>
1994-05-24 10:09:53 +00:00
#include <sys/resourcevar.h>
#include <sys/sysent.h>
#include <sys/sysproto.h>
#include <sys/systm.h>
1994-05-24 10:09:53 +00:00
#include <vm/vm.h>
#include <vm/vm_param.h>
#include <vm/pmap.h>
#include <vm/vm_map.h>
#ifndef _SYS_SYSPROTO_H_
1994-05-24 10:09:53 +00:00
struct obreak_args {
char *nsize;
1994-05-24 10:09:53 +00:00
};
#endif
/*
* MPSAFE
*/
1994-05-24 10:09:53 +00:00
/* ARGSUSED */
int
sys_obreak(td, uap)
struct thread *td;
1994-05-24 10:09:53 +00:00
struct obreak_args *uap;
{
struct vmspace *vm = td->td_proc->p_vmspace;
vm_map_t map = &vm->vm_map;
vm_offset_t new, old, base;
rlim_t datalim, lmemlim, vmemlim;
int prot, rv;
int error = 0;
boolean_t do_map_wirefuture;
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
2004-02-04 21:52:57 +00:00
PROC_LOCK(td->td_proc);
datalim = lim_cur(td->td_proc, RLIMIT_DATA);
lmemlim = lim_cur(td->td_proc, RLIMIT_MEMLOCK);
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
2004-02-04 21:52:57 +00:00
vmemlim = lim_cur(td->td_proc, RLIMIT_VMEM);
PROC_UNLOCK(td->td_proc);
do_map_wirefuture = FALSE;
new = round_page((vm_offset_t)uap->nsize);
vm_map_lock(map);
1994-05-24 10:09:53 +00:00
base = round_page((vm_offset_t) vm->vm_daddr);
old = base + ctob(vm->vm_dsize);
if (new > base) {
/*
* Check the resource limit, but allow a process to reduce
* its usage, even if it remains over the limit.
*/
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
2004-02-04 21:52:57 +00:00
if (new - base > datalim && new > old) {
error = ENOMEM;
goto done;
}
if (new > vm_map_max(map)) {
error = ENOMEM;
goto done;
}
} else if (new < base) {
/*
* This is simply an invalid value. If someone wants to
* do fancy address space manipulations, mmap and munmap
* can do most of what the user would want.
*/
error = EINVAL;
goto done;
}
if (new > old) {
if (!old_mlock && map->flags & MAP_WIREFUTURE) {
if (ptoa(pmap_wired_count(map->pmap)) +
(new - old) > lmemlim) {
error = ENOMEM;
goto done;
}
}
if (map->size + (new - old) > vmemlim) {
error = ENOMEM;
goto done;
}
#ifdef RACCT
PROC_LOCK(td->td_proc);
error = racct_set(td->td_proc, RACCT_DATA, new - base);
if (error != 0) {
PROC_UNLOCK(td->td_proc);
error = ENOMEM;
goto done;
}
error = racct_set(td->td_proc, RACCT_VMEM,
map->size + (new - old));
if (error != 0) {
racct_set_force(td->td_proc, RACCT_DATA, old - base);
PROC_UNLOCK(td->td_proc);
error = ENOMEM;
goto done;
}
if (!old_mlock && map->flags & MAP_WIREFUTURE) {
error = racct_set(td->td_proc, RACCT_MEMLOCK,
ptoa(pmap_wired_count(map->pmap)) + (new - old));
if (error != 0) {
racct_set_force(td->td_proc, RACCT_DATA,
old - base);
racct_set_force(td->td_proc, RACCT_VMEM,
map->size);
PROC_UNLOCK(td->td_proc);
error = ENOMEM;
goto done;
}
}
PROC_UNLOCK(td->td_proc);
#endif
prot = VM_PROT_RW;
#ifdef COMPAT_FREEBSD32
#if defined(__amd64__)
if (i386_read_exec && SV_PROC_FLAG(td->td_proc, SV_ILP32))
prot |= VM_PROT_EXECUTE;
#endif
#endif
rv = vm_map_insert(map, NULL, 0, old, new, prot, VM_PROT_ALL, 0);
1994-05-24 10:09:53 +00:00
if (rv != KERN_SUCCESS) {
#ifdef RACCT
PROC_LOCK(td->td_proc);
racct_set_force(td->td_proc, RACCT_DATA, old - base);
racct_set_force(td->td_proc, RACCT_VMEM, map->size);
if (!old_mlock && map->flags & MAP_WIREFUTURE) {
racct_set_force(td->td_proc, RACCT_MEMLOCK,
ptoa(pmap_wired_count(map->pmap)));
}
PROC_UNLOCK(td->td_proc);
#endif
error = ENOMEM;
goto done;
1994-05-24 10:09:53 +00:00
}
vm->vm_dsize += btoc(new - old);
/*
* Handle the MAP_WIREFUTURE case for legacy applications,
* by marking the newly mapped range of pages as wired.
* We are not required to perform a corresponding
* vm_map_unwire() before vm_map_delete() below, as
* it will forcibly unwire the pages in the range.
*
* XXX If the pages cannot be wired, no error is returned.
*/
if ((map->flags & MAP_WIREFUTURE) == MAP_WIREFUTURE) {
if (bootverbose)
printf("obreak: MAP_WIREFUTURE set\n");
do_map_wirefuture = TRUE;
}
} else if (new < old) {
rv = vm_map_delete(map, new, old);
1994-05-24 10:09:53 +00:00
if (rv != KERN_SUCCESS) {
error = ENOMEM;
goto done;
1994-05-24 10:09:53 +00:00
}
vm->vm_dsize -= btoc(old - new);
#ifdef RACCT
PROC_LOCK(td->td_proc);
racct_set_force(td->td_proc, RACCT_DATA, new - base);
racct_set_force(td->td_proc, RACCT_VMEM, map->size);
if (!old_mlock && map->flags & MAP_WIREFUTURE) {
racct_set_force(td->td_proc, RACCT_MEMLOCK,
ptoa(pmap_wired_count(map->pmap)));
}
PROC_UNLOCK(td->td_proc);
#endif
1994-05-24 10:09:53 +00:00
}
done:
vm_map_unlock(map);
if (do_map_wirefuture)
(void) vm_map_wire(map, old, new,
VM_MAP_WIRE_USER|VM_MAP_WIRE_NOHOLES);
return (error);
1994-05-24 10:09:53 +00:00
}
#ifndef _SYS_SYSPROTO_H_
1994-05-24 10:09:53 +00:00
struct ovadvise_args {
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman
1995-01-09 16:06:02 +00:00
int anom;
1994-05-24 10:09:53 +00:00
};
#endif
/*
* MPSAFE
*/
1994-05-24 10:09:53 +00:00
/* ARGSUSED */
int
sys_ovadvise(td, uap)
struct thread *td;
1994-05-24 10:09:53 +00:00
struct ovadvise_args *uap;
{
/* START_GIANT_OPTIONAL */
/* END_GIANT_OPTIONAL */
1994-05-24 10:09:53 +00:00
return (EINVAL);
}