freebsd-skq

Author	SHA1	Message	Date
alfred	a3f0842419	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
markm	b831760191	Putting sys/lockmgr.h in here allows us to depollute userland includes a bit. OK'ed by: bde	2001-05-03 11:33:51 +00:00
alfred	cd79f17cad	Fix the botched rev 1.59 where I made it such that without INVARIANTS the map is never locked. Submitted by: tegge	2001-04-18 05:30:24 +00:00
alfred	bcfbf5a27d	use %p for pointer printf, include sys/systm.h for printf proto	2001-04-13 10:22:14 +00:00
alfred	7dcb59378d	Use a macro wrapper over printf along with KASSERT to reduce the amount of code here.	2001-04-13 08:07:37 +00:00
dillon	be9d069cf0	Fix a lock reversal problem in the VM subsystem related to threaded programs. There is a case during a fork() which can cause a deadlock. From Tor - The workaround that consists of setting a flag in the vm map that indicates that a fork is in progress and using that mark in the page fault handling to force a revalidation failure. That change will only affect (pessimize) page fault handling during fork for threaded (linuxthreads style) applications and applications using aio_*(). Submited by: tegge	2001-03-14 06:48:53 +00:00
bmilekic	f364d4ac36	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
jasone	db944ecf00	For lockmgr mutex protection, use an array of mutexes that are allocated and initialized during boot. This avoids bloating sizeof(struct lock). As a side effect, it is no longer necessary to enforce the assumtion that lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been removed. Idea taken from: BSD/OS.	2000-10-12 22:37:28 +00:00
jasone	4e290e67b7	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
ps	c3800346ab	Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2). This This feature allows you to specify if mmap'd data is included in an application's corefile. Change the type of eflags in struct vm_map_entry from u_char to vm_eflags_t (an unsigned int). Reviewed by: dillon,jdp,alfred Approved by: jkh	2000-02-28 04:10:35 +00:00
peter	d53e4c1d80	Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.	1999-12-29 05:07:58 +00:00
dillon	b66fb2c648	Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg	1999-12-12 03:19:33 +00:00
dillon	37bee3bb3f	cleanup madvise code, add a few more sanity checks. Reviewed by: Alan Cox <alc@cs.rice.edu>, dg@root.com	1999-09-21 05:00:48 +00:00
dillon	d1666d7f52	Add 'lastr' field to vm_map_entry in preparation for its removal from the vnode. (The changeover is undergoing final testing and will be committed soon). Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>	1999-09-17 05:40:17 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
alc	42583848d0	Correct the inconsistent formatting in struct vm_map. Addendum to rev 1.47: submitted by dillon.	1999-08-23 18:16:05 +00:00
alc	980da3c115	struct vm_map: The lock structure cannot be the first element of the vm_map because this can result in livelock between two or more system processes trying to kmem_alloc_wait.	1999-08-23 18:08:34 +00:00
mjacob	c3587e5dcb	Fix breakage - an extra brace got inserted where DIAGNOSTIC was defined but MAP_LOCK_DIAGNOSTIC wasn't.	1999-08-18 03:56:57 +00:00
alc	708e372c9a	vm_map_lock*: Remove semicolons or add "do { } while (0)" as necessary to enable the use of these macros in arbitrary statements. (There are no functional changes.) Submitted by: dillon	1999-08-16 18:21:09 +00:00
alc	33da09bf48	Move the memory access behavior information provided by madvise from the vm_object to the vm_map. Submitted by: dillon	1999-08-01 06:05:09 +00:00
alc	bcd454da95	Remove unused function prototypes.	1999-07-10 18:16:08 +00:00
alc	c9ce3ad902	Remove some unused function and variable declarations.	1999-06-19 18:42:53 +00:00
alc	936a553033	Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert, eliminating the need for the pmap_object_init_pt calls in imgact_* and mmap. Reviewed by: David Greenman <dg@root.com>	1999-05-17 00:53:56 +00:00
alc	83a9863079	Remove prototypes for functions that don't exist anymore (vm_map.h). Remove a useless argument from vm_map_madvise's interface (vm_map.c, vm_map.h, and vm_mmap.c). Remove a redundant test in vm_uiomove (vm_map.c). Make two changes to vm_object_coalesce: 1. Determine whether the new range of pages actually overlaps the existing object's range of pages before calling vm_object_page_remove. (Prior to this change almost 90% of the calls to vm_object_page_remove were to remove pages that were beyond the end of the object.) 2. Free any swap space allocated to removed pages.	1999-05-16 05:07:34 +00:00
alc	e79d7db00b	Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option. It never makes sense to specify MAP_COPY_NEEDED without also specifying MAP_COPY_ON_WRITE, and vice versa. Thus, MAP_COPY_ON_WRITE suffices. Reviewed by: David Greenman <dg@root.com>	1999-05-14 23:09:34 +00:00
alc	4e7ebf3dd0	Upgrading a map's lock to exclusive status should increment the map's timestamp. In general, whenever an exclusive lock is acquired the timestamp should be incremented.	1999-03-06 07:11:33 +00:00
alc	5149bf6666	Remove the last of the share map code: struct vm_map::is_main_map. Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>	1999-03-02 05:43:18 +00:00
luoqi	082d37c1ac	Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>	1999-02-19 14:25:37 +00:00
dillon	98732ec693	Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.	1999-02-07 21:48:23 +00:00
julian	4b7738dba1	Mostly remove the VM_STACK OPTION. This changes the definitions of a few items so that structures are the same whether or not the option itself is enabled. This allows people to enable and disable the option without recompilng the world. As the author says: \|I ran into a problem pulling out the VM_STACK option. I was aware of this \|when I first did the work, but then forgot about it. The VM_STACK stuff \|has some code changes in the i386 branch. There need to be corresponding \|changes in the alpha branch before it can come out completely. what is done: \| \|1) Pull the VM_STACK option out of the header files it appears in. This \|really shouldn't affect anything that executes with or without the rest \|of the VM_STACK patches. The vm_map_entry will then always have one \|extra element (avail_ssize). It just won't be used if the VM_STACK \|option is not turned on. \| \|I've also pulled the option out of vm_map.c. This shouldn't harm anything, \|since the routines that are enabled as a result are not called unless \|the VM_STACK option is enabled elsewhere. \| \|2) Add what appears to be appropriate code the the alpha branch, still \|protected behind the VM_STACK switch. I don't have an alpha machine, \|so we would need to get some testers with alpha machines to try it out. \| \|Once there is some testing, we can consider making the change permanent \|for both i386 and alpha. \| [..] \| \|Once the alpha code is adequately tested, we can pull VM_STACK out \|everywhere. \| Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-26 02:49:52 +00:00
julian	4666ac5027	Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port. The code is conditional on VM_STACK. Not using this will produce the old heavily tested system. Submitted by: Richard Seaman <dick@tar.com>	1999-01-06 23:05:42 +00:00
dyson	197bd655c4	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)	1998-01-22 17:30:44 +00:00
dyson	b130b30c96	Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.	1998-01-17 09:17:02 +00:00
dyson	cb2800cd94	Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.	1998-01-06 05:26:17 +00:00
dyson	6bd1f74dcf	Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)	1997-12-19 09:03:37 +00:00
dyson	cc823b6e73	Fix kern_lock so that it will work. Additionally, clean-up some of the VM systems usage of the kernel lock (lockmgr) code. This is a first pass implementation, and is expected to evolve as needed. The API for the lock manager code has not changed, but the underlying implementation has changed significantly. This change should not materially affect our current SMP or UP code without non-standard parameters being used.	1997-08-18 02:06:35 +00:00
dyson	8fa8ae3d0d	Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.	1997-08-05 00:02:08 +00:00
peter	ecf50a7463	The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.	1997-04-07 07:16:06 +00:00
dyson	22d3427970	Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem. The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again. The code in procfs_mem is now less bogus (but maybe still a little so.)	1997-04-06 02:29:45 +00:00
peter	94b6d72794	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
bde	6a229a98d5	Removed vestiges of Mach lock types. vm_map.h: Removed #include of <sys/proc.h>. curproc is only used in some macros and users of the macros already include <sys/proc.h>.	1997-02-18 14:07:03 +00:00
dyson	10f666af84	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>	1997-02-10 02:22:35 +00:00
dyson	52f682b582	Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.	1997-01-16 04:16:22 +00:00
jkh	808a36ef65	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
dyson	fb5d9384f8	Eliminate the redundancy due to the similarity between the routines vm_map_simplify and vm_map_simplify_entry. Make vm_map_simplify_entry handle wired maps so that we can get rid of vm_map_simplify. Modify the callers of vm_map_simplify to properly use vm_map_simplify_entry. Submitted by: Alan Cox <alc@cs.rice.edu>	1996-12-28 23:07:49 +00:00
dyson	765e5fd282	Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.) Under normal usage, the new mlock code will be a significant improvement over our old stuff.	1996-12-14 17:54:17 +00:00
dyson	468189da8d	Make vm_map_insert much more intelligent in the MAP_NOFAULT case so that map entries are coalesced when appropriate. Also, conditionalize some code that is currently not used in vm_map_insert. This mod has been added to eliminate unnecessary map entries in buffer map. Additionally, there were some cases where map coalescing could be done when it shouldn't. That problem has been resolved.	1996-12-07 00:03:43 +00:00
dyson	7a58275f33	Implement a new totally dynamic (up to MAXPHYS) buffer kva allocation scheme. Additionally, add the capability for checking for unexpected kernel page faults. The maximum amount of kva space for buffers hasn't been decreased from where it is, but it will now be possible to do so. This scheme manages the kva space similar to the buffers themselves. If there isn't enough kva space because of usage or fragementation, buffers will be reclaimed until a buffer allocation is successful. This scheme should be very resistant to fragmentation problems until/if the LFS code is fixed and uses the bogus buffer locking scheme -- but a 'fixed' LFS is not likely to use such a scheme. Now there should be NO problem allocating buffers up to MAXPHYS.	1996-11-30 22:41:49 +00:00
dyson	01ce9d323a	Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.	1996-07-30 03:08:57 +00:00
dyson	293abd3564	This commit is meant to solve a couple of VM system problems or performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some evil PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance significantly. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.	1996-07-27 03:24:10 +00:00

1 2

63 Commits