freebsd-skq

Author	SHA1	Message	Date
peter	d247469429	Add brackets to silence egcs and help clarity.	1999-05-06 22:06:45 +00:00
phk	f57a01ebfc	remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde	1999-05-06 20:00:34 +00:00
luoqi	148df5d8ab	Don't ignore mmap() address hint below the text section.	1999-05-06 00:46:19 +00:00
billf	dd35516544	Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)	1999-05-03 23:57:32 +00:00
alc	5cb08a2652	The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-02 23:57:16 +00:00
dt	ba8c622703	s/static foo_devsw_installed = 0;/static int foo_devsw_installed;/. (Edited automatically)	1999-04-28 10:54:24 +00:00
phk	16e3fbd2c1	Suser() simplification: 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc ), prototyped in <sys/proc.h>. 3: s/suser_xxx($[a-zA-Z0-9_]$->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.	1999-04-27 11:18:52 +00:00
dt	9efe75f7a6	Make pmap_collect() an official pmap interface.	1999-04-23 20:29:58 +00:00
peter	a74bdeb7d1	unifdef -DVM_STACK - it's been on for a while for x86 and was checked and appeared to be working for the Alpha some time ago.	1999-04-19 14:14:14 +00:00
peter	dfdcc62332	Move the declaration of faultin() from the vm headers to proc.h, since it is now referenced from a macro there (PHOLD()).	1999-04-13 19:17:15 +00:00
eivind	29478a96d3	Staticize	1999-04-11 02:16:27 +00:00
dt	80578d3e92	Convert usage of vm_page_bits() to the new convention ("Inputs are required to range within a page").	1999-04-10 20:52:11 +00:00
eivind	b790d856f9	Lock vnode correctly for VOP_OPEN. Discussed with: alc, dillon	1999-04-10 17:54:43 +00:00
peter	100f4abd46	Don't forcibly kill processes that are locked in-core via PHOLD - it was just checking P_NOSWAP before.	1999-04-06 03:14:56 +00:00
peter	bc820937dc	Only use p->p_lock (manage by PHOLD()/PRELE()) - P_NOSWAP/P_PHYSIO is no longer set.	1999-04-06 03:11:34 +00:00
julian	0ed09d2ad5	Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>	1999-04-05 19:38:30 +00:00
alc	ad1fbba2a9	Two changes to vm_map_delete: 1. Don't bother checking object->ref_count == 1 in order to set OBJ_ONEMAPPING. It's a waste of time. If object->ref_count == 1, vm_map_entry_delete will "run-down" the object and its pages. 2. If object->ref_count == 1, ignore OBJ_ONEMAPPING. Wait for vm_map_entry_delete to "run-down" the object and its pages. Otherwise, we're calling two different procedures to delete the object's pages. Note: "vmstat -s" will once again show a non-zero value for "pages freed by exiting processes".	1999-04-04 07:11:02 +00:00
alc	a976359db5	Mainly, eliminate the comments about share maps. (We don't have share maps any more.) Also, eliminate an incorrect comment that says that we don't coalesce vm_map_entry's. (We do.)	1999-03-27 23:46:04 +00:00
eivind	fdc0436c85	Correct a comment.	1999-03-27 02:39:01 +00:00
alc	9b15de3986	Two changes: Remove more (redundant) map timestamp increments from properly synchronized routines. (Changed: vm_map_entry_link, vm_map_entry_unlink, and vm_map_pageable.) Micro-optimize vm_map_entry_link and vm_map_entry_unlink, eliminating unnecessary dereferences. At the same time, converted them from macros to inline functions.	1999-03-21 23:37:00 +00:00
alc	4bdf1d66da	Construct the free queue(s) in descending order (by physical address) so that the first 16MB of physical memory is allocated last rather than first. On large-memory machines, this avoids the exhaustion of low physical memory before isa_dmainit has run.	1999-03-19 05:21:03 +00:00
alc	57d921a394	Correct a problem in kmem_malloc: A kmem_malloc allowing "wait" may block (VM_WAIT) holding the map lock. This is bad. For example, a subsequent kmem_malloc by an interrupt handler on the same map may find the lock held and panic in the lockmgr.	1999-03-16 07:39:07 +00:00
alc	8baf85480b	Two changes: In general, vm_map_simplify_entry should be performed INSIDE the loop that traverses the map, not outside. (Changed: vm_map_inherit, vm_map_pageable.) vm_fault_unwire doesn't acquire the map lock (or block holding it). Thus, vm_map_set/clear_recursive shouldn't be called. (Changed: vm_map_user_pageable, vm_map_pageable.)	1999-03-15 06:24:52 +00:00
julian	ec27b516c8	Fix breakage in last commit Submitted by: Brian Feldman <green@unixhelp.org>	1999-03-15 05:09:48 +00:00
julian	8ad9ed65a4	A bit of a hack, but allows the vn device to be a module again. Submitted by: Matt Dillon <dillon@freebsd.org>	1999-03-14 20:40:15 +00:00
julian	0c3f3973d2	Submitted by: Matt Dillon <dillon@freebsd.org> The old VN device broke in -4.x when the definition of B_PAGING changed. This patch fixes this plus implements additional capabilities. The new VN device can be backed by a file ( as per normal ), or it can be directly backed by swap. Due to dependencies in VM include files (on opt_xxx options) the new vn device cannot be a module yet. This will be fixed in a later commit. This commit delimitted by tags {PRE,POST}_MATT_VNDEV	1999-03-14 09:20:01 +00:00
alc	aa8bb4e29a	Correct two optimization errors in vm_object_page_remove: 1. The size of vm_object::memq is vm_object::resident_page_count, not vm_object::size. 2. The "size > 4" test sometimes results in the traversal of a ~1000 page memq in order to locate ~10 pages.	1999-03-14 06:36:00 +00:00
alc	2d75a5cc4c	Remove vm_page_frees from kmem_malloc that are performed by vm_map_delete/vm_object_page_remove anyway.	1999-03-12 08:05:49 +00:00
julian	4726cfcda9	Stop the mfs from trying to swap out crucial bits of the mfs as this can lead to deadlock. Submitted by: Mat dillon <dillon@freebsd.org>	1999-03-12 00:44:03 +00:00
alc	143686d0c8	Remove (redundant) map timestamp increments from some properly synchronized routines.	1999-03-09 08:00:17 +00:00
alc	118d31f1dd	Remove an unused variable from vmspace_fork.	1999-03-08 03:53:07 +00:00
alc	65b8ae0944	Change vm_map_growstack to acquire and hold a read lock (instead of a write lock) until it actually needs to modify the vm_map. Note: it is legal to modify vm_map::hint without holding a write lock. Submitted by: "Richard Seaman, Jr." <dick@tar.com> with minor changes by myself.	1999-03-07 21:25:42 +00:00
alc	4e7ebf3dd0	Upgrading a map's lock to exclusive status should increment the map's timestamp. In general, whenever an exclusive lock is acquired the timestamp should be incremented.	1999-03-06 07:11:33 +00:00
alc	9e3d479b9d	To avoid a conflict for the vm_map's lock with vm_fault, release the read lock around the subyte operations in mincore. After the lock is reacquired, use the map's timestamp to determine if we need to restart the scan.	1999-03-02 22:55:02 +00:00
alc	5149bf6666	Remove the last of the share map code: struct vm_map::is_main_map. Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>	1999-03-02 05:43:18 +00:00
alc	4d728cf4b3	mincore doesn't modify the vm_map. Therefore, it doesn't require an exclusive lock. A read lock will suffice.	1999-03-01 20:42:16 +00:00
alc	440397c816	Reviewed by: "John S. Dyson" <dyson@iquest.net> Submitted by: Matthew Dillon <dillon@apollo.backplane.com> To prevent a deadlock, if we are extremely low on memory, force synchronous operation by the VOP_PUTPAGES in vnode_pager_putpages.	1999-02-27 23:39:28 +00:00
alc	11ba367bdf	Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Corrected the computation of cnt.v_ozfod in vm_fault: vm_fault was counting the number of unoptimized rather than optimized zero-fill faults.	1999-02-25 06:00:52 +00:00
dillon	25079ce905	Comment swstrategy() routine.	1999-02-25 05:37:18 +00:00
dillon	3ad14cacdc	Remove unnecessary page protects on map_split and collapse operations. Fix bug where an object's OBJ_WRITEABLE/OBJ_MIGHTBEDIRTY flags do not get set under certain circumstances ( page rename case ). Reviewed by: Alan Cox <alc@cs.rice.edu>, John Dyson	1999-02-24 21:26:26 +00:00
dillon	9f7c64c6ce	Removed ENOMEM error on swap_pager_full condition which ignored the availability of physical memory. As per original bug report by Bruce. Reviewed by: Alan Cox <alc@cs.rice.edu>	1999-02-22 08:42:16 +00:00
dillon	fa06d8f968	Remove conditional sysctl's Leave swap_async_max sysctl intact, remove swap_cluster_max sysctl. Reviewed by: Alan Cox <alc@cs.rice.edu>	1999-02-21 08:34:15 +00:00
dillon	338a9c530d	Reviewed by: Alan Cox <alc@cs.rice.edu> Fix problem w/ low-swap/low-memory handling as reported by Bruce Evans.	1999-02-21 08:30:49 +00:00
luoqi	e0559c2622	Eliminate a possible numerical overflow.	1999-02-19 19:14:48 +00:00
luoqi	082d37c1ac	Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>	1999-02-19 14:25:37 +00:00
dillon	c950305edd	Submitted by: Alan Cox <alc@cs.rice.edu> Remove remaining share map garbage from vm_map_lookup() and clean out old #if 0 stuff.	1999-02-19 03:11:37 +00:00
dillon	03d8f2679e	Limit number of simultanious asynchronous swap pager I/Os that can be in progress at any given moment. Add two swap tuneables to sysctl: vm.swap_async_max: 4 vm.swap_cluster_max: 16 Recommended values are a cluster size of 8 or 16 pages. async_max is about right for 1-4 swap devices. Reduce to 2 if swap is eating too much bandwidth, or even 1 if swap is both eating too much bandwidth and sitting on a slow network (10BaseT). The defaults work well across a broad range of configurations and should normally be left alone.	1999-02-18 19:57:33 +00:00
dillon	e819b6214d	Submitted by: Luoqi Chen <luoqi@watermarkgroup.com> Unlock vnode before messing with map to avoid deadlock between map and vnode ( e.g. with exec_map and underlying program binary vnode ). Solves a deadlock that most often occurs during a large -j# buildworld reported by three people.	1999-02-17 09:08:29 +00:00
dillon	1fbfe938f2	Minor reorganization of vm_page_alloc(). No functional changes have been made but the code has been reorganized and documented to make it more readable, reduce the size of the code, and optimize the branch path caching capabilities that most modern processors have.	1999-02-15 06:52:14 +00:00
dillon	0a57647424	Fix a bug in the new madvise() code that would possibly (improperly) free swap space out from under a busy page. This is not legal because the swap may be reallocated and I/O issued while I/O is still in progress on the same swap page from the madvise()'d object. This bug could only occur under extreme paging conditions but might not cause an error until much later. As a side-benefit, madvise() is now even smaller.	1999-02-15 02:03:40 +00:00
dillon	aadfe1d833	Minor optimization to madvise() MADV_FREE to make page as freeable as possible without actually unmapping it from the process. As of now, I declare madvise() on OBJT_DEFAULT/OBJT_SWAP objects to be 'working and complete'.	1999-02-12 20:42:19 +00:00
dillon	e38d19126b	Fix non-fatal bug in vm_map_insert() which improperly cleared OBJ_ONEMAPPING in the case where an object is extended by an additional vm_map_entry must be allocated. In vm_object_madvise(), remove calll to vm_page_cache() in MADV_FREE case in order to avoid a page fault on page reuse. However, we still mark the page as clean and destroy any swap backing store. Submitted by: Alan Cox <alc@cs.rice.edu>	1999-02-12 09:51:43 +00:00
dillon	76e195c050	Addendum to vm_map coalesce optimization. Also, this was backed-out because there was a concensus on current in regards to leaving bss r+w+x instead of r+w. This is in order to maintain reasonable compatibility with existing JIT compilers (e.g. kaffe) and possibly other programs.	1999-02-09 01:39:29 +00:00
dillon	139adb1b8f	Revamp vm_object_[q]collapse(). Despite the complexity of this patch, no major operational changes were made. The three core object->memq loops were moved into a single inline procedure and various operational characteristics of the collapse function were documented.	1999-02-08 19:00:15 +00:00
dillon	8bdc42c0a1	General cleanup. Remove #if 0's and remove useless register qualifiers.	1999-02-08 05:15:54 +00:00
dillon	b7a0b99c31	Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in with PQ_FREE. There is little operational difference other then the kernel being a few kilobytes smaller and the code being more readable. * vm_page_select_free() has been greatly simplified. * The PQ_ZERO page queue and supporting structures have been removed * vm_page_zero_idle() revamped (see below) PG_ZERO setting and clearing has been migrated from vm_page_alloc() to vm_page_free[_zero]() and will eventually be guarenteed to remain tracked throughout a page's life ( if it isn't already ). When a page is freed, PG_ZERO pages are appended to the appropriate tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended. When locating a new free page, PG_ZERO selection operates from within vm_page_list_find() ( get page from end of queue instead of beginning of queue ) and then only occurs in the nominal critical path case. If the nominal case misses, both normal and zero-page allocation devolves into the same _vm_page_list_find() select code without any specific zero-page optimizations. Additionally, vm_page_zero_idle() has been revamped. Hysteresis has been added and zero-page tracking adjusted to conform with the other changes. Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free pages. We may wish to increase both parameters as time permits. The hysteresis is designed to avoid silly zeroing in borderline allocation/free situations.	1999-02-08 00:37:36 +00:00
dillon	3e732af5ea	Backed out vm_map coalesce optimization - it resulted in 22% more page faults for reasons unknown ( under investigation ). /usr/bin/time -l make in /usr/src/bin went from 67000 faults to 90000 faults.	1999-02-08 00:27:56 +00:00
dillon	98732ec693	Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.	1999-02-07 21:48:23 +00:00
dillon	08bf7e9a93	Remove L1 cache coloring optimization ( leave L2 cache coloring opt ). Rewrite vm_page_list_find() and vm_page_select_free() - make inline out of nominal case.	1999-02-07 20:45:15 +00:00
dillon	7a0029ee87	When shadowing objects, adjust the page coloring of the shadowing object such that pages in the combined/shadowed object are consistantly colored. Submitted by: "John S. Dyson" <dyson@iquest.net>	1999-02-07 08:44:53 +00:00
dillon	638895c103	Add hysteresis to the 'swap_pager_getswapspace; failed' console message. Also widen the hysteresis levels a little ( these really should be dynamically configured ).	1999-02-06 07:22:21 +00:00
dillon	eb4dbd2f37	The elf loader sets the permissions on bss to VM_PROT_READ\|VM_PROT_WRITE rather then VM_PROT_ALL. obreak, on the otherhand, uses VM_PROT_ALL. This prevents vm_map_insert() from being able to coalesce the heap and creates an extra map entry. Since current architectures ignore VM_PROT_EXECUTE anyway, and since not having VM_PROT_EXECUTE on data/bss may provide protection in the future, obreak now uses read+write rather then all (r+w+x). This is an optimization, not a bug fix. Submitted by: Alan Cox <alc@cs.rice.edu>	1999-02-05 07:49:29 +00:00
dillon	1648ae3fae	Fix bug in a KASSERT I introduced in vm_page_qcollapse() rev 1.139. Since paging is in progress, page scan in vm_page_qcollapse() must be protected at atleast splbio() to prevent pages from being ripped out from under the scan.	1999-02-04 17:47:52 +00:00
dillon	fdc78db606	Submitted by: Alan Cox The vm_map_insert()/vm_object_coalesce() optimization has been extended to include OBJT_SWAP objects as well as OBJT_DEFAULT objects. This is possible because it costs nothing to extend an OBJT_SWAP object with the new swapper. We can't do this with the old swapper. The old swapper used a linear array that would have had to have been reallocated, costing time as well as a potential low-memory deadlock.	1999-02-03 01:57:17 +00:00
dillon	56683bbe5a	This patch eliminates a pointless test from appearing twice in vm_map_simplify_entry. Basically, once you've verified that the objects in the adjacent vm_map_entry's are the same, either NULL or the same vm_object, there's no point in checking that the objects have the same behavior. Obtained from: Alan Cox <alc@cs.rice.edu>	1999-02-01 08:49:30 +00:00
julian	df7c58af81	Submitted by: Alan Cox <alc@cs.rice.edu> Checked by: "Richard Seaman, Jr." <dick@tar.com> Fix the following problem: As the code stands now, growing any stack, and not just the process's main stack, modifies vm->vm_ssize. This is inconsistent with the code earlier in the same procedure.	1999-01-31 14:09:25 +00:00
dillon	975fba8a24	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-28 00:57:57 +00:00
dillon	ca8ef4ff13	Remove unintended trigraph sequences in comments for -Wall	1999-01-27 18:19:53 +00:00
julian	4b7738dba1	Mostly remove the VM_STACK OPTION. This changes the definitions of a few items so that structures are the same whether or not the option itself is enabled. This allows people to enable and disable the option without recompilng the world. As the author says: \|I ran into a problem pulling out the VM_STACK option. I was aware of this \|when I first did the work, but then forgot about it. The VM_STACK stuff \|has some code changes in the i386 branch. There need to be corresponding \|changes in the alpha branch before it can come out completely. what is done: \| \|1) Pull the VM_STACK option out of the header files it appears in. This \|really shouldn't affect anything that executes with or without the rest \|of the VM_STACK patches. The vm_map_entry will then always have one \|extra element (avail_ssize). It just won't be used if the VM_STACK \|option is not turned on. \| \|I've also pulled the option out of vm_map.c. This shouldn't harm anything, \|since the routines that are enabled as a result are not called unless \|the VM_STACK option is enabled elsewhere. \| \|2) Add what appears to be appropriate code the the alpha branch, still \|protected behind the VM_STACK switch. I don't have an alpha machine, \|so we would need to get some testers with alpha machines to try it out. \| \|Once there is some testing, we can consider making the change permanent \|for both i386 and alpha. \| [..] \| \|Once the alpha code is adequately tested, we can pull VM_STACK out \|everywhere. \| Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-26 02:49:52 +00:00
julian	05a2232887	Enable Linux threads support by default. This takes the conditionals out of the code that has been tested by various people for a while. ps and friends (libkvm) will need a recompile as some proc structure changes are made. Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-26 02:38:12 +00:00
dillon	e06ca031d7	Undo last commit - not a bug, just duplicate code. PG_MAPPED and PG_WRITEABLE are already cleared by vm_page_protect().	1999-01-24 07:06:52 +00:00
dillon	a4c067a459	Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline. The inline can thus do sanity checks ( or not ) over all cases.	1999-01-24 06:04:52 +00:00
dillon	facf2fdb5a	vm_map_split() used to dirty the page manually after calling vm_page_rename(), but never pulled the page off PQ_CACHE if it was on PQ_CACHE. Dirty pages in PQ_CACHE are not allowed and a KASSERT was added in -4.x to test for this... and got hit. In -4.x, vm_page_rename() automatically dirties the page. This commit also has it deal with the PQ_CACHE case, deactivating the page in that case.	1999-01-24 06:00:31 +00:00
dillon	f4215e2355	Add vm_page_dirty() inline with PQ_CACHE sanity check	1999-01-24 05:57:50 +00:00
dillon	9ed8ff237b	vm_pager_put_pages() is passed an rcval array to hold per-page return values. The 'int' return value for the procedure was never used and not well defined in any case when there are mixed errors on pages, so it has been removed. vm_pager_put_pages() and associated vm_pager functions now return void.	1999-01-24 02:32:15 +00:00
dillon	90b1bd7723	Clear PG_MAPPED as well as PG_WRITEABLE when a page is moved to the cache.	1999-01-24 02:29:26 +00:00
dillon	555a2c90ad	Added warning printf ( needs INVARIANTS ) when busy cache page is found while trying to free memory.	1999-01-24 01:33:22 +00:00
dillon	5ba2ffcf3a	It is possible for a page in the cache to be busy. vm_pageout.c was not checking for this condition while it tried to free cache pages. Fixed.	1999-01-24 01:06:31 +00:00
dillon	84015ba9cb	Add invariants to vm_page_busy() and vm_page_wakeup() to check for PG_BUSY stupidity.	1999-01-24 01:05:15 +00:00
dillon	e854781bc8	Clear PG_WRITEABLE in vm_page_cache(). This may or may not be a bug, but the bit should definitely be cleared.	1999-01-24 01:04:04 +00:00
dillon	44c2f47425	Depreciate vm_object_pmap_copy() - nobody uses it. Everyone uses vm_object_pmap_copt_1() now, apparently.	1999-01-24 01:01:38 +00:00
dillon	e554adda08	Get rid of unused old_m in vm_fault. Add INVARIANTS to test whether page is still busy after all the hell vm_fault goes through.. it is supposed to be, and printf() if it isn't. don't panic, though.	1999-01-24 00:55:04 +00:00
dillon	3a81644d73	Reenable John Dyson's low-memory VM_WAIT code for page reactivations out of PQ_CACHE. Add comments explaining what it accomplishes and its limitations.	1999-01-23 06:00:27 +00:00
dillon	be14f0e426	Mainly changes to support the new swapper. The big adjustment is that swap blocks are now in PAGE_SIZE'd increments instead of DEV_BSIZE'd increments. We still convert to DEV_BSIZE'd increments for the backing store I/O, but everything else is in PAGE_SIZE increments.	1999-01-21 10:17:12 +00:00
dillon	67eada315f	Move many of the vm_pager_*() functions from vm_pager.c to inlines in vm_pager.h	1999-01-21 10:15:47 +00:00
dillon	c11718f005	Move many of the vm_pager_() functions from vm_pager.c to inlines in vm_pager.h Added argument to getpbuf() and relpbuf() to allow each subsystem to specify a different hard limit on the number of simultanious physical bufferes that said subsystem may allocate. Without this feature, one subsystem ( e.g. the vfs clustering code ) could hog ALL* the pbufs, causing a deadlock in the pager in a low memory situation. Same for trypbuf().	1999-01-21 10:15:24 +00:00
dillon	58521d75ed	Reorganized some of the low memory testing code to make it more useful. Removed call to vm_object_collapse(), which can block. This was being called without the pageout code holding any sort of reference on the vm_object or vm_page_t structures being manipulated. Since this code can block, it was possible for other kernel code to shred the state the pageout code was assuming remained intact. Fixed potential blocking condition in vm_pageout_page_free() ( which could cause a deadlock in a low-memory situation ). Currently there is a hack in-place to deal with clean filesystem meta-data polluting the inactive page queue. John doesn't like the hack, and neither do I. Revamped and commented a portion of the pageout loop. Added protection against potential memory deadlocks with OBJT_VNODE when using VOP_ISLOCKED(). The problem is that vp->v_data can be NULL which causes VOP_ISLOCKED() to return a less informed answer. remove vm_pager_sync() -- none of the pagers use it any more ( the old swapper used to. The new one does not ).	1999-01-21 10:12:54 +00:00
dillon	fb433b53df	The TAILQ hashq has been turned into a singly-linked=list link, reducing the size of vm_page_t. SWAPBLK_NONE and SWAPBLK_MASK are defined here. These actually are more generalized then their names imply, but their placement is somewhat of a legacy issue from a prior test version of this code that put the swapblk in the vm_page_t structure. That test code was eventually thrown away. The legacy remains. Added vm_page_flash() inline. Similar to vm_page_wakeup() except that it does not clear PG_BUSY ( one assumes that PG_BUSY is already clear ). Used by a number of routines to wakeup waiters. Collapsed some of the code in inline calls to make other inline calls. GCC will optimize this well and it reduces duplication. vm_page_free() and vm_page_free_zero() inlines added to convert to the proper vm_page_free_toq() call. vm_page_sleep_busy() inline added, replacing vm_page_sleep() ( which has been removed ). This implements a much more optimizable page-waiting function.	1999-01-21 10:06:24 +00:00
dillon	1ceb9c9b9e	The hash table used to be a table of doubly-link list headers ( two pointers per entry ). The table has been changed to a singly linked list of vm_page_t pointers. The table has been doubled in size, but the entries only take half the space so a net-zero change in memory use. The hash function has been changed, hopefully for the better. The combination of the larger hash table size of changed function should keep the chain length down to a reasonable number (0-3, average 1). vm_object->page_hint has been removed. This 'optimization' was not only never needed, but costs as much as a hash chain link to implement. While having page_hint in vm_object might result in better locality of reference, the cost is not worth the space in vm_object or the extra instructions in my view. vm_page_alloc*() functions have been inlined and call a generalized non-inlined vm_page_alloc_toq() which combines the standard alloc and zero-page alloc functions together, reducing code size and the L1 cache footprint. Some reordering has been done... not much. The delinking code should be faster ( because unlinking a doubly-linked list requires four memory ops and unlinking a singly linked list only requires two ), and we get a hash consistancy check for free. vm_page_rename() now automatically sets the page's dirty bits. vm_page_alloc() does not try to manually inline freeing a cache page. Instead, it now properly calls vm_page_free(m) ... vm_page_free() is really too complex to manually inline. vm_await(), supporting asleep(), has been added.	1999-01-21 10:01:49 +00:00
dillon	04052d89f8	The vm_object structure is now somewhat smaller due to the removal of most of the swap-pager-specific fields, the removal of the id, and the removal of paging_offset. A new inline, vm_object_pip_wakeupn() has been added to subtract an arbitrary number n from the paging_in_progress count and then wakeup waiters as necessary. n may be 0, resulting in a 'flash'.	1999-01-21 09:51:21 +00:00
dillon	3f5f4e54ca	object->id was badly implemented. It has simply been removed. object->paging_offset has been removed - it was used to optimize a single OBJT_SWAP collapse case yet introduced massive confusion throughout vm_object.c. The optimization was inconsequential except for the claim that it didn't have to allocate any memory. The optimization has been removed. madvise() has been fixed. The old madvise() could be made to operate on shared objects which is a big no-no. The new one is much more careful in what it modifies. MADV_FREE was totally broken and has now been fixed. vm_page_rename() now automatically dirties a page, so explicit dirtying of the page prior to calling vm_page_rename() has been removed.	1999-01-21 09:46:55 +00:00
dillon	5e7ceee4a0	Objects associated with raw devices are no longer counted in the VM stats total because they may contain absurd numbers ( like the size of all of physical memory if you mmap() /dev/mem ).	1999-01-21 09:41:52 +00:00
dillon	bb85a38eff	General cleanup related to the new pager. We no longer have to worry about conversions of objects to OBJT_SWAP, it is done automatically now. Replaced manually inserted code with inline calls for busy waiting on pages, which also incidently fixes a potential PG_BUSY race due to the code not running at splvm(). vm_objects no longer have a paging_offset field ( see vm/vm_object.c )	1999-01-21 09:40:48 +00:00
dillon	8716ba4543	Potential bug fix, do not just clear PG_BUSY... call vm_page_wakeup() instead to properly handle any waiters. Added comments, added support for M_ASLEEP. Generally treat M_ flags as flags instead of constants to compare against.	1999-01-21 09:38:20 +00:00
dillon	f89474ca7b	Removed low-memory blockages at fork. This is the wrong place to put this sort of test. We need to fix the low-memory handling in general.	1999-01-21 09:36:23 +00:00
dillon	fae5210519	Mainly cleanup. Removed some inappropriate low-memory handling code and added lots of comments. Add tie-in to vm_pager ( and thus the new swapper ) to deallocate backing swap for dirtied pages on the fly.	1999-01-21 09:35:38 +00:00
dillon	e3c11331d3	The default_pager's interaction with the swap_pager has been reorganized, and the swap_pager has been completely replaced. The new swap pager uses the new blist radix-tree based bitmap allocator for low level swap allocation and deallocation. The new allocator is effectively O(5) while the old one was O(N), and the new allocator allocates all required memory at init time rather then at allocate memory on the fly at run time. Swap metadata is allocated in clusters and stored in a hash table, eliminating linearly allocated structures. Many, many features have been rewritten or added. Swap space is now reallocated on the fly providing a poor-mans auto defragmentation of swap space. Swap space that is no longer needed is freed on a timely basis so no garbage collection is necessary. Swap I/O is marked B_ASYNC and NFS has been fixed to do the right thing with it, so NFS-based paging now has around 10x the performance as it did before ( previously NFS enforced synchronous I/O for paging ).	1999-01-21 09:33:07 +00:00
dillon	df24433bbe	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
eivind	89e1199534	KNFize, by bde.	1999-01-10 01:58:29 +00:00
eivind	a8dc66f457	Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith	1999-01-08 17:31:30 +00:00
julian	a7b385889e	Changes to the LINUX_THREADS support to only allocate extra memory for shared signal handling when there is shared signal handling being used. This removes the main objection to making the shared signal handling a standard ability in rfork() and friends and 'unconditionalising' this code. (i.e. the allocation of an extra 328 bytes per process). Signal handling information remains in the U area until such a time as it's reference count would be incremented to > 1. At that point a new struct is malloc'd and maintained in KVM so that it can be shared between the processes (threads) using it. A function to check the reference count and move the struct back to the U area when it drops back to 1 is also supplied. Signal information is therefore now swapable for all processes that are not sharing that information with other processes. THis should addres the concerns raised by Garrett and others. Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-07 21:23:50 +00:00
julian	4666ac5027	Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port. The code is conditional on VM_STACK. Not using this will produce the old heavily tested system. Submitted by: Richard Seaman <dick@tar.com>	1999-01-06 23:05:42 +00:00
bde	734d13314e	Ifdefed conditionally used simplock variables.	1999-01-02 11:34:57 +00:00
dt	065be55870	Don't free swap in swap_pager_getpages(): this code probably cause the "dying daemons" problem. (I thought this code was introduced in rev.1.80, but it just relaxed the condition.) Also, kill related "suggest more swap space" warning (also introduced in 1.80). It was confusing, to say the least... Requested by: msmith Not objected by: dg	1998-12-29 22:53:51 +00:00
dillon	52a9d9d6e6	Update comments to routines in vm_page.c, most especially whether a routine can block or not as part of a general effort to carefully document blocking/non-blocking calls in the kernel.	1998-12-23 01:52:47 +00:00
julian	d718e5c06d	Fix two bogons created by 'patch(1)' in my last commit.	1998-12-19 08:23:31 +00:00
julian	61490236bc	Reviewed by: Luoqi Chen, Jordan Hubbard Submitted by: "Richard Seaman, Jr." <lists@tar.com> Obtained from: linux :-) Code to allow Linux Threads to run under FreeBSD. By default not enabled This code is dependent on the conditional COMPAT_LINUX_THREADS (suggested by Garret) This is not yet a 'real' option but will be within some number of hours.	1998-12-19 02:55:34 +00:00
dt	b35cc94e30	Don't disable mmap with large file offset.	1998-12-09 20:22:21 +00:00
archie	60d13c7a9d	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
archie	982e80577d	Examine all occurrences of sprintf(), strcat(), and str[n]cpy() for possible buffer overflow problems. Replaced most sprintf()'s with snprintf(); for others cases, added terminating NUL bytes where appropriate, replaced constants like "16" with sizeof(), etc. These changes include several bug fixes, but most changes are for maintainability's sake. Any instance where it wasn't "immediately obvious" that a buffer overflow could not occur was made safer. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Mike Spengler <mks@networkcs.com>	1998-12-04 22:54:57 +00:00
rvb	8d21664d46	In vnode_pager_input_old, set auio.uio_procp = curproc vs auio.uio_procp = (struct proc *) 0	1998-12-04 18:39:44 +00:00
dg	3d709a2bc7	Add missing splvm protection around unqueue call. Without this, the page queues would eventually get corrupted.	1998-11-25 07:40:49 +00:00
bde	9094f7c05e	Fixed a null pointer panic in spc_free(). swap_pager_putpages() almost always causes this panic for the curproc != pageproc case. This case apparently doesn't happen in normal operation, but it happens when vm_page_alloc_contig() is called when there is a memory hogging application that hasn't already been paged out. PR: 8632 Reviewed by: info@opensound.com (Dev Mazumdar), dg Broken in: rev.1.89 (1998/02/23)	1998-11-19 06:20:42 +00:00
dg	ba58877007	Closed a small race condition between wiring/unwiring pages that involved the page's wire_count.	1998-11-11 15:07:57 +00:00
peter	73192d8050	add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE()	1998-11-10 09:16:29 +00:00
dfr	b6d9e06815	* Fix a couple of places in the device pager where an address was truncated to 32 bits. * Change the calling convention of the device mmap entry point to pass a vm_offset_t instead of an int for the offset allowing devices with a larger memory map than (1<<32) to be supported on the alpha (/dev/mem is one such). These changes are required to allow the X server to mmap the various I/O regions used for device port and memory access on the alpha.	1998-11-08 12:39:07 +00:00
dg	b178f74f12	Implemented zero-copy TCP/IP extensions via sendfile(2) - send a file to a stream socket. sendfile(2) is similar to implementations in HP-UX, Linux, and other systems, but the API is more extensive and addresses many of the complaints that the Apache Group and others have had with those other implementations. Thanks to Marc Slemko of the Apache Group for helping me work out the best API for this. Anyway, this has the "net" result of speeding up sends of files over TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU cycles) when compared to a traditional read/write loop.	1998-11-05 14:28:26 +00:00
peter	e5c6a4fa5e	Add John Dyson's SYSCTL descriptions, and an export of more stats to a sysctl hierarchy (vm.stats.*). SYSCTL descriptions are only present in source, they do not get compiled into the binaries taking up memory.	1998-10-31 17:21:31 +00:00
peter	8ef35acf90	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
dg	304c46fa2c	Fixed wrong comments in and about vm_page_deactivate().	1998-10-28 13:41:43 +00:00
dg	20b2c33d9a	Added a second argument, "activate" to the vm_page_unwire() call so that the caller can select either inactive or active queue to put the page on.	1998-10-28 13:37:02 +00:00
dg	7850189506	Added needed splvm() protection around object page traversal in vm_object_terminate().	1998-10-27 13:22:51 +00:00
bde	9fafc47653	Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted when bdevsw[] became sparse. We still depend on magic to avoid having to check that (v_rdev) device numbers in vnodes are not NODEV. Removed a redundant `major(dev) < nblkdev' test instead of updating it. Don't follow a garbage bdevsw pointer for attempts to swap on empty regular files. This case currently can't happen. Swapping on regular files is ifdefed out in swapon() and isn't attempted for empty files in nfs_mountroot().	1998-10-25 19:24:04 +00:00
phk	13c66194f4	Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.	1998-10-25 17:44:59 +00:00
dg	b898ae170b	Oops, revert part of last fix. vm_pager_dealloc() can't be called until after the pages are removed from the object...so fix the problem by not printing the diagnostic for wired fictitious pages (which is normal).	1998-10-23 05:43:13 +00:00
dg	599836ef43	Fixed two bugs in recent commit: in vm_object_terminate, vm_pager_dealloc needs to be called prior to freeing remaining pages in the object so that the device pager has an opportunity to grab its "fake" pages. Also, in the case of wired pages, the page must be made busy prior to calling vm_page_remove. This is a difference from 2.2.x that I overlooked when I brought these changes forward.	1998-10-23 05:25:49 +00:00
dg	b8a68d9fd9	Make the VM system handle the case where a terminating object contains legitimately wired pages. Currently we print a diagnostic when this happens, but this will be removed soon when it will be common for this to occur with zero-copy TCP/IP buffers.	1998-10-22 02:16:53 +00:00
dg	e51a9e30ea	Convert fake page allocs to use the zone allocator, thus eliminating the private pool management code in here.	1998-10-22 01:45:29 +00:00
dg	268ea3fc13	Set m->object to NULL in dev_pager_getfake().	1998-10-21 23:06:50 +00:00
dg	92891f8e3d	Nuked PG_TABLED flag. Replaced with m->object != NULL.	1998-10-21 14:46:42 +00:00
dg	bbfdc21592	Add a diagnostic printf for freeing a wired page. This will eventually be turned into a panic, but I want to make sure that all cases of freeing pages with wire_count==1 (which is/was allowed) have first been fixed.	1998-10-21 11:43:04 +00:00
dg	3defb6d13f	Fixed two potentially serious classes of bugs: 1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.	1998-10-13 08:24:45 +00:00
jdp	2846983609	Fix a panic on SMP systems, caused by sleeping while holding a simple-lock. The reviewer raises the following caveat: "I believe these changes open a non-critical race condition when adding memory to the pool for the zone. I think what will happen is that you could have two threads that are simultaneously adding additional memory when the pool runs out. This appears to not be a problem, however, since the re-aquisition of the lock will protect the list pointers." The submitter agrees that the race is non-critical, and points out that it already existed for the non-SMP case. He suggests that perhaps a sleep lock (using the lock manager) should be used to close that race. This might be worth revisiting after 3.0 is released. Reviewed by: dg (David Greenman) Submitted by: tegge (Tor Egge)	1998-10-09 00:24:49 +00:00
jdp	317967a273	Fix a bug in which a page index was used where a byte offset was expected. This bug caused builds of Modula-3 to fail in mysterious ways on SMP kernels. More precisely, such builds failed on systems with kern.fast_vfork equal to 0, the default and only supported value for SMP kernels. PR: kern/7468 Submitted by: tegge (Tor Egge)	1998-10-01 20:46:41 +00:00
abial	121218d024	Make #define NO_SWAPPING a normal kernel config option. Reviewed by: jkh	1998-09-29 17:33:59 +00:00
rvb	32f1573bbe	John Dyson approved of this solution; make vnode_pager_input_old set m->valid	1998-09-28 23:58:10 +00:00
dg	dc15100c5d	Be more selctive about when we clear p->valid. Submitted by: John Dyson <toor@dyson.iquest.net>	1998-09-28 02:40:11 +00:00
bde	4d4fe42f59	Removed unused file.	1998-09-20 06:28:10 +00:00
bde	a84a2dedfc	Instantiate `nfs_mount_type' in a standard file so that it is present when nfs is an LKM. Declare it in a header file. Don't forget to use it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0 will soon be the mount type number for the first vfs loaded. NetBSD uses strcmp() to avoid this ugly global.	1998-09-05 15:17:34 +00:00
dfr	e2df972eb1	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.	1998-09-04 08:06:57 +00:00
wollman	c97cc8ee06	Separate wakeup conditions for page I/O count (pg_busy) and lock (PG_BUSY). This is not sa completely solution to the deadlock, but the additional wakeups have helped in my observation. Suggested by: John Dyson	1998-09-01 17:12:19 +00:00
luoqi	920e5f64ff	Fix a rounding problem that causes vnode pager to fail to remove the last partially filled page during a truncation. PR: kern/7422	1998-08-25 13:47:37 +00:00
dfr	5fdaeb281d	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>	1998-08-24 08:39:39 +00:00
mckay	acd489515b	Correct/clarify some comments.	1998-08-22 15:24:09 +00:00
dfr	a1b2079000	Protect all modifications to paging_in_progress with splvm().	1998-08-13 08:05:13 +00:00
dfr	0864bef679	Protect all modifications to paging_in_progress with splvm(). The i386 managed to avoid corruption of this variable by luck (the compiler used a memory read-modify-write instruction which wasn't interruptable) but other architectures cannot. With this change, I am now able to 'make buildworld' on the alpha (sfx: the crowd goes wild...)	1998-08-06 08:33:19 +00:00
bde	d7aa77e789	Fixed two spl nesting bugs. They caused (at least) the entire pageout daemon to run at splvm() forever after swap_pager_putpages() is called from vm_pageout_scan(). Broken in: rev.1.189 (1998/02/23)	1998-07-28 15:30:01 +00:00
dfr	9c96ae361d	Notify pmap when a page is freed on the alpha to allow it to clean up its emulated modified/referenced bits.	1998-07-26 18:15:20 +00:00
dg	76fd38da9c	Improved pager input failure message.	1998-07-22 09:38:04 +00:00
phk	101e6d7c92	There is a comment in vm_param.h which doesn't belong to the code still left in there. The macros it describes disapeared some- time since 4.4BSD lite. PR: 7246 Reviewed by: phk Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	1998-07-22 06:21:55 +00:00
bde	bd9ef8a24a	Cast pointers to [u]intptr_t instead of to [unsigned] long.	1998-07-15 04:17:55 +00:00
bde	863d5c8b68	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.	1998-07-15 02:32:35 +00:00
bde	faa4d9c3da	Print pointers using %p instead of attempting to print them by casting them to long, etc. Fixed some nearby printf bogons (sign errors not warned about by gcc, and style bugs, but not truncation of vm_ooffset_t's).	1998-07-14 12:26:15 +00:00
bde	6b64f2fed4	Print pointers using %p instead of attempting to print them by casting them to long, etc. Fixed some nearby printf bogons (sign errors not warned about by gcc, and style bugs, but not truncation of vm_ooffset_t's). Use slightly less bogus casts for passing pointers to ddb command functions.	1998-07-14 12:14:58 +00:00
bde	9a46e507bb	Fixed printf format errors.	1998-07-11 12:07:52 +00:00
bde	0bd5cff687	Fixed printf format errors.	1998-07-11 11:30:46 +00:00
bde	f0b863f4b5	Fixed printf format errors.	1998-07-11 07:46:16 +00:00
alex	4cddfb10b4	Removed no longer valid comment about swb_block being int instead of daddr_t. PR: 7238 Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	1998-07-10 21:50:17 +00:00
alex	afecce40f4	Removed unnecessary test from if/else construct. PR: 7233 Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	1998-07-10 17:58:35 +00:00
dfr	918a149d3c	Don't truncate the return value of mmap to sizeof(int).	1998-07-05 11:56:52 +00:00
julian	0262543b5f	There is no such thing any more as "struct bdevsw". There is only cdevsw (which should be renamed in a later edit to deventry or something). cdevsw contains the union of what were in both bdevsw an cdevsw entries. The bdevsw[] table stiff exists and is a second pointer to the cdevsw entry of the device. it's major is in d_bmaj rather than d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw). rawread()/rawwrite() went away as part of this though it's not strictly the same patch, just that it involves all the same lines in the drivers. cdroms no longer have write() entries (they did have rawwrite (?)). tapes no longer have support for bdev operations. Reviewed by: Eivind Eklund and Mike Smith Changes suggested by eivind.	1998-07-04 22:30:26 +00:00
julian	4363221ba2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
jmg	92fb44f7fe	document some VM paging options for cache sizes: PQ_NOOPT no coloring PQ_LARGECACHE used for 512k/16k cache PQ_HUGECACHE used for 1024k/16k cache	1998-06-30 08:01:30 +00:00
phk	267ffb2428	Remove bdevsw_add(), change the only two users to use bdevsw_add_generic(). Extend cdevsw to be superset of bdevsw. Remove non-functional bdev lkm support. Teach wcd what the open() args mean.	1998-06-25 11:28:07 +00:00
bde	9e868cbb1a	Removed unused includes.	1998-06-21 18:02:50 +00:00
bde	403bdcb97b	Removed unused includes.	1998-06-21 14:53:44 +00:00
dfr	1d5f38ac22	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
dg	3baa046254	Changed the log() of "Out of mbuf clusters - increase maxusers" to a printf() of "Out of mbuf clusters - adjust NMBCLUSTERS or increase maxusers" so that the message is more informative and so that it will appear in the kernel message buffer.	1998-06-05 21:48:45 +00:00
dyson	adec643e58	Cleanup and remove some dead code from the initialization.	1998-06-02 05:50:08 +00:00
dyson	5dbf701901	Correct sleep priority.	1998-06-02 05:39:13 +00:00
dyson	46ceef1fad	Support a 16K first level cache for 512K 2nd level. Also, add support for 1MB 2nd level cache.	1998-05-24 04:25:27 +00:00
dyson	d26bce6481	Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)	1998-05-21 07:47:58 +00:00
peter	34e429e59e	Make the previous commit compile..	1998-05-19 07:13:21 +00:00
guido	6f968c2fa2	Plug hole reported on Bugtraq: do not allow mmap with WRITE privs for append-only and immutable files. Obtained from: OpenBSD (partly)	1998-05-18 18:26:27 +00:00
dyson	9b04841c1b	An important fix for proper inheritance of backing objects for object splits. Another excellent detective job by Tor. Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>	1998-05-16 23:03:20 +00:00
dyson	65fbc3a74d	Fix the shm panic. I mistakenly used the shadow_count to keep the object from being split, and instead added an OBJ_NOSPLIT.	1998-05-04 17:12:53 +00:00
dyson	dfdb369a7d	Work around some VM bugs, the worst being an overly aggressive swap space free calculation. More complete fixes will be forthcoming, in a week.	1998-05-04 03:01:44 +00:00
dyson	d1eb14c5f6	Another minor cleanup of the split code. Make sure that pages are busied during the entire time, so that the waits for pages being unbusy don't make the objects inconsistant.	1998-05-02 06:36:16 +00:00
peter	fc934d7995	Seatbelts for vm_page_bits() in case a file offset is passed in rather than the page offset. If a large file offset was passed in, a large negative array index could be generated which could cause page faults etc at worst and file corruption at the least. (Pages are allocated within file space on page alignment boundaries, so a file offset being passed in here is harmless to DTRT. The case where this was happening has already been fixed though, this is in case it happens again). Reviewed by: dyson	1998-05-02 03:02:13 +00:00
dyson	7835943279	Fix minor bug with new over used swap fix.	1998-05-01 02:25:29 +00:00
dyson	a7cf05f1b7	Add a needed prototype, and fix a panic problem with the new memory code.	1998-04-29 06:59:08 +00:00
dyson	b5a79794cd	Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.	1998-04-29 04:28:22 +00:00
dyson	36e092b938	Fix a pseudo-swap leak problem. This mitigates "leaks" due to freeing partial objects, not freeing entire objects didn't free any of it. Simple fix to the map code. Reviewed by: dg	1998-04-28 05:54:47 +00:00
dyson	888e1a851b	Correct copyright.	1998-04-25 04:50:03 +00:00
bde	b598f559b2	Support compiling with `gcc -ansi'.	1998-04-15 17:47:40 +00:00
phk	9b703b1455	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
bde	cd450d6714	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
dyson	6e92f5716b	Some VM improvements, including elimination of alot of Sig-11 problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!! pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.) pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances. vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.) vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust. vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true. vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.) vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities. vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.) vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.) 10) Modify FFS to use vtruncbuf. vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.) vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly. vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.) vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.	1998-03-16 01:56:03 +00:00
guido	7549a1c33e	Fix for mmap of char devices bug as described in OpenBSD advisory of 1998/02/20 Reviewed by: John Dyson Submitted by: "Cy Schubert" <cschuber@uumail.gov.bc.ca>	1998-03-12 19:36:18 +00:00
msmith	d1a1b4e9a9	Complement diagnostic messages about missing per-FS VOP page operations, but don't make their absence fatal. Submitted by: terry	1998-03-09 08:58:53 +00:00
dyson	af690687d2	Quell unneeded pageout daemon activity.	1998-03-08 18:19:17 +00:00
dyson	6620bf5710	Remove a very ill advised vm_page_protect. This was being called for a non-managed page. That is a big no-no.	1998-03-08 18:05:59 +00:00
dyson	d3b4226c4a	Some cruft left over from my megacommit. A page rotation optimization was a good idea, but can cause instability. That optimization is now removed.	1998-03-08 06:27:30 +00:00
dyson	b99df11fc8	Several minor fixes: 1) When freeing pages, it is a good idea to protect them off. (This is probably gratuitious, but good form.) 2) Allow collapsing pages in the backing object that are PQ_CACHE. This will improve memory utilization. 3) Correct the collapse code so that pages that were on the cache queue are moved to the inactive queue. This is done when pages are marked dirty (so that those pages will be properly paged out instead of freed), so that cached pages will not be paradoxically marked dirty.	1998-03-08 06:25:59 +00:00
dyson	8ceb6160f4	This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.	1998-03-07 21:37:31 +00:00
dyson	20b22506b2	Make vm_fault much cleaner by removing the evil macro inlines, and put alot of it's context into a data structure. This allows significant shortening of its codepath, and will significantly decrease it's cache footprint. Also, add some stats to vmmeter. Note that you'll have to rebuild/recompile vmstat, systat, etc... Otherwise, you'll get "very interesting" paging stats.	1998-03-07 20:45:47 +00:00
dufault	e28788f2a4	Reviewed by: msmith, bde long ago POSIX.4 headers and sysctl variables. Nothing should change unless POSIX4 is defined or _POSIX_VERSION is set to 199309.	1998-03-04 10:27:00 +00:00
dyson	69e5a1e9f5	1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."	1998-03-01 04:18:54 +00:00
msmith	15e6194107	In the author's words: These diffs implement the first stage of a VOP_{GET\|PUT}PAGES pushdown for local media FS's. See ffs_putpages in /sys/ufs/ufs/ufs_readwrite.c for implementation details for generic _{get\|put}pages for local media FS's. Support is trivial to add for any FS that formerly relied on the default behaviour of the vnode_pager in in EOPNOTSUPP cases (just copy the ffs_getpages() code for the FS in question's _{get\|put}pages). Obviously, it would be better if each local media FS implemented a more optimal method, instead of calling an exported interface from the /sys/vm/vnode_pager.c, but this is a necessary first step in getting the FS's to a point where they can be supplied with better implementations on a case-by-case basis. Obviously, the cd9660_putpages() can be rather trivial (since it is a read-only FS type 8-)). A slight (temporary) modification is made to print a diagnostic message in the case where the underlying filesystem attempts to engage in the previous behaviour. Failure is likely to be ungraceful. Submitted by: terry@freebsd.org (Terry Lambert)	1998-02-26 06:39:59 +00:00
dyson	014146c040	Fix page prezeroing for SMP, and fix some potential paging-in-progress hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.	1998-02-25 03:56:15 +00:00
dyson	4730cf91f6	Correct some severe VM tuning problems for small systems (<=16MB), and improve tuning on larger systems. (A couple of the VM tuning params for small systems were so badly chosen that the system could hang under load.) The broken tuning was originaly my fault.	1998-02-24 10:16:23 +00:00
dyson	c4e82fbab0	Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)	1998-02-23 08:22:48 +00:00
dyson	b77de22650	Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden in a way identically as before.) I had problems with the system properly handling the number of vnodes when there is alot of system memory, and the default VM_KMEM_SIZE. Two new options "VM_KMEM_SIZE_SCALE" and "VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems with greater than 128MB. Add some accouting for vm_zone memory allocations, and provide properly for vm_zone allocations out of the kmem_map. Also move the vm_zone allocation stats to the VM OID tree from the KERN OID tree.	1998-02-23 07:42:43 +00:00
bde	9fca072392	Removed unused #includes.	1998-02-20 13:11:54 +00:00
msmith	a82f842875	Move the 'sw' device off block major #1 , which is now occupied by 'wfd'.	1998-02-19 12:15:06 +00:00
eivind	d7a6ab2803	Staticize.	1998-02-09 06:11:36 +00:00
dyson	8cf33a55bf	Fix an argument to vn_lock. It appears that alot of the vn_lock usage is a bit undisciplined, and should be checked carefully.	1998-02-08 14:55:13 +00:00
eivind	4547a09753	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
dyson	ebccbfc1ff	1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.	1998-02-05 03:32:49 +00:00
eivind	c552a9a1c3	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
bde	ffbb93a37a	Added #include of <sys/queue.h> so that this file is more "self"-sufficent.	1998-02-03 22:19:35 +00:00
dyson	7fde1e8379	This fix should help the panic problems in -current. There were some errors in "interval" management. Due to the clustering mechanism, the code is necessarily complex and error prone.	1998-02-03 00:50:36 +00:00
bde	2adc6309fe	Forward declare more structs that are used in prototypes here - don't depend on <sys/types.h> forward declaring common ones.	1998-02-01 20:08:39 +00:00
dyson	ef6e7f7b8d	Fix a performance problem caused by an earlier commit.	1998-02-01 02:00:20 +00:00
dyson	44cc663f3d	contigalloc doesn't place the allocated page(s) into an object, and now this breaks vm_page_wire (due to wired page accounting per object.) This should fix a problem as described by Donald Maddox.	1998-01-31 20:30:18 +00:00
dyson	2aacd1ab4f	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.	1998-01-31 11:56:53 +00:00
eivind	3e199e2bf3	Turn NSWAPDEV into a new-style option.	1998-01-25 04:13:25 +00:00
eivind	71ddd31390	Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style. This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.	1998-01-24 02:54:56 +00:00
dyson	8726294764	Add better support for larger I/O clusters, including larger physical I/O. The support is not mature yet, and some of the underlying implementation needs help. However, support does exist for IDE devices now.	1998-01-24 02:01:46 +00:00
dyson	197bd655c4	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)	1998-01-22 17:30:44 +00:00
dyson	f3e61df0fe	Allow gdb to work again.	1998-01-21 12:18:00 +00:00
dyson	b130b30c96	Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.	1998-01-17 09:17:02 +00:00
dyson	d9d8bf6d30	Fix some vnode management problems, and better mgmt of vnode free list. Fix the UIO optimization code. Fix an assumption in vm_map_insert regarding allocation of swap pagers. Fix an spl problem in the collapse handling in vm_object_deallocate. When pages are freed from vnode objects, and the criteria for putting the associated vnode onto the free list is reached, either put the vnode onto the list, or put it onto an interrupt safe version of the list, for further transfer onto the actual free list. Some minor syntax changes changing pre-decs, pre-incs to post versions. Remove a bogus timeout (that I added for debugging) from vn_lock. PHK will likely still have problems with the vnode list management, and so do I, but it is better than it was.	1998-01-12 01:46:33 +00:00
dyson	cbf538c65f	Turn off the VTEXT flag when an object is no longer referenced, so that an executable that is no longer running can be written to. Also, clear the OBJ_OPT flag more often, when appropriate.	1998-01-07 03:12:19 +00:00
dyson	cb2800cd94	Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.	1998-01-06 05:26:17 +00:00
alex	e87ef9d530	caddr_t --> void *	1997-12-31 02:35:29 +00:00
dyson	8ab3ac77d2	Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem with the object cache removal.	1997-12-29 01:03:55 +00:00
dyson	cd67bb82fe	Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.	1997-12-29 00:25:11 +00:00
dyson	74cfdda9c1	The ioopt code is still buggy, but wasn't fully disabled.	1997-12-25 20:55:15 +00:00
dyson	d97fabbb53	Support running with inadequate swap space. Additionally, the code will complain with a suggestion of increasing it.	1997-12-24 15:05:25 +00:00
dyson	2a4aef5cd4	Improve my copyright.	1997-12-22 11:48:13 +00:00
dyson	4c44bbc963	Change bogus usage of btoc to atop. The incorrect usage of btoc was pointed out by bde.	1997-12-19 15:31:13 +00:00
dyson	6bd1f74dcf	Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)	1997-12-19 09:03:37 +00:00
eivind	01dd6091ed	Make COMPAT_43 and COMPAT_SUNOS new-style options.	1997-12-16 17:40:42 +00:00
dyson	fc1a352788	Fix a recursive kernel_map lock problem in vm_zone allocator. PR: 5298	1997-12-15 05:16:09 +00:00
dyson	e86cd387f8	Slight improvement to the vm_zone stats output. Also, some other superficial cleanups.	1997-12-14 05:17:44 +00:00
dyson	738872cad6	After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush. Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.	1997-12-14 02:11:23 +00:00
dyson	043ed4f1ba	Fix the prototype for swapout_procs(); Submitted by: dima@best.net	1997-12-11 02:10:55 +00:00
dyson	9821c09585	Support an optional, sysctl enabled feature of idle process swapout. This is apparently useful for large shell systems, or systems with long running idle processes. To enable the feature: sysctl -w vm.swap_idle_enabled=1 Please note that some of the other vm sysctl variables have been renamed to be more accurate. Submitted by: Much of it from Matt Dillon <dillon@best.net>	1997-12-06 02:23:36 +00:00
bde	efd51d84cf	Don't include <sys/lock.h> in headers when only `struct simplelock' is required. Fixed everything that depended on the pollution.	1997-12-05 19:55:52 +00:00
dyson	b1d65d1edc	Add new (very useful) tunable for pageout daemon. The flag changes the maximum pageout rate: sysctl -w vm.vm_maxlaunder=n 1 < n < inf. If paging heavily on large systems, it is likely that a performance improvement can be achieved by increasing the parameter. On a large system, the parm is 32, but numbers as large as 128 can make a big difference. If paging is expensive, you might try decreasing the number to 1-8.	1997-12-05 05:41:06 +00:00
dyson	cd720ec73b	Support applications that need to resist or deny use of swap space. sysctl -w vm.defer_swap_pageouts=1 Causes the system to resist the use of swap space. In low memory conditions, performance will decrease. sysctl -w vm.disable_swap_pageouts=1 Causes the system to mostly disable the use of swap space. In low memory conditions, the system will likely start killing processes.	1997-12-04 19:00:56 +00:00
phk	a1bfb618d9	In all such uses of struct buf: 's/b_un.b_addr/b_data/g'	1997-12-02 21:07:20 +00:00
bde	17589a0467	Removed all traces of P_IDLEPROC. It was tested but never set.	1997-11-24 15:15:33 +00:00
bde	c729ff1df2	Don't #define max() to get a version that works with vm_ooffset's. Just use qmax(). This should be fixed more generally using overloaded functions.	1997-11-24 15:03:13 +00:00
bde	e38ebd73b3	Removed unused #include of <sys/malloc.h>. This file now uses only zalloc(). Many more cases like this are probably obscured by not including <vm/zone.h> explicitly (it is spammed into <sys/malloc.h>).	1997-11-18 11:02:19 +00:00
tegge	08d3982b7d	Simplify map entries during user page wire and user page unwire operations in vm_map_user_pageable(). Check return value of vm_map_lock_upgrade() during a user page wire operation.	1997-11-14 23:42:10 +00:00
phk	ccc7e7fa9f	Rename some local variables to avoid shadowing other local variables. Found by: -Wshadow	1997-11-07 09:21:01 +00:00
phk	4d26888936	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00
phk	4c8218a5c7	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
dyson	bae55d2661	Fix the "missing page" problem. Also, improve the performance of page allocation in common cases.	1997-11-06 08:35:50 +00:00
bde	fb826377ff	Removed unused #includes.	1997-10-28 15:59:26 +00:00
dyson	56bd787cbb	Support garbage collecting the pmap pv entries. The management doesn't happen until the system would have nearly failed anyway, so no signficant overhead is added. This helps large systems with lots of processes.	1997-10-25 02:41:56 +00:00
dyson	bcae676793	Decrease the initial allocation for the zone allocations.	1997-10-24 23:41:04 +00:00
phk	36e7a51ea1	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
phk	645e7b2ab6	Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde	1997-10-11 18:31:40 +00:00
peter	34c09ca6a9	Attempt to fix the previous fix to the contigmalloc1 prototype. struct malloc_type isn't defined in all cases (eg: from ddb), and the line wrapping was very badly mangled.	1997-10-11 10:39:19 +00:00
phk	a8f67509a6	Fix contigmalloc() and contigmalloc1() arguments.	1997-10-10 18:18:47 +00:00
dyson	9b2d1fdb85	Improve management of pages moving from the inactive to active queue. Additionally, add some much needed comments.	1997-10-06 02:48:16 +00:00
dyson	df56983676	Relax the vnode locking for read only operations.	1997-10-06 02:38:30 +00:00
peter	9cb126eb6a	Fix some style(9) and formatting problems. tabsize 4 formatting doesn't look too great with 'more' etc. Approved by: dyson (with a minor grumble :-)	1997-09-21 11:41:12 +00:00
dyson	e64b1984f9	Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.	1997-09-21 04:24:27 +00:00
peter	796eb5ce0a	Update select -> poll in drivers.	1997-09-14 03:19:42 +00:00
peter	0a52445cc3	Print correct function name in panics	1997-09-13 15:04:52 +00:00
jlemon	0eba88d0be	Do not consider VM_PROT_OVERRIDE_WRITE to be part of the protection entry when handling a fault. This is set by procfs whenever it wants to write to a page, as a means of overriding `r-x COW' entries, but causes failures in the `rwx' case. Submitted by: bde	1997-09-12 15:58:47 +00:00
bde	bcade9a903	Removed yet more vestiges of config-time swap configuration and/or cleaned up nearby cruft.	1997-09-07 16:21:11 +00:00
bde	a08aff2d02	Removed unused #includes.	1997-09-01 03:17:34 +00:00
bde	e11885cf92	Some staticized variables were still declared to be extern.	1997-09-01 02:55:50 +00:00
bde	98fcb3f476	Print a device number in hex instead of decimal.	1997-09-01 02:28:32 +00:00
phk	0b3a12b83e	Change the 0xdeadb hack to a flag called VDOOMED. Introduce VFREE which indicates that vnode is on freelist. Rename vholdrele() to vdrop(). Create vfree() and vbusy() to add/delete vnode from freelist. Add vfree()/vbusy() to keep (v_holdcnt != 0 \|\| v_usecount != 0) vnodes off the freelist. Generalize vhold()/v_holdcnt to mean "do not recycle". Fix reassignbuf()s lack of use of vhold(). Use vhold() instead of checking v_cache_src list. Remove vtouch(), the vnodes are always vget'ed soon enough after for it to have any measuable effect. Add sysctl debug.freevnodes to keep track of things. Move cache_purge() up in getnewvnodes to avoid race. Decrement v_usecount after VOP_INACTIVE(), put a vhold() on it during VOP_INACTIVE() Unmacroize vhold()/vdrop() Print out VDOOMED and VFREE flags (XXX: should use %b) Reviewed by: dyson	1997-08-31 07:32:39 +00:00
peter	29e2c84e7a	Allow non-page aligned file offset mmap's, providing that the system is allowed to choose the address, or that the MAP_FIXED address has the same remainder when modulo PAGE_SIZE as the file offset. Apparently this is posix1003.1b specified behavior. SVR4 and the other *BSD's allow it too. It costs us nothing to support and means we don't get EINVAL on some mmap code that works perfectly elsewhere. Obtained from: NetBSD	1997-08-30 18:50:06 +00:00
bde	c978fb3652	Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.	1997-08-25 22:15:31 +00:00
dyson	042ae4067b	This is a trial improvement for the vnode reference count while on the vnode free list problem. Also, the vnode age flag is no longer used by the vnode pager. (It is actually incorrect to use then.) Constructive feedback welcome -- just be kind.	1997-08-22 03:56:37 +00:00
bde	6be005551f	#include <machine/limits.h> explicitly in the few places that it is required.	1997-08-21 20:33:42 +00:00
fsmp	24a2d0d38a	Added includes of smp.h for SMP. This eliminates a bazillion warnings about implicit s_lock & friends.	1997-08-18 03:29:21 +00:00
dyson	cc823b6e73	Fix kern_lock so that it will work. Additionally, clean-up some of the VM systems usage of the kernel lock (lockmgr) code. This is a first pass implementation, and is expected to evolve as needed. The API for the lock manager code has not changed, but the underlying implementation has changed significantly. This change should not materially affect our current SMP or UP code without non-standard parameters being used.	1997-08-18 02:06:35 +00:00
dyson	a8d01f6338	The "cutsie" register parameter passing that I had mistakenly used breaks profiling. Since it doesn't really improve perf much, I have backed it out.	1997-08-10 00:12:13 +00:00
dyson	85f902e519	More vm_zone cleanup. The sysctl now accounts for items better, and counts the number of allocations.	1997-08-07 03:52:55 +00:00
dyson	5f9cb6429d	Add exposure of some vm_zone allocation stats by sysctl. Also, change the initialization parameters of some zones in VM map. This contains only optimizations and not bugfixes.	1997-08-06 04:58:05 +00:00
dyson	e150d815cc	Fixed the commit botch that was causing crashes soon after system startup. Due to the error, the initialization of the zone for pv_entries was missing. The system should be usable again.	1997-08-05 23:03:24 +00:00
dyson	2649bd0b26	Another attempt at cleaning up the new memory allocator.	1997-08-05 22:24:31 +00:00
dyson	55205b3be5	Fix some bugs, document vm_zone better. Add copyright to vm_zone.h. Use the new zone code in pmap.c so that we can get rid of the ugly ad-hoc allocations in pmap.c.	1997-08-05 22:07:27 +00:00
dyson	96f688be11	Modify pmap to use our new memory allocator. Also, change the vm_map_entry allocations to be interrupt safe.	1997-08-05 01:32:52 +00:00
dyson	54005d6ed9	A very simple zone allocator.	1997-08-05 00:07:31 +00:00
dyson	8fa8ae3d0d	Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.	1997-08-05 00:02:08 +00:00
bde	9195bd1ec7	Removed unused #includes.	1997-08-02 14:33:27 +00:00
dyson	5e05983d33	Add the ability for the pageout daemon to measure stats on memory usage before the system is out of memory. The daemon does a minimal amount of work that increases as the system becomes more likely to run out of memory and page in/out. The default tuning is fairly low in background CPU usage, and sysctl variables have been added to enable flexable operation. This is an experimental feature that will likely be changed and improved over time.	1997-07-27 04:49:19 +00:00
dyson	e011371c82	Fix a very subtile problem that causes unnessary numbers of objects backing a single logical object. Submitted by: Alan Cox <alc@cs.rice.edu>	1997-07-27 04:44:12 +00:00
dyson	b39089e3e9	Add support for 4MB pages. This includes the .text, .data, .data parts of the kernel, and also most of the dynamic parts of the kernel. Additionally, 4MB pages will be allocated for display buffers as appropriate (only.) The 4MB support for SMP isn't complete, but doesn't interfere with operation either.	1997-07-17 04:34:03 +00:00
tegge	4b413d416e	Don't try upgrading an existing exclusive lock in vm_map_user_pageable. This should close PR kern/3180. Also remove a bogus unconditional call to vm_map_unlock_read in vm_map_lookup.	1997-06-23 21:51:03 +00:00
peter	e0245a10b2	Kill some stale leftovers from the earlier attempts at SMP per-cpu pages	1997-06-22 15:47:16 +00:00
dyson	8786565a86	Remove a window during running down a file vnode. Also, the OBJ_DEAD flag wasn't being respected during vref(), et. al. Note that this isn't the eventual fix for the locking problem. Fine grained SMP in the VM and VFS code will require (lots) more work.	1997-06-22 03:00:24 +00:00
dyson	db14cfe28c	Correct the return code for the mlock system call. Also add the stubs for mlockall and munlockall.	1997-06-15 23:35:32 +00:00
dyson	2e39fb736c	Fix a reference problem with maps. Only appears to manifest itself when sharing address spaces.	1997-06-15 23:33:52 +00:00
peter	6f94abef97	Update the #include "opt_smpxxx.h" includes - opt_smp.h isn't needed very much in the generic parts of the kernel now.	1997-05-29 02:57:22 +00:00
dfr	d7e320b30e	Fix a few bugs with NFS and mmap caused by NFS' use of b_validoff and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully can be tidied up later by a slight redesign. PR: kern/2573, kern/2754, kern/3046 (possibly) Reviewed by: dyson	1997-05-19 14:36:56 +00:00
dyson	8d05a00726	Check the correct queue for waking up the pageout daemon. Specifically, the pageout daemon wasn't always being waken up appropriately when the (cache + free) queues were depleted. Submitted by: David S. Miller <davem@jenolan.rutgers.edu>	1997-05-01 14:36:01 +00:00
peter	6323aa10bf	Man the liferafts! Here comes the long awaited SMP -> -current merge! There are various options documented in i386/conf/LINT, there is more to come over the next few days. The kernel should run pretty much "as before" without the options to activate SMP mode. There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment. This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!	1997-04-26 11:46:25 +00:00
peter	4997aa77f1	Send this to the Attic so there's no mixups over which kern_lock.c is in use in -current.	1997-04-21 13:39:56 +00:00
peter	05ac2f5194	Unused variable (upobj is now purely handled within pmap)	1997-04-14 03:40:42 +00:00
dyson	61955ab830	Fully implement vfork. Vfork is now much much faster than even our fork. (On my machine, fork is about 240usecs, vfork is 78usecs.) Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory from the other threads of a group. Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating possible existing shares with other threads/processes. Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a thread from the rest of the group. Fix the case where a thread does an exec. It is almost nonsense for a thread to modify the other threads address space by an exec, so we now automatically divorce the address space before modifying it.	1997-04-13 01:48:35 +00:00
peter	ecf50a7463	The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.	1997-04-07 07:16:06 +00:00
peter	07c3ab609e	Commit a typo fix that's been sitting in my tree for ages, quite forgotten. The typo was detected once apon a time with the -Wunused compile option. The result was that a block of code for implementing madvise(.. MADV_SEQUENTIAL..) behavior was "dead" and unused, probably negating the effect of activating the option. Reviewed by: dyson	1997-04-06 16:16:11 +00:00
dyson	f304c6bda9	Make vm_map_protect be more complete about map simplification. This is useful when a process changes it's page range protections very much. Submitted by: Alan Cox <alc@cs.rice.edu>	1997-04-06 03:04:31 +00:00
dyson	54fd4a3d42	Correction to the prototype for vm_fault.	1997-04-06 02:30:56 +00:00
dyson	22d3427970	Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem. The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again. The code in procfs_mem is now less bogus (but maybe still a little so.)	1997-04-06 02:29:45 +00:00
bde	278256e73a	Removed potentially harmful garbage <vm/lock.h> and fixed bogus use of it. It was actually harmless because the use was null due to fortuitous include orders and identical (wrong) idempotency macros.	1997-04-01 08:39:07 +00:00
dg	1543ecae88	Changed the way that the exec image header is read to be filesystem- centric rather than VM-centric to fix a problem with errors not being detectable when the header is read. Killed exech_map as a result of these changes. There appears to be no performance difference with this change.	1997-03-31 11:11:26 +00:00
bde	0d3591bdbd	Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.	1997-03-23 03:37:54 +00:00
dyson	9c1cce114f	Fix a significant error in the accounting for pre-zeroed pages. This is a candidate for RELENG_2_2...	1997-03-23 02:44:54 +00:00
dyson	3f5747589b	When removing IN_RECURSE support during the Lite/2 merge, read/write to/from mmaped regions was broken. This commit fixes the breakage, and uses the new Lite/2 locking mechanisms.	1997-03-08 04:33:47 +00:00
bde	61157dd0d7	Removed a wrong LK_INTERLOCK flag.	1997-02-27 15:38:41 +00:00
peter	94b6d72794	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
bde	6a229a98d5	Removed vestiges of Mach lock types. vm_map.h: Removed #include of <sys/proc.h>. curproc is only used in some macros and users of the macros already include <sys/proc.h>.	1997-02-18 14:07:03 +00:00
wollman	cb442e2038	Provide an alternative interface to contigmalloc() which allows a specific map to be used when allocating the kernel va (e.g., mb_map). The VM gurus may want to look this over.	1997-02-13 19:37:40 +00:00
dyson	10f666af84	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>	1997-02-10 02:22:35 +00:00
dyson	0ebe30bff0	Another fix to inheriting shared segments. Do the copy on write thing if needed. Submitted by: Alan Cox <alc@cs.rice.edu>	1997-01-31 04:10:41 +00:00
dg	5479ba63a9	Added a check/panic for v_usecount being 0 (no vnode reference) in vnode_pager_alloc().	1997-01-24 22:20:23 +00:00
dyson	7a02d1469f	Fix two problems where a NULL object is dereferenced. One problem was in the VM_INHERIT_SHARE case of vmspace_fork, and also in vm_map_madvise. Submitted by: Alan Cox <alc@cs.rice.edu>	1997-01-22 01:34:48 +00:00
dyson	7a84712547	Make MADV_FREE work better. Specifically, it did not wait for the page to be unbusy, and it caused some algorithmic problems as a result. There were some other problems with it also, so this is a general cleanup of the code. Submitted by: Douglas Crosher <dtc@scrooge.ee.swin.oz.au> and myself.	1997-01-20 02:25:14 +00:00
dyson	52f682b582	Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.	1997-01-16 04:16:22 +00:00
dg	78afd808d5	Fix bug related to map entry allocations where a sleep might be attempted when allocating memory for network buffers at interrupt time. This is due to inadequate checking for the new mcl_map. Fixed by merging mb_map and mcl_map into a single mb_map. Reviewed by: wollman	1997-01-15 20:46:02 +00:00
bde	dd18dffcc8	Removed redundant spl0()'s from kernel processes. They were work-arounds for a bug in fork().	1997-01-15 19:05:08 +00:00
jkh	808a36ef65	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
dyson	9215dc1c36	Slightly correct the code that moves pages from the active to the inactive queue. This is only a minor performance improvement, but will not affect perf on machines that don't have ref bits.	1997-01-11 07:22:24 +00:00
dyson	73dafb0b2c	Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.	1997-01-11 07:19:02 +00:00
dyson	b68b333a77	Undo the collapse breakage (swap space usage problem.)	1997-01-03 17:02:28 +00:00
dyson	3bb3295727	Guess what? We left alot of the old collapse code that is not needed anymore with the "full" collapse fix that we added about 1yr ago!!! The code has been removed by optioning it out for now, so we can put it back in ASAP if any problems are found.	1997-01-01 04:45:05 +00:00
dyson	b7fce42185	A very significant improvement in the management of process maps and objects. Previously, "fancy" memory management techniques such as that used by the M3 RTS would have the tendancy of chopping up processes allocated memory into lots of little objects. Alan has come up with some improvements to migtigate the sitution to the point where even the M3 RTS only has one object for bss and it's managed memory (when running CVSUP.) (There are still cases where the situation isn't improved when the system pages -- but this is much much better for the vast majority of cases.) The system will now be able to much more effectively merge map entries. Submitted by: Alan Cox <alc@cs.rice.edu>	1996-12-31 16:23:38 +00:00
dyson	c232302d3f	Let the VM system know that on certain arch's that VM_PROT_READ also implies VM_PROT_EXEC. We support it that way for now, since the break system call by default gives VM_PROT_ALL. Now we have a better chance of coalesing map entries when mixing mmap/break type operations. This was contributing to excessive numbers of map entries on the modula-3 runtime system. The problem is still not "solved", but the situation makes more sense. Eventually, when we work on architectures where VM_PROT_READ is orthogonal to VM_PROT_EXEC, we will have to visit this issue carefully (esp. regarding security issues.)	1996-12-30 05:31:21 +00:00
dyson	bc06bff430	EEEK!!! useracc and kernacc didn't lock their respective maps. Additionally, eliminate the map->hint distortion associated with useracc. That may/may-not be the "right" thing to do -- but time will tell. Submitted by: Partially by Alan Cox <alc@cs.rice.edu>	1996-12-30 03:56:11 +00:00
dyson	a23fe88830	Superficial cleanup of comment.	1996-12-29 02:33:12 +00:00
dyson	fb5d9384f8	Eliminate the redundancy due to the similarity between the routines vm_map_simplify and vm_map_simplify_entry. Make vm_map_simplify_entry handle wired maps so that we can get rid of vm_map_simplify. Modify the callers of vm_map_simplify to properly use vm_map_simplify_entry. Submitted by: Alan Cox <alc@cs.rice.edu>	1996-12-28 23:07:49 +00:00
dyson	527a08777f	The code unnecessarily created an object with no handle up-front, which has the negative effect of disabling some map optimizations. This patch defers the creation of the object until it needs to be at fault time. Submitted by: Alan Cox <alc@cs.rice.edu>	1996-12-28 22:40:44 +00:00
joerg	63b6a05776	Make DFLDSIZ and MAXDSIZ fully-supported options. "Don't forget to do a ``make depend''" :-)	1996-12-22 23:17:09 +00:00
dyson	765e5fd282	Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.) Under normal usage, the new mlock code will be a significant improvement over our old stuff.	1996-12-14 17:54:17 +00:00
dyson	86b1c9f6b9	Expunge inlines...	1996-12-07 07:44:05 +00:00
dyson	16bfdb75a7	Fix a map entry leak problem found by DG. Also, de-inline a function vm_map_entry_dispose, because it won't help being inlined.	1996-12-07 06:19:37 +00:00
dyson	468189da8d	Make vm_map_insert much more intelligent in the MAP_NOFAULT case so that map entries are coalesced when appropriate. Also, conditionalize some code that is currently not used in vm_map_insert. This mod has been added to eliminate unnecessary map entries in buffer map. Additionally, there were some cases where map coalescing could be done when it shouldn't. That problem has been resolved.	1996-12-07 00:03:43 +00:00
dyson	7a58275f33	Implement a new totally dynamic (up to MAXPHYS) buffer kva allocation scheme. Additionally, add the capability for checking for unexpected kernel page faults. The maximum amount of kva space for buffers hasn't been decreased from where it is, but it will now be possible to do so. This scheme manages the kva space similar to the buffers themselves. If there isn't enough kva space because of usage or fragementation, buffers will be reclaimed until a buffer allocation is successful. This scheme should be very resistant to fragmentation problems until/if the LFS code is fixed and uses the bogus buffer locking scheme -- but a 'fixed' LFS is not likely to use such a scheme. Now there should be NO problem allocating buffers up to MAXPHYS.	1996-11-30 22:41:49 +00:00
dyson	f573ad0ab2	Make the kernel smaller with at worst a neutral effect on perf by de-inlining some VM calls. (Actually, I measured a small improvement.)	1996-11-28 23:15:07 +00:00
dyson	2383152fd5	Improve the locality of reference for variables in vm_page and vm_kern by moving them from .bss to .data. With this change, there is a measurable perf improvement in fork/exec.	1996-11-17 02:38:31 +00:00
dyson	f459fb4443	Vastly improved contigmalloc routine. It does not solve the problem of allocating contiguous buffer memory in general, but make it much more likely to work at boot-up time. The best chance for an LKM-type load of a sound driver is immediately after the mount of the root filesystem. This appears to work for a 64K allocation on an 8MB system.	1996-11-05 04:19:08 +00:00
dyson	70bcdaf44a	Change mmap to use OBJT_DEFAULT instead of OBJT_SWAP by default for anonymous objects. The system will automatically change the type to SWAP if needed (for size or pageout reasons.)	1996-10-29 22:07:11 +00:00
phk	ffa9f8fecc	The way we get a vnode for swapdev is not quite kosher. In particular it breaks in the DEVFS_ROOT case. replicate a bit too much of bdevvp() in here to circumvent the problem. The real problem is the magic that lives in bdevsw[1].	1996-10-27 22:31:00 +00:00
dyson	30a1549ad1	Remove a bogus optimization in the mmap code. It is superfluous, and at best is the same speed as the unoptimized code. At worst, it slows down trivial programs.	1996-10-24 02:56:23 +00:00
dyson	ea70b20311	Make processes waken up eligible for immediate swap-in.	1996-10-17 02:58:20 +00:00
dyson	576dd5e9f6	Clean up the rundown of the object backing a vnode. This should fix NFS problems associated with forcible dismounts.	1996-10-17 02:49:35 +00:00
bde	5beb18abc1	Removed nested include of <sys/proc.h> from <vm/vm_object.h> and fixed the one place that depended on it. wakeup() is now prototyped in <sys/systm.h> so that it is normally visible. Added nested include of <sys/queue.h> in <vm/vm_object.h>. The queue macros are a more fundamental prerequisite for <vm/vm_object.h> than the wakeup prototype and previously happened to be included by namespace pollution from <sys/proc.h> or elsewhere.	1996-10-15 18:24:34 +00:00
dyson	4af9184e0b	Move much of the machine dependent code from vm_glue.c into pmap.c. Along with the improved organization, small proc fork performance is now about 5%-10% faster.	1996-10-15 03:16:45 +00:00

... 5 6 7 8 9 ...

1042 Commits