freebsd-nq

Author	SHA1	Message	Date
Poul-Henning Kamp	ec38b344cb	Move the implementation of the vmspace_swap_count() (used only in the "toss the largest process" emergency handling) from vm_map.c to swap_pager.c. The quantity calculated depends strongly on the internals of the swap_pager and by moving it, we no longer need to expose the internal metrics of the swap_pager to the world.	2003-07-18 10:47:58 +00:00
Poul-Henning Kamp	567104a148	Add a new function swap_pager_status() which reports the total size of the paging space and how much of it is in use (in pages). Use this interface from the Linuxolator instead of groping around in the internals of the swap_pager.	2003-07-18 10:26:09 +00:00
Poul-Henning Kamp	e9c0cc157b	Merge swap_pager.c and vm_swap.c into swap_pager.c, the separation is not natural and needlessly exposes a lot of dirty laundry. Move private interfaces between the two from swap_pager.h to swap_pager.c and staticize as much as possible. No functional change.	2003-07-18 10:02:44 +00:00
Poul-Henning Kamp	116b3c2af9	Make sure that SWP_NPAGES always has the same value in all source files, so that SWAP_META_PAGES does not vary either. swap_pager.c ended up with a value of 16, everybody else 8. Go with the 16 for now. This should only have any effect in the "kill processes because we are out of swap" scenario, where it will make some sort of estimate of something more precise.	2003-07-17 21:58:43 +00:00
Alan Cox	dd5e55f872	Maintain the lock on a vm object when calling vm_page_grab().	2003-06-25 04:53:56 +00:00
Alan Cox	5ea4972cd4	Make swap_pager_haspages() static; remove unused function prototypes.	2003-06-20 20:20:06 +00:00
Poul-Henning Kamp	b94b853bf1	This file was ignored by CVS in my last commit for some reason: Remove pointless initialization of b_spc field, which now no longer exists.	2003-06-16 09:31:15 +00:00
Alan Cox	33a609ece0	Extend the scope of the vm object lock in swp_pager_async_iodone() to cover a vm_page_free().	2003-06-13 06:17:42 +00:00
Alan Cox	8630c1173e	Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.	2003-06-13 03:02:28 +00:00
David E. O'Brien	874651b13c	Use __FBSDID().	2003-06-11 23:50:51 +00:00
Alan Cox	19ba4c8e49	Assert that the vm object is locked on entry to swap_pager_freespace().	2003-06-07 20:43:16 +00:00
Alan Cox	658ad5fff5	Lock the vm_object when performing vm_pager_deallocate().	2003-05-06 02:45:28 +00:00
Alan Cox	17cd3642fe	- Lock the vm_object when performing swap_pager_isswapped(). - Assert that the vm_object is locked in swap_pager_isswapped().	2003-04-28 17:13:53 +00:00
Alan Cox	1ca5895341	- Convert vm_object_pip_wait() from using tsleep() to msleep(). - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().	2003-04-26 18:33:18 +00:00
Alan Cox	d68d828b43	- Lock the vm_object when performing vm_object_pip_add(). - Remove an unnecessary variable.	2003-04-20 07:08:30 +00:00
Alan Cox	d22bc7101c	- Lock the vm_object when performing vm_object_pip_add().	2003-04-20 03:41:21 +00:00
Alan Cox	0fa05eae77	- Lock the vm_object when performing vm_object_pip_subtract(). - Assert that the vm_object lock is held in vm_object_pip_subtract().	2003-04-19 22:11:41 +00:00
Alan Cox	0d420ad3e6	- Lock the vm_object when performing vm_object_pip_wakeupn(). - Assert that the vm_object lock is held in vm_object_pip_wakeupn(). - Add a new macro VM_OBJECT_LOCK_ASSERT().	2003-04-19 21:15:44 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Poul-Henning Kamp	c410df597b	Avoid extern decls in .c files by putting them in the vm/swap_pager.h include file where they belong. Share the dmmax_mask variable.	2003-01-03 14:30:46 +00:00
Poul-Henning Kamp	862702306b	Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.	2003-01-03 06:32:15 +00:00
Alan Cox	b365ea9e30	Hold the page queues lock when performing vm_page_flag_set().	2002-12-18 04:02:02 +00:00
Matthew Dillon	92da00bb24	This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment. Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks	2002-12-15 19:17:57 +00:00
Alan Cox	a12cc0e489	Remove vm_page_protect(). Instead, use pmap_page_protect() directly.	2002-11-18 04:05:22 +00:00
Olivier Houchard	f64e99baa2	Remove extra #include<sys/vmmeter.h>.	2002-11-11 13:57:50 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	6a2eac8acc	- Lock access to numoutput on the swap devices.	2002-09-25 01:24:17 +00:00
Matthew Dillon	ec61f55d42	Reduce the maximum KVA reserved for swap meta structures from 70 to 32 MB. Reduce the swap meta calculation by a factor of 2, it's still massive overkill. X-MFC after: immediately	2002-08-31 21:15:29 +00:00
Alan Cox	ab9abe5d7e	o Lock page queue accesses by vm_page_free().	2002-07-21 20:38:45 +00:00
Alan Cox	40eab1e944	o Lock page queue accesses by vm_page_try_to_cache(). (The accesses in kern/vfs_bio.c are already locked.) o Assert that the page queues lock is held in vm_page_try_to_cache().	2002-07-20 20:58:46 +00:00
Ian Dowse	23f09d50bb	Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary: o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.	2002-06-26 20:32:51 +00:00
Ian Dowse	5125fe4f45	Use an explicit cast to avoid relying on sign extension to do the right thing in code such as `vm_pindex_t x = ~SWAP_META_MASK'. Reviewed by: dillon	2002-06-26 19:18:14 +00:00
Alan Cox	24c46d036d	o Replace GIANT_REQUIRED in swap_pager_alloc() by the acquisition and release of Giant. (Annotate as MPSAFE.)	2002-06-22 08:03:21 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Jeff Roberson	670d17b5c0	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 04:02:59 +00:00
Alfred Perlstein	11caded34f	Remove __P.	2002-03-19 22:20:14 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
Eivind Eklund	a128794977	- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc	2002-03-10 21:52:48 +00:00
John Baldwin	fdcc1cc09f	Use thread0.td_ucred instead of proc0.p_ucred. This change is cosmetic and isn't strictly required. However, it lowers the number of false positives found when grep'ing the kernel sources for p_ucred to ensure proper locking.	2002-02-27 19:18:10 +00:00
Poul-Henning Kamp	57c10583aa	GC: BIO_ORDERED, various infrastructure dealing with BIO_ORDERED.	2002-02-22 09:26:35 +00:00
Tor Egge	d6844b6bf6	Don't use an uninitialized field reserved for callers in the bio structure passed to swap_pager_strategy(). Instead, use a field reserved for drivers and initialize it before usage. Reviewed by: dillon	2001-10-15 23:02:54 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
Matthew Dillon	2f9e4e8025	Limit the amount of KVM reserved for the buffer cache and for swap-meta information. The default limits only effect machines with > 1GB of ram and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and kern.maxbcache. This has the effect of leaving more KVM available for sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad adds memory to a machine and then sees the kernel panic on boot due to running out of KVM. Also change the default swap-meta auto-sizing calculation to allocate half of what it was previously allocating. The prior defaults were way too high. Note that we cannot afford to run out of swap-meta structures so we still stay somewhat conservative here.	2001-08-20 00:41:12 +00:00
Alfred Perlstein	61ce6eeee3	Fixups for the initial allocation by dillon: 1) allocate fewer buckets 2) when failing to allocate swap zone, keep reducing the zone by a third rather than a half in order to reduce the chance of allocating way too little. I also moved around some code for readability. Suggested by: dillon Reviewed by: dillon	2001-08-02 07:54:58 +00:00
Matthew Dillon	54d9214595	whitespace / register cleanup	2001-07-04 19:00:13 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
John Baldwin	6d541bf1ae	- Protect all accesses to nsw_[rw]count{,_{,a}sync} with the pbuf mutex. - Don't drop the vm mutex while grabbing the pbuf mutex to manipulate said variables.	2001-06-22 21:12:19 +00:00
John Baldwin	b608320d4a	- Fix the sw_alloc_interlock to actually lock itself when the lock is acquired. - Assert Giant is held in the strategy, getpages, and putpages methods and the getchainbuf, flushchainbuf, and waitchainbuf functions. - Always call flushchainbuf() w/o the VM lock.	2001-05-23 22:31:15 +00:00
Alfred Perlstein	c5e62505ad	aquire Giant when playing with the buffercache and doing IO. use msleep against the vm mutex while waiting for a page IO to complete.	2001-05-23 10:28:11 +00:00
Alfred Perlstein	240e0fdd93	aquire vm mutex in swp_pager_async_iodone. Don't call swp_pager_async_iodone with the mutex held.	2001-05-22 19:01:26 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
Poul-Henning Kamp	a468031ce8	Actually biofinish(struct bio , struct devstat , int error) is more general than the bioerror(). Most of this patch is generated by scripts.	2001-05-06 20:00:03 +00:00
Alfred Perlstein	a9fa2c05fc	Protect pager object creation with sx locks. Protect pager object list manipulation with a mutex. It doesn't look possible to combine them under a single sx lock because creation may block and we can't have the object list manipulation block on anything other than a mutex because of interrupt requests.	2001-04-18 20:24:16 +00:00
Alfred Perlstein	2a758ebe58	protect pbufs and associated counts with a mutex	2001-04-13 10:23:32 +00:00
Robert Watson	edfa785a8e	Introduce per-swap area accounting in the VM system, and export this information via the vm.nswapdev sysctl (number of swap areas) and vm.swapdevX nodes (where X is the device), which contain the MIBs dev, blocks, used, and flags. These changes are required to allow top and other userland swap-monitoring utilities to run without setgid kmem. Submitted by: Thomas Moestl <tmoestl@gmx.net> Reviewed by: freebsd-audit	2001-02-23 18:46:21 +00:00
Seigo Tanimura	21cd6e6232	- If swap metadata does not fit into the KVM, reduce the number of struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits. - Reject swapon(2) upon failure of swap_zone allocation. This is just a temporary fix. Better solutions include: (suggested by: dillon) o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves. Reviewed by: alfred, dillon	2000-12-13 10:01:00 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
Robert Watson	cee313c431	o Export dmmax ("Maximum size of a swap block") using SYSCTL_INT. This removes a reason that systat requires setgid kmem. More to come.	2000-11-20 00:39:04 +00:00
Matthew Dillon	936524aa02	Implement a low-memory deadlock solution. Removed most of the hacks that were trying to deal with low-memory situations prior to now. The new code is based on the concept that I/O must be able to function in a low memory situation. All major modules related to I/O (except networking) have been adjusted to allow allocation out of the system reserve memory pool. These modules now detect a low memory situation but rather then block they instead continue to operate, then return resources to the memory pool instead of cache them or leave them wired. Code has been added to stall in a low-memory situation prior to a vnode being locked. Thus situations where a process blocks in a low-memory condition while holding a locked vnode have been reduced to near nothing. Not only will I/O continue to operate, but many prior deadlock conditions simply no longer exist. Implement a number of VFS/BIO fixes (found by Ian): in biodone(), bogus-page replacement code, the loop was not properly incrementing loop variables prior to a continue statement. We do not believe this code can be hit anyway but we aren't taking any chances. We'll turn the whole section into a panic (as it already is in brelse()) after the release is rolled. In biodone(), the foff calculation was incorrectly clamped to the iosize, causing the wrong foff to be calculated for pages in the case of an I/O error or biodone() called without initiating I/O. The problem always caused a panic before. Now it doesn't. The problem is mainly an issue with NFS. Fixed casts for ~PAGE_MASK. This code worked properly before only because the calculations use signed arithmatic. Better to properly extend PAGE_MASK first before inverting it for the 64 bit masking op. In brelse(), the bogus_page fixup code was improperly throwing away the original contents of 'm' when it did the j-loop to fix the bogus pages. The result was that it would potentially invalidate parts of the WRONG page(!), leading to corruption. There may still be cases where a background bitmap write is being duplicated, causing potential corruption. We have identified a potentially serious bug related to this but the fix is still TBD. So instead this patch contains a KASSERT to detect the problem and panic the machine rather then continue to corrupt the filesystem. The problem does not occur very often.. it is very hard to reproduce, and it may or may not be the cause of the corruption people have reported. Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>) Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>	2000-11-18 23:06:26 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
Matthew Dillon	64bcb9c815	The swap bitmap allocator was not calculating the bitmap size properly in the face of non-stripe-aligned swap areas. The bug could cause a panic during boot. Refuse to configure a swap area that is too large (67 GB or so) Properly document the power-of-2 requirement for SWB_NPAGES. The patch is slightly different then the one Tor enclosed in the P.R., but accomplishes the same thing. PR: kern/20273 Submitted by: Tor.Egge@fast.no	2000-10-13 16:44:34 +00:00
Peter Wemm	0385347c1a	Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon	2000-05-21 12:50:18 +00:00
Poul-Henning Kamp	9626b608de	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
Poul-Henning Kamp	0b441832f3	Convert the vm_pager_strategy() interface to take a struct bio instead of a struct buf. Don't try to examine B_ASYNC, it is a layering violation to do so. The only current user of this interface is vn(4) which, since it emulates a disk interface, operates on struct bio already.	2000-05-03 07:47:46 +00:00
Poul-Henning Kamp	e4057dbdb0	Move and staticize the bufchain functions so they become local to the only piece of code using them. This will ease a rewrite of them.	2000-05-01 19:38:51 +00:00
Poul-Henning Kamp	8177437d85	Complete the bio/buf divorce for all code below devfs::strategy Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS	2000-04-15 05:54:02 +00:00
Poul-Henning Kamp	c244d2de43	Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.	2000-04-02 15:24:56 +00:00
Matthew Dillon	25db2c5417	Add necessary spl protection for swapper. The problem was located by Alfred while testing his SPLASSERT stuff. This is not a complete fix, more protections are probably needed.	2000-03-27 21:33:32 +00:00
Philippe Charnier	5929bcfaba	Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce	2000-03-27 20:41:17 +00:00
Philippe Charnier	956f31353c	Spelling	2000-03-26 15:20:23 +00:00
Poul-Henning Kamp	912e4ae9f7	Fix one place which knew that B_WRITE was zero. Fix a stylistic mistake of mine while here. Found by: Stephen Hocking <shocking@prth.pgs.com>	2000-03-22 08:40:13 +00:00
Poul-Henning Kamp	b99c307a21	Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.	2000-03-20 11:29:10 +00:00
Poul-Henning Kamp	21144e3bf1	Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.	2000-03-20 10:44:49 +00:00
Poul-Henning Kamp	db5f635acc	Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.	2000-03-16 08:51:55 +00:00
Peter Wemm	ea3aecf5b4	Fix the swap backed vn case - this was broken by my rev 1.128 to swap_pager.c and related commits. Essentially swap_pager.c is backed out to before the changes, but swapdev_vp is converted into a real vnode with just VOP_STRATEGY(). It no longer abuses specfs vnops and no longer needs a dev_t and /dev/drum (or /dev/swapdev) for the intermediate layer. This essentially restores the vnode interface as the interface to the bottom of the swap pager, and vm_swap.c provides a clean vnode interface. This will need to be revisited when we swap to files (vnodes) - which is the other reason for keeping the vnode interface between the swap pager and the swap devices. OK'ed by: dillon	1999-12-28 07:30:55 +00:00
Poul-Henning Kamp	24e7ab7c0b	Isolate the swapdev_vp "not quite" vnode in the only source file which needs it now that /dev/drum is gone. Reviewed by: eivind, peter	1999-11-22 15:27:09 +00:00
Peter Wemm	cdacc6ab42	Remove the non-functional "swap device" userland front-end to the multiplexed underlying swap devices (/dev/drum). The only thing it did was to allow root to open /dev/drum, but not do anything with it. Various utilities used to grovel around in here, but Matt has written a much nicer (and clean) front-end to this for libkvm, and nothing uses the old system any more. The VM system was calling VOP_STRATEGY() on the vp of the first underlying swap device (not the /dev/drum one, the first real device), and using the VOP system to indirectly (and only) call swstrategy() to choose an underlying device and enqueue it on that device. I have changed it to avoid diverting through the VOP system and to call the only possible target directly, saving a little bit of time and some complexity. In all, nothing much changes, except some scaffolding to support the roundabout way of calling swstrategy() is gone. Matt gave me the ok to do this some time ago, and I apologize for taking so long to get around to it.	1999-11-18 06:55:40 +00:00
Poul-Henning Kamp	923502ff91	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
Matthew Dillon	4dcc5c2d1d	Fix a number of spl bugs related to reserving and freeing swap space. Swap space can be freed from an interrupt and so swap reservation and freeing must occur at splvm. Add swap_pager_reserve() code to support a new swap pre-reservation capability for the VN device. Generally cleanup the swap code by simplifying the swp_pager_meta_build() static function and consolidating the SWAPBLK_NONE test from a bit test to an absolute compare. The bit test was left over from a rejected swap allocation scheme that was not ultimately committed. A few other minor cleanups were also made. Reorganize the swap strategy code, again for VN support, to not reallocate swap when writing as this messes up pre-reservation and can fragment I/O unnecessarily as VN-baesd disk is messed around with. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>	1999-09-17 05:09:24 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Bruce Evans	af647dde33	Use devtoname to print dev_t's instead of casting them to u_long for misprinting with %lx. Cast pointers to intptr_t instead of casting them to long. Cosmetic.	1999-08-23 23:55:03 +00:00
Alan Cox	c52e7044cf	Correct an accidental omission of one "vm_page_undirty" replacement from the previous commit.	1999-08-17 05:56:00 +00:00
Alan Cox	2c28a10540	Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page. Use it. Submitted by: dillon	1999-08-17 04:02:34 +00:00
Alan Cox	9b21395a95	Remove vm_object::last_read. It is used by the old swap pager, but not by the new one, i.e., vm/swap_pager.c rev 1.108. Reviewed by: dillon@backplane.com	1999-07-16 05:11:37 +00:00
Peter Wemm	b890cb2cbd	Kirk missed a required BUF_KERNPROC(). Even though this is a non-async transfer, the b_iodone hook causes biodone() to release it from interrupt context.	1999-06-27 22:08:38 +00:00
Kirk McKusick	67812eacd7	Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.	1999-06-26 02:47:16 +00:00
Poul-Henning Kamp	b0eeea2042	remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde	1999-05-06 20:00:34 +00:00
Julian Elischer	a5296b05b4	Submitted by: Matt Dillon <dillon@freebsd.org> The old VN device broke in -4.x when the definition of B_PAGING changed. This patch fixes this plus implements additional capabilities. The new VN device can be backed by a file ( as per normal ), or it can be directly backed by swap. Due to dependencies in VM include files (on opt_xxx options) the new vn device cannot be a module yet. This will be fixed in a later commit. This commit delimitted by tags {PRE,POST}_MATT_VNDEV	1999-03-14 09:20:01 +00:00
Matthew Dillon	ad3cce2041	Remove conditional sysctl's Leave swap_async_max sysctl intact, remove swap_cluster_max sysctl. Reviewed by: Alan Cox <alc@cs.rice.edu>	1999-02-21 08:34:15 +00:00
Matthew Dillon	20d3034f39	Reviewed by: Alan Cox <alc@cs.rice.edu> Fix problem w/ low-swap/low-memory handling as reported by Bruce Evans.	1999-02-21 08:30:49 +00:00
Matthew Dillon	327f4e8394	Limit number of simultanious asynchronous swap pager I/Os that can be in progress at any given moment. Add two swap tuneables to sysctl: vm.swap_async_max: 4 vm.swap_cluster_max: 16 Recommended values are a cluster size of 8 or 16 pages. async_max is about right for 1-4 swap devices. Reduce to 2 if swap is eating too much bandwidth, or even 1 if swap is both eating too much bandwidth and sitting on a slow network (10BaseT). The defaults work well across a broad range of configurations and should normally be left alone.	1999-02-18 19:57:33 +00:00
Matthew Dillon	2b0d37a4f8	Add hysteresis to the 'swap_pager_getswapspace; failed' console message. Also widen the hysteresis levels a little ( these really should be dynamically configured ).	1999-02-06 07:22:21 +00:00
Matthew Dillon	5e24f1a2f6	Remove unintended trigraph sequences in comments for -Wall	1999-01-27 18:19:53 +00:00
Matthew Dillon	7dbf82dc13	Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline. The inline can thus do sanity checks ( or not ) over all cases.	1999-01-24 06:04:52 +00:00
Matthew Dillon	e4542174b0	vm_pager_put_pages() is passed an rcval array to hold per-page return values. The 'int' return value for the procedure was never used and not well defined in any case when there are mixed errors on pages, so it has been removed. vm_pager_put_pages() and associated vm_pager functions now return void.	1999-01-24 02:32:15 +00:00
Matthew Dillon	9f6fed9017	The default_pager's interaction with the swap_pager has been reorganized, and the swap_pager has been completely replaced. The new swap pager uses the new blist radix-tree based bitmap allocator for low level swap allocation and deallocation. The new allocator is effectively O(5) while the old one was O(N), and the new allocator allocates all required memory at init time rather then at allocate memory on the fly at run time. Swap metadata is allocated in clusters and stored in a hash table, eliminating linearly allocated structures. Many, many features have been rewritten or added. Swap space is now reallocated on the fly providing a poor-mans auto defragmentation of swap space. Swap space that is no longer needed is freed on a timely basis so no garbage collection is necessary. Swap I/O is marked B_ASYNC and NFS has been fixed to do the right thing with it, so NFS-based paging now has around 10x the performance as it did before ( previously NFS enforced synchronous I/O for paging ).	1999-01-21 09:33:07 +00:00
Matthew Dillon	1c7c3c6a86	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
Eivind Eklund	219cbf59f2	KNFize, by bde.	1999-01-10 01:58:29 +00:00
Eivind Eklund	5526d2d920	Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith	1999-01-08 17:31:30 +00:00
Dmitrij Tejblum	7a91724556	Don't free swap in swap_pager_getpages(): this code probably cause the "dying daemons" problem. (I thought this code was introduced in rev.1.80, but it just relaxed the condition.) Also, kill related "suggest more swap space" warning (also introduced in 1.80). It was confusing, to say the least... Requested by: msmith Not objected by: dg	1998-12-29 22:53:51 +00:00
Bruce Evans	04258de351	Fixed a null pointer panic in spc_free(). swap_pager_putpages() almost always causes this panic for the curproc != pageproc case. This case apparently doesn't happen in normal operation, but it happens when vm_page_alloc_contig() is called when there is a memory hogging application that hasn't already been paged out. PR: 8632 Reviewed by: info@opensound.com (Dev Mazumdar), dg Broken in: rev.1.89 (1998/02/23)	1998-11-19 06:20:42 +00:00
Peter Wemm	40c8cfe552	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
David Greenman	6cde7a165f	Fixed two potentially serious classes of bugs: 1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.	1998-10-13 08:24:45 +00:00
Doug Rabson	e69763a315	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.	1998-09-04 08:06:57 +00:00
Doug Rabson	069e9bc1b4	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>	1998-08-24 08:39:39 +00:00
Doug Rabson	196e9a52ec	Protect all modifications to paging_in_progress with splvm().	1998-08-13 08:05:13 +00:00
Bruce Evans	ccbbd9271b	Fixed two spl nesting bugs. They caused (at least) the entire pageout daemon to run at splvm() forever after swap_pager_putpages() is called from vm_pageout_scan(). Broken in: rev.1.189 (1998/02/23)	1998-07-28 15:30:01 +00:00
Bruce Evans	ac1e407b32	Fixed printf format errors.	1998-07-11 07:46:16 +00:00
Julian Elischer	fd5d1124e2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
John Dyson	cbd8ec0902	Work around some VM bugs, the worst being an overly aggressive swap space free calculation. More complete fixes will be forthcoming, in a week.	1998-05-04 03:01:44 +00:00
John Dyson	c0877f103f	Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.	1998-04-29 04:28:22 +00:00
Bruce Evans	c1087c1324	Support compiling with `gcc -ansi'.	1998-04-15 17:47:40 +00:00
John Dyson	8f9110f6a1	This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.	1998-03-07 21:37:31 +00:00
John Dyson	ffc82b0a70	1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."	1998-03-01 04:18:54 +00:00
John Dyson	660957521c	Fix page prezeroing for SMP, and fix some potential paging-in-progress hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.	1998-02-25 03:56:15 +00:00
John Dyson	e47ed70b0f	Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)	1998-02-23 08:22:48 +00:00
Eivind Eklund	303b270b0a	Staticize.	1998-02-09 06:11:36 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
John Dyson	e736cd05cb	This fix should help the panic problems in -current. There were some errors in "interval" management. Due to the clustering mechanism, the code is necessarily complex and error prone.	1998-02-03 00:50:36 +00:00
John Dyson	1f13bdaa97	Fix a performance problem caused by an earlier commit.	1998-02-01 02:00:20 +00:00
John Dyson	eaf13dd73a	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.	1998-01-31 11:56:53 +00:00
John Dyson	2d8acc0f4a	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)	1998-01-22 17:30:44 +00:00
John Dyson	4722175765	Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.	1998-01-17 09:17:02 +00:00
John Dyson	b44e4b7a2b	Support running with inadequate swap space. Additionally, the code will complain with a suggestion of increasing it.	1997-12-24 15:05:25 +00:00
Poul-Henning Kamp	ab3f746966	In all such uses of struct buf: 's/b_un.b_addr/b_data/g'	1997-12-02 21:07:20 +00:00
Bruce Evans	79624e2147	Removed unused #includes.	1997-09-01 03:17:34 +00:00
Bruce Evans	dfeca1b8ae	Print a device number in hex instead of decimal.	1997-09-01 02:28:32 +00:00
Bruce Evans	b9dcd593ff	Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.	1997-08-25 22:15:31 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
John Dyson	9b5a5d81be	Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.	1997-01-11 07:19:02 +00:00
Bruce Evans	8ba0c490a9	Removed __pure's and __pure2's. __pure is a no-op for recent versions of gcc by definition, and __pure2 is a no-op in effect (presumably the compiler can see when an inline function has no side effects).	1996-10-12 20:09:48 +00:00
John Dyson	5070c7f8c5	Addition of page coloring support. Various levels of coloring are afforded. The default level works with minimal overhead, but one can also enable full, efficient use of a 512K cache. (Parameters can be generated to support arbitrary cache sizes also.)	1996-09-08 20:44:49 +00:00
John Dyson	67bf686897	Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.	1996-07-30 03:08:57 +00:00
John Dyson	4f4d35edf0	This commit is meant to solve a couple of VM system problems or performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some evil PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance significantly. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.	1996-07-27 03:24:10 +00:00
John Dyson	3091ee0955	Mostly superficial code improvements, add a diagnostic. The code improvements include significant simplification of the reservation of the swap pager control blocks for reads. Add a panic for an inconsistent swap pager control block count.	1996-06-10 04:58:48 +00:00
John Dyson	0a47b48b9f	Initial support for MADV_FREE, support for pages that we don't care about the contents anymore. This gives us alot of the advantage of freeing individual pages through munmap, but with almost none of the overhead.	1996-05-23 00:45:58 +00:00
John Dyson	b18bfc3da7	This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me: More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.	1996-05-18 03:38:05 +00:00
Poul-Henning Kamp	aa8de40ae5	Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.	1996-05-03 21:01:54 +00:00
Poul-Henning Kamp	e911eafcba	removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG Major macro cleanup.	1996-05-02 14:21:14 +00:00
John Dyson	45952afcc7	Fix a problem in the swap pager that caused some of the pages that were paged in under low swap space conditions to both loose their backing store and their dirty bits. This would cause pages to be demand zeroed under certain conditions in low VM space conditions and consequential sig-11's or sig-10's. This situation was made worse lately when the level for swap space reclaim threshold was increased.	1996-03-06 04:31:46 +00:00
John Dyson	836e5d1360	In order to fix some concurrency problems with the swap pager early on in the FreeBSD development, I had made a global lock around the rlist code. This was bogus, and now the lock is maintained on a per resource list basis. This now allows the rlist code to be used for almost any non-interrupt level application.	1996-03-03 21:11:08 +00:00
John Dyson	de5f6a7765	1) Eliminate unnecessary bzero of UPAGES. 2) Eliminate unnecessary copying of pages during/after forks. 3) Add user map simplification.	1996-03-02 02:54:24 +00:00
David Greenman	1af87c9263	"out of space" -> "out of swap space".	1996-01-31 13:14:21 +00:00
John Dyson	bd7e5f992e	Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.	1996-01-19 04:00:31 +00:00
Bruce Evans	f2c6b65bab	Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.	1995-12-17 07:19:58 +00:00
Poul-Henning Kamp	f708ef1b9e	Another mega commit to staticize things.	1995-12-14 09:55:16 +00:00
Poul-Henning Kamp	87b6de2b76	A Major staticize sweep. Generates a couple of warnings that I'll deal with later. A number of unused vars removed. A number of unused procs removed or #ifdefed.	1995-12-14 08:32:45 +00:00
John Dyson	cb6962cdee	Some new anti-deadlock code ended up messing up the paging stats. A modified version of the code is now in place, and gausspage performance is back up to where it should be.	1995-12-11 15:43:33 +00:00
John Dyson	a316d390bd	Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.	1995-12-11 04:58:34 +00:00
David Greenman	efeaf95a41	Untangled the vm.h include file spaghetti.	1995-12-07 12:48:31 +00:00
Bruce Evans	cac597e4c0	Completed function declarations and/or added prototypes. Staticized some functions. __purified some functions. Some functions were bogusly declared as returning `const'. This hasn't done anything since gcc-2.5. For later versions of gcc, the equivalent is __attribute__((const)) at the end of function declarations.	1995-12-03 12:18:39 +00:00
Poul-Henning Kamp	3af768903d	Remove unused vars & funcs, make things static, protoize a little bit.	1995-11-20 12:20:02 +00:00
Bruce Evans	ff98689d2d	Fixed recent staticizations. Some protypes for static functions were left in headers and not staticized.	1995-11-16 09:51:22 +00:00
Poul-Henning Kamp	f5a1271110	staticize.	1995-11-14 20:53:20 +00:00
David Greenman	23922ccaca	Move page fixups (pmap_clear_modify, etc) that happen after paging input completes out of vm_fault and into the pagers. This get rid of some redundancy and improves the architecture. Reviewed by: John Dyson <dyson>	1995-11-02 06:42:47 +00:00
David Greenman	2f82e60467	Check that the swap block is valid before including it in a cluster. Submitted by: John Dyson	1995-09-24 04:40:19 +00:00
John Dyson	894048d71c	Make sure that the prezero flag is cleared when needed.	1995-09-11 00:47:17 +00:00
John Dyson	ca56715fd8	Fixed a sign reversal problem -- might have cause some Sig-11s that people have been seeing.	1995-09-06 07:08:45 +00:00
John Dyson	170db9c63a	Allow the fault code to use additional clustering info from both bmap and the swap pager. Improved fault clustering performance.	1995-09-04 04:44:26 +00:00
David Greenman	2a4895f4bb	1) Merged swpager structure into vm_object. 2) Changed swap_pager internal interfaces to cope w/#1. 3) Eliminated object->copy as we no longer have copy objects. 4) Minor stylistic changes.	1995-07-16 13:28:37 +00:00
David Greenman	24a1cce34f	NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).	1995-07-13 08:48:48 +00:00
Rodney W. Grimes	9b2e535452	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
David Greenman	5f55e84104	Accessing pages beyond the end of a mapped file results in internal inconsistencies in the VM system that eventually lead to a panic. These changes fix the behavior to conform to the behavior in SunOS, which is to deny faults to pages beyond the EOF (returning SIGBUS). Internally, this is implemented by requiring faults to be within the object size boundaries. These changes exposed another bug, namely that passing in an offset to mmap when trying to map an unnamed anonymous region also results in internal inconsistencies. In this case, the offset is forced to zero. Reviewed by: John Dyson and others	1995-05-18 02:59:26 +00:00
David Greenman	a401ebbe32	Changed swap partition handling/allocation so that it doesn't require specific partitions be mentioned in the kernel config file ("swap on foo" is now obsolete). From Poul-Henning: The visible effect is this: As default, unless options "NSWAPDEV=23" is in your config, you will have four swap-devices. You can swapon(2) any block device you feel like, it doesn't have to be in the kernel config. There is a performance/resource win available by getting the NSWAPDEV right (but only if you have just one swap-device ??), but using that as default would be too restrictive. The invisible effect is that: Swap-handling disappears from the $arch part of the kernel. It gets a lot simpler (-145 lines) and cleaner. Reviewed by: John Dyson, David Greenman Submitted by: Poul-Henning Kamp, with minor changes by me.	1995-05-14 03:00:10 +00:00
David Greenman	ee3a64c992	Changed "handle" from type caddr_t to void ; "handle" is several different types of pointers, and "char " is a bad choice for the type.	1995-05-10 18:56:09 +00:00
John Dyson	11fda60b40	Another error in the correction for trimming swap allocation for small objects. (This code needs to be revisited.)	1995-05-07 06:36:59 +00:00
John Dyson	85b67b98fd	Fixed a calculation that would once-in-a-while cause the swap_pager to emit spurious page outside of object type messages. It is not a fatal condition anyway, so the message will be omitted for release. Also, the code that "clips" the allocation size, associated with the above problem, was fixed.	1995-05-07 03:48:54 +00:00
David Greenman	aba8f38eb9	New flag: B_PAGING. Added as part of the vn driver hack.	1995-04-19 10:32:11 +00:00
David Greenman	64abb5a522	Removed obsolete/unused variable declarations. Removed some extern declarations and included the correct include files.	1995-04-16 13:58:42 +00:00
David Greenman	c3cb3e12f5	Moved some zero-initialized variables into .bss. Made code intended to be called only from DDB #ifdef DDB. Removed some completely unused globals.	1995-04-16 12:56:22 +00:00
David Greenman	c419d77e29	Added a check for wrong object size; print a warning, but deal with it correctly. The warning will tell us that there is a bug somewhere else in sizing the object correctly. Submitted by: John Dyson	1995-03-22 05:12:18 +00:00
David Greenman	edf8a81561	Removed redundant newlines that were in some panic strings.	1995-03-19 14:29:26 +00:00
David Greenman	63635f5ae2	Clear OBJ_INTERNAL flag for device pager objects and named anonymous objects.	1995-03-11 22:25:02 +00:00
David Greenman	f919ebde54	Various changes from John and myself that do the following: New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.	1995-03-01 23:30:04 +00:00
David Greenman	c3a1e4250f	Fixed severely broken printf (arguments out of order, no newline).	1995-02-25 17:02:48 +00:00
David Greenman	c0503609a0	Only do object paging_in_progress wakeups if someone is waiting on this condition. Submitted by: John Dyson	1995-02-22 09:15:35 +00:00
David Greenman	7fb0c17ed2	Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_ pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.	1995-02-21 01:22:48 +00:00
David Greenman	a1f6d91cc2	swap_pager.c: Fixed long standing bug in freeing swap space during object collapses. Fixed 'out of space' messages from printing out too often. Modified to use new kmem_malloc() calling convention. Implemented an additional stat in the swap pager struct to count the amount of space allocated to that pager. This may be removed at some point in the future. Minimized unnecessary wakeups. vm_fault.c: Don't try to collect fault stats on 'swapped' processes - there aren't any upages to store the stats in. Changed read-ahead policy (again!). vm_glue.c: Be sure to gain a reference to the process's map before swapping. Be sure to lose it when done. kern_malloc.c: Added the ability to specify if allocations are at interrupt time or are 'safe'; this affects what types of pages can be allocated. vm_map.c: Fixed a variety of map lock problems; there's still a lurking bug that will eventually bite. vm_object.c: Explicitly initialize the object fields rather than bzeroing the struct. Eliminated the 'rcollapse' code and folded it's functionality into the "real" collapse routine. Moved an object_unlock() so that the backing_object is protected in the qcollapse routine. Make sure nobody fools with the backing_object when we're destroying it. Added some diagnostic code which can be called from the debugger that looks through all the internal objects and makes certain that they all belong to someone. vm_page.c: Fixed a rather serious logic bug that would result in random system crashes. Changed pagedaemon wakeup policy (again!). vm_pageout.c: Removed unnecessary page rotations on the inactive queue. Changed the number of pages to explicitly free to just free_reserved level. Submitted by: John Dyson	1995-02-02 09:09:15 +00:00
David Greenman	6d40c3d394	Added ability to detect sequential faults and DTRT. (swap_pager.c) Added hook for pmap_prefault() and use symbolic constant for new third argument to vm_page_alloc() (vm_fault.c, various) Changed the way that upages and page tables are held. (vm_glue.c) Fixed architectural flaw in allocating pages at interrupt time that was introduced with the merged cache changes. (vm_page.c, various) Adjusted some algorithms to acheive better paging performance and to accomodate the fix for the architectural flaw mentioned above. (vm_pageout.c) Fixed pbuf handling problem, changed policy on handling read-behind page. (vnode_pager.c) Submitted by: John Dyson	1995-01-24 10:14:09 +00:00
David Greenman	480dff540b	Fixed some formatting weirdness that I overlooked in the previous commit.	1995-01-10 07:32:52 +00:00
David Greenman	0d94caffca	These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman	1995-01-09 16:06:02 +00:00
David Greenman	7609ab12e4	Initialize b_vnbuf.le_next before returning a new buffer in getpbuf and trypbuf. Move a couple of splbio's to be slightly less conservative.	1994-12-23 04:56:51 +00:00
David Greenman	6f7bc393bb	Fixed a benign off by one error.	1994-12-22 05:18:12 +00:00
David Greenman	6185408336	Don't ever clear B_BUSY on a pbuf (or any other flag for that matter). This appears to be the cause of some buffer confusion that leads to a panic during heavy paging. Submitted by: John Dyson	1994-12-19 00:02:56 +00:00
David Greenman	24ea4a9671	Fixed bugs in accounting of swap space that resulted in the pager thinking it was out of space when it really wasn't. Submitted by: John Dyson	1994-11-13 15:36:48 +00:00
David Greenman	a83c285c7e	Fixed return status from pagers. Ahem...the previous method would manufacture data when it couldn't get it legitimately. :-( Submitted by: John Dyson	1994-11-06 09:55:31 +00:00
David Greenman	1b119d9d17	Improved I/O error reporting.	1994-10-25 07:06:20 +00:00
David Greenman	5663e6de1c	Various changes to allow operation without any swapspace configured. Note that this is intended for use only in floppy situations and is done at the sacrifice of performance in that case (in ther words, this is not the best solution, but works okay for this exceptional situation). Submitted by: John Dyson	1994-10-22 02:18:03 +00:00
David Greenman	976e77fc9c	1) Some of the counters in the vmmeter struct don't fit well into the Mach VM scheme of things, so I've changed them to be more appropriate. page in/ous are now associated with the pager that did them. Nuked v_fault as the only fault of interest that wouldn't be already counted in v_trap is a VM fault, and this is counted seperately. 2) Implemented most of the remaining counters and corrected the counting of some that were done wrong. They are all almost correct now...just a few minor ones left to fix.	1994-10-15 13:33:09 +00:00
David Greenman	b73f3b1da2	Got rid of redundant declaration warnings.	1994-10-14 12:26:18 +00:00
David Greenman	2e1e24dd28	Fixed bug where page modifications would be lost when swap space was almost depleted. Reviewed by: John Dyson	1994-10-14 01:58:52 +00:00
David Greenman	35c10d2239	Got rid of map.h. It's a leftover from the rmap code, and we use rlists. Changed swapmap into swaplist.	1994-10-09 07:35:18 +00:00
Poul-Henning Kamp	05f0fdd26a	Cosmetics: unused vars, ()'s, #include's &c &c to silence gcc. Reviewed by: davidg	1994-10-09 01:52:19 +00:00
David Greenman	426de76026	Disabled swap anti-fragmentation code. It reduces swap paging performance by 20% in my tests, and it appears to be the cause of a swap leak. Submitted by: John Dyson	1994-09-25 04:02:10 +00:00
David Greenman	fff93ab600	Patches from John Dyson to improve swap code efficiency. Religiously add back pmap_clear_modify() in vnode_pager_input until we figure out why system performance isn't what we expect. Submitted by: John Dyson (swap_pager) & David Greenman (vnode_pager)	1994-08-29 06:23:19 +00:00
Garrett Wollman	f23b4c91c4	Fix up some sloppy coding practices: - Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above. NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.	1994-08-18 22:36:09 +00:00
David Greenman	a481f20029	Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet). Submitted by: John Dyson	1994-08-07 13:10:43 +00:00

... 2 3 4 5 6 ...

355 Commits