Commit Graph

777 Commits

Author SHA1 Message Date
Julian Elischer
8d17e69460 Catch a case spotted by Tor where files mmapped could leave garbage in the
unallocated parts of the last page when the file ended on a frag
but not a page boundary.
Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF,
in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h
    vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c
    ufs/ufs/ufs_readwrite.c kern/vfs_bio.c

Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Alan Cox <alc@freebsd.org>
1999-04-05 19:38:30 +00:00
Alan Cox
876318eca0 Two changes to vm_map_delete:
1. Don't bother checking object->ref_count == 1 in order to set
OBJ_ONEMAPPING.  It's a waste of time.  If object->ref_count == 1,
vm_map_entry_delete will "run-down" the object and its pages.

2. If object->ref_count == 1, ignore OBJ_ONEMAPPING.  Wait for
vm_map_entry_delete to "run-down" the object and its pages.
Otherwise, we're calling two different procedures to delete
the object's pages.

Note: "vmstat -s" will once again show a non-zero value
for "pages freed by exiting processes".
1999-04-04 07:11:02 +00:00
Alan Cox
ad5fca3b4a Mainly, eliminate the comments about share maps. (We don't have share maps
any more.)  Also, eliminate an incorrect comment that says that we don't
coalesce vm_map_entry's.  (We do.)
1999-03-27 23:46:04 +00:00
Eivind Eklund
4491ea9111 Correct a comment. 1999-03-27 02:39:01 +00:00
Alan Cox
99c81ca94d Two changes:
Remove more (redundant) map timestamp increments from properly
synchronized routines.  (Changed: vm_map_entry_link, vm_map_entry_unlink,
and vm_map_pageable.)

Micro-optimize vm_map_entry_link and vm_map_entry_unlink, eliminating
unnecessary dereferences.  At the same time, converted them from macros
to inline functions.
1999-03-21 23:37:00 +00:00
Alan Cox
61fc5ee627 Construct the free queue(s) in descending order (by physical
address) so that the first 16MB of physical memory is allocated
last rather than first.  On large-memory machines, this avoids
the exhaustion of low physical memory before isa_dmainit has run.
1999-03-19 05:21:03 +00:00
Alan Cox
c7003c6991 Correct a problem in kmem_malloc: A kmem_malloc allowing "wait" may
block (VM_WAIT) holding the map lock.  This is bad.  For example, a subsequent
kmem_malloc by an interrupt handler on the same map may find the lock held
and panic in the lockmgr.
1999-03-16 07:39:07 +00:00
Alan Cox
44428f621d Two changes:
In general, vm_map_simplify_entry should be performed INSIDE
the loop that traverses the map, not outside.  (Changed:
vm_map_inherit, vm_map_pageable.)

vm_fault_unwire doesn't acquire the map lock (or block holding
it).  Thus, vm_map_set/clear_recursive shouldn't be called.
(Changed: vm_map_user_pageable, vm_map_pageable.)
1999-03-15 06:24:52 +00:00
Julian Elischer
811c2e1a76 Fix breakage in last commit
Submitted by: Brian Feldman <green@unixhelp.org>
1999-03-15 05:09:48 +00:00
Julian Elischer
0237469f43 A bit of a hack, but allows the vn device to be a module again.
Submitted by: Matt Dillon <dillon@freebsd.org>
1999-03-14 20:40:15 +00:00
Julian Elischer
a5296b05b4 Submitted by: Matt Dillon <dillon@freebsd.org>
The old VN device broke in -4.x when the definition of B_PAGING
changed. This patch fixes this plus implements additional capabilities.
The new VN device can be backed by a file ( as per normal ), or it can
be directly backed by swap.

Due to dependencies in VM include files  (on opt_xxx options) the new
vn device cannot be a module yet. This will be fixed in a later commit.
This commit delimitted by tags {PRE,POST}_MATT_VNDEV
1999-03-14 09:20:01 +00:00
Alan Cox
a1a54e9fc1 Correct two optimization errors in vm_object_page_remove:
1. The size of vm_object::memq is vm_object::resident_page_count,
not vm_object::size.

2. The "size > 4" test sometimes results in the traversal of a ~1000 page
memq in order to locate ~10 pages.
1999-03-14 06:36:00 +00:00
Alan Cox
b73d0eb905 Remove vm_page_frees from kmem_malloc that are performed
by vm_map_delete/vm_object_page_remove anyway.
1999-03-12 08:05:49 +00:00
Julian Elischer
51df594922 Stop the mfs from trying to swap out crucial bits of the mfs
as this can lead to deadlock.
Submitted by: Mat dillon <dillon@freebsd.org>
1999-03-12 00:44:03 +00:00
Alan Cox
00d4f4a5f4 Remove (redundant) map timestamp increments from some properly
synchronized routines.
1999-03-09 08:00:17 +00:00
Alan Cox
da3a3026b9 Remove an unused variable from vmspace_fork. 1999-03-08 03:53:07 +00:00
Alan Cox
9de3dd734e Change vm_map_growstack to acquire and hold a read lock (instead of a write
lock) until it actually needs to modify the vm_map.

Note: it is legal to modify vm_map::hint without holding a write lock.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com> with minor changes
		by myself.
1999-03-07 21:25:42 +00:00
Alan Cox
f59e8eb9b1 Upgrading a map's lock to exclusive status should increment
the map's timestamp.  In general, whenever an exclusive lock is
acquired the timestamp should be incremented.
1999-03-06 07:11:33 +00:00
Alan Cox
dd2622a8cd To avoid a conflict for the vm_map's lock with vm_fault, release
the read lock around the subyte operations in mincore.  After the lock is
reacquired, use the map's timestamp to determine if we need to restart
the scan.
1999-03-02 22:55:02 +00:00
Alan Cox
e5f251d2d3 Remove the last of the share map code: struct vm_map::is_main_map.
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-03-02 05:43:18 +00:00
Alan Cox
eff50fcd4c mincore doesn't modify the vm_map. Therefore, it doesn't require
an exclusive lock.  A read lock will suffice.
1999-03-01 20:42:16 +00:00
Alan Cox
0e3cdf2cf8 Reviewed by: "John S. Dyson" <dyson@iquest.net>
Submitted by:	Matthew Dillon <dillon@apollo.backplane.com>
To prevent a deadlock, if we are extremely low on memory, force synchronous
operation by the VOP_PUTPAGES in vnode_pager_putpages.
1999-02-27 23:39:28 +00:00
Alan Cox
14286e5e8f Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Corrected the computation of cnt.v_ozfod in vm_fault: vm_fault
was counting the number of unoptimized rather than optimized
zero-fill faults.
1999-02-25 06:00:52 +00:00
Matthew Dillon
82e5072fcd Comment swstrategy() routine. 1999-02-25 05:37:18 +00:00
Matthew Dillon
d1bf5d56b6 Remove unnecessary page protects on map_split and collapse operations.
Fix bug where an object's OBJ_WRITEABLE/OBJ_MIGHTBEDIRTY flags do
    not get set under certain circumstances ( page rename case ).

Reviewed by:	Alan Cox <alc@cs.rice.edu>, John Dyson
1999-02-24 21:26:26 +00:00
Matthew Dillon
c4812f564a Removed ENOMEM error on swap_pager_full condition which ignored the
availability of physical memory.  As per original bug report by
    Bruce.

Reviewed by:	Alan Cox <alc@cs.rice.edu>
1999-02-22 08:42:16 +00:00
Matthew Dillon
ad3cce2041 Remove conditional sysctl's
Leave swap_async_max sysctl intact, remove swap_cluster_max sysctl.

Reviewed by:	     Alan Cox <alc@cs.rice.edu>
1999-02-21 08:34:15 +00:00
Matthew Dillon
20d3034f39 Reviewed by: Alan Cox <alc@cs.rice.edu>
Fix problem w/ low-swap/low-memory handling as reported by Bruce Evans.
1999-02-21 08:30:49 +00:00
Luoqi Chen
fe2144fd5a Eliminate a possible numerical overflow. 1999-02-19 19:14:48 +00:00
Luoqi Chen
b1028ad122 Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
Matthew Dillon
9b09b6c73f Submitted by: Alan Cox <alc@cs.rice.edu>
Remove remaining share map garbage from vm_map_lookup() and clean out
    old #if 0 stuff.
1999-02-19 03:11:37 +00:00
Matthew Dillon
327f4e8394 Limit number of simultanious asynchronous swap pager I/Os that can
be in progress at any given moment.

    Add two swap tuneables to sysctl:

	vm.swap_async_max: 4
	vm.swap_cluster_max: 16

    Recommended values are a cluster size of 8 or 16 pages.  async_max is
    about right for 1-4 swap devices.  Reduce to 2 if swap is eating too much
    bandwidth, or even 1 if swap is both eating too much bandwidth and sitting
    on a slow network (10BaseT).

    The defaults work well across a broad range of configurations and should
    normally be left alone.
1999-02-18 19:57:33 +00:00
Matthew Dillon
b33fb764f1 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
Unlock vnode before messing with map to avoid deadlock between map and
    vnode ( e.g. with exec_map and underlying program binary vnode ).  Solves
    a deadlock that most often occurs during a large -j# buildworld reported
    by three people.
1999-02-17 09:08:29 +00:00
Matthew Dillon
efcae3d355 Minor reorganization of vm_page_alloc(). No functional changes have
been made but the code has been reorganized and documented to make
    it more readable, reduce the size of the code, and optimize the branch
    path caching capabilities that most modern processors have.
1999-02-15 06:52:14 +00:00
Matthew Dillon
1ce137be82 Fix a bug in the new madvise() code that would possibly (improperly)
free swap space out from under a busy page.  This is not legal because
    the swap may be reallocated and I/O issued while I/O is still in
    progress on the same swap page from the madvise()'d object.  This bug
    could only occur under extreme paging conditions but might not cause
    an error until much later.  As a side-benefit, madvise() is now even
    smaller.
1999-02-15 02:03:40 +00:00
Matthew Dillon
41c67e12bd Minor optimization to madvise() MADV_FREE to make page as freeable as
possible without actually unmapping it from the process.

    As of now, I declare madvise() on OBJT_DEFAULT/OBJT_SWAP objects to be
    'working and complete'.
1999-02-12 20:42:19 +00:00
Matthew Dillon
2aaeadf8d9 Fix non-fatal bug in vm_map_insert() which improperly cleared
OBJ_ONEMAPPING in the case where an object is extended by an
    additional vm_map_entry must be allocated.

    In vm_object_madvise(), remove calll to vm_page_cache() in MADV_FREE
    case in order to avoid a page fault on page reuse.  However, we still
    mark the page as clean and destroy any swap backing store.

Submitted by:	Alan Cox <alc@cs.rice.edu>
1999-02-12 09:51:43 +00:00
Matthew Dillon
b4f8f16e56 Addendum to vm_map coalesce optimization. Also, this was backed-out
because there was a concensus on current in regards to leaving bss r+w+x
    instead of r+w.  This is in order to maintain reasonable compatibility
    with existing JIT compilers (e.g. kaffe) and possibly other programs.
1999-02-09 01:39:29 +00:00
Matthew Dillon
2ad1a3f729 Revamp vm_object_[q]collapse(). Despite the complexity of this patch,
no major operational changes were made.  The three core object->memq loops
    were moved into a single inline procedure and various operational
    characteristics of the collapse function were documented.
1999-02-08 19:00:15 +00:00
Matthew Dillon
d031cff181 General cleanup. Remove #if 0's and remove useless register qualifiers. 1999-02-08 05:15:54 +00:00
Matthew Dillon
faa273d5c2 Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in with
PQ_FREE.  There is little operational difference other then the kernel
    being a few kilobytes smaller and the code being more readable.

    * vm_page_select_free() has been *greatly* simplified.
    * The PQ_ZERO page queue and supporting structures have been removed
    * vm_page_zero_idle() revamped (see below)

    PG_ZERO setting and clearing has been migrated from vm_page_alloc()
    to vm_page_free[_zero]() and will eventually be guarenteed to remain
    tracked throughout a page's life ( if it isn't already ).

    When a page is freed, PG_ZERO pages are appended to the appropriate
    tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended.
    When locating a new free page, PG_ZERO selection operates from within
    vm_page_list_find() ( get page from end of queue instead of beginning
    of queue ) and then only occurs in the nominal critical path case.  If
    the nominal case misses, both normal and zero-page allocation devolves
    into the same _vm_page_list_find() select code without any specific
    zero-page optimizations.

    Additionally, vm_page_zero_idle() has been revamped.  Hysteresis has been
    added and zero-page tracking adjusted to conform with the other changes.
    Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free
    pages.  We may wish to increase both parameters as time permits.  The
    hysteresis is designed to avoid silly zeroing in borderline allocation/free
    situations.
1999-02-08 00:37:36 +00:00
Matthew Dillon
5313b05fe0 Backed out vm_map coalesce optimization - it resulted in 22% more page
faults for reasons unknown ( under investigation ).
    /usr/bin/time -l make in /usr/src/bin went from 67000 faults to 90000
    faults.
1999-02-08 00:27:56 +00:00
Matthew Dillon
9fdfe602fc Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
Matthew Dillon
a0e7b3e5ce Remove L1 cache coloring optimization ( leave L2 cache coloring opt ).
Rewrite vm_page_list_find() and vm_page_select_free() - make inline out
    of nominal case.
1999-02-07 20:45:15 +00:00
Matthew Dillon
9b09fe24a4 When shadowing objects, adjust the page coloring of the shadowing object
such that pages in the combined/shadowed object are consistantly
    colored.

Submitted by:	"John S. Dyson" <dyson@iquest.net>
1999-02-07 08:44:53 +00:00
Matthew Dillon
2b0d37a4f8 Add hysteresis to the 'swap_pager_getswapspace; failed' console message.
Also widen the hysteresis levels a little ( these really should be
    dynamically configured ).
1999-02-06 07:22:21 +00:00
Matthew Dillon
5b02fcc3d0 The elf loader sets the permissions on bss to VM_PROT_READ|VM_PROT_WRITE
rather then VM_PROT_ALL.  obreak, on the otherhand, uses VM_PROT_ALL.
    This prevents vm_map_insert() from being able to coalesce the heap and
    creates an extra map entry.  Since current architectures ignore
    VM_PROT_EXECUTE anyway, and since not having VM_PROT_EXECUTE on data/bss
    may provide protection in the future, obreak now uses read+write rather
    then all (r+w+x).

    This is an optimization, not a bug fix.

Submitted by:	Alan Cox <alc@cs.rice.edu>
1999-02-05 07:49:29 +00:00
Matthew Dillon
588059bea0 Fix bug in a KASSERT I introduced in vm_page_qcollapse() rev 1.139.
Since paging is in progress, page scan in vm_page_qcollapse() must be
    protected at atleast splbio() to prevent pages from being ripped out from
    under the scan.
1999-02-04 17:47:52 +00:00
Matthew Dillon
4112823fc7 Submitted by: Alan Cox
The vm_map_insert()/vm_object_coalesce() optimization has been extended
    to include OBJT_SWAP objects as well as OBJT_DEFAULT objects.  This is
    possible because it costs nothing to extend an OBJT_SWAP object with
    the new swapper.  We can't do this with the old swapper.  The old swapper
    used a linear array that would have had to have been reallocated, costing
    time as well as a potential low-memory deadlock.
1999-02-03 01:57:17 +00:00
Matthew Dillon
b406c0f55c This patch eliminates a pointless test from appearing twice
in vm_map_simplify_entry.  Basically, once you've verified that
    the objects in the adjacent vm_map_entry's are the same, either
    NULL or the same vm_object, there's no point in checking that the
    objects have the same behavior.

Obtained from:  Alan Cox <alc@cs.rice.edu>
1999-02-01 08:49:30 +00:00
Julian Elischer
287457c2e7 Submitted by: Alan Cox <alc@cs.rice.edu>
Checked by: "Richard Seaman, Jr." <dick@tar.com>
Fix the following problem:
As the code stands now, growing any stack, and not just the process's
main stack, modifies vm->vm_ssize.  This is inconsistent with the code
earlier in the same procedure.
1999-01-31 14:09:25 +00:00
Matthew Dillon
8aef171243 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
Matthew Dillon
5e24f1a2f6 Remove unintended trigraph sequences in comments for -Wall 1999-01-27 18:19:53 +00:00
Julian Elischer
2907af2a96 Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option.  I was aware of this
|when I first did the work, but then forgot about it.  The VM_STACK stuff
|has some code changes in the i386 branch.  There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in.  This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches.  The vm_map_entry will then always have one
|extra element (avail_ssize).  It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c.  This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch.  I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:49:52 +00:00
Julian Elischer
88c5ea4574 Enable Linux threads support by default.
This takes the conditionals out of the code that has been tested by
various people for a while.
ps and friends (libkvm) will need a recompile as some proc structure
changes are made.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:38:12 +00:00
Matthew Dillon
2f586e1b2c Undo last commit - not a bug, just duplicate code. PG_MAPPED and
PG_WRITEABLE are already cleared by vm_page_protect().
1999-01-24 07:06:52 +00:00
Matthew Dillon
7dbf82dc13 Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL
to use the vm_page_dirty() inline.

    The inline can thus do sanity checks ( or not ) over all cases.
1999-01-24 06:04:52 +00:00
Matthew Dillon
68af6d169b vm_map_split() used to dirty the page manually after calling
vm_page_rename(), but never pulled the page off PQ_CACHE if it was on
    PQ_CACHE.  Dirty pages in PQ_CACHE are not allowed and a KASSERT was
    added in -4.x to test for this... and got hit.

    In -4.x, vm_page_rename() automatically dirties the page.  This commit
    also has it deal with the PQ_CACHE case, deactivating the page in that
    case.
1999-01-24 06:00:31 +00:00
Matthew Dillon
161a8f6519 Add vm_page_dirty() inline with PQ_CACHE sanity check 1999-01-24 05:57:50 +00:00
Matthew Dillon
e4542174b0 vm_pager_put_pages() is passed an rcval array to hold per-page return
values.  The 'int' return value for the procedure was never used and
    not well defined in any case when there are mixed errors on pages, so
    it has been removed.  vm_pager_put_pages() and associated vm_pager
    functions now return void.
1999-01-24 02:32:15 +00:00
Matthew Dillon
e1a4feafd0 Clear PG_MAPPED as well as PG_WRITEABLE when a page is moved to the
cache.
1999-01-24 02:29:26 +00:00
Matthew Dillon
d044d7bfb6 Added warning printf ( needs INVARIANTS ) when busy cache page is found
while trying to free memory.
1999-01-24 01:33:22 +00:00
Matthew Dillon
aaba53da90 It is possible for a page in the cache to be busy. vm_pageout.c was not
checking for this condition while it tried to free cache pages.  Fixed.
1999-01-24 01:06:31 +00:00
Matthew Dillon
a7039a1d42 Add invariants to vm_page_busy() and vm_page_wakeup() to check for
PG_BUSY stupidity.
1999-01-24 01:05:15 +00:00
Matthew Dillon
c9fa34cf07 Clear PG_WRITEABLE in vm_page_cache(). This may or may not be a bug,
but the bit should definitely be cleared.
1999-01-24 01:04:04 +00:00
Matthew Dillon
8e3ad7c918 Depreciate vm_object_pmap_copy() - nobody uses it. Everyone uses
vm_object_pmap_copt_1() now, apparently.
1999-01-24 01:01:38 +00:00
Matthew Dillon
bc6d84a6a3 Get rid of unused old_m in vm_fault. Add INVARIANTS to test whether
page is still busy after all the hell vm_fault goes through.. it is
    supposed to be, and printf() if it isn't.  don't panic, though.
1999-01-24 00:55:04 +00:00
Matthew Dillon
7615edaa49 Reenable John Dyson's low-memory VM_WAIT code for page reactivations out
of PQ_CACHE.  Add comments explaining what it accomplishes and its
    limitations.
1999-01-23 06:00:27 +00:00
Matthew Dillon
04d986a5fc Mainly changes to support the new swapper. The big adjustment is that
swap blocks are now in PAGE_SIZE'd increments instead of DEV_BSIZE'd
    increments.  We still convert to DEV_BSIZE'd increments for the
    backing store I/O, but everything else is in PAGE_SIZE increments.
1999-01-21 10:17:12 +00:00
Matthew Dillon
aa91dfc69b Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h
1999-01-21 10:15:47 +00:00
Matthew Dillon
f85f2fa903 Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h

     Added argument to getpbuf() and relpbuf() to allow each subsystem to
     specify a different hard limit on the number of simultanious physical
     bufferes that said subsystem may allocate.  Without this feature, one
     subsystem ( e.g. the vfs clustering code ) could hog *ALL* the pbufs,
     causing a deadlock in the pager in a low memory situation.

     Same for trypbuf().
1999-01-21 10:15:24 +00:00
Matthew Dillon
a489f7614b Reorganized some of the low memory testing code to make it more useful.
Removed call to vm_object_collapse(), which can block.  This was being
    called without the pageout code holding any sort of reference on the
    vm_object or vm_page_t structures being manipulated.  Since this code
    can block, it was possible for other kernel code to shred the state
    the pageout code was assuming remained intact.

    Fixed potential blocking condition in vm_pageout_page_free() ( which
    could cause a deadlock in a low-memory situation ).

    Currently there is a hack in-place to deal with clean filesystem meta-data
    polluting the inactive page queue.  John doesn't like the hack, and neither
    do I.

    Revamped and commented a portion of the pageout loop.

    Added protection against potential memory deadlocks with OBJT_VNODE
    when using VOP_ISLOCKED().  The problem is that vp->v_data can be NULL
    which causes VOP_ISLOCKED() to return a less informed answer.

    remove vm_pager_sync() -- none of the pagers use it any more ( the old
    swapper used to.  The new one does not ).
1999-01-21 10:12:54 +00:00
Matthew Dillon
41dbefba28 The TAILQ hashq has been turned into a singly-linked=list link,
reducing the size of vm_page_t.

    SWAPBLK_NONE and SWAPBLK_MASK are defined here.  These actually are
    more generalized then their names imply, but their placement is somewhat
    of a legacy issue from a prior test version of this code that put
    the swapblk in the vm_page_t structure.  That test code was eventually
    thrown away.  The legacy remains.

    Added vm_page_flash() inline.  Similar to vm_page_wakeup() except that
    it does not clear PG_BUSY ( one assumes that PG_BUSY is already clear ).
    Used by a number of routines to wakeup waiters.

    Collapsed some of the code in inline calls to make other inline calls.
    GCC will optimize this well and it reduces duplication.

    vm_page_free() and vm_page_free_zero() inlines added to convert to
    the proper vm_page_free_toq() call.

    vm_page_sleep_busy() inline added, replacing vm_page_sleep() ( which has
    been removed ).  This implements a much more optimizable page-waiting
    function.
1999-01-21 10:06:24 +00:00
Matthew Dillon
060282de8a The hash table used to be a table of doubly-link list headers ( two
pointers per entry ).  The table has been changed to a singly linked
    list of vm_page_t pointers.  The table has been doubled in size, but
    the entries only take half the space so a net-zero change in memory use.

    The hash function has been changed, hopefully for the better.  The
    combination of the larger hash table size of changed function should
    keep the chain length down to a reasonable number (0-3, average 1).

    vm_object->page_hint has been removed.  This 'optimization' was not
    only never needed, but costs as much as a hash chain link to implement.
    While having page_hint in vm_object might result in better locality
    of reference, the cost is not worth the space in vm_object or the
    extra instructions in my view.

    vm_page_alloc*() functions have been inlined and call a generalized
    non-inlined vm_page_alloc_toq() which combines the standard alloc
    and zero-page alloc functions together, reducing code size and the L1
    cache footprint.  Some reordering has been done... not much.  The
    delinking code should be faster ( because unlinking a doubly-linked list
    requires four memory ops and unlinking a singly linked list only requires
    two ), and we get a hash consistancy check for free.

    vm_page_rename() now automatically sets the page's dirty bits.

    vm_page_alloc() does not try to manually inline freeing a cache page.
    Instead, it now properly calls vm_page_free(m) ... vm_page_free() is
    really too complex to manually inline.

    vm_await(), supporting asleep(), has been added.
1999-01-21 10:01:49 +00:00
Matthew Dillon
48f1335479 The vm_object structure is now somewhat smaller due to the removal
of most of the swap-pager-specific fields, the removal of the id,
    and the removal of paging_offset.

    A new inline, vm_object_pip_wakeupn() has been added to subtract an
    arbitrary number n from the paging_in_progress count and then wakeup
    waiters as necessary.  n may be 0, resulting in a 'flash'.
1999-01-21 09:51:21 +00:00
Matthew Dillon
7bc9e80ecc object->id was badly implemented. It has simply been removed.
object->paging_offset has been removed - it was used to optimize a
    single OBJT_SWAP collapse case yet introduced massive confusion throughout
    vm_object.c.  The optimization was inconsequential except for the
    claim that it didn't have to allocate any memory.  The optimization
    has been removed.

    madvise() has been fixed.  The old madvise() could be made to operate
    on shared objects which is a big no-no.  The new one is much more careful
    in what it modifies.  MADV_FREE was totally broken and has now been fixed.

    vm_page_rename() now automatically dirties a page, so explicit dirtying
    of the page prior to calling vm_page_rename() has been removed.
1999-01-21 09:46:55 +00:00
Matthew Dillon
e4ba1db60a Objects associated with raw devices are no longer counted in the VM stats
total because they may contain absurd numbers ( like the size of all
    of physical memory if you mmap() /dev/mem ).
1999-01-21 09:41:52 +00:00
Matthew Dillon
81522c62fa General cleanup related to the new pager. We no longer have to worry
about conversions of objects to OBJT_SWAP, it is done automatically
    now.

    Replaced manually inserted code with inline calls for busy waiting on
    pages, which also incidently fixes a potential PG_BUSY race due to
    the code not running at splvm().

    vm_objects no longer have a paging_offset field ( see vm/vm_object.c )
1999-01-21 09:40:48 +00:00
Matthew Dillon
1d58b2bc7d Potential bug fix, do not just clear PG_BUSY... call vm_page_wakeup()
instead to properly handle any waiters.

    Added comments, added support for M_ASLEEP.  Generally treat M_ flags
    as flags instead of constants to compare against.
1999-01-21 09:38:20 +00:00
Matthew Dillon
6de6079300 Removed low-memory blockages at fork. This is the wrong place to put
this sort of test.  We need to fix the low-memory handling in general.
1999-01-21 09:36:23 +00:00
Matthew Dillon
4c23ae0916 Mainly cleanup. Removed some inappropriate low-memory handling code
and added lots of comments.  Add tie-in to vm_pager ( and thus the
    new swapper ) to deallocate backing swap for dirtied pages on the fly.
1999-01-21 09:35:38 +00:00
Matthew Dillon
9f6fed9017 The default_pager's interaction with the swap_pager has been reorganized,
and the swap_pager has been completely replaced.

    The new swap pager uses the new blist radix-tree based bitmap allocator
    for low level swap allocation and deallocation.   The new allocator
    is effectively O(5) while the old one was O(N), and the new allocator
    allocates all required memory at init time rather then at allocate
    memory on the fly at run time.

    Swap metadata is allocated in clusters and stored in a hash table,
    eliminating linearly allocated structures.

    Many, many features have been rewritten or added.  Swap space is now
    reallocated on the fly providing a poor-mans auto defragmentation of
    swap space.  Swap space that is no longer needed is freed on a timely
    basis so no garbage collection is necessary.

    Swap I/O is marked B_ASYNC and NFS has been fixed to do the right
    thing with it, so NFS-based paging now has around 10x the performance
    as it did before ( previously NFS enforced synchronous I/O for paging ).
1999-01-21 09:33:07 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
Eivind Eklund
219cbf59f2 KNFize, by bde. 1999-01-10 01:58:29 +00:00
Eivind Eklund
5526d2d920 Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as
discussed on -hackers.

Introduce 'KASSERT(assertion, ("panic message", args))' for simple
check + panic.

Reviewed by:	msmith
1999-01-08 17:31:30 +00:00
Julian Elischer
dc9c271aa1 Changes to the LINUX_THREADS support to only allocate extra memory for
shared signal handling when there is shared signal handling being
used.

This removes the main objection to making the shared signal handling
a standard ability in rfork() and friends and 'unconditionalising'
this code. (i.e. the allocation of an extra 328 bytes per process).

Signal handling information remains in the U area until such a time as
it's reference count would be incremented to > 1. At that point a new
struct is malloc'd and maintained in KVM so that it can be shared between
the processes (threads) using it.

A function to check the reference count and move the struct back to the U
area when it drops back to 1 is also supplied. Signal information is
therefore now swapable for all processes that are not sharing that
information with other processes. THis should addres the concerns raised
by Garrett and others.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-07 21:23:50 +00:00
Julian Elischer
2267af789e Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>
1999-01-06 23:05:42 +00:00
Bruce Evans
289bdf33d3 Ifdefed conditionally used simplock variables. 1999-01-02 11:34:57 +00:00
Dmitrij Tejblum
7a91724556 Don't free swap in swap_pager_getpages(): this code probably cause the
"dying daemons" problem. (I thought this code was introduced in rev.1.80,
but it just relaxed the condition.)

Also, kill related "suggest more swap space" warning (also introduced in
1.80). It was confusing, to say the least...

Requested by:		msmith
Not objected by:	dg
1998-12-29 22:53:51 +00:00
Matthew Dillon
9858fcda2e Update comments to routines in vm_page.c, most especially whether a
routine can block or not as part of a general effort to carefully
    document blocking/non-blocking calls in the kernel.
1998-12-23 01:52:47 +00:00
Julian Elischer
39fb8e6b3e Fix two bogons created by 'patch(1)' in my last commit. 1998-12-19 08:23:31 +00:00
Julian Elischer
6626c6045c Reviewed by: Luoqi Chen, Jordan Hubbard
Submitted by:	 "Richard Seaman, Jr." <lists@tar.com>
Obtained from:	linux :-)

Code to allow Linux Threads to run under FreeBSD.

By default not enabled
This code is dependent on the conditional
COMPAT_LINUX_THREADS (suggested by Garret)
This is not yet a 'real' option but will be within some number of hours.
1998-12-19 02:55:34 +00:00
Dmitrij Tejblum
fc56545639 Don't disable mmap with large file offset. 1998-12-09 20:22:21 +00:00
Archie Cobbs
f1d19042b0 The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.
1998-12-07 21:58:50 +00:00
Archie Cobbs
2127f26023 Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Mike Spengler <mks@networkcs.com>
1998-12-04 22:54:57 +00:00
Robert V. Baron
af1f63c7eb In vnode_pager_input_old, set auio.uio_procp = curproc
vs auio.uio_procp = (struct proc *) 0
1998-12-04 18:39:44 +00:00
David Greenman
c699f45e35 Add missing splvm protection around unqueue call. Without this, the page
queues would eventually get corrupted.
1998-11-25 07:40:49 +00:00
Bruce Evans
04258de351 Fixed a null pointer panic in spc_free(). swap_pager_putpages()
almost always causes this panic for the curproc != pageproc case.
This case apparently doesn't happen in normal operation, but it
happens when vm_page_alloc_contig() is called when there is a memory
hogging application that hasn't already been paged out.

PR:		8632
Reviewed by:	info@opensound.com (Dev Mazumdar), dg
Broken in:	rev.1.89 (1998/02/23)
1998-11-19 06:20:42 +00:00
David Greenman
4f6e1f8bfc Closed a small race condition between wiring/unwiring pages that involved
the page's wire_count.
1998-11-11 15:07:57 +00:00
Peter Wemm
1c5bb3eaa1 add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE() 1998-11-10 09:16:29 +00:00