Commit Graph

745 Commits

Author SHA1 Message Date
Matthew Dillon
b33fb764f1 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
Unlock vnode before messing with map to avoid deadlock between map and
    vnode ( e.g. with exec_map and underlying program binary vnode ).  Solves
    a deadlock that most often occurs during a large -j# buildworld reported
    by three people.
1999-02-17 09:08:29 +00:00
Matthew Dillon
efcae3d355 Minor reorganization of vm_page_alloc(). No functional changes have
been made but the code has been reorganized and documented to make
    it more readable, reduce the size of the code, and optimize the branch
    path caching capabilities that most modern processors have.
1999-02-15 06:52:14 +00:00
Matthew Dillon
1ce137be82 Fix a bug in the new madvise() code that would possibly (improperly)
free swap space out from under a busy page.  This is not legal because
    the swap may be reallocated and I/O issued while I/O is still in
    progress on the same swap page from the madvise()'d object.  This bug
    could only occur under extreme paging conditions but might not cause
    an error until much later.  As a side-benefit, madvise() is now even
    smaller.
1999-02-15 02:03:40 +00:00
Matthew Dillon
41c67e12bd Minor optimization to madvise() MADV_FREE to make page as freeable as
possible without actually unmapping it from the process.

    As of now, I declare madvise() on OBJT_DEFAULT/OBJT_SWAP objects to be
    'working and complete'.
1999-02-12 20:42:19 +00:00
Matthew Dillon
2aaeadf8d9 Fix non-fatal bug in vm_map_insert() which improperly cleared
OBJ_ONEMAPPING in the case where an object is extended by an
    additional vm_map_entry must be allocated.

    In vm_object_madvise(), remove calll to vm_page_cache() in MADV_FREE
    case in order to avoid a page fault on page reuse.  However, we still
    mark the page as clean and destroy any swap backing store.

Submitted by:	Alan Cox <alc@cs.rice.edu>
1999-02-12 09:51:43 +00:00
Matthew Dillon
b4f8f16e56 Addendum to vm_map coalesce optimization. Also, this was backed-out
because there was a concensus on current in regards to leaving bss r+w+x
    instead of r+w.  This is in order to maintain reasonable compatibility
    with existing JIT compilers (e.g. kaffe) and possibly other programs.
1999-02-09 01:39:29 +00:00
Matthew Dillon
2ad1a3f729 Revamp vm_object_[q]collapse(). Despite the complexity of this patch,
no major operational changes were made.  The three core object->memq loops
    were moved into a single inline procedure and various operational
    characteristics of the collapse function were documented.
1999-02-08 19:00:15 +00:00
Matthew Dillon
d031cff181 General cleanup. Remove #if 0's and remove useless register qualifiers. 1999-02-08 05:15:54 +00:00
Matthew Dillon
faa273d5c2 Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in with
PQ_FREE.  There is little operational difference other then the kernel
    being a few kilobytes smaller and the code being more readable.

    * vm_page_select_free() has been *greatly* simplified.
    * The PQ_ZERO page queue and supporting structures have been removed
    * vm_page_zero_idle() revamped (see below)

    PG_ZERO setting and clearing has been migrated from vm_page_alloc()
    to vm_page_free[_zero]() and will eventually be guarenteed to remain
    tracked throughout a page's life ( if it isn't already ).

    When a page is freed, PG_ZERO pages are appended to the appropriate
    tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended.
    When locating a new free page, PG_ZERO selection operates from within
    vm_page_list_find() ( get page from end of queue instead of beginning
    of queue ) and then only occurs in the nominal critical path case.  If
    the nominal case misses, both normal and zero-page allocation devolves
    into the same _vm_page_list_find() select code without any specific
    zero-page optimizations.

    Additionally, vm_page_zero_idle() has been revamped.  Hysteresis has been
    added and zero-page tracking adjusted to conform with the other changes.
    Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free
    pages.  We may wish to increase both parameters as time permits.  The
    hysteresis is designed to avoid silly zeroing in borderline allocation/free
    situations.
1999-02-08 00:37:36 +00:00
Matthew Dillon
5313b05fe0 Backed out vm_map coalesce optimization - it resulted in 22% more page
faults for reasons unknown ( under investigation ).
    /usr/bin/time -l make in /usr/src/bin went from 67000 faults to 90000
    faults.
1999-02-08 00:27:56 +00:00
Matthew Dillon
9fdfe602fc Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
Matthew Dillon
a0e7b3e5ce Remove L1 cache coloring optimization ( leave L2 cache coloring opt ).
Rewrite vm_page_list_find() and vm_page_select_free() - make inline out
    of nominal case.
1999-02-07 20:45:15 +00:00
Matthew Dillon
9b09fe24a4 When shadowing objects, adjust the page coloring of the shadowing object
such that pages in the combined/shadowed object are consistantly
    colored.

Submitted by:	"John S. Dyson" <dyson@iquest.net>
1999-02-07 08:44:53 +00:00
Matthew Dillon
2b0d37a4f8 Add hysteresis to the 'swap_pager_getswapspace; failed' console message.
Also widen the hysteresis levels a little ( these really should be
    dynamically configured ).
1999-02-06 07:22:21 +00:00
Matthew Dillon
5b02fcc3d0 The elf loader sets the permissions on bss to VM_PROT_READ|VM_PROT_WRITE
rather then VM_PROT_ALL.  obreak, on the otherhand, uses VM_PROT_ALL.
    This prevents vm_map_insert() from being able to coalesce the heap and
    creates an extra map entry.  Since current architectures ignore
    VM_PROT_EXECUTE anyway, and since not having VM_PROT_EXECUTE on data/bss
    may provide protection in the future, obreak now uses read+write rather
    then all (r+w+x).

    This is an optimization, not a bug fix.

Submitted by:	Alan Cox <alc@cs.rice.edu>
1999-02-05 07:49:29 +00:00
Matthew Dillon
588059bea0 Fix bug in a KASSERT I introduced in vm_page_qcollapse() rev 1.139.
Since paging is in progress, page scan in vm_page_qcollapse() must be
    protected at atleast splbio() to prevent pages from being ripped out from
    under the scan.
1999-02-04 17:47:52 +00:00
Matthew Dillon
4112823fc7 Submitted by: Alan Cox
The vm_map_insert()/vm_object_coalesce() optimization has been extended
    to include OBJT_SWAP objects as well as OBJT_DEFAULT objects.  This is
    possible because it costs nothing to extend an OBJT_SWAP object with
    the new swapper.  We can't do this with the old swapper.  The old swapper
    used a linear array that would have had to have been reallocated, costing
    time as well as a potential low-memory deadlock.
1999-02-03 01:57:17 +00:00
Matthew Dillon
b406c0f55c This patch eliminates a pointless test from appearing twice
in vm_map_simplify_entry.  Basically, once you've verified that
    the objects in the adjacent vm_map_entry's are the same, either
    NULL or the same vm_object, there's no point in checking that the
    objects have the same behavior.

Obtained from:  Alan Cox <alc@cs.rice.edu>
1999-02-01 08:49:30 +00:00
Julian Elischer
287457c2e7 Submitted by: Alan Cox <alc@cs.rice.edu>
Checked by: "Richard Seaman, Jr." <dick@tar.com>
Fix the following problem:
As the code stands now, growing any stack, and not just the process's
main stack, modifies vm->vm_ssize.  This is inconsistent with the code
earlier in the same procedure.
1999-01-31 14:09:25 +00:00
Matthew Dillon
8aef171243 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
Matthew Dillon
5e24f1a2f6 Remove unintended trigraph sequences in comments for -Wall 1999-01-27 18:19:53 +00:00
Julian Elischer
2907af2a96 Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option.  I was aware of this
|when I first did the work, but then forgot about it.  The VM_STACK stuff
|has some code changes in the i386 branch.  There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in.  This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches.  The vm_map_entry will then always have one
|extra element (avail_ssize).  It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c.  This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch.  I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:49:52 +00:00
Julian Elischer
88c5ea4574 Enable Linux threads support by default.
This takes the conditionals out of the code that has been tested by
various people for a while.
ps and friends (libkvm) will need a recompile as some proc structure
changes are made.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:38:12 +00:00
Matthew Dillon
2f586e1b2c Undo last commit - not a bug, just duplicate code. PG_MAPPED and
PG_WRITEABLE are already cleared by vm_page_protect().
1999-01-24 07:06:52 +00:00
Matthew Dillon
7dbf82dc13 Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL
to use the vm_page_dirty() inline.

    The inline can thus do sanity checks ( or not ) over all cases.
1999-01-24 06:04:52 +00:00
Matthew Dillon
68af6d169b vm_map_split() used to dirty the page manually after calling
vm_page_rename(), but never pulled the page off PQ_CACHE if it was on
    PQ_CACHE.  Dirty pages in PQ_CACHE are not allowed and a KASSERT was
    added in -4.x to test for this... and got hit.

    In -4.x, vm_page_rename() automatically dirties the page.  This commit
    also has it deal with the PQ_CACHE case, deactivating the page in that
    case.
1999-01-24 06:00:31 +00:00
Matthew Dillon
161a8f6519 Add vm_page_dirty() inline with PQ_CACHE sanity check 1999-01-24 05:57:50 +00:00
Matthew Dillon
e4542174b0 vm_pager_put_pages() is passed an rcval array to hold per-page return
values.  The 'int' return value for the procedure was never used and
    not well defined in any case when there are mixed errors on pages, so
    it has been removed.  vm_pager_put_pages() and associated vm_pager
    functions now return void.
1999-01-24 02:32:15 +00:00
Matthew Dillon
e1a4feafd0 Clear PG_MAPPED as well as PG_WRITEABLE when a page is moved to the
cache.
1999-01-24 02:29:26 +00:00
Matthew Dillon
d044d7bfb6 Added warning printf ( needs INVARIANTS ) when busy cache page is found
while trying to free memory.
1999-01-24 01:33:22 +00:00
Matthew Dillon
aaba53da90 It is possible for a page in the cache to be busy. vm_pageout.c was not
checking for this condition while it tried to free cache pages.  Fixed.
1999-01-24 01:06:31 +00:00
Matthew Dillon
a7039a1d42 Add invariants to vm_page_busy() and vm_page_wakeup() to check for
PG_BUSY stupidity.
1999-01-24 01:05:15 +00:00
Matthew Dillon
c9fa34cf07 Clear PG_WRITEABLE in vm_page_cache(). This may or may not be a bug,
but the bit should definitely be cleared.
1999-01-24 01:04:04 +00:00
Matthew Dillon
8e3ad7c918 Depreciate vm_object_pmap_copy() - nobody uses it. Everyone uses
vm_object_pmap_copt_1() now, apparently.
1999-01-24 01:01:38 +00:00
Matthew Dillon
bc6d84a6a3 Get rid of unused old_m in vm_fault. Add INVARIANTS to test whether
page is still busy after all the hell vm_fault goes through.. it is
    supposed to be, and printf() if it isn't.  don't panic, though.
1999-01-24 00:55:04 +00:00
Matthew Dillon
7615edaa49 Reenable John Dyson's low-memory VM_WAIT code for page reactivations out
of PQ_CACHE.  Add comments explaining what it accomplishes and its
    limitations.
1999-01-23 06:00:27 +00:00
Matthew Dillon
04d986a5fc Mainly changes to support the new swapper. The big adjustment is that
swap blocks are now in PAGE_SIZE'd increments instead of DEV_BSIZE'd
    increments.  We still convert to DEV_BSIZE'd increments for the
    backing store I/O, but everything else is in PAGE_SIZE increments.
1999-01-21 10:17:12 +00:00
Matthew Dillon
aa91dfc69b Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h
1999-01-21 10:15:47 +00:00
Matthew Dillon
f85f2fa903 Move many of the vm_pager_*() functions from vm_pager.c to inlines in
vm_pager.h

     Added argument to getpbuf() and relpbuf() to allow each subsystem to
     specify a different hard limit on the number of simultanious physical
     bufferes that said subsystem may allocate.  Without this feature, one
     subsystem ( e.g. the vfs clustering code ) could hog *ALL* the pbufs,
     causing a deadlock in the pager in a low memory situation.

     Same for trypbuf().
1999-01-21 10:15:24 +00:00
Matthew Dillon
a489f7614b Reorganized some of the low memory testing code to make it more useful.
Removed call to vm_object_collapse(), which can block.  This was being
    called without the pageout code holding any sort of reference on the
    vm_object or vm_page_t structures being manipulated.  Since this code
    can block, it was possible for other kernel code to shred the state
    the pageout code was assuming remained intact.

    Fixed potential blocking condition in vm_pageout_page_free() ( which
    could cause a deadlock in a low-memory situation ).

    Currently there is a hack in-place to deal with clean filesystem meta-data
    polluting the inactive page queue.  John doesn't like the hack, and neither
    do I.

    Revamped and commented a portion of the pageout loop.

    Added protection against potential memory deadlocks with OBJT_VNODE
    when using VOP_ISLOCKED().  The problem is that vp->v_data can be NULL
    which causes VOP_ISLOCKED() to return a less informed answer.

    remove vm_pager_sync() -- none of the pagers use it any more ( the old
    swapper used to.  The new one does not ).
1999-01-21 10:12:54 +00:00
Matthew Dillon
41dbefba28 The TAILQ hashq has been turned into a singly-linked=list link,
reducing the size of vm_page_t.

    SWAPBLK_NONE and SWAPBLK_MASK are defined here.  These actually are
    more generalized then their names imply, but their placement is somewhat
    of a legacy issue from a prior test version of this code that put
    the swapblk in the vm_page_t structure.  That test code was eventually
    thrown away.  The legacy remains.

    Added vm_page_flash() inline.  Similar to vm_page_wakeup() except that
    it does not clear PG_BUSY ( one assumes that PG_BUSY is already clear ).
    Used by a number of routines to wakeup waiters.

    Collapsed some of the code in inline calls to make other inline calls.
    GCC will optimize this well and it reduces duplication.

    vm_page_free() and vm_page_free_zero() inlines added to convert to
    the proper vm_page_free_toq() call.

    vm_page_sleep_busy() inline added, replacing vm_page_sleep() ( which has
    been removed ).  This implements a much more optimizable page-waiting
    function.
1999-01-21 10:06:24 +00:00
Matthew Dillon
060282de8a The hash table used to be a table of doubly-link list headers ( two
pointers per entry ).  The table has been changed to a singly linked
    list of vm_page_t pointers.  The table has been doubled in size, but
    the entries only take half the space so a net-zero change in memory use.

    The hash function has been changed, hopefully for the better.  The
    combination of the larger hash table size of changed function should
    keep the chain length down to a reasonable number (0-3, average 1).

    vm_object->page_hint has been removed.  This 'optimization' was not
    only never needed, but costs as much as a hash chain link to implement.
    While having page_hint in vm_object might result in better locality
    of reference, the cost is not worth the space in vm_object or the
    extra instructions in my view.

    vm_page_alloc*() functions have been inlined and call a generalized
    non-inlined vm_page_alloc_toq() which combines the standard alloc
    and zero-page alloc functions together, reducing code size and the L1
    cache footprint.  Some reordering has been done... not much.  The
    delinking code should be faster ( because unlinking a doubly-linked list
    requires four memory ops and unlinking a singly linked list only requires
    two ), and we get a hash consistancy check for free.

    vm_page_rename() now automatically sets the page's dirty bits.

    vm_page_alloc() does not try to manually inline freeing a cache page.
    Instead, it now properly calls vm_page_free(m) ... vm_page_free() is
    really too complex to manually inline.

    vm_await(), supporting asleep(), has been added.
1999-01-21 10:01:49 +00:00
Matthew Dillon
48f1335479 The vm_object structure is now somewhat smaller due to the removal
of most of the swap-pager-specific fields, the removal of the id,
    and the removal of paging_offset.

    A new inline, vm_object_pip_wakeupn() has been added to subtract an
    arbitrary number n from the paging_in_progress count and then wakeup
    waiters as necessary.  n may be 0, resulting in a 'flash'.
1999-01-21 09:51:21 +00:00
Matthew Dillon
7bc9e80ecc object->id was badly implemented. It has simply been removed.
object->paging_offset has been removed - it was used to optimize a
    single OBJT_SWAP collapse case yet introduced massive confusion throughout
    vm_object.c.  The optimization was inconsequential except for the
    claim that it didn't have to allocate any memory.  The optimization
    has been removed.

    madvise() has been fixed.  The old madvise() could be made to operate
    on shared objects which is a big no-no.  The new one is much more careful
    in what it modifies.  MADV_FREE was totally broken and has now been fixed.

    vm_page_rename() now automatically dirties a page, so explicit dirtying
    of the page prior to calling vm_page_rename() has been removed.
1999-01-21 09:46:55 +00:00
Matthew Dillon
e4ba1db60a Objects associated with raw devices are no longer counted in the VM stats
total because they may contain absurd numbers ( like the size of all
    of physical memory if you mmap() /dev/mem ).
1999-01-21 09:41:52 +00:00
Matthew Dillon
81522c62fa General cleanup related to the new pager. We no longer have to worry
about conversions of objects to OBJT_SWAP, it is done automatically
    now.

    Replaced manually inserted code with inline calls for busy waiting on
    pages, which also incidently fixes a potential PG_BUSY race due to
    the code not running at splvm().

    vm_objects no longer have a paging_offset field ( see vm/vm_object.c )
1999-01-21 09:40:48 +00:00
Matthew Dillon
1d58b2bc7d Potential bug fix, do not just clear PG_BUSY... call vm_page_wakeup()
instead to properly handle any waiters.

    Added comments, added support for M_ASLEEP.  Generally treat M_ flags
    as flags instead of constants to compare against.
1999-01-21 09:38:20 +00:00
Matthew Dillon
6de6079300 Removed low-memory blockages at fork. This is the wrong place to put
this sort of test.  We need to fix the low-memory handling in general.
1999-01-21 09:36:23 +00:00
Matthew Dillon
4c23ae0916 Mainly cleanup. Removed some inappropriate low-memory handling code
and added lots of comments.  Add tie-in to vm_pager ( and thus the
    new swapper ) to deallocate backing swap for dirtied pages on the fly.
1999-01-21 09:35:38 +00:00
Matthew Dillon
9f6fed9017 The default_pager's interaction with the swap_pager has been reorganized,
and the swap_pager has been completely replaced.

    The new swap pager uses the new blist radix-tree based bitmap allocator
    for low level swap allocation and deallocation.   The new allocator
    is effectively O(5) while the old one was O(N), and the new allocator
    allocates all required memory at init time rather then at allocate
    memory on the fly at run time.

    Swap metadata is allocated in clusters and stored in a hash table,
    eliminating linearly allocated structures.

    Many, many features have been rewritten or added.  Swap space is now
    reallocated on the fly providing a poor-mans auto defragmentation of
    swap space.  Swap space that is no longer needed is freed on a timely
    basis so no garbage collection is necessary.

    Swap I/O is marked B_ASYNC and NFS has been fixed to do the right
    thing with it, so NFS-based paging now has around 10x the performance
    as it did before ( previously NFS enforced synchronous I/O for paging ).
1999-01-21 09:33:07 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
Eivind Eklund
219cbf59f2 KNFize, by bde. 1999-01-10 01:58:29 +00:00
Eivind Eklund
5526d2d920 Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as
discussed on -hackers.

Introduce 'KASSERT(assertion, ("panic message", args))' for simple
check + panic.

Reviewed by:	msmith
1999-01-08 17:31:30 +00:00
Julian Elischer
dc9c271aa1 Changes to the LINUX_THREADS support to only allocate extra memory for
shared signal handling when there is shared signal handling being
used.

This removes the main objection to making the shared signal handling
a standard ability in rfork() and friends and 'unconditionalising'
this code. (i.e. the allocation of an extra 328 bytes per process).

Signal handling information remains in the U area until such a time as
it's reference count would be incremented to > 1. At that point a new
struct is malloc'd and maintained in KVM so that it can be shared between
the processes (threads) using it.

A function to check the reference count and move the struct back to the U
area when it drops back to 1 is also supplied. Signal information is
therefore now swapable for all processes that are not sharing that
information with other processes. THis should addres the concerns raised
by Garrett and others.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-07 21:23:50 +00:00
Julian Elischer
2267af789e Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>
1999-01-06 23:05:42 +00:00
Bruce Evans
289bdf33d3 Ifdefed conditionally used simplock variables. 1999-01-02 11:34:57 +00:00
Dmitrij Tejblum
7a91724556 Don't free swap in swap_pager_getpages(): this code probably cause the
"dying daemons" problem. (I thought this code was introduced in rev.1.80,
but it just relaxed the condition.)

Also, kill related "suggest more swap space" warning (also introduced in
1.80). It was confusing, to say the least...

Requested by:		msmith
Not objected by:	dg
1998-12-29 22:53:51 +00:00
Matthew Dillon
9858fcda2e Update comments to routines in vm_page.c, most especially whether a
routine can block or not as part of a general effort to carefully
    document blocking/non-blocking calls in the kernel.
1998-12-23 01:52:47 +00:00
Julian Elischer
39fb8e6b3e Fix two bogons created by 'patch(1)' in my last commit. 1998-12-19 08:23:31 +00:00
Julian Elischer
6626c6045c Reviewed by: Luoqi Chen, Jordan Hubbard
Submitted by:	 "Richard Seaman, Jr." <lists@tar.com>
Obtained from:	linux :-)

Code to allow Linux Threads to run under FreeBSD.

By default not enabled
This code is dependent on the conditional
COMPAT_LINUX_THREADS (suggested by Garret)
This is not yet a 'real' option but will be within some number of hours.
1998-12-19 02:55:34 +00:00
Dmitrij Tejblum
fc56545639 Don't disable mmap with large file offset. 1998-12-09 20:22:21 +00:00
Archie Cobbs
f1d19042b0 The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.
1998-12-07 21:58:50 +00:00
Archie Cobbs
2127f26023 Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Mike Spengler <mks@networkcs.com>
1998-12-04 22:54:57 +00:00
Robert V. Baron
af1f63c7eb In vnode_pager_input_old, set auio.uio_procp = curproc
vs auio.uio_procp = (struct proc *) 0
1998-12-04 18:39:44 +00:00
David Greenman
c699f45e35 Add missing splvm protection around unqueue call. Without this, the page
queues would eventually get corrupted.
1998-11-25 07:40:49 +00:00
Bruce Evans
04258de351 Fixed a null pointer panic in spc_free(). swap_pager_putpages()
almost always causes this panic for the curproc != pageproc case.
This case apparently doesn't happen in normal operation, but it
happens when vm_page_alloc_contig() is called when there is a memory
hogging application that hasn't already been paged out.

PR:		8632
Reviewed by:	info@opensound.com (Dev Mazumdar), dg
Broken in:	rev.1.89 (1998/02/23)
1998-11-19 06:20:42 +00:00
David Greenman
4f6e1f8bfc Closed a small race condition between wiring/unwiring pages that involved
the page's wire_count.
1998-11-11 15:07:57 +00:00
Peter Wemm
1c5bb3eaa1 add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE() 1998-11-10 09:16:29 +00:00
Doug Rabson
7095ee912b * Fix a couple of places in the device pager where an address was
truncated to 32 bits.
* Change the calling convention of the device mmap entry point to
  pass a vm_offset_t instead of an int for the offset allowing
  devices with a larger memory map than (1<<32) to be supported
  on the alpha (/dev/mem is one such).

These changes are required to allow the X server to mmap the various
I/O regions used for device port and memory access on the alpha.
1998-11-08 12:39:07 +00:00
David Greenman
dd0b2081f4 Implemented zero-copy TCP/IP extensions via sendfile(2) - send a
file to a stream socket. sendfile(2) is similar to implementations in
HP-UX, Linux, and other systems, but the API is more extensive and
addresses many of the complaints that the Apache Group and others have
had with those other implementations. Thanks to Marc Slemko of the
Apache Group for helping me work out the best API for this.
Anyway, this has the "net" result of speeding up sends of files over
TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU
cycles) when compared to a traditional read/write loop.
1998-11-05 14:28:26 +00:00
Peter Wemm
b0359e2c11 Add John Dyson's SYSCTL descriptions, and an export of more stats to
a sysctl hierarchy (vm.stats.*).  SYSCTL descriptions are only present
in source, they do not get compiled into the binaries taking up memory.
1998-10-31 17:21:31 +00:00
Peter Wemm
40c8cfe552 Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.
1998-10-31 15:31:29 +00:00
David Greenman
c8d14c765f Fixed wrong comments in and about vm_page_deactivate(). 1998-10-28 13:41:43 +00:00
David Greenman
730075613a Added a second argument, "activate" to the vm_page_unwire() call so that
the caller can select either inactive or active queue to put the page on.
1998-10-28 13:37:02 +00:00
David Greenman
e4b7635de2 Added needed splvm() protection around object page traversal in
vm_object_terminate().
1998-10-27 13:22:51 +00:00
Bruce Evans
9cd93b3aec Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted
when bdevsw[] became sparse.  We still depend on magic to avoid having to
check that (v_rdev) device numbers in vnodes are not NODEV.

Removed a redundant `major(dev) < nblkdev' test instead of updating it.

Don't follow a garbage bdevsw pointer for attempts to swap on empty
regular files.  This case currently can't happen.  Swapping on regular
files is ifdefed out in swapon() and isn't attempted for empty files
in nfs_mountroot().
1998-10-25 19:24:04 +00:00
Poul-Henning Kamp
f5ef029e92 Nitpicking and dusting performed on a train. Removes trivial warnings
about unused variables, labels and other lint.
1998-10-25 17:44:59 +00:00
David Greenman
9fcfb650d1 Oops, revert part of last fix. vm_pager_dealloc() can't be called until
after the pages are removed from the object...so fix the problem by
not printing the diagnostic for wired fictitious pages (which is normal).
1998-10-23 05:43:13 +00:00
David Greenman
356863eb01 Fixed two bugs in recent commit: in vm_object_terminate, vm_pager_dealloc
needs to be called prior to freeing remaining pages in the object so that
the device pager has an opportunity to grab its "fake" pages. Also, in
the case of wired pages, the page must be made busy prior to calling
vm_page_remove. This is a difference from 2.2.x that I overlooked when
I brought these changes forward.
1998-10-23 05:25:49 +00:00
David Greenman
0b10ba9822 Make the VM system handle the case where a terminating object contains
legitimately wired pages. Currently we print a diagnostic when this
happens, but this will be removed soon when it will be common for this
to occur with zero-copy TCP/IP buffers.
1998-10-22 02:16:53 +00:00
David Greenman
24166bb84b Convert fake page allocs to use the zone allocator, thus eliminating the
private pool management code in here.
1998-10-22 01:45:29 +00:00
David Greenman
bb7db2c011 Set m->object to NULL in dev_pager_getfake(). 1998-10-21 23:06:50 +00:00
David Greenman
300ee8246e Nuked PG_TABLED flag. Replaced with m->object != NULL. 1998-10-21 14:46:42 +00:00
David Greenman
12d534d2f9 Add a diagnostic printf for freeing a wired page. This will eventually
be turned into a panic, but I want to make sure that all cases of freeing
pages with wire_count==1 (which is/was allowed) have first been fixed.
1998-10-21 11:43:04 +00:00
David Greenman
6cde7a165f Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to
   "size" being page rounded in some cases and not in others.
   This sometimes resulted in corrupted files. First noticed by
   Terry Lambert.
   Fixed by changing the "size" pager_alloc parameter to be a 64bit
   byte value (as opposed to a 32bit page index) and changing the
   pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
   caused some 64bit offsets and sizes to be scrambled. Removing
   the cast required adding casts at a few dozen callers.
   There may be problems with other bogus casts in close-by
   macros. A quick check seemed to indicate that those were okay,
   however.
1998-10-13 08:24:45 +00:00
John Polstra
9b35a0d694 Fix a panic on SMP systems, caused by sleeping while holding a
simple-lock.

The reviewer raises the following caveat: "I believe these changes
open a non-critical race condition when adding memory to the pool
for the zone. I think what will happen is that you could have two
threads that are simultaneously adding additional memory when the
pool runs out. This appears to not be a problem, however, since
the re-aquisition of the lock will protect the list pointers."
The submitter agrees that the race is non-critical, and points out
that it already existed for the non-SMP case.  He suggests that
perhaps a sleep lock (using the lock manager) should be used to
close that race.  This might be worth revisiting after 3.0 is
released.

Reviewed by:	dg (David Greenman)
Submitted by:	tegge (Tor Egge)
1998-10-09 00:24:49 +00:00
John Polstra
a0fce82724 Fix a bug in which a page index was used where a byte offset was
expected.  This bug caused builds of Modula-3 to fail in mysterious
ways on SMP kernels.  More precisely, such builds failed on systems
with kern.fast_vfork equal to 0, the default and only supported
value for SMP kernels.

PR:		kern/7468
Submitted by:	tegge (Tor Egge)
1998-10-01 20:46:41 +00:00
Andrzej Bialecki
faa5f8d8da Make #define NO_SWAPPING a normal kernel config option.
Reviewed by:	jkh
1998-09-29 17:33:59 +00:00
Robert V. Baron
6e3a3f387c John Dyson approved of this solution; make vnode_pager_input_old set m->valid 1998-09-28 23:58:10 +00:00
David Greenman
ce65e68c03 Be more selctive about when we clear p->valid.
Submitted by:	John Dyson <toor@dyson.iquest.net>
1998-09-28 02:40:11 +00:00
Bruce Evans
4af2cddffe Removed unused file. 1998-09-20 06:28:10 +00:00
Bruce Evans
500b04a257 Instantiate `nfs_mount_type' in a standard file so that it is present
when nfs is an LKM.  Declare it in a header file.  Don't forget to use
it in non-Lite2 code.  Initialize it to -1 instead of to 0, since 0
will soon be the mount type number for the first vfs loaded.

NetBSD uses strcmp() to avoid this ugly global.
1998-09-05 15:17:34 +00:00
Doug Rabson
e69763a315 Cosmetic changes to the PAGE_XXX macros to make them consistent with
the other objects in vm.
1998-09-04 08:06:57 +00:00
Garrett Wollman
85e7f5492b Separate wakeup conditions for page I/O count (pg_busy) and lock (PG_BUSY).
This is not sa completely solution to the deadlock, but the additional wakeups
have helped in my observation.

Suggested by: John Dyson
1998-09-01 17:12:19 +00:00
Luoqi Chen
c576d12115 Fix a rounding problem that causes vnode pager to fail to remove the last
partially filled page during a truncation.

PR:		kern/7422
1998-08-25 13:47:37 +00:00
Doug Rabson
069e9bc1b4 Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).

Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags
and vm_object::paging_in_progress to use operations which are not
interruptable.

Reviewed by: Bruce Evans <bde@zeta.org.au>
1998-08-24 08:39:39 +00:00
Stephen McKay
9e80236560 Correct/clarify some comments. 1998-08-22 15:24:09 +00:00
Doug Rabson
196e9a52ec Protect all modifications to paging_in_progress with splvm(). 1998-08-13 08:05:13 +00:00
Doug Rabson
d474eaaa5f Protect all modifications to paging_in_progress with splvm(). The i386
managed to avoid corruption of this variable by luck (the compiler used a
memory read-modify-write instruction which wasn't interruptable) but other
architectures cannot.

With this change, I am now able to 'make buildworld' on the alpha (sfx: the
crowd goes wild...)
1998-08-06 08:33:19 +00:00
Bruce Evans
ccbbd9271b Fixed two spl nesting bugs. They caused (at least) the entire pageout
daemon to run at splvm() forever after swap_pager_putpages() is called
from vm_pageout_scan().

Broken in: rev.1.189 (1998/02/23)
1998-07-28 15:30:01 +00:00