Commit Graph

238 Commits

Author SHA1 Message Date
Philippe Charnier
956f31353c Spelling 2000-03-26 15:20:23 +00:00
Paul Saab
9730a5daab Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2).
This
This feature allows you to specify if mmap'd data is included in
an application's corefile.

Change the type of eflags in struct vm_map_entry from u_char to
vm_eflags_t (an unsigned int).

Reviewed by:	dillon,jdp,alfred
Approved by:	jkh
2000-02-28 04:10:35 +00:00
Matthew Dillon
1f6889a1eb Fix null-pointer dereference crash when the system is intentionally
run out of KVM through a mmap()/fork() bomb that allocates hundreds
    of thousands of vm_map_entry structures.

    Add panic to make null-pointer dereference crash a little more verbose.

    Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number
    of mmap()'d spaces (discrete vm_map_entry's in the process).  The value
    defaults to around 9000 for a 128MB machine.  The test is scaled for the
    number of processes sharing a vmspace (aka linux threads).  Setting
    the value to 0 disables the feature.

PR: kern/16573
Approved by: jkh
2000-02-16 21:11:33 +00:00
Matthew Dillon
ff359f84c9 Fix a deadlock between msync(..., MS_INVALIDATE) and vm_fault. The
invalidation code cannot wait for paging to complete while holding a
    vnode lock, so we don't wait.  Instead we simply allow the lower level
    code to simply block on any busy pages it encounters.  I think Yahoo
    may be the only entity in the entire world that actually uses this
    msync feature :-).

Bug reported by:  Paul Saab <paul@mu.org>
2000-01-21 20:17:01 +00:00
Matthew Dillon
4f79d873c1 Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to
madvise().

    This feature prevents the update daemon from gratuitously flushing
    dirty pages associated with a mapped file-backed region of memory.  The
    system pager will still page the memory as necessary and the VM system
    will still be fully coherent with the filesystem.  Modifications made
    by other means to the same area of memory, for example by write(), are
    unaffected.  The feature works on a page-granularity basis.

    MAP_NOSYNC allows one to use mmap() to share memory between processes
    without incuring any significant filesystem overhead, putting it in
    the same performance category as SysV Shared memory and anonymous memory.

Reviewed by: julian, alc, dg
1999-12-12 03:19:33 +00:00
Alan Cox
2b71c841f5 Remove nonsensical vm_map_{clear,set}_recursive() calls
from vm_map_pageable().  At the point they called, vm_map_pageable()
holds a read (or shared) lock on the map.  The purpose
of vm_map_{clear,set}_recursive() is to disable/enable repeated
write (or exclusive) lock requests by the same process.
1999-11-25 20:21:52 +00:00
Alan Cox
2ed14a92db Correct the following error: vm_map_pageable() on a COW'ed (post-fork)
vm_map always failed because vm_map_lookup() looked at
 "vm_map_entry->wired_count" instead of "(vm_map_entry->eflags &
 MAP_ENTRY_USER_WIRED)".  The effect was that many page
 wiring operations by sysctl were (silently) failing.
1999-11-23 06:51:28 +00:00
Alan Cox
79e1e3b9b4 Remove unused #include's.
Submitted by:	phk
1999-11-07 20:03:54 +00:00
Alan Cox
1ab41ed97c The functions declared by this header file no longer exist.
Submitted by:	phk (in part)
1999-11-07 06:46:48 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Matthew Dillon
b430905573 cleanup madvise code, add a few more sanity checks.
Reviewed by:	Alan Cox <alc@cs.rice.edu>,  dg@root.com
1999-09-21 05:00:48 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Alan Cox
f7fc307ade vm_map_madvise:
A complete rewrite by dillon and myself to separate
	the implementation of behaviors that effect the vm_map_entry
	from those that effect the vm_object.

	A result of this change is that madvise(..., MADV_FREE);
	is much cheaper.
1999-08-13 17:45:34 +00:00
Alan Cox
5abfdd1eef vm_map_madvise:
Now that behaviors are stored in the vm_map_entry rather than
	the vm_object, it's no longer necessary to instantiate a vm_object
	just to hold the behavior.

Reviewed by:	dillon
1999-08-10 04:50:20 +00:00
Alan Cox
7f866e4b29 Move the memory access behavior information provided by madvise
from the vm_object to the vm_map.

Submitted by:	dillon
1999-08-01 06:05:09 +00:00
Alan Cox
d4da2dbae6 Fix the following problem:
When creating new processes (or performing exec), the new page
directory is initialized too early.  The kernel might grow before
p_vmspace is initialized for the new process.  Since pmap_growkernel
doesn't yet know about the new page directory, it isn't updated, and
subsequent use causes a failure.

The fix is (1) to clear p_vmspace early, to stop pmap_growkernel
from stomping on memory, and (2) to defer part of the initialization
of new page directories until p_vmspace is initialized.

PR:		kern/12378
Submitted by:	tegge
Reviewed by:	dfr
1999-07-21 18:02:27 +00:00
Alan Cox
32b76dfa8a Cleanup OBJ_ONEMAPPING management.
vm_map.c:
	Don't set OBJ_ONEMAPPING on arbitrary vm objects.  Only default
	and swap type vm objects should have it set.  vm_object_deallocate
	already handles these cases.

vm_object.c:
	If OBJ_ONEMAPPING isn't already clear in vm_object_shadow,
	we are in trouble.  Instead of clearing it, make it
	an assertion that it is already clear.
1999-07-11 18:30:32 +00:00
Peter Wemm
3efc015bae Fix some int/long printf problems for the Alpha 1999-07-01 19:53:43 +00:00
Alan Cox
6389da78d5 vm_map_growstack uses vmspace::vm_ssize as though it contained
the stack size in bytes when in fact it is the stack size in pages.
1999-06-17 21:29:38 +00:00
Alan Cox
29b45e9e99 vm_map_insert sometimes extends an existing vm_map entry, rather than
creating a new entry.  vm_map_stack and vm_map_growstack can panic when
a new entry isn't created.  Fixed vm_map_stack and vm_map_growstack.

Also, when extending the stack, always set the protection to VM_PROT_ALL.
1999-06-17 05:49:00 +00:00
Alan Cox
94f7e29a2a Move vm_map_stack and vm_map_growstack after the definition
of the vm_map_clip_end macro.  (The next commit will modify
vm_map_stack and vm_map_growstack to use vm_map_clip_end.)
1999-06-17 00:39:26 +00:00
Alan Cox
1fc43fd11d Remove some unused declarations and duplicate initialization. 1999-06-17 00:27:39 +00:00
Alan Cox
1c85e3df24 vm_map_protect:
The wrong vm_map_entry is used to determine if writes must not be
	allowed due to COW.
1999-06-12 23:10:38 +00:00
Alan Cox
9a2f6362a7 Avoid the creation of unnecessary shadow objects. 1999-05-28 03:39:44 +00:00
Alan Cox
4e045f937b vm_map_insert:
General cleanup.  Eliminate coalescing checks that are duplicated
	by vm_object_coalesce.
1999-05-18 05:38:48 +00:00
Alan Cox
e972780a11 Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert,
eliminating the need for the pmap_object_init_pt calls in imgact_* and
mmap.

Reviewed by:	David Greenman <dg@root.com>
1999-05-17 00:53:56 +00:00
Alan Cox
ea41812fe5 Remove prototypes for functions that don't exist anymore (vm_map.h).
Remove a useless argument from vm_map_madvise's interface (vm_map.c,
	vm_map.h, and vm_mmap.c).

Remove a redundant test in vm_uiomove (vm_map.c).

Make two changes to vm_object_coalesce:

1. Determine whether the new range of pages actually overlaps
the existing object's range of pages before calling vm_object_page_remove.
(Prior to this change almost 90% of the calls to vm_object_page_remove
were to remove pages that were beyond the end of the object.)

2. Free any swap space allocated to removed pages.
1999-05-16 05:07:34 +00:00
Alan Cox
e5f13bdd09 Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying
MAP_COPY_ON_WRITE, and vice versa.  Thus, MAP_COPY_ON_WRITE suffices.

Reviewed by:	David Greenman <dg@root.com>
1999-05-14 23:09:34 +00:00
Alan Cox
876318eca0 Two changes to vm_map_delete:
1. Don't bother checking object->ref_count == 1 in order to set
OBJ_ONEMAPPING.  It's a waste of time.  If object->ref_count == 1,
vm_map_entry_delete will "run-down" the object and its pages.

2. If object->ref_count == 1, ignore OBJ_ONEMAPPING.  Wait for
vm_map_entry_delete to "run-down" the object and its pages.
Otherwise, we're calling two different procedures to delete
the object's pages.

Note: "vmstat -s" will once again show a non-zero value
for "pages freed by exiting processes".
1999-04-04 07:11:02 +00:00
Alan Cox
ad5fca3b4a Mainly, eliminate the comments about share maps. (We don't have share maps
any more.)  Also, eliminate an incorrect comment that says that we don't
coalesce vm_map_entry's.  (We do.)
1999-03-27 23:46:04 +00:00
Alan Cox
99c81ca94d Two changes:
Remove more (redundant) map timestamp increments from properly
synchronized routines.  (Changed: vm_map_entry_link, vm_map_entry_unlink,
and vm_map_pageable.)

Micro-optimize vm_map_entry_link and vm_map_entry_unlink, eliminating
unnecessary dereferences.  At the same time, converted them from macros
to inline functions.
1999-03-21 23:37:00 +00:00
Alan Cox
44428f621d Two changes:
In general, vm_map_simplify_entry should be performed INSIDE
the loop that traverses the map, not outside.  (Changed:
vm_map_inherit, vm_map_pageable.)

vm_fault_unwire doesn't acquire the map lock (or block holding
it).  Thus, vm_map_set/clear_recursive shouldn't be called.
(Changed: vm_map_user_pageable, vm_map_pageable.)
1999-03-15 06:24:52 +00:00
Alan Cox
00d4f4a5f4 Remove (redundant) map timestamp increments from some properly
synchronized routines.
1999-03-09 08:00:17 +00:00
Alan Cox
da3a3026b9 Remove an unused variable from vmspace_fork. 1999-03-08 03:53:07 +00:00
Alan Cox
9de3dd734e Change vm_map_growstack to acquire and hold a read lock (instead of a write
lock) until it actually needs to modify the vm_map.

Note: it is legal to modify vm_map::hint without holding a write lock.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com> with minor changes
		by myself.
1999-03-07 21:25:42 +00:00
Alan Cox
e5f251d2d3 Remove the last of the share map code: struct vm_map::is_main_map.
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
1999-03-02 05:43:18 +00:00
Matthew Dillon
d1bf5d56b6 Remove unnecessary page protects on map_split and collapse operations.
Fix bug where an object's OBJ_WRITEABLE/OBJ_MIGHTBEDIRTY flags do
    not get set under certain circumstances ( page rename case ).

Reviewed by:	Alan Cox <alc@cs.rice.edu>, John Dyson
1999-02-24 21:26:26 +00:00
Luoqi Chen
b1028ad122 Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
Matthew Dillon
9b09b6c73f Submitted by: Alan Cox <alc@cs.rice.edu>
Remove remaining share map garbage from vm_map_lookup() and clean out
    old #if 0 stuff.
1999-02-19 03:11:37 +00:00
Matthew Dillon
2aaeadf8d9 Fix non-fatal bug in vm_map_insert() which improperly cleared
OBJ_ONEMAPPING in the case where an object is extended by an
    additional vm_map_entry must be allocated.

    In vm_object_madvise(), remove calll to vm_page_cache() in MADV_FREE
    case in order to avoid a page fault on page reuse.  However, we still
    mark the page as clean and destroy any swap backing store.

Submitted by:	Alan Cox <alc@cs.rice.edu>
1999-02-12 09:51:43 +00:00
Matthew Dillon
9fdfe602fc Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
Matthew Dillon
4112823fc7 Submitted by: Alan Cox
The vm_map_insert()/vm_object_coalesce() optimization has been extended
    to include OBJT_SWAP objects as well as OBJT_DEFAULT objects.  This is
    possible because it costs nothing to extend an OBJT_SWAP object with
    the new swapper.  We can't do this with the old swapper.  The old swapper
    used a linear array that would have had to have been reallocated, costing
    time as well as a potential low-memory deadlock.
1999-02-03 01:57:17 +00:00
Matthew Dillon
b406c0f55c This patch eliminates a pointless test from appearing twice
in vm_map_simplify_entry.  Basically, once you've verified that
    the objects in the adjacent vm_map_entry's are the same, either
    NULL or the same vm_object, there's no point in checking that the
    objects have the same behavior.

Obtained from:  Alan Cox <alc@cs.rice.edu>
1999-02-01 08:49:30 +00:00
Julian Elischer
287457c2e7 Submitted by: Alan Cox <alc@cs.rice.edu>
Checked by: "Richard Seaman, Jr." <dick@tar.com>
Fix the following problem:
As the code stands now, growing any stack, and not just the process's
main stack, modifies vm->vm_ssize.  This is inconsistent with the code
earlier in the same procedure.
1999-01-31 14:09:25 +00:00
Matthew Dillon
8aef171243 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 00:57:57 +00:00
Julian Elischer
2907af2a96 Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option.  I was aware of this
|when I first did the work, but then forgot about it.  The VM_STACK stuff
|has some code changes in the i386 branch.  There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in.  This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches.  The vm_map_entry will then always have one
|extra element (avail_ssize).  It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c.  This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch.  I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:49:52 +00:00
Matthew Dillon
7dbf82dc13 Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL
to use the vm_page_dirty() inline.

    The inline can thus do sanity checks ( or not ) over all cases.
1999-01-24 06:04:52 +00:00
Matthew Dillon
81522c62fa General cleanup related to the new pager. We no longer have to worry
about conversions of objects to OBJT_SWAP, it is done automatically
    now.

    Replaced manually inserted code with inline calls for busy waiting on
    pages, which also incidently fixes a potential PG_BUSY race due to
    the code not running at splvm().

    vm_objects no longer have a paging_offset field ( see vm/vm_object.c )
1999-01-21 09:40:48 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
Julian Elischer
2267af789e Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>
1999-01-06 23:05:42 +00:00
Poul-Henning Kamp
f5ef029e92 Nitpicking and dusting performed on a train. Removes trivial warnings
about unused variables, labels and other lint.
1998-10-25 17:44:59 +00:00
David Greenman
6cde7a165f Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to
   "size" being page rounded in some cases and not in others.
   This sometimes resulted in corrupted files. First noticed by
   Terry Lambert.
   Fixed by changing the "size" pager_alloc parameter to be a 64bit
   byte value (as opposed to a 32bit page index) and changing the
   pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
   caused some 64bit offsets and sizes to be scrambled. Removing
   the cast required adding casts at a few dozen callers.
   There may be problems with other bogus casts in close-by
   macros. A quick check seemed to indicate that those were okay,
   however.
1998-10-13 08:24:45 +00:00
John Polstra
a0fce82724 Fix a bug in which a page index was used where a byte offset was
expected.  This bug caused builds of Modula-3 to fail in mysterious
ways on SMP kernels.  More precisely, such builds failed on systems
with kern.fast_vfork equal to 0, the default and only supported
value for SMP kernels.

PR:		kern/7468
Submitted by:	tegge (Tor Egge)
1998-10-01 20:46:41 +00:00
Doug Rabson
e69763a315 Cosmetic changes to the PAGE_XXX macros to make them consistent with
the other objects in vm.
1998-09-04 08:06:57 +00:00
Doug Rabson
069e9bc1b4 Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).

Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags
and vm_object::paging_in_progress to use operations which are not
interruptable.

Reviewed by: Bruce Evans <bde@zeta.org.au>
1998-08-24 08:39:39 +00:00
Doug Rabson
d474eaaa5f Protect all modifications to paging_in_progress with splvm(). The i386
managed to avoid corruption of this variable by luck (the compiler used a
memory read-modify-write instruction which wasn't interruptable) but other
architectures cannot.

With this change, I am now able to 'make buildworld' on the alpha (sfx: the
crowd goes wild...)
1998-08-06 08:33:19 +00:00
Bruce Evans
101eeb7f9f Print pointers using %p instead of attempting to print them by
casting them to long, etc.  Fixed some nearby printf bogons (sign
errors not warned about by gcc, and style bugs, but not truncation
of vm_ooffset_t's).

Use slightly less bogus casts for passing pointers to ddb command
functions.
1998-07-14 12:14:58 +00:00
Bruce Evans
fc62ef1fb5 Fixed printf format errors. 1998-07-11 11:30:46 +00:00
Bruce Evans
ac1e407b32 Fixed printf format errors. 1998-07-11 07:46:16 +00:00
Bruce Evans
e5b19842ef Removed unused includes. 1998-06-21 14:53:44 +00:00
Doug Rabson
ecbb00a262 This commit fixes various 64bit portability problems required for
FreeBSD/alpha.  The most significant item is to change the command
argument to ioctl functions from int to u_long.  This change brings us
inline with various other BSD versions.  Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
1998-06-07 17:13:14 +00:00
John Dyson
cf2819ccb8 Make flushing dirty pages work correctly on filesystems that
unexpectedly do not complete writes even with sync I/O requests.
This should help the behavior of mmaped files when using
softupdates (and perhaps in other circumstances also.)
1998-05-21 07:47:58 +00:00
John Dyson
bd6be9150d An important fix for proper inheritance of backing objects for
object splits.  Another excellent detective job by Tor.
Submitted by:	Tor Egge <Tor.Egge@idi.ntnu.no>
1998-05-16 23:03:20 +00:00
John Dyson
96fb8cf258 Fix the shm panic. I mistakenly used the shadow_count to keep the object
from being split, and instead added an OBJ_NOSPLIT.
1998-05-04 17:12:53 +00:00
John Dyson
cbd8ec0902 Work around some VM bugs, the worst being an overly aggressive
swap space free calculation.  More complete fixes will be forthcoming,
in a week.
1998-05-04 03:01:44 +00:00
John Dyson
86524867d1 Another minor cleanup of the split code. Make sure that pages are
busied during the entire time, so that the waits for pages being
unbusy don't make the objects inconsistant.
1998-05-02 06:36:16 +00:00
John Dyson
e493d28abc Fix minor bug with new over used swap fix. 1998-05-01 02:25:29 +00:00
John Dyson
dda6b17151 Add a needed prototype, and fix a panic problem with the new
memory code.
1998-04-29 06:59:08 +00:00
John Dyson
c0877f103f Tighten up management of memory and swap space during map allocation,
deallocation cycles.  This should provide a measurable improvement
on swap and memory allocation on loaded systems.  It is unlikely a
complete solution.  Also, provide more map info with procfs.
Chuck Cranor spurred on this improvement.
1998-04-29 04:28:22 +00:00
John Dyson
2dbea5d2e3 Fix a pseudo-swap leak problem. This mitigates "leaks" due to
freeing partial objects, not freeing entire objects didn't
free any of it.  Simple fix to the map code.
Reviewed by:	dg
1998-04-28 05:54:47 +00:00
John Dyson
8f9110f6a1 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
John Dyson
660957521c Fix page prezeroing for SMP, and fix some potential paging-in-progress
hangs.  The paging-in-progress diagnosis was a result of Tor Egge's
excellent detective work.
Submitted by:	Partially from Tor Egge.
1998-02-25 03:56:15 +00:00
John Dyson
e47ed70b0f Significantly improve the efficiency of the swap pager, which appears to
have declined due to code-rot over time.  The swap pager rundown code
has been clean-up, and unneeded wakeups removed.  Lots of splbio's
are changed to splvm's.  Also, set the dynamic tunables for the
pageout daemon to be more sane for larger systems (thereby decreasing
the daemon overheadla.)
1998-02-23 08:22:48 +00:00
Bruce Evans
39e4376ba7 Removed unused #includes. 1998-02-20 13:11:54 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
John Dyson
157ac55f97 Fix an argument to vn_lock. It appears that alot of the vn_lock usage
is a bit undisciplined, and should be checked carefully.
1998-02-08 14:55:13 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
John Dyson
95461b450d 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
John Dyson
eaf13dd73a Change the busy page mgmt, so that when pages are freed, they
MUST be PG_BUSY.  It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed.  The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use.  There were some minor problems
with the collapse code, and this plugs those subtile "holes."

Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages.  I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise.  For now, we are stuck
with the current morass.
1998-01-31 11:56:53 +00:00
John Dyson
2d8acc0f4a VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
John Dyson
480ba2f552 Allow gdb to work again. 1998-01-21 12:18:00 +00:00
John Dyson
4722175765 Tie up some loose ends in vnode/object management. Remove an unneeded
config option in pmap.  Fix a problem with faulting in pages.  Clean-up
some loose ends in swap pager memory management.

The system should be much more stable, but all subtile bugs aren't fixed yet.
1998-01-17 09:17:02 +00:00
John Dyson
925a3a419a Fix some vnode management problems, and better mgmt of vnode free list.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.

PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
1998-01-12 01:46:33 +00:00
John Dyson
95e5e988e0 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
John Dyson
60f8d46448 Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem
with the object cache removal.
1997-12-29 01:03:55 +00:00
John Dyson
2be70f79f6 Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00
John Dyson
6d1756a948 The ioopt code is still buggy, but wasn't fully disabled. 1997-12-25 20:55:15 +00:00
John Dyson
c2e11a039d Change bogus usage of btoc to atop. The incorrect usage of btoc was
pointed out by bde.
1997-12-19 15:31:13 +00:00
John Dyson
1efb74fbcc Some performance improvements, and code cleanups (including changing our
expensive OFF_TO_IDX to btoc whenever possible.)
1997-12-19 09:03:37 +00:00
Bruce Evans
5270ecea67 Don't #define max() to get a version that works with vm_ooffset's.
Just use qmax().

This should be fixed more generally using overloaded functions.
1997-11-24 15:03:13 +00:00
Tor Egge
b44959ce49 Simplify map entries during user page wire and user page unwire operations in
vm_map_user_pageable().

Check return value of vm_map_lock_upgrade() during a user page wire operation.
1997-11-14 23:42:10 +00:00
Poul-Henning Kamp
4a11ca4e29 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
Bruce Evans
55b211e3af Removed unused #includes. 1997-10-28 15:59:26 +00:00
John Dyson
0a80f406b3 Decrease the initial allocation for the zone allocations. 1997-10-24 23:41:04 +00:00
Poul-Henning Kamp
a1c995b626 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
Poul-Henning Kamp
55166637cd Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
John Dyson
99448ed11d Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half.  The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs.  Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.
1997-09-21 04:24:27 +00:00
Jonathan Lemon
987b847efc Do not consider VM_PROT_OVERRIDE_WRITE to be part of the protection
entry when handling a fault.  This is set by procfs whenever it wants
to write to a page, as a means of overriding `r-x COW' entries, but
causes failures in the `rwx' case.

Submitted by:	 bde
1997-09-12 15:58:47 +00:00
Bruce Evans
79624e2147 Removed unused #includes. 1997-09-01 03:17:34 +00:00