Commit Graph

37 Commits

Author SHA1 Message Date
green
cf7423d4fe Remove extraneous locks on the VM free page queue mutex; it is not
meant to be recursed upon, and could cauuse a deadlock inside the
new contigmalloc (vm.old_contigmalloc=0) code.

Submitted by:	alc
2004-07-19 23:29:36 +00:00
green
c4b4a8f048 Reimplement contigmalloc(9) with an algorithm which stands a greatly-
improved chance of working despite pressure from running programs.
Instead of trying to throw a bunch of pages out to swap and hope for
the best, only a range that can potentially fulfill contigmalloc(9)'s
request will have its contents paged out (potentially, not forcibly)
at a time.

The new contigmalloc operation still operates in three passes, but it
could potentially be tuned to more or less.  The first pass only looks
at pages in the cache and free pages, so they would be thrown out
without having to block.  If this is not enough, the subsequent passes
page out any unwired memory.  To combat memory pressure refragmenting
the section of memory being laundered, each page is removed from the
systems' free memory queue once it has been freed so that blocking
later doesn't cause the memory laundered so far to get reallocated.

The page-out operations are now blocking, as it would make little sense
to try to push out a page, then get its status immediately afterward
to remove it from the available free pages queue, if it's unlikely to
have been freed.  Another change is that if KVA allocation fails, the
allocated memory segment will be freed and not leaked.

There is a sysctl/tunable, defaulting to on, which causes the old
contigmalloc() algorithm to be used.  Nonetheless, I have been using
vm.old_contigmalloc=0 for over a month.  It is safe to switch at
run-time to see the difference it makes.

A new interface has been used which does not require mapping the
allocated pages into KVA: vm_page.h functions vm_page_alloc_contig()
and vm_page_release_contig().  These are what vm.old_contigmalloc=0
uses internally, so the sysctl/tunable does not affect their operation.

When using the contigmalloc(9) and contigfree(9) interfaces, memory
is now tracked with malloc(9) stats.  Several functions have been
exported from kern_malloc.c to allow other subsystems to use these
statistics, as well.  This invalidates the BUGS section of the
contigmalloc(9) manpage.
2004-07-19 06:21:27 +00:00
green
52f66d0879 Make contigmalloc() more reliable:
1. Remove a race whereby contigmalloc() would deadlock against the
   running processes in the system if they kept reinstantiating
   the memory on the active and inactive page queues that it was
   trying to flush out.  The process doing the contigmalloc() would
   sit in "swwrt" forever and the swap pager would be going at full
   force, but never get anywhere.  Instead of doing it until the
   queues are empty, launder for as many iterations as there are
   pages in the queue.
2. Do all laundering to swap synchronously; previously, the vnode
   laundering was synchronous and the swap laundering not.
3. Increase the number of launder-or-allocate passes to three, from
   two, while failing without bothering to do all the laundering on
   the third pass if allocation was not possible.  This effectively
   gives exactly two chances to launder enough contiguous memory,
   helpful with high memory churn where a lot of memory from one pass
   to the next (and during a single laundering loop) becomes dirtied
   again.

I can now reliably hot-plug hardware requiring a 256KB contigmalloc()
without having the kldload/cbb ithread sit around failing to make
progress, while running a busy X session.  Previously, it took killing
X to get contigmalloc() to get further (that is, quiescing the system),
and even then contigmalloc() returned failure.
2004-06-15 01:02:00 +00:00
imp
04c9462c02 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-06 20:15:37 +00:00
alc
4fad4b8d26 Remove GIANT_REQUIRED from contigfree(). 2004-03-13 07:09:15 +00:00
alc
9687ab882b In the last revision, I introduced a physical contiguity check that is both
unnecessary and wrong.  While it is necessary to verify that the page is
still free after dropping and reacquiring the free page queue lock, the
physical contiguity of the page can not change, making this check
unnecessary.  This check was wrong in that it could cause an out-of-bounds
array access.

Tested by:	rwatson
2004-03-05 04:46:32 +00:00
alc
163a9eabd1 Modify contigmalloc1() so that the free page queues lock is not held when
vm_page_free() is called.  The problem with holding this lock is that it is
a spin lock and vm_page_free() may attempt the acquisition of a different
default-type lock.
2004-03-02 08:25:58 +00:00
alc
600caa157e Correct a long-standing race condition in vm_contig_launder() that could
result in a panic "vm_page_cache: caching a dirty page, ...": Access to the
page must be restricted or removed before calling vm_page_cache().  This
race condition is identical in nature to that which was addressed by
vm_pageout.c's revision 1.251 and vm_page.c's revision 1.275.

MFC after:	7 days
2004-02-16 03:43:57 +00:00
alc
24162d19e2 Remove vm_page_alloc_contig(). It's now unused. 2004-01-14 06:21:38 +00:00
alc
9aa6374009 - Unmanage pages allocated by contigmalloc1(). (There is no point in
having PV entries for these pages.)
 - Remove splvm() and splx() calls.
2004-01-10 21:17:53 +00:00
alc
9f7878e05a - Enable recursive acquisition of the mutex synchronizing access to the
free pages queue.  This is presently needed by contigmalloc1().
 - Move a sanity check against attempted double allocation of two pages
   to the same vm object offset from vm_page_alloc() to vm_page_insert().
   This provides better protection because double allocation could occur
   through a direct call to vm_page_insert(), such as that by
   vm_page_rename().
 - Modify contigmalloc1() to hold the mutex synchronizing access to the
   free pages queue while it scans vm_page_array in search of free pages.
 - Correct a potential leak of pages by contigmalloc1() that I introduced
   in revision 1.20: We must convert all cache queue pages to free pages
   before we begin removing free pages from the free queue.  Otherwise,
   if we have to restart the scan because we are unable to acquire the
   vm object lock that is necessary to convert a cache queue page to a
   free page, we leak those free pages already removed from the free queue.
2004-01-08 20:48:26 +00:00
alc
90dc293a41 Don't bother clearing PG_ZERO in contigmalloc1(), kmem_alloc(), or
kmem_malloc().  It serves no purpose.
2004-01-06 20:52:55 +00:00
alc
bccf1d15ab - Increase the object lock's scope in vm_contig_launder() so that access
to the object's type field and the call to vm_pageout_flush() are
   synchronized.
 - The above change allows for the eliminaton of the last parameter
   to vm_pageout_flush().
 - Synchronize access to the page's valid field in vm_pageout_flush()
   using the containing object's lock.
2003-10-18 21:09:21 +00:00
bms
44aa51e3ae Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
mux
e7241bd66a Use pmap_zero_page() to zero pages instead of bzero() because
they haven't been vm_map_wire()'d yet.
2003-07-27 10:41:33 +00:00
alc
d63c1dd2b8 Acquire Giant rather than asserting it is held in contigmalloc(). This is
a prerequisite to removing further uses of Giant from UMA.
2003-07-26 21:48:46 +00:00
mux
a3fee15cba Add support for the M_ZERO flag to contigmalloc().
Reviewed by:	jeff
2003-07-25 21:02:25 +00:00
alc
bab2fb677f Lock a vm object when freeing a page from it. 2003-07-05 20:51:22 +00:00
mux
ed5355f79d Fix a few style(9) nits. 2003-07-02 01:47:47 +00:00
obrien
b0678d7a44 Use __FBSDID(). 2003-06-11 23:50:51 +00:00
alc
87da2c3cf3 - Acquire the vm_object's lock when performing vm_object_page_clean().
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
   whether its caller has locked the vm_object.  (This is a temporary
   measure to bootstrap vm_object locking.)
2003-04-24 04:31:25 +00:00
alc
a05b4b3347 Update locking on the kernel_object to use the new macros. 2003-04-14 00:36:53 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
alc
8b4847e64e - Hold the kernel_object's lock around vm_page_insert(..., kernel_object,
...).
2002-12-23 20:39:15 +00:00
alc
9565c29f19 o Extend the scope of the page queues lock in contigmalloc1().
o Replace vm_page_sleep_busy() with vm_page_sleep_if_busy()
   in vm_contig_launder().
2002-08-04 07:07:34 +00:00
alc
d048dca979 o Require that the page queues lock is held on entry to vm_pageout_clean()
and vm_pageout_flush().
 o Acquire the page queues lock before calling vm_pageout_clean()
   or vm_pageout_flush().
2002-07-27 23:20:32 +00:00
alc
478ee061bb o Lock page queue accesses by vm_page_cache() in vm_contig_launder().
o Micro-optimize the control flow in vm_contig_launder().
2002-07-20 06:11:16 +00:00
alc
dd3f128981 o Create vm_contig_launder() to replace code that appears twice
in contigmalloc1().
2002-07-15 06:33:31 +00:00
alc
f5dfaef158 o Lock some (unfortunately, not yet all) accesses to the page queues. 2002-07-12 03:17:22 +00:00
alc
596ff14965 o Lock accesses to the free page queues in contigmalloc1(). 2002-07-05 06:43:32 +00:00
alc
42cf959f18 o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and
vm_map_user_pageable().
 o Remove vm_map_pageable() and vm_map_user_pageable().
 o Remove vm_map_clear_recursive() and vm_map_set_recursive().  (They were
   only used by vm_map_pageable() and vm_map_user_pageable().)

Reviewed by:	tegge
2002-06-14 18:21:01 +00:00
alc
53ca2106e4 o Make contigmalloc1() static. 2002-05-22 01:01:37 +00:00
alc
e56a9ee7ae Call vm_pageq_remove_nowakeup() rather than duplicating it. 2002-03-03 22:36:14 +00:00
dillon
cbc26091b2 contigmalloc1() could cause the vm_page_zero_count to become incorrect.
Properly track the count.

Submitted by:	mark tinguely <tinguely@web.cs.ndsu.nodak.edu>
2001-10-17 17:34:34 +00:00
dillon
d96ac0398e Makes contigalloc[1]() create the vm_map / underlying wired pages in the
kernel map and object in a manner that contigfree() is actually able to
free.  Previously contigfree() freed up the KVA space but could not
unwire & free the underlying VM pages due to mismatched pageability between
the map entry and the VM pages.

Submitted by:	Thomas Moestl <tmoestl@gmx.net>
Testing by: mark tinguely <tinguely@web.cs.ndsu.nodak.edu>
MFC after:	3 days
2001-10-13 04:23:37 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
dillon
93369f554a Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.

vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc...  vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)
2001-07-04 23:27:09 +00:00