Commit Graph

2297 Commits

Author SHA1 Message Date
Mohan Srinivasan
6c125b8df6 Fix for problems that occur when all mbuf clusters migrate to the mbuf packet
zone. Cluster allocations fail when this happens. Also processes that may have
blocked on cluster allocations will never be woken up. Thanks to rwatson for
an overview of the issue and pointers to the mbuma paper and his tool to dump
out UMA zones.

Reviewed by: andre@
2007-01-25 01:05:23 +00:00
Mohan Srinivasan
7738029183 Fix for a bug where only one process (of multiple) blocked on
maxpages on a zone is woken up, with the rest never being woken up as
a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked
procsses instead. This change introduces a thundering herd, but since
this should be relatively infrequent, optimizing this (by introducing
a count of blocked processes, for example) may be premature.

Reviewd by: ups@
2007-01-24 22:49:11 +00:00
Jeff Roberson
f0393f063a - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
Xin LI
f67af5c918 Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 15:05:52 +00:00
Robert Watson
635fd50514 Remove uma_zalloc_arg() hack, which coerced M_WAITOK to M_NOWAIT when
allocations were made using improper flags in interrupt context.
Replace with a simple WITNESS warning call.  This restores the
invariant that M_WAITOK allocations will always succeed or die
horribly trying, which is relied on by many UMA consumers.

MFC after:	3 weeks
Discussed with:	jhb
2007-01-10 21:04:43 +00:00
Alan Cox
e6eaadba43 Declare the map entry created by kmem_init() for the range from
VM_MIN_KERNEL_ADDRESS to the end of the kernel's bootstrap data as
MAP_NOFAULT.
2007-01-07 07:32:04 +00:00
John Baldwin
663b416f16 - Add a new function uma_zone_exhausted() to see if a zone is full.
- Add a printf in swp_pager_meta_build() to warn if the swapzone becomes
  exhausted so that there's at least a warning before a box that runs out
  of swapzone space before running out of swap space deadlocks.

MFC after:	1 week
Reviwed by:	alc
2007-01-05 19:09:01 +00:00
Alan Cox
73000556e8 Optimize vm_object_split(). Specifically, make the number of iterations
equal to the number of physical pages that are renamed to the new object
rather than the new object's virtual size.
2006-12-17 20:14:43 +00:00
Alan Cox
95442adf05 Simplify the computation of the new object's size in vm_object_split(). 2006-12-16 08:17:07 +00:00
Kip Macy
35d10226b7 Remove the requirement that phys_avail be sorted in ascending order
by explicitly finding the lowest and highest addresses when calculating
the size of the vm_pages array

Reviewed by :alc
2006-12-08 08:44:47 +00:00
Julian Elischer
ad1e7d285a Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
Ruslan Ermilov
9bed18a493 The clean_map has been made local to vm_init.c long ago. 2006-11-20 16:23:34 +00:00
Ruslan Ermilov
ef1b7c4804 Remove a redundant pointer-type variable. 2006-11-20 08:33:55 +00:00
Ruslan Ermilov
276096bb3e When counting vm totals, skip unreferenced objects, including
vnodes representing mounted file systems.

Reviewed by:	alc
MFC after:	3 days
2006-11-20 00:16:00 +00:00
Alan Cox
0f3b612a06 There is no point in setting PG_REFERENCED on kmem_object pages because
they are "unmanaged", i.e., non-pageable, pages.

Remove a stale comment.
2006-11-13 00:27:02 +00:00
Alan Cox
44b8bd66f9 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
Alan Cox
49c3b92531 I misplaced the assertion that was added to vm_page_startup() in the
previous change.  Correct its placement.
2006-11-08 19:11:54 +00:00
Alan Cox
9ad3296a25 Simplify the construction of the free queues in vm_page_startup(). Add
an assertion to test a hypothesis concerning other redundant computation
in vm_page_startup().
2006-11-08 18:43:47 +00:00
Alan Cox
815bc69fb0 Ensure that the page's oflags field is initialized by contigmalloc(). 2006-11-08 06:23:29 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
John Birrell
8460a577a4 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00
Robert Watson
ae4e9636ac Better align output of "show uma" by moving from displaying the basic
counters of allocs/frees/use for each zone to the same statistics
shown by userspace "vmstat -z".

MFC after:	3 days
2006-10-26 12:55:32 +00:00
Alan Cox
66bdd5d619 The page queues lock is no longer required by vm_page_wakeup(). 2006-10-23 05:27:31 +00:00
Alan Cox
2a53696fb8 The page queues lock is no longer required by vm_page_busy() or
vm_page_wakeup().  Reduce or eliminate its use accordingly.
2006-10-22 21:18:48 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
Alan Cox
9af80719db Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's
busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object
lock instead of the global page queues lock.
2006-10-22 04:28:14 +00:00
Alan Cox
9fea8cad08 Eliminate unnecessary PG_BUSY tests. They originally served a purpose
that is now handled by vm object locking.
2006-10-21 21:02:04 +00:00
Alan Cox
7e2393ff51 Long ago, revision 1.22 of vm/vm_pager.h introduced a bug. Specifically,
it introduced a check after the call to file system's get pages method
that assumes that the get pages method does not change the array of pages
that is passed to it.  In the case of vnode_pager_generic_getpages(),
this assumption has been incorrect.  The contents of the array of pages
may be shifted by vnode_pager_generic_getpages().  Likely, the problem
has been hidden by vnode_pager_haspage() limiting the set of pages that
are passed to vnode_pager_generic_getpages() such that a shift never
occurs.

The fix implemented herein is to adjust the pointer to the array of pages
rather than shifting the pages within the array.

MFC after: 3 weeks
Fix suggested by: tegge
2006-10-14 23:21:48 +00:00
Alan Cox
bff763439b Change vnode_pager_addr() such that on returning it distinguishes between
an error returned by VOP_BMAP() and a hole in the file.

Change the callers to vnode_pager_addr() such that they return
VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page.

Reviewed by: tegge
MFC after: 3 weeks
2006-10-14 22:09:03 +00:00
Kip Macy
600c53adf9 sun4v requires TSBs (translation storage buffers) to be contiguous and be
size aligned requiring heavy usage of vm_page_alloc_contig

This change makes vm_page_alloc_contig SMP safe

Approved by: scottl (acting as backup for mentor rwatson)
2006-10-12 04:41:39 +00:00
Alan Cox
1de11f1af3 Distinguish between two distinct kinds of errors from VOP_BMAP() in
vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the
underlying file system and (2) an error in performing the VOP_BMAP().
Previously, vnode_pager_generic_getpages() assumed that all errors were
of the first type.  If, in fact, the error was of the second type, the
likely outcome was for the process to become permanently blocked on a busy
page.

MFC after: 3 weeks
Reviewed by: tegge
2006-10-10 18:26:18 +00:00
Alan Cox
f4f83da02d Change vnode_pager_generic_getpages() so that it does not panic if the
given file is sparse.  Instead, it zeroes the requested page.

Reviewed by: tegge
PR: kern/98116
MFC after: 3 days
2006-10-08 20:26:16 +00:00
Ken Smith
a9a5d47c85 Fix two minor style(9) nits in v1.313 which were noticed during an
MFC review.  alc@ will be MFCing V1.313 plus style fix to RELENG_6.
2006-09-29 00:20:56 +00:00
Alan Cox
e1cb7bc081 Make vm_page_release_contig() static. 2006-09-03 22:24:08 +00:00
Alan Cox
eb4bbba83a Refactor vm_page_sleep_if_busy() so that the test for a busy page is
inlined and a procedure call is made in the rare case, i.e., when it is
necessary to sleep.  In this case, inlining the test actually makes the
kernel smaller.
2006-08-27 19:50:13 +00:00
Alan Cox
1f081553cc Prevent a call to contigmalloc() that asks for more physical memory than
the machine has from causing a panic.

Submitted by: Michael Plass
PR: 101668
MFC after: 3 days
2006-08-26 02:43:23 +00:00
Alan Cox
09ef0d6e0c The return value from vm_pageq_add_new_page() is not used. Eliminate it. 2006-08-25 04:36:19 +00:00
Alan Cox
b276ae6f6a Add _vm_stats and _vm_stats_misc to the sysctl declarations in sysctl.h and
eliminate their declarations from various source files.
2006-08-21 06:27:28 +00:00
Alan Cox
38498c2123 vm_page_zero_idle()'s return value serves no purpose. Eliminate it. 2006-08-21 00:55:05 +00:00
Alan Cox
4f9d17d8ab Page flags are reset on (re)allocation. There is no need to clear any
flags except for PG_ZERO in vm_page_free_toq().
2006-08-21 00:34:31 +00:00
Alan Cox
b146f9e5d2 Reimplement the page's NOSYNC flag as an object-synchronized instead of a
page queues-synchronized flag.  Reduce the scope of the page queues lock in
vm_fault() accordingly.

Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the
scope of the page queues lock.  Reviewed by: tegge
Additionally, eliminate an unnecessary dereference in computing the
argument that is passed to vm_object_set_writeable_dirty().
2006-08-13 00:11:09 +00:00
Alan Cox
25017df472 Ensure that the page's new field for object-synchronized flags is always
initialized to zero.

Call vm_page_sleep_if_busy() instead of duplicating its implementation in
vm_page_grab().
2006-08-11 17:18:58 +00:00
Alan Cox
75db2abb2e Change vm_page_cowfault() so that it doesn't allocate a pre-busied page. 2006-08-10 04:48:29 +00:00
Alan Cox
5786be7cc7 Introduce a field to struct vm_page for storing flags that are
synchronized by the lock on the object containing the page.

Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags.  Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.

Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().

Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
2006-08-09 17:43:27 +00:00
Alan Cox
e7e56b2889 Eliminate the acquisition and release of the page queues lock around a call
to vm_page_sleep_if_busy().
2006-08-06 00:17:17 +00:00
Alan Cox
e74814b66a Change vm_page_sleep_if_busy() so that it no longer requires the caller to
hold the page queues lock.
2006-08-06 00:15:40 +00:00
Alan Cox
ab1661cb76 Remove a stale comment. 2006-08-05 19:07:07 +00:00
Alan Cox
91449ce98c When sleeping on a busy page, use the lock from the containing object
rather than the global page queues lock.
2006-08-03 23:56:11 +00:00
Alan Cox
78985e424a Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
Alan Cox
604c2bbc34 Export the number of object bypasses and collapses through sysctl. 2006-07-22 22:31:57 +00:00