Commit Graph

2297 Commits

Author SHA1 Message Date
mohans
83064ec323 Fix for problems that occur when all mbuf clusters migrate to the mbuf packet
zone. Cluster allocations fail when this happens. Also processes that may have
blocked on cluster allocations will never be woken up. Thanks to rwatson for
an overview of the issue and pointers to the mbuma paper and his tool to dump
out UMA zones.

Reviewed by: andre@
2007-01-25 01:05:23 +00:00
mohans
9799fcf93c Fix for a bug where only one process (of multiple) blocked on
maxpages on a zone is woken up, with the rest never being woken up as
a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked
procsses instead. This change introduces a thundering herd, but since
this should be relatively infrequent, optimizing this (by introducing
a count of blocked processes, for example) may be premature.

Reviewd by: ups@
2007-01-24 22:49:11 +00:00
jeff
474b917526 - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
delphij
9856d14ea1 Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 15:05:52 +00:00
rwatson
d6f9a991c6 Remove uma_zalloc_arg() hack, which coerced M_WAITOK to M_NOWAIT when
allocations were made using improper flags in interrupt context.
Replace with a simple WITNESS warning call.  This restores the
invariant that M_WAITOK allocations will always succeed or die
horribly trying, which is relied on by many UMA consumers.

MFC after:	3 weeks
Discussed with:	jhb
2007-01-10 21:04:43 +00:00
alc
57699b6f9c Declare the map entry created by kmem_init() for the range from
VM_MIN_KERNEL_ADDRESS to the end of the kernel's bootstrap data as
MAP_NOFAULT.
2007-01-07 07:32:04 +00:00
jhb
db17c1d997 - Add a new function uma_zone_exhausted() to see if a zone is full.
- Add a printf in swp_pager_meta_build() to warn if the swapzone becomes
  exhausted so that there's at least a warning before a box that runs out
  of swapzone space before running out of swap space deadlocks.

MFC after:	1 week
Reviwed by:	alc
2007-01-05 19:09:01 +00:00
alc
143ffef93b Optimize vm_object_split(). Specifically, make the number of iterations
equal to the number of physical pages that are renamed to the new object
rather than the new object's virtual size.
2006-12-17 20:14:43 +00:00
alc
71d9f62bca Simplify the computation of the new object's size in vm_object_split(). 2006-12-16 08:17:07 +00:00
kmacy
d0a44b7b7c Remove the requirement that phys_avail be sorted in ascending order
by explicitly finding the lowest and highest addresses when calculating
the size of the vm_pages array

Reviewed by :alc
2006-12-08 08:44:47 +00:00
julian
396ed947f6 Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
ru
01fad2a3f6 The clean_map has been made local to vm_init.c long ago. 2006-11-20 16:23:34 +00:00
ru
825e064fde Remove a redundant pointer-type variable. 2006-11-20 08:33:55 +00:00
ru
e58221ee6f When counting vm totals, skip unreferenced objects, including
vnodes representing mounted file systems.

Reviewed by:	alc
MFC after:	3 days
2006-11-20 00:16:00 +00:00
alc
ce5848ab99 There is no point in setting PG_REFERENCED on kmem_object pages because
they are "unmanaged", i.e., non-pageable, pages.

Remove a stale comment.
2006-11-13 00:27:02 +00:00
alc
6093953d36 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
alc
375373276d I misplaced the assertion that was added to vm_page_startup() in the
previous change.  Correct its placement.
2006-11-08 19:11:54 +00:00
alc
763826feff Simplify the construction of the free queues in vm_page_startup(). Add
an assertion to test a hypothesis concerning other redundant computation
in vm_page_startup().
2006-11-08 18:43:47 +00:00
alc
1f317d8152 Ensure that the page's oflags field is initialized by contigmalloc(). 2006-11-08 06:23:29 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
jb
f82c799735 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00
rwatson
d4be3ff623 Better align output of "show uma" by moving from displaying the basic
counters of allocs/frees/use for each zone to the same statistics
shown by userspace "vmstat -z".

MFC after:	3 days
2006-10-26 12:55:32 +00:00
alc
f395a2d02c The page queues lock is no longer required by vm_page_wakeup(). 2006-10-23 05:27:31 +00:00
alc
5d9c66a3f8 The page queues lock is no longer required by vm_page_busy() or
vm_page_wakeup().  Reduce or eliminate its use accordingly.
2006-10-22 21:18:48 +00:00
rwatson
7beaaf5cd2 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
alc
cbcb760109 Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's
busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object
lock instead of the global page queues lock.
2006-10-22 04:28:14 +00:00
alc
7d7a43f1b4 Eliminate unnecessary PG_BUSY tests. They originally served a purpose
that is now handled by vm object locking.
2006-10-21 21:02:04 +00:00
alc
31fbaf5d5c Long ago, revision 1.22 of vm/vm_pager.h introduced a bug. Specifically,
it introduced a check after the call to file system's get pages method
that assumes that the get pages method does not change the array of pages
that is passed to it.  In the case of vnode_pager_generic_getpages(),
this assumption has been incorrect.  The contents of the array of pages
may be shifted by vnode_pager_generic_getpages().  Likely, the problem
has been hidden by vnode_pager_haspage() limiting the set of pages that
are passed to vnode_pager_generic_getpages() such that a shift never
occurs.

The fix implemented herein is to adjust the pointer to the array of pages
rather than shifting the pages within the array.

MFC after: 3 weeks
Fix suggested by: tegge
2006-10-14 23:21:48 +00:00
alc
07a6c3ab4e Change vnode_pager_addr() such that on returning it distinguishes between
an error returned by VOP_BMAP() and a hole in the file.

Change the callers to vnode_pager_addr() such that they return
VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page.

Reviewed by: tegge
MFC after: 3 weeks
2006-10-14 22:09:03 +00:00
kmacy
e728d44745 sun4v requires TSBs (translation storage buffers) to be contiguous and be
size aligned requiring heavy usage of vm_page_alloc_contig

This change makes vm_page_alloc_contig SMP safe

Approved by: scottl (acting as backup for mentor rwatson)
2006-10-12 04:41:39 +00:00
alc
26e34ffad0 Distinguish between two distinct kinds of errors from VOP_BMAP() in
vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the
underlying file system and (2) an error in performing the VOP_BMAP().
Previously, vnode_pager_generic_getpages() assumed that all errors were
of the first type.  If, in fact, the error was of the second type, the
likely outcome was for the process to become permanently blocked on a busy
page.

MFC after: 3 weeks
Reviewed by: tegge
2006-10-10 18:26:18 +00:00
alc
bd713f224f Change vnode_pager_generic_getpages() so that it does not panic if the
given file is sparse.  Instead, it zeroes the requested page.

Reviewed by: tegge
PR: kern/98116
MFC after: 3 days
2006-10-08 20:26:16 +00:00
kensmith
0c209e1877 Fix two minor style(9) nits in v1.313 which were noticed during an
MFC review.  alc@ will be MFCing V1.313 plus style fix to RELENG_6.
2006-09-29 00:20:56 +00:00
alc
9fce925349 Make vm_page_release_contig() static. 2006-09-03 22:24:08 +00:00
alc
f2ccfe9525 Refactor vm_page_sleep_if_busy() so that the test for a busy page is
inlined and a procedure call is made in the rare case, i.e., when it is
necessary to sleep.  In this case, inlining the test actually makes the
kernel smaller.
2006-08-27 19:50:13 +00:00
alc
8f32cfe8b1 Prevent a call to contigmalloc() that asks for more physical memory than
the machine has from causing a panic.

Submitted by: Michael Plass
PR: 101668
MFC after: 3 days
2006-08-26 02:43:23 +00:00
alc
d108c1d6d1 The return value from vm_pageq_add_new_page() is not used. Eliminate it. 2006-08-25 04:36:19 +00:00
alc
fe4d04c555 Add _vm_stats and _vm_stats_misc to the sysctl declarations in sysctl.h and
eliminate their declarations from various source files.
2006-08-21 06:27:28 +00:00
alc
24209e2264 vm_page_zero_idle()'s return value serves no purpose. Eliminate it. 2006-08-21 00:55:05 +00:00
alc
ce3ad47700 Page flags are reset on (re)allocation. There is no need to clear any
flags except for PG_ZERO in vm_page_free_toq().
2006-08-21 00:34:31 +00:00
alc
cc1f2c465b Reimplement the page's NOSYNC flag as an object-synchronized instead of a
page queues-synchronized flag.  Reduce the scope of the page queues lock in
vm_fault() accordingly.

Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the
scope of the page queues lock.  Reviewed by: tegge
Additionally, eliminate an unnecessary dereference in computing the
argument that is passed to vm_object_set_writeable_dirty().
2006-08-13 00:11:09 +00:00
alc
b787cad1e0 Ensure that the page's new field for object-synchronized flags is always
initialized to zero.

Call vm_page_sleep_if_busy() instead of duplicating its implementation in
vm_page_grab().
2006-08-11 17:18:58 +00:00
alc
bc546843d7 Change vm_page_cowfault() so that it doesn't allocate a pre-busied page. 2006-08-10 04:48:29 +00:00
alc
b98eae58a6 Introduce a field to struct vm_page for storing flags that are
synchronized by the lock on the object containing the page.

Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags.  Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.

Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().

Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
2006-08-09 17:43:27 +00:00
alc
bc6eabeb88 Eliminate the acquisition and release of the page queues lock around a call
to vm_page_sleep_if_busy().
2006-08-06 00:17:17 +00:00
alc
84c8fb9bd2 Change vm_page_sleep_if_busy() so that it no longer requires the caller to
hold the page queues lock.
2006-08-06 00:15:40 +00:00
alc
3f8b2509e4 Remove a stale comment. 2006-08-05 19:07:07 +00:00
alc
cbc0dafbb2 When sleeping on a busy page, use the lock from the containing object
rather than the global page queues lock.
2006-08-03 23:56:11 +00:00
alc
a152234cf9 Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
alc
89dea53ec2 Export the number of object bypasses and collapses through sysctl. 2006-07-22 22:31:57 +00:00