Commit Graph

292 Commits

Author SHA1 Message Date
attilio
9bd4fdf7ce Do proper "locking" for missing vmmeters part.
Now, we assume no more sched_lock protection for some of them and use the
distribuited loads method for vmmeter (distribuited through CPUs).

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:45:18 +00:00
attilio
7dd8ed88a9 Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
kib
f13486a222 Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by:	jhb
Reviewed by:	daichi (unionfs)
Approved by:	re (kensmith)
2007-05-31 11:51:53 +00:00
jeff
e1996cb960 - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
rwatson
bd3cfaaef8 Audit pathnames looked up in swapon(2) and swapoff(2).
MFC after:	2 weeks
Obtained from:	TrustedBSD Project
2007-04-23 14:41:34 +00:00
jhb
9081d44243 Use pause() rather than tsleep() on stack variables and function pointers. 2007-02-27 17:23:29 +00:00
jhb
9c764c7fc3 - Move 'struct swdevt' back into swap_pager.h and expose it to userland.
- Restore support for fetching swap information from crash dumps via
  kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps.

Reviewed by:	arch@, phk
MFC after:	1 week
2007-02-07 17:43:11 +00:00
jhb
db17c1d997 - Add a new function uma_zone_exhausted() to see if a zone is full.
- Add a printf in swp_pager_meta_build() to warn if the swapzone becomes
  exhausted so that there's at least a warning before a box that runs out
  of swapzone space before running out of swap space deadlocks.

MFC after:	1 week
Reviwed by:	alc
2007-01-05 19:09:01 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
alc
f395a2d02c The page queues lock is no longer required by vm_page_wakeup(). 2006-10-23 05:27:31 +00:00
rwatson
7beaaf5cd2 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
alc
b98eae58a6 Introduce a field to struct vm_page for storing flags that are
synchronized by the lock on the object containing the page.

Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags.  Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.

Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().

Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
2006-08-09 17:43:27 +00:00
alc
3f8b2509e4 Remove a stale comment. 2006-08-05 19:07:07 +00:00
alc
cbc0dafbb2 When sleeping on a busy page, use the lock from the containing object
rather than the global page queues lock.
2006-08-03 23:56:11 +00:00
pjd
5af5ea6363 Use better order here. 2006-05-10 06:50:44 +00:00
pjd
ca3f23ca34 On shutdown try to turn off all swap devices. This way GEOM providers are
properly closed on shutdown.

Requested by:	ru
Reviewed by:	alc
MFC after:	2 weeks
2006-04-10 10:03:41 +00:00
imp
f1947eff71 Remove leading __ from __(inline|const|signed|volatile). They are
obsolete.  This should reduce diffs to NetBSD as well.
2006-03-08 06:31:46 +00:00
cognet
c2e2425399 Make sure b_vp and b_bufobj are NULL before calling relpbuf(), as it asserts
they are. They should be NULL at this point, except if we're coming from
swapdev_strategy().
It should only affect the case where we're swapping directly on a file over
NFS.
2006-01-27 21:11:50 +00:00
cognet
2ccdc16083 Make sure we have a bufobj before calling bstrategy().
I'm not sure this is the right thing to do, but at least I don't panic
anymore when swapping on a NFS file without using md(4).

X-MFC after:      proper review
2005-09-21 15:01:09 +00:00
alc
38bf328ab8 Eliminate inconsistency in the setting of the B_DONE flag. Specifically,
make the b_iodone callback responsible for setting it if it is needed.
Previously, it was set unconditionally by bufdone() without holding
whichever lock is shared by the b_iodone callback and the corresponding
top-half function.  Consequently, in a race, the top-half function could
conclude that operation was done before the b_iodone callback finished.
See, for example, aio_physwakeup() and aio_fphysio().

Note: I don't believe that the other, more widely-used b_iodone callbacks
are affected.

Discussed with: jeff
Reviewed by: phk
MFC after: 2 weeks
2005-07-20 19:06:06 +00:00
alc
b351552abd Reduce the number of times that we acquire and release locks in
swap_pager_getpages().

MFC after: 1 week
2005-05-20 21:26:05 +00:00
alc
45eab788de Remove calls to spl*(). 2005-05-19 06:11:13 +00:00
alc
0e9f42833e Revert revision 1.270: swp_pager_async_iodone() need not perform
VM_LOCK_GIANT().

Discussed with: jeff
2005-05-18 17:48:04 +00:00
jeff
5adae6c622 - VM_LOCK_GIANT in the swap pager's iodone routine as VFS will soon call it
without Giant.

Sponsored by:	Isilon Systems, Inc.
2005-04-30 11:25:49 +00:00
jeff
f869be5c72 - Pass the ISOPEN flag to namei so filesystems will know we're about to
open them or otherwise access the data.
2005-04-27 09:05:19 +00:00
das
fb7ab34022 Move the swap_zone == NULL check earlier (i.e. before we dereference
the pointer.)

Found by:	Coverity Prevent analysis tool
2005-03-18 21:22:48 +00:00
imp
f0bf889d0d /* -> /*- for license, minor formatting changes 2005-01-07 02:29:27 +00:00
phk
3542218858 When allocating bio's in the swap_pager use M_WAITOK since the
alternative is much worse.
2005-01-03 13:28:56 +00:00
das
af608beb40 Disable U area swapping and remove the routines that create, destroy,
copy, and swap U areas.

Reviewed by:	arch@
2004-11-20 02:29:00 +00:00
das
9d935df169 Fix the last known race in swapoff(), which could lead to a spurious panic:
swapoff: failed to locate %d swap blocks

The race occurred because putpages() can block between the time it
allocates swap space and the time it updates the swap metadata to
associate that space with a vm_object, so swapoff() would complain
about the temporary inconsistency.  I hoped to fix this by making
swp_pager_getswapspace() and swp_pager_meta_build() a single atomic
operation, but that proved to be inconvenient.  With this change,
swapoff() simply doesn't attempt to be so clever about detecting when
all the pageout activity to the target device should have drained.
2004-11-06 07:17:50 +00:00
das
2beb616ced Close a race in swapoff(). Here are the gory details:
In order to avoid livelock, swapoff() skips over objects with a
  nonzero pip count and makes another pass if necessary.  Since it is
  impossible to know which objects we care about, it would choose an
  arbitrary object with a nonzero pip count and wait for it before
  making another pass, the theory being that this object would finish
  paging about as quickly as the ones we care about.  Unfortunately,
  we may have slept since we acquired a reference to this object.
  Hack around this problem by tsleep()ing on the pointer anyway, but
  timeout after a fixed interval.  More elegant solutions are possible,
  but the ones I considered unnecessarily complicate this rare case.

Also, kill some nits that seem to have crept into the swapoff() code
in the last 75 revisions or so:

- Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since
  the latter can be derived from the former.

- Replace swp_pager_find_dev() with something simpler.  There's no
  need to iterate over the entire list of swap devices just to determine
  if a given block is assigned to the one we're interested in.

- Expand the scope of the swhash_mtx in a couple of places so that it
  isn't released and reacquired once for every hash bucket.

- Don't drop the swhash_mtx while holding a reference to an object.
  We need to lock the object first.  Unfortunately, doing so would
  violate the established lock order, so use VM_OBJECT_TRYLOCK() and
  try again on a subsequent pass if the object is already locked.

- Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
2004-11-05 05:36:56 +00:00
phk
50168ede53 De-couple our I/O bio request from the embedded bio in buf by explicitly
copying the fields.
2004-11-04 08:38:07 +00:00
phk
1e4caea88c Remove buf->b_dev field. 2004-11-04 07:59:57 +00:00
phk
1b25a59886 Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.

Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.

Rename ibwrite to bufwrite().

Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.

Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().

Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
2004-10-24 20:03:41 +00:00
phk
52a089c526 Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.
2004-10-22 08:47:20 +00:00
phk
3833976d12 Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write
count on a bufobj.  Bufobj_wdrop() replaces vwakeup().

Use these functions all relevant places except in ffs_softdep.c where
the use if interlocked_sleep() makes this impossible.

Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
2004-10-21 15:53:54 +00:00
das
2e3453ac22 Don't look for swap blocks in objects that aren't swap-backed.
I expect that this will fix the following panic, reported by Jun:
	swap_pager_isswapped: failed to locate all swap meta blocks

MT5 candidate
2004-09-24 16:04:20 +00:00
phk
d8d2b01380 Tag all geom classes in the tree with a version number. 2004-08-08 07:57:53 +00:00
alc
c7df7afd46 - Change uma_zone_set_obj() to call kmem_alloc_nofault() instead of
kmem_alloc_pageable().  The difference between these is that an errant
   memory access to the zone will be detected sooner with
   kmem_alloc_nofault().

The following changes serve to eliminate the following lock-order
reversal reported by witness:

 1st 0xc1a3c084 vm object (vm object) @ vm/swap_pager.c:1311
 2nd 0xc07acb00 swap_pager swhash (swap_pager swhash) @ vm/swap_pager.c:1797
 3rd 0xc1804bdc vm object (vm object) @ vm/uma_core.c:931

There is no potential deadlock in this case.  However, witness is unable
to recognize this because vm objects used by UMA have the same type as
ordinary vm objects.  To remedy this, we make the following changes:

 - Add a mutex type argument to VM_OBJECT_LOCK_INIT().
 - Use the mutex type argument to assign distinct types to special
   vm objects such as the kernel object, kmem object, and UMA objects.
 - Define a static swap zone object for use by UMA.  (Only static
   objects are assigned a special mutex type.)
2004-07-22 19:44:49 +00:00
bms
f91be5fba2 Properly brucify a string by outdenting it. 2004-07-06 02:27:30 +00:00
bms
72f48d5467 In swap_pager_getpages(), bp->b_dev can be NULL, particularly for the
case of NFS mounted swap, so do not try to dereference it.

While we're here, brucify the printf() call which happens when we
time out on acquisition of vm_page_queue_mtx.

PR:		kern/67898
Submitted by:	bde (style)
2004-06-23 15:15:07 +00:00
phk
40dd98a3bd Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
phk
dfd1f7fd50 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
alc
b57e5e03fd Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00
alc
2d844ba403 - Substitute bdone() and bwait() from vfs_bio.c for
swap_pager_putpages()'s buffer completion code.  Note: the only
   difference between swp_pager_sync_iodone() and bdone(), aside from
   the locking in the latter, was the unnecessary clearing of B_ASYNC.
 - Remove an unnecessary pmap_page_protect() from
   swp_pager_async_iodone().

Reviewed by:	tegge
2004-02-23 03:15:13 +00:00
phk
124977fbb6 Remove the absolute count g_access_abs() function since experience has
shown that it is not useful.

Rename the relative count g_access_rel() function to g_access(), only
the name has changed.

Change all g_access_rel() calls in our CVS tree to call g_access() instead.

Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source
code compatibility.
2004-02-12 22:42:11 +00:00
alc
8a8d62e1aa swp_pager_async_iodone() no longer requires Giant. Modify bufdone()
and swapgeom_done() to perform swp_pager_async_iodone() without Giant.

Reviewed by:	tegge
2004-02-07 08:54:50 +00:00
phk
62a971f78f Check error return from g_clone_bio(). (netchild@)
Add XXX comment about why this is still not optimal. (phk@)

Submitted by:	netchild@
2004-02-02 13:08:03 +00:00
alc
60b2122af9 1. Statically initialize swap_pager_full and swap_pager_almost_full to the
full state.  (When swap is added their state will change appropriately.)
2. Set swap_pager_full and swap_pager_almost_full to the full state when
   the last swap device is removed.
Combined these changes eliminate nonsense messages from the kernel on swap-
less machines.

Item 2 submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Prodding by:		phk
2004-01-24 21:31:06 +00:00
alc
8c0190db49 Simplify the various pager allocation routines by computing the desired
object size once and assigning that value to a local variable.
2004-01-04 20:55:15 +00:00