Commit Graph

302 Commits

Author SHA1 Message Date
kib
fa686c638e Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
alc
919e3cbf28 Eliminate unnecessary obfuscation when testing a page's valid bits. 2009-06-07 19:38:26 +00:00
alc
30d072f507 Eliminate an incorrect comment. 2009-05-07 05:44:13 +00:00
alc
b9963f1636 Eliminate an archaic band-aid. The immediately preceding comment already
explains why the band-aid is unnecessary.

Suggested by:	tegge
2009-04-26 20:54:57 +00:00
alc
b13621e4e2 Allow valid pages to be mapped for read access when they have a non-zero
busy count.  Only mappings that allow write access should be prevented by
a non-zero busy count.

(The prohibition on mapping pages for read access when they have a non-
zero busy count originated in revision 1.202 of i386/i386/pmap.c when
this code was a part of the pmap.)

Reviewed by:	tegge
2009-04-19 00:34:34 +00:00
alc
8a8f5251fa Prior to r188331 a map entry's last read offset was only updated by a hard
fault.  In r188331 this update was relocated because of synchronization
changes to a place where it would occur on both hard and soft faults.  This
change again restricts the update to hard faults.
2009-02-25 07:52:53 +00:00
alc
e3d0161279 Avoid some cases of unnecessary page queues locking by vm_fault's delete-
behind heuristic.
2009-02-09 06:23:21 +00:00
alc
9513bac196 Eliminate OBJ_NEEDGIANT. After r188331, OBJ_NEEDGIANT's only use is by a
redundant assertion in vm_fault().

Reviewed by:	kib
2009-02-08 22:17:24 +00:00
kib
9988f9e959 Remove no longer valid comment.
Submitted by:	alc
2009-02-08 21:20:13 +00:00
kib
379838428b Do not sleep for vnode lock while holding map lock in vm_fault. Try to
acquire vnode lock for OBJT_VNODE object after map lock is dropped.
Because we have the busy page(s) in the object, sleeping there would
result in deadlock with vnode resize. Try to get lock without sleeping,
and, if the attempt failed, drop the state, lock the vnode, and restart
the fault handler from the start with already locked vnode.

Because the vnode_pager_lock() function is inlined in vm_fault(),
axe it.

Based on suggestion by:	alc
Reviewed by:	tegge, alc
Tested by:	pho
2009-02-08 20:23:46 +00:00
kib
b798264c6d Style. 2009-02-08 19:37:01 +00:00
alc
0de51cf047 Simplify the inner loop of vm_fault()'s delete-behind heuristic.
Instead of checking each page for PG_UNMANAGED, perform a one-time
check whether the object is OBJT_PHYS.  (PG_UNMANAGED pages only
belong to OBJT_PHYS objects.)
2008-03-16 17:37:19 +00:00
alc
160b9af7de Eliminate an unnecessary test from vm_fault's delete-behind heuristic.
Specifically, since the delete-behind heuristic is never applied to a
device-backed object, there is no point in checking whether each of the
object's pages is fictitious.  (Only device-backed objects have
fictitious pages.)
2008-03-09 06:08:58 +00:00
alc
545d26e30b Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.
2008-01-03 07:34:34 +00:00
alc
4565fa1697 Add the superpage reservation system. This is "part 2 of 2" of the
machine-independent support for superpages.  (The earlier part was
the rewrite of the physical memory allocator.)  The remainder of the
code required for superpages support is machine-dependent and will
be added to the various pmap implementations at a later date.

Initially, I am only supporting one large page size per architecture.
Moreover, I am only enabling the reservation system on amd64.  (In
an emergency, it can be disabled by setting VM_NRESERVLEVELS to 0
in amd64/include/vmparam.h or your kernel configuration file.)
2007-12-29 19:53:04 +00:00
kib
53bbfe99fb Do not dereference NULL pointer.
Reported by:	Peter Holm
Reviewed by:	alc
Approved by:	re (kensmith)
2007-10-08 20:09:53 +00:00
alc
d1bce06c64 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
alc
8765bda351 Two changes to vm_fault_additional_pages():
1. Rewrite the backward scan.  Specifically, reverse the order in which
   pages are allocated so that upon failure it is never necessary to
   free pages that were just allocated.  Moreover, any allocated pages
   can be put to use.  This makes the backward scan behave just like the
   forward scan.

2. Eliminate an explicit, unsynchronized check for low memory before
   calling vm_page_alloc().  It serves no useful purpose.  It is, in
   effect, optimizing the uncommon case at the expense of the common
   case.

Approved by:	re (hrs)
MFC after:	3 weeks
2007-07-20 06:55:11 +00:00
alc
f58d26b291 Eliminate the special case handling of OBJT_DEVICE objects in
vm_fault_additional_pages() that was introduced in revision 1.47.  Then
as now, it is unnecessary because dev_pager_haspage() returns zero for
both the number of pages to read ahead and read behind, producing the
same exact behavior by vm_fault_additional_pages() as the special case
handling.

Approved by: re (rwatson)
2007-07-08 19:42:52 +00:00
alc
0985df88fc When a cached page is reactivated in vm_fault(), update the counter that
tracks the total number of reactivated pages.  (We have not been
counting reactivations by vm_fault() since revision 1.46.)

Correct a comment in vm_fault_additional_pages().

Approved by:	re (kensmith)
MFC after:	1 week
2007-07-06 21:25:21 +00:00
mjacob
cd7cf5829b Initialize reqpage to zero. 2007-06-17 04:14:27 +00:00
attilio
e333d0ff0e Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
jeff
a7a8bac81f - Move rusage from being per-process in struct pstats to per-thread in
td_ru.  This removes the requirement for per-process synchronization in
   statclock() and mi_switch().  This was previously supported by
   sched_lock which is going away.  All modifications to rusage are now
   done in the context of the owning thread.  reads proceed without locks.
 - Aggregate exiting threads rusage in thread_exit() such that the exiting
   thread's rusage is not lost.
 - Provide a new routine, rufetch() to fetch an aggregate of all rusage
   structures from all threads in a process.  This routine must be used
   in any place requiring a rusage from a process prior to it's exit.  The
   exited process's rusage is still available via p_ru.
 - Aggregate tick statistics only on demand via rufetch() or when a thread
   exits.  Tick statistics are kept in the thread and protected by sched_lock
   until it exits.

Initial patch by:	attilio
Reviewed by:		attilio, bde (some objections), arch (mostly silent)
2007-06-01 01:12:45 +00:00
attilio
7dd8ed88a9 Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
alc
08b2128056 Eliminate the reactivation of cached pages in vm_fault_prefault() and
vm_map_pmap_enter() unless the caller is madvise(MADV_WILLNEED).  With
the exception of calls to vm_map_pmap_enter() from
madvise(MADV_WILLNEED), vm_fault_prefault() and vm_map_pmap_enter()
are both used to create speculative mappings.  Thus, always
reactivating cached pages is a mistake.  In principle, cached pages
should only be reactivated by an actual access.  Otherwise, the
following misbehavior can occur.  On a hard fault for a text page the
clustering algorithm fetches not only the required page but also
several of the adjacent pages.  Now, suppose that one or more of the
adjacent pages are never accessed.  Ultimately, these unused pages
become cached pages through the efforts of the page daemon.  However,
the next activation of the executable reactivates and maps these
unused pages.  Consequently, they are never replaced.  In effect, they
become pinned in memory.
2007-05-22 04:45:59 +00:00
jeff
e1996cb960 - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
pjd
b6f1c3fccc Fix a problem for file systems that don't implement VOP_BMAP() operation.
The problem is this: vm_fault_additional_pages() calls vm_pager_has_page(),
which calls vnode_pager_haspage(). Now when VOP_BMAP() returns an error (eg.
EOPNOTSUPP), vnode_pager_haspage() returns TRUE without initializing 'before'
and 'after' arguments, so we have some accidental values there. This bascially
was causing this condition to be meet:

	if ((rahead + rbehind) >
	    ((cnt.v_free_count + cnt.v_cache_count) - cnt.v_free_reserved)) {
		pagedaemon_wakeup();
		[...]
	}

(we have some random values in rahead and rbehind variables)

I'm not entirely sure this is the right fix, maybe we should just return FALSE
in vnode_pager_haspage() when VOP_BMAP() fails?

alc@ knows about this problem, maybe he will be able to come up with a better
fix if this is not the right one.
2007-04-05 20:49:46 +00:00
alc
584a970755 vm_page_busy() no longer requires the page queues lock to be held. Reduce
the scope of the page queues lock in vm_fault() accordingly.
2007-03-23 06:11:25 +00:00
alc
2210798637 Use PCPU_LAZY_INC() to update page fault statistics. 2007-03-05 18:55:14 +00:00
alc
6093953d36 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
alc
f395a2d02c The page queues lock is no longer required by vm_page_wakeup(). 2006-10-23 05:27:31 +00:00
alc
cbcb760109 Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's
busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object
lock instead of the global page queues lock.
2006-10-22 04:28:14 +00:00
alc
7d7a43f1b4 Eliminate unnecessary PG_BUSY tests. They originally served a purpose
that is now handled by vm object locking.
2006-10-21 21:02:04 +00:00
alc
cc1f2c465b Reimplement the page's NOSYNC flag as an object-synchronized instead of a
page queues-synchronized flag.  Reduce the scope of the page queues lock in
vm_fault() accordingly.

Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the
scope of the page queues lock.  Reviewed by: tegge
Additionally, eliminate an unnecessary dereference in computing the
argument that is passed to vm_object_set_writeable_dirty().
2006-08-13 00:11:09 +00:00
alc
bc6eabeb88 Eliminate the acquisition and release of the page queues lock around a call
to vm_page_sleep_if_busy().
2006-08-06 00:17:17 +00:00
alc
b5b274360a Retire debug.mpsafevm. None of the architectures supported in CVS require
it any longer.
2006-07-21 23:22:49 +00:00
ups
b3a7439a45 Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps
2006-06-15 01:01:06 +00:00
alc
951cb63293 Simplify the implementation of vm_fault_additional_pages() based upon the
object's memq being ordered.  Specifically, replace repeated calls to
vm_page_lookup() by two simple constant-time operations.

Reviewed by: tegge
2006-05-13 20:05:44 +00:00
imp
f1947eff71 Remove leading __ from __(inline|const|signed|volatile). They are
obsolete.  This should reduce diffs to NetBSD as well.
2006-03-08 06:31:46 +00:00
tegge
724ef57f1f Adjust old comment (present in rev 1.1) to match changes in rev 1.82.
PR:	kern/92509
Submitted by:   "Bryan Venteicher" <bryanv@daemoninthecloset.org>
2006-02-02 21:55:38 +00:00
alc
8db5cab2c2 Use the new macros abstracting the page coloring/queues implementation.
(There are no functional changes.)
2006-01-27 08:35:32 +00:00
netchild
507a9b3e93 MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
tegge
7245d518e8 Don't access fs->first_object after dropping reference to it.
The result could be a missed or extra giant unlock.

Reviewed by:	alc
2005-12-20 12:27:59 +00:00
alc
a5d0ac5faf Remove unneeded calls to pmap_remove_all(). The given page is not mapped.
Reviewed by: tegge
2005-12-11 22:06:57 +00:00
alc
dd6197a72f Eliminate an incorrect cast. 2005-09-07 01:42:30 +00:00
alc
39788de49e Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.
2005-09-03 18:20:20 +00:00
jhb
841c5ac424 Convert a remaining !fs.map->system_map to
fs.first_object->flags & OBJ_NEEDGIANT test that was missed in an earlier
revision.  This fixes mutex assertion failures in the debug.mpsafevm=0
case.

Reported by:	ps
MFC after:	3 days
2005-07-14 21:18:07 +00:00
grehan
a442ec4d3f The final test in unlock_and_deallocate() to determine if GIANT needs to be
unlocked wasn't updated to check for OBJ_NEEDGIANT. This caused a WITNESS
panic when debug_mpsafevm was set to 0.

Approved by:	jeffr
2005-05-12 04:09:41 +00:00
jeff
d62d255d2e - Add a new object flag "OBJ_NEEDSGIANT". We set this flag if the
underlying vnode requires Giant.
 - In vm_fault only acquire Giant if the underlying object has NEEDSGIANT
   set.
 - In vm_object_shadow inherit the NEEDSGIANT flag from the backing object.
2005-05-03 11:11:26 +00:00
jeff
1dd5432139 - Remove GIANT_REQUIRED where giant is no longer required.
- Use VFS_LOCK_GIANT() rather than directly acquiring giant in places
   where giant is only held because vfs requires it.

Sponsored By:   Isilon Systems, Inc.
2005-01-24 10:48:29 +00:00
imp
f0bf889d0d /* -> /*- for license, minor formatting changes 2005-01-07 02:29:27 +00:00
alc
f16b9f1b30 Continue the transition from synchronizing access to the page's PG_BUSY
flag and busy field with the global page queues lock to synchronizing their
access with the containing object's lock.  Specifically, acquire the
containing object's lock before reading the page's PG_BUSY flag and busy
field in vm_fault().

Reviewed by: tegge@
2004-12-24 19:31:54 +00:00
alc
a618275b13 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
alc
ede2fb9751 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
alc
46e3ee9584 Remove unnecessary check for curthread == NULL. 2004-10-17 20:29:28 +00:00
alc
96c3a115d5 System maps are prohibited from mapping vnode-backed objects. Take
advantage of this restriction to avoid acquiring and releasing Giant when
wiring pages within a system map.

In collaboration with: tegge@
2004-09-11 18:49:59 +00:00
alc
82e55fdf76 Push Giant deep into vm_forkproc(), acquiring it only if the process has
mapped System V shared memory segments (see shmfork_myhook()) or requires
the allocation of an ldt (see vm_fault_wire()).
2004-09-03 05:11:32 +00:00
alc
38af5e4b6b In vm_fault_unwire() eliminate the acquisition and release of Giant in the
case of non-kernel pmaps.
2004-09-01 19:18:59 +00:00
alc
069d1661bd In the previous revision, I failed to condition an early release of Giant
in vm_fault() on debug_mpsafevm.  If debug_mpsafevm was not set, the result
was an assertion failure early in the boot process.

Reported by: green@
2004-08-22 00:08:43 +00:00
alc
bdaf27d7e6 Further reduce the use of Giant by vm_fault(): Giant is held only when
manipulating a vnode, e.g., calling vput().  This reduces contention for
Giant during many copy-on-write faults, resulting in some additional
speedup on SMPs.

Note: debug_mpsafevm must be enabled for this optimization to take effect.
2004-08-21 19:20:21 +00:00
alc
336d354baa - Introduce and use a new tunable "debug.mpsafevm". At present, setting
"debug.mpsafevm" results in (almost) Giant-free execution of zero-fill
   page faults.  (Giant is held only briefly, just long enough to determine
   if there is a vnode backing the faulting address.)

   Also, condition the acquisition and release of Giant around calls to
   pmap_remove() on "debug.mpsafevm".

   The effect on performance is significant.  On my dual Opteron, I see a
   3.6% reduction in "buildworld" time.

 - Use atomic operations to update several counters in vm_fault().
2004-08-16 06:16:12 +00:00
tegge
c5a462b4d9 The vm map lock is needed in vm_fault() after the page has been found,
to avoid later changes before pmap_enter() and vm_fault_prefault()
has completed.

Simplify deadlock avoidance by not blocking on vm map relookup.

In collaboration with: alc
2004-08-12 20:14:49 +00:00
alc
8c107931a7 Make two changes to vm_fault().
1. Move a comment to its proper place, updating it.  (Except for white-
   space, this comment had been unchanged since revision 1.1!)
2. Remove spl calls.
2004-08-09 18:46:39 +00:00
alc
eabee22ac5 Make two changes to vm_fault().
1. Retain the map lock until after the calls to pmap_enter() and
   vm_fault_prefault().
2. Remove a stale comment.  Submitted by: tegge@
2004-08-09 06:01:46 +00:00
alc
5d0912f6d8 To date, unwiring a fictitious page has produced a panic. The reason
being that PHYS_TO_VM_PAGE() returns the wrong vm_page for fictitious
pages but unwiring uses PHYS_TO_VM_PAGE().  The resulting panic
reported an unexpected wired count.  Rather than attempting to fix
PHYS_TO_VM_PAGE(), this fix takes advantage of the properties of
fictitious pages.  Specifically, fictitious pages will never be
completely unwired.  Therefore, we can keep a fictitious page's wired
count forever set to one and thereby avoid the use of
PHYS_TO_VM_PAGE() when we know that we're working with a fictitious
page, just not which one.

In collaboration with: green@, tegge@
PR: kern/29915
2004-05-22 04:53:51 +00:00
alc
b57e5e03fd Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00
alc
dbdc402421 - Make the acquisition of Giant in vm_fault_unwire() conditional on the
pmap.  For the kernel pmap, Giant is not required.  In general, for
   other pmaps, Giant is required by i386's pmap_pte() implementation.
   Specifically, the use of PMAP2/PADDR2 is synchronized by Giant.
   Note: In principle, updates to the kernel pmap's wired count could be
   lost without Giant.  However, in practice, we never use the kernel
   pmap's wired count.  This will be resolved when pmap locking appears.
 - With the above change, cpu_thread_clean() and uma_large_free() need
   not acquire Giant.  (The first case is simply the revival of
   i386/i386/vm_machdep.c's revision 1.226 by peter.)
2004-03-10 04:44:43 +00:00
alc
aae81a61cf Correct a long-standing race condition in vm_fault() that could result in a
panic "vm_page_cache: caching a dirty page, ...": Access to the page must
be restricted or removed before calling vm_page_cache().  This race
condition is identical in nature to that which was addressed by
vm_pageout.c's revision 1.251 and vm_page.c's revision 1.275.

Reviewed by:	tegge
MFC after:	7 days
2004-02-15 00:42:26 +00:00
alc
cabea24620 - Locking for the per-process resource limits structure has eliminated
the need for Giant in vm_map_growstack().
 - Use the proc * that is passed to vm_map_growstack() rather than
   curthread->td_proc.
2004-02-05 06:33:18 +00:00
alc
5eb32a5d39 - Reduce Giant's scope in vm_fault().
- Use vm_object_reference_locked() instead of vm_object_reference()
   in vm_fault().
2003-12-26 23:33:37 +00:00
mini
918610ef5e NFC: Update stale comments.
Reviewed by:	alc
2003-11-10 00:44:00 +00:00
alc
b722f9a630 - vm_fault_copy_entry() should not assume that the source object contains
every page.  If the source entry was read-only, one or more wired pages
   could be in backing objects.
 - vm_fault_copy_entry() should not set the PG_WRITEABLE flag on the page
   unless the destination entry is, in fact, writeable.
2003-10-15 08:00:45 +00:00
alc
352d9382c0 Lock the destination object in vm_fault_copy_entry(). 2003-10-08 07:11:19 +00:00
alc
76f6c3b059 Retire vm_page_copy(). Its reason for being ended when peter@ modified
pmap_copy_page() et al. to accept a vm_page_t rather than a physical
address.  Also, this change will facilitate locking access to the vm page's
valid field.
2003-10-08 05:35:12 +00:00
alc
3272fbe303 Synchronize access to a vm page's valid field using the containing
vm object's lock.
2003-10-04 21:35:48 +00:00
alc
b1691aebe4 Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added.  This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine.  Note: pmap_is_prefaultable() and pmap_mincore() have
much in common.  Going forward, it's worth considering their merger.
2003-10-03 22:46:53 +00:00
alc
1644dd5fce Add vm object locking to vnode_pager_lock(). (This triggers the movement
of a VM_OBJECT_LOCK() in vm_fault().)
2003-09-18 02:26:03 +00:00
alc
06fbefe190 To implement the sequential access optimization, vm_fault() may need to
reacquire the "first" object's lock while a backing object's lock is held.
Since this is a lock-order reversal, vm_fault() uses trylock to acquire
the first object's lock, skipping the sequential access optimization in
the unlikely event that the trylock fails.
2003-08-23 06:52:32 +00:00
alc
fa54a6610e Maintain a lock on the vm object of interest throughout vm_fault(),
releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages()
or vm_pager_has_pages()).  If we sleep, we have marked the vm object with
the paging-in-progress flag.
2003-06-22 21:35:41 +00:00
alc
5d0aaa2a87 As vm_fault() descends the chain of backing objects, set paging-in-
progress on the next object before clearing it on the current object.
2003-06-22 05:36:53 +00:00
alc
752ed0a2b9 Make some style and white-space changes to the copy-on-write path through
vm_fault(); remove a pointless assignment statement from that path.
2003-06-22 00:00:11 +00:00
alc
ed79b4d625 Lock one of the vm objects involved in an optimized copy-on-write fault. 2003-06-21 06:31:42 +00:00
alc
893b54638f The so-called "optimized copy-on-write fault" case should not require
the vm map lock.  What's really needed is vm object locking, which
is (for the moment) provided Giant.

Reviewed by:	tegge
2003-06-20 04:20:36 +00:00
alc
29c6e6376c Fix a vm object reference leak in the page-based copy-on-write mechanism
used by the zero-copy sockets implementation.

Reviewed by:	gallatin
2003-06-19 01:40:44 +00:00
obrien
b0678d7a44 Use __FBSDID(). 2003-06-11 23:50:51 +00:00
jhb
d5cf4c5275 Prefer the proc lock to sched_lock when testing PS_INMEM now that it is
safe to do so.
2003-04-22 20:01:56 +00:00
alc
033a6f0bc7 - Lock the vm_object when performing vm_object_pip_wakeup(). 2003-04-20 19:25:28 +00:00
alc
5990076d78 - Lock the vm_object when performing vm_object_pip_add(). 2003-04-20 03:41:21 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
ken
471eab1868 Zero copy send and receive fixes:
- On receive, vm_map_lookup() needs to trigger the creation of a shadow
  object.  To make that happen, call vm_map_lookup() with PROT_WRITE
  instead of PROT_READ in vm_pgmoveco().

- On send, a shadow object will be created by the vm_map_lookup() in
  vm_fault(), but vm_page_cowfault() will delete the original page from
  the backing object rather than simply letting the legacy COW mechanism
  take over.  In other words, the new page should be added to the shadow
  object rather than replacing the old page in the backing object.  (i.e.
  vm_page_cowfault() should not be called in this case.)  We accomplish
  this by making sure fs.object == fs.first_object before calling
  vm_page_cowfault() in vm_fault().

Submitted by:	gallatin, alc
Tested by:	ken
2003-03-08 06:58:18 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
dillon
aba727244c Merge all the various copies of vm_fault_quick() into a single
portable copy.
2003-01-16 00:02:21 +00:00
alc
e45ec3c803 vm_fault_copy_entry() needn't clear PG_ZERO because it didn't pass
VM_ALLOC_ZERO to vm_page_alloc().
2003-01-12 07:33:16 +00:00
alc
3f894298eb Reduce the number of times that we acquire and release the page queues
lock by making vm_page_rename()'s caller, rather than vm_page_rename(),
responsible for acquiring it.
2002-12-29 07:17:06 +00:00
alc
43045f3d3b - Hold the page queues lock around calls to vm_page_flag_clear(). 2002-12-24 19:02:03 +00:00
alc
3f31cbe67c - Hold the page queues lock when performing vm_page_busy() or
vm_page_flag_set().
 - Replace vm_page_sleep_busy() with proper page queues locking
   and vm_page_sleep_if_busy().
2002-12-19 01:20:24 +00:00
alc
5e336b1d19 Now that pmap_remove_all() is exported by our pmap implementations
use it directly.
2002-11-16 07:44:25 +00:00
alc
fc8a5bc419 When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than
indirectly through vm_page_protect().  The one remaining page flag that
is updated by vm_page_protect() is already being updated by our various
pmap implementations.

Note: A later commit will similarly change the VM_PROT_READ case and
eliminate vm_page_protect().
2002-11-10 07:12:04 +00:00
alc
22918c79b0 Complete the page queues locking needed for the page-based copy-
on-write (COW) mechanism.  (This mechanism is used by the zero-copy
TCP/IP implementation.)
 - Extend the scope of the page queues lock in vm_fault()
   to cover vm_page_cowfault().
 - Modify vm_page_cowfault() to release the page queues lock
   if it sleeps.
2002-10-19 18:34:39 +00:00
alc
d5f256dae2 o Retire pmap_pageable(). It's an advisory routine that none
of our platforms implements.
2002-08-25 04:20:05 +00:00