freebsd-skq

Author	SHA1	Message	Date
Alan Cox	ec96dca788	Simplify the inner loop of vm_fault()'s delete-behind heuristic. Instead of checking each page for PG_UNMANAGED, perform a one-time check whether the object is OBJT_PHYS. (PG_UNMANAGED pages only belong to OBJT_PHYS objects.)	2008-03-16 17:37:19 +00:00
Alan Cox	593e717ec9	Eliminate an unnecessary test from vm_fault's delete-behind heuristic. Specifically, since the delete-behind heuristic is never applied to a device-backed object, there is no point in checking whether each of the object's pages is fictitious. (Only device-backed objects have fictitious pages.)	2008-03-09 06:08:58 +00:00
Alan Cox	eb2a051720	Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion. Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.	2008-01-03 07:34:34 +00:00
Alan Cox	f8a47341fe	Add the superpage reservation system. This is "part 2 of 2" of the machine-independent support for superpages. (The earlier part was the rewrite of the physical memory allocator.) The remainder of the code required for superpages support is machine-dependent and will be added to the various pmap implementations at a later date. Initially, I am only supporting one large page size per architecture. Moreover, I am only enabling the reservation system on amd64. (In an emergency, it can be disabled by setting VM_NRESERVLEVELS to 0 in amd64/include/vmparam.h or your kernel configuration file.)	2007-12-29 19:53:04 +00:00
Konstantin Belousov	4ab8ab9285	Do not dereference NULL pointer. Reported by: Peter Holm Reviewed by: alc Approved by: re (kensmith)	2007-10-08 20:09:53 +00:00
Alan Cox	7bfda801a8	Change the management of cached pages (PQ_CACHE) in two fundamental ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)	2007-09-25 06:25:06 +00:00
Alan Cox	806453645a	Two changes to vm_fault_additional_pages(): 1. Rewrite the backward scan. Specifically, reverse the order in which pages are allocated so that upon failure it is never necessary to free pages that were just allocated. Moreover, any allocated pages can be put to use. This makes the backward scan behave just like the forward scan. 2. Eliminate an explicit, unsynchronized check for low memory before calling vm_page_alloc(). It serves no useful purpose. It is, in effect, optimizing the uncommon case at the expense of the common case. Approved by: re (hrs) MFC after: 3 weeks	2007-07-20 06:55:11 +00:00
Alan Cox	d1974c0df1	Eliminate the special case handling of OBJT_DEVICE objects in vm_fault_additional_pages() that was introduced in revision 1.47. Then as now, it is unnecessary because dev_pager_haspage() returns zero for both the number of pages to read ahead and read behind, producing the same exact behavior by vm_fault_additional_pages() as the special case handling. Approved by: re (rwatson)	2007-07-08 19:42:52 +00:00
Alan Cox	65ea29a690	When a cached page is reactivated in vm_fault(), update the counter that tracks the total number of reactivated pages. (We have not been counting reactivations by vm_fault() since revision 1.46.) Correct a comment in vm_fault_additional_pages(). Approved by: re (kensmith) MFC after: 1 week	2007-07-06 21:25:21 +00:00
Matt Jacob	9dae729081	Initialize reqpage to zero.	2007-06-17 04:14:27 +00:00
Attilio Rao	6759608248	Rework the PCPU_* (MD) interface: - Rename PCPU_LAZY_INC into PCPU_INC - Add the PCPU_ADD interface which just does an add on the pcpu member given a specific value. Note that for most architectures PCPU_INC and PCPU_ADD are not safe. This is a point that needs some discussions/work in the next days. Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:38:48 +00:00
Jeff Roberson	1c4bcd050a	- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)	2007-06-01 01:12:45 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Alan Cox	cf4682ae23	Eliminate the reactivation of cached pages in vm_fault_prefault() and vm_map_pmap_enter() unless the caller is madvise(MADV_WILLNEED). With the exception of calls to vm_map_pmap_enter() from madvise(MADV_WILLNEED), vm_fault_prefault() and vm_map_pmap_enter() are both used to create speculative mappings. Thus, always reactivating cached pages is a mistake. In principle, cached pages should only be reactivated by an actual access. Otherwise, the following misbehavior can occur. On a hard fault for a text page the clustering algorithm fetches not only the required page but also several of the adjacent pages. Now, suppose that one or more of the adjacent pages are never accessed. Ultimately, these unused pages become cached pages through the efforts of the page daemon. However, the next activation of the executable reactivates and maps these unused pages. Consequently, they are never replaced. In effect, they become pinned in memory.	2007-05-22 04:45:59 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Pawel Jakub Dawidek	fcdd9721e4	Fix a problem for file systems that don't implement VOP_BMAP() operation. The problem is this: vm_fault_additional_pages() calls vm_pager_has_page(), which calls vnode_pager_haspage(). Now when VOP_BMAP() returns an error (eg. EOPNOTSUPP), vnode_pager_haspage() returns TRUE without initializing 'before' and 'after' arguments, so we have some accidental values there. This bascially was causing this condition to be meet: if ((rahead + rbehind) > ((cnt.v_free_count + cnt.v_cache_count) - cnt.v_free_reserved)) { pagedaemon_wakeup(); [...] } (we have some random values in rahead and rbehind variables) I'm not entirely sure this is the right fix, maybe we should just return FALSE in vnode_pager_haspage() when VOP_BMAP() fails? alc@ knows about this problem, maybe he will be able to come up with a better fix if this is not the right one.	2007-04-05 20:49:46 +00:00
Alan Cox	768131d293	vm_page_busy() no longer requires the page queues lock to be held. Reduce the scope of the page queues lock in vm_fault() accordingly.	2007-03-23 06:11:25 +00:00
Alan Cox	d8810d894d	Use PCPU_LAZY_INC() to update page fault statistics.	2007-03-05 18:55:14 +00:00
Alan Cox	44b8bd66f9	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)	2006-11-12 21:48:34 +00:00
Alan Cox	66bdd5d619	The page queues lock is no longer required by vm_page_wakeup().	2006-10-23 05:27:31 +00:00
Alan Cox	9af80719db	Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.	2006-10-22 04:28:14 +00:00
Alan Cox	9fea8cad08	Eliminate unnecessary PG_BUSY tests. They originally served a purpose that is now handled by vm object locking.	2006-10-21 21:02:04 +00:00
Alan Cox	b146f9e5d2	Reimplement the page's NOSYNC flag as an object-synchronized instead of a page queues-synchronized flag. Reduce the scope of the page queues lock in vm_fault() accordingly. Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the scope of the page queues lock. Reviewed by: tegge Additionally, eliminate an unnecessary dereference in computing the argument that is passed to vm_object_set_writeable_dirty().	2006-08-13 00:11:09 +00:00
Alan Cox	e7e56b2889	Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().	2006-08-06 00:17:17 +00:00
Alan Cox	2cf139527c	Retire debug.mpsafevm. None of the architectures supported in CVS require it any longer.	2006-07-21 23:22:49 +00:00
Stephan Uphoff	2053c12705	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
Alan Cox	8f8790a76d	Simplify the implementation of vm_fault_additional_pages() based upon the object's memq being ordered. Specifically, replace repeated calls to vm_page_lookup() by two simple constant-time operations. Reviewed by: tegge	2006-05-13 20:05:44 +00:00
Warner Losh	62a59e8f0d	Remove leading __ from __(inline\|const\|signed\|volatile). They are obsolete. This should reduce diffs to NetBSD as well.	2006-03-08 06:31:46 +00:00
Tor Egge	44ed341759	Adjust old comment (present in rev 1.1) to match changes in rev 1.82. PR: kern/92509 Submitted by: "Bryan Venteicher" <bryanv@daemoninthecloset.org>	2006-02-02 21:55:38 +00:00
Alan Cox	82eedee4a4	Use the new macros abstracting the page coloring/queues implementation. (There are no functional changes.)	2006-01-27 08:35:32 +00:00
Alexander Leidinger	ef39c05baa	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
Tor Egge	b898bb1be3	Don't access fs->first_object after dropping reference to it. The result could be a missed or extra giant unlock. Reviewed by: alc	2005-12-20 12:27:59 +00:00
Alan Cox	05406e6f33	Remove unneeded calls to pmap_remove_all(). The given page is not mapped. Reviewed by: tegge	2005-12-11 22:06:57 +00:00
Alan Cox	57b5187b16	Eliminate an incorrect cast.	2005-09-07 01:42:30 +00:00
Alan Cox	ba8bca610c	Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.	2005-09-03 18:20:20 +00:00
John Baldwin	b9d3b80521	Convert a remaining !fs.map->system_map to fs.first_object->flags & OBJ_NEEDGIANT test that was missed in an earlier revision. This fixes mutex assertion failures in the debug.mpsafevm=0 case. Reported by: ps MFC after: 3 days	2005-07-14 21:18:07 +00:00
Peter Grehan	10b00dd4f3	The final test in unlock_and_deallocate() to determine if GIANT needs to be unlocked wasn't updated to check for OBJ_NEEDGIANT. This caused a WITNESS panic when debug_mpsafevm was set to 0. Approved by: jeffr	2005-05-12 04:09:41 +00:00
Jeff Roberson	ed4fe4f4f5	- Add a new object flag "OBJ_NEEDSGIANT". We set this flag if the underlying vnode requires Giant. - In vm_fault only acquire Giant if the underlying object has NEEDSGIANT set. - In vm_object_shadow inherit the NEEDSGIANT flag from the backing object.	2005-05-03 11:11:26 +00:00
Jeff Roberson	ae51ff1127	- Remove GIANT_REQUIRED where giant is no longer required. - Use VFS_LOCK_GIANT() rather than directly acquiring giant in places where giant is only held because vfs requires it. Sponsored By: Isilon Systems, Inc.	2005-01-24 10:48:29 +00:00
Warner Losh	60727d8b86	/* -> /*- for license, minor formatting changes	2005-01-07 02:29:27 +00:00
Alan Cox	a51b084059	Continue the transition from synchronizing access to the page's PG_BUSY flag and busy field with the global page queues lock to synchronizing their access with the containing object's lock. Specifically, acquire the containing object's lock before reading the page's PG_BUSY flag and busy field in vm_fault(). Reviewed by: tegge@	2004-12-24 19:31:54 +00:00
Alan Cox	1f70d62298	Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep. Remove a bogus comment from pmap_enter_quick(). Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.	2004-12-23 20:16:11 +00:00
Alan Cox	85f5b24573	In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call. Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)	2004-12-15 19:55:05 +00:00
Alan Cox	950d5f7a99	Remove unnecessary check for curthread == NULL.	2004-10-17 20:29:28 +00:00
Alan Cox	5e4bdb57cb	System maps are prohibited from mapping vnode-backed objects. Take advantage of this restriction to avoid acquiring and releasing Giant when wiring pages within a system map. In collaboration with: tegge@	2004-09-11 18:49:59 +00:00
Alan Cox	94ddc7076d	Push Giant deep into vm_forkproc(), acquiring it only if the process has mapped System V shared memory segments (see shmfork_myhook()) or requires the allocation of an ldt (see vm_fault_wire()).	2004-09-03 05:11:32 +00:00
Alan Cox	1a95d74419	In vm_fault_unwire() eliminate the acquisition and release of Giant in the case of non-kernel pmaps.	2004-09-01 19:18:59 +00:00
Alan Cox	3268a1bf75	In the previous revision, I failed to condition an early release of Giant in vm_fault() on debug_mpsafevm. If debug_mpsafevm was not set, the result was an assertion failure early in the boot process. Reported by: green@	2004-08-22 00:08:43 +00:00
Alan Cox	b99e61353f	Further reduce the use of Giant by vm_fault(): Giant is held only when manipulating a vnode, e.g., calling vput(). This reduces contention for Giant during many copy-on-write faults, resulting in some additional speedup on SMPs. Note: debug_mpsafevm must be enabled for this optimization to take effect.	2004-08-21 19:20:21 +00:00
Alan Cox	c1fbc251cd	- Introduce and use a new tunable "debug.mpsafevm". At present, setting "debug.mpsafevm" results in (almost) Giant-free execution of zero-fill page faults. (Giant is held only briefly, just long enough to determine if there is a vnode backing the faulting address.) Also, condition the acquisition and release of Giant around calls to pmap_remove() on "debug.mpsafevm". The effect on performance is significant. On my dual Opteron, I see a 3.6% reduction in "buildworld" time. - Use atomic operations to update several counters in vm_fault().	2004-08-16 06:16:12 +00:00
Tor Egge	19dc560756	The vm map lock is needed in vm_fault() after the page has been found, to avoid later changes before pmap_enter() and vm_fault_prefault() has completed. Simplify deadlock avoidance by not blocking on vm map relookup. In collaboration with: alc	2004-08-12 20:14:49 +00:00
Alan Cox	8673599662	Make two changes to vm_fault(). 1. Move a comment to its proper place, updating it. (Except for white- space, this comment had been unchanged since revision 1.1!) 2. Remove spl calls.	2004-08-09 18:46:39 +00:00
Alan Cox	eebf3286a6	Make two changes to vm_fault(). 1. Retain the map lock until after the calls to pmap_enter() and vm_fault_prefault(). 2. Remove a stale comment. Submitted by: tegge@	2004-08-09 06:01:46 +00:00
Alan Cox	4be14af9cf	To date, unwiring a fictitious page has produced a panic. The reason being that PHYS_TO_VM_PAGE() returns the wrong vm_page for fictitious pages but unwiring uses PHYS_TO_VM_PAGE(). The resulting panic reported an unexpected wired count. Rather than attempting to fix PHYS_TO_VM_PAGE(), this fix takes advantage of the properties of fictitious pages. Specifically, fictitious pages will never be completely unwired. Therefore, we can keep a fictitious page's wired count forever set to one and thereby avoid the use of PHYS_TO_VM_PAGE() when we know that we're working with a fictitious page, just not which one. In collaboration with: green@, tegge@ PR: kern/29915	2004-05-22 04:53:51 +00:00
Alan Cox	5a32489377	Make vm_page's PG_ZERO flag immutable between the time of the page's allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page. Reviewed by: tegge@	2004-05-06 05:03:23 +00:00
Alan Cox	5d328ed44b	- Make the acquisition of Giant in vm_fault_unwire() conditional on the pmap. For the kernel pmap, Giant is not required. In general, for other pmaps, Giant is required by i386's pmap_pte() implementation. Specifically, the use of PMAP2/PADDR2 is synchronized by Giant. Note: In principle, updates to the kernel pmap's wired count could be lost without Giant. However, in practice, we never use the kernel pmap's wired count. This will be resolved when pmap locking appears. - With the above change, cpu_thread_clean() and uma_large_free() need not acquire Giant. (The first case is simply the revival of i386/i386/vm_machdep.c's revision 1.226 by peter.)	2004-03-10 04:44:43 +00:00
Alan Cox	c6d9ef2e1f	Correct a long-standing race condition in vm_fault() that could result in a panic "vm_page_cache: caching a dirty page, ...": Access to the page must be restricted or removed before calling vm_page_cache(). This race condition is identical in nature to that which was addressed by vm_pageout.c's revision 1.251 and vm_page.c's revision 1.275. Reviewed by: tegge MFC after: 7 days	2004-02-15 00:42:26 +00:00
Alan Cox	bfee999d6a	- Locking for the per-process resource limits structure has eliminated the need for Giant in vm_map_growstack(). - Use the proc * that is passed to vm_map_growstack() rather than curthread->td_proc.	2004-02-05 06:33:18 +00:00
Alan Cox	a976eb5e46	- Reduce Giant's scope in vm_fault(). - Use vm_object_reference_locked() instead of vm_object_reference() in vm_fault().	2003-12-26 23:33:37 +00:00
Jonathan Mini	8f101a2f31	NFC: Update stale comments. Reviewed by: alc	2003-11-10 00:44:00 +00:00
Alan Cox	c5b65a6723	- vm_fault_copy_entry() should not assume that the source object contains every page. If the source entry was read-only, one or more wired pages could be in backing objects. - vm_fault_copy_entry() should not set the PG_WRITEABLE flag on the page unless the destination entry is, in fact, writeable.	2003-10-15 08:00:45 +00:00
Alan Cox	8afcf0cc36	Lock the destination object in vm_fault_copy_entry().	2003-10-08 07:11:19 +00:00
Alan Cox	669890eaeb	Retire vm_page_copy(). Its reason for being ended when peter@ modified pmap_copy_page() et al. to accept a vm_page_t rather than a physical address. Also, this change will facilitate locking access to the vm page's valid field.	2003-10-08 05:35:12 +00:00
Alan Cox	cbfbaad8be	Synchronize access to a vm page's valid field using the containing vm object's lock.	2003-10-04 21:35:48 +00:00
Alan Cox	566526a957	Migrate pmap_prefault() into the machine-independent virtual memory layer. A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.	2003-10-03 22:46:53 +00:00
Alan Cox	417a26a154	Add vm object locking to vnode_pager_lock(). (This triggers the movement of a VM_OBJECT_LOCK() in vm_fault().)	2003-09-18 02:26:03 +00:00
Alan Cox	8d8b9c6e70	To implement the sequential access optimization, vm_fault() may need to reacquire the "first" object's lock while a backing object's lock is held. Since this is a lock-order reversal, vm_fault() uses trylock to acquire the first object's lock, skipping the sequential access optimization in the unlikely event that the trylock fails.	2003-08-23 06:52:32 +00:00
Alan Cox	f29ba63ec9	Maintain a lock on the vm object of interest throughout vm_fault(), releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages() or vm_pager_has_pages()). If we sleep, we have marked the vm object with the paging-in-progress flag.	2003-06-22 21:35:41 +00:00
Alan Cox	c8567c3a77	As vm_fault() descends the chain of backing objects, set paging-in- progress on the next object before clearing it on the current object.	2003-06-22 05:36:53 +00:00
Alan Cox	d98ddc4615	Make some style and white-space changes to the copy-on-write path through vm_fault(); remove a pointless assignment statement from that path.	2003-06-22 00:00:11 +00:00
Alan Cox	ebf7512532	Lock one of the vm objects involved in an optimized copy-on-write fault.	2003-06-21 06:31:42 +00:00
Alan Cox	e50346b5e0	The so-called "optimized copy-on-write fault" case should not require the vm map lock. What's really needed is vm object locking, which is (for the moment) provided Giant. Reviewed by: tegge	2003-06-20 04:20:36 +00:00
Alan Cox	d18e8afe99	Fix a vm object reference leak in the page-based copy-on-write mechanism used by the zero-copy sockets implementation. Reviewed by: gallatin	2003-06-19 01:40:44 +00:00
David E. O'Brien	874651b13c	Use __FBSDID().	2003-06-11 23:50:51 +00:00
John Baldwin	eeec6bab2e	Prefer the proc lock to sched_lock when testing PS_INMEM now that it is safe to do so.	2003-04-22 20:01:56 +00:00
Alan Cox	b009d5a0af	- Lock the vm_object when performing vm_object_pip_wakeup().	2003-04-20 19:25:28 +00:00
Alan Cox	d22bc7101c	- Lock the vm_object when performing vm_object_pip_add().	2003-04-20 03:41:21 +00:00
Jake Burkholder	227f9a1c58	- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)	2003-03-25 00:07:06 +00:00
Kenneth D. Merry	9b80d344ec	Zero copy send and receive fixes: - On receive, vm_map_lookup() needs to trigger the creation of a shadow object. To make that happen, call vm_map_lookup() with PROT_WRITE instead of PROT_READ in vm_pgmoveco(). - On send, a shadow object will be created by the vm_map_lookup() in vm_fault(), but vm_page_cowfault() will delete the original page from the backing object rather than simply letting the legacy COW mechanism take over. In other words, the new page should be added to the shadow object rather than replacing the old page in the backing object. (i.e. vm_page_cowfault() should not be called in this case.) We accomplish this by making sure fs.object == fs.first_object before calling vm_page_cowfault() in vm_fault(). Submitted by: gallatin, alc Tested by: ken	2003-03-08 06:58:18 +00:00
Alan Cox	09c80124a3	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
Matthew Dillon	e3669cee72	Merge all the various copies of vm_fault_quick() into a single portable copy.	2003-01-16 00:02:21 +00:00
Alan Cox	1761f1829d	vm_fault_copy_entry() needn't clear PG_ZERO because it didn't pass VM_ALLOC_ZERO to vm_page_alloc().	2003-01-12 07:33:16 +00:00
Alan Cox	a28cc55e5b	Reduce the number of times that we acquire and release the page queues lock by making vm_page_rename()'s caller, rather than vm_page_rename(), responsible for acquiring it.	2002-12-29 07:17:06 +00:00
Alan Cox	82ea080d88	- Hold the page queues lock around calls to vm_page_flag_clear().	2002-12-24 19:02:03 +00:00
Alan Cox	9a96b6382a	- Hold the page queues lock when performing vm_page_busy() or vm_page_flag_set(). - Replace vm_page_sleep_busy() with proper page queues locking and vm_page_sleep_if_busy().	2002-12-19 01:20:24 +00:00
Alan Cox	4fec79bef8	Now that pmap_remove_all() is exported by our pmap implementations use it directly.	2002-11-16 07:44:25 +00:00
Alan Cox	d154fb4fe6	When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than indirectly through vm_page_protect(). The one remaining page flag that is updated by vm_page_protect() is already being updated by our various pmap implementations. Note: A later commit will similarly change the VM_PROT_READ case and eliminate vm_page_protect().	2002-11-10 07:12:04 +00:00
Alan Cox	f4ecdf056e	Complete the page queues locking needed for the page-based copy- on-write (COW) mechanism. (This mechanism is used by the zero-copy TCP/IP implementation.) - Extend the scope of the page queues lock in vm_fault() to cover vm_page_cowfault(). - Modify vm_page_cowfault() to release the page queues lock if it sleeps.	2002-10-19 18:34:39 +00:00
Alan Cox	6508a194aa	o Retire pmap_pageable(). It's an advisory routine that none of our platforms implements.	2002-08-25 04:20:05 +00:00
Alan Cox	fff6062ab6	o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.	2002-08-25 00:22:31 +00:00
Alan Cox	a9911f9a0f	o Move a call to vm_page_wakeup() inside the scope of the page queues lock.	2002-08-10 23:27:06 +00:00
Alan Cox	db44450b11	o Remove the setting and clearing of the PG_MAPPED flag. (This flag is obsolete.)	2002-08-10 07:11:16 +00:00
Alan Cox	4abd55b296	o Lock page queue accesses by vm_page_activate().	2002-07-27 07:20:27 +00:00
Alan Cox	ef594d3186	o Merge vm_fault_wire() and vm_fault_user_wire() by adding a new parameter, user_wire.	2002-07-24 19:47:56 +00:00
Alan Cox	2ad9827349	o Lock page queue accesses by vm_page_free() and vm_page_deactivate().	2002-07-21 21:20:57 +00:00
Alan Cox	15a5d2108e	o Lock page queue accesses by vm_page_cache() in vm_fault() and vm_pageout_scan(). (The others are already locked.) o Assert that the page queues lock is held in vm_page_cache().	2002-07-20 19:34:21 +00:00
Alan Cox	2d09a6ad97	o Lock some page queue accesses, in particular, those by vm_page_unwire().	2002-07-13 19:24:04 +00:00
Kenneth D. Merry	98cb733c67	At long last, commit the zero copy sockets code. MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.	2002-06-26 03:37:47 +00:00
Alan Cox	042bb29940	o Remove GIANT_REQUIRED from vm_fault_user_wire(). o Move pmap_pageable() outside of Giant in vm_fault_unwire(). (pmap_pageable() is a no-op on all supported architectures.) o Remove the acquisition and release of Giant from mlock().	2002-06-16 20:42:29 +00:00
Alan Cox	4b9fdc2bce	o Acquire and release Giant around pmap operations in vm_fault_unwire() and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete() only if operating on the kernel_object or the kmem_object. o Remove GIANT_REQUIRED from vm_map_remove(). o Remove the acquisition and release of Giant from munmap().	2002-05-26 04:54:56 +00:00

1 2 3 4 5 ...

291 Commits