freebsd-skq

Author	SHA1	Message	Date
kib	6796656333	Remove zero-copy sockets code. It only worked for anonymous memory, and the equivalent functionality is now provided by sendfile(2) over posix shared memory filedescriptor. Remove the cow member of struct vm_page, and rearrange the remaining members. While there, make hold_count unsigned. Requested and reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (delphij)	2013-09-16 06:25:54 +00:00
kib	889b9d0e0b	If the last page of the file is partially full and whole valid portion is invalidated, invalidate the whole page. Otherwise, partially valid page appears on a page queue, which is wrong. This could only happen for the last page, because only then buffer which triggered invalidation could not cover the whole page. Reported and tested by: pho (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation Approved by: re (delphij) MFC after: 2 weeks	2013-09-14 10:11:38 +00:00
kib	7ab18d4990	The vm_page_trysbusy() should not fail when shared busy counter or VPB_BIT_WAITERS flag were changed between reading of busy_lock and the cas. The vm_page_sbusy(), which is the only user of vm_page_trysbusy() in the tree, panics on the failure, which in these cases is transient and do not mean that the current page state prevents sbusying. Retry the operation inside vm_page_trysbusy() if cas failed, only return a failure when VPB_BIT_SHARED is cleared. Reported and tested by: pho Reviewed by: attilio Sponsored by: The FreeBSD Foundation	2013-09-05 12:54:40 +00:00
alc	aa9a7bb9e6	Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2). Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%. Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise(). Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division	2013-08-29 15:49:05 +00:00
glebius	088bcbe3ed	Remove comment that is no longer relevant since r254182.	2013-08-26 14:14:25 +00:00
alc	1a535523cd	Addendum to r254141: The call to vm_radix_insert() in vm_page_cache() can reclaim the last preexisting cached page in the object, resulting in a call to vdrop(). Detect this scenario so that the vnode's hold count is correctly maintained. Otherwise, we panic. Reported by: scottl Tested by: pho Discussed with: attilio, jeff, kib	2013-08-23 17:27:12 +00:00
kib	ba12eedccd	Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-22 07:39:53 +00:00
alc	42d76a02b5	Addendum to r254141: Allow recursion on the free pages queues lock in vm_page_alloc_freelist(). Reported and tested by: sbruno Sponsored by: EMC / Isilon Storage Division	2013-08-21 15:31:43 +00:00
attilio	ae49aeaba6	On the recovery path for vm_page_alloc(), if a page had been requested wired, unwind back the wiring bits otherwise we can end up freeing a page that is considered wired. Sponsored by: EMC / Isilon storage division Reported by: alc	2013-08-15 11:01:25 +00:00
jeff	bc00d6df57	Improve pageout flow control to wakeup more frequently and do less work while maintaining better LRU of active pages. - Change v_free_target to include the quantity previously represented by v_cache_min so we don't need to add them together everywhere we use them. - Add a pageout_wakeup_thresh that sets the free page count trigger for waking the page daemon. Set this 10% above v_free_min so we wakeup before any phase transitions in vm users. - Adjust down v_free_target now that we're willing to accept more pagedaemon wakeups. This means we process fewer pages in one iteration as well, leading to shorter lock hold times and less overall disruption. - Eliminate vm_pageout_page_stats(). This was a minor variation on the PQ_ACTIVE segment of the normal pageout daemon. Instead we now process 1 / vm_pageout_update_period pages every second. This causes us to visit the whole active list every 60 seconds. Previously we would only maintain the active LRU when we were short on pages which would mean it could be woefully out of date. Reviewed by: alc (slight variant of this) Discussed with: alc, kib, jhb Sponsored by: EMC / Isilon Storage Division	2013-08-13 21:56:16 +00:00
attilio	f2a180739c	Correct the recovery logic in vm_page_alloc_contig: what is really needed on this code snipped is that all the pages that are already fully inserted gets fully freed, while for the others the object removal itself might be skipped, hence the object might be set to NULL. Sponsored by: EMC / Isilon storage division Reported by: alc, kib Reviewed by: alc	2013-08-11 21:15:04 +00:00
kib	4675fcfce0	Different consumers of the struct vm_page abuse pageq member to keep additional information, when the page is guaranteed to not belong to a paging queue. Usually, this results in a lot of type casts which make reasoning about the code correctness harder. Sometimes m->object is used instead of pageq, which could cause real and confusing bugs if non-NULL m->object is leaked. See r141955 and r253140 for examples. Change the pageq member into a union containing explicitly-typed members. Use them instead of type-punning or abusing m->object in x86 pmaps, uma and vm_page_alloc_contig(). Requested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-10 17:36:42 +00:00
jhb	8f3909e991	Revert the addition of VPO_BUSY and instead update vm_page_replace() to properly unbusy the page. Submitted by: alc	2013-08-09 21:14:55 +00:00
attilio	e9f37cac74	On all the architectures, avoid to preallocate the physical memory for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl	2013-08-09 11:28:55 +00:00
attilio	16c7563cf4	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
kib	8de1718b60	Split the pagequeues per NUMA domains, and split pageademon process into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-07 16:36:38 +00:00
kib	c2cfac4ffc	In the vm_page_set_invalid() function, do not assert that the page is not busy, since its only caller brelse() can legitimately call it on busy page. This happens for VOP_PUTPAGES() on filesystems that use buffers and which VOP_WRITE() method marked the buffer containing page as non-cacheable. Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-07-11 05:38:39 +00:00
neel	44e6cbd7c8	vm_phys_fictitious_reg_range() was losing the 'memattr' because it would be reset by pmap_page_init() right after being initialized in vm_page_initfake(). The statement above is with reference to the amd64 implementation of pmap_page_init(). Fix this by calling 'pmap_page_init()' in 'vm_page_initfake()' before changing the 'memattr'. Reviewed by: kib MFC after: 2 weeks	2013-07-03 23:38:37 +00:00
glebius	fd69e3f2df	Typo in comment.	2013-06-24 13:36:16 +00:00
alc	53ffec3a56	Revise the interface between vm_object_madvise() and vm_page_dontneed() so that pointless calls to pmap_is_modified() can be easily avoided when performing madvise(..., MADV_FREE). Sponsored by: EMC / Isilon Storage Division	2013-06-10 01:48:21 +00:00
kib	dba0637421	Remove irrelevant comments. Discussed with: alc MFC after: 3 days	2013-06-03 17:30:40 +00:00
alc	17993ced1b	Require that the page lock is held, instead of the object lock, when clearing the page's PGA_REFERENCED flag. Since we are typically manipulating the page's act_count field when we are clearing its PGA_REFERENCED flag, the page lock is already held everywhere that we clear the PGA_REFERENCED flag. So, in fact, this revision only changes some comments and an assertion. Nonetheless, it will enable later changes to object locking in the pageout code. Introduce vm_page_assert_locked(), which completely hides the implementation details of the page lock from the caller, and use it in vm_page_aflag_clear(). (The existing vm_page_lock_assert() could not be used in vm_page_aflag_clear().) Over the coming weeks, I expect that we'll either eliminate or replace the various uses of vm_page_lock_assert() with vm_page_assert_locked(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-06-03 01:22:54 +00:00
alc	ece39b8d03	Now that access to the page's "act_count" field is synchronized by the page lock instead of the object lock, there is no reason for vm_page_activate() to assert that the object is locked for either read or write access. (The "VPO_UNMANAGED" flag never changes after page allocation.) Sponsored by: EMC / Isilon Storage Division	2013-06-01 20:32:34 +00:00
attilio	fdf82ef9cf	o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-05-21 20:38:19 +00:00
kib	f8c66c9055	Add ddb command 'show pginfo' which provides useful information about a vm page, denoted either by an address of the struct vm_page, or, if the '/p' modifier is specified, by a physical address of the corresponding frame. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-05-21 11:04:00 +00:00
alc	585b1bf4b4	Relax the object locking assertion in vm_page_lookup(). Now that a radix tree is used to maintain the object's collection of resident pages, vm_page_lookup() no longer needs an exclusive lock. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-05-17 18:49:43 +00:00
peter	46ecea3da0	Bandaid for compiling with gcc, which happens to be the default compiler for a number of platforms still.	2013-05-13 07:09:31 +00:00
alc	7d20e37fb6	Refactor vm_page_alloc()'s interactions with vm_reserv_alloc_page() and vm_page_insert() so that (1) vm_radix_lookup_le() is never called while the free page queues lock is held and (2) vm_radix_lookup_le() is called at most once. This change reduces the average time that the free page queues lock is held by vm_page_alloc() as well as vm_page_alloc()'s average overall running time. Sponsored by: EMC / Isilon Storage Division	2013-05-12 16:50:18 +00:00
alc	9e48bd7ba9	Most allocation of pages to objects proceeds from lower to higher indices. Consequentially, vm_page_insert() should use vm_radix_lookup_le() instead of vm_radix_lookup_ge(). Here's why. In the expected case, vm_radix_lookup_le() will quickly find a page less than the specified key at the same radix node. In contrast, vm_radix_lookup_ge() is expected to return NULL, but to do that it must examine every slot in the radix tree that is greater than the key. Prior to this change, the average cost of a vm_page_insert() call on my test machine was 992 cycles. After this change, the average cost is only 532 cycles, a reduction of 46%. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-17 16:23:19 +00:00
alc	b346e448af	Simplify the interface to vm_radix_insert() by eliminating the parameter "index". The content of a radix tree leaf, or at least its "key", is not opaque to the other radix tree operations. Specifically, they know how to extract the "key" from a leaf. So, eliminating the parameter "index" isn't breaking the abstraction. Moreover, eliminating the parameter "index" effectively prevents the caller from passing an inconsistent "index" and leaf to vm_radix_insert(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-03-17 16:06:03 +00:00
attilio	3c52979cb4	MFC	2013-03-12 13:26:12 +00:00
alc	07fc599921	When transferring the page from one object to another, don't insert the page into its new object until the page's pindex has been updated. Otherwise, one code path within vm_radix_insert() may use the wrong pindex value. Sponsored by: EMC / Isilon Storage Division	2013-03-12 06:14:31 +00:00
attilio	32a3275e77	MFC	2013-03-11 10:49:02 +00:00
alc	2c9c761886	Introduce vm_radix_is_empty(), and use it in place of vm_object_cache_is_empty() where the caller is aware of the page cache's implementation as a radix trie. Sponsored by: EMC / Isilon Storage Division	2013-03-10 17:30:57 +00:00
alc	854d4fd5e6	Update a comment: The object lock is no longer a mutex.	2013-03-09 21:32:24 +00:00
attilio	76954ad68a	Merge from vmcontention.	2013-03-09 03:19:53 +00:00
attilio	16a80466e5	MFC	2013-03-09 02:51:51 +00:00
attilio	72f7f3e528	Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho	2013-03-09 02:32:23 +00:00
attilio	754f3790b8	Merge from vmc-playground: Introduce a new KPI that verifies if the page cache is empty for a specified vm_object. This KPI does not make assumptions about the locking in order to be used also for building assertions at init and destroy time. It is mostly used to hide implementation details of the page cache. Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: alc (vm_radix based version) Tested by: flo, pho, jhb, davide	2013-03-09 02:05:29 +00:00
attilio	709ad55889	Evaluations on the likelyhood of empty object cache cannot be made in general way but must be evaluated case by case. Embedd the decision in the caller themselves rather than in a general purpose KPI. Sponsored by: EMC / Isilon storage division Reported by: alc Reviewed by: alc	2013-03-04 12:33:40 +00:00
alc	a855741cf1	A Boolean is more appropriate than an int here. Use what I think is a slightly better variable name. Sponsored by: EMC / Isilon Storage Division	2013-03-04 07:20:59 +00:00
alc	317a9584fb	Two out of three times that vm_page_find_least() is called, it's going to return the vm object's first page. In those cases, there is no need to traverse the trie. Sponsored by: EMC / Isilon Storage Division	2013-03-03 01:36:31 +00:00
attilio	210a93e7f7	Merge from vmcontention	2013-02-26 18:18:39 +00:00
attilio	134623836d	MFC	2013-02-26 18:11:43 +00:00
attilio	49f99b7251	Wrap the sleeps synchronized by the vm_object lock into the specific macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-26 17:22:08 +00:00
attilio	e590e8091e	As VM_OBJECT_SLEEP() is a vm_object_t specific function, make the passed object as the first argument of the function for consistency. Sponsored by: EMC / Isilon storage revision	2013-02-26 01:38:12 +00:00
attilio	fc0ecac2f8	Revert wrongly added asserts: lookup and remove from the collection of cached pages doesn't require the object lock to be held. Sponsored by: EMC / Isilon storage division	2013-02-26 00:34:52 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
attilio	47d4527f8f	Remove an unuseful check as looking up into an empty trie should be as fast as checking a NULL ptr. Sponsored by: EMC / Isilon storage division	2013-02-15 17:22:57 +00:00
attilio	ea8f6ef283	Remove whitespace.	2013-02-15 17:21:41 +00:00
attilio	af55a9fb46	- Fix style in vm_page_lookup(): there is no whiteline between assertions and other code in this file. - Reinsert some comments that were lost during the work but which are actual yet, reducing differences with HEAD. Sponsoed by: EMC / Isilon storage division	2013-02-15 12:35:16 +00:00
attilio	642ba82fdf	Remove an unuseful check on resident_page_count. vm_radix_lookup_ge() of an empty trie is as fast as checking a NULL pointer.	2013-02-14 15:25:31 +00:00
attilio	eafe26c8a6	The radix preallocation pages can overfow the biggestone segment, so use a different scheme for preallocation: reserve few KB of nodes to be used to cater page allocations before the memory can be efficiently pre-allocated by UMA. This at all effects remove boot_pages further carving and along with this modifies to the boot_pages allocation system and necessity to initialize the UMA zone before pmap_init(). Reported by: pho, jhb	2013-02-14 15:23:00 +00:00
attilio	53f78d1a7d	Implement a new algorithm for managing the radix trie which also includes path-compression. This greatly helps with sparsely populated tries, where an uncompressed trie may end up by having a lot of intermediate nodes for very little leaves. The new algorithm introduces 2 main concepts: the node level and the node owner. Every node represents a branch point where the leaves share the key up to the level specified in the node-level (current level excluded, of course). Such key partly shared is the one contained in the owner. Of course, the root branch is exempted to keep a valid owner, because theoretically all the keys are contained in the space designed by the root branch node. The search algorithm seems very intuitive and that is where one should start reading to understand the full approach. In the end, the algorithm ends up by demanding only one node per insert and this is not necessary in all the cases. To stay safe, we basically preallocate as many nodes as the number of physical pages are in the system, using uma_preallocate(). However, this raises 2 concerns: * As pmap_init() needs to kmem_alloc(), the nodes must be pre-allocated when vm_radix_init() is currently called, which is much before UMA is fully initialized. This means that uma_prealloc() will dig into the UMA_BOOT_PAGES pool of pages, which is often not enough to keep track of such large allocations. In order to fix this, change a bit the concept of UMA_BOOT_PAGES and vm.boot_pages. More specifically make the UMA_BOOT_PAGES an initial "value" as long as vm.boot_pages and extend the boot_pages physical area by as many bytes as needed with the information returned by vm_radix_allocphys_size(). * A small amount of pages will be held in per-cpu buckets and won't be accessible from curcpu, so the vm_radix_node_get() could really panic when the pre-allocation pool is close to be exhausted. In theory we could pre-allocate more pages than the number of physical frames to satisfy such request, but as many insert would happen without a node allocation anyway, I think it is safe to assume that the over-allocation is already compensating for such problem. On the field testing can stand me correct, of course. This could be further helped by the case where we allow a single-page insert to not require a complete root node. The use of pre-allocation gets rid all the non-direct mapping trickery and introduced lock recursion allowance for vm_page_free_queue. The nodes children are reduced in number from 32 -> 16 and from 16 -> 8 (for respectively 64 bits and 32 bits architectures). This would make the children to fit into cacheline for amd64 case, for example, and in general spawn less cacheline, which may be helpful in lookup_ge() case. Also, path-compression cames to help in cases where there are many levels, making the fallouts of such change less hurting. Sponsored by: EMC / Isilon storage division Reviewed by: jeff (partially) Tested by: flo	2013-02-13 01:19:31 +00:00
attilio	87d8d2eec7	Fix style.	2013-02-10 16:00:14 +00:00
attilio	59f669fb82	Style.	2013-02-07 15:06:04 +00:00
attilio	4c22b4bafe	Cleanup vm_radix KPI: - Avoid the return value for vm_radix_insert() - Name the functions argument per-style(9) - Avoid to get and return opaque objects but use vm_page_t as vm_radix is thought to not really be general code but to cater specifically page cache and resident cache.	2013-02-06 18:37:46 +00:00
attilio	d3fb98bfb4	Enrich comments on newly added assertions.	2013-02-06 17:47:24 +00:00
attilio	22b0de04b3	Reduce diffs against HEAD.	2013-02-06 17:17:11 +00:00
attilio	9d88c5279c	Now that vm_page_cache_free() and vm_page_cache_transfer() are reimplemented as ranged operations, sync vm_page_is_cached() semantic with HEAD.	2013-02-06 14:50:34 +00:00
attilio	44f85cd1a5	Reduce diffs against HEAD: Reimplement vm_page_cache_free() as a range operation.	2013-02-06 14:29:05 +00:00
attilio	439c0b8cf1	Reduce diffs against HEAD: - Reimplement vm_page_cache_transfer() properly - Remove vm_page_cache_rename() as a subsequent change	2013-02-05 00:09:33 +00:00
attilio	d61cd60feb	Reduce differences with HEAD.	2013-02-04 22:05:22 +00:00
attilio	b972b67ed7	Merge from vmcontention	2013-02-04 15:44:42 +00:00
attilio	be719e9167	MFC	2012-12-11 00:07:19 +00:00
alc	02094caa2c	In the past four years, we've added two new vm object types. Each time, similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages. To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged. Reviewed and tested by: kib	2012-12-09 00:32:38 +00:00
alc	e12b2ad698	Correct an error in r230623. When both VM_ALLOC_NODUMP and VM_ALLOC_ZERO were specified to vm_page_alloc(), PG_NODUMP wasn't being set on the allocated page when it happened to be pre-zeroed. MFC after: 5 days	2012-11-21 06:26:18 +00:00
alc	ff7333d33f	Replace the single, global page queues lock with per-queue locks on the active and inactive paging queues. Reviewed by: kib	2012-11-13 02:50:39 +00:00
alc	60d5a532fb	In general, we call pmap_remove_all() before calling vm_page_cache(). So, the call to pmap_remove_all() within vm_page_cache() is usually redundant. This change eliminates that call to pmap_remove_all() and introduces a call to pmap_remove_all() before vm_page_cache() in the one place where it didn't already exist. When iterating over a paging queue, if the object containing the current page has a zero reference count, then the page can't have any managed mappings. So, a call to pmap_remove_all() is pointless. Change a panic() call in vm_page_cache() to a KASSERT(). MFC after: 6 weeks	2012-11-01 16:20:02 +00:00
attilio	d38d7bb245	Rework the known mutexes to benefit about staying on their own cache line in order to avoid manual frobbing but using struct mtx_padalign. The sole exception being nvme and sxfge drivers, where the author redefined CACHE_LINE_SIZE manually, so they need to be analyzed and dealt with separately. Reviwed by: jimharris, alc	2012-10-31 18:07:18 +00:00
alc	77582e8298	Replace the page hold queue, PQ_HOLD, by a new page flag, PG_UNHOLDFREE, because the queue itself serves no purpose. When a held page is freed, inserting the page into the hold queue has the side effect of setting the page's "queue" field to PQ_HOLD. Later, when the page is unheld, it will be freed because the "queue" field is PQ_HOLD. In other words, PQ_HOLD is used as a flag, not a queue. So, this change replaces it with a flag. To accomodate the new page flag, make the page's "flags" field wider and "oflags" field narrower. Reviewed by: kib	2012-10-29 06:15:04 +00:00
attilio	64eaf39fd7	MFC	2012-10-22 21:26:36 +00:00
alc	1df941a7f3	Move vm_page_requeue() to the only file that uses it. MFC after: 3 weeks	2012-10-13 20:19:43 +00:00
alc	c84b1820ea	Eliminate the conditional for releasing the page queues lock in vm_page_sleep(). vm_page_sleep() is no longer called with this lock held. Eliminate assertions that the page queues lock is NOT held. These assertions won't translate well to having distinct locks on the active and inactive page queues, and they really aren't that useful. MFC after: 3 weeks	2012-10-13 18:46:46 +00:00
alc	271cefc5f6	Tidy up a bit: Update some of the comments. In particular, use "sleep" in preference to "block" where appropriate. Eliminate some unnecessary casts. Make a few whitespace changes for consistency. Reviewed by: kib MFC after: 3 days	2012-10-03 05:06:45 +00:00
attilio	d3c5a80b69	MFC	2012-08-27 11:59:04 +00:00
kib	a3d0fb0175	Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week	2012-08-14 11:45:47 +00:00
kib	4259905d31	Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week	2012-08-04 18:16:43 +00:00
attilio	c52a057b19	MFC	2012-08-03 15:58:05 +00:00
alc	5b4712b5a1	Inline vm_page_aflags_clear() and vm_page_aflags_set(). Add comments stating that neither these functions nor the flags that they are used to manipulate are part of the KBI.	2012-08-03 01:48:15 +00:00
alc	ad2692aed9	Correct vm_page_alloc_contig()'s implementation of VM_ALLOC_NODUMP.	2012-07-17 02:36:59 +00:00
attilio	e237fbcafa	Merge from vmcontention	2012-07-08 16:12:59 +00:00
attilio	ffa3f082ff	- Split the cached and resident pages tree into 2 distinct ones. This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio	2012-07-08 14:01:25 +00:00
attilio	d659f0f6bf	MFC	2012-07-07 17:55:27 +00:00
alc	c5e6daff9d	Add new pmap layer locks to the predefined lock order. Change the names of a few existing VM locks to follow a consistent naming scheme.	2012-06-27 03:45:25 +00:00
attilio	5d9dd820d8	MFC	2012-06-23 02:08:15 +00:00
alc	4d96d753fe	Selectively inline vm_page_dirty().	2012-06-20 23:25:47 +00:00
alc	6eeaee04e4	The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks	2012-06-16 18:56:19 +00:00
attilio	807db03f96	Revert r231027 and fix the prototype for vm_radix_remove(). The target of this is getting at the point where the recovery path is completely removed as we could count on pre-allocation once the path compressed trie is implemented.	2012-06-08 18:44:54 +00:00
attilio	e761e0c4bc	MFC	2012-06-01 14:57:55 +00:00
andrew	39fa59acb9	Fix booting on ARM. In PHYS_TO_VM_PAGE() when VM_PHYSSEG_DENSE is set the check if we are past the end of vm_page_array was incorrect causing it to return NULL. This value is then used in vm_phys_add_page causing a data abort. Reviewed by: alc, kib, imp Tested by: stas	2012-05-22 07:04:23 +00:00
nwhitehorn	e83623fb1f	Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies range operations like pmap_remove() and pmap_protect() as well as allowing simple operations like pmap_extract() not to involve any global state. This substantially reduces lock coverages for the global table lock and improves concurrency.	2012-05-20 14:33:28 +00:00
kib	9ff1ec42a4	Add a facility to register a range of physical addresses to be used for allocation of fictitious pages, for which PHYS_TO_VM_PAGE() returns proper fictitious vm_page_t. The range should be de-registered after consumer stopped using it. De-inline the PHYS_TO_VM_PAGE() since it now carries code to iterate over registered ranges. A hash container might be developed instead of range registration interface, and fake pages could be put automatically into the hash, were PHYS_TO_VM_PAGE() could look them up later. This should be considered before the MFC of the commit is done. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:42:56 +00:00
kib	1dfd5258de	Split the code from vm_page_getfake() to initialize the fake page struct vm_page into new interface vm_page_initfake(). Handle the case of fake page re-initialization with changed memattr. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:34:22 +00:00
kib	e5df51c93a	Assert that the page passed to vm_page_putfake() is unmanaged. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:27:51 +00:00
kib	84b22557f6	Make the vm_page_array_size long. Remove redundand zero initialization for vm_page_array_size and nearby variablees. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:03:06 +00:00
attilio	c7b668e647	MFC	2012-05-05 21:40:32 +00:00
nwhitehorn	d394621163	Avoid a lock order reversal in pmap_extract_and_hold() from relocking the page. This PMAP requires an additional lock besides the PMAP lock in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release. Suggested by: kib MFC after: 4 days	2012-04-22 17:58:30 +00:00

1 2 3 4 5 ...

628 Commits