freebsd-skq

Author	SHA1	Message	Date
kib	41802e2c86	Adapt vm_page_aflag_set(PGA_WRITEABLE) to the locking of pmap_enter(PMAP_ENTER_NOSLEEP). The PGA_WRITEABLE flag can be set when either the page is busied, or the owner object is locked. Update comments, move all assertions about page state when PGA_WRITEABLE flag is set, into new helper vm_page_assert_pga_writeable(). Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-08-09 05:00:34 +00:00
attilio	2802c525ad	- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them. Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker. This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2014-06-16 18:15:27 +00:00
alc	39548e640f	Add a page size field to struct vm_page. Increase the page size field when a partially populated reservation becomes fully populated, and decrease this field when a fully populated reservation becomes partially populated. Use this field to simplify the implementation of pmap_enter_object() on amd64, arm, and i386. On all architectures where we support superpages, the cost of creating a superpage mapping is roughly the same as creating a base page mapping. For example, both kinds of mappings entail the creation of a single PTE and PV entry. With this in mind, use the page size field to make the implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little smarter. Previously, if MAP_PREFAULT_PARTIAL was specified to vm_map_pmap_enter(), that function would only map base pages. Now, it will create up to 96 base page or superpage mappings. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2014-06-07 17:12:26 +00:00
alc	d0ccfff2c5	Since the introduction of the popmap to reservations in r259999, there is no longer any need for the page's PG_CACHED and PG_FREE flags to be set and cleared while the free page queues lock is held. Thus, vm_page_alloc(), vm_page_alloc_contig(), and vm_page_alloc_freelist() can wait until after the free page queues lock is released to clear the page's flags. Moreover, the PG_FREE flag can be retired. Now that the reservation system no longer uses it, its only uses are in a few assertions. Eliminating these assertions is no real loss. Other assertions catch the same types of misbehavior, like doubly freeing a page (see r260032) or dirtying a free page (free pages are invalid and only valid pages can be dirtied). Eliminate an unneeded variable from vm_page_alloc_contig(). Sponsored by: EMC / Isilon Storage Division	2013-12-31 18:25:15 +00:00
kib	8ca067efb2	PG_SLAB no longer serves a useful purpose, since m->object is no longer abused to store pointer to slab. Remove it. Reviewed by: alc Sponsored by: The FreeBSD Foundation Approved by: re (hrs)	2013-09-17 07:35:26 +00:00
kib	6796656333	Remove zero-copy sockets code. It only worked for anonymous memory, and the equivalent functionality is now provided by sendfile(2) over posix shared memory filedescriptor. Remove the cow member of struct vm_page, and rearrange the remaining members. While there, make hold_count unsigned. Requested and reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (delphij)	2013-09-16 06:25:54 +00:00
kib	ba12eedccd	Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-22 07:39:53 +00:00
jeff	bc00d6df57	Improve pageout flow control to wakeup more frequently and do less work while maintaining better LRU of active pages. - Change v_free_target to include the quantity previously represented by v_cache_min so we don't need to add them together everywhere we use them. - Add a pageout_wakeup_thresh that sets the free page count trigger for waking the page daemon. Set this 10% above v_free_min so we wakeup before any phase transitions in vm users. - Adjust down v_free_target now that we're willing to accept more pagedaemon wakeups. This means we process fewer pages in one iteration as well, leading to shorter lock hold times and less overall disruption. - Eliminate vm_pageout_page_stats(). This was a minor variation on the PQ_ACTIVE segment of the normal pageout daemon. Instead we now process 1 / vm_pageout_update_period pages every second. This causes us to visit the whole active list every 60 seconds. Previously we would only maintain the active LRU when we were short on pages which would mean it could be woefully out of date. Reviewed by: alc (slight variant of this) Discussed with: alc, kib, jhb Sponsored by: EMC / Isilon Storage Division	2013-08-13 21:56:16 +00:00
kib	4675fcfce0	Different consumers of the struct vm_page abuse pageq member to keep additional information, when the page is guaranteed to not belong to a paging queue. Usually, this results in a lot of type casts which make reasoning about the code correctness harder. Sometimes m->object is used instead of pageq, which could cause real and confusing bugs if non-NULL m->object is leaked. See r141955 and r253140 for examples. Change the pageq member into a union containing explicitly-typed members. Use them instead of type-punning or abusing m->object in x86 pmaps, uma and vm_page_alloc_contig(). Requested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-10 17:36:42 +00:00
jhb	8f3909e991	Revert the addition of VPO_BUSY and instead update vm_page_replace() to properly unbusy the page. Submitted by: alc	2013-08-09 21:14:55 +00:00
obrien	8b37b80e65	Add missing 'VPO_BUSY' from r254141 to fix kernel build break.	2013-08-09 16:43:50 +00:00
attilio	e9f37cac74	On all the architectures, avoid to preallocate the physical memory for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl	2013-08-09 11:28:55 +00:00
attilio	16c7563cf4	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
kib	8de1718b60	Split the pagequeues per NUMA domains, and split pageademon process into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-07 16:36:38 +00:00
alc	53ffec3a56	Revise the interface between vm_object_madvise() and vm_page_dontneed() so that pointless calls to pmap_is_modified() can be easily avoided when performing madvise(..., MADV_FREE). Sponsored by: EMC / Isilon Storage Division	2013-06-10 01:48:21 +00:00
alc	7c42edf9b8	Update a comment.	2013-06-04 05:44:52 +00:00
alc	17993ced1b	Require that the page lock is held, instead of the object lock, when clearing the page's PGA_REFERENCED flag. Since we are typically manipulating the page's act_count field when we are clearing its PGA_REFERENCED flag, the page lock is already held everywhere that we clear the PGA_REFERENCED flag. So, in fact, this revision only changes some comments and an assertion. Nonetheless, it will enable later changes to object locking in the pageout code. Introduce vm_page_assert_locked(), which completely hides the implementation details of the page lock from the caller, and use it in vm_page_aflag_clear(). (The existing vm_page_lock_assert() could not be used in vm_page_aflag_clear().) Over the coming weeks, I expect that we'll either eliminate or replace the various uses of vm_page_lock_assert() with vm_page_assert_locked(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-06-03 01:22:54 +00:00
alc	5c982b2ff7	Simplify the definition of vm_page_lock_assert(). There is no compelling reason to inline the implementation of vm_page_lock_assert() in the !KLD_MODULES case. Use the same implementation for both KLD_MODULES and !KLD_MODULES. Reviewed by: kib	2013-05-31 16:00:42 +00:00
attilio	3975276634	The per-page act_count can be made very-easily protected by the per-page lock rather than vm_object lock, without any further overhead. Make the formal switch. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-04-08 20:02:27 +00:00
attilio	9d88c5279c	Now that vm_page_cache_free() and vm_page_cache_transfer() are reimplemented as ranged operations, sync vm_page_is_cached() semantic with HEAD.	2013-02-06 14:50:34 +00:00
attilio	44f85cd1a5	Reduce diffs against HEAD: Reimplement vm_page_cache_free() as a range operation.	2013-02-06 14:29:05 +00:00
attilio	439c0b8cf1	Reduce diffs against HEAD: - Reimplement vm_page_cache_transfer() properly - Remove vm_page_cache_rename() as a subsequent change	2013-02-05 00:09:33 +00:00
attilio	b972b67ed7	Merge from vmcontention	2013-02-04 15:44:42 +00:00
attilio	be719e9167	MFC	2012-12-11 00:07:19 +00:00
alc	1284a0383d	Update a comment to reflect the elimination of the hold queue in r242300.	2012-11-17 04:00:19 +00:00
kib	bc5bfde14d	Move the declaration of vm_phys_paddr_to_vm_page() from vm/vm_page.h to vm/vm_phys.h, where it belongs. Requested and reviewed by: alc MFC after: 2 weeks	2012-11-16 05:55:56 +00:00
kib	75f2aa672f	Explicitely state that M_USE_RESERVE requires M_NOWAIT, using assertion. Reviewed by: alc MFC after: 2 weeks	2012-11-16 05:49:56 +00:00
kib	e8ae50d444	Flip the semantic of M_NOWAIT to only require the allocation to not sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions. Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class. Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags(). Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks	2012-11-14 20:01:40 +00:00
alc	ff7333d33f	Replace the single, global page queues lock with per-queue locks on the active and inactive paging queues. Reviewed by: kib	2012-11-13 02:50:39 +00:00
attilio	d38d7bb245	Rework the known mutexes to benefit about staying on their own cache line in order to avoid manual frobbing but using struct mtx_padalign. The sole exception being nvme and sxfge drivers, where the author redefined CACHE_LINE_SIZE manually, so they need to be analyzed and dealt with separately. Reviwed by: jimharris, alc	2012-10-31 18:07:18 +00:00
alc	77582e8298	Replace the page hold queue, PQ_HOLD, by a new page flag, PG_UNHOLDFREE, because the queue itself serves no purpose. When a held page is freed, inserting the page into the hold queue has the side effect of setting the page's "queue" field to PQ_HOLD. Later, when the page is unheld, it will be freed because the "queue" field is PQ_HOLD. In other words, PQ_HOLD is used as a flag, not a queue. So, this change replaces it with a flag. To accomodate the new page flag, make the page's "flags" field wider and "oflags" field narrower. Reviewed by: kib	2012-10-29 06:15:04 +00:00
attilio	64eaf39fd7	MFC	2012-10-22 21:26:36 +00:00
alc	1df941a7f3	Move vm_page_requeue() to the only file that uses it. MFC after: 3 weeks	2012-10-13 20:19:43 +00:00
attilio	d3c5a80b69	MFC	2012-08-27 11:59:04 +00:00
kib	a3d0fb0175	Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week	2012-08-14 11:45:47 +00:00
kib	cac2fe116f	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
kib	4259905d31	Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week	2012-08-04 18:16:43 +00:00
attilio	c52a057b19	MFC	2012-08-03 15:58:05 +00:00
alc	5b4712b5a1	Inline vm_page_aflags_clear() and vm_page_aflags_set(). Add comments stating that neither these functions nor the flags that they are used to manipulate are part of the KBI.	2012-08-03 01:48:15 +00:00
alc	ceefb8bf17	Eliminate an unneeded declaration. (I should have removed this as part of r227568.)	2012-07-30 20:38:37 +00:00
attilio	e237fbcafa	Merge from vmcontention	2012-07-08 16:12:59 +00:00
attilio	ffa3f082ff	- Split the cached and resident pages tree into 2 distinct ones. This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around. Tested by: flo, Andrea Barberio	2012-07-08 14:01:25 +00:00
attilio	5d9dd820d8	MFC	2012-06-23 02:08:15 +00:00
alc	4d96d753fe	Selectively inline vm_page_dirty().	2012-06-20 23:25:47 +00:00
alc	6eeaee04e4	The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks	2012-06-16 18:56:19 +00:00
attilio	807db03f96	Revert r231027 and fix the prototype for vm_radix_remove(). The target of this is getting at the point where the recovery path is completely removed as we could count on pre-allocation once the path compressed trie is implemented.	2012-06-08 18:44:54 +00:00
attilio	e761e0c4bc	MFC	2012-06-01 14:57:55 +00:00
kib	9ff1ec42a4	Add a facility to register a range of physical addresses to be used for allocation of fictitious pages, for which PHYS_TO_VM_PAGE() returns proper fictitious vm_page_t. The range should be de-registered after consumer stopped using it. De-inline the PHYS_TO_VM_PAGE() since it now carries code to iterate over registered ranges. A hash container might be developed instead of range registration interface, and fake pages could be put automatically into the hash, were PHYS_TO_VM_PAGE() could look them up later. This should be considered before the MFC of the commit is done. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:42:56 +00:00
kib	1dfd5258de	Split the code from vm_page_getfake() to initialize the fake page struct vm_page into new interface vm_page_initfake(). Handle the case of fake page re-initialization with changed memattr. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:34:22 +00:00
kib	d67fed001c	Commit the change forgotten in r235356. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month	2012-05-12 20:10:18 +00:00

1 2 3 4 5 ...

259 Commits