freebsd-dev

Author	SHA1	Message	Date
Jeff Roberson	fff5403f84	(5/6) Move the VPO_NOSYNC to PGA_NOSYNC to eliminate the dependency on the object lock in vm_page_set_validclean(). Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21595	2019-10-15 03:48:22 +00:00
Jeff Roberson	0012f373e4	(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594	2019-10-15 03:45:41 +00:00
Jeff Roberson	205be21d99	(3/6) Add a shared object busy synchronization mechanism that blocks new page busy acquires while held. This allows code that would need to acquire and release a very large number of page busy locks to use the old mechanism where busy is only checked and not held. This comes at the cost of false positives but never false negatives which the single consumer, vm_fault_soft_fast(), handles. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21592	2019-10-15 03:41:36 +00:00
Jeff Roberson	8da1c09853	(2/6) Don't release xbusy in vm_page_remove(), defer to vm_page_free_prep(). This persists busy state across operations like rename and replace. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21549	2019-10-15 03:38:02 +00:00
Jeff Roberson	63e9755548	(1/6) Replace busy checks with acquires where it is trival to do so. This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548	2019-10-15 03:35:11 +00:00
Leandro Lupori	0ecc478b74	[PPC64] Initial kernel minidump implementation Based on POWER9BSD implementation, with all POWER9 specific code removed and addition of new methods in PPC64 MMU interface, to isolate platform specific code. Currently, the new methods are implemented on pseries and PowerNV (D21643). Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D21551	2019-10-14 13:04:04 +00:00
Mark Johnston	4090e2170d	Assert that the PGA_{WRITEABLE,EXECUTABLE} flags do not leak. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21783	2019-10-07 23:31:17 +00:00
Mateusz Guzik	7b1fbc424a	vm: stop trylocking page queues in vm_page_pqbatch_submit About 11 minutes of poudriere -s -j 104 and probing on return value of trylocks reveals that over 10% of attempts fail, which in turn means there are more atomics performed than necessary. Trylocking was there to try preventing migration, but it's not very likely to happen if the lock is uncontested. Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21925	2019-10-07 23:19:09 +00:00
Mark Johnston	7cc833c598	Fix a race in vm_page_swapqueue(). vm_page_swapqueue() atomically transitions a page between queues. To do so, it must hold the page queue lock for the old queue. However, once the queue index has been updated, the queue lock no longer protects the page's queue state. Thus, we must speculatively remove the page from the old queue before committing the queue state update, and roll back if the update fails. Reported and tested by: pho Reviewed by: kib Sponsored by: Intel, Netflix Differential Revision: https://reviews.freebsd.org/D21791	2019-09-27 16:46:08 +00:00
Mark Johnston	2b93f779d2	Add some counters for per-VM page events. For now, just count batched page queue state operations. vm.stats.page.queue_ops counts the number of batch entries that successfully completed, while queue_nops counts entries that had no effect, which occurs when the queue operation had been completed before the batch entry was processed. Reviewed by: alc, kib MFC after: 1 week Sponsored by: Intel, Netflix Differential Revision: https://reviews.freebsd.org/D21782	2019-09-25 17:08:35 +00:00
Mark Johnston	923da43e7c	Fix a race in vm_page_dequeue_deferred_free() after r352110. This function loaded the page's queue index before setting PGA_DEQUEUE. In this window the page daemon may have deactivated the page, updating its queue index. Make the operation atomic using vm_page_pqstate_cmpset(); the page daemon will not modify the page once it observes that PGA_DEQUEUE is set. Reported and tested by: pho Reviewed by: alc, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21639	2019-09-16 15:12:49 +00:00
Mark Johnston	47aef898ea	Fix a page leak in vm_page_reclaim_run(). After r352110 the attempt to remove mappings of the page being replaced may fail if the page is wired. In this case we must free the replacement page. Reviewed by: alc, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21639	2019-09-16 15:09:31 +00:00
Mark Johnston	e8bcf6966b	Revert r352406, which contained changes I didn't intend to commit.	2019-09-16 15:04:45 +00:00
Mark Johnston	41fd4b9422	Fix a couple of nits in r352110. - Remove a dead variable from the amd64 pmap_extract_and_hold(). - Fix grammar in the vm_page_wire man page. Reported by: alc Reviewed by: alc, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21639	2019-09-16 15:03:12 +00:00
Jeff Roberson	c75757481f	Replace redundant code with a few new vm_page_grab facilities: - VM_ALLOC_NOCREAT will grab without creating a page. - vm_page_grab_valid() will grab and page in if necessary. - vm_page_busy_acquire() automates some busy acquire loops. Discussed with: alc, kib, markj Tested by: pho (part of larger branch) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21546	2019-09-10 19:08:01 +00:00
Jeff Roberson	4cdea4a853	Use the sleepq lock rather than the page lock to protect against wakeup races with page busy state. The object lock is still used as an interlock to ensure that the identity stays valid. Most callers should use vm_page_sleep_if_busy() to handle the locking particulars. Reviewed by: alc, kib, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21255	2019-09-10 18:27:45 +00:00
Mark Johnston	fee2a2fa39	Change synchonization rules for vm_page reference counting. There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486	2019-09-09 21:32:42 +00:00
Mark Johnston	7cdeaf3309	Add preliminary support for atomic updates of per-page queue state. Queue operations on a page use the page lock when updating the page to reflect the desired queue state, and the page queue lock when physically enqueuing or dequeuing a page. Multiple pages share a given page lock, but queue state is per-page; this false sharing results in heavy lock contention. Take a small step towards the use of atomic_cmpset to synchronize updates to per-page queue state by introducing vm_page_pqstate_cmpset() and using it in the page daemon. In the longer term the plan is to stop using the page lock to protect page identity and rely only on the object and page busy locks. However, since the page daemon avoids acquiring the object lock except when necessary, some synchronization with a concurrent free of the page is required. vm_page_pqstate_cmpset() can be used to ensure that queue state updates are successful only if the page is not scheduled for a dequeue, which is sufficient for the page daemon. Add vm_page_swapqueue(), which moves a page from one queue to another using vm_page_pqstate_cmpset(). Use it in the active queue scan, which does not use the object lock. Modify vm_page_dequeue_deferred() to use vm_page_pqstate_cmpset() as well. Reviewed by: kib Discussed with: jeff Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21257	2019-09-03 14:29:58 +00:00
Mark Johnston	9d75f0dc75	Map the vm_page array into KVA on amd64. r351198 allows the kernel to use domain-local memory to back the vm_page array (up to 2MB boundaries) and reserves a separate PML4 entry for that purpose. One consequence of that change is that the vm_page array is no longer present in minidumps, which only adds pages mapped above VM_MIN_KERNEL_ADDRESS. To avoid the friction caused by having kernel data structures mapped below VM_MIN_KERNEL_ADDRESS, map the vm_page array starting at VM_MIN_KERNEL_ADDRESS instead of using a dedicated PML4 entry. Reviewed by: kib Discussed with: jeff Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21491	2019-09-03 13:18:51 +00:00
Mark Johnston	a70e17eeca	Fix a few nits in vm_pqbatch_process_page(). - Don't bother masking off non-queue state flags when loading the page's atomic state, since it is only required for one of the function's assertions. Update the assertion instead. - Remove an incorrect comment regarding synchronization with the page daemon. The page daemon only ever checks for PGA_ENQUEUED with the page queue lock held. - When clearing requeue flags, only clear the flags that have been acted upon. Reviewed by: kib (previous version) Discussed with: alc Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21368	2019-08-26 20:20:10 +00:00
Mark Johnston	f93670b7b9	Stop clearing page flags in vm_page_pqbatch_submit(). All existing callers guarantee that the page does not have a pre-existing dequeue pending. Thus, if the page is dequeued before pqbatch_submit() acquires the page queue lock, we do not need to do anything since vm_page_dequeue_complete() takes care of clearing all page queue state flags for us. With this change, vm_page_pqbatch_submit() has the nice property that it does not directly modify any fields in the page structure. Reviewed by: alc, kib Tested by: pho (part of a larger change) MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21372	2019-08-23 19:53:11 +00:00
Mark Johnston	386eba08bd	Make vm_pqbatch_submit_page() externally visible. It will become useful for the page daemon to be able to directly create a batch queue entry for a page, and without modifying the page structure. Rename vm_pqbatch_submit_page() to vm_page_pqbatch_submit() to keep the namespace consistent. No functional change intended. Reviewed by: alc, kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21369	2019-08-23 19:49:29 +00:00
Mark Johnston	8b90607f20	Simplify vm_page_dequeue() and fix an assertion. - Add a vm_pagequeue_remove() function to physically remove a page from its queue and update the queue length. - Remove vm_page_pagequeue_lockptr() and let vm_page_pagequeue() return NULL for dequeued pages. - Avoid unnecessarily reloading the queue index if vm_page_dequeue() loses a race with a concurrent queue operation. - Correct an always-true assertion: vm_page_dequeue() may be called from the page allocator with the page unlocked. The assertion m->order == VM_NFREEORDER simply tests whether the page has been removed from the vm_phys free lists; instead, check whether the page belongs to an object. Reviewed by: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21341	2019-08-21 16:11:12 +00:00
Jeff Roberson	3e5e1b5135	Allocate amd64's page array using pages and page directory pages from the NUMA domain that the pages describe. Patch original from gallatin. Reviewed by: kib Tested by: pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21252	2019-08-18 23:07:56 +00:00
Jeff Roberson	b7565d44df	Encapsulate phys_avail manipulation in a set of simple routines. Add a NUMA aware boot time memory allocator that will be used to allocate early domain correct structures. Code partially submitted by gallatin. Reviewed by: gallatin, kib Tested by: pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21251	2019-08-18 07:06:31 +00:00
Konstantin Belousov	245139c69d	Fix OOM handling of some corner cases. In addition to pagedaemon initiating OOM, also do it from the vm_fault() internals. Namely, if the thread waits for a free page to satisfy page fault some preconfigured amount of time, trigger OOM. These triggers are rate-limited, due to a usual case of several threads of the same multi-threaded process to enter fault handler simultaneously. The faults from pagedaemon threads participate in the calculation of OOM rate, but are not under the limit. Reviewed by: markj (previous version) Tested by: pho Discussed with: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D13671	2019-08-16 09:43:49 +00:00
Mark Johnston	98549e2dc6	Centralize the logic in vfs_vmio_unwire() and sendfile_free_page(). Both of these functions atomically unwire a page, optionally attempt to free the page, and enqueue or requeue the page. Add functions vm_page_release() and vm_page_release_locked() to perform the same task. The latter must be called with the page's object lock held. As a side effect of this refactoring, the buffer cache will no longer attempt to free mapped pages when completing direct I/O. This is consistent with the handling of pages by sendfile(SF_NOCACHE). Reviewed by: alc, kib MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20986	2019-07-29 22:01:28 +00:00
Mark Johnston	b16e57a6c9	Rename vm_page_{import,release}() to vm_page_zone_{import,release}(). I would like to use the name vm_page_release() for a different purpose, and vm_page_{import,release}() are local to vm_page.c. Reviewed by: kib MFC after: 1 week	2019-07-20 18:25:41 +00:00
Mark Johnston	eeacb3b02f	Merge the vm_page hold and wire mechanisms. The hold_count and wire_count fields of struct vm_page are separate reference counters with similar semantics. The remaining essential differences are that holds are not counted as a reference with respect to LRU, and holds have an implicit free-on-last unhold semantic whereas vm_page_unwire() callers must explicitly determine whether to free the page once the last reference to the page is released. This change removes the KPIs which directly manipulate hold_count. Functions such as vm_fault_quick_hold_pages() now return wired pages instead. Since r328977 the overhead of maintaining LRU for wired pages is lower, and in many cases vm_fault_quick_hold_pages() callers would swap holds for wirings on the returned pages anyway, so with this change we remove a number of page lock acquisitions. No functional change is intended. __FreeBSD_version is bumped. Reviewed by: alc, kib Discussed with: jeff Discussed with: jhb, np (cxgbe) Tested by: pho (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19247	2019-07-08 19:46:20 +00:00
Mark Johnston	46736e306c	Elide the vm_reserv_free_page() call when PG_PCPU_CACHE is set. Pages with PG_PCPU_CACHE set cannot have been allocated from a reservation, so as an optimization, skip the call to vm_reserv_free_page() in this case. Otherwise, the access of the corresponding reservation structure often results in a cache miss. Reviewed by: alc, kib Discussed with: jeff MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20859	2019-07-08 19:02:40 +00:00
Mark Johnston	d9a73522e3	Add a per-CPU page cache per VM free pool. Some workloads benefit from having a per-CPU cache for VM_FREEPOOL_DIRECT pages. Reviewed by: dougm, kib Discussed with: alc, jeff MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20858	2019-07-08 18:56:30 +00:00
Mark Johnston	9f74cdbf78	Mark pages allocated from the per-CPU cache. Only free pages to the cache when they were allocated from that cache. This mitigates rapid fragmentation of physical memory seen during poudriere's dependency calculation phase. In particular, pages belonging to broken reservations are no longer freed to the per-CPU cache, so they get a chance to coalesce with freed pages during the break. Otherwise, the optimized CoW handler may create object chains in which multiple objects contain pages from the same reservation, and the order in which we do object termination means that the reservation is broken before all of those pages are freed, so some of them end up in the per-CPU cache and thus permanently fragment physical memory. The flag may also be useful for eliding calls to vm_reserv_free_page(), thus avoiding memory accesses for data that is likely not present in the CPU caches. Reviewed by: alc Discussed with: jeff MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20763	2019-07-02 19:51:40 +00:00
Mark Johnston	0fd977b3fa	Add a return value to vm_page_remove(). Use it to indicate whether the page may be safely freed following its removal from the object. Also change vm_page_remove() to assume that the page's object pointer is non-NULL, and have callers perform this check instead. This is a step towards an implementation of an atomic reference counter for each physical page structure. Reviewed by: alc, dougm, kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20758	2019-06-26 17:37:51 +00:00
Mark Johnston	ee1f168540	Group vm_page_activate()'s definition with other related functions. No functional change intended. MFC after: 3 days	2019-06-19 21:36:00 +00:00
Mark Johnston	2d2748710a	Remove an outdated header comment for vm_page.c. The listed rules were incomplete and outdated. There is a much more comprehensive comment in vm_page.h. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20503	2019-06-04 18:38:27 +00:00
Alan Cox	2d5039db18	Retire vm_reserv_extend_{contig,page}(). These functions were introduced as part of a false start toward fine-grained reservation locking. In the end, they were not needed, so eliminate them. Order the parameters to vm_reserv_alloc_{contig,page}() consistently with the vm_page functions that call them. Update the comments about the locking requirements for vm_reserv_alloc_{contig,page}(). They no longer require a free page queues lock. Wrap several lines that became too long after the "req" and "domain" parameters were added to vm_reserv_alloc_{contig,page}(). Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20492	2019-06-03 05:15:36 +00:00
Mark Johnston	d842aa5114	Add a vm_page_wired() predicate. Use it instead of accessing the wire_count field directly. No functional change intended. Reviewed by: alc, kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20485	2019-06-02 01:00:17 +00:00
Doug Moore	b8590dae50	The function vm_phys_free_contig invokes vm_phys_free_pages for every power-of-two page block it frees, launching an unsuccessful search for a buddy to pair up with each time. The only possible buddy-up mergers are across the boundaries of the freed region, so change vm_phys_free_contig simply to enqueue the freed interior blocks, via a new function vm_phys_enqueue_contig, and then call vm_phys_free_pages on the bounding blocks to create as big a cross-boundary block as possible after buddy-merging. The only callers of vm_phys_free_contig at the moment call it in situations where merging blocks across the boundary is clearly impossible, so just call vm_phys_enqueue_contig in those places and avoid trying to buddy-up at all. One beneficiary of this change is in breaking reservations. For the case where memory is freed in breaking a reservation with only the first and last pages allocated, the number of cycles consumed by the operation drops about 11% with this change. Suggested by: alc Reviewed by: alc Approved by: kib, markj (mentors) Differential Revision: https://reviews.freebsd.org/D16901	2019-05-31 21:02:42 +00:00
Mark Johnston	42447bb506	Remove a redundant vm_page_remove() call. vm_page_free_prep() removes the page from its object. No functional change intended. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20469	2019-05-31 14:59:40 +00:00
Mark Johnston	3b5b20292b	Implement minidump support for RISC-V. Submitted by: Mitchell Horne <mhorne063@gmail.com> Differential Revision: https://reviews.freebsd.org/D18320	2019-03-06 00:01:06 +00:00
Mark Johnston	1e2b3e6f92	Allow vm_page_free_prep() to dequeue pages without the page lock. This is a step towards being able to free pages without the page lock held. The approach is simply to add an implementation of vm_page_dequeue_deferred() which does not assert that the page lock is held. Formally, the page lock is required to set PGA_DEQUEUE, but in the case of vm_page_free_prep() we get the same mutual exclusion for free by virtue of the fact that no other references to the page may exist. No functional change intended. Reviewed by: kib (previous version) MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19065	2019-02-03 18:43:20 +00:00
Mark Johnston	d0488e698f	Fix a race in vm_page_dequeue_deferred(). To detect the case where the page is already marked for a deferred dequeue, we must read the "queue" and "aflags" fields in a precise order. Otherwise, a race with a concurrent vm_page_dequeue_complete() could leave the page with PGA_DEQUEUE set despite it already having been dequeued. Fix the problem by using vm_page_queue() to check the queue state, which correctly handles the race. Reviewed by: kib Tested by: pho MFC after: 3 days Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19039	2019-02-03 18:38:58 +00:00
Gleb Smirnoff	bb15d1c778	o Move zone limit from keg level up to zone level. This means that now two zones sharing a keg may have different limits. Now this is going to work: zone = uma_zcreate(); uma_zone_set_max(zone, limit); zone2 = uma_zsecond_create(zone); uma_zone_set_max(zone2, limit2); Kegs no longer have uk_maxpages field, but zones have uz_items. When set, it may be rounded up to minimum possible CPU bucket cache size. For small limits bucket cache can also be reconfigured to be smaller. Counter uz_items is updated whenever items transition from keg to a bucket cache or directly to a consumer. If zone has uz_maxitems set and it is reached, then we are going to sleep. o Since new limits don't play well with multi-keg zones, remove them. The idea of multi-keg zones was introduced exactly 10 years ago, and never have had a practical usage. In discussion with Jeff we came to a wild agreement that if we ever want to reintroduce the idea of a smart allocator that would be able to choose between two (or more) totally different backing stores, that choice should be made one level higher than UMA, e.g. in malloc(9) or in mget(), or whatever and choice should be controlled by the caller. o Sleeping code is improved to account number of sleepers and wake them one by one, to avoid thundering herd problem. o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache() KPI added. Having no bucket cache basically means setting maxcache to 0. o Now with many fields added and many removed (no multi-keg zones!) make sure that struct uma_zone is perfectly aligned. Reviewed by: markj, jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D17773	2019-01-15 00:02:06 +00:00
Gleb Smirnoff	9cc36b3dab	Fix regression in r331368, that broke dumping of UMA startup pages when WITNESS is present. Discussed with: markj	2019-01-07 23:17:09 +00:00
Konstantin Belousov	7af4985245	Add 'v' modifier to the ddb 'show pginfo' command to display vm_page backing the provided kernel virtual address. Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation	2018-12-30 15:58:18 +00:00
Mark Johnston	e31fc3ab13	Update the free page count when blacklisting pages. Otherwise the free page count will not accurately reflect the physical page allocator's state. On 11 this can trigger panics in vm_page_alloc() since the allocator state and free page count are updated atomically and we expect them to stay in sync. On 12 the bug would manifest as threads looping in vm_page_alloc(). PR: 231296 Reported by: mav, wollman, Rainer Duffner, Josh Gitlin Reviewed by: alc, kib, mav MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18374	2018-11-29 16:31:01 +00:00
Mark Johnston	920239efde	Fix some problems that manifest when NUMA domain 0 is empty. - In uma_prealloc(), we need to check for an empty domain before the first allocation attempt, not after. Fix this by switching uma_prealloc() to use a vm_domainset iterator, which addresses the secondary issue of using a signed domain identifier in round-robin iteration. - Don't automatically create a page daemon for domain 0. - In domainset_empty_vm(), recompute ds_cnt and ds_order after excluding empty domains; otherwise we may frequently specify an empty domain when calling in to the page allocator, wasting CPU time. Convert DOMAINSET_PREF() policies for empty domains to round-robin. - When freeing bootstrap pages, don't count them towards the per-domain total page counts for now: some vm_phys segments are created before the SRAT is parsed and are thus always identified as being in domain 0 even when they are not. Then, when bootstrap pages are freed, they are added to a domain that we had previously thought was empty. Until this is corrected, we simply exclude them from the per-domain page count. Reported and tested by: Rajesh Kumar <rajfbsd@gmail.com> Reviewed by: gallatin MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17704	2018-10-30 17:57:40 +00:00
Mark Johnston	4c29d2de67	Refactor domainset iterators for use by malloc(9) and UMA. Before this change we had two flavours of vm_domainset iterators: "page" and "malloc". The latter was only used for kmem_() and hard-coded its behaviour based on kernel_object's policy. Moreover, its use contained a race similar to that fixed by r338755 since the kernel_object's iterator was being run without the object lock. In some cases it is useful to be able to explicitly specify a policy (domainset) or policy+iterator (domainset_ref) when performing memory allocations. To that end, refactor the vm_dominset_ KPI to permit this, and get rid of the "malloc" domainset_iter KPI in the process. Reviewed by: jeff (previous version) Tested by: pho (part of a larger patch) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17417	2018-10-23 16:35:58 +00:00
Mark Johnston	463406ac4a	Add more NUMA-specific low memory predicates. Use these predicates instead of inline references to vm_min_domains. Also add a global all_domains set, akin to all_cpus. Reviewed by: alc, jeff, kib Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17278	2018-09-24 19:24:17 +00:00
Mark Johnston	7a364d458a	Split some checks in vm_page_activate() to make it easier to read. No functional change intended. Reviewed by: alc, kib Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17028	2018-09-10 18:59:23 +00:00

1 2 3 4 5 ...

712 Commits