freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	ba7c64d17b	Typo in comment.	2016-01-24 13:38:41 +00:00
John Baldwin	8a4dc40ff4	Various cleanups to the main function for AIO kernel processes: - Pull the vmspace logic out into helper functions and reduce duplication. Operations on the vmspace are all isolated to vm_map.c, but it now exports a new 'vmspace_switch_aio' for use by AIO kernel processes. - When an AIO kernel process wants to exit, break out of the main loop and perform cleanup after the loop end. This reduces a lot of indentation and allows cleanup to more closely mirror setup actions before the loop starts. - Convert a DIAGNOSTIC to KASSERT(). - Replace mycp with more typical 'p'. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4990	2016-01-19 21:37:51 +00:00
Alan Cox	477bffbe4d	A fix to r292469: Iterate over the physical segments in descending rather than ascending order in vm_phys_alloc_contig() so that, for example, a sequence of contigmalloc(low=0, high=4GB) calls doesn't exhaust the supply of low physical memory resulting in a later contigmalloc(low=0, high=1MB) failure. Reported by: cy Tested by: cy Sponsored by: EMC / Isilon Storage Division	2016-01-16 04:41:40 +00:00
Adrian Chadd	54de56f3b2	Fix the domain iterator to not try the first-touch / fixed domain more than once when doing round-robin. This lead to a panic because the iterator was trying the same domain twice and not trying one of the other domains. Reported by: pho Tested by: pho	2016-01-10 17:53:43 +00:00
Konstantin Belousov	ff64a90ed9	Add missed relpbuf() for a smallfs page-in. Reported by: Shawn Webb Tested by: pho Sponsored by: The FreeBSD Foundation	2015-12-27 14:42:39 +00:00
Jonathan T. Looney	54503a13d8	Add a safety net to reclaim mbufs when one of the mbuf zones become exhausted. It is possible for a bug in the code (or, theoretically, even unusual network conditions) to exhaust all possible mbufs or mbuf clusters. When this occurs, things can grind to a halt fairly quickly. However, we currently do not call mb_reclaim() unless the entire system is experiencing a low-memory condition. While it is best to try to prevent exhaustion of one of the mbuf zones, it would also be useful to have a mechanism to attempt to recover from these situations by freeing "expendable" mbufs. This patch makes two changes: a) The patch adds a generic API to the UMA zone allocator to set a function that should be called when an allocation fails because the zone limit has been reached. Because of the way this function can be called, it really should do minimal work. b) The patch uses this API to try to free mbufs when an allocation fails from one of the mbuf zones because the zone limit has been reached. The function schedules a callout to run mb_reclaim(). Differential Revision: https://reviews.freebsd.org/D3864 Reviewed by: gnn Comments by: rrs, glebius MFC after: 2 weeks Sponsored by: Juniper Networks	2015-12-20 02:05:33 +00:00
Alan Cox	c869e67208	Introduce a new mechanism for relocating virtual pages to a new physical address and use this mechanism when: 1. kmem_alloc_{attr,contig}() can't find suitable free pages in the physical memory allocator's free page lists. This replaces the long-standing approach of scanning the inactive and inactive queues, converting clean pages into PG_CACHED pages and laundering dirty pages. In contrast, the new mechanism does not use PG_CACHED pages nor does it trigger a large number of I/O operations. 2. on 32-bit MIPS processors, uma_small_alloc() and the pmap can't find free pages in the physical memory allocator's free page lists that are covered by the direct map. Tested by: adrian 3. ttm_bo_global_init() and ttm_vm_page_alloc_dma32() can't find suitable free pages in the physical memory allocator's free page lists. In the coming months, I expect that this new mechanism will be applied in other places. For example, balloon drivers should use relocation to minimize fragmentation of the guest physical address space. Make vm_phys_alloc_contig() a little smarter (and more efficient in some cases). Specifically, use vm_phys_segs[] earlier to avoid scanning free page lists that can't possibly contain suitable pages. Reviewed by: kib, markj Glanced at: jhb Discussed with: jeff Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4444	2015-12-19 18:42:50 +00:00
Conrad Meyer	8170d6e52b	vm_page_replace: add wrapper to KASSERT about old page It turns out the callers of vm_page_replace know exactly which page they are replacing and would like to assert about it. Change those from hard panics to KASSERTs, and provide them with a wrapper so they don't have to deal with warnings from an INVARIANTS-dependent dead store of the return value of vm_page_replace. Submitted by: Ryan Libby <rlibby@gmail.com> Reviewed by: alc, kib (earlier version) Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4497	2015-12-17 17:48:57 +00:00
Conrad Meyer	dc62d55929	vm_page.h: page busy macro fixups Minor changes to: - delete extraneous trailing semicolons from macro definitions, and - correct spelling of "busying" in panic messages Submitted by: Ryan Libby <rlibby@gmail.com> Reviewed by: alc, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4577	2015-12-16 23:23:12 +00:00
Gleb Smirnoff	b0cd20172d	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
Mark Johnston	d9e2e68d38	Don't make assertions about td_critnest when the scheduler is stopped. A panicking thread always executes with a critical section held, so any attempt to allocate or free memory while dumping will otherwise cause a second panic. This can occur, for example, if xpt_polled_action() completes non-dump I/O that was pending at the time of the panic. The fact that this can occur is itself a bug, but asserting in this case does little but reduce the reliability of kernel dumps. Suggested by: kib Reported by: pho	2015-12-11 20:05:07 +00:00
Conrad Meyer	5e09bdc821	vm_page_replace: remove redundant radix lookup Remove redundant lookup of the old page from vm_page_replace. Verification that the old page exists is already done by vm_radix_replace. Submitted by: Ryan Libby <rlibby@gmail.com> Reviewed by: alc, kib Sponsored by: EMC / Isilon Storage Division Follow-up to: https://reviews.freebsd.org/D4326 Differential Revision: https://reviews.freebsd.org/D4471	2015-12-10 22:57:27 +00:00
Conrad Meyer	6fee422ed5	vm_fault_hold: handle vm_page_rename failure On vm_page_rename failure, fix a missing object unlock and a double free of a page. First remove the old page, then rename into other page into first_object, then free the old page. This avoids the problem on rename failure. This is a little ugly but seems to be the most straightforward solution. Tested with: $ sysctl debug.fail_point.uma_zalloc_arg="1%return" $ kyua test -k /usr/tests/sys/Kyuafile Submitted by: Ryan Libby <rlibby@gmail.com> Reviewed by: kib Seen by: alc Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4326	2015-12-06 17:46:12 +00:00
Conrad Meyer	4cc8daf782	Pull vm_object_scan_all_shadowed out of vm_object_backing_scan These two functions were largely unrelated, they just used the same same loop logic to walk through a backing object's memq. Pull out the all_shadowed test as its own function and eliminate OBSC_TEST_ALL_SHADOWED. Rename vm_object_backing_scan to vm_object_collapse_scan. No functional change. Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4335	2015-12-03 17:21:10 +00:00
Konstantin Belousov	99a1570a25	r221714 fixed the situation when the collapse scan improperly handled invalid (busy) page supposedly inserted by the vm_fault(), in the OBSC_COLLAPSE_NOWAIT case. As a continuation to r221714, fix a case when invalid page is found by the object scan in OBSC_COLLAPSE_WAIT case as well. But, since this is waitable scan, we should wait for the termination of the busy state and restart from the beginning of the backing object' page queue. [] Do not free the shadow page swap space when the parent page is invalid, otherwise this action potentially corrupts user data. Combine all instances of the collapse scan sleep code fragments into the new helper vm_object_backing_scan_wait(). Improve style compliance and comments. Change the return type of vm_object_backing_scan() to bool. Initial submission by: cem, https://reviews.freebsd.org/D4103 [] Reviewed by: alc, cem Tested by: cem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D4146	2015-12-01 09:06:09 +00:00
Konstantin Belousov	b89def80f5	Minor cleanup. Systematically use ANSI C functions definitions. Correct type of the flags argument to the dev_pager_putpages() function. Use vm_pager_free_nonreq(). Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-11-29 11:37:25 +00:00
Konstantin Belousov	2eb2f0d5e3	In vm_pageout_grow_cache(), do not re-try the inactive queue when active queue scan initiated write. Re-trying from the inactive queue when doing active scan makes the loop never end if number of domains is greater than 1 and inactive or active scan cannot reach the target. Reported and tested by: Andrew Gallatin <gallatin@netflix.com> Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-11-27 19:43:36 +00:00
Alan Cox	67b7e4345e	Correct an error in vm_reserv_reclaim_contig(). In the highly unusual case that the reservation contained "low", the starting position in the popmap for the free page search was incorrectly calculated. The most likely (and visible) symptom of this error was the assertion failure, "vm_reserv_reclaim_contig: pa is too low".	2015-11-26 19:12:18 +00:00
Konstantin Belousov	9af50b0126	Record proper commit message for r291157. The r289895 revision did not accounted for the block containing the requested page, when calculating the run of pages. Include the pages before/after the requested page, that fit into the reqblock, into the calculation. Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-11-22 09:50:13 +00:00
Konstantin Belousov	4586820a07	Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-11-22 09:48:03 +00:00
Mark Johnston	7672ca059a	Remove unneeded includes of opt_kdtrace.h. As of r258541, KDTRACE_HOOKS is defined in opt_global.h, so opt_kdtrace.h is not needed when defining SDT(9) probes.	2015-11-22 02:01:01 +00:00
Gleb Smirnoff	09c837b897	Remove remnants of the old NFS from vnode pager. Reviewed by: kib Sponsored by: Netflix	2015-11-20 23:52:27 +00:00
Jonathan T. Looney	1067a2ba68	Consistently enforce the restriction against calling malloc/free when in a critical section. uma_zalloc_arg()/uma_zalloc_free() may acquire a sleepable lock on the zone. The malloc() family of functions may call uma_zalloc_arg() or uma_zalloc_free(). The malloc(9) man page currently claims that free() will never sleep. It also implies that the malloc() family of functions will not sleep when called with M_NOWAIT. However, it is more correct to say that these functions will not sleep indefinitely. Indeed, they may acquire a sleepable lock. However, a developer may overlook this restriction because the WITNESS check that catches attempts to call the malloc() family of functions within a critical section is inconsistenly applied. This change clarifies the language of the malloc(9) man page to clarify the restriction against calling the malloc() family of functions while in a critical section or holding a spin lock. It also adds KASSERTs at appropriate points to make the enforcement of this restriction more consistent. PR: 204633 Differential Revision: https://reviews.freebsd.org/D4197 Reviewed by: markj Approved by: gnn (mentor) Sponsored by: Juniper Networks	2015-11-19 14:04:53 +00:00
Konstantin Belousov	76386c7ecd	Rework the test which raises OOM condition. Right now, the code checks for the swap space consumption plus checks that the amount of the free pages exceeds some limit, in case pagedeamon did not coped with the page shortage in one of the late passes. This is wrong because it does not account for the presence of the reclamaible pages in the queues which are not selectable for reclaim immediately. E.g., on the swap-less systems, large active queue easily triggered OOM. Instead, only raise OOM when pagedaemon is unable to produce a free page in several back-to-back passes. Track the failed passes per pagedaemon thread. The number of passes to trigger OOM was selected empirically and tested both on small (32M-64M i386 VM) and large (32G amd64) configurations. If the specifics of the load require tuning, sysctl vm.pageout_oom_seq sets the number of back-to-back passes which must fail before OOM is raised. Each pass takes 1/2 of seconds. Less the value, more sensible the pagedaemon is to the page shortage. In future, some heuristic to calculate the value of the tunable might be designed based on the system configuration and load. But before it can be done, the i/o system must be fixed to reliably time-out pagedaemon writes, even if waiting for the memory to proceed. Then, code can account for the in-flight page-outs and postpone OOM until all of them finished, which should reduce the need in tuning. Right now, ignoring the in-flight writes and the counter allows to break deadlocks due to write path doing sleepable memory allocations. Reported by: Dmitry Sivachenko, bde, many others Tested by: pho, bde, tuexen (arm) Reviewed by: alc Discussed with: bde, imp Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-11-16 06:26:26 +00:00
Konstantin Belousov	3949873f7a	Do not use vmspace_resident_count() for the OOM process selection. Residency count track the number of pte entries installed into the current pmap, which does not reflect the consumption of the physical memory by the address map. Due to several mechanisms like pv entries reclamation, copy on write etc. the resident pte entries count may be much less than the amount of physical memory kept by the process. Provide the OOM-specific vm_pageout_oom_pagecount() function which estimates the amount of reclamaible memory which could be stolen if the process is killed. Reported and tested by: pho Reviewed by: alc Comments text by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-11-16 06:02:11 +00:00
Konstantin Belousov	b98acc0a1b	VM daemon works in parallel with the pagedaemon threads, and, among other actions, swaps out kernel stacks of the processes. On the other hand, currentl OOM logic which selects a process to kill in the critical condition, skips process with swapped-out thread. Under some loads, this results in the big(gest) process being ignored by OOM. Do not skip a process which has inhibited thread due to the swap-out, in the OOM selection loop. Note that killing such process requires the thread stack page-in, but sometimes this is the only way to recover. Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-11-16 05:52:04 +00:00
John Baldwin	645743ea99	Export various helper variables describing the layout and size of certain kernel structures for use by debuggers. This mostly aids in examining cores from a kernel without debug symbols as a debugger can infer these values if debug symbols are available. One set of variables describes the layout of 'struct linker_file' to walk the list of loaded kernel modules. A second set of variables describes the layout of 'struct proc' and 'struct thread' to walk the list of processes in the kernel and the threads in each process. The 'pcb_size' variable is used to index into the stoppcbs[] array. The 'vm_maxuser_address' is used to distinguish kernel virtual addresses from user addresses. This doesn't have to be perfect, and 'vm_maxuser_address' is a cheap and simple way to differentiate kernel pointers from simple values like TIDs and PIDs. While here, annotate the fields in struct pcb used by kgdb on amd64 and i386 to note that their ABI should be preserved. Annotations for other platforms will be added in the future. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3773	2015-11-12 22:00:59 +00:00
Mark Johnston	7e78597f04	Ensure that deactivated pages that are not expected to be reused are reclaimed in FIFO order by the pagedaemon. Previously we would enqueue such pages at the head of the inactive queue, yielding a LIFO reclaim order. Reviewed by: alc MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2015-11-08 01:36:18 +00:00
Konstantin Belousov	eac91e326a	Reduce the amount of calls to VOP_BMAP() made from the local vnode pager. It is enough to execute VOP_BMAP() once to obtain both the disk block address for the requested page, and the before/after limits for the contiguous run. The clipping of the vm_page_t array passed to the vnode_pager_generic_getpages() and the disk address for the first page in the clipped array can be deduced from the call results. While there, remove some noise (like if (1) {...}) and adjust nearby code. Reviewed by: alc Discussed with: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-10-24 21:59:22 +00:00
Jason A. Harmening	7c989c156f	Fix capitalization	2015-10-23 12:06:06 +00:00
Jason A. Harmening	a50730587b	Remove unclear comment about address truncation in busdma. Add (hopefully much clearer) comment at declaration of PHYS_TO_VM_PAGE(). Noted by: avg	2015-10-23 12:03:25 +00:00
Konstantin Belousov	69b8585e79	Only marker is guaranteed to be present on the queue after the relock in vm_pageout_fallback_object_lock() and vm_pageout_page_lock(). The check for the m->queue == queue assumes that the page does belong to a queue. Modify the 'unchanged' calculation bu dereferencing the marker tailq pointers, which is known to belong to the queue. Since for a page m linked to the queue, m->queue must be equal to the queue index, assert this instead of checking. In collaboration with: alc Sponsored by: The FreeBSD Foundation (kib) MFC after: 2 weeks	2015-10-18 09:33:28 +00:00
Konstantin Belousov	8748f58cde	Revert r289302, invalid pages can be queued, e.g. by vfs_vmio_unwire(). Found by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2015-10-15 19:07:38 +00:00
Konstantin Belousov	12a73f207a	Invalid pages should not appear on the inactive queue. Change the check into an assertion. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2015-10-14 09:03:32 +00:00
Jeff Roberson	21fae96123	Parallelize the buffer cache and rewrite getnewbuf(). This results in a 8x performance improvement in a micro benchmark on a 4 socket machine. - Get buffer headers from a per-cpu uma cache that sits in from of the free queue. - Use a per-cpu quantum cache in vmem to eliminate contention for kva. - Use multiple clean queues according to buffer cache size to eliminate clean queue lock contention. - Introduce a bufspace daemon that attempts to prevent getnewbuf() callers from blocking or doing direct recycling. - Close some bufspace allocation races that could lead to endless recycling. - Further the transition to a more modern style of small functions grouped by prefix in order to improve growing complexity. Sponsored by: EMC / Isilon Reviewed by: kib Tested by: pho	2015-10-14 02:10:07 +00:00
Alan Cox	e595970add	Exploit r288122 to avoid pointlessly enqueueing a page that is about to be freed. Submitted by: kmacy Differential Revision: https://reviews.freebsd.org/D1674	2015-10-09 03:38:58 +00:00
Alan Cox	27e9ed8a5a	Exploit r288122 to address a cosmetic issue. Pages belonging to either the kernel or kmem object can't be paged out. Since they can't be paged out, they are never enqueued in a paging queue. Nonetheless, passing PQ_INACTIVE to vm_page_unwire() in kmem_unback() creates the appearance that these pages are being enqueued in the inactive queue. As of r288122, we can avoid giving this false impression by passing PQ_NONE. Submitted by: kmacy Differential Revision: https://reviews.freebsd.org/D1674	2015-10-06 05:49:00 +00:00
Warner Losh	d635a37ffa	Mark swap_pager_putpages static at its definition. It was already static at its declaration. Remove needless swapdev_strategy forward declaration. MFC After: 3 days	2015-10-05 21:29:17 +00:00
Alan Cox	bc7275964c	Reduce the scope of a variable to the only file where it is used.	2015-10-03 19:27:52 +00:00
Mark Johnston	3138cd3670	As a step towards the elimination of PG_CACHED pages, rework the handling of POSIX_FADV_DONTNEED so that it causes the backing pages to be moved to the head of the inactive queue instead of being cached. This affects the implementation of POSIX_FADV_NOREUSE as well, since it works by applying POSIX_FADV_DONTNEED to file ranges after they have been read or written. At that point the corresponding buffers may still be dirty, so the previous implementation would coalesce successive ranges and apply POSIX_FADV_DONTNEED to the result, ensuring that pages backing the dirty buffers would eventually be cached. To preserve this behaviour in an efficient manner, this change adds a new buf flag, B_NOREUSE, which causes the pages backing a VMIO buf to be placed at the head of the inactive queue when the buf is released. POSIX_FADV_NOREUSE then works by setting this flag in bufs that underlie the specified range. Reviewed by: alc, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3726	2015-09-30 23:06:29 +00:00
Alan Cox	9e829b2272	The conversion of kmem_alloc_attr() from operating on a vm map to a vmem arena in r254025 introduced a bug in the case when an allocation is only partially successful. Specifically, the vm object lock was not being acquired before freeing the allocated pages. To address this bug, replace the existing code by a call to kmem_unback(). Change the type of a variable in kmem_alloc_attr() so that an allocation of two or more gigabytes won't fail. Replace the error handling code in kmem_back() by a call to kmem_unback(). Reviewed by: kib (an earlier version) MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2015-09-26 22:57:10 +00:00
Alan Cox	087a613247	Exploit r288122 to address a cosmetic issue. Since the pages allocated by noobj_alloc() don't belong to a vm object, they can't be paged out. Since they can't be paged out, they are never enqueued in a paging queue. Nonetheless, passing PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages are being enqueued in the inactive queue. As of r288122, we can avoid giving this false impression by passing PQ_NONE. Submitted by: kmacy Differential Revision: https://reviews.freebsd.org/D1674	2015-09-26 17:45:10 +00:00
Alan Cox	15aaea7892	Change vm_page_unwire() such that it (1) accepts PQ_NONE as the specified queue and (2) returns a Boolean indicating whether the page's wire count transitioned to zero. Exploit this change in vfs_vmio_release() to avoid pointlessly enqueueing a page that is about to be freed. (An earlier version of this change was developed by attilio@ and kmacy@. Any errors in this version are my own.) Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2015-09-22 18:16:52 +00:00
Alan Cox	d9347bca9a	Correct a non-fatal error in vm_pageout_worker(). vm_pageout_worker() should not assume that vm_pages_needed will remain set while it sleeps. Other threads can clear vm_pages_needed by performing a sufficient number of vm_page_free() calls, e.g., process termination. The effect of this error was that vm_pageout_worker() would free and/or launder pages when, in fact, there was no shortage of free pages. Rewrite a nearby comment to describe all of the possible cases and not just the most common case. The problem being that the comment made the most common case seem like the only case. Reviewed by: kib MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2015-09-20 19:20:03 +00:00
Alan Cox	c9af644e5c	Eliminate (many) unnecessary calls to pmap_remove_all(). Pages from objects with a reference count of zero can't possibly be mapped, so there is never a need for vm_page_set_invalid() to call pmap_remove_all() on them. Reviewed by: kib MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2015-09-17 22:28:38 +00:00
Mark Johnston	d73ce4c698	Remove the v_cache_min and v_cache_max sysctls. They are unused and have no effect. Reviewed by: alc Sponsored by: EMC / Isilon Storage Division	2015-09-11 03:00:20 +00:00
Konstantin Belousov	b8db977617	Remove a check which caused spurious SIGSEGV on usermode access to the mapped address without valid pte installed, when parallel wiring of the entry happen. The entry must be copy on write. If entry is COW but was already copied, and parallel wiring set MAP_ENTRY_IN_TRANSITION, vm_fault() would sleep waiting for the MAP_ENTRY_IN_TRANSITION flag to clear. After that, the fault handler is restarted and vm_map_lookup() or vm_map_lookup_locked() trip over the check. Note that this is race, if the address is accessed after the wiring is done, the entry does not fault at all. There is no reason in the current kernel to disallow write access to the COW wired entry if the entry permissions allow it. Initially this was done in r24666, since that kernel did not supported proper copy-on-write for wired text, which was fixed in r199869. The r251901 revision re-introduced the r24666 fix for the current VM. Note that write access must clear MAP_ENTRY_NEEDS_COPY entry flag by performing COW. In reverse, when MAP_ENTRY_NEEDS_COPY is set in vmspace_fork(), the MAP_ENTRY_USER_WIRED flag is cleared. Put the assert stating the invariant, instead of returning the error. Reported and debugging help by: peter Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-09-09 06:19:33 +00:00
Warner Losh	9e3e3fe5b3	The swap pager is compatible with direct dispatch. It does its own locking and doesn't sleep. Flag the consumer we create as such. In addition, decrement the in flight index when we have an out of memory error after having incremented it previously. This would have prevented swapoff from working if the swap pager ever hit a resource shortage trying to swap out something (the swap in path always waits for a bio, so won't have this issue). Simplify the close logic by abandoning the use of private and initializing the index to 1 and dropping that reference when we previously set private. Also, set sw_id only while sw_dev_mtx is held. This should only affect swapping to a vnode, as opposed to a geom whose close always sets it to NULL with sw_dev_mtx held. Differential Review: https://reviews.freebsd.org/D3547	2015-09-08 17:47:56 +00:00
Alan Cox	27a9fb2fc2	To simplify upcoming changes to the inactive queue scan, change the code so that there is only one place where pages are freed and only one place where pages are moved to the tail of the queue. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2015-09-08 04:18:57 +00:00
Alan Cox	960810ccea	Eliminate pointless requeueing of pages from terminated objects. These pages will have left the inactive queue before the page daemon performs its next scan. Also, ignore references to pages from terminated objects. This allows the clean pages to be freed a little sooner. Move some comments to their proper place, i.e., next to the code that they describe, and update other nearby comments. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2015-09-05 17:34:49 +00:00

1 2 3 4 5 ...

3439 Commits