Refactor ZFS ARC reclaim logic to be more VM cooperative
MFC r270861:
Ensure that ZFS ARC free memory checks include cached pages
MFC r272483:
Refactor ZFS ARC reclaim checks and limits
Sponsored by: Multiplay
Remove stray uma_mtx lock/unlock in zone_drain_wait()
Callers of zone_drain_wait(M_WAITOK) do not need to hold (and were not)
the uma_mtx, but we would attempt to unlock and relock the mutex if we
had to sleep because the zone was already draining. The M_NOWAIT callers
may hold the uma_mtx, but we do not sleep in that case.
Back in the days when the kernel was single threaded, testing
"vm_paging_target() > 0" was a reasonable way of determining if the
inactive queue scan met its target. However, now that other threads
can be allocating pages while the inactive queue scan is running, it's
an unreliable method. The effect of it being unreliable is that we
can start swapping out processes when we didn't intend to.
This issue has existed since the kernel was multithreaded, but the
changes to the inactive queue target in 10.0-RELEASE have made its
effects visible.
This change introduces a more direct method for determining if the
inactive queue scan met its target that is not affected by the actions
of other threads.
wired region. Rework the handling of unwire to do the it in batch,
both at pmap and object level.
All commits below are by alc.
MFC r268327:
Introduce pmap_unwire().
MFC r268591:
Implement pmap_unwire() for powerpc.
MFC r268776:
Implement pmap_unwire() for arm.
MFC r268806:
pmap_unwire(9) man page.
MFC r269134:
When unwiring a region of an address space, do not assume that the
underlying physical pages are mapped by the pmap. This fixes a leak
of the wired pages on the unwiring of the region mapped with no access
allowed.
MFC r269339:
In the implementation of the new function pmap_unwire(), the call to
MOEA64_PVO_TO_PTE() must be performed before any changes are made to the
PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic.
MFC r269365:
Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed
by the combination of r268591 and r269134: When we attempt to add the
wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing.
(They only set the wired attribute on newly created mappings.)
MFC r269433:
Handle wiring failures in vm_map_wire() with the new functions
pmap_unwire() and vm_object_unwire().
Retire vm_fault_{un,}wire(), since they are no longer used.
MFC r269438:
Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable
"rv" is uninitialized.
MFC r269485:
Retire pmap_change_wiring().
Reviewed by: alc
Implement 'fast path' for the vm page fault handler.
MFC r270387 (by alc):
Relax one of the conditions for mapping a page on the fast path.
Approved by: re (gjb)
by flag). The ia64 pmap.c changes are direct commit, since ia64 is
removed on head.
MFC r269368 (by alc):
Retire PVO_EXECUTABLE.
MFC r269728:
Change pmap_enter(9) interface to take flags parameter and superpage
mapping size (currently unused).
MFC r269759 (by alc):
Update the text of a KASSERT() to reflect the changes in r269728.
MFC r269822 (by alc):
Change {_,}pmap_allocpte() so that they look for the flag
PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether
to sleep on page table page allocation.
MFC r270151 (by alc):
Replace KASSERT that no PV list locks are held with a conditional
unlock.
Reviewed by: alc
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Add OBJ_TMPFS_NODE flag.
MFC r268616:
Set the OBJ_TMPFS_NODE flag for vm_object of VREG tmpfs node.
MFC r269053:
Correct assertion. tmpfs vm object is always at the bottom of
the shadow chain.
Assert that the new entry is inserted into the right location in the
map entries list, and that it does not overlap with the previous and
next entries.
Allocating new bucket for bucket zone, never take it from the zone itself,
since it will almost certanly fail. Take next bigger zone instead.
This situation should not happen with original bucket zones configuration:
"32 Bucket" zone uses "64 Bucket" and vice versa. But if "64 Bucket" zone
lock is congested, zone may grow its bucket size and start biting itself.
With the new-and-improved vm_fault_copy_entry() (r265843), we can always
avoid soft page faults when adding write access to user wired entries in
vm_map_protect(). Previously, we only avoided the soft page fault when
the underlying pages were copy-on-write. In other words, we avoided the
pages faults that might sleep on page allocation, but not the trivial
page faults to update the physical map.
On a fork allow read-only wired pages to be copy-on-write shared between
the parent and child processes. Previously, we copied these pages even
though they are read only. However, the reason for copying them is
historical and no longer exists. In recent times, vm_map_protect() has
developed the ability to copy pages when write access is added to wired
copy-on-write pages. So, in this case, copy-on-write sharing of wired
pages is not to be feared. It is not going to lead to copy-on-write
faults on wired memory.
msync(2) must return ENOMEM and not EINVAL when the address is outside the
allowed range or when one or more pages are not mapped. This according to
The Open Group Base Specifications Issue 7.
Sponsored by: EMC / Isilon storage division
For the upgrade case in vm_fault_copy_entry(), when the entry does not
need COW and is writeable, do not create a new backing object for the entry.
MFC r265887:
Fix locking.
Prior to r254304, a separate function, vm_pageout_page_stats(), was used
to periodically update the reference status of the active pages. This
function was called, instead of vm_pageout_scan(), when memory was not
scarce. The objective was to provide up to date reference status for
active pages in case memory did become scarce and active pages needed to
be deactivated.
The active page queue scan performed by vm_pageout_page_stats() was
virtually identical to that performed by vm_pageout_scan(), and so r254304
eliminated vm_pageout_page_stats(). Instead, vm_pageout_scan() is
called with the parameter "pass" set to zero. The intention was that when
pass is zero, vm_pageout_scan() would only scan the active queue.
However, the variable page_shortage can still be greater than zero when
memory is not scarce and vm_pageout_scan() is called with pass equal to
zero. Consequently, the inactive queue may be scanned and dirty pages
laundered even though that was not intended by r254304. This revision
fixes that.
Correctly update the count of stuck pages, "addl_page_shortage", in
vm_pageout_scan(). There were missing increments in two less common
cases.
Don't conflate the count of stuck pages and the pageout deficit provided
by vm_page_alloc{,_contig}().
Handle held pages consistently in the inactive queue scan. In the more
common case, we did not move the page to the tail of the queue. Whereas,
in the less common case, we did. There's no particular reason to move
the page in the less common case, so remove it.
Perform the calculation of the page shortage for the active queue scan a
little earlier, before the active queue lock is acquired. The correctness
of this calculation doesn't depend on the active queue lock being held.
Eliminate a redundant variable, "pcount". Use the more descriptive
variable, "maxscan", in its place.
Apply a few nearby style fixes, e.g., eliminate stray whitespace and
excess parentheses.
Fix two issues with /dev/mem access on amd64, both causing kernel page
faults.
First, for accesses to direct map region should check for the limit by
which direct map is instantiated.
Second, for accesses to the kernel map, use a new thread private flag
TDP_DEVMEMIO, which instructs vm_fault() to return error when fault
happens on the MAP_ENTRY_NOFAULT entry, instead of panicing.
MFC r263498:
Add change forgotten in r263475. Make dmaplimit accessible outside
amd64/pmap.c.