The arch required two-pages alignment due to single TLB entry caching
two consequtive mappings.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D33763
Move an assignment back to where it was before, to turn the
defined-but-not-used error back into a set-but-not-used warning.
Fixes: 01e115ab83a4 vm_phys: #include vm_extern
Arm64 and powerpc don't include vm_extern.h indirectly in vm_phys.c, which
means that for the sake of those architectures, it must be included explicitly.
Also, fix a set-unused warning that jenkins also found.
Reported by: Jenkins
Fixes: c606ab59e7f9 vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D33685
It is only called in the file that defines it, so make it static and
remove the declaration from the header.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33688
Handle specially the boundary==0 case of vm_reserv_reclaim_config,
by turning off boundary adjustment in that case.
Reviewed by: alc
Tested by: pho, madpilot
fb38b29b5609 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
should have moved a comment block when it moved the function call that followed it.
Move the comment block now.
Function vm_reserv_reclaim_contig breaks a reservation with enough
free space to satisfy an allocation request and returns the free space
to the buddy allocator. Change the function to allocate the request
memory from the reservation before breaking it, and return that memory
to the caller. That avoids a second call to the buddy allocator and
guarantees successful allocation after breaking the reservation, where
that success is not currently guaranteed.
Reviewed by: alc, kib (previous version)
Differential Revision: https://reviews.freebsd.org/D33644
Fix a very recent change that introduced a page accounting error in
case of a reserveration being broken.
Reviewed by: alc
Fixes: fb38b29b5609 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
Differential Revision: https://reviews.freebsd.org/D33645
Extract code common to functions vm_page_alloc_contig_domain and
vm_page_alloc_noobj_contig_domain into a new function. Do so in a way
that eliminates a bound-to-fail reservation test after a reservation
is broken by a call from vm_page_alloc_contig_domain.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33551
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.
Obtained from: Juniper Networks, Inc.
MFC after: 1 week
Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.
Subscribers: imp, #contributor_reviews_base
Differential Revision: https://reviews.freebsd.org/D33550
Commit 867c27c23a5c modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache. However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call is_object_active()
to check if the file is mmap(2)'d.
This patch renames is_object_active() to vm_object_is_active(),
moves it to sys/vm/vm_object.c and makes it global, so that
the NFS client can call it in a future commit.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33520
Function vm_reserv_test_contig has incorrectly used its alignment
and boundary parameters to find a well-positioned range of empty pages
in a reservation. Consequently, a reservation could be broken
mistakenly when it was unable to provide a satisfactory set of pages.
Rename the function, correct the errors, and add assertions to detect
the error in case it appears again.
Reviewed by: alc, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33344
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage. (For largepage mappings, it calls vm_fault() once per large
page.)
A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page. Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry. Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler. So the second page is wired twice and will be
leaked when the object is destroyed.
Fix the problem by modify vm_fault_populate() to wire only the fault
page. Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.
PR: 260347
Reviewed by: alc, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33416
In vm_reserv_init, set all the marker popmap bits in vm_reserv_init,
and not just the bits of the first popmap entry.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33258
A page must not become invalid while vm_fault_soft_fast() is attempting
to map unbusied pages for reading.
Note that all callers hold the object write lock already, and
vm_page_set_invalid() asserts the object write lock.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33250
The flag requests skipping the heuristic which tries to avoid leaving
system with more allocated memory than available from RAM and remanining
swap.
Reviewed by: markj
Discussed with: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33165
For compatibility, add a placeholder pointer to the start of the
added struct swapoff_new_args, and use it to distinguish old vs. new
style of syscall invocation.
Reviewed by: markj
Discussed with: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33165
When swap is turned off due to system shutdown or reboot, ignore the
check. Problem is that the check is not accurate by any means, free
page count can legitimately be low while system still able to page in
everything from the swap. Then, we turn swap off if swapping on
real file or some non-standard geom provider, and typically panic
when system appears to actually need to unavailable page.
For syscall, it is better to be safe than sorry.
Reported and tested by: peterj
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33147
VOP_STRATEGY() requires locked vnode. Note that we lock the swap vnode
while pages are busy, but this would only cause real LoR if pages belong
to the swap vnode, which must not be the case for correct use.
Reported and tested by: peterj
Reviewed by: markj
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33119
to cover VOP_GETATTR() call in sys_swapon(). Move locking from inside
swapongeom() and swaponvp() into sys_swapon().
Reported by and tested by: peterj
Reviewed by: markj
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33119
Rather than overloading the meanings of the Mach statuses, introduce a
new set for use internally in the fault code. This makes the control
flow easier to follow and provides some extra error checking when a
fault status variable is used in a switch statement.
vm_fault_lookup() and vm_fault_relookup() continue to use Mach statuses
for now, as there isn't much benefit to converting them and they
effectively pass through a status from vm_map_lookup().
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33017
This makes it easier to factor out pieces of vm_fault(). No functional
change intended.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33016
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.
To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.
Reviewed by: kib, markj, jhb
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D31992
4.3 BSD's mmap took an int len and long pos. Reject negative lengths
and in freebsd32 sign-extend pos correctly rather than mis-handling
negative positions as large positive ones.
Reviewed by: kib
This will be used to break a deadlock in ZFS between the per-mountpoint
teardown lock and page busy locks. In particular, when purging data
from the page cache during dataset rollback, we want to avoid blocking
on the busy state of invalid pages since the busying thread may be
blocked on the teardown lock in zfs_getpages().
Add a helper, vn_pages_remove_valid(), for use by filesystems. Bump
__FreeBSD_version so that the OpenZFS port can make use of the new
helper.
PR: 258208
Reviewed by: avg, kib, sef
Tested by: pho (part of a larger patch)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32931
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32948
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
instead.
- Remove vm_page_sleep_if_(x)busy().
No functional change intended.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32947
We added _NORECLAIM to request that kmem_alloc_contig_pages() not spend
time scanning physical memory for candidates to reclaim. In some
situations the scanning can induce large amounts of undesirable latency,
and it's less important that the request be satisfied than it is that we
not spend many milliseconds scanning.
The problem extends to vm_reserv_reclaim_contig(), which unlike
vm_reserv_reclaim() may have to scan the entire list of partially
populated reservations. Use VM_ALLOC_NORECLAIM to request that this
scan not be executed.[1]
As a side effect, this fixes a regression in 02fb0585e7b3 ("vm_page:
Drop handling of VM_ALLOC_NOOBJ in vm_page_alloc_contig_domain()")
where VM_ALLOC_CONTIG was not included in VPAC_FLAGS or VPANC_FLAGS even
though it is not masked by kmem_alloc_contig_pages().[2]
Reported by: gallatin [1], glebius [2]
Reviewed by: alc, glebius, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32899
Kegs with no items reserved have uk_reserve = 0. So the check
keg->uk_reserve >= dom->ud_free_items will be true once all slabs are
depleted. Then, rather than go and allocate a fresh slab, we return to
the cache layer.
The intent was to do this only when the keg actually has a reserve, so
modify the check to verify this first. Another approach would be to
make uk_reserve signed and set it to -1 until uma_zone_reserve() is
called, but this requires a few casts elsewhere.
Fixes: 1b2dcc8c54a8 ("uma: Avoid depleting keg reserves when filling a bucket")
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32516
M_USE_RESERVE is used in a couple of places in the VM to avoid unbounded
recursion when the direct map is not available, as is the case on 32-bit
platforms or when certain kernel sanitizers (KASAN and KMSAN) are
enabled. For example, to allocate KVA, the kernel might allocate a
kernel map entry, which might require a new slab, which requires KVA.
For these zones, we use uma_prealloc() to populate a reserve of items,
and then in certain serialized contexts M_USE_RESERVE can be used to
guarantee a successful allocation. uma_prealloc() allocates the
requested number of items, distributing them evenly among NUMA domains.
Thus, in a first-touch zone, to satisfy an M_USE_RESERVE allocation we
might have to check the slab lists of other domains than the current one
to provide the semantics expected by consumers.
So, try harder to find an item if M_USE_RESERVE is specified and the keg
doesn't have anything for current (first-touch) domain. Specifically,
fall back to a round-robin slab allocation. This change fixes boot-time
panics on NUMA systems with KASAN or KMSAN enabled.[1]
Alternately we could have uma_prealloc() allocate the requested number
of items for each domain, but for some existing consumers this would be
quite wasteful. In general I think keg_fetch_slab() should try harder
to find free slabs in other domains before trying to allocate fresh
ones, but let's limit this to M_USE_RESERVE for now.
Also fix a separate problem that I noticed: in a non-round-robin slab
allocation with M_WAITOK, rather than sleeping after a failed slab
allocation we simply try again. Call vm_wait_domain() before retrying.
Reported by: mjg, tuexen [1]
Reviewed by: alc
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32515
For making the call faster, do not count active/inactive object queues,
and do not report vnode info if any (for tmpfs).
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31163
vm_reserv_reclaim_*() will release pages to the default freepool, not
the direct freepool from which noobj allocations are drawn. But if both
pools are empty, the noobj allocator variants must break reservations to
make progress.
Reported by: cy
Reviewed by: kib (previous version)
Fixes: b498f71bc56a ("vm_page: Add a new page allocator interface for unnamed pages")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32592