In my tests during buildkernel fs->m was always NULL at that stage.
Note the change has no impact on vm obj contention during said workload.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D39027
This is almost the simplest patch which manages to avoid write locking
for backing objects, as a result mostly fixing vm object contention
problems.
What is not fixed:
1. cacheline ping pong due to read-locks
2. cacheline ping pong due to pip
3. cacheling ping pong due to object busying
4. write locking on first object
On top of it the use of VM_OBJECT_UNLOCK instead of explicitly tracking
the state is slower multithreaded that it needs to be, done for
simplicity for the time being.
Sample lock profiling results doing -j 104 buildkernel on tmpfs:
before:
71446200 (rw:vmobject)
14689706 (sx:vm map (user))
4166251 (rw:pmap pv list)
2799924 (spin mutex:turnstile chain)
after:
19940411 (rw:vmobject)
8166012 (rw:pmap pv list)
6017608 (sx:vm map (user))
1151416 (sleep mutex:pipe mutex)
Reviewed by: kib
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D38964
Lock names are shown in top as a `*` followed by the first five
characters of the name. `*vmobj` a little more obvious and easier to
search for than `*vm ob`.
Differential Revision: https://reviews.freebsd.org/D36264
When vm_fault_soft_fast() creates a mapping, it release the VM map lock
before unbusying the top-level object. Without the map lock, however,
nothing prevents the VM object from being deallocated while still busy.
Fix the problem by unbusying the object before releasing the VM map
lock. If vm_fault_soft_fast() fails to create a mapping, the VM map
lock is not released, so those cases don't need to change.
Reported by: syzkaller
Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38527
Remove the platform-specific definitions of VM_BATCHQUEUE_SIZE
for amd64 and powerpc64, and instead treat all 64-bit platforms
identically. This has the effect of increasing the arm64
and riscv VM_BATCHQUEUE_SIZE to match that of other platforms.
Reviewed by: jhb, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37707
We need to repeat the operation if the vnode was relocked.
Reported and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38114
getattr is very expensive and in important cases only gets called to get
the size. This can be optimized with a dedicated routine which obtains
that statistic.
As a step towards that goal make size-only consumers use a dedicated
routine.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D37885
Same as it is done in other error return cases. Callers depend on error
case returning NULL, e.g. vm_imgact_hold_page().
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37719
Rather than waiting until the batchqueue is full to acquire the lock &
process the queue, we now start trying to acquire the lock using trylocks
when the batchqueue is 1/2 full. This removes almost all contention on the
vm pagequeue mutex for for our busy sendfile() based web workload.
It also greadly reduces the amount of time a network driver ithread
remains blocked on a mutex, and eliminates some packet drops under
heavy load.
So that the system does not loose the benefit of processing large
batchqueues, I've doubled the size of the batchqueues. This way, when
there is no contention, we process the same batch size as before.
This has been run for several months on a busy Netflix server, as well
as on my personal desktop.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37305
For dynamically allocated pager type, which inherits the parent's alloc
method, type of the returned object is set to the parent's type
otherwise.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37097
The vnode could be reclaimed and allocated again during the lifecycle of
the node, but the node cannot. Also, referencing the node would allow
to reach it and tmpfs mount data from the object, regardless of the
state of the possibly absent vnode.
Still use swp_tmpfs for back-pointer, instead of using handle. Use of
named swap objects would incur taking the sw_alloc_sx on node allocation
and deallocation.
swp_tmpfs is renamed to swp_priv to remove the last bit of tmpfs in vm/.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37097
also make it return the count of the swap pages freed, which are not
simultaneously resident in the object.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37097
This command already prints a tremendous amount of output, and properly
obeys the pager. It no longer makes sense to arbitrarily limit the pages
that are printed, as the reader will not be aware that this has
happened.
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D37361
markj says:
...the assertion is incorrect and should simply be removed.
It has been racy since we removed the use of the page hash
lock to synchronize wiring of pages.
PR: 267621
Reviewed by: markj, Anton Rang <rang@acm.org>
MFC after: 1 week
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D37320
Use it and several other vm_page_*_valid() functions in more places.
Suggested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37024
As an optimization, vm_page_activate() avoids requeuing a page that's
already in the active queue. A page's location in the active queue is
mostly unimportant.
When a page is unwired and placed back in the page queues,
vm_page_unwire() avoids moving pages out of PQ_ACTIVE to honour the
request, the idea being that they're likely mapped and so will simply
get bounced back in to PQ_ACTIVE during a queue scan.
In both cases, if the page was logically in PQ_ACTIVE but had not yet
been physically enqueued (i.e., the page is in a per-CPU batch), we
would end up clearing PGA_REQUEUE from the page. Then, batch processing
would ignore the page, so it would end up unwired and not in any queues.
This can arise, for example, when a page is allocated and then
vm_page_activate() is called multiple times in quick succession. The
result is that the page is hidden from the page daemon, so while it will
be freed when its VM object is destroyed, it cannot be reclaimed under
memory pressure.
Fix the bug: when checking if a page is in PQ_ACTIVE, only perform the
optimization if the page is physically enqueued.
PR: 256507
Fixes: f3f38e2580 ("Start implementing queue state updates using fcmpset loops.")
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: E-CARD Ltd.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36839
GCC warns about the mismatched sizes on i386 where vm_paddr_t is 64
bits.
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D36750
Move declarations for in-kernel function and variables to _KERNEL
section.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D36612
The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.
There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.
Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36164
This reverts commit f6ffed44a8.
fetchadd will fail the waiters flag, which can cause other
code to wait when it should not with nothing clear it
Revert until I sort this out.
Reported by: markj
This also fixes a bug where not-last unbusy failed to post a release
fence.
Reviewed by: markj (previous version), kib (previous version)
Differential Revision: https://reviews.freebsd.org/D36084
The functions pbuf_init, kva_alloc, and keg_alloc_slab are significant
contributors to the kernel boot time when FreeBSD boots inside the
Firecracker VMM. Instrument them so they show up on flamecharts.
As in vm_fault_cow(), it's possible, albeit rare, for multiple vm_maps
to share a shadow object. When copying a page from a backing object
into the shadow, all mappings of the source page must therefore be
removed. Otherwise, future operations on the object tree may detect
that the source page is fully shadowed and thus can be freed.
Approved by: so
Security: FreeBSD-SA-22:11.vm
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35635
We don't need to hold the object lock while allocating swap space, so
don't.
Reviewed by: dougm, kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35839
This is not completely exhaustive, but covers a large majority of
commands in the tree.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393
Keep the definition around since it's used by userspace.
Reviewed by: alc, imp, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35791
It's unused now. Keep the OBJ_DEFAULT identifier, but make it an alias
of OBJT_SWAP for the benefit of out-of-tree code.
Reviewed by: alc, imp, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35790
With the removal of OBJT_DEFAULT, OBJ_ANON implies OBJ_SWAP.
Note, this means that vm_object_split() is more expensive than it used
to be, as it holds busy locks until the end of the range is reached,
even if the object has no swap blocks allocated.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35789
Now that OBJT_DEFAULT objects can't be instantiated, we can simplify
checks of the form object->type == OBJT_DEFAULT || (object->flags &
OBJ_SWAP) != 0. No functional change intended.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35788
With the removal of OBJT_DEFAULT, we can simply handle this in
swap_pager_dealloc(). No functional change intended.
Suggested by: alc
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35787
With the removal of OBJT_DEFAULT, we can assume that pager operations
provide an object with OBJ_SWAP set. Also, we do not need to convert
objects from type OBJT_DEFAULT. Thus, remove checks for OBJ_SWAP and
remove code which modifies the object type. In some places, replace the
check for OBJ_SWAP with a check for whether any swap blocks are
assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35786
With this change, OBJT_DEFAULT objects are no longer allocated.
Instead, anonymous objects are always of type OBJT_SWAP and always have
OBJ_SWAP set.
Modify the page fault handler to check the swap block radix tree in
places where it checked for objects of type OBJT_DEFAULT. In
particular, there's no need to invoke getpages for an OBJT_SWAP object
with no swap blocks assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35785
In preparation for removing OBJT_DEFAULT, eliminate some stale/unhelpful
comments from vm_mmap(), and remove an unused case. In particular, the
remaining callers of vm_mmap() in the tree do not specify OBJT_DEFAULT.
It's much more common to use vm_map_find() to map an object into user
memory, so rather than adjusting vm_mmap() to handle OBJT_SWAP objects,
let's further discourage its use and simply remove OBJT_DEFAULT
handling.
Reviewed by: dougm, alc, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35778