Commit Graph

4638 Commits

Author SHA1 Message Date
Konstantin Belousov
d950c5898a vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h
instead of fatty sys/systm.h.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov
f4cdb9d7c3 vm/vm_pager.h: use sys/systm.h header
it is needed for __read_mostly attribute definition, which right now
comes from vm/vm_page.h including sys/systm.h

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov
531f8cfea0 Use dedicated lock name for pbufs
Also remove a pointer to array variable, use array address directly.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:14 +02:00
John Baldwin
29d481ae6a Make <vm/vm_extern.h> more self-contained.
Add a nested include of <sys/systm.h> for recently added assertions.
Without this, existing code (such as in drm-kmod) needs to be patched
to add the newly required header.

While here, rewrite the assertions using KASSERT().

Reviewed by:	dougm, alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D34070
2022-01-28 13:14:03 -08:00
Konstantin Belousov
3de96d664a vm_pageout_scans: correct detection of active object
For non-anonymous swap objects, there is always a reference from the
owner to the object to keep it from recycling.  Account for it when
deciding should we query pmap for hardware active references for the
page.

As result, we avoid unneeded calls to pmap_ts_referenced(), which for
non-mapped page means avoiding unneccessary lock and unlock of the pv list.

Reviewed by:	markj
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33924
2022-01-22 19:34:32 +02:00
Doug Moore
0ce7909cd0 vm_phys: add essential segment bounds check
A lower-bound segment check is necessary in vm_phys_alloc_seg_contig.
Add one.

Reported by:	jenkins
Reviewed by:	alc
Fixes:	da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33945
2022-01-19 00:42:39 -06:00
Doug Moore
da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
In vm_phys_alloc_seg_contig, in allocating multiple memory blocks for
a huge allocation, ensure that the end of the allocated range does not
exceed the upper segment limit.

Reorder a couple of checks to improve code layout.

Reviewed by:	alc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33870
2022-01-18 12:49:09 -06:00
Mark Johnston
46d35d415a fork: Copy the vm_stacktop field into the new vmspace
Fixes:	1811c1e957 ("exec: Reimplement stack address randomization")
Reported by:	pho
Reported by:	syzbot+0446312a51bc13ead834@syzkaller.appspotmail.com
Sponsored by:	The FreeBSD Foundation
2022-01-18 10:51:49 -05:00
Mark Johnston
1811c1e957 exec: Reimplement stack address randomization
The approach taken by the stack gap implementation was to insert a
random gap between the top of the fixed stack mapping and the true top
of the main process stack.  This approach was chosen so as to avoid
randomizing the previously fixed address of certain process metadata
stored at the top of the stack, but had some shortcomings.  In
particular, mlockall(2) calls would wire the gap, bloating the process'
memory usage, and RLIMIT_STACK included the size of the gap so small
(< several MB) limits could not be used.

There is little value in storing each process' ps_strings at a fixed
location, as only very old programs hard-code this address; consumers
were converted decades ago to use a sysctl-based interface for this
purpose.  Thus, this change re-implements stack address randomization by
simply breaking the convention of storing ps_strings at a fixed
location, and randomizing the location of the entire stack mapping.
This implementation is simpler and avoids the problems mentioned above,
while being unlikely to break compatibility anywhere the default ASLR
settings are used.

The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
and is re-enabled by default.

PR:		260303
Reviewed by:	kib
Discussed with:	emaste, mw
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:12:36 -05:00
Mark Johnston
a04ce833f9 uma: Avoid polling for an invalid SMR sequence number
Buckets in an SMR-enabled zone can legitimately be tagged with
SMR_SEQ_INVALID.  This effectively means that the zone destructor (if
any) was invoked on all items in the bucket, and the contained memory is
safe to reuse.  If the first bucket in the full bucket list was tagged
this way, UMA would unnecessarily poll per-CPU state before attempting
to fetch a full bucket from the list.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-01-14 15:38:02 -05:00
Mark Johnston
4a864f624a vm_pageout: Print a more accurate message to the console before an OOM kill
Previously we'd always print "out of swap space."  This can be
misleading, as there are other reasons an OOM kill can be triggered.  In
particular, it's entirely possible to trigger an OOM kill on a system
with plenty of free swap space.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33810
2022-01-14 15:04:21 -05:00
Brooks Davis
0910a41ef3 Revert "syscallarg_t: Add a type for system call arguments"
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.

This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
2022-01-12 23:29:20 +00:00
Brooks Davis
1544e0f5d1 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-01-12 22:51:25 +00:00
Doug Moore
84e2ae64c5 vm_reserv: use enhanced bitstring for popmaps
vm_reserv.c uses its own bitstring implemenation for popmaps. Using
the bitstring_t type from a standard header eliminates the code
duplication, allows some bit-at-a-time operations to be replaced with
more efficient bitstring range operations, and, in
vm_reserv_test_contig, allows bit_ffc_area_at to more efficiently
search for a big-enough set of consecutive zero-bits.

Make bitstring changes improve the vm_reserv code.  Define a bit_ntest
method to test whether a range of bits is all set, or all clear.
Define bit_ff_at and bit_ff_area_at to implement the ffs and ffc
versions with a parameter to choose between set- and clear- bits.
Improve the area_at implementation.  Modify the bit_nset and
bit_nclear implementations to allow code optimization in the cases
when start or end are multiples of _BITSTR_BITS.

Add a few new cases to bitstring_test.

Discussed with:	alc
Reviewed by:	markj
Tested by:	pho (earlier version)
Differential Revision:	https://reviews.freebsd.org/D33312
2022-01-12 11:03:53 -06:00
Mark Johnston
c4a25e0713 vm_pageout: Group sysctl variables together with sysctl definitions
Fix some style bugs while here.  No functional change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33811
2022-01-11 09:27:45 -05:00
Mark Johnston
43b3b8e52d swap_pager: uma_zcreate() doesn't fail
Remove always-false checks for UMA zone creation failure.  No functional
change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33809
2022-01-11 09:27:45 -05:00
Doug Moore
ae13829ddc vm_addr_ok: add power2 invariant check
With INVARIANTS defined, have vm_addr_align_ok and vm_addr_bound_ok
panic when passed an alignment/boundary parameter that is not a power
of two.

Reviewed by:	alc
Suggested by:	kib, se
Differential Revision:	https://reviews.freebsd.org/D33725
2022-01-10 01:17:25 -06:00
Konstantin Belousov
c25a30e255 Dump page tracking no longer needed on mips
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D33763
2022-01-06 06:00:39 +02:00
Konstantin Belousov
f54882a862 Remove special kstack allocation code for mips.
The arch required two-pages alignment due to single TLB entry caching
two consequtive mappings.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D33763
2022-01-06 04:43:56 +02:00
Doug Moore
f76916c095 vm_reserv: #include vm_extern.h explicitly, for arm.
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
2021-12-31 00:40:25 -06:00
Doug Moore
e6930b1c5f vm_phys: convert error back to warning
Move an assignment back to where it was before, to turn the
defined-but-not-used error back into a set-but-not-used warning.

Fixes:	01e115ab83 vm_phys: #include vm_extern
2021-12-31 00:23:46 -06:00
Doug Moore
01e115ab83 vm_phys: #include vm_extern
Arm64 and powerpc don't include vm_extern.h indirectly in vm_phys.c, which
means that for the sake of those architectures, it must be included explicitly.

Also, fix a set-unused warning that jenkins also found.

Reported by:	Jenkins
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
2021-12-30 23:31:18 -06:00
Doug Moore
c606ab59e7 vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D33685
2021-12-30 22:09:08 -06:00
Gleb Smirnoff
841e0a8757 uma: with KTR trace allocs/frees from SMR zones 2021-12-29 23:08:33 -08:00
Gleb Smirnoff
28782f73df uma: with KTR report item being freed in uma_zfree_arg() 2021-12-29 23:08:15 -08:00
Doug Moore
8119cdd38b vm_phys: hide vm_phys_set_pool
It is only called in the file that defines it, so make it static and
remove the declaration from the header.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33688
2021-12-29 11:17:33 -06:00
John Baldwin
d90e41a154 sys/vm: Use C99 fixed-width integer types.
No functional change.

Reviewed by:	imp, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D33641
2021-12-28 09:43:21 -08:00
Doug Moore
49fd2d51f0 vm_reserv: fix zero-boundary error
Handle specially the boundary==0 case of vm_reserv_reclaim_config,
by turning off boundary adjustment in that case.

Reviewed by:	alc
Tested by:	pho, madpilot
2021-12-26 11:40:27 -06:00
Doug Moore
4bae154fe8 vm_page: Move a comment
fb38b29b56 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
should have moved a comment block when it moved the function call that followed it.

Move the comment block now.
2021-12-24 16:10:30 -06:00
Doug Moore
0d5fac2872 vm: alloc pages from reserv before breaking it
Function vm_reserv_reclaim_contig breaks a reservation with enough
free space to satisfy an allocation request and returns the free space
to the buddy allocator. Change the function to allocate the request
memory from the reservation before breaking it, and return that memory
to the caller. That avoids a second call to the buddy allocator and
guarantees successful allocation after breaking the reservation, where
that success is not currently guaranteed.

Reviewed by:	alc, kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D33644
2021-12-24 12:59:16 -06:00
Doug Moore
184c63db3c Fix clerical error in page alloc
Fix a very recent change that introduced a page accounting error in
case of a reserveration being broken.
Reviewed by:	alc
Fixes:	fb38b29b56 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
Differential Revision:	https://reviews.freebsd.org/D33645
2021-12-24 02:47:21 -06:00
Doug Moore
fb38b29b56 vm_page: Remove extra test, dup code from page alloc
Extract code common to functions vm_page_alloc_contig_domain and
vm_page_alloc_noobj_contig_domain into a new function.  Do so in a way
that eliminates a bound-to-fail reservation test after a reservation
is broken by a call from vm_page_alloc_contig_domain.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33551
2021-12-23 22:45:47 -06:00
Stephen J. Kiernan
18048b6e3c Eliminate key press requirement "show vmopag" command output.
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.

Obtained from:	Juniper Networks, Inc.
MFC after:	1 week

Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.

Subscribers: imp, #contributor_reviews_base

Differential Revision: https://reviews.freebsd.org/D33550
2021-12-19 19:40:52 -05:00
Rick Macklem
cd37afd8b6 vm_object: Make is_object_active() global
Commit 867c27c23a modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache.  However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call is_object_active()
to check if the file is mmap(2)'d.

This patch renames is_object_active() to vm_object_is_active(),
moves it to sys/vm/vm_object.c and makes it global, so that
the NFS client can call it in a future commit.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33520
2021-12-19 16:11:44 -08:00
Doug Moore
f7aa44763d Correct type size format error in KASSERT.
Reported by:	jenkins
Fixes:	6f1c890827 vm: Don't break vm reserv that can't meet align reqs
2021-12-16 13:48:58 -06:00
Doug Moore
6f1c890827 vm: Don't break vm reserv that can't meet align reqs
Function vm_reserv_test_contig has incorrectly used its alignment
and boundary parameters to find a well-positioned range of empty pages
in a reservation.  Consequently, a reservation could be broken
mistakenly when it was unable to provide a satisfactory set of pages.

Rename the function, correct the errors, and add assertions to detect
the error in case it appears again.

Reviewed by:	alc, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33344
2021-12-16 12:20:56 -06:00
Mark Johnston
88642d978a vm_fault: Fix vm_fault_populate()'s handling of VM_FAULT_WIRE
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage.  (For largepage mappings, it calls vm_fault() once per large
page.)

A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page.  Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry.  Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler.  So the second page is wired twice and will be
leaked when the object is destroyed.

Fix the problem by modify vm_fault_populate() to wire only the fault
page.  Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.

PR:		260347
Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33416
2021-12-14 15:10:46 -05:00
Konstantin Belousov
5346570276 swapoff: add one more variant of the syscall
Requested and reviewed by:	brooks
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33343
2021-12-09 02:48:46 +02:00
Doug Moore
9f32cb5b1c Set uninitialized popmap bits in vm_reserv_init
In vm_reserv_init, set all the marker popmap bits in vm_reserv_init,
and not just the bits of the first popmap entry.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33258
2021-12-05 17:17:25 -06:00
Gleb Smirnoff
2cb67bd798 uma: remove unused *item argument from cache_free()
Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D33272
2021-12-05 10:44:47 -08:00
Mark Johnston
39a7396f5d vm_page: Tighten the object lock assertion in vm_page_invalid()
A page must not become invalid while vm_fault_soft_fast() is attempting
to map unbusied pages for reading.

Note that all callers hold the object write lock already, and
vm_page_set_invalid() asserts the object write lock.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33250
2021-12-05 10:51:11 -05:00
Konstantin Belousov
e8dc2ba29c swapoff(2): add a SWAPOFF_FORCE flag
The flag requests skipping the heuristic which tries to avoid leaving
system with more allocated memory than available from RAM and remanining
swap.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov
a4e4132fa3 swapoff(2): replace special device name argument with a structure
For compatibility, add a placeholder pointer to the start of the
added struct swapoff_new_args, and use it to distinguish old vs. new
style of syscall invocation.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov
6df359449f swap_pager.c: Remove MPSAFE and ARGSUSED annotations
Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov
0190c38b9d swapoff_one(): only check free pages count manually turning swap off
When swap is turned off due to system shutdown or reboot, ignore the
check.  Problem is that the check is not accurate by any means, free
page count can legitimately be low while system still able to page in
everything from the swap.  Then, we turn swap off if swapping on
real file or some non-standard geom provider, and typically panic
when system appears to actually need to unavailable page.

For syscall, it is better to be safe than sorry.

Reported and tested by:	peterj
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33147
2021-11-29 18:38:02 +02:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Konstantin Belousov
b19740f4ce swap_pager: lock vnode in swapdev_strategy()
VOP_STRATEGY() requires locked vnode.  Note that we lock the swap vnode
while pages are busy, but this would only cause real LoR if pages belong
to the swap vnode, which must not be the case for correct use.

Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:50 +02:00
Konstantin Belousov
6ddf41faa6 swapon: extend the region where the swap vnode is locked
to cover VOP_GETATTR() call in sys_swapon().  Move locking from inside
swapongeom() and swaponvp() into sys_swapon().

Reported by and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:44 +02:00
Konstantin Belousov
a6d04f34a4 swap pager: lock vnode around VOP_CLOSE()
Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:39 +02:00
Mark Johnston
d47d3a94bb vm_fault: Factor out per-object operations into vm_fault_object()
No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33018
2021-11-24 14:02:56 -05:00