freebsd-skq

Author	SHA1	Message	Date
Mark Johnston	dc2b320563	Allocate UMA per-CPU counters earlier. Otherwise anything counted before SI_SUB_VM_CONF is discarded. However, it is useful to be able to see stats from allocations done early during boot. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24756	2020-05-14 16:06:54 +00:00
Kyle Evans	c79cee7136	kernel: provide panicky version of __unreachable __builtin_unreachable doesn't raise any compile-time warnings/errors on its own, so problems with its usage can't be easily detected. While it would be nice for this situation to change and compilers to at least add a warning for trivial cases where local state means the instruction can't be reached, this isn't the case at the moment and likely will not happen. This commit adds an __assert_unreachable, whose intent is incredibly clear: it asserts that this instruction is unreachable. On INVARIANTS builds, it's a panic(), and on non-INVARIANTS it expands to __unreachable(). Existing users of __unreachable() are converted to __assert_unreachable, to improve debuggability if this assumption is violated. Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D23793	2020-05-13 18:07:37 +00:00
Justin Hibbits	65bbba25d2	powerpc64: Implement Radix MMU for POWER9 CPUs Summary: POWER9 supports two MMU formats: traditional hashed page tables, and Radix page tables, similar to what's presesnt on most other architectures. The PowerISA also specifies a process table -- a table of page table pointers-- which on the POWER9 is only available with the Radix MMU, so we can take advantage of it with the Radix MMU driver. Written by Matt Macy. Differential Revision: https://reviews.freebsd.org/D19516	2020-05-11 02:33:37 +00:00
Mark Johnston	a9ea09e548	Re-check for wirings after busying the page in vm_page_release_locked(). A concurrent unlocked lookup can wire the page after vm_page_release_locked() releases the last wiring, in which case vm_page_release_locked() must not free the page. Once the xbusy lock is acquired, that, the object lock and the fact that the page is unmapped ensure that the wire count cannot increase, so re-check for new wirings after the page is xbusied. Update the comment above vm_page_wired() to reflect the new synchronization rules. Reported by: glebius Reviewed by: alc, jeff, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24592	2020-04-28 13:51:41 +00:00
Mark Johnston	f13fa9df05	Use a single VM object for kernel stacks. Previously we allocated a separate VM object for each kernel stack. However, fully constructed kernel stacks are cached by UMA, so there is no harm in using a single global object for all stacks. This reduces memory consumption and makes it easier to define a memory allocation policy for kernel stack pages, with the aim of reducing physical memory fragmentation. Add a global kstack_object, and use the stack KVA address to index into the object like we do with kernel_object. Reviewed by: kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24473	2020-04-26 20:08:57 +00:00
Mark Johnston	33655d9546	Factor out the kmem contig page alloc and reclamation code. kmem_alloc_attr_domain() and kmem_alloc_contig_domain() duplicated each other's page allocation and reclamation logic. Place it in a single function to make it easier to add additional consumers. No functional change intended. Reviewed by: jeff, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24475	2020-04-21 16:01:44 +00:00
Mark Johnston	303b77029b	Minimize conditional compilation for handling of M_EXEC. This simplifies some planned changes. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24474	2020-04-21 15:55:28 +00:00
Mark Johnston	70e68b19a4	Handle trashed queue pointers in vm_page_acquire_unlocked(). vm_page_acquire_unlocked() relies on type-stability of vm_page structures and assumes that the listq linkage pointers always point to a vm_page or are NULL. QUEUE_MACRO_DEBUG_TRASH breaks that assumption, so add an explicit check for a trashed queue pointer before dereferencing. Reported and tested by: pho Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24472	2020-04-20 14:45:17 +00:00
Bryan Drewery	adc0388117	Remove dead code leftover from r331018. Sponsored by: Dell EMC	2020-03-31 01:12:53 +00:00
Konstantin Belousov	abfdf76791	VOP_GETPAGES_ASYNC(): consistently call iodone() callback in case of error. Reviewed by: glebius, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24038	2020-03-30 21:44:30 +00:00
Konstantin Belousov	a7c55b3e1b	ddb show pginfo: print pages reference value in hex. It is more useful this way after the VPRC_ flags were introduced. Sponsored by: The FreeBSD Foundation	2020-03-28 12:21:52 +00:00
Jeff Roberson	d1105e9441	Check for busy or wired in vm_page_relookup(). Some callers will only keep a page wired and expect it to still be present. Reported by: delphij@FreeBSD.org Reviewed by: kib	2020-03-11 22:25:45 +00:00
Mark Johnston	54007ce8ae	Clean up uma_int.h a bit. This makes it easier to write libkvm programs that access UMA data structures. - Remove a couple of unused slab functions and make others local to uma_core.c. Similarly move SLAB_BITSETS, which affects the layout of slab structures, to uma_core.c. - Stop defining the slab structures under _KERNEL. There's no real reason they can't be visible to userspace like the rest of UMA's structures are. - Group KEG_ASSERT_COLD with other keg macros. - Convert an assertion about MAXMEMDOM to use _Static_assert. No functional change intended. Discussed with: jeff Reviewed by: rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23980	2020-03-07 15:37:23 +00:00
Mark Johnston	3fba886874	Move SMR pointer type definition and access macros to smr_types.h. The intent is to provide a header that can be included by other headers without introducing too much pollution. smr.h depends on various headers and will likely grow over time, but is less likely to be required by system headers. Rename SMR_TYPE_DECLARE() to SMR_POINTER(): - One might use SMR to protect more than just pointers; it could be used for resizeable arrays, for example, so TYPE seems too generic. - It is useful to be able to define anonymous SMR-protected pointer types and the _DECLARE suffix makes that look wrong. Reviewed by: jeff, mjg, rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23988	2020-03-07 00:55:46 +00:00
Brooks Davis	3823a5990a	Remove an apparently incorrect assertion. Without this change mips64 fails to boot. Discussed with: markj Sponsored by: DARPA	2020-03-06 23:31:09 +00:00
Mark Johnston	d869a17e62	Use COUNTER_U64_DEFINE_EARLY() in places where it simplifies things. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23978	2020-03-06 19:10:00 +00:00
Brooks Davis	d718de812f	Introduce kern_mmap_req(). This presents an extensible interface to the generic mmap(2) implementation via a struct pointer intended to use a designated initializer or compount literal. We take advantage of the mandatory zeroing of fields not listed in the initializer. Remove kern_mmap_fpcheck() and use kern_mmap_req(). The motivation for this change is a desire to keep the core implementation from growing an ever-increasing number of arguments that must be specified in the correct order for the lowest-level implementations. In CheriBSD we have already added two more arguments. Reviewed by: kib Discussed with: kevans Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D23164	2020-03-04 21:27:12 +00:00
Mark Johnston	1ed42f6fdd	Avoid doubly wiring a newly allocated page in vm_page_grab_valid(). This fixes a regression from r358363. Reported by: manu, jbeich Tested by: jbeich	2020-03-01 22:09:11 +00:00
Mateusz Guzik	7f746c9fcc	vm: add debug to uma_zone_set_smr Reviewed by: markj, rlibby Differential Revision: https://reviews.freebsd.org/D23902	2020-03-01 21:49:16 +00:00
Jeff Roberson	6be21eb778	Provide a lock free alternative to resolve bogus pages. This is not likely to be much of a perf win, just a nice code simplification. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D23866	2020-02-28 21:42:48 +00:00
Jeff Roberson	7aaf252c96	Convert a few triviail consumers to the new unlocked grab API. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23847	2020-02-28 20:34:30 +00:00
Jeff Roberson	3f39f80ab3	Support the NOCREAT flag for grab_valid_unlocked. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23865	2020-02-28 20:32:35 +00:00
Jeff Roberson	1a0c234eb2	Simplify vref() code in object_reference. The local temporary is no longer necessary. Fix formatting errors. Reported by: mjg Discussed with: kib	2020-02-28 20:30:53 +00:00
Mark Johnston	c99d0c5801	Add a blocking counter KPI. refcount(9) was recently extended to support waiting on a refcount to drop to zero, as this was needed for a lockless VM object paging-in-progress counter. However, this adds overhead to all uses of refcount(9) and doesn't really match traditional refcounting semantics: once a counter has dropped to zero, the protected object may be freed at any point and it is not safe to dereference the counter. This change removes that extension and instead adds a new set of KPIs, blockcount_*, for use by VM object PIP and busy. Reviewed by: jeff, kib, mjg Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23723	2020-02-28 16:05:18 +00:00
Jeff Roberson	fe835cbf5f	A pair of performance improvements. Swap buckets on free as well as alloc so that alloc is always the most cache-hot data. When selecting a zone domain for the round-robin bucket cache use the local domain unless there is a severe imbalance. This does not affinitize memory, only locks and queues. Reviewed by: markj, rlibby Differential Revision: https://reviews.freebsd.org/D23824	2020-02-27 08:23:10 +00:00
Jeff Roberson	c49be4f1c6	Add unlocked grab* function variants that use lockless radix code to lookup pages. These variants will fall back to their locked counterparts if the page is not present. Discussed with: kib, markj Differential Revision: https://reviews.freebsd.org/D23449	2020-02-27 02:37:27 +00:00
Ed Maste	acb8858f05	Return ENOTSUP for mmap/mprotect if prot not subset of prot_max From POSIX, [ENOTSUP] The implementation does not support the combination of accesses requested in the prot argument. This fits the case that prot contains permissions which are not a subset of prot_max. Reviewed by: brooks, cem Relnotes: Yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23843	2020-02-26 20:03:43 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Doug Moore	36b01270d1	The last argument to swp_pager_getswapspace is always 1. Remove that argument. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23810	2020-02-24 04:01:09 +00:00
Mark Johnston	7ca5539285	Allow swap_pager_putpages() to allocate one block at a time. The minimum allocation size of 4 blocks is an old policy that came with the "new" swap pager in r42957. Since then the blist allocator has gotten better at reducing fragmentation; for example, with r349777 it can return a range that spans multiple leaves. When swap space is close to being exhaused, the minimum of 4 blocks most likely exacerbates memory pressure, so reduce it to 1. Reported by: alc Tested by: pho Reviewed by: alc, dougm, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23763	2020-02-23 17:59:51 +00:00
Ryan Libby	eaa17d4291	sys/vm: quiet -Wwrite-strings Discussed with: kib Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23796	2020-02-23 03:32:04 +00:00
Mark Johnston	0464f16e91	Constify uma_zcache_create() and uma_zsecond_create()'s "name" argument. It is already internally handled as a pointer to a const string, in particular by uma_zcreate(). Fix indentation while here. MFC after: 1 week	2020-02-22 17:44:28 +00:00
Kyle Evans	cef81f8f01	vm_radix: prefer __builtin_unreachable() to an unreachable panic() This provides the needed hint to GCC and offers an annotation for readers to observe that it's in-fact impossible to hit this point. We'll get hit with a a -Wswitch error if the enum applicable to the switch above were to get expanded without the new value(s) being handled.	2020-02-22 16:20:04 +00:00
Jeff Roberson	226dd6db47	Add an atomic-free tick moderated lazy update variant of SMR. This enables very cheap read sections with free-to-use latencies and memory overhead similar to epoch. On a recent AMD platform a read section cost 1ns vs 5ns for the default SMR. On Xeon the numbers should be more like 1 ns vs 11. The memory consumption should be proportional to the product of the free rate and 2*1/hz while normal SMR consumption is proportional to the product of free rate and maximum read section time. While here refactor the code to make future additions more straightforward. Name the overall technique Global Unbound Sequences (GUS) and adjust some comments accordingly. This helps distinguish discussions of the general technique (SMR) vs this specific implementation (GUS). Discussed with: rlibby, markj	2020-02-22 03:44:10 +00:00
Warner Losh	cafbf0c664	Don't convert all lower-layer errors to EIO. Don't convert all lower layer errors to EIO. Instead, pass the actual error up the stack. This will allow the upper layers that look for ENXIO to react properly to that signal from the lower layers and, for UFS, unmount the filesystem. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23755	2020-02-20 01:33:01 +00:00
Warner Losh	65252dc903	Don't spam the console with an additional, and useless, error message. There's no need to spam the console with this error message. If there's an I/O error, the disk/cam driver will report it at the lower levels. If that's an actual problem, the upper layers will report that. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23756	2020-02-20 00:34:46 +00:00
Jeff Roberson	4b3dac72b3	Silence a gcc warning about no return from a function that handles every possible enum in a switch statement. I verified that this emits nothing as expected on clang. radix relies on constant propagation to eliminate any branching from these access routines. Reported by: lwhsu/tinderbox	2020-02-19 22:34:22 +00:00
Jeff Roberson	1ddda2eb24	Use SMR to provide a safe unlocked lookup for vm_radix. The tree is kept correct for readers with store barriers and careful ordering. The existing object lock serializes writers. Consumers will be introduced in later commits. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D23446	2020-02-19 19:58:31 +00:00
Jeff Roberson	c6fd3e23f7	Use per-domain locks for the bucket cache. This gives much better concurrency when there are a large number of cores per-domain and multiple domains. Avoid taking the lock entirely if it will not be productive. ROUNDROBIN domains will have mixed memory in each domain and will load balance to all domains. While here refactor the zone/domain separation and bucket limits to simplify callers. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23673	2020-02-19 18:48:46 +00:00
Jeff Roberson	e9ceb9dd11	Don't release xbusy on kmem pages. After lockless page lookup we will not be able to guarantee that they can be racquired without blocking. Reviewed by: kib Discussed with: markj Differential Revision: https://reviews.freebsd.org/D23506	2020-02-19 09:10:11 +00:00
Jeff Roberson	6c5f36ff30	Eliminate some unnecessary uses of UMA_ZONE_VM. Only zones involved in virtual address or physical page allocation need to be marked with this flag. Reviewed by: markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D23712	2020-02-19 08:17:27 +00:00
Mark Johnston	34e2051faf	Remove swblk_t. It was used only to store the bounds of each swap device. However, since swblk_t is a signed 32-bit int and daddr_t is a signed 64-bit int, swp_pager_isondev() may return an invalid result if swap devices are repeatedly added and removed and sw_end for a device ends up becoming a negative number. Note that the removed comment about maximum swap size still applies. Reviewed by: jeff, kib Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23666	2020-02-17 15:11:07 +00:00
Mark Johnston	725b4ff001	Fix a swap block allocation race. putpages' allocation of swap blocks is done under the global sw_dev lock. Previously it would drop that lock before inserting the allocated blocks into the object's trie, creating a window in which swap blocks are allocated but are not visible to swapoff. This can cause swp_pager_strategy() to fail and panic the system. Fix the problem bluntly, by allocating swap blocks under the object lock. Reviewed by: jeff, kib Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23665	2020-02-17 15:10:41 +00:00
Mark Johnston	c90d075be4	Fix object locking races in swapoff(2). swap_pager_swapoff_object()'s goal is to allocate pages for all valid swap blocks belonging to the object, for which there is no resident page. If the page corresponding to a block is already resident and valid, the block can simply be discarded. The existing implementation tries to minimize the number of I/Os used. For each cluster of swap blocks, it finds maximal runs of valid swap blocks not resident in memory, and valid resident pages. During this processing, the object lock may be dropped in several places: when calling getpages, or when blocking on a busy page in vm_page_grab_pages(). While the lock is dropped, another thread may free swap blocks, causing getpages to page in stale data. Fix the problem following a suggestion from Jeff: use getpages' readahead capability to perform clustering rather than doing it ourselves. The simplies the code a bit without reintroducing the old behaviour of performing one I/O per page. Reviewed by: jeff Reported by: dhw, gallatin Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23664	2020-02-17 15:09:40 +00:00
Jeff Roberson	ed581bf68f	Add a simple accessor that returns the bytes of memory consumed by a zone.	2020-02-17 01:59:55 +00:00
Jeff Roberson	f212367b42	Refactor _vm_page_busy_sleep to reduce the delta between the various sleep routines and introduce a variant that supports lockless sleep. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23612	2020-02-17 01:08:00 +00:00
Jeff Roberson	70260874ac	UMA has become more particular about zone types. Use the right allocator calls in uma_zwait().	2020-02-17 01:06:18 +00:00
Jeff Roberson	6d88d784f8	Slightly restructure uma_zalloc* to generate better code from clang and reduce duplication among zalloc functions. Reviewed by: markj Discussed with: mjg Differential Revision: https://reviews.freebsd.org/D23672	2020-02-16 01:07:19 +00:00
Mateusz Guzik	3379d2f926	vm: use new capsicum helpers	2020-02-15 01:29:07 +00:00
Mateusz Guzik	23ed568caa	vm: remove no longer needed atomic_load_ptr casts	2020-02-14 23:16:29 +00:00

1 2 3 4 5 ...

4385 Commits