freebsd-dev

Author	SHA1	Message	Date
Ed Maste	575a4437a9	uma: fix KTR message after r366840 Reported by: bz Sponsored by: The FreeBSD Foundation	2020-10-19 18:54:44 +00:00
Mark Johnston	f09cbea31a	uma: Respect uk_reserve in keg_drain() When a reserve of free items is configured for a zone, the reserve must not be reclaimed under memory pressure. Modify keg_drain() to simply respect the reserved pool. While here remove an always-false uk_freef == NULL check (kegs that shouldn't be drained should set _NOFREE instead), and make sure that the keg_drain() KTR statement does not reference an uninitialized variable. Reviewed by: alc, rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26772	2020-10-19 16:57:40 +00:00
Mark Johnston	1b2dcc8c54	uma: Avoid depleting keg reserves when filling a bucket zone_import() fetches a free or partially free slab from the keg and then uses its items to populate an array, typically filling a bucket. If a single allocation causes the keg to drop below its minimum reserve, the inner loop ends. However, if the bucket is still not full and M_USE_RESERVE is specified, the outer loop will continue to fetch items from the keg. If M_USE_RESERVE is specified and the number of free items is below the reserved limit, we should return only a single item. Otherwise, if the bucket size is larger than the reserve, all of the reserved items may end up in a single per-CPU bucket, invisible to other CPUs. Reviewed by: rlibby MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26771	2020-10-19 16:55:03 +00:00
Konstantin Belousov	6f3b523c9a	Avoid dump_avail[] redefinition. Move dump_avail[] extern declaration and inlines into a new header vm/vm_dumpset.h. This fixes default gcc build for mips. Reviewed by: alc, scottph Tested by: kevans (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26741	2020-10-14 22:51:40 +00:00
Mark Johnston	06d8bdcbf7	uma: Use the bucket cache for cross-domain allocations uma_zalloc_domain() allocates from the requested domain instead of following a first-touch policy (the default for most zones). Currently it is only used by malloc_domainset(), and consumers free returned items with free(9) since r363834. Previously uma_zalloc_domain() worked by always going to the keg for an item. As a result, the use of UMA zone caches was unbalanced: we free items to the caches, but always allocate from the keg, skipping the caches. Make some effort to allocate from the UMA caches when performing a cross-domain allocation. This avoids blowing up the caches when something is performing many transient allocations with malloc_domainset(). Reported and tested by: dhw, glebius Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26427	2020-10-02 19:04:29 +00:00
Mark Johnston	5afdf5c1ca	uma: Use LIFO for non-SMR bucket caches When SMR was introduced, zone_put_bucket() was changed to always place full buckets at the end of the queue. However, it is generally preferable to use recently used buckets since their items are more likely to be resident in cache. So, for buckets that have no constraint on item reuse, use a last-in-first-out ordering as we did before. Reviewed by: rlibby Tested by: dhw, glebius Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26426	2020-10-02 19:04:09 +00:00
Mark Johnston	952c8964ba	uma: Remove newlines from panic messages Sponsored by: The FreeBSD Foundation	2020-10-02 19:03:42 +00:00
Konstantin Belousov	89d2fb14d5	Add interruptible variant of vm_wait(9), vm_wait_intr(9). Also add msleep flags argument to vm_wait_doms(9). Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652	2020-09-08 23:28:09 +00:00
Mateusz Guzik	c3aa3bf97c	vm: clean up empty lines in .c and .h files	2020-09-01 21:20:45 +00:00
Andrew Gallatin	791dda877f	uma: record allocation failures due to zone limits The zone limit mechanism was recently reworked, and allocation failures due to limits being exceeded were inadvertently no longer being recorded. This would lead to, for example, mbuf allocation failures not being indicated in netstat -m or vmstat -z Reviewed by: markj Sponsored by: Netflix	2020-08-21 18:31:57 +00:00
Mark Johnston	b21b022a81	Revert r364310. Some of the resulting fallout in CAM does not appear straightforward to fix, so simply revert the commit for now in the absence of a better solution. Discussed with: mjg Reported by: dhw	2020-08-18 14:09:49 +00:00
Gleb Smirnoff	1921bb7b68	With INVARIANTS panic immediately if M_WAITOK is requested in a non-sleepable context. Previously only _sleep() would panic. This will catch misuse of M_WAITOK at development stage rather than at stress load stage. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D26027	2020-08-17 15:37:08 +00:00
Mark Johnston	af32cefd7c	Check the UMA zone's full bucket cache before short-circuiting an alloc. The global "bucketdisable" flag indicates that we are in a low memory situation and should avoid allocating buckets. However, in the allocation path we were checking it before the full bucket cache and bailing even if the cache is non-empty. Defer the check so that we have a shot at allocating from the cache. This came up because M_NOWAIT allocations from the buf trie node zone must always succeed. In one scenario, all of the preallocated trie nodes were in the bucket list, and a new slab allocation could not succeed due to a memory shortage. The short-circuiting caused an allocation failure which triggered a panic. Reported by: pho Reviewed by: cem Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25980	2020-08-10 20:34:45 +00:00
Mark Johnston	96ad26eefb	Remove free_domain() and uma_zfree_domain(). These functions were introduced before UMA started ensuring that freed memory gets placed in domain-local caches. They no longer serve any purpose since UMA now provides their functionality by default. Remove them to simplyify the kernel memory allocator interfaces a bit. Reviewed by: cem, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25937	2020-08-04 13:58:36 +00:00
Mark Johnston	8c277118d8	Fix UMA's first-touch policy on systems with empty domains. Suppose a thread is running on a CPU in a NUMA domain with no physical RAM. When an item is freed to a first-touch zone, it ends up in the cross-domain bucket. When the bucket is full, it gets placed in another domain's bucket queue. However, when allocating an item, UMA will always go to the keg upon a per-CPU cache miss because the empty domain's bucket queue will always be empty. This means that a non-empty domain's bucket queues can grow very rapidly on such systems. For example, it can easily cause mbuf allocation failures when the zone limit is reached. Change cache_alloc() to follow a round-robin policy when running on an empty domain. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25355	2020-06-28 21:35:04 +00:00
Jeff Roberson	c8b0a88b8d	Clarify some language. Favor primary where both master and primary were used in conjunction with secondary.	2020-06-20 20:21:04 +00:00
Mateusz Guzik	1c58c09f5a	uma: hide item_domain under ifdef NUMA Fixes build warnings on mips.	2020-05-29 08:30:35 +00:00
Mark Johnston	81302f1d77	Fix boot on systems where NUMA domain 0 is unpopulated. - Add vm_phys_early_add_seg(), complementing vm_phys_early_alloc(), to ensure that segments registered during hammer_time() are placed in the right domain. Otherwise, since the SRAT is not parsed at that point, we just add them to domain 0, which may be incorrect and results in a domain with only several MB worth of memory. - Fix uma_startup1() to try allocating memory for zones from any domain. If domain 0 is unpopulated, the allocation will simply fail, resulting in a page fault slightly later during boot. - Change _vm_phys_domain() to return -1 for addresses not covered by the affinity table, and change vm_phys_early_alloc() to handle wildcard domains. This is necessary on amd64, where the page array is dense and pmap_page_array_startup() may allocate page table pages for non-existent page frames. Reported and tested by: Rafael Kitover <rkitover@gmail.com> Reviewed by: cem (earlier version), kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25001	2020-05-28 19:41:00 +00:00
Mark Johnston	dc2b320563	Allocate UMA per-CPU counters earlier. Otherwise anything counted before SI_SUB_VM_CONF is discarded. However, it is useful to be able to see stats from allocations done early during boot. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24756	2020-05-14 16:06:54 +00:00
Mark Johnston	54007ce8ae	Clean up uma_int.h a bit. This makes it easier to write libkvm programs that access UMA data structures. - Remove a couple of unused slab functions and make others local to uma_core.c. Similarly move SLAB_BITSETS, which affects the layout of slab structures, to uma_core.c. - Stop defining the slab structures under _KERNEL. There's no real reason they can't be visible to userspace like the rest of UMA's structures are. - Group KEG_ASSERT_COLD with other keg macros. - Convert an assertion about MAXMEMDOM to use _Static_assert. No functional change intended. Discussed with: jeff Reviewed by: rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23980	2020-03-07 15:37:23 +00:00
Brooks Davis	3823a5990a	Remove an apparently incorrect assertion. Without this change mips64 fails to boot. Discussed with: markj Sponsored by: DARPA	2020-03-06 23:31:09 +00:00
Mateusz Guzik	7f746c9fcc	vm: add debug to uma_zone_set_smr Reviewed by: markj, rlibby Differential Revision: https://reviews.freebsd.org/D23902	2020-03-01 21:49:16 +00:00
Jeff Roberson	fe835cbf5f	A pair of performance improvements. Swap buckets on free as well as alloc so that alloc is always the most cache-hot data. When selecting a zone domain for the round-robin bucket cache use the local domain unless there is a severe imbalance. This does not affinitize memory, only locks and queues. Reviewed by: markj, rlibby Differential Revision: https://reviews.freebsd.org/D23824	2020-02-27 08:23:10 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Ryan Libby	eaa17d4291	sys/vm: quiet -Wwrite-strings Discussed with: kib Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23796	2020-02-23 03:32:04 +00:00
Mark Johnston	0464f16e91	Constify uma_zcache_create() and uma_zsecond_create()'s "name" argument. It is already internally handled as a pointer to a const string, in particular by uma_zcreate(). Fix indentation while here. MFC after: 1 week	2020-02-22 17:44:28 +00:00
Jeff Roberson	226dd6db47	Add an atomic-free tick moderated lazy update variant of SMR. This enables very cheap read sections with free-to-use latencies and memory overhead similar to epoch. On a recent AMD platform a read section cost 1ns vs 5ns for the default SMR. On Xeon the numbers should be more like 1 ns vs 11. The memory consumption should be proportional to the product of the free rate and 2*1/hz while normal SMR consumption is proportional to the product of free rate and maximum read section time. While here refactor the code to make future additions more straightforward. Name the overall technique Global Unbound Sequences (GUS) and adjust some comments accordingly. This helps distinguish discussions of the general technique (SMR) vs this specific implementation (GUS). Discussed with: rlibby, markj	2020-02-22 03:44:10 +00:00
Jeff Roberson	c6fd3e23f7	Use per-domain locks for the bucket cache. This gives much better concurrency when there are a large number of cores per-domain and multiple domains. Avoid taking the lock entirely if it will not be productive. ROUNDROBIN domains will have mixed memory in each domain and will load balance to all domains. While here refactor the zone/domain separation and bucket limits to simplify callers. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23673	2020-02-19 18:48:46 +00:00
Jeff Roberson	ed581bf68f	Add a simple accessor that returns the bytes of memory consumed by a zone.	2020-02-17 01:59:55 +00:00
Jeff Roberson	70260874ac	UMA has become more particular about zone types. Use the right allocator calls in uma_zwait().	2020-02-17 01:06:18 +00:00
Jeff Roberson	6d88d784f8	Slightly restructure uma_zalloc* to generate better code from clang and reduce duplication among zalloc functions. Reviewed by: markj Discussed with: mjg Differential Revision: https://reviews.freebsd.org/D23672	2020-02-16 01:07:19 +00:00
Mark Johnston	cefc92e1a2	Update the zone-global count of cached items in bucket_cache_reclaim(). This was missed in r351673. The count is used to enfore cache limits, which are rarely used. Discussed with: jeff Sponsored by: The FreeBSD Foundation	2020-02-13 23:15:21 +00:00
Jeff Roberson	543117bed8	Fix a case where ub_seq would fail to be set if the cross bucket was flushed due to memory pressure. Reviewed by: markj Differential Revision: http://reviews.freebsd.org/D23614	2020-02-13 20:58:51 +00:00
Mateusz Guzik	3acb6572fc	Store offset into zpcpu allocations in the per-cpu area. This shorten zpcpu_get and allows more optimizations. Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D23570	2020-02-12 11:11:22 +00:00
Mark Johnston	4ab3aee8fb	Reduce lock hold time in keg_drain(). Maintain a count of free slabs in the per-domain keg structure and use that to clear the free slab list in constant time for most cases. This helps minimize lock contention induced by reclamation, in preparation for proactive trimming of excesses of free memory. Reviewed by: jeff, rlibby Tested by: pho Differential Revision: https://reviews.freebsd.org/D23532	2020-02-11 20:06:33 +00:00
Ryan Libby	bae55c4aec	uma: remove UMA_ZFLAG_CACHEONLY flag UMA_ZFLAG_CACHEONLY was essentially the same thing as UMA_ZONE_VM, but with a more confusing name. Remove the flag, make UMA_ZONE_VM an inherit flag, and replace all references. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23516	2020-02-06 08:32:25 +00:00
Ryan Libby	33e5a1ea3b	uma: multipage chicken switch Add a switch to allow disabling multipage slabs, in order to facilitate measuring memory usage and performance effects. The tunable vm.debug.uma_multipage_slabs defaults to 1 and can be set to 0 to disable. The name may change soon. Reviewed by: markj (previous version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23487	2020-02-04 22:40:45 +00:00
Ryan Libby	27ca37acb7	uma: grow slabs to enforce minimum memory efficiency Memory efficiency can be poor with awkward item sizes (e.g. 1/2 or 1 page size + epsilon). In order to achieve a minimum memory efficiency, select a slab size with a potentially larger number of pages if it yields a lower portion of waste. This may mean using page_alloc instead of uma_small_alloc, which could be more costly. Discussed with: jeff, mckusick Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23239	2020-02-04 22:40:34 +00:00
Ryan Libby	ec0d828071	uma: add UMA_ZONE_CONTIG, and a default contig_alloc For now, copy the mbuf allocator. Reviewed by: jeff, markj (previous version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23237	2020-02-04 22:40:11 +00:00
Ryan Libby	5ba16cf3d7	uma: pcpu_page_free needs to startup_free pages from startup_alloc After r357392, it is apparent that we do have some early-boot PCPU zones. Make it so we can safely free pages from them if they are actually used during early boot. Reviewed by: jeff, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23496	2020-02-04 22:39:58 +00:00
Jeff Roberson	e84130a0c0	Use literal bucket sizes for smaller buckets rather than the rounding system. Small bucket sizes already pack well even if they are an odd number of words. This prevents any potential new instances of the problem fixed in r357463 as well as making the system easier to understand. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23494	2020-02-04 20:28:06 +00:00
Jeff Roberson	dc3915c8c6	Use STAILQ instead of TAILQ for bucket lists. We only need FIFO behavior and this is more space efficient. Stop queueing recently used buckets to the head of the list. If the bucket goes to a different processor the cache coherency will be more expensive. We already try to encourage cache-hot behavior in the per-cpu layer. Reviewed by: rlibby Differential Revision: https://reviews.freebsd.org/D23493	2020-02-04 02:41:24 +00:00
Mark Johnston	36cb95c736	Disable the smallest UMA bucket size on 32-bit platforms. With r357314, sizeof(struct uma_bucket) grew to 16 bytes on 32-bit platforms, so BUCKET_SIZE(4) is 0. This resulted in the creation of a bucket zone for buckets with zero capacity. A more general fix is planned, but for now this bandaid allows 32-bit platforms to boot again. PR: 243837 Discussed with: jeff Reported by: pho, Jenkins via lwhsu Tested by: pho Sponsored by: The FreeBSD Foundation	2020-02-03 19:29:02 +00:00
Jeff Roberson	f96d4157a7	Fix a bug in r356776 where the page allocator was not properly restored to the percpu page allocator after it had been temporarily overridden by startup_alloc. Reported by: pho, bdragon	2020-02-01 23:46:30 +00:00
Jeff Roberson	9e47b34110	Fix LINT build with MEMGUARD.	2020-01-31 02:03:22 +00:00
Jeff Roberson	d4665eaa66	Implement a safe memory reclamation feature that is tightly coupled with UMA. This is in the same family of algorithms as Epoch/QSBR/RCU/PARSEC but is a unique algorithm. This has 3x the performance of epoch in a write heavy workload with less than half of the read side cost. The memory overhead is significantly lessened by limiting the free-to-use latency. A synthetic test uses 1/20th of the memory vs Epoch. There is significant further discussion in the comments and code review. This code should be considered experimental. I will write a man page after it has settled. After further validation the VM will begin using this feature to permit lockless page lookups. Both markj and cperciva tested on arm64 at large core counts to verify fences on weaker ordering architectures. I will commit a stress testing tool in a follow-up. Reviewed by: mmacy, markj, rlibby, hselasky Discussed with: sbahara Differential Revision: https://reviews.freebsd.org/D22586	2020-01-31 00:49:51 +00:00
Ryan Libby	8d1c459ae5	uma: fix zone domain overlaying pcpu cache with disabled cpus UMA zone structures have two arrays at the end which are sized according to the machine: an array of CPU count length, and an array of NUMA domain count length. The CPU counting was wrong in the case where some CPUs are disabled (when mp_ncpus != mp_maxid + 1), and this caused the second array to be overlaid with the first. Reported by: olivier Reviewed by: jeff, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23318	2020-01-23 04:56:38 +00:00
Ryan Libby	7e2406774e	uma: report leaks more accurately Previously UMA had some false negatives in the leak report at keg destruction time, where it only reported leaks if there were free items in the slab layer (rather than allocated items), which notably would not be true for single-item slabs (large items). Now, report a leak if there are any allocated pages, and calculate and report the number of allocated items rather than free items. Reviewed by: jeff, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23275	2020-01-23 04:56:34 +00:00
Jeff Roberson	530cc6a25d	Some architectures with DMAP still consume boot kva. Simplify the test for claiming kva in uma_startup2() to handle this. Reported by: bdragon	2020-01-23 03:37:35 +00:00
Andrew Gallatin	2052680238	pcpu_page_alloc: guard against empty NUMA domains Some systems, such as higher end Threadripper, may have NUMA domains with no physical memory, Don't allocate from these domains. This fixes a "panic: vm_wait in early boot" on my 2990WX desktop Reviewed by: jeff Sponsored by: Netflix	2020-01-18 18:25:37 +00:00

1 2 3 4 5 ...

419 Commits