Commit Graph

67 Commits

Author SHA1 Message Date
jeff
c2c436e241 Use the ticks since the last update to reduce hysteresis in the partpopq and
contention on the vm_reserv_domain lock.

This gives a roughly 8x speedup on will-it-scale fault1 on a 16 core machine.

Reviewed by:	alc, kib, markj
2018-07-07 01:54:45 +00:00
jeff
1b05a1062f Fix two compliation problems on non-amd64 architectures. 2018-03-23 18:24:02 +00:00
cy
80da77353e Fix build on i386 without INVARIANTS following r331369.
--- vm_reserv.o ---
In file included from /opt/src/svn-current/sys/vm/vm_reserv.c:48:
In file included from /opt/src/svn-current/sys/sys/counter.h:37:
./machine/counter.h:174:3: error: implicit declaration of function
'critical_enter' is invalid in C99 [-Werror,-Wimplicit-function-declarat
ion]
                critical_enter();

Reviewed by:	jeff@
2018-03-23 03:22:30 +00:00
jeff
78fd3a94bb Lock reservations with a dedicated lock in each reservation. Protect the
vmd_free_count with atomics.

This allows us to allocate and free from reservations without the free lock
except where a superpage is allocated from the physical layer, which is
roughly 1/512 of the operations on amd64.

Use the counter api to eliminate cache conention on counters.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14707
2018-03-22 19:21:11 +00:00
jeff
1dfd513751 Eliminate pageout wakeup races. Take another step towards lockless
vmd_free_count manipulation.  Reduce the scope of the free lock by
using a pageout lock to synchronize sleep and wakeup.  Only trigger
the pageout daemon on transitions between states.  Drive all wakeup
operations directly as side-effects from freeing memory rather than
requiring an additional function call.

Reviewed by:	markj, kib
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14612
2018-03-15 19:23:07 +00:00
jeff
ff01cfd694 Don't assert that the domain free lock is held until we're certain that
there is a valid reservation.  This can trip erroneously when memory
falls within a domain but doesn't have the reservation initialized because
it does not meet size or alignment requirements.

Reported by:	pho, mjg
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-07 22:04:27 +00:00
markj
8359115798 Correct some comments after r328954.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D14486
2018-02-23 23:27:53 +00:00
kib
b93d5395e3 Cleanup unused page argument for vm_reserv_break().
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14364
2018-02-14 00:34:02 +00:00
kib
0c20f07bdd Do not leak rv->psind in some specific situations.
Suppose that we have an object with a mapped superpage, and that all
pages in the superpages are held (by some driver).  Additionally,
suppose that the object is terminated, e.g. because the only process
mapping it is exiting.  Then the reservation is broken, but the pages
cannot be freed until later, when they are unheld.  In this situation,
the reservation code cannot clean psind, since no pages are freed, and
the page is freed and then reused with invalid psind.

Clean psind on vm_reserv_break() to avoid the situation.

Reported and tested by:	Slava Shwartsman
Reviewed by:	markj
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14335
2018-02-13 15:36:28 +00:00
jeff
e67ec0d694 Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state.  Protect reservations with the free lock
from the domain that they belong to.  Refactor to make vm domains more
of a first class object.

Reviewed by:    markj, kib, gallatin
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14000
2018-02-06 22:10:07 +00:00
jeff
e7c9f84113 Implement NUMA policy for kmem_*(9). This maintains compatibility with
reservations by giving each memory domain its own KVA space in vmem that
is naturally aligned on superpage boundaries.

Reviewed by:	alc, markj, kib  (some objections)
Sponsored by:	Netflix, Dell/EMC Isilon
Tested by;	pho
Differential Revision:	https://reviews.freebsd.org/D13289
2018-01-12 23:13:55 +00:00
jeff
f93de233c6 Move domain iterators into the page layer where domain selection should take
place.  This makes the majority of the phys layer explicitly domain specific.

Reviewed by:	markj, kib (some objections)
Discussed with:	alc
Tested by:	pho
Sponsored by:	Netflix & Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13014
2017-11-28 23:18:35 +00:00
pfg
78a6b08618 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
alc
1179a1717e Utilize pmap_enter(..., psind=1) in vm_fault_soft_fast() on amd64. (The
Differential Revision discusses the benefits of this change.)

Add a function, vm_reserv_to_superpage(), that returns the superpage
containing the specified base page.

Reviewed by:	kib, markj
Tested by:	pho
MFC after:	10 days
Differential Revision:	https://reviews.freebsd.org/D11556
2017-07-23 16:28:13 +00:00
glebius
5763443023 All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
2017-04-17 17:07:00 +00:00
alc
d619a90d54 Relax the object type restrictions on vm_page_alloc_contig(). Specifically,
add support for object types that were previously prohibited because they
could contain PG_CACHED pages.

Roughly halve the number of radix trie operations performed by
vm_page_alloc_contig() using the same approach that is employed by
vm_page_alloc().  Also, eliminate the radix trie lookup performed with the
free page queues lock held.

Tidy up the handling of radix trie insert failures in vm_page_alloc() and
vm_page_alloc_contig().

Reviewed by:	kib, markj
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8878
2016-12-28 18:32:13 +00:00
alc
023ed8da8d Eliminate every mention of PG_CACHED pages from the comments in the machine-
independent layer of the virtual memory system.  Update some of the nearby
comments to eliminate redundancy and improve clarity.

In vm/vm_reserv.c, do not use hyphens after adverbs ending in -ly per
The Chicago Manual of Style.

Update the comment in vm/vm_page.h defining the four types of page queues to
reflect the elimination of PG_CACHED pages and the introduction of the
laundry queue.

Reviewed by:	kib, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8752
2016-12-12 17:47:09 +00:00
alc
2fa3607305 Remove most of the code for implementing PG_CACHED pages. (This change does
not remove user-space visible fields from vm_cnt or all of the references to
cached pages from comments.  Those changes will come later.)

Reviewed by:	kib, markj
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8497
2016-11-15 18:22:50 +00:00
alc
8343c406db Introduce a new mechanism for relocating virtual pages to a new physical
address and use this mechanism when:

1. kmem_alloc_{attr,contig}() can't find suitable free pages in the physical
   memory allocator's free page lists.  This replaces the long-standing
   approach of scanning the inactive and inactive queues, converting clean
   pages into PG_CACHED pages and laundering dirty pages.  In contrast, the
   new mechanism does not use PG_CACHED pages nor does it trigger a large
   number of I/O operations.

2. on 32-bit MIPS processors, uma_small_alloc() and the pmap can't find
   free pages in the physical memory allocator's free page lists that are
   covered by the direct map.  Tested by: adrian

3. ttm_bo_global_init() and ttm_vm_page_alloc_dma32() can't find suitable
   free pages in the physical memory allocator's free page lists.

In the coming months, I expect that this new mechanism will be applied in
other places.  For example, balloon drivers should use relocation to
minimize fragmentation of the guest physical address space.

Make vm_phys_alloc_contig() a little smarter (and more efficient in some
cases).  Specifically, use vm_phys_segs[] earlier to avoid scanning free
page lists that can't possibly contain suitable pages.

Reviewed by:	kib, markj
Glanced at:	jhb
Discussed with:	jeff
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D4444
2015-12-19 18:42:50 +00:00
alc
75944832ce Correct an error in vm_reserv_reclaim_contig(). In the highly unusual
case that the reservation contained "low", the starting position in the
popmap for the free page search was incorrectly calculated.  The most
likely (and visible) symptom of this error was the assertion failure,
"vm_reserv_reclaim_contig: pa is too low".
2015-11-26 19:12:18 +00:00
alc
51df0b58b1 Introduce a sysctl for reporting the number of fully populated reservations. 2015-08-06 21:27:50 +00:00
alc
263927b83e Retire VM_FREEPOOL_CACHE as the next step in eliminating PG_CACHE pages.
Differential Revision:	https://reviews.freebsd.org/D2712
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2015-06-08 04:59:32 +00:00
alc
f116ec9943 Correct an off-by-one error in vm_reserv_reclaim_contig() that results in
an infinite loop.

Submitted by:	Svatopluk Kraus
MFC after:	1 week
2015-04-11 22:57:13 +00:00
ian
eae109babf Revert r279932; this is going to be fixed in the sbuf code instead.
PR:		195668
2015-03-14 13:00:37 +00:00
ian
037188bda9 Nullterminate strings returned via sysctl.
PR:		195668
2015-03-12 18:06:30 +00:00
kib
c539cecb43 Fix function name in the panic message.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-08 02:13:46 +00:00
alc
f4c161ccb6 By the time that vm_reserv_init() runs, vm_phys_segs[] is initialized. Use
it instead of phys_avail[].

Discussed with:	Svatopluk Kraus
2014-11-22 17:46:30 +00:00
alc
996a2e8d68 Fix a boundary case error in vm_reserv_alloc_contig(): If a reservation
isn't being allocated for the last of the requested pages, because a
reservation won't fit in the gap between allocated pages, then the
reservation structure shouldn't be initialized.

While I'm here, improve the nearby comments.

Reported by:	jeff, pho
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2014-09-10 05:52:30 +00:00
alc
f133c95804 Correct a bug in the management of the population map on big-endian
machines.  Specifically, there was a mismatch between how the routine
allocation and deallocation operations accessed the population map
and how the aggressively optimized reservation-breaking operation
accessed it.  So, problems only occurred when reservations were broken.
This change makes the routine operations access the population map in
the same way as the reservation breaking operation.

This bug was introduced in r259999.

PR:		187080
Tested by:	jmg (on an "armeb" machine)
Sponsored by:	EMC / Isilon Storage Division
2014-06-11 16:11:12 +00:00
alc
39548e640f Add a page size field to struct vm_page. Increase the page size field when
a partially populated reservation becomes fully populated, and decrease this
field when a fully populated reservation becomes partially populated.

Use this field to simplify the implementation of pmap_enter_object() on
amd64, arm, and i386.

On all architectures where we support superpages, the cost of creating a
superpage mapping is roughly the same as creating a base page mapping.  For
example, both kinds of mappings entail the creation of a single PTE and PV
entry.  With this in mind, use the page size field to make the
implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little
smarter.  Previously, if MAP_PREFAULT_PARTIAL was specified to
vm_map_pmap_enter(), that function would only map base pages.  Now, it will
create up to 96 base page or superpage mappings.

Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2014-06-07 17:12:26 +00:00
alc
4e0447c7bb Add "popmap" assertions: The page being freed isn't already free, and the
page being allocated isn't already allocated.

Sponsored by:	EMC / Isilon Storage Division
2013-12-29 04:54:52 +00:00
alc
0850ce80b6 MFp4 alc_popmap
Change the way that reservations keep track of which pages are in use.
  Instead of using the page's PG_CACHED and PG_FREE flags, maintain a bit
  vector within the reservation.  This approach has a couple benefits.
  First, it makes breaking reservations much cheaper because there are
  fewer cache misses to identify the unused pages.  Second, it is a pre-
  requisite for supporting two or more reservation sizes.
2013-12-28 04:28:35 +00:00
kib
8ca067efb2 PG_SLAB no longer serves a useful purpose, since m->object is no
longer abused to store pointer to slab. Remove it.

Reviewed by:    alc
Sponsored by:   The FreeBSD Foundation
Approved by:	re (hrs)
2013-09-17 07:35:26 +00:00
alc
7d20e37fb6 Refactor vm_page_alloc()'s interactions with vm_reserv_alloc_page() and
vm_page_insert() so that (1) vm_radix_lookup_le() is never called while the
free page queues lock is held and (2) vm_radix_lookup_le() is called at most
once.  This change reduces the average time that the free page queues lock
is held by vm_page_alloc() as well as vm_page_alloc()'s average overall
running time.

Sponsored by:	EMC / Isilon Storage Division
2013-05-12 16:50:18 +00:00
alc
6cbd8f24b9 The calls to vm_radix_lookup_ge() by vm_reserv_alloc_{contig,page}() can
be eliminated.  If the calls to vm_radix_lookup_le() return NULL, then
the page at the head of the object's memq must be the page with the least
pindex greater than the specified pindex.

Reviewed by:	attilio
Sponsored by:	EMC / Isilon Storage Division
2013-03-17 20:40:31 +00:00
attilio
76954ad68a Merge from vmcontention. 2013-03-09 03:19:53 +00:00
attilio
16a80466e5 MFC 2013-03-09 02:51:51 +00:00
attilio
6b1291b4d1 Do not call vm_radix_lookup_ge() in the reservation system unless
it is absolutely necessary.

Sponsored by:	EMC / Isilon storage division
Submitted by:	alc
2013-02-24 16:10:43 +00:00
alc
96feae12e9 Correctly assert that no page already exists at the offset within the
object that is currently being allocated.

Sponsored by:	EMC / Isilon Storage Division
2013-02-23 19:28:31 +00:00
attilio
905e648d42 Hide the details for the assertion for VM_OBJECT_LOCK operations.
Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into
VM_OBJECT_ASSERT_WLOCKED(foo)

Sponsored by:	EMC / Isilon storage division
Requested by:	alc
2013-02-21 21:54:53 +00:00
attilio
658534ed5a Switch vm_object lock to be a rwlock.
* VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations
* VM_OBJECT_SLEEP() is introduced as a general purpose primitve to
  get a sleep operation using a VM_OBJECT_LOCK() as protection
* The approach must bear with vm_pager.h namespace pollution so many
  files require including directly rwlock.h
2013-02-20 10:38:34 +00:00
attilio
702feea4c3 Correctly complete r246474. 2013-02-07 15:08:35 +00:00
attilio
5ab232ef16 Strengten checks. 2013-02-07 15:06:45 +00:00
attilio
3bbca3bce2 Fixup r246423 by adding vm_radix.h includes where it is not present
currently.
2013-02-06 18:33:32 +00:00
attilio
b972b67ed7 Merge from vmcontention 2013-02-04 15:44:42 +00:00
attilio
c52a057b19 MFC 2012-08-03 15:58:05 +00:00
alc
8f708ce433 Correct an off-by-one error in vm_reserv_alloc_contig() that resulted in
the last reservation of a multi-reservation allocation not being
initialized.
2012-07-15 21:46:19 +00:00
attilio
ffa3f082ff - Split the cached and resident pages tree into 2 distinct ones.
This makes the RED/BLACK support go away and simplifies a lot vmradix
  functions used here. This happens because with patricia trie support
  the trie will be little enough that keeping 2 diffetnt will be
  efficient too.
- Reduce differences with head, in places like backing scan where the
  optimizazions used shuffled the code a little bit around.

Tested by:	flo, Andrea Barberio
2012-07-08 14:01:25 +00:00
attilio
af795c2c10 MFC 2012-04-11 14:54:06 +00:00
alc
a317fe0618 If a page belonging a reservation is cached, then mark the reservation so
that it will be freed to the cache pool rather than the default pool.
Otherwise, the cached pages within the reservation may be recycled sooner
than necessary.

Reported by:	Andrey Zonov
2012-04-08 17:00:46 +00:00