Commit Graph

13 Commits

Author SHA1 Message Date
John Baldwin
c0ae66888b Create a cpuset mask for each NUMA domain that is available in the
kernel via the global cpuset_domain[] array. To export these to userland,
add a CPU_WHICH_DOMAIN level that can be used to fetch the mask for a
specific domain. Add a -d flag to cpuset(1) that can be used to fetch
the mask for a given domain.

Differential Revision:	https://reviews.freebsd.org/D1232
Submitted by:	jeff (kernel bits)
Reviewed by:	adrian, jeff
2015-01-08 15:53:13 +00:00
Adrian Chadd
fc4f524a6e Missing from previous commit - keep the VM domain -> PXM mapping
array and use it to map PXM -> VM domain when needed.

Differential Revision:	D906
Reviewed by:	jhb
2014-10-09 05:34:28 +00:00
John Baldwin
e07ef9b0f6 Move <machine/apicvar.h> to <x86/apicvar.h>. 2014-01-23 20:10:22 +00:00
Konstantin Belousov
449c2e92c9 Split the pagequeues per NUMA domains, and split pageademon process
into threads each processing queue in a single domain.  The structure
of the pagedaemons and queues is kept intact, most of the changes come
from the need for code to find an owning page queue for given page,
calculated from the segment containing the page.

The tie between NUMA domain and pagedaemon thread/pagequeue split is
rather arbitrary, the multithreaded daemon could be allowed for the
single-domain machines, or one domain might be split into several page
domains, to further increase concurrency.

Right now, each pagedaemon thread tries to reach the global target,
precalculated at the start of the pass.  This is not optimal, since it
could cause excessive page deactivation and freeing.  The code should
be changed to re-check the global page deficit state in the loop after
some number of iterations.

The pagedaemons reach the quorum before starting the OOM, since one
thread inability to meet the target is normal for split queues.  Only
when all pagedaemons fail to produce enough reusable pages, OOM is
started by single selected thread.

Launder is modified to take into account the segments layout with
regard to the region for which cleaning is performed.

Based on the preliminary patch by jeff, sponsored by EMC / Isilon
Storage Division.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:36:38 +00:00
Attilio Rao
7e226537c7 o Add accessor functions to add and remove pages from a specific
freelist.
o Split the pool of free pages queues really by domain and not rely on
  definition of VM_RAW_NFREELIST.
o For MAXMEMDOM > 1, wrap the RR allocation logic into a specific
  function that is called when calculating the allocation domain.
  The RR counter is kept, currently, per-thread.
  In the future it is expected that such function evolves in a real
  policy decision referee, based on specific informations retrieved by
  per-thread and per-vm_object attributes.
o Add the concept of "probed domains" under the form of vm_ndomains.
  It is responsibility for every architecture willing to support multiple
  memory domains to correctly probe vm_ndomains along with mem_affinity
  segments attributes.  Those two values are supposed to remain always
  consistent.
  Please also note that vm_ndomains and td_dom_rr_idx are both int
  because segments already store domains as int.  Ideally u_int would
  have much more sense. Probabilly this should be cleaned up in the
  future.
o Apply RR domain selection also to vm_phys_zero_pages_idle().

Sponsored by:	EMC / Isilon storage division
Partly obtained from:	jeff
Reviewed by:	alc
Tested by:	jeff
2013-05-13 15:40:51 +00:00
Attilio Rao
ab13ed1e45 Revert r250339 as apparently it is more clutter than help.
Sponsored by:	EMC / Isilon storage division
Requested by:	jhb
2013-05-08 21:06:47 +00:00
Attilio Rao
16e073e57a Add functions to do ACPI System Locality Information Table parsing
and printing at boot.
For reference on table informations and purposes please review ACPI specs.

Sponsored by:	EMC / Isilon storage division
Obtained from:	jeff
Reviewed by:	jhb (earlier version)
2013-05-07 22:49:56 +00:00
Attilio Rao
941646f5ec Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in
order to match the MAXCPU concept.  The change should also be useful
for consolidation and consistency.

Sponsored by:	EMC / Isilon storage division
Obtained from:	jeff
Reviewed by:	alc
2013-05-07 22:46:24 +00:00
John Baldwin
174b5f3850 Make VM_NDOMAIN a kernel option so that it can be enabled from a kernel
config file.

Requested by:	phk (ages ago)
MFC after:	1 month
2013-02-14 19:38:04 +00:00
John Baldwin
289908743e Fix a few bugs in the SRAT parsing code:
- Actually increment ndomain when building our list of known domains
  so that we can properly renumber them to be 0-based and dense.
- If the number of domains exceeds the configured maximum (VM_NDOMAIN),
  bail out of processing the SRAT and disable NUMA rather than hitting an
  obscure panic later.
- Don't bother parsing the SRAT at all if VM_NDOMAIN is set to 1 to
  disable NUMA (the default).

Reported by:	phk (2)
MFC after:	1 week
2012-01-03 20:53:58 +00:00
John Baldwin
4d99cfb313 Ignore SRAT memory entries if the memory range does not overlap with an
existing phys_avail[] table.  If a hw.physmem setting causes a memory
domain to not be present in phys_avail[], the SRAT table will now be
ignored rather than triggering a panic when a CPU in the missing domain
tries to allocate a page.

MFC after:	1 week
2011-10-05 16:03:47 +00:00
John Baldwin
6676877bd9 When performing a sanity check on the SRAT table to ensure that each
memory domain has an assigned CPU, ignore disabled CPUs.  Previously
disabled CPUs were counted as being in domain 0.

Reported by:	mdf
2010-07-29 17:37:35 +00:00
John Baldwin
dd540b4623 Add a parser for the ACPI SRAT table for amd64 and i386. It sets
PCPU(domain) for each CPU and populates a mem_affinity array suitable
for the NUMA support in the physical memory allocator.

Reviewed by:	alc
2010-07-27 20:40:46 +00:00