Commit Graph

96 Commits

Author SHA1 Message Date
vangyzen
70ab07a30b memstat_kvm_uma: fix reading of uma_zone_domain structures
Coverity flagged the scaling by sizeof(uzd).  That is the type
of the pointer, so the scaling was already done by pointer arithmetic.
However, this was also passing a stack frame pointer to kvm_read,
so it was doubly wrong.

Move ZDOM_GET into the !_KERNEL section and use it in libmemstat.

Reported by:	Coverity
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D26213
2020-08-28 19:50:40 +00:00
jeff
72deafb875 Use per-domain locks for the bucket cache.
This gives much better concurrency when there are a large number of
cores per-domain and multiple domains.  Avoid taking the lock entirely
if it will not be productive.  ROUNDROBIN domains will have mixed
memory in each domain and will load balance to all domains.

While here refactor the zone/domain separation and bucket limits to
simplify callers.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D23673
2020-02-19 18:48:46 +00:00
markj
b4a046c049 libmemstat: Catch up with r357776.
Reported by:	O. Hartmann <ohartmann@walstatt.org>
2020-02-11 20:15:49 +00:00
jeff
64902f3fb7 Fix libmemstat_uma build after r357485.
Submitted by:	cy
2020-02-04 05:27:45 +00:00
jeff
5c24f16c23 Use per-domain keg locks. This provides both a lock and separate space
accounting for each NUMA domain.  Independent keg domain locks are important
with cross-domain frees.  Hashed zones are non-numa and use a single keg
lock to protect the hash table.

Reviewed by:	markj, rlibby
Differential Revision:	https://reviews.freebsd.org/D22829
2020-01-04 03:30:08 +00:00
jeff
652639ff25 Optimize fast path allocations by storing bucket headers in the per-cpu
cache area.  This allows us to check on bucket space for all per-cpu
buckets with a single cacheline access and fewer branches.

Reviewed by:	markj, rlibby
Differential Revision:	https://reviews.freebsd.org/D22825
2019-12-25 20:50:53 +00:00
rlibby
50394a786d Revert r355706 & r355710
The quick fix didn't work.  I'll sort it out tomorrow.

Revert r355710: "libmemstat: unbreak build"
Revert r355706: "uma dbg: flexible size for slab debug bitset too"
2019-12-13 11:21:28 +00:00
rlibby
39dee0d088 libmemstat: unbreak build
r355706 added an instance of offsetof() to the UMA private kernel header
file uma_int.h.  Userspace memstat_uma.c includes that header, and
chokes on offsetof() because apparently the definition in sys/types.h is
ifdef _KERNEL.  Now, include sys/stddef.h which has an identical
definition.

Pointyhat to:	rlibby
Sponsored by:	Dell EMC Isilon
2019-12-13 10:34:19 +00:00
sjg
16923f2426 Update Makefile.depend files
Update a bunch of Makefile.depend files as
a result of adding Makefile.depend.options files

Reviewed by:	 bdrewery
MFC after:	1 week
Sponsored by:   Juniper Networks
Differential Revision:  https://reviews.freebsd.org/D22494
2019-12-11 17:37:53 +00:00
manu
a77ce411ec pkgbase: Create a FreeBSD-utilities package and make it the default one
The default package use to be FreeBSD-runtime but it should only contain
binaries and libs enough to boot to single user and repair the system, it
is also very handy to have a package that can be tranform to a small mfsroot.
So create a new package named FreeBSD-utilities and make it the default one.
Also move a few binaries and lib into this package when it make sense.
Reviewed by:	bapt, gjb
Differential Revision:	https://reviews.freebsd.org/D21506
2019-09-05 14:15:47 +00:00
markj
5451b35f06 Extend uma_reclaim() to permit different reclamation targets.
The page daemon periodically invokes uma_reclaim() to reclaim cached
items from each zone when the system is under memory pressure.  This
is important since the size of these caches is unbounded by default.
However it also results in bursts of high latency when allocating from
heavily used zones as threads miss in the per-CPU caches and must
access the keg in order to allocate new items.

With r340405 we maintain an estimate of each zone's usage of its
(per-NUMA domain) cache of full buckets.  Start making use of this
estimate to avoid reclaiming the entire cache when under memory
pressure.  In particular, introduce TRIM, DRAIN and DRAIN_CPU
verbs for uma_reclaim() and uma_zone_reclaim().  When trimming, only
items in excess of the estimate are reclaimed.  Draining a zone
reclaims all of the cached full buckets (the previous behaviour of
uma_reclaim()), and may further drain the per-CPU caches in extreme
cases.

Now, when under memory pressure, the page daemon will trim zones
rather than draining them.  As a result, heavily used zones do not incur
bursts of bucket cache misses following reclamation, but large, unused
caches will be reclaimed as before.

Reviewed by:	jeff
Tested by:	pho (an earlier version)
MFC after:	2 months
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16667
2019-09-01 22:22:43 +00:00
jeff
807c696ddc Add two new kernel options to control memory locality on NUMA hardware.
- UMA_XDOMAIN enables an additional per-cpu bucket for freed memory that
   was freed on a different domain from where it was allocated.  This is
   only used for UMA_ZONE_NUMA (first-touch) zones.
 - UMA_FIRSTTOUCH sets the default UMA policy to be first-touch for all
   zones.  This tries to maintain locality for kernel memory.

Reviewed by:	gallatin, alc, kib
Tested by:	pho, gallatin
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20929
2019-08-06 21:50:34 +00:00
glebius
3c1186c5c7 The KVM code also needs a fix similar to r344269.
Reported by:	pho
2019-05-29 03:14:46 +00:00
glebius
7897f6746e With r343051 UMA switched from atomic counts to counter(9) and now kernel
reports snap counts of how much a zone alloced and how much it freed.  It
may happen that snap values doesn't match, e.g alloced - freed < 0.
Workaround that in memstat library.

Reported by:	pho
2019-02-18 21:27:13 +00:00
glebius
b3ea1313c6 This was missed in r343051: make uz_allocs, uz_frees and uz_fails counter(9). 2019-01-15 18:47:19 +00:00
glebius
f1a8621cf2 o Move zone limit from keg level up to zone level. This means that now
two zones sharing a keg may have different limits. Now this is going
  to work:

  zone = uma_zcreate();
  uma_zone_set_max(zone, limit);
  zone2 = uma_zsecond_create(zone);
  uma_zone_set_max(zone2, limit2);

  Kegs no longer have uk_maxpages field, but zones have uz_items. When
  set, it may be rounded up to minimum possible CPU bucket cache size.
  For small limits bucket cache can also be reconfigured to be smaller.
  Counter uz_items is updated whenever items transition from keg to a
  bucket cache or directly to a consumer. If zone has uz_maxitems set and
  it is reached, then we are going to sleep.

o Since new limits don't play well with multi-keg zones, remove them. The
  idea of multi-keg zones was introduced exactly 10 years ago, and never
  have had a practical usage. In discussion with Jeff we came to a wild
  agreement that if we ever want to reintroduce the idea of a smart allocator
  that would be able to choose between two (or more) totally different
  backing stores, that choice should be made one level higher than UMA,
  e.g. in malloc(9) or in mget(), or whatever and choice should be controlled
  by the caller.

o Sleeping code is improved to account number of sleepers and wake them one
  by one, to avoid thundering herd problem.

o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache()
  KPI added. Having no bucket cache basically means setting maxcache to 0.

o Now with many fields added and many removed (no multi-keg zones!) make
  sure that struct uma_zone is perfectly aligned.

Reviewed by:	markj, jeff
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D17773
2019-01-15 00:02:06 +00:00
mjg
dd8ab6470b libmemstat: adjust for per-cpu stats after r338899
Reported by:	yuripv
Reviewed by:	kib, markj
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17490
2018-10-11 23:25:14 +00:00
des
58d2db41a5 Reduce <sys/queue.h> pollution.
While <sys/sysctl.h> includes <sys/queue.h> unconditionally, it is only
actually used in code which is conditional on _KERNEL.  Make the #include
itself conditional as well, and fix userland code that uses <sys/queue.h>
for other purposes but relied on <sys/sysctl.h> to bring it in.

MFC after:	1 week
2018-05-11 00:01:43 +00:00
jeff
f375b4dd66 Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
pfg
260ba0bff1 lib: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-26 02:00:33 +00:00
bdrewery
a598c4b809 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
jhibbits
8d17c296fe Fix buildworld for powerpc.
vmpage requires struct pmap to exist and contain a pm_stats field.  As of
r308817, either AIM or BOOKE is required to be set in order to get their
respective pmap structs.  Rather than expose them both, or try to unify them
unnecessarily, add a third option which contains only a pm_stats field, and
change the two existing pmap structures to place the common fields at the
beginning of the struct.  This actually fixes the stats collection by libkvm on
AIM hardware, because before it was accessing a possibly different offset, which
would cause it to read garbage.

Bump __FreeBSD_version to denote this ABI change, so that ports which depend on
libkvm can be rebuilt.
2016-11-20 06:10:12 +00:00
gjb
e4997c6184 MFH
Sponsored by:	The FreeBSD Foundation
2016-02-10 04:20:39 +00:00
glebius
b3c4f0ddbf Include sys/_task.h into uma_int.h, so that taskqueue.h isn't a
requirement for uma_int.h.

Suggested by:	jhb
2016-02-09 20:22:35 +00:00
gjb
a44dc347a7 MFH
Sponsored by:	The FreeBSD Foundation
2016-02-08 12:16:01 +00:00
gjb
fef2698edf First pass through library packaging.
Sponsored by:	The FreeBSD Foundation
2016-02-04 21:16:35 +00:00
glebius
c805a3354e Fix build. 2016-02-04 00:23:21 +00:00
bdrewery
e13d6f8b3f META MODE: Prefer INSTALL=tools/install.sh to lessen the need for xinstall.host.
This both avoids some dependencies on xinstall.host and allows
bootstrapping on older releases to work due to lack of at least 'install -l'
support.

Sponsored by:	EMC / Isilon Storage Division
2015-11-25 19:10:28 +00:00
sjg
008d7c831f Add META_MODE support.
Off by default, build behaves normally.
WITH_META_MODE we get auto objdir creation, the ability to
start build from anywhere in the tree.

Still need to add real targets under targets/ to build packages.

Differential Revision:       D2796
Reviewed by: brooks imp
2015-06-13 19:20:56 +00:00
sjg
75a137820d dirdeps.mk now sets DEP_RELDIR 2015-06-08 23:35:17 +00:00
sjg
65145fa4c8 Merge sync of head 2015-05-27 01:19:58 +00:00
bapt
6adce30d28 Convert libraries to use LIBADD
While here reduce a bit overlinking
2014-11-25 11:07:26 +00:00
sjg
d7cd1d425c Merge head from 7/28 2014-08-19 06:50:54 +00:00
bapt
1f77f137dc use .Mt to mark up email addresses consistently (part3)
PR:		191174
Submitted by:	Franco Fichtner  <franco at lastsummer.de>
2014-06-23 08:23:05 +00:00
sjg
5860f0d106 Updated dependencies 2014-05-16 14:09:51 +00:00
sjg
1a7e48acf1 Updated dependencies 2014-05-10 05:16:28 +00:00
sjg
0c7e03a54c Merge head 2014-04-27 08:13:43 +00:00
glebius
665c1c0919 Expose real size of UMA allocations via libmemstat(3).
Sponsored by:	Nginx, Inc.
2014-02-10 20:09:10 +00:00
sjg
62bb106222 Merge from head 2013-09-05 20:18:59 +00:00
jeff
cca9ad5b94 Refine UMA bucket allocation to reduce space consumption and improve
performance.

 - Always free to the alloc bucket if there is space.  This gives LIFO
   allocation order to improve hot-cache performance.  This also allows
   for zones with a single bucket per-cpu rather than a pair if the entire
   working set fits in one bucket.
 - Enable per-cpu caches of buckets.  To prevent recursive bucket
   allocation one bucket zone still has per-cpu caches disabled.
 - Pick the initial bucket size based on a table driven maximum size
   per-bucket rather than the number of items per-page.  This gives
   more sane initial sizes.
 - Only grow the bucket size when we face contention on the zone lock, this
   causes bucket sizes to grow more slowly.
 - Adjust the number of items per-bucket to account for the header space.
   This packs the buckets more efficiently per-page while making them
   not quite powers of two.
 - Eliminate the per-zone free bucket list.  Always return buckets back
   to the bucket zone.  This ensures that as zones grow into larger
   bucket sizes they eventually discard the smaller sizes.  It persists
   fewer buckets in the system.  The locking is slightly trickier.
 - Only switch buckets in zalloc, not zfree, this eliminates pathological
   cases where we ping-pong between two buckets.
 - Ensure that the thread that fills a new bucket gets to allocate from
   it to give a better upper bound on allocation time.

Sponsored by:	EMC / Isilon Storage Division
2013-06-18 04:50:20 +00:00
sjg
6d37b86f2b Updated dependencies 2013-03-11 17:21:52 +00:00
sjg
0ee5295509 Updated dependencies 2013-02-16 01:23:54 +00:00
sjg
9f7bd28e77 Updated/new Makefile.depend 2012-11-08 21:24:17 +00:00
sjg
778e93c51a Sync from head 2012-11-04 02:52:03 +00:00
mdf
1bc1b805d7 Const-ify the zone name argument to uma_zcreate(9).
MFC after:	3 days
2012-10-26 17:51:05 +00:00
marcel
9dd41e3647 Sync FreeBSD's bmake branch with Juniper's internal bmake branch.
Requested by: Simon Gerraty <sjg@juniper.net>
2012-08-22 19:25:57 +00:00
gjb
9761e3fdaf Fix various typos in manual pages.
Submitted by:	amdmi3
PR:		165431
MFC after:	1 week
2012-02-25 14:31:25 +00:00
pluknet
0660d162e1 Cosmetic cleanup: remove #define LIBMEMSTAT used to prevent a nested
include of opt_vmpage.h from vm/vm_page.h.  opt_vmpage.h was retired
before 7.0 together with options PQ_NOOPT.

Approved by:	re (kib)
MFC after:	3 days
2011-09-02 14:10:42 +00:00
pluknet
3ec0e5bcb6 Get rid of MAXCPU knowledge used for internal needs only. Switch to
dynamic memory allocation to hold per-CPU memory types data (sized to
mp_maxid for UMA, and to mp_maxcpus for malloc to match the kernel).

That fixes libmemstat with arbitrary large MAXCPU values and therefore
eliminates MEMSTAT_ERROR_TOOMANYCPUS error type.

Reviewed by:	jhb
Approved by:	re (kib)
2011-08-01 09:43:35 +00:00
attilio
27825059cd Revert r222363, as bde@ pointed out the initial solution was far more
correct.
2011-05-31 20:59:53 +00:00