Commit Graph

55 Commits

Author SHA1 Message Date
markj
58afbd3942 Revise the page cache size policy.
In r353734 the use of the page caches was limited to systems with a
relatively large amount of RAM per CPU.  This was to mitigate some
issues reported with the system not able to keep up with memory pressure
in cases where it had been able to do so prior to the addition of the
direct free pool cache.  This change re-enables those caches.

The change modifies uma_zone_set_maxcache(), which was introduced
specifically for the page cache zones.  Rather than using it to limit
only the full bucket cache, have it also set uz_count_max to provide an
upper bound on the per-CPU cache size that is consistent with the number
of items requested.  Remove its return value since it has no use.

Enable the page cache zones unconditionally, and limit them to 0.1% of
the domain's pages.  The limit can be overridden by the
vm.pgcache_zone_max tunable as before.

Change the item size parameter passed to uma_zcache_create() to the
correct size, and stop setting UMA_ZONE_MAXBUCKET.  This allows the page
cache buckets to be adaptively sized, like the rest of UMA's caches.
This also causes the initial bucket size to be small, so only systems
which benefit from large caches will get them.

Reviewed by:	gallatin, jeff
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22393
2019-11-22 16:30:47 +00:00
markj
5451b35f06 Extend uma_reclaim() to permit different reclamation targets.
The page daemon periodically invokes uma_reclaim() to reclaim cached
items from each zone when the system is under memory pressure.  This
is important since the size of these caches is unbounded by default.
However it also results in bursts of high latency when allocating from
heavily used zones as threads miss in the per-CPU caches and must
access the keg in order to allocate new items.

With r340405 we maintain an estimate of each zone's usage of its
(per-NUMA domain) cache of full buckets.  Start making use of this
estimate to avoid reclaiming the entire cache when under memory
pressure.  In particular, introduce TRIM, DRAIN and DRAIN_CPU
verbs for uma_reclaim() and uma_zone_reclaim().  When trimming, only
items in excess of the estimate are reclaimed.  Draining a zone
reclaims all of the cached full buckets (the previous behaviour of
uma_reclaim()), and may further drain the per-CPU caches in extreme
cases.

Now, when under memory pressure, the page daemon will trim zones
rather than draining them.  As a result, heavily used zones do not incur
bursts of bucket cache misses following reclamation, but large, unused
caches will be reclaimed as before.

Reviewed by:	jeff
Tested by:	pho (an earlier version)
MFC after:	2 months
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16667
2019-09-01 22:22:43 +00:00
markj
3cffe5128e Update and clean up the UMA man page.
- Fix warnings from igor and mandoc.
- Provide a brief description of the separation between zones and their
  backend slab allocators.
- Document cache zones and secondary zones.
- Document the kernel config options added in r350659.
- Document the uma_zalloc_pcpu() and uma_zfree_pcpu() wrappers.
- Document uma_zone_reserve(), uma_zone_reserve_kva() and
  uma_zone_prealloc().
- Document uma_zone_alloc() and uma_zone_freef().
- Add some missing MLINKs and Xrefs.

MFC after:	2 weeks
2019-08-30 19:35:44 +00:00
jtl
8222f5cb7c Make UMA and malloc(9) return non-executable memory in most cases.
Most kernel memory that is allocated after boot does not need to be
executable.  There are a few exceptions.  For example, kernel modules
do need executable memory, but they don't use UMA or malloc(9).  The
BPF JIT compiler also needs executable memory and did use malloc(9)
until r317072.

(Note that a side effect of r316767 was that the "small allocation"
path in UMA on amd64 already returned non-executable memory.  This
meant that some calls to malloc(9) or the UMA zone(9) allocator could
return executable memory, while others could return non-executable
memory.  This change makes the behavior consistent.)

This change makes malloc(9) return non-executable memory unless the new
M_EXEC flag is specified.  After this change, the UMA zone(9) allocator
will always return non-executable memory, and a KASSERT will catch
attempts to use the M_EXEC flag to allocate executable memory using
uma_zalloc() or its variants.

Allocations that do need executable memory have various choices.  They
may use the M_EXEC flag to malloc(9), or they may use a different VM
interfact to obtain executable pages.

Now that malloc(9) again allows executable allocations, this change also
reverts most of r317072.

PR:		228927
Reviewed by:	alc, kib, markj, jhb (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D15691
2018-06-13 17:04:41 +00:00
jeff
3e6c614462 Document new NUMA related syscalls and utility options.
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-24 23:58:44 +00:00
trasz
023e12f8a9 Fix formatting errors that resulted in apropos(1) output looking weird.
MFC after:	2 weeks
2018-03-17 11:41:06 +00:00
glebius
dce26e08f3 UMA_ZONE_REFCNT was removed.
PR:		209715
Submitted by:	Fabian Keil <fk fabiankeil.de>
MFC after:	3 days
2017-04-26 17:55:43 +00:00
trasz
989dbbd065 Fix a bunch of "sentence not on new line" warnings in section 9.
MFC after:	1 month
2016-06-08 09:19:47 +00:00
wblock
0506ba8df2 Spelling fixes supplied by pfg@, detected with codespell, plus
additional misspellings detected by igor.

MFC after:	1 week
2016-05-01 22:00:41 +00:00
imp
9def4cf348 Read-only is hyphenated when it modifies a noun. 2016-01-16 00:37:27 +00:00
jtl
94d8d1452b Add a safety net to reclaim mbufs when one of the mbuf zones become
exhausted.

It is possible for a bug in the code (or, theoretically, even unusual
network conditions) to exhaust all possible mbufs or mbuf clusters.
When this occurs, things can grind to a halt fairly quickly. However,
we currently do not call mb_reclaim() unless the entire system is
experiencing a low-memory condition.

While it is best to try to prevent exhaustion of one of the mbuf zones,
it would also be useful to have a mechanism to attempt to recover from
these situations by freeing "expendable" mbufs.

This patch makes two changes:

a) The patch adds a generic API to the UMA zone allocator to set a
function that should be called when an allocation fails because the
zone limit has been reached. Because of the way this function can be
called, it really should do minimal work.

b) The patch uses this API to try to free mbufs when an allocation
fails from one of the mbuf zones because the zone limit has been
reached. The function schedules a callout to run mb_reclaim().

Differential Revision:	https://reviews.freebsd.org/D3864
Reviewed by:	gnn
Comments by:	rrs, glebius
MFC after:	2 weeks
Sponsored by:	Juniper Networks
2015-12-20 02:05:33 +00:00
brueffer
3d2f95b7b4 Fix various mdoc issues and some EOL whitespace.
Found with:	mandoc -Tlint
2014-12-21 10:57:42 +00:00
bapt
21f6fe7ae4 use .Mt to mark up email addresses consistently (part6)
PR:		191174
Submitted by:	Franco Fichtner <franco at lastsummer.de>
2014-06-26 21:44:30 +00:00
glebius
e8c2426587 Provide macros that allow easily export uma(9) zone limits and
current usage via sysctl(9):

  SYSCTL_UMA_MAX()
  SYSCTL_ADD_UMA_MAX()
  SYSCTL_UMA_CUR()
  SYSCTL_ADD_UMA_CUR()

Sponsored by:	Nginx, Inc.
2014-02-07 14:29:03 +00:00
joel
d5b5017793 Remove contractions. 2013-04-11 18:46:41 +00:00
glebius
7f9db020a2 Merge from projects/counters: UMA_ZONE_PCPU zones.
These zones have slab size == sizeof(struct pcpu), but request from VM
enough pages to fit (uk_slabsize * mp_ncpus). An item allocated from such
zone would have a separate twin for each CPU in the system, and these twins
are at a distance of sizeof(struct pcpu) from each other. This magic value
of distance would allow us to make some optimizations later.

  To address private item from a CPU simple arithmetics should be used:

  item = (type *)((char *)base + sizeof(struct pcpu) * curcpu)

  These arithmetics are available as zpcpu_get() macro in pcpu.h.

  To introduce non-page size slabs a new field had been added to uma_keg
uk_slabsize. This shifted some frequently used fields of uma_keg to the
fourth cache line on amd64. To mitigate this pessimization, uma_keg fields
were a bit rearranged and least frequently used uk_name and uk_link moved
down to the fourth cache line. All other fields, that are dereferenced
frequently fit into first three cache lines.

Sponsored by:	Nginx, Inc.
2013-04-08 19:10:45 +00:00
glebius
22bd645df5 Document some flags to the uma_zcreate(). Not all flags are documented,
only those that at least are used in the kernel, or that definitely
work.
2013-03-21 16:19:46 +00:00
glebius
04d26633fa Document uma_find_refcnt(). 2013-03-21 16:04:34 +00:00
pjd
a585ca9ec8 Implemented uma_zone_set_warning(9) function that sets a warning, which
will be printed once the given zone becomes full and cannot allocate an
item. The warning will not be printed more often than every five minutes.

All UMA warnings can be globally turned off by setting sysctl/tunable
vm.zone_warnings to 0.

Discussed on:	arch
Obtained from:	WHEEL Systems
MFC after:	2 weeks
2012-12-07 22:27:13 +00:00
trasz
61a7d2c215 Make it clear that NULL can only be returned when M_NOWAIT was used. 2012-10-28 21:01:32 +00:00
gjb
9761e3fdaf Fix various typos in manual pages.
Submitted by:	amdmi3
PR:		165431
MFC after:	1 week
2012-02-25 14:31:25 +00:00
ed
23524b572c Globally replace u_int*_t from (non-contributed) man pages.
The reasoning behind this, is that if we are consistent in our
documentation about the uint*_t stuff, people will be less tempted to
write new code that uses the non-standard types.

I am not going to bump the man page dates, as these changes can be
considered style nits. The meaning of the man pages is unaffected.

MFC after:	1 month
2012-02-12 18:29:56 +00:00
uqs
5179964e55 Re-encode files from ISO-8859-1 to UTF-8 2011-05-22 14:03:30 +00:00
mdf
3f66b92677 uma_zfree(zone, NULL) should do nothing, to match free(9).
Noticed by:	Ron Steinke <rsteinke at isilon dot com>
MFC after:	3 days
2010-10-19 16:06:00 +00:00
lstewart
4171305db0 Change uma_zone_set_max to return the effective value of "nitems" after
rounding. The same value can also be obtained with uma_zone_get_max, but this
change avoids a caller having to make two back-to-back calls.

Sponsored by:	FreeBSD Foundation
Reviewed by:	gnn, jhb
2010-10-16 04:41:45 +00:00
lstewart
2770ad22da - Simplify implementation of uma_zone_get_max.
- Add uma_zone_get_cur which returns the current approximate occupancy of
  a zone. This is useful for providing stats via sysctl amongst other things.

Sponsored by:	FreeBSD Foundation
Reviewed by:	gnn, jhb
MFC after:	2 weeks
2010-10-16 04:14:45 +00:00
remko
78396ae0b6 Document the _arg versions of the uma_zalloc and uma_zfree functions.
PR:		docs/120357
Submitted by:	gahr
MFC after:	3 days
2008-06-19 18:33:38 +00:00
ru
3bdce07112 Bump document date for the previous change. 2006-10-21 16:08:21 +00:00
kib
ff66a30202 Remove long untrue note about storing state information inside free items.
OKed by:	rwatson, tegge
Approved by:	pjd (mentor)
MFC after:	1 week
2006-10-02 07:27:00 +00:00
des
42f562f3a8 I don't normally use my middle name, so remove it from attributions in
man pages (though not from copyright notices).  While I'm here, add email
addresses where appropriate.
2004-01-25 11:39:42 +00:00
harti
0914303fac Document uma_zone_set_max and its non-obvious behaviour.
Reviewed by:	bmilekic
2003-07-21 14:20:58 +00:00
hmp
d535a2bede Various mdoc(7) fixes:
Add devfs(5) reference                   - make_dev.9
	 Change .Xr from VFS_VGET(9) to vget(9)   - vnode.9
	 Spelling fix, 'useage' to 'usage'        - zone.9

Approved by:	des (mentor)
2003-05-31 14:20:30 +00:00
ru
6d3a461a4f mdoc(7) police: scheduled sweep.
Approved by:	re
2002-11-29 11:39:20 +00:00
alfred
b80f7ad39f Flesh out the description of the uma_zcreate callback function arguements
a bit.  As there may be changes soon we're still a bit vague unfortunatly.
2002-11-18 01:11:58 +00:00
ru
1b01e6fb63 mdoc(7) police: Fix SYNOPSIS, bump document date. 2002-05-30 11:37:39 +00:00
asmodai
06b04bc450 Add description for uma_zcreate().
Submitted by:	arr
2002-05-18 11:12:02 +00:00
asmodai
3292015fc5 Chase the sources and document the change of wait to flags, which are
the normal malloc(9) flags.

Submitted by:	arr
2002-04-30 16:30:19 +00:00
asmodai
1fe12f5d34 Remove references to zinit() which does not exist anymore. 2002-04-30 15:04:41 +00:00
asmodai
e5f4e75d0c Document the zone allocator is now a slab allocator.
Show Jeff's work and your's truly manual page updates.
2002-04-30 14:56:44 +00:00
asmodai
410ec4b478 Document uma_zalloc() behaviour. 2002-04-30 14:26:22 +00:00
asmodai
b9890f2065 Update function arguments to what is current used. 2002-04-30 13:03:28 +00:00
asmodai
4d8fcc6041 Prefix the remaining functions with uma_ as is now the case in UMA. 2002-04-30 12:45:31 +00:00
asmodai
32b1462881 zinit() does not exist anymore. 2002-04-30 12:29:59 +00:00
asmodai
0b2fc084f5 Remove references to zbootinit() and zinitna(). 2002-04-30 09:47:50 +00:00
asmodai
8adefc573a Do not use a contraction, aren't -> are not. 2002-04-30 09:38:52 +00:00
asmodai
b6b2470fee Remove wrong include, one is supposed to include vm/uma.h instead of
vm_zone.h.
2002-04-30 08:51:42 +00:00
ru
04417a8c35 mdoc(7) police: get rid of WEOL and HSB introduced in rev 1.6. 2002-01-10 13:02:55 +00:00
davidc
6b067561b6 Update function definitions and required include files to reflect
the current state of the system.

Approved by: alfred
2001-12-26 23:14:04 +00:00
julian
9e5d1d1b57 Make the man page reflec t the code a bit better.
Specifically, note the condition of the memory on initial
and subsequent allocations is different.
2001-12-14 19:19:31 +00:00
ru
623da62a5a mdoc(7) police: Use the new .In macro for #include statements. 2001-10-01 16:09:29 +00:00