2005-07-14 17:40:02 +00:00
|
|
|
/*-
|
2017-11-26 02:00:33 +00:00
|
|
|
* SPDX-License-Identifier: BSD-2-Clause-FreeBSD
|
|
|
|
*
|
2006-02-11 19:21:39 +00:00
|
|
|
* Copyright (c) 2005-2006 Robert N. M. Watson
|
2005-07-14 17:40:02 +00:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
|
|
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
* SUCH DAMAGE.
|
|
|
|
*
|
|
|
|
* $FreeBSD$
|
|
|
|
*/
|
|
|
|
|
2021-12-05 22:27:33 +01:00
|
|
|
#define _WANT_FREEBSD_BITSET
|
|
|
|
|
2005-07-14 17:40:02 +00:00
|
|
|
#include <sys/param.h>
|
2019-01-15 18:47:19 +00:00
|
|
|
#include <sys/counter.h>
|
2011-05-08 14:45:53 +00:00
|
|
|
#include <sys/cpuset.h>
|
2005-07-14 17:40:02 +00:00
|
|
|
#include <sys/sysctl.h>
|
|
|
|
|
|
|
|
#include <vm/uma.h>
|
2005-08-01 19:07:39 +00:00
|
|
|
#include <vm/uma_int.h>
|
2005-07-14 17:40:02 +00:00
|
|
|
|
|
|
|
#include <err.h>
|
|
|
|
#include <errno.h>
|
2005-08-01 19:07:39 +00:00
|
|
|
#include <kvm.h>
|
|
|
|
#include <nlist.h>
|
2006-02-11 19:19:29 +00:00
|
|
|
#include <stddef.h>
|
2005-07-14 17:40:02 +00:00
|
|
|
#include <stdio.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
#include <string.h>
|
Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
|
|
|
#include <unistd.h>
|
2005-07-14 17:40:02 +00:00
|
|
|
|
|
|
|
#include "memstat.h"
|
|
|
|
#include "memstat_internal.h"
|
|
|
|
|
2005-08-01 19:07:39 +00:00
|
|
|
static struct nlist namelist[] = {
|
|
|
|
#define X_UMA_KEGS 0
|
|
|
|
{ .n_name = "_uma_kegs" },
|
|
|
|
#define X_MP_MAXID 1
|
|
|
|
{ .n_name = "_mp_maxid" },
|
2006-02-11 18:44:37 +00:00
|
|
|
#define X_ALL_CPUS 2
|
|
|
|
{ .n_name = "_all_cpus" },
|
2018-01-12 23:25:05 +00:00
|
|
|
#define X_VM_NDOMAINS 3
|
|
|
|
{ .n_name = "_vm_ndomains" },
|
2005-08-01 19:07:39 +00:00
|
|
|
{ .n_name = "" },
|
|
|
|
};
|
|
|
|
|
2005-07-14 17:40:02 +00:00
|
|
|
/*
|
|
|
|
* Extract uma(9) statistics from the running kernel, and store all memory
|
|
|
|
* type information in the passed list. For each type, check the list for an
|
|
|
|
* existing entry with the right name/allocator -- if present, update that
|
|
|
|
* entry. Otherwise, add a new entry. On error, the entire list will be
|
|
|
|
* cleared, as entries will be in an inconsistent state.
|
|
|
|
*
|
|
|
|
* To reduce the level of work for a list that starts empty, we keep around a
|
|
|
|
* hint as to whether it was empty when we began, so we can avoid searching
|
|
|
|
* the list for entries to update. Updates are O(n^2) due to searching for
|
|
|
|
* each entry before adding it.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
memstat_sysctl_uma(struct memory_type_list *list, int flags)
|
|
|
|
{
|
|
|
|
struct uma_stream_header *ushp;
|
|
|
|
struct uma_type_header *uthp;
|
|
|
|
struct uma_percpu_stat *upsp;
|
|
|
|
struct memory_type *mtp;
|
2011-08-01 09:43:35 +00:00
|
|
|
int count, hint_dontsearch, i, j, maxcpus, maxid;
|
2005-07-14 17:40:02 +00:00
|
|
|
char *buffer, *p;
|
|
|
|
size_t size;
|
|
|
|
|
2005-07-24 01:28:54 +00:00
|
|
|
hint_dontsearch = LIST_EMPTY(&list->mtl_list);
|
2005-07-14 17:40:02 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Query the number of CPUs, number of malloc types so that we can
|
|
|
|
* guess an initial buffer size. We loop until we succeed or really
|
|
|
|
* fail. Note that the value of maxcpus we query using sysctl is not
|
|
|
|
* the version we use when processing the real data -- that is read
|
|
|
|
* from the header.
|
|
|
|
*/
|
|
|
|
retry:
|
2011-08-01 09:43:35 +00:00
|
|
|
size = sizeof(maxid);
|
|
|
|
if (sysctlbyname("kern.smp.maxid", &maxid, &size, NULL, 0) < 0) {
|
2005-07-24 01:28:54 +00:00
|
|
|
if (errno == EACCES || errno == EPERM)
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_PERMISSION;
|
|
|
|
else
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_DATAERROR;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
2011-08-01 09:43:35 +00:00
|
|
|
if (size != sizeof(maxid)) {
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_DATAERROR;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
size = sizeof(count);
|
|
|
|
if (sysctlbyname("vm.zone_count", &count, &size, NULL, 0) < 0) {
|
2005-07-24 01:28:54 +00:00
|
|
|
if (errno == EACCES || errno == EPERM)
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_PERMISSION;
|
|
|
|
else
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_VERSION;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
if (size != sizeof(count)) {
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_DATAERROR;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
size = sizeof(*uthp) + count * (sizeof(*uthp) + sizeof(*upsp) *
|
2011-08-01 09:43:35 +00:00
|
|
|
(maxid + 1));
|
2005-07-14 17:40:02 +00:00
|
|
|
|
|
|
|
buffer = malloc(size);
|
|
|
|
if (buffer == NULL) {
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_NOMEMORY;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sysctlbyname("vm.zone_stats", buffer, &size, NULL, 0) < 0) {
|
|
|
|
/*
|
|
|
|
* XXXRW: ENOMEM is an ambiguous return, we should bound the
|
|
|
|
* number of loops, perhaps.
|
|
|
|
*/
|
|
|
|
if (errno == ENOMEM) {
|
|
|
|
free(buffer);
|
|
|
|
goto retry;
|
|
|
|
}
|
2005-07-24 01:28:54 +00:00
|
|
|
if (errno == EACCES || errno == EPERM)
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_PERMISSION;
|
|
|
|
else
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_VERSION;
|
2005-07-14 17:40:02 +00:00
|
|
|
free(buffer);
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (size == 0) {
|
|
|
|
free(buffer);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (size < sizeof(*ushp)) {
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_VERSION;
|
2005-07-14 17:40:02 +00:00
|
|
|
free(buffer);
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
p = buffer;
|
|
|
|
ushp = (struct uma_stream_header *)p;
|
|
|
|
p += sizeof(*ushp);
|
|
|
|
|
|
|
|
if (ushp->ush_version != UMA_STREAM_VERSION) {
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_VERSION;
|
2005-07-14 17:40:02 +00:00
|
|
|
free(buffer);
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* For the remainder of this function, we are quite trusting about
|
|
|
|
* the layout of structures and sizes, since we've determined we have
|
|
|
|
* a matching version and acceptable CPU count.
|
|
|
|
*/
|
|
|
|
maxcpus = ushp->ush_maxcpus;
|
|
|
|
count = ushp->ush_count;
|
|
|
|
for (i = 0; i < count; i++) {
|
|
|
|
uthp = (struct uma_type_header *)p;
|
|
|
|
p += sizeof(*uthp);
|
|
|
|
|
|
|
|
if (hint_dontsearch == 0) {
|
|
|
|
mtp = memstat_mtl_find(list, ALLOCATOR_UMA,
|
|
|
|
uthp->uth_name);
|
|
|
|
} else
|
|
|
|
mtp = NULL;
|
|
|
|
if (mtp == NULL)
|
2005-07-23 21:17:15 +00:00
|
|
|
mtp = _memstat_mt_allocate(list, ALLOCATOR_UMA,
|
2011-08-01 09:43:35 +00:00
|
|
|
uthp->uth_name, maxid + 1);
|
2005-07-14 17:40:02 +00:00
|
|
|
if (mtp == NULL) {
|
2005-08-01 13:18:21 +00:00
|
|
|
_memstat_mtl_empty(list);
|
2005-07-14 17:40:02 +00:00
|
|
|
free(buffer);
|
2005-07-24 01:28:54 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_NOMEMORY;
|
2005-07-14 17:40:02 +00:00
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset the statistics on a current node.
|
|
|
|
*/
|
2011-08-01 09:43:35 +00:00
|
|
|
_memstat_mt_reset_stats(mtp, maxid + 1);
|
2005-07-14 17:40:02 +00:00
|
|
|
|
2005-07-14 20:01:04 +00:00
|
|
|
mtp->mt_numallocs = uthp->uth_allocs;
|
|
|
|
mtp->mt_numfrees = uthp->uth_frees;
|
2005-07-15 23:39:21 +00:00
|
|
|
mtp->mt_failures = uthp->uth_fails;
|
2010-06-15 19:28:37 +00:00
|
|
|
mtp->mt_sleeps = uthp->uth_sleeps;
|
2019-08-06 21:50:34 +00:00
|
|
|
mtp->mt_xdomain = uthp->uth_xdomain;
|
2005-07-14 20:01:04 +00:00
|
|
|
|
2005-07-14 17:40:02 +00:00
|
|
|
for (j = 0; j < maxcpus; j++) {
|
|
|
|
upsp = (struct uma_percpu_stat *)p;
|
|
|
|
p += sizeof(*upsp);
|
|
|
|
|
|
|
|
mtp->mt_percpu_cache[j].mtp_free =
|
|
|
|
upsp->ups_cache_free;
|
|
|
|
mtp->mt_free += upsp->ups_cache_free;
|
|
|
|
mtp->mt_numallocs += upsp->ups_allocs;
|
|
|
|
mtp->mt_numfrees += upsp->ups_frees;
|
|
|
|
}
|
|
|
|
|
2019-02-18 21:27:13 +00:00
|
|
|
/*
|
|
|
|
* Values for uth_allocs and uth_frees frees are snap.
|
|
|
|
* It may happen that kernel reports that number of frees
|
|
|
|
* is greater than number of allocs. See counter(9) for
|
|
|
|
* details.
|
|
|
|
*/
|
|
|
|
if (mtp->mt_numallocs < mtp->mt_numfrees)
|
|
|
|
mtp->mt_numallocs = mtp->mt_numfrees;
|
|
|
|
|
2005-07-14 17:40:02 +00:00
|
|
|
mtp->mt_size = uthp->uth_size;
|
2014-02-10 20:09:10 +00:00
|
|
|
mtp->mt_rsize = uthp->uth_rsize;
|
2005-07-14 20:01:04 +00:00
|
|
|
mtp->mt_memalloced = mtp->mt_numallocs * uthp->uth_size;
|
|
|
|
mtp->mt_memfreed = mtp->mt_numfrees * uthp->uth_size;
|
2005-07-14 17:40:02 +00:00
|
|
|
mtp->mt_bytes = mtp->mt_memalloced - mtp->mt_memfreed;
|
|
|
|
mtp->mt_countlimit = uthp->uth_limit;
|
|
|
|
mtp->mt_byteslimit = uthp->uth_limit * uthp->uth_size;
|
|
|
|
|
|
|
|
mtp->mt_count = mtp->mt_numallocs - mtp->mt_numfrees;
|
UMA supports "secondary" zones, in which a second zone can be layered
on top of a primary zone, sharing the same allocation "keg". When
reporting statistics for zones, do not report the free items in the
keg as part of the free items in the zone, or those free items will
be reported more than once: for the primary zone, and then any
secondary zones off the primary zone. Separately record and maintain
a kegfree statistic, and export via memstat_get_kegfree(), which is
available for use if needed. Since items free'd back to the keg are
not fully initialized, and hence may not actually be available (since
secondary zone ctor-time initialization can fail), this makes some
amount of sense.
This change corrects a bug made visible in the libmemstat(3)
modifications to netstat: mbufs freed back to the keg from the
packet zone would be counted twice, resulting in negative values
being printed in the mbuf free count.
Some further refinement of reporting relating to secondary zones may
still be required.
Reported by: ssouhlal
MFC after: 3 days
2005-07-20 09:17:40 +00:00
|
|
|
mtp->mt_zonefree = uthp->uth_zone_free;
|
2005-07-25 09:52:59 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* UMA secondary zones share a keg with the primary zone. To
|
|
|
|
* avoid double-reporting of free items, report keg free
|
|
|
|
* items only in the primary zone.
|
|
|
|
*/
|
|
|
|
if (!(uthp->uth_zone_flags & UTH_ZONE_SECONDARY)) {
|
|
|
|
mtp->mt_kegfree = uthp->uth_keg_free;
|
2005-08-01 13:18:21 +00:00
|
|
|
mtp->mt_free += mtp->mt_kegfree;
|
2005-07-25 09:52:59 +00:00
|
|
|
}
|
2005-07-14 17:40:02 +00:00
|
|
|
mtp->mt_free += mtp->mt_zonefree;
|
|
|
|
}
|
|
|
|
|
|
|
|
free(buffer);
|
|
|
|
|
|
|
|
return (0);
|
|
|
|
}
|
2005-08-01 19:07:39 +00:00
|
|
|
|
|
|
|
static int
|
|
|
|
kread(kvm_t *kvm, void *kvm_pointer, void *address, size_t size,
|
|
|
|
size_t offset)
|
|
|
|
{
|
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
ret = kvm_read(kvm, (unsigned long)kvm_pointer + offset, address,
|
|
|
|
size);
|
|
|
|
if (ret < 0)
|
|
|
|
return (MEMSTAT_ERROR_KVM);
|
|
|
|
if ((size_t)ret != size)
|
|
|
|
return (MEMSTAT_ERROR_KVM_SHORTREAD);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2012-10-26 17:51:05 +00:00
|
|
|
kread_string(kvm_t *kvm, const void *kvm_pointer, char *buffer, int buflen)
|
2005-08-01 19:07:39 +00:00
|
|
|
{
|
|
|
|
ssize_t ret;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < buflen; i++) {
|
|
|
|
ret = kvm_read(kvm, (unsigned long)kvm_pointer + i,
|
|
|
|
&(buffer[i]), sizeof(char));
|
|
|
|
if (ret < 0)
|
|
|
|
return (MEMSTAT_ERROR_KVM);
|
|
|
|
if ((size_t)ret != sizeof(char))
|
|
|
|
return (MEMSTAT_ERROR_KVM_SHORTREAD);
|
|
|
|
if (buffer[i] == '\0')
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
/* Truncate. */
|
|
|
|
buffer[i-1] = '\0';
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
kread_symbol(kvm_t *kvm, int index, void *address, size_t size,
|
|
|
|
size_t offset)
|
|
|
|
{
|
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
ret = kvm_read(kvm, namelist[index].n_value + offset, address, size);
|
|
|
|
if (ret < 0)
|
|
|
|
return (MEMSTAT_ERROR_KVM);
|
|
|
|
if ((size_t)ret != size)
|
|
|
|
return (MEMSTAT_ERROR_KVM_SHORTREAD);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* memstat_kvm_uma() is similar to memstat_sysctl_uma(), only it extracts
|
|
|
|
* UMA(9) statistics from a kernel core/memory file.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
memstat_kvm_uma(struct memory_type_list *list, void *kvm_handle)
|
|
|
|
{
|
2006-01-16 00:37:20 +00:00
|
|
|
LIST_HEAD(, uma_keg) uma_kegs;
|
2005-08-01 19:07:39 +00:00
|
|
|
struct memory_type *mtp;
|
2018-01-12 23:25:05 +00:00
|
|
|
struct uma_zone_domain uzd;
|
2020-01-04 03:30:08 +00:00
|
|
|
struct uma_domain ukd;
|
2005-08-01 19:07:39 +00:00
|
|
|
struct uma_bucket *ubp, ub;
|
2006-02-11 19:19:29 +00:00
|
|
|
struct uma_cache *ucp, *ucp_array;
|
2005-08-01 19:07:39 +00:00
|
|
|
struct uma_zone *uzp, uz;
|
|
|
|
struct uma_keg *kzp, kz;
|
2020-01-04 03:30:08 +00:00
|
|
|
uint64_t kegfree;
|
2018-01-12 23:25:05 +00:00
|
|
|
int hint_dontsearch, i, mp_maxid, ndomains, ret;
|
2005-08-01 19:07:39 +00:00
|
|
|
char name[MEMTYPE_MAXNAME];
|
2011-05-02 17:13:40 +00:00
|
|
|
cpuset_t all_cpus;
|
Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
|
|
|
long cpusetsize;
|
2005-08-01 19:07:39 +00:00
|
|
|
kvm_t *kvm;
|
|
|
|
|
|
|
|
kvm = (kvm_t *)kvm_handle;
|
|
|
|
hint_dontsearch = LIST_EMPTY(&list->mtl_list);
|
|
|
|
if (kvm_nlist(kvm, namelist) != 0) {
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_KVM;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
if (namelist[X_UMA_KEGS].n_type == 0 ||
|
|
|
|
namelist[X_UMA_KEGS].n_value == 0) {
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_KVM_NOSYMBOL;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
ret = kread_symbol(kvm, X_MP_MAXID, &mp_maxid, sizeof(mp_maxid), 0);
|
|
|
|
if (ret != 0) {
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
2018-01-12 23:25:05 +00:00
|
|
|
ret = kread_symbol(kvm, X_VM_NDOMAINS, &ndomains,
|
|
|
|
sizeof(ndomains), 0);
|
|
|
|
if (ret != 0) {
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
2005-08-01 19:07:39 +00:00
|
|
|
ret = kread_symbol(kvm, X_UMA_KEGS, &uma_kegs, sizeof(uma_kegs), 0);
|
|
|
|
if (ret != 0) {
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
|
|
|
cpusetsize = sysconf(_SC_CPUSET_SIZE);
|
2011-05-31 20:59:53 +00:00
|
|
|
if (cpusetsize == -1 || (u_long)cpusetsize > sizeof(cpuset_t)) {
|
Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
|
|
|
list->mtl_error = MEMSTAT_ERROR_KVM_NOSYMBOL;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
CPU_ZERO(&all_cpus);
|
|
|
|
ret = kread_symbol(kvm, X_ALL_CPUS, &all_cpus, cpusetsize, 0);
|
2006-02-11 18:44:37 +00:00
|
|
|
if (ret != 0) {
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
2006-02-11 19:19:29 +00:00
|
|
|
ucp_array = malloc(sizeof(struct uma_cache) * (mp_maxid + 1));
|
|
|
|
if (ucp_array == NULL) {
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_NOMEMORY;
|
|
|
|
return (-1);
|
|
|
|
}
|
2005-08-01 19:07:39 +00:00
|
|
|
for (kzp = LIST_FIRST(&uma_kegs); kzp != NULL; kzp =
|
|
|
|
LIST_NEXT(&kz, uk_link)) {
|
|
|
|
ret = kread(kvm, kzp, &kz, sizeof(kz), 0);
|
|
|
|
if (ret != 0) {
|
2006-02-11 19:19:29 +00:00
|
|
|
free(ucp_array);
|
2005-08-01 19:07:39 +00:00
|
|
|
_memstat_mtl_empty(list);
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
for (uzp = LIST_FIRST(&kz.uk_zones); uzp != NULL; uzp =
|
|
|
|
LIST_NEXT(&uz, uz_link)) {
|
|
|
|
ret = kread(kvm, uzp, &uz, sizeof(uz), 0);
|
|
|
|
if (ret != 0) {
|
2006-02-11 19:19:29 +00:00
|
|
|
free(ucp_array);
|
|
|
|
_memstat_mtl_empty(list);
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
ret = kread(kvm, uzp, ucp_array,
|
|
|
|
sizeof(struct uma_cache) * (mp_maxid + 1),
|
|
|
|
offsetof(struct uma_zone, uz_cpu[0]));
|
|
|
|
if (ret != 0) {
|
|
|
|
free(ucp_array);
|
2005-08-01 19:07:39 +00:00
|
|
|
_memstat_mtl_empty(list);
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
ret = kread_string(kvm, uz.uz_name, name,
|
|
|
|
MEMTYPE_MAXNAME);
|
|
|
|
if (ret != 0) {
|
2006-02-11 19:19:29 +00:00
|
|
|
free(ucp_array);
|
2005-08-01 19:07:39 +00:00
|
|
|
_memstat_mtl_empty(list);
|
|
|
|
list->mtl_error = ret;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
if (hint_dontsearch == 0) {
|
|
|
|
mtp = memstat_mtl_find(list, ALLOCATOR_UMA,
|
|
|
|
name);
|
|
|
|
} else
|
|
|
|
mtp = NULL;
|
|
|
|
if (mtp == NULL)
|
|
|
|
mtp = _memstat_mt_allocate(list, ALLOCATOR_UMA,
|
2011-08-01 09:43:35 +00:00
|
|
|
name, mp_maxid + 1);
|
2005-08-01 19:07:39 +00:00
|
|
|
if (mtp == NULL) {
|
2006-02-11 19:19:29 +00:00
|
|
|
free(ucp_array);
|
2005-08-01 19:07:39 +00:00
|
|
|
_memstat_mtl_empty(list);
|
|
|
|
list->mtl_error = MEMSTAT_ERROR_NOMEMORY;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Reset the statistics on a current node.
|
|
|
|
*/
|
2011-08-01 09:43:35 +00:00
|
|
|
_memstat_mt_reset_stats(mtp, mp_maxid + 1);
|
2019-01-15 18:47:19 +00:00
|
|
|
mtp->mt_numallocs = kvm_counter_u64_fetch(kvm,
|
|
|
|
(unsigned long )uz.uz_allocs);
|
|
|
|
mtp->mt_numfrees = kvm_counter_u64_fetch(kvm,
|
|
|
|
(unsigned long )uz.uz_frees);
|
|
|
|
mtp->mt_failures = kvm_counter_u64_fetch(kvm,
|
|
|
|
(unsigned long )uz.uz_fails);
|
2020-02-19 18:48:46 +00:00
|
|
|
mtp->mt_xdomain = kvm_counter_u64_fetch(kvm,
|
|
|
|
(unsigned long )uz.uz_xdomain);
|
2010-06-15 19:28:37 +00:00
|
|
|
mtp->mt_sleeps = uz.uz_sleeps;
|
2019-05-29 03:14:46 +00:00
|
|
|
/* See comment above in memstat_sysctl_uma(). */
|
|
|
|
if (mtp->mt_numallocs < mtp->mt_numfrees)
|
|
|
|
mtp->mt_numallocs = mtp->mt_numfrees;
|
|
|
|
|
2005-08-01 19:07:39 +00:00
|
|
|
if (kz.uk_flags & UMA_ZFLAG_INTERNAL)
|
|
|
|
goto skip_percpu;
|
|
|
|
for (i = 0; i < mp_maxid + 1; i++) {
|
2011-05-02 17:13:40 +00:00
|
|
|
if (!CPU_ISSET(i, &all_cpus))
|
2006-02-11 18:44:37 +00:00
|
|
|
continue;
|
2006-02-11 19:19:29 +00:00
|
|
|
ucp = &ucp_array[i];
|
2005-08-01 19:07:39 +00:00
|
|
|
mtp->mt_numallocs += ucp->uc_allocs;
|
|
|
|
mtp->mt_numfrees += ucp->uc_frees;
|
|
|
|
|
2019-12-25 20:50:53 +00:00
|
|
|
mtp->mt_free += ucp->uc_allocbucket.ucb_cnt;
|
|
|
|
mtp->mt_free += ucp->uc_freebucket.ucb_cnt;
|
|
|
|
mtp->mt_free += ucp->uc_crossbucket.ucb_cnt;
|
2005-08-01 19:07:39 +00:00
|
|
|
}
|
|
|
|
skip_percpu:
|
|
|
|
mtp->mt_size = kz.uk_size;
|
2014-02-10 20:09:10 +00:00
|
|
|
mtp->mt_rsize = kz.uk_rsize;
|
2005-08-01 19:07:39 +00:00
|
|
|
mtp->mt_memalloced = mtp->mt_numallocs * mtp->mt_size;
|
|
|
|
mtp->mt_memfreed = mtp->mt_numfrees * mtp->mt_size;
|
2006-02-11 16:54:00 +00:00
|
|
|
mtp->mt_bytes = mtp->mt_memalloced - mtp->mt_memfreed;
|
o Move zone limit from keg level up to zone level. This means that now
two zones sharing a keg may have different limits. Now this is going
to work:
zone = uma_zcreate();
uma_zone_set_max(zone, limit);
zone2 = uma_zsecond_create(zone);
uma_zone_set_max(zone2, limit2);
Kegs no longer have uk_maxpages field, but zones have uz_items. When
set, it may be rounded up to minimum possible CPU bucket cache size.
For small limits bucket cache can also be reconfigured to be smaller.
Counter uz_items is updated whenever items transition from keg to a
bucket cache or directly to a consumer. If zone has uz_maxitems set and
it is reached, then we are going to sleep.
o Since new limits don't play well with multi-keg zones, remove them. The
idea of multi-keg zones was introduced exactly 10 years ago, and never
have had a practical usage. In discussion with Jeff we came to a wild
agreement that if we ever want to reintroduce the idea of a smart allocator
that would be able to choose between two (or more) totally different
backing stores, that choice should be made one level higher than UMA,
e.g. in malloc(9) or in mget(), or whatever and choice should be controlled
by the caller.
o Sleeping code is improved to account number of sleepers and wake them one
by one, to avoid thundering herd problem.
o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache()
KPI added. Having no bucket cache basically means setting maxcache to 0.
o Now with many fields added and many removed (no multi-keg zones!) make
sure that struct uma_zone is perfectly aligned.
Reviewed by: markj, jeff
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D17773
2019-01-15 00:02:06 +00:00
|
|
|
mtp->mt_countlimit = uz.uz_max_items;
|
2005-08-01 19:07:39 +00:00
|
|
|
mtp->mt_byteslimit = mtp->mt_countlimit * mtp->mt_size;
|
|
|
|
mtp->mt_count = mtp->mt_numallocs - mtp->mt_numfrees;
|
2018-01-12 23:25:05 +00:00
|
|
|
for (i = 0; i < ndomains; i++) {
|
2020-08-28 19:50:40 +00:00
|
|
|
ret = kread(kvm, ZDOM_GET(uzp, i), &uzd,
|
|
|
|
sizeof(uzd), 0);
|
2020-01-04 03:30:08 +00:00
|
|
|
if (ret != 0)
|
|
|
|
continue;
|
2018-01-12 23:25:05 +00:00
|
|
|
for (ubp =
|
2020-02-04 05:27:45 +00:00
|
|
|
STAILQ_FIRST(&uzd.uzd_buckets);
|
2018-01-12 23:25:05 +00:00
|
|
|
ubp != NULL;
|
2020-02-04 05:27:45 +00:00
|
|
|
ubp = STAILQ_NEXT(&ub, ub_link)) {
|
2018-01-12 23:25:05 +00:00
|
|
|
ret = kread(kvm, ubp, &ub,
|
|
|
|
sizeof(ub), 0);
|
2020-01-04 03:30:08 +00:00
|
|
|
if (ret != 0)
|
|
|
|
continue;
|
2018-01-12 23:25:05 +00:00
|
|
|
mtp->mt_zonefree += ub.ub_cnt;
|
|
|
|
}
|
2005-08-01 19:07:39 +00:00
|
|
|
}
|
|
|
|
if (!((kz.uk_flags & UMA_ZONE_SECONDARY) &&
|
|
|
|
LIST_FIRST(&kz.uk_zones) != uzp)) {
|
2020-01-04 03:30:08 +00:00
|
|
|
kegfree = 0;
|
|
|
|
for (i = 0; i < ndomains; i++) {
|
|
|
|
ret = kread(kvm, &kzp->uk_domain[i],
|
|
|
|
&ukd, sizeof(ukd), 0);
|
|
|
|
if (ret != 0)
|
2020-02-11 20:15:49 +00:00
|
|
|
kegfree += ukd.ud_free_items;
|
2020-01-04 03:30:08 +00:00
|
|
|
}
|
|
|
|
mtp->mt_kegfree = kegfree;
|
2005-08-01 19:07:39 +00:00
|
|
|
mtp->mt_free += mtp->mt_kegfree;
|
|
|
|
}
|
|
|
|
mtp->mt_free += mtp->mt_zonefree;
|
|
|
|
}
|
|
|
|
}
|
2006-02-11 19:19:29 +00:00
|
|
|
free(ucp_array);
|
2005-08-01 19:07:39 +00:00
|
|
|
return (0);
|
|
|
|
}
|