Commit Graph

206 Commits

Author SHA1 Message Date
gjb
fc21f40567 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
hselasky
bd1ed65f0f Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
kib
23d48b19e1 For Xeon 7500 and 48XX (Nehalem EX and Westmere EX) variants of the
Core i7 and Westmere processors, the uncore PMC subsystem is
completely different from the uncore PMC on smaller versions of CPUs.
Disable existing uncore hwpmc code for EX, otherwise non-existing MSRs
are accessed.

The cores PMCs seems to be identical for non-EX and EX, according to
the SDM.

Reviewed by:	davide, fabient
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-06-04 16:06:38 +00:00
gnn
17a2ad14fc Add missing Ivy Bridge and Haswell events.
Submitted by: Anton Rang <rang@mac.com>
MFC: 2 weeks
2014-06-02 20:50:08 +00:00
markj
1ee93b78d4 Remove some prototypes for undefined functions.
MFC after:	3 days
2014-05-15 21:19:13 +00:00
jhibbits
c76bc003ed Enable and disable the PMC unit at load/unload time, respectively.
MFC after:	3 weeks
2014-04-18 06:39:00 +00:00
hiren
2babaa36d7 Update hwpmc to support core events for Atom Silvermont microarchitecture.
(Model 0x4D as per Intel document 330061-001 01/2014)

Tested by:	Olivier Cochard-Labbe <olivier@cochatrd.me>
MFC after:	4 weeks
2014-03-20 20:51:08 +00:00
rwatson
33fdc14c0c Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

MFC after:	3 weeks
2014-03-16 10:55:57 +00:00
eadler
1130a4041b Fix pointer type in call to malloc
Submitted by:	Meyer, Conrad conrad.meyer@isilon.com
2014-03-13 16:51:40 +00:00
eadler
448a71e0c6 Fix pointer type in call to malloc
Submitted by:	Meyer, Conrad conrad.meyer@isilon.com
2014-03-13 16:51:01 +00:00
kib
801489b5e7 Use correct types for sizeof() in the calculations for the malloc(9) sizes [1].
While there, remove unneeded checks for failed allocations with M_WAITOK flag.

Submitted by:	Conrad Meyer <cemeyer@uw.edu> [1]
MFC after:	1 week
2014-03-12 10:25:26 +00:00
jhibbits
5873400b7c Fix callchain capture for hwpmc(4). While here, some style(9) fixes, too.
MFC after:	2 weeks
2014-02-27 04:45:29 +00:00
jhibbits
859fb3bb22 Add hwpmc(4) support for the PowerPC 970 class processors, direct events.
This also fixes asserts on removal of the module for the mpc74xx.

The PowerPC 970 processors have two different types of events: direct events
and indirect events.  Thus far only direct events are supported.  I included
some documentation in the driver on how indirect events work, but support is
for the future.

MFC after:	1 month
2014-02-01 02:03:50 +00:00
jhibbits
c7db1aef3d MPC74xx should not fall through, to the error case.
MFC after:	1 week
2014-01-25 22:50:42 +00:00
jhb
b2533ec507 Move <machine/apicvar.h> to <x86/apicvar.h>. 2014-01-23 20:10:22 +00:00
gnn
39ab4f7cb2 Add another Haswell model (0x45) to the set of supported chips.
Model 0x45 appears, for example, in late 2013 Mac Book Pro models
and is properly emulated by VMware.
2013-12-20 20:22:10 +00:00
attilio
10b4f63c53 o Remove assertions on ipa_version as sometimes the version detection
using cpuid can be quirky (this is the case of VMWare without the
  vPMC support) but fail to probe hwpmc.
o Apply the fix for XEON family of processors as established by
  315338-020 document (bug AJ85).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	fabient
2013-12-20 14:03:56 +00:00
jhibbits
c4ffb19933 Add userland PMC backtracing, and use the PMC trapframe macros for kernel
backtraces.

MFC after:	1 week
2013-12-14 20:12:28 +00:00
eadler
44c01df173 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
davide
023fd8bc67 Remove local change leftover, this should never have been part of
r255745.

Pointy-hat to:	davide
Approved by:	re (implicit)
2013-09-20 23:10:52 +00:00
davide
5273d359fd Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	re (delphij)
2013-09-20 23:06:21 +00:00
jhibbits
2764ddeb0a Fix the build. 2013-09-05 01:13:26 +00:00
pjd
029a6f5d92 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
jhibbits
6fc9e86bed Fix hwpmc(4) for 32-bit PowerPC. 2013-09-04 04:11:38 +00:00
jhibbits
241d6ad5a0 Refactor PowerPC hwpmc(4) driver into generic and specific. More refactoring
will likely be done as more drivers are added, since AIM-compatible processors
have similar PMC configuration logic.
2013-09-03 00:34:18 +00:00
davide
0920b58072 Complete r250105. Do not zero fields if M_ZERO flag is specified to
malloc(9).

Reported by:	pluknet, glebius
2013-09-01 21:44:43 +00:00
adrian
41d7ccf605 Remove the duplicate LLC_MISS event and put it in the right order. 2013-08-29 13:52:51 +00:00
adrian
1f5f1f4e50 Update the mis-predicted branch PMC names (for sandy bridge) to not clash.
The SDM (June 2013) tables on these are rather confusing.  Yes, they
assign the same name (BR_MISP_RETIRED.ALL_BRANCHES) to two codes
(C5H/00H and C5H/04H.) The latter however is the PEBS version.

So, to make it easier to see the difference - and yes, we can use
both without having to actually enable the PEBS specific bits! -
just rename the PEBS one to _PS so there's no clashing.

Tested:

* Sandy bridge
2013-08-25 12:58:34 +00:00
adrian
99136ad2af Fix a >80 character long line, introduced in my previous commit.
Noticed by: hiren
2013-08-25 12:02:20 +00:00
adrian
055650f8c7 Update the MEM_UOP_RETIRED PMC operation for sandy bridge and sandy
bridge Xeon.

Summary: These are PEBS events but they're also available as normal
counter/sample events.  The source table (Table 19-2) lists the
base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must
be qualified with other values.  This particular commit fleshes
out those umask values.

Source:

* Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21.

Tested:

* Sandy Bridge (non-Xeon)
2013-08-25 02:07:28 +00:00
markj
3541d8b143 Rename the kld_unload event handler to kld_unload_try, and add a new
kld_unload event handler which gets invoked after a linker file has been
successfully unloaded. The kld_unload and kld_load event handlers are now
invoked with the shared linker lock held, while kld_unload_try is invoked
with the lock exclusively held.

Convert hwpmc(4) to use these event handlers instead of having
kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are
loaded or unloaded. This has no functional effect, but simplifes the linker
code somewhat.

Reviewed by:	jhb
2013-08-24 21:13:38 +00:00
adrian
6d47a8c959 Change the name of this particular event to reflect the name used in
Linux and Intel examples.

Sourced:

* https://github.com/andikleen/pmu-tools/blob/master/snb-client.csv
* http://software.intel.com/en-us/comment/1747932#comment-1747932

Note:

* It's not currently in the Intel SDM; I need to chase down what's
  going on.

Tested:

* Sandy Bridge
2013-08-21 21:47:56 +00:00
bz
197a6108f9 Correct a typo in the event mask mnemonic.
Reviewed by:	gnn
MFC after:	3 days
2013-08-20 14:59:31 +00:00
adrian
e405686dc7 Add in missing events for Sandy Bridge Xeon.
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy
  bridge Xeon.  Right now it only is enabled for Sandy Bridge.
* D2/0F is actually a combination rather than a separate counter, so
  just flip that on for the CPU types that support it.

There's an errata for using this on SB Xeon hardware - I've documented
it in kern/181346.

Tested:

* Sandy Bridge
* Sandy Bridge Xeon

Sponsored by:	Netflix, Inc.
2013-08-18 06:08:52 +00:00
alc
b4fae70474 Relax the vm object locking. Use a read lock.
Sponsored by:	EMC / Isilon Storage Division
2013-06-05 17:00:10 +00:00
davide
0a4bbe752e Suppress a GCC warning. This warning is actually bogus and newer GCC
versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1)
do not whine about it, so, at some point this workaround will be reverted.

Reported by:	ache
Discussed with:	dim
2013-05-02 14:55:21 +00:00
davide
6fde656c55 malloc(9) cannot return NULL if M_WAITOK flag is specified. 2013-04-30 15:59:22 +00:00
davide
e45f362b6b The Intel PMC architectural events have encodings which are identical to
those of some non-architectural core events. This is not a problem in the
general case as long as there's an 1:1 mapping between the two, but there
are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents
both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P.
CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure
reference (i.e. bus) but TSC, so there's the need to disambiguate.
In order to avoid the namespace collision rename all the architectural
events in a way they cannot be ambigous and refactor the architectural
events handling function to reflect this change.
While here, per Jim Harris request, rename
iap_architectural_event_is_unsupported() to iap_event_is_architectural().

Discussed with:	jimharris
Reviewed by:	jimharris, gnn
2013-04-30 15:31:45 +00:00
davide
70ee3bdcde Complete r250097:
Do not change the initialization order in pmc_intel_initialize().
2013-04-30 14:56:41 +00:00
davide
cce56d6bd8 When hwpmc(4) module is unloaded it reports a double leakage. This happens
at least if FreeBSD is ran under VirtualBox. In order to avoid the leakage,
properly deallocate structures in case CPU claims that hw performance
monitoring counters are not supported.

Reported by:	hiren
2013-04-30 08:33:38 +00:00
davide
3569f976bf Fixup Westmere hwpmc(4) support: add missing CPU flag so that
intrucion-retired, llc-misses and llc-reference events can now be
allocated.

Reviewed by:	jimharris, gnn
2013-04-30 08:18:08 +00:00
hiren
67955559b1 Improve/correct a comment. We now support a lot more cpu types.
PR:	kern/177496
Approved by:	sbruno (mentor)
2013-04-14 02:26:12 +00:00
rstone
3343598ba8 Cosmetic change: make a comment reference Sandy Bridge *Xeon*
Reviewed by:	sbruno
MFC after:	1 week
2013-04-12 20:43:14 +00:00
sbruno
20a2c3e096 Trailing whitespace cleanup along with 80 column enforcemnt.
Submitted by:	hiren.panchasara@gmail.com
Reviewed by:	sbruno@freebsd.org
Obtained from:	Yahoo! Inc.
MFC after:	2 weeks
2013-04-03 21:34:35 +00:00
sbruno
18d941830f Update hwpmc to support Haswell class processors.
0x3C:      /* Per Intel document 325462-045US 01/2013. */

Add manpage to document all the goodness that is available in this
processor model.

Submitted by:	hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by:	jimharris, sbruno
Obtained from:	Yahoo! Inc.
MFC after:	2 weeks
2013-03-28 19:15:54 +00:00
attilio
bf1dc90446 MFC 2013-03-08 00:03:07 +00:00
fabient
5e81647255 Add a generic way to call per event allocate / release function.
Reviewed by:	mav
MFC after:	1 month
2013-03-05 10:18:48 +00:00
attilio
820ab571ec MFC 2013-02-26 21:09:35 +00:00
mav
fdde785247 Add support for good old 8192Hz profiling clock to software PMC.
Reviewed by:	fabient
2013-02-26 18:13:42 +00:00
attilio
afe5ce0c13 MFC 2013-02-26 17:33:18 +00:00