supported, use full-width aliases MSRs for writes. This fixes the
"[pmc,X] negative increment" assertion on the context switch when
clipped counter value is sign-extended.
Add definitions for the MSR IA32_PERF_CAPABILITIES needed to detect
the feature.
PR: 207068
Submitted by: joss.upton@yahoo.com
MFC after: 2 weeks
IAP_F_FM's as well as incorrect umask specifications for
some of the new Broadwell/Skylake PMC's. Also silvermont
had a *lot* of missing IAP_F_FM.
Sponsored by: Netflix Inc.
tested on the Broadwell-Xeon with a hacked up version of pmcstudy -T. I still need
to circle back and add in to pmcstudy all the new tests from the Broadwell Vtune
guide (for the hacked up version I just made it so I could run the -T option). The
Skylake CPU is not yet available (even though Intel is advertising it .. imagine that).
The Skylake PMC's will need to be tested once we can get a sample skylake CPU :-)
Sponsored by: Netflix Inc.
In both cases, the the effect of the bug was that a very small positive
number was written to the counter. This means that a large number of
events needed to occur before the next sampling interrupt would trigger.
Even with very frequently occurring events like clock cycles wrapping all
the way around could take a long time. Both bugs occurred when updating
the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count
against the count set when the thread switched in and generates a delta
to apply to the saved count. If this delta causes the reload counter
to go negative, it would add a full reload interval to wrap it around to
a positive value. The fix is to add the full reload interval if the
resulting counter is zero.
Second, occasionally the raw counter value read during a context switch
has actually wrapped, but an interrupt has not yet triggered. In this
case the existing logic would return a very large reload count (e.g.
2^48 - 2 if the counter had overflowed by a count of 2). This was seen
both for fixed-function and programmable counters on an E5-2643.
Workaround this case by returning a reload count of zero.
PR: 198149
Differential Revision: https://reviews.freebsd.org/D2557
Reviewed by: emaste
MFC after: 1 week
Sponsored by: Norse Corp, Inc.
The MEM_UOPS_RETIRED actually work the same way as the Sandy
Bridge counters, but the counters were documented in a different
way and that seemed to cause the Ivy Bridge counters to be
implemented incorrectly. Use the same counter definitions as
Sandy Bridge. While I'm here, rename the counters to match
what's documented in the datasheet.
Differential Revision: https://reviews.freebsd.org/D1590
MFC after: 1 month
Sponsored by: Sandvine Inc.
On Sandy Bridge and later, to count branch-related events you
have to or together a mask indicating the type of branch
instruction to count (e.g. direct jump, branch, etc) and a bits
indicating whether to count taken and not-taken branches. The
current counter definitions where defining this bits individually,
so the counters never worked and always just counted 0.
Fix the counter definitions to instead contain the proper
combination of masks. Also update the man pages to reflect the
new counters.
Differential Revision: https://reviews.freebsd.org/D1587
MFC after: 1 month
Sponsored by: Sandvine Inc.
A couple of pmc counters did not work because there were being
restricted to the wrong PMC unit. I've verified that these
counters now work and match the documented restrictions.
Differential Revision: https://reviews.freebsd.org/D1586
MFC after: 1 month
Sponsored by: Sandvine Inc
go back through HASWELL, IVY_BRIDGE, IVY_BRIDGE_XEON and SANDY_BRIDGE
to straighten out all the missing PMCs. We also add a new pmc tool
pmcstudy, this allows one to run the various formulas from
the documents "Using Intel Vtune Amplifier XE on XXX Generation platforms" for
IB/SB and Haswell. The tool also allows one to postulate your own
formulas with any of the various PMC's. At some point I will enahance
this to work with Brendan Gregg's flame-graphs so we can flamegraph
various PMC interactions. Note the manual page also needs some
work (lots of work) but gnn has committed to help me with that ;-)
Reviewed by: gnn
MFC after:1 month
Sponsored by: Netflix Inc.
events we have actually counted 'Branch Instruction Retired' when people
asked for 'Unhalted core cycles' using the 'unhalted-core-cycles' event mask
mnemonic.
Reviewed by: jimharris
Discussed with: gnn, rwatson
MFC after: 3 days
Sponsored by: DARPA/AFRL
Core i7 and Westmere processors, the uncore PMC subsystem is
completely different from the uncore PMC on smaller versions of CPUs.
Disable existing uncore hwpmc code for EX, otherwise non-existing MSRs
are accessed.
The cores PMCs seems to be identical for non-EX and EX, according to
the SDM.
Reviewed by: davide, fabient
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
using cpuid can be quirky (this is the case of VMWare without the
vPMC support) but fail to probe hwpmc.
o Apply the fix for XEON family of processors as established by
315338-020 document (bug AJ85).
Sponsored by: EMC / Isilon storage division
Reviewed by: fabient
bridge Xeon.
Summary: These are PEBS events but they're also available as normal
counter/sample events. The source table (Table 19-2) lists the
base versions (LOAD, STLB_MISS, SPLIT, ALL) but it says they must
be qualified with other values. This particular commit fleshes
out those umask values.
Source:
* Linux; SDM June 2013, Volume 3B, Table 19-2 and 18-21.
Tested:
* Sandy Bridge (non-Xeon)
* Add in MEM_LOAD_UOPS_LLC_HIT_RETIRED for both sandy bridge and sandy
bridge Xeon. Right now it only is enabled for Sandy Bridge.
* D2/0F is actually a combination rather than a separate counter, so
just flip that on for the CPU types that support it.
There's an errata for using this on SB Xeon hardware - I've documented
it in kern/181346.
Tested:
* Sandy Bridge
* Sandy Bridge Xeon
Sponsored by: Netflix, Inc.
versions than the one in base (dim@ mentioned he tried on 4.7.3 and 4.8.1)
do not whine about it, so, at some point this workaround will be reverted.
Reported by: ache
Discussed with: dim
those of some non-architectural core events. This is not a problem in the
general case as long as there's an 1:1 mapping between the two, but there
are few exceptions. For example, 3CH_01H on Nehalem/Westmere represents
both unhalted-reference-cycles and CPU_CLK_UNHALTED.REF_P.
CPU_CLK_UNHALTED.REF_P on the aforementioned architectures does not measure
reference (i.e. bus) but TSC, so there's the need to disambiguate.
In order to avoid the namespace collision rename all the architectural
events in a way they cannot be ambigous and refactor the architectural
events handling function to reflect this change.
While here, per Jim Harris request, rename
iap_architectural_event_is_unsupported() to iap_event_is_architectural().
Discussed with: jimharris
Reviewed by: jimharris, gnn
0x3C: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this
processor model.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: jimharris, sbruno
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
case 0x3E: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this
processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: davide, jimharris, sbruno
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
(Model 0x2D /* Per Intel document 253669-044US 08/2012. */)
Add manpage to document all the goodness that is available in this
processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: jimharris@ fabient@
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
- Do not cover error returned by pmc_core_initialize with the
result of pmc_uncore_initialize, fail right away.
- Give a user something to report instead failing silently
Reported by: Alexandr Kovalenko <never@nevermind.kiev.ua>
processors with CPUID signature 06_1AH, 06_1EH, and 06_1FH.
Refuse to allocate them on unsupported model.
Submitted by: Davide Italiano <davide.italiano@gmail.com>
MFC after: 1 month
a PMC. It was possible that we could have turned a bit on but
never cleared it.
Extend the calls to rdmsr() to all necessary functions, not
just those which previously caused a panic.
Pointed out by: jhb@
MFC after: 1 week
All of the necessary wrmsr calls are now preceded by a rdmsr
and we leave the reserved bits alone.
Document the bits in the relevant registers for future reference.
Tested by: mdf
MFC after: 1 week
domain clock, 8 programmable PMC.
- Westmere based CPU (Xeon 5600, Corei7 980X) support.
- New man pages with events list for core and uncore.
- Updated Corei7 events with Intel 253669-033US December 2009 doc.
There is some removed events in the documentation, they have been
kept in the code but documented in the man page as obsolete.
- Offcore response events can be setup with rsp token.
Sponsored by: NETASQ
* Correct a group of typos: for Core2 programmable events, check
user supplied umask values against the correct event descriptor
field.
Submitted by: Ryan Stone <rysto32 at gmail dot com>