go back through HASWELL, IVY_BRIDGE, IVY_BRIDGE_XEON and SANDY_BRIDGE
to straighten out all the missing PMCs. We also add a new pmc tool
pmcstudy, this allows one to run the various formulas from
the documents "Using Intel Vtune Amplifier XE on XXX Generation platforms" for
IB/SB and Haswell. The tool also allows one to postulate your own
formulas with any of the various PMC's. At some point I will enahance
this to work with Brendan Gregg's flame-graphs so we can flamegraph
various PMC interactions. Note the manual page also needs some
work (lots of work) but gnn has committed to help me with that ;-)
Reviewed by: gnn
MFC after:1 month
Sponsored by: Netflix Inc.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
- Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc()
routines in the local APIC code that the hwpmc(4) driver can use to
manage the local APIC PMC interrupt vector.
- Do not enable the local APIC PMC interrupt vector by default when
HWPMC_HOOKS is enabled. Instead, the hwpmc(4) driver explicitly
enables the interrupt when it is succesfully initialized and disables
the interrupt when it is unloaded. This avoids enabling the interrupt
on unsupported CPUs which may result in spurious NMIs.
Reported by: rnoland
Reviewed by: jkoshy
Approved by: re (kib)
MFC after: 2 weeks
dependencies. A 'struct pmc_classdep' structure describes operations
on PMCs; 'struct pmc_mdep' contains one or more 'struct pmc_classdep'
structures depending on the CPU in question.
Inside PMC class dependent code, row indices are relative to the
PMCs supported by the PMC class; MI code in "hwpmc_mod.c" translates
global row indices before invoking class dependent operations.
- Augment the OP_GETCPUINFO request with the number of PMCs present
in a PMC class.
- Move code common to Intel CPUs to file "hwpmc_intel.c".
- Move TSC handling to file "hwpmc_tsc.c".
Group mutexes used in hwpmc(4) into 3 "types" in the sense of
witness(4):
- leaf spin mutexes---only one of these should be held at a time,
so these mutexes are specified as belonging to a single witness
type "pmc-leaf".
- `struct pmc_owner' descriptors are protected by a spin mutex of
witness type "pmc-owner-proc". Since we call wakeup_one() while
holding these mutexes, the witness type of these mutexes needs
to dominate that of "sleepq chain" mutexes.
- logger threads use a sleep mutex, of type "pmc-sleep".
Submitted by: wkoszek (earlier patch)
- Update driver interrupt statistics correctly.
sys/sys/pmc.h, sys/dev/hwpmc/hwpmc_mod.c:
- Fix a bug affecting debug printfs.
- Move the 'stalled' flag from being in a bit in the
'pm_flags' field of a 'struct pmc' to a field of its own in the
same structure. This flag is updated from the NMI handler and
keeping it separate makes it easier to avoid races with other
parts of the code.
sys/dev/hwpmc/hwpmc_logging.c:
- Do arithmetic with 'uintptr_t' types rather that casting
to and from 'char *'.
Approved by: re (scottl)
- Implement sampling modes and logging support in hwpmc(4).
- Separate MI and MD parts of hwpmc(4) and allow sharing of
PMC implementations across different architectures.
Add support for P4 (EMT64) style PMCs to the amd64 code.
- New pmcstat(8) options: -E (exit time counts) -W (counts
every context switch), -R (print log file).
- pmc(3) API changes, improve our ability to keep ABI compatibility
in the future. Add more 'alias' names for commonly used events.
- bug fixes & documentation.
Have pmcstat(8) and pmccontrol(8) use these APIs.
Return PMC class-related constants (PMC widths and capabilities)
with the OP GETCPUINFO call leaving OP PMCINFO to return only the
dynamic information associated with a PMC (i.e., whether enabled,
owner pid, reload count etc.).
Allow pmc_read() (i.e., OPS PMCRW) on active self-attached PMCs to
get upto-date values from hardware since we can guarantee that the
hardware is running the correct PMC at the time of the call.
Bug fixes:
- (x86 class processors) Fix a bug that prevented an RDPMC
instruction from being recognized as permitted till after the
attached process had context switched out and back in again after
a pmc_start() call.
Tighten the rules for using RDPMC class instructions: a GETMSR
OP is now allowed only after an OP ATTACH has been done by the
PMC's owner to itself. OP GETMSR is not allowed for PMCs that
track descendants, for PMCs attached to processes other than
their owner processes.
- (P4/HTT processors only) Fix a bug that caused the MI and MD
layers to get out of sync. Add a new MD operation 'get_config()'
as part of this fix.
- Allow multiple system-mode PMCs at the same row-index but on
different CPUs to be allocated.
- Reject allocation of an administratively disabled PMC.
Misc. code cleanups and refactoring. Improve a few comments.
Only allow a process to use the x86 RDPMC instruction if it has
allocated and attached a PMC to itself.
Inform the MD layer of the "pseudo context switch out" that needs
to be done when the last thread of a process is exiting.
includes the MD header for us. Do not include <machine/specialreg.h>
as it is not a header file that can be included from MI files. It
is included from <machine/pmc_mdep.h> if so needed and possible.
Ok'd: jkoshy@