Move PMC documentation to separate manual pages, one per PMC class.
This commit is contained in:
parent
9ed4f7cf04
commit
9ec48f3bd4
239
lib/libpmc/pmc.k7.3
Normal file
239
lib/libpmc/pmc.k7.3
Normal file
@ -0,0 +1,239 @@
|
|||||||
|
.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
|
||||||
|
.\"
|
||||||
|
.\" Redistribution and use in source and binary forms, with or without
|
||||||
|
.\" modification, are permitted provided that the following conditions
|
||||||
|
.\" are met:
|
||||||
|
.\" 1. Redistributions of source code must retain the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer.
|
||||||
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer in the
|
||||||
|
.\" documentation and/or other materials provided with the distribution.
|
||||||
|
.\"
|
||||||
|
.\" This software is provided by Joseph Koshy ``as is'' and
|
||||||
|
.\" any express or implied warranties, including, but not limited to, the
|
||||||
|
.\" implied warranties of merchantability and fitness for a particular purpose
|
||||||
|
.\" are disclaimed. in no event shall Joseph Koshy be liable
|
||||||
|
.\" for any direct, indirect, incidental, special, exemplary, or consequential
|
||||||
|
.\" damages (including, but not limited to, procurement of substitute goods
|
||||||
|
.\" or services; loss of use, data, or profits; or business interruption)
|
||||||
|
.\" however caused and on any theory of liability, whether in contract, strict
|
||||||
|
.\" liability, or tort (including negligence or otherwise) arising in any way
|
||||||
|
.\" out of the use of this software, even if advised of the possibility of
|
||||||
|
.\" such damage.
|
||||||
|
.\"
|
||||||
|
.\" $FreeBSD$
|
||||||
|
.\"
|
||||||
|
.Dd September 16, 2008
|
||||||
|
.Os
|
||||||
|
.Dt PMC.K7 3
|
||||||
|
.Sh NAME
|
||||||
|
.Nm pmc.k7
|
||||||
|
.Nd measurement events for
|
||||||
|
.Tn AMD
|
||||||
|
.Tn Athlon
|
||||||
|
(K7 family) CPUs
|
||||||
|
.Sh LIBRARY
|
||||||
|
.Lb libpmc
|
||||||
|
.Sh SYNOPSIS
|
||||||
|
.In pmc.h
|
||||||
|
.Sh DESCRIPTION
|
||||||
|
AMD K7 PMCs are present in the
|
||||||
|
.Tn "AMD Athlon"
|
||||||
|
series of CPUs and are documented in:
|
||||||
|
.Rs
|
||||||
|
.%B "AMD Athlon Processor x86 Code Optimization Guide"
|
||||||
|
.%N "Publication No. 22007"
|
||||||
|
.%D "February 2002"
|
||||||
|
.%Q "Advanced Micro Devices, Inc."
|
||||||
|
.Re
|
||||||
|
.Ss PMC Features
|
||||||
|
AMD K7 PMCs are 48 bits wide.
|
||||||
|
Each K7 CPU contains 4 PMCs with the following capabilities:
|
||||||
|
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
||||||
|
.It Em Capability Ta Em Support
|
||||||
|
.It PMC_CAP_CASCADE Ta \&No
|
||||||
|
.It PMC_CAP_EDGE Ta Yes
|
||||||
|
.It PMC_CAP_INTERRUPT Ta Yes
|
||||||
|
.It PMC_CAP_INVERT Ta Yes
|
||||||
|
.It PMC_CAP_READ Ta Yes
|
||||||
|
.It PMC_CAP_PRECISE Ta \&No
|
||||||
|
.It PMC_CAP_SYSTEM Ta Yes
|
||||||
|
.It PMC_CAP_TAGGING Ta \&No
|
||||||
|
.It PMC_CAP_THRESHOLD Ta Yes
|
||||||
|
.It PMC_CAP_USER Ta Yes
|
||||||
|
.It PMC_CAP_WRITE Ta Yes
|
||||||
|
.El
|
||||||
|
.Ss Event Qualifiers
|
||||||
|
.Pp
|
||||||
|
Event specifiers for AMD K7 PMCs can have the following optional
|
||||||
|
qualifiers:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li count= Ns Ar value
|
||||||
|
Configure the counter to increment only if the number of configured
|
||||||
|
events measured in a cycle is greater than or equal to
|
||||||
|
.Ar value .
|
||||||
|
.It Li edge
|
||||||
|
Configure the counter to only count negated-to-asserted transitions
|
||||||
|
of the conditions expressed by the other qualifiers.
|
||||||
|
In other words, the counter will increment only once whenever a given
|
||||||
|
condition becomes true, irrespective of the number of clocks during
|
||||||
|
which the condition remains true.
|
||||||
|
.It Li inv
|
||||||
|
Invert the sense of comparision when the
|
||||||
|
.Dq Li count
|
||||||
|
qualifier is present, making the counter to increment when the
|
||||||
|
number of events per cycle is less than the value specified by
|
||||||
|
the
|
||||||
|
.Dq Li count
|
||||||
|
qualifier.
|
||||||
|
.It Li os
|
||||||
|
Configure the PMC to count events happening at privilege level 0.
|
||||||
|
.It Li unitmask= Ns Ar mask
|
||||||
|
This qualifier is used to further qualify a select few events,
|
||||||
|
.Dq Li k7-dc-refills-from-l2 ,
|
||||||
|
.Dq Li k7-dc-refills-from-system
|
||||||
|
and
|
||||||
|
.Dq Li k7-dc-writebacks .
|
||||||
|
Here
|
||||||
|
.Ar mask
|
||||||
|
is a string of the following characters optionally separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li m
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq Modified
|
||||||
|
state.
|
||||||
|
.It Li o
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq Owner
|
||||||
|
state.
|
||||||
|
.It Li e
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq Exclusive
|
||||||
|
state.
|
||||||
|
.It Li s
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq Shared
|
||||||
|
state.
|
||||||
|
.It Li i
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq Invalid
|
||||||
|
state.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
If no
|
||||||
|
.Dq Li unitmask
|
||||||
|
qualifier is specified, the default is to count events for caches
|
||||||
|
lines in any of the above states.
|
||||||
|
.It Li usr
|
||||||
|
Configure the PMC to count events occurring at privilege levels 1, 2
|
||||||
|
or 3.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
If neither of the
|
||||||
|
.Dq Li os
|
||||||
|
or
|
||||||
|
.Dq Li usr
|
||||||
|
qualifiers were specified, the default is to enable both.
|
||||||
|
.Ss AMD K7 Event Specifiers
|
||||||
|
The event specifiers supported on AMD K7 PMCs are:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li k7-dc-accesses
|
||||||
|
Count data cache accesses.
|
||||||
|
.It Li k7-dc-misses
|
||||||
|
Count data cache misses.
|
||||||
|
.It Li k7-dc-refills-from-l2 Op Li ,unitmask= Ns Ar mask
|
||||||
|
Count data cache refills from L2 cache.
|
||||||
|
This event may be further qualified using the
|
||||||
|
.Dq Li unitmask
|
||||||
|
qualifier.
|
||||||
|
.It Li k7-dc-refills-from-system Op Li ,unitmask= Ns Ar mask
|
||||||
|
Count data cache refills from system memory.
|
||||||
|
This event may be further qualified using the
|
||||||
|
.Dq Li unitmask
|
||||||
|
qualifier.
|
||||||
|
.It Li k7-dc-writebacks Op Li ,unitmask= Ns Ar mask
|
||||||
|
Count data cache writebacks.
|
||||||
|
This event may be further qualified using the
|
||||||
|
.Dq Li unitmask
|
||||||
|
qualifier.
|
||||||
|
.It Li k7-l1-dtlb-miss-and-l2-dtlb-hits
|
||||||
|
Count L1 DTLB misses and L2 DTLB hits.
|
||||||
|
.It Li k7-l1-and-l2-dtlb-misses
|
||||||
|
Count L1 and L2 DTLB misses.
|
||||||
|
.It Li k7-misaligned-references
|
||||||
|
Count misaligned data references.
|
||||||
|
.It Li k7-ic-fetches
|
||||||
|
Count instruction cache fetches.
|
||||||
|
.It Li k7-ic-misses
|
||||||
|
Count instruction cache misses.
|
||||||
|
.It Li k7-l1-itlb-misses
|
||||||
|
Count L1 ITLB misses that are L2 ITLB hits.
|
||||||
|
.It Li k7-l1-l2-itlb-misses
|
||||||
|
Count L1 (and L2) ITLB misses.
|
||||||
|
.It Li k7-retired-instructions
|
||||||
|
Count all retired instructions.
|
||||||
|
.It Li k7-retired-ops
|
||||||
|
Count retired ops.
|
||||||
|
.It Li k7-retired-branches
|
||||||
|
Count all retired branches (conditional, unconditional, exceptions
|
||||||
|
and interrupts).
|
||||||
|
.It Li k7-retired-branches-mispredicted
|
||||||
|
Count all misprediced retired branches.
|
||||||
|
.It Li k7-retired-taken-branches
|
||||||
|
Count retired taken branches.
|
||||||
|
.It Li k7-retired-taken-branches-mispredicted
|
||||||
|
Count mispredicted taken branches that were retired.
|
||||||
|
.It Li k7-retired-far-control-transfers
|
||||||
|
Count retired far control transfers.
|
||||||
|
.It Li k7-retired-resync-branches
|
||||||
|
Count retired resync branches (non control transfer branches).
|
||||||
|
.It Li k7-interrupts-masked-cycles
|
||||||
|
Count the number of cycles when the processor's
|
||||||
|
.Va IF
|
||||||
|
flag was zero.
|
||||||
|
.It Li k7-interrupts-masked-while-pending-cycles
|
||||||
|
Count the number of cycles interrupts were masked while pending due
|
||||||
|
to the processor's
|
||||||
|
.Va IF
|
||||||
|
flag being zero.
|
||||||
|
.It Li k7-hardware-interrupts
|
||||||
|
Count the number of taken hardware interrupts.
|
||||||
|
.El
|
||||||
|
.Ss Event Name Aliases
|
||||||
|
The following table shows the mapping between the PMC-independent
|
||||||
|
aliases supported by
|
||||||
|
.Lb libpmc
|
||||||
|
and the underlying hardware events used.
|
||||||
|
.Bl -column "branch-mispredicts" "Description"
|
||||||
|
.It Em Alias Ta Em Event
|
||||||
|
.It Li branches Ta Li k7-retired-branches
|
||||||
|
.It Li branch-mispredicts Ta Li k7-retired-branches-mispredicted
|
||||||
|
.It Li dc-misses Ta Li k7-dc-misses
|
||||||
|
.It Li ic-misses Ta Li k7-ic-misses
|
||||||
|
.It Li instructions Ta Li k7-retired-instructions
|
||||||
|
.It Li interrupts Ta Li k7-hardware-interrupts
|
||||||
|
.It Li unhalted-cycles Ta (unsupported)
|
||||||
|
.El
|
||||||
|
.Sh SEE ALSO
|
||||||
|
.Xr pmc 3 ,
|
||||||
|
.Xr pmc.k8 3 ,
|
||||||
|
.Xr pmc.p4 3 ,
|
||||||
|
.Xr pmc.p5 3 ,
|
||||||
|
.Xr pmc.p6 3 ,
|
||||||
|
.Xr pmc.tsc 3 ,
|
||||||
|
.Xr pmclog 3 ,
|
||||||
|
.Xr hwpmc 4
|
||||||
|
.Sh HISTORY
|
||||||
|
The
|
||||||
|
.Nm pmc
|
||||||
|
library first appeared in
|
||||||
|
.Fx 6.0 .
|
||||||
|
.Sh AUTHORS
|
||||||
|
The
|
||||||
|
.Lb libpmc
|
||||||
|
library was written by
|
||||||
|
.An "Joseph Koshy"
|
||||||
|
.Aq jkoshy@FreeBSD.org .
|
703
lib/libpmc/pmc.k8.3
Normal file
703
lib/libpmc/pmc.k8.3
Normal file
@ -0,0 +1,703 @@
|
|||||||
|
.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
|
||||||
|
.\"
|
||||||
|
.\" Redistribution and use in source and binary forms, with or without
|
||||||
|
.\" modification, are permitted provided that the following conditions
|
||||||
|
.\" are met:
|
||||||
|
.\" 1. Redistributions of source code must retain the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer.
|
||||||
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer in the
|
||||||
|
.\" documentation and/or other materials provided with the distribution.
|
||||||
|
.\"
|
||||||
|
.\" This software is provided by Joseph Koshy ``as is'' and
|
||||||
|
.\" any express or implied warranties, including, but not limited to, the
|
||||||
|
.\" implied warranties of merchantability and fitness for a particular purpose
|
||||||
|
.\" are disclaimed. in no event shall Joseph Koshy be liable
|
||||||
|
.\" for any direct, indirect, incidental, special, exemplary, or consequential
|
||||||
|
.\" damages (including, but not limited to, procurement of substitute goods
|
||||||
|
.\" or services; loss of use, data, or profits; or business interruption)
|
||||||
|
.\" however caused and on any theory of liability, whether in contract, strict
|
||||||
|
.\" liability, or tort (including negligence or otherwise) arising in any way
|
||||||
|
.\" out of the use of this software, even if advised of the possibility of
|
||||||
|
.\" such damage.
|
||||||
|
.\"
|
||||||
|
.\" $FreeBSD$
|
||||||
|
.\"
|
||||||
|
.Dd September 17, 2008
|
||||||
|
.Os
|
||||||
|
.Dt PMC.K8 3
|
||||||
|
.Sh NAME
|
||||||
|
.Nm pmc.k8
|
||||||
|
.Nd measurement events for
|
||||||
|
.Tn AMD
|
||||||
|
.Tn Athlon 64
|
||||||
|
(K8 family) CPUs
|
||||||
|
.Sh LIBRARY
|
||||||
|
.Lb libpmc
|
||||||
|
.Sh SYNOPSIS
|
||||||
|
.In pmc.h
|
||||||
|
.Sh DESCRIPTION
|
||||||
|
AMD K8 PMCs are present in the
|
||||||
|
.Tn "AMD Athlon64"
|
||||||
|
and
|
||||||
|
.Tn "AMD Opteron"
|
||||||
|
series of CPUs.
|
||||||
|
They are documented in the
|
||||||
|
.Rs
|
||||||
|
.%B "BIOS and Kernel Developer's Guide for the AMD Athlon(tm) 64 and AMD Opteron Processors"
|
||||||
|
.%N "Publication No. 26094"
|
||||||
|
.%D "April 2004"
|
||||||
|
.%Q "Advanced Micro Devices, Inc."
|
||||||
|
.Re
|
||||||
|
.Ss PMC Features
|
||||||
|
AMD K8 PMCs are 48 bits wide.
|
||||||
|
Each CPU contains 4 PMCs with the following capabilities:
|
||||||
|
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
||||||
|
.It Em Capability Ta Em Support
|
||||||
|
.It PMC_CAP_CASCADE Ta \&No
|
||||||
|
.It PMC_CAP_EDGE Ta Yes
|
||||||
|
.It PMC_CAP_INTERRUPT Ta Yes
|
||||||
|
.It PMC_CAP_INVERT Ta Yes
|
||||||
|
.It PMC_CAP_READ Ta Yes
|
||||||
|
.It PMC_CAP_PRECISE Ta \&No
|
||||||
|
.It PMC_CAP_SYSTEM Ta Yes
|
||||||
|
.It PMC_CAP_TAGGING Ta \&No
|
||||||
|
.It PMC_CAP_THRESHOLD Ta Yes
|
||||||
|
.It PMC_CAP_USER Ta Yes
|
||||||
|
.It PMC_CAP_WRITE Ta Yes
|
||||||
|
.El
|
||||||
|
.Ss Event Qualifiers
|
||||||
|
.Pp
|
||||||
|
Event specifiers for AMD K8 PMCs can have the following optional
|
||||||
|
qualifiers:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li count= Ns Ar value
|
||||||
|
Configure the counter to increment only if the number of configured
|
||||||
|
events measured in a cycle is greater than or equal to
|
||||||
|
.Ar value .
|
||||||
|
.It Li edge
|
||||||
|
Configure the counter to only count negated-to-asserted transitions
|
||||||
|
of the conditions expressed by the other fields.
|
||||||
|
In other words, the counter will increment only once whenever a given
|
||||||
|
condition becomes true, irrespective of the number of clocks during
|
||||||
|
which the condition remains true.
|
||||||
|
.It Li inv
|
||||||
|
Invert the sense of comparision when the
|
||||||
|
.Dq Li count
|
||||||
|
qualifier is present, making the counter to increment when the
|
||||||
|
number of events per cycle is less than the value specified by
|
||||||
|
the
|
||||||
|
.Dq Li count
|
||||||
|
qualifier.
|
||||||
|
.It Li mask= Ns Ar qualifier
|
||||||
|
Many event specifiers for AMD K8 PMCs need to be additionally
|
||||||
|
qualified using a mask qualifier.
|
||||||
|
These additional qualifiers are event-specific and are documented
|
||||||
|
along with their associated event specifiers below.
|
||||||
|
.It Li os
|
||||||
|
Configure the PMC to count events happening at privilege level 0.
|
||||||
|
.It Li usr
|
||||||
|
Configure the PMC to count events occurring at privilege levels 1, 2
|
||||||
|
or 3.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
If neither of the
|
||||||
|
.Dq Li os
|
||||||
|
or
|
||||||
|
.Dq Li usr
|
||||||
|
qualifiers were specified, the default is to enable both.
|
||||||
|
.Ss AMD K8 Event Specifiers
|
||||||
|
The event specifiers supported on AMD K8 PMCs are:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li k8-bu-cpu-clk-unhalted
|
||||||
|
Count the number of clock cycles when the CPU is not in the HLT or
|
||||||
|
STPCLK states.
|
||||||
|
.It Li k8-bu-fill-request-l2-miss Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count fill requests that missed in the L2 cache.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li dc-fill
|
||||||
|
Count data cache fill requests.
|
||||||
|
.It Li ic-fill
|
||||||
|
Count instruction cache fill requests.
|
||||||
|
.It Li tlb-reload
|
||||||
|
Count TLB reloads.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of requests.
|
||||||
|
.It Li k8-bu-internal-l2-request Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count internally generated requests to the L2 cache.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li cancelled
|
||||||
|
Count cancelled requests.
|
||||||
|
.It Li dc-fill
|
||||||
|
Count data cache fill requests.
|
||||||
|
.It Li ic-fill
|
||||||
|
Count instruction cache fill requests.
|
||||||
|
.It Li tag-snoop
|
||||||
|
Count tag snoop requests.
|
||||||
|
.It Li tlb-reload
|
||||||
|
Count TLB reloads.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of requests.
|
||||||
|
.It Li k8-dc-access
|
||||||
|
Count data cache accesses including microcode scratchpad accesses.
|
||||||
|
.It Li k8-dc-copyback Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count data cache copyback operations.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li exclusive
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq exclusive
|
||||||
|
state.
|
||||||
|
.It Li invalid
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq invalid
|
||||||
|
state.
|
||||||
|
.It Li modified
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq modified
|
||||||
|
state.
|
||||||
|
.It Li owner
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq owner
|
||||||
|
state.
|
||||||
|
.It Li shared
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq shared
|
||||||
|
state.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations for lines in all the
|
||||||
|
above states.
|
||||||
|
.It Li k8-dc-dcache-accesses-by-locks Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count data cache accesses by lock instructions.
|
||||||
|
This event is only available on processors of revision C or later
|
||||||
|
vintage.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li accesses
|
||||||
|
Count data cache accesses by lock instructions.
|
||||||
|
.It Li misses
|
||||||
|
Count data cache misses by lock instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all accesses.
|
||||||
|
.It Li k8-dc-dispatched-prefetch-instructions Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count the number of dispatched prefetch instructions.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li load
|
||||||
|
Count load operations.
|
||||||
|
.It Li nta
|
||||||
|
Count non-temporal operations.
|
||||||
|
.It Li store
|
||||||
|
Count store operations.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all operations.
|
||||||
|
.It Li k8-dc-l1-dtlb-miss-and-l2-dtlb-hit
|
||||||
|
Count L1 DTLB misses that are L2 DTLB hits.
|
||||||
|
.It Li k8-dc-l1-dtlb-miss-and-l2-dtlb-miss
|
||||||
|
Count L1 DTLB misses that are also misses in the L2 DTLB.
|
||||||
|
.It Li k8-dc-microarchitectural-early-cancel-of-an-access
|
||||||
|
Count microarchitectural early cancels of data cache accesses.
|
||||||
|
.It Li k8-dc-microarchitectural-late-cancel-of-an-access
|
||||||
|
Count microarchitectural late cancels of data cache accesses.
|
||||||
|
.It Li k8-dc-misaligned-data-reference
|
||||||
|
Count misaligned data references.
|
||||||
|
.It Li k8-dc-miss
|
||||||
|
Count data cache misses.
|
||||||
|
.It Li k8-dc-one-bit-ecc-error Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count one bit ECC errors found by the scrubber.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li scrubber
|
||||||
|
Count scrubber detected errors.
|
||||||
|
.It Li piggyback
|
||||||
|
Count piggyback scrubber errors.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count both kinds of errors.
|
||||||
|
.It Li k8-dc-refill-from-l2 Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count data cache refills from L2 cache.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li exclusive
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq exclusive
|
||||||
|
state.
|
||||||
|
.It Li invalid
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq invalid
|
||||||
|
state.
|
||||||
|
.It Li modified
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq modified
|
||||||
|
state.
|
||||||
|
.It Li owner
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq owner
|
||||||
|
state.
|
||||||
|
.It Li shared
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq shared
|
||||||
|
state.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations for lines in all the
|
||||||
|
above states.
|
||||||
|
.It Li k8-dc-refill-from-system Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count data cache refills from system memory.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li exclusive
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq exclusive
|
||||||
|
state.
|
||||||
|
.It Li invalid
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq invalid
|
||||||
|
state.
|
||||||
|
.It Li modified
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq modified
|
||||||
|
state.
|
||||||
|
.It Li owner
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq owner
|
||||||
|
state.
|
||||||
|
.It Li shared
|
||||||
|
Count operations for lines in the
|
||||||
|
.Dq shared
|
||||||
|
state.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations for lines in all the
|
||||||
|
above states.
|
||||||
|
.It Li k8-fp-dispatched-fpu-ops Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count the number of dispatched FPU ops.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li add-pipe-excluding-junk-ops
|
||||||
|
Count add pipe ops excluding junk ops.
|
||||||
|
.It Li add-pipe-junk-ops
|
||||||
|
Count junk ops in the add pipe.
|
||||||
|
.It Li multiply-pipe-excluding-junk-ops
|
||||||
|
Count multiply pipe ops excluding junk ops.
|
||||||
|
.It Li multiply-pipe-junk-ops
|
||||||
|
Count junk ops in the multiply pipe.
|
||||||
|
.It Li store-pipe-excluding-junk-ops
|
||||||
|
Count store pipe ops excluding junk ops
|
||||||
|
.It Li store-pipe-junk-ops
|
||||||
|
Count junk ops in the store pipe.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of ops.
|
||||||
|
.It Li k8-fp-cycles-with-no-fpu-ops-retired
|
||||||
|
Count cycles when no FPU ops were retired.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
.It Li k8-fp-dispatched-fpu-fast-flag-ops
|
||||||
|
Count dispatched FPU ops that use the fast flag interface.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
.It Li k8-fr-decoder-empty
|
||||||
|
Count cycles when there was nothing to dispatch (i.e., the decoder
|
||||||
|
was empty).
|
||||||
|
.It Li k8-fr-dispatch-stalls
|
||||||
|
Count all dispatch stalls.
|
||||||
|
.It Li k8-fr-dispatch-stall-for-segment-load
|
||||||
|
Count dispatch stalls for segment loads.
|
||||||
|
.It Li k8-fr-dispatch-stall-for-serialization
|
||||||
|
Count dispatch stalls for serialization.
|
||||||
|
.It Li k8-fr-dispatch-stall-from-branch-abort-to-retire
|
||||||
|
Count dispatch stalls from branch abort to retiral.
|
||||||
|
.It Li k8-fr-dispatch-stall-when-fpu-is-full
|
||||||
|
Count dispatch stalls when the FPU is full.
|
||||||
|
.It Li k8-fr-dispatch-stall-when-ls-is-full
|
||||||
|
Count dispatch stalls when the load/store unit is full.
|
||||||
|
.It Li k8-fr-dispatch-stall-when-reorder-buffer-is-full
|
||||||
|
Count dispatch stalls when the reorder buffer is full.
|
||||||
|
.It Li k8-fr-dispatch-stall-when-reservation-stations-are-full
|
||||||
|
Count dispatch stalls when reservation stations are full.
|
||||||
|
.It Li k8-fr-dispatch-stall-when-waiting-for-all-to-be-quiet
|
||||||
|
Count dispatch stalls when waiting for all to be quiet.
|
||||||
|
.\" XXX What does "waiting for all to be quiet" mean?
|
||||||
|
.It Li k8-fr-dispatch-stall-when-waiting-far-xfer-or-resync-branch-pending
|
||||||
|
Count dispatch stalls when a far control transfer or a resync branch
|
||||||
|
is pending.
|
||||||
|
.It Li k8-fr-fpu-exceptions Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count FPU exceptions.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li sse-and-x87-microtraps
|
||||||
|
Count SSE and x87 microtraps.
|
||||||
|
.It Li sse-reclass-microfaults
|
||||||
|
Count SSE reclass microfaults
|
||||||
|
.It Li sse-retype-microfaults
|
||||||
|
Count SSE retype microfaults
|
||||||
|
.It Li x87-reclass-microfaults
|
||||||
|
Count x87 reclass microfaults.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of exceptions.
|
||||||
|
.It Li k8-fr-interrupts-masked-cycles
|
||||||
|
Count cycles when interrupts were masked (by CPU RFLAGS field IF was zero).
|
||||||
|
.It Li k8-fr-interrupts-masked-while-pending-cycles
|
||||||
|
Count cycles while interrupts were masked while pending (i.e., cycles
|
||||||
|
when INTR was asserted while CPU RFLAGS field IF was zero).
|
||||||
|
.It Li k8-fr-number-of-breakpoints-for-dr0
|
||||||
|
Count the number of breakpoints for DR0.
|
||||||
|
.It Li k8-fr-number-of-breakpoints-for-dr1
|
||||||
|
Count the number of breakpoints for DR1.
|
||||||
|
.It Li k8-fr-number-of-breakpoints-for-dr2
|
||||||
|
Count the number of breakpoints for DR2.
|
||||||
|
.It Li k8-fr-number-of-breakpoints-for-dr3
|
||||||
|
Count the number of breakpoints for DR3.
|
||||||
|
.It Li k8-fr-retired-branches
|
||||||
|
Count retired branches including exceptions and interrupts.
|
||||||
|
.It Li k8-fr-retired-branches-mispredicted
|
||||||
|
Count mispredicted retired branches.
|
||||||
|
.It Li k8-fr-retired-far-control-transfers
|
||||||
|
Count retired far control transfers (which are always mispredicted).
|
||||||
|
.It Li k8-fr-retired-fastpath-double-op-instructions Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count retired fastpath double op instructions.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li low-op-pos-0
|
||||||
|
Count instructions with the low op in position 0.
|
||||||
|
.It Li low-op-pos-1
|
||||||
|
Count instructions with the low op in position 1.
|
||||||
|
.It Li low-op-pos-2
|
||||||
|
Count instructions with the low op in position 2.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of instructions.
|
||||||
|
.It Li k8-fr-retired-fpu-instructions Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count retired FPU instructions.
|
||||||
|
This event is supported in revision B and later CPUs.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li mmx-3dnow
|
||||||
|
Count MMX and 3DNow!\& instructions.
|
||||||
|
.It Li packed-sse-sse2
|
||||||
|
Count packed SSE and SSE2 instructions.
|
||||||
|
.It Li scalar-sse-sse2
|
||||||
|
Count scalar SSE and SSE2 instructions
|
||||||
|
.It Li x87
|
||||||
|
Count x87 instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of instructions.
|
||||||
|
.It Li k8-fr-retired-near-returns
|
||||||
|
Count retired near returns.
|
||||||
|
.It Li k8-fr-retired-near-returns-mispredicted
|
||||||
|
Count mispredicted near returns.
|
||||||
|
.It Li k8-fr-retired-resyncs
|
||||||
|
Count retired resyncs (non-control transfer branches).
|
||||||
|
.It Li k8-fr-retired-taken-hardware-interrupts
|
||||||
|
Count retired taken hardware interrupts.
|
||||||
|
.It Li k8-fr-retired-taken-branches
|
||||||
|
Count retired taken branches.
|
||||||
|
.It Li k8-fr-retired-taken-branches-mispredicted
|
||||||
|
Count retired taken branches that were mispredicted.
|
||||||
|
.It Li k8-fr-retired-taken-branches-mispredicted-by-addr-miscompare
|
||||||
|
Count retired taken branches that were mispredicted only due to an
|
||||||
|
address miscompare.
|
||||||
|
.It Li k8-fr-retired-uops
|
||||||
|
Count retired uops.
|
||||||
|
.It Li k8-fr-retired-x86-instructions
|
||||||
|
Count retired x86 instructions including exceptions and interrupts.
|
||||||
|
.It Li k8-ic-fetch
|
||||||
|
Count instruction cache fetches.
|
||||||
|
.It Li k8-ic-instruction-fetch-stall
|
||||||
|
Count cycles in stalls due to instruction fetch.
|
||||||
|
.It Li k8-ic-l1-itlb-miss-and-l2-itlb-hit
|
||||||
|
Count L1 ITLB misses that are L2 ITLB hits.
|
||||||
|
.It Li k8-ic-l1-itlb-miss-and-l2-itlb-miss
|
||||||
|
Count ITLB misses that miss in both L1 and L2 ITLBs.
|
||||||
|
.It Li k8-ic-microarchitectural-resync-by-snoop
|
||||||
|
Count microarchitectural resyncs caused by snoops.
|
||||||
|
.It Li k8-ic-miss
|
||||||
|
Count instruction cache misses.
|
||||||
|
.It Li k8-ic-refill-from-l2
|
||||||
|
Count instruction cache refills from L2 cache.
|
||||||
|
.It Li k8-ic-refill-from-system
|
||||||
|
Count instruction cache refills from system memory.
|
||||||
|
.It Li k8-ic-return-stack-hits
|
||||||
|
Count hits to the return stack.
|
||||||
|
.It Li k8-ic-return-stack-overflow
|
||||||
|
Count overflows of the return stack.
|
||||||
|
.It Li k8-ls-buffer2-full
|
||||||
|
Count load/store buffer2 full events.
|
||||||
|
.It Li k8-ls-locked-operation Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count locked operations.
|
||||||
|
For revision C and later CPUs, the following qualifiers are supported:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li cycles-in-request
|
||||||
|
Count the number of cycles in the lock request/grant stage.
|
||||||
|
.It Li cycles-to-complete
|
||||||
|
Count the number of cycles a lock takes to complete once it is
|
||||||
|
non-speculative and is the older load/store operation.
|
||||||
|
.It Li locked-instructions
|
||||||
|
Count the number of lock instructions executed.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count the number of lock instructions executed.
|
||||||
|
.It Li k8-ls-microarchitectural-late-cancel
|
||||||
|
Count microarchitectural late cancels of operations in the load/store
|
||||||
|
unit.
|
||||||
|
.It Li k8-ls-microarchitectural-resync-by-self-modifying-code
|
||||||
|
Count microarchitectural resyncs caused by self-modifying code.
|
||||||
|
.It Li k8-ls-microarchitectural-resync-by-snoop
|
||||||
|
Count microarchitectural resyncs caused by snoops.
|
||||||
|
.It Li k8-ls-retired-cflush-instructions
|
||||||
|
Count retired CFLUSH instructions.
|
||||||
|
.It Li k8-ls-retired-cpuid-instructions
|
||||||
|
Count retired CPUID instructions.
|
||||||
|
.It Li k8-ls-segment-register-load Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count segment register loads.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li cs
|
||||||
|
Count CS register loads.
|
||||||
|
.It Li ds
|
||||||
|
Count DS register loads.
|
||||||
|
.It Li es
|
||||||
|
Count ES register loads.
|
||||||
|
.It Li fs
|
||||||
|
Count FS register loads.
|
||||||
|
.It Li gs
|
||||||
|
Count GS register loads.
|
||||||
|
.\" .It Li hs
|
||||||
|
.\" Count HS register loads.
|
||||||
|
.\" XXX "HS" register?
|
||||||
|
.It Li ss
|
||||||
|
Count SS register loads.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of loads.
|
||||||
|
.It Li k8-nb-memory-controller-bypass-saturation Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count memory controller bypass counter saturation events.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li dram-controller-interface-bypass
|
||||||
|
Count DRAM controller interface bypass.
|
||||||
|
.It Li dram-controller-queue-bypass
|
||||||
|
Count DRAM controller queue bypass.
|
||||||
|
.It Li memory-controller-hi-pri-bypass
|
||||||
|
Count memory controller high priority bypasses.
|
||||||
|
.It Li memory-controller-lo-pri-bypass
|
||||||
|
Count memory controller low priority bypasses.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
.It Li k8-nb-memory-controller-dram-slots-missed
|
||||||
|
Count memory controller DRAM command slots missed (in MemClks).
|
||||||
|
.It Li k8-nb-memory-controller-page-access-event Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count memory controller page access events.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li page-conflict
|
||||||
|
Count page conflicts.
|
||||||
|
.It Li page-hit
|
||||||
|
Count page hits.
|
||||||
|
.It Li page-miss
|
||||||
|
Count page misses.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of events.
|
||||||
|
.It Li k8-nb-memory-controller-page-table-overflow
|
||||||
|
Count memory control page table overflow events.
|
||||||
|
.It Li k8-nb-probe-result Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count probe events.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li probe-hit
|
||||||
|
Count all probe hits.
|
||||||
|
.It Li probe-hit-dirty-no-memory-cancel
|
||||||
|
Count probe hits without memory cancels.
|
||||||
|
.It Li probe-hit-dirty-with-memory-cancel
|
||||||
|
Count probe hits with memory cancels.
|
||||||
|
.It Li probe-miss
|
||||||
|
Count probe misses.
|
||||||
|
.El
|
||||||
|
.It Li k8-nb-sized-commands Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count sized commands issued.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li nonpostwrszbyte
|
||||||
|
.It Li nonpostwrszdword
|
||||||
|
.It Li postwrszbyte
|
||||||
|
.It Li postwrszdword
|
||||||
|
.It Li rdszbyte
|
||||||
|
.It Li rdszdword
|
||||||
|
.It Li rdmodwr
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of commands.
|
||||||
|
.It Li k8-nb-memory-controller-turnaround Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count memory control turnaround events.
|
||||||
|
This event may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.\" XXX doc is unclear whether these are cycle counts or event counts
|
||||||
|
.It Li dimm-turnaround
|
||||||
|
Count DIMM turnarounds.
|
||||||
|
.It Li read-to-write-turnaround
|
||||||
|
Count read to write turnarounds.
|
||||||
|
.It Li write-to-read-turnaround
|
||||||
|
Count write to read turnarounds.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of events.
|
||||||
|
.It Li k8-nb-ht-bus0-bandwidth Op Li ,mask= Ns Ar qualifier
|
||||||
|
.It Li k8-nb-ht-bus1-bandwidth Op Li ,mask= Ns Ar qualifier
|
||||||
|
.It Li k8-nb-ht-bus2-bandwidth Op Li ,mask= Ns Ar qualifier
|
||||||
|
Count events on the HyperTransport(tm) buses.
|
||||||
|
These events may be further qualified using
|
||||||
|
.Ar qualifier ,
|
||||||
|
which is a
|
||||||
|
.Ql +
|
||||||
|
separated set of the following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li buffer-release
|
||||||
|
Count buffer release messages sent.
|
||||||
|
.It Li command
|
||||||
|
Count command messages sent.
|
||||||
|
.It Li data
|
||||||
|
Count data messages sent.
|
||||||
|
.It Li nop
|
||||||
|
Count nop messages sent.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all types of messages.
|
||||||
|
.El
|
||||||
|
.Ss Event Name Aliases
|
||||||
|
The following table shows the mapping between the PMC-independent
|
||||||
|
aliases supported by
|
||||||
|
.Lb libpmc
|
||||||
|
and the underlying hardware events used.
|
||||||
|
.Bl -column "branch-mispredicts" "Description"
|
||||||
|
.It Em Alias Ta Em Event
|
||||||
|
.It Li branches Ta Li k8-fr-retired-taken-branches
|
||||||
|
.It Li branch-mispredicts Ta Li k8-fr-retired-taken-branches-mispredicted
|
||||||
|
.It Li dc-misses Ta Li k8-dc-miss
|
||||||
|
.It Li ic-misses Ta Li k8-ic-miss
|
||||||
|
.It Li instructions Ta Li k8-fr-retired-x86-instructions
|
||||||
|
.It Li interrupts Ta Li k8-fr-taken-hardware-interrupts
|
||||||
|
.It Li unhalted-cycles Ta Li k8-by-cpu-clk-unhalted
|
||||||
|
.El
|
||||||
|
.Sh SEE ALSO
|
||||||
|
.Xr pmc 3 ,
|
||||||
|
.Xr pmc.k7 3 ,
|
||||||
|
.Xr pmc.p4 3 ,
|
||||||
|
.Xr pmc.p5 3 ,
|
||||||
|
.Xr pmc.p6 3 ,
|
||||||
|
.Xr pmc.tsc 3 ,
|
||||||
|
.Xr pmclog 3 ,
|
||||||
|
.Xr hwpmc 4
|
||||||
|
.Sh HISTORY
|
||||||
|
The
|
||||||
|
.Nm pmc
|
||||||
|
library first appeared in
|
||||||
|
.Fx 6.0 .
|
||||||
|
.Sh AUTHORS
|
||||||
|
The
|
||||||
|
.Lb libpmc
|
||||||
|
library was written by
|
||||||
|
.An "Joseph Koshy"
|
||||||
|
.Aq jkoshy@FreeBSD.org .
|
1222
lib/libpmc/pmc.p4.3
Normal file
1222
lib/libpmc/pmc.p4.3
Normal file
File diff suppressed because it is too large
Load Diff
402
lib/libpmc/pmc.p5.3
Normal file
402
lib/libpmc/pmc.p5.3
Normal file
@ -0,0 +1,402 @@
|
|||||||
|
.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
|
||||||
|
.\"
|
||||||
|
.\" Redistribution and use in source and binary forms, with or without
|
||||||
|
.\" modification, are permitted provided that the following conditions
|
||||||
|
.\" are met:
|
||||||
|
.\" 1. Redistributions of source code must retain the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer.
|
||||||
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer in the
|
||||||
|
.\" documentation and/or other materials provided with the distribution.
|
||||||
|
.\"
|
||||||
|
.\" This software is provided by Joseph Koshy ``as is'' and
|
||||||
|
.\" any express or implied warranties, including, but not limited to, the
|
||||||
|
.\" implied warranties of merchantability and fitness for a particular purpose
|
||||||
|
.\" are disclaimed. in no event shall Joseph Koshy be liable
|
||||||
|
.\" for any direct, indirect, incidental, special, exemplary, or consequential
|
||||||
|
.\" damages (including, but not limited to, procurement of substitute goods
|
||||||
|
.\" or services; loss of use, data, or profits; or business interruption)
|
||||||
|
.\" however caused and on any theory of liability, whether in contract, strict
|
||||||
|
.\" liability, or tort (including negligence or otherwise) arising in any way
|
||||||
|
.\" out of the use of this software, even if advised of the possibility of
|
||||||
|
.\" such damage.
|
||||||
|
.\"
|
||||||
|
.\" $FreeBSD$
|
||||||
|
.\"
|
||||||
|
.Dd September 16, 2008
|
||||||
|
.Os
|
||||||
|
.Dt PMC 3
|
||||||
|
.Sh NAME
|
||||||
|
.Nm pmc
|
||||||
|
.Nd library for accessing hardware performance monitoring counters
|
||||||
|
.Sh LIBRARY
|
||||||
|
.Lb libpmc
|
||||||
|
.Sh SYNOPSIS
|
||||||
|
.In pmc.h
|
||||||
|
.Sh DESCRIPTION
|
||||||
|
Intel Pentium PMCs are present in Intel
|
||||||
|
.Tn Pentium
|
||||||
|
and
|
||||||
|
.Tn "Pentium MMX"
|
||||||
|
processors.
|
||||||
|
These PMCs are documented in the
|
||||||
|
.Rs
|
||||||
|
.%B "Intel 64 and IA-32 Intel(R) Architectures Software Developer's Manual"
|
||||||
|
.%T "Volume 3B: System Programming Guide, Part 2"
|
||||||
|
.%N "Order Number 253669-024US"
|
||||||
|
.%D "August 2007"
|
||||||
|
.%Q "Intel Corporation"
|
||||||
|
.Re
|
||||||
|
.Ss PMC Features
|
||||||
|
These CPUs contain two PMCs, each 40 bits wide.
|
||||||
|
These PMCs support the following capabilities:
|
||||||
|
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
||||||
|
.It Em Capability Ta Em Support
|
||||||
|
.It PMC_CAP_CASCADE Ta \&No
|
||||||
|
.It PMC_CAP_EDGE Ta \&No
|
||||||
|
.It PMC_CAP_INTERRUPT Ta \&No
|
||||||
|
.It PMC_CAP_INVERT Ta \&No
|
||||||
|
.It PMC_CAP_READ Ta Yes
|
||||||
|
.It PMC_CAP_PRECISE Ta \&No
|
||||||
|
.It PMC_CAP_SYSTEM Ta Yes
|
||||||
|
.It PMC_CAP_TAGGING Ta \&No
|
||||||
|
.It PMC_CAP_THRESHOLD Ta \&No
|
||||||
|
.It PMC_CAP_USER Ta Yes
|
||||||
|
.It PMC_CAP_WRITE Ta Yes
|
||||||
|
.El
|
||||||
|
.Ss Event Qualifiers
|
||||||
|
Event specifiers for Intel Pentium PMCs can have the following common
|
||||||
|
qualifiers:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li duration
|
||||||
|
Count duration (in clocks) of events.
|
||||||
|
The default is to count events.
|
||||||
|
.It Li os
|
||||||
|
Measure events at privilege levels 0, 1 and 2.
|
||||||
|
.It Li overflow
|
||||||
|
Assert the external processor pin associated with a counter on counter
|
||||||
|
overflow.
|
||||||
|
.It Li usr
|
||||||
|
Measure events at privilege level 3.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
If neither of the
|
||||||
|
.Dq Li os
|
||||||
|
or
|
||||||
|
.Dq Li usr
|
||||||
|
qualifiers are specified, the default is to enable both.
|
||||||
|
.Pp
|
||||||
|
Some events may only be used on specific counters and some events
|
||||||
|
are defined only on processors supporting the MMX instruction set.
|
||||||
|
Note that these PMCs do not have the ability to interrupt the CPU.
|
||||||
|
.Ss Intel Pentium Event Specifiers
|
||||||
|
The event specifiers supported by Intel Pentium PMCs are:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li p5-any-segment-register-loaded
|
||||||
|
The number of writes to any segment register, including the LDTR,
|
||||||
|
GDTR, TR and IDTR.
|
||||||
|
Far control transfers and task switches that involve privilege
|
||||||
|
level changes will count this event twice.
|
||||||
|
.It Li p5-bank-conflicts
|
||||||
|
The number of actual bank conflicts.
|
||||||
|
.It Li p5-branches
|
||||||
|
The number of taken and not taken branches including branches, jumps, calls,
|
||||||
|
software interrupts and interrupt returns.
|
||||||
|
.It Li p5-breakpoint-match-on-dr0-register
|
||||||
|
The number of matches on the DR0 breakpoint register.
|
||||||
|
.It Li p5-breakpoint-match-on-dr1-register
|
||||||
|
The number of matches on the DR1 breakpoint register.
|
||||||
|
.It Li p5-breakpoint-match-on-dr2-register
|
||||||
|
The number of matches on the DR2 breakpoint register.
|
||||||
|
.It Li p5-breakpoint-match-on-dr3-register
|
||||||
|
The number of matches on the DR3 breakpoint register.
|
||||||
|
.It Li p5-btb-false-entries
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of false entries in the BTB.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-btb-hits
|
||||||
|
The number of branches executed that hit in the branch table buffer.
|
||||||
|
.It Li p5-btb-miss-prediction-on-not-taken-branch
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of times the BTB predicted a not-taken branch as taken.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-bus-cycle-duration
|
||||||
|
The number of cycles while a bus cycle was in progress.
|
||||||
|
.It Li p5-bus-ownership-latency
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The time from bus ownership being requested to ownership being granted.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-bus-ownership-transfers
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of bus ownership transfers.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-bus-utilization-due-to-processor-activity
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks the bus is busy due to the processor's own
|
||||||
|
activity.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-cache-line-sharing
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of shared data lines in L1 cache.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-cache-m-state-line-sharing
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of hits to an M- state line due to a memory access by
|
||||||
|
another processor.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-code-cache-miss
|
||||||
|
The number of instruction reads that miss the internal code cache.
|
||||||
|
Both cacheable and uncacheable misses are counted.
|
||||||
|
.It Li p5-code-read
|
||||||
|
The number of instruction reads to both cacheable and uncacheable regions.
|
||||||
|
.It Li p5-code-tlb-miss
|
||||||
|
The number of instruction reads that miss the instruction TLB.
|
||||||
|
Both cacheable and uncacheable unreads are counted.
|
||||||
|
.It Li p5-d1-starvation-and-fifo-is-empty
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of times the D1 stage cannot issue any instructions because
|
||||||
|
the FIFO was empty.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-d1-starvation-and-only-one-instruction-in-fifo
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of times the D1 stage could issue only one instruction
|
||||||
|
because the FIFO had one instruction ready.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-data-cache-lines-written-back
|
||||||
|
The number of data cache lines that are written back, including
|
||||||
|
those caused by internal and external snoops.
|
||||||
|
.It Li p5-data-cache-tlb-miss-stall-duration
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks the pipeline is stalled due to a data cache
|
||||||
|
TLB miss.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-data-read
|
||||||
|
The number of memory data reads, counting internal data cache hits and
|
||||||
|
misses.
|
||||||
|
I/O and data memory accesses due to TLB miss processing are
|
||||||
|
not included.
|
||||||
|
Split cycle reads are counted individually.
|
||||||
|
.It Li p5-data-read-miss
|
||||||
|
The number of memory read accesses that miss the data cache, counting
|
||||||
|
both cacheable and uncacheable accesses.
|
||||||
|
Data accesses that are part of TLB miss processing are not included.
|
||||||
|
I/O accesses are not included.
|
||||||
|
.It Li p5-data-read-miss-or-write-miss
|
||||||
|
The number of data reads and writes that miss the internal data cache,
|
||||||
|
counting uncacheable accesses.
|
||||||
|
Data accesses due to TLB miss processing are not counted.
|
||||||
|
.It Li p5-data-read-or-write
|
||||||
|
The number of data reads and writes including internal data cache hits
|
||||||
|
and misses.
|
||||||
|
Data reads due to TLB miss processing are not counted.
|
||||||
|
.It Li p5-data-tlb-miss
|
||||||
|
The number of misses to the data cache translation lookaside buffer.
|
||||||
|
.It Li p5-data-write
|
||||||
|
The number of memory data writes, counting internal data cache hits
|
||||||
|
and misses.
|
||||||
|
I/O is not included and split cycle writes are counted individually.
|
||||||
|
.It Li p5-data-write-miss
|
||||||
|
The number of memory write accesses that miss the data cache, counting
|
||||||
|
both cacheable and uncacheable accesses.
|
||||||
|
I/O accesses are not counted.
|
||||||
|
.It Li p5-emms-instructions-executed
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of EMMS instructions executed.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-external-data-cache-snoop-hits
|
||||||
|
The number of external snoops to the data cache that hit a valid line,
|
||||||
|
or the data line fill buffer, or one of the write back buffers.
|
||||||
|
.It Li p5-external-snoops
|
||||||
|
The number of external snoop requests accepted, including snoops that
|
||||||
|
hit in the code cache, the data cache and that hit in neither.
|
||||||
|
.It Li p5-floating-point-stalls-duration
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of cycles the pipeline is stalled due to a floating point
|
||||||
|
freeze.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-flops
|
||||||
|
The number of floating point adds, subtracts, multiples, divides and
|
||||||
|
square roots.
|
||||||
|
Transcendental instructions trigger this event multiple times.
|
||||||
|
Instructions generating divide-by-zero, negative square root, special
|
||||||
|
operand and stack exceptions are not counted.
|
||||||
|
Integer multiply instructions that use the x87 FPU are counted.
|
||||||
|
.It Li p5-full-write-buffer-stall-duration-while-executing-mmx-instructions
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks the pipeline has stalled due to full write
|
||||||
|
buffers when executing MMX instructions.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-hardware-interrupts
|
||||||
|
The number of taken INTR and NMI interrupts.
|
||||||
|
.It Li p5-instructions-executed
|
||||||
|
The number of instructions executed.
|
||||||
|
Repeat prefixed instructions are counted only once.
|
||||||
|
The HLT instruction is counted only once, irrespective of the number
|
||||||
|
of cycles spent in the halted state.
|
||||||
|
All hardware and software exceptions are counted as instructions, and
|
||||||
|
fault handler invocations are also counted as instructions.
|
||||||
|
.It Li p5-instructions-executed-v-pipe
|
||||||
|
The number of instructions that executed in the V pipe.
|
||||||
|
.It Li p5-io-read-or-write-cycle
|
||||||
|
The number of bus cycles directed to I/O space.
|
||||||
|
.It Li p5-locked-bus-cycle
|
||||||
|
The number of locked bus cycles that occur on account of the lock
|
||||||
|
prefixes, LOCK instructions, page table updates and descriptor table
|
||||||
|
updates.
|
||||||
|
.It Li p5-memory-accesses-in-both-pipes
|
||||||
|
The number of data memory reads or writes that are paired in both pipes.
|
||||||
|
.It Li p5-misaligned-data-memory-on-mmx-instructions
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of misaligned data memory references when executing MMX
|
||||||
|
instructions.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-misaligned-data-memory-or-io-references
|
||||||
|
The number of memory or I/O reads or writes that are not aligned on
|
||||||
|
natural boundaries.
|
||||||
|
2- and 4-byte accesses are counted as misaligned if they cross a 4
|
||||||
|
byte boundary.
|
||||||
|
.It Li p5-mispredicted-or-unpredicted-returns
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of returns predicted incorrectly or not at all, only
|
||||||
|
counting RET instructions.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-mmx-instruction-data-read-misses
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of MMX instruction data read misses.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-mmx-instruction-data-reads
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of MMX instruction data reads.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-mmx-instruction-data-write-misses
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of data write misses caused by MMX instructions.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-mmx-instruction-data-writes
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of data writes caused by MMX instructions.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-mmx-instructions-executed-u-pipe
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of MMX instructions executed in the U pipe.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-mmx-instructions-executed-v-pipe
|
||||||
|
The number of MMX instructions executed in the V pipe.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-mmx-multiply-unit-interlock
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks the pipeline is stalled because the destination
|
||||||
|
of a prior MMX multiply is not ready.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-movd-movq-store-stall-due-to-previous-mmx-operation
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks a MOVD/MOVQ instruction stalled in the D2 stage
|
||||||
|
of the pipeline due to a previous MMX instruction.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-noncacheable-memory-reads
|
||||||
|
The number of bus cycles for non-cacheable instruction or data reads,
|
||||||
|
including cycles caused by TLB misses.
|
||||||
|
.It Li p5-number-of-cycles-not-in-halt-state
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of cycles the processor is not idle due to the HLT
|
||||||
|
instruction.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-pipeline-agi-stalls
|
||||||
|
The number of address generation interlock stalls.
|
||||||
|
An AGI that occurs in both the U and V pipelines in the same clock
|
||||||
|
signals the event twice.
|
||||||
|
.It Li p5-pipeline-flushes
|
||||||
|
The number of pipeline flushes that occur.
|
||||||
|
Pipeline flushes are caused by branch mispredicts, exceptions,
|
||||||
|
interrupts, some segment register loads, and BTB misses.
|
||||||
|
Prefetch queue flushes due to serializing instructions are not
|
||||||
|
counted.
|
||||||
|
.It Li p5-pipeline-flushes-due-to-wrong-branch-predictions
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of pipeline flushes due to wrong branch predictions
|
||||||
|
resolved in either the E- or WB- stage of the pipeline.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-pipeline-flushes-due-to-wrong-branch-predictions-resolved-in-wb-stage
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of pipeline flushes due to wrong branch predictions
|
||||||
|
resolved in the stage of the pipeline.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-pipeline-stall-for-mmx-instruction-data-memory-reads
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks during pipeline stalls caused by waiting MMX data
|
||||||
|
memory reads.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-predicted-returns
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of predicted returns, whether correct or incorrect.
|
||||||
|
This counter only counts RET instructions.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-returns
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of RET instructions executed.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-saturating-mmx-instructions-executed
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of saturating MMX instructions executed.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p5-saturations-performed
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of saturating MMX instructions executed when at least one
|
||||||
|
of its results were actually saturated.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-stall-on-mmx-instruction-write-to-e-o-m-state-line
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of clocks during stalls on MMX instructions writing to
|
||||||
|
E- or M- state cache lines.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-stall-on-write-to-an-e-or-m-state-line
|
||||||
|
The number of stalls on a write to an exclusive or modified data cache
|
||||||
|
line.
|
||||||
|
.It Li p5-taken-branch-or-btb-hit
|
||||||
|
The number of events that may cause a hit in the BTB, namely either
|
||||||
|
taken branches or BTB hits.
|
||||||
|
.It Li p5-taken-branches
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of taken branches.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-transitions-between-mmx-and-fp-instructions
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of transitions between MMX and floating-point instructions
|
||||||
|
and vice-versa.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p5-waiting-for-data-memory-read-stall-duration
|
||||||
|
The number of clocks the pipeline was stalled waiting for data
|
||||||
|
memory reads.
|
||||||
|
Data TLB misses processing is included in this count.
|
||||||
|
.It Li p5-write-buffer-full-stall-duration
|
||||||
|
The number of clocks while the pipeline was stalled due to write
|
||||||
|
buffers being full.
|
||||||
|
.It Li p5-write-hit-to-m-or-e-state-lines
|
||||||
|
The number of writes that hit exclusive or modified lines in the data
|
||||||
|
cache.
|
||||||
|
.It Li p5-writes-to-noncacheable-memory
|
||||||
|
.Pq Tn Pentium MMX
|
||||||
|
The number of writes to non-cacheable memory, including write cycles
|
||||||
|
caused by TLB misses and I/O writes.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.El
|
||||||
|
.Sh SEE ALSO
|
||||||
|
.Xr pmc 3 ,
|
||||||
|
.Xr pmc.k7 3 ,
|
||||||
|
.Xr pmc.k8 3 ,
|
||||||
|
.Xr pmc.p4 3 ,
|
||||||
|
.Xr pmc.p6 3 ,
|
||||||
|
.Xr pmc.tsc 3 ,
|
||||||
|
.Xr pmclog 3 ,
|
||||||
|
.Xr hwpmc 4
|
||||||
|
.Sh HISTORY
|
||||||
|
The
|
||||||
|
.Nm pmc
|
||||||
|
library first appeared in
|
||||||
|
.Fx 6.0 .
|
||||||
|
.Sh AUTHORS
|
||||||
|
The
|
||||||
|
.Lb libpmc
|
||||||
|
library was written by
|
||||||
|
.An "Joseph Koshy"
|
||||||
|
.Aq jkoshy@FreeBSD.org .
|
954
lib/libpmc/pmc.p6.3
Normal file
954
lib/libpmc/pmc.p6.3
Normal file
@ -0,0 +1,954 @@
|
|||||||
|
.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
|
||||||
|
.\"
|
||||||
|
.\" Redistribution and use in source and binary forms, with or without
|
||||||
|
.\" modification, are permitted provided that the following conditions
|
||||||
|
.\" are met:
|
||||||
|
.\" 1. Redistributions of source code must retain the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer.
|
||||||
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||||
|
.\" notice, this list of conditions and the following disclaimer in the
|
||||||
|
.\" documentation and/or other materials provided with the distribution.
|
||||||
|
.\"
|
||||||
|
.\" This software is provided by Joseph Koshy ``as is'' and
|
||||||
|
.\" any express or implied warranties, including, but not limited to, the
|
||||||
|
.\" implied warranties of merchantability and fitness for a particular purpose
|
||||||
|
.\" are disclaimed. in no event shall Joseph Koshy be liable
|
||||||
|
.\" for any direct, indirect, incidental, special, exemplary, or consequential
|
||||||
|
.\" damages (including, but not limited to, procurement of substitute goods
|
||||||
|
.\" or services; loss of use, data, or profits; or business interruption)
|
||||||
|
.\" however caused and on any theory of liability, whether in contract, strict
|
||||||
|
.\" liability, or tort (including negligence or otherwise) arising in any way
|
||||||
|
.\" out of the use of this software, even if advised of the possibility of
|
||||||
|
.\" such damage.
|
||||||
|
.\"
|
||||||
|
.\" $FreeBSD$
|
||||||
|
.\"
|
||||||
|
.Dd September 16, 2008
|
||||||
|
.Os
|
||||||
|
.Dt PMC.P6 3
|
||||||
|
.Sh NAME
|
||||||
|
.Nm pmc.p6
|
||||||
|
.Nd measurement events for
|
||||||
|
.Tn Intel
|
||||||
|
Pentium Pro, P-II, P-III family CPUs
|
||||||
|
.Sh LIBRARY
|
||||||
|
.Lb libpmc
|
||||||
|
.Sh SYNOPSIS
|
||||||
|
.In pmc.h
|
||||||
|
.Sh DESCRIPTION
|
||||||
|
Intel P6 PMCs are present in Intel
|
||||||
|
.Tn "Pentium Pro" ,
|
||||||
|
.Tn "Pentium II" ,
|
||||||
|
.Tn Celeron ,
|
||||||
|
.Tn "Pentium III"
|
||||||
|
and
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors.
|
||||||
|
.Pp
|
||||||
|
They are documented in
|
||||||
|
.Rs
|
||||||
|
.%B "IA-32 Intel(R) Architecture Software Developer's Manual"
|
||||||
|
.%T "Volume 3: System Programming Guide"
|
||||||
|
.%N "Order Number 245472-012"
|
||||||
|
.%D 2003
|
||||||
|
.%Q "Intel Corporation"
|
||||||
|
.Re
|
||||||
|
.Pp
|
||||||
|
Some of these events are affected by processor errata described in
|
||||||
|
.Rs
|
||||||
|
.%B "Intel(R) Pentium(R) III Processor Specification Update"
|
||||||
|
.%N "Document Number: 244453-054"
|
||||||
|
.%D "April 2005"
|
||||||
|
.%Q "Intel Corporation"
|
||||||
|
.Re
|
||||||
|
.Ss PMC Features
|
||||||
|
These CPUs have two counters, each 40 bits wide.
|
||||||
|
Some events may only be used on specific counters and some events are
|
||||||
|
defined only on specific processor models.
|
||||||
|
These PMCs support the following capabilities:
|
||||||
|
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
||||||
|
.It Em Capability Ta Em Support
|
||||||
|
.It PMC_CAP_CASCADE Ta \&No
|
||||||
|
.It PMC_CAP_EDGE Ta Yes
|
||||||
|
.It PMC_CAP_INTERRUPT Ta Yes
|
||||||
|
.It PMC_CAP_INVERT Ta Yes
|
||||||
|
.It PMC_CAP_READ Ta Yes
|
||||||
|
.It PMC_CAP_PRECISE Ta \&No
|
||||||
|
.It PMC_CAP_SYSTEM Ta Yes
|
||||||
|
.It PMC_CAP_TAGGING Ta \&No
|
||||||
|
.It PMC_CAP_THRESHOLD Ta Yes
|
||||||
|
.It PMC_CAP_USER Ta Yes
|
||||||
|
.It PMC_CAP_WRITE Ta Yes
|
||||||
|
.El
|
||||||
|
.Ss Event Qualifiers
|
||||||
|
Event specifiers for Intel P6 PMCs can have the following common
|
||||||
|
qualifiers:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li cmask= Ns Ar value
|
||||||
|
Configure the PMC to increment only if the number of configured
|
||||||
|
events measured in a cycle is greater than or equal to
|
||||||
|
.Ar value .
|
||||||
|
.It Li edge
|
||||||
|
Configure the PMC to count the number of deasserted to asserted
|
||||||
|
transitions of the conditions expressed by the other qualifiers.
|
||||||
|
If specified, the counter will increment only once whenever a
|
||||||
|
condition becomes true, irrespective of the number of clocks during
|
||||||
|
which the condition remains true.
|
||||||
|
.It Li inv
|
||||||
|
Invert the sense of comparision when the
|
||||||
|
.Dq Li cmask
|
||||||
|
qualifier is present, making the counter increment when the number of
|
||||||
|
events per cycle is less than the value specified by the
|
||||||
|
.Dq Li cmask
|
||||||
|
qualifier.
|
||||||
|
.It Li os
|
||||||
|
Configure the PMC to count events happening at processor privilege
|
||||||
|
level 0.
|
||||||
|
.It Li umask= Ns Ar value
|
||||||
|
This qualifier is used to further qualify the event selected (see
|
||||||
|
below).
|
||||||
|
.It Li usr
|
||||||
|
Configure the PMC to count events occurring at privilege levels 1, 2
|
||||||
|
or 3.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
If neither of the
|
||||||
|
.Dq Li os
|
||||||
|
or
|
||||||
|
.Dq Li usr
|
||||||
|
qualifiers are specified, the default is to enable both.
|
||||||
|
.Pp
|
||||||
|
The event specifiers supported by Intel P6 PMCs are:
|
||||||
|
.Bl -tag -width indent
|
||||||
|
.It Li p6-baclears
|
||||||
|
Count the number of times a static branch prediction was made by the
|
||||||
|
branch decoder because the BTB did not have a prediction.
|
||||||
|
.It Li p6-br-bac-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of branch instructions executed that where
|
||||||
|
mispredicted at the Front End (BAC).
|
||||||
|
.It Li p6-br-bogus
|
||||||
|
Count the number of bogus branches.
|
||||||
|
.It Li p6-br-call-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of call instructions executed.
|
||||||
|
.It Li p6-br-call-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of call instructions executed that were mispredicted.
|
||||||
|
.It Li p6-br-cnd-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of conditional branch instructions executed.
|
||||||
|
.It Li p6-br-cnd-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of conditional branch instructions executed that were
|
||||||
|
mispredicted.
|
||||||
|
.It Li p6-br-ind-call-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of indirect call instructions executed.
|
||||||
|
.It Li p6-br-ind-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of indirect branch instructions executed.
|
||||||
|
.It Li p6-br-ind-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of indirect branch instructions executed that were
|
||||||
|
mispredicted.
|
||||||
|
.It Li p6-br-inst-decoded
|
||||||
|
Count the number of branch instructions decoded.
|
||||||
|
.It Li p6-br-inst-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of branch instructions executed but necessarily retired.
|
||||||
|
.It Li p6-br-inst-retired
|
||||||
|
Count the number of branch instructions retired.
|
||||||
|
.It Li p6-br-miss-pred-retired
|
||||||
|
Count the number of mispredicted branch instructions retired.
|
||||||
|
.It Li p6-br-miss-pred-taken-ret
|
||||||
|
Count the number of taken mispredicted branches retired.
|
||||||
|
.It Li p6-br-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of branch instructions executed that were
|
||||||
|
mispredicted at execution.
|
||||||
|
.It Li p6-br-ret-bac-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of return instructions executed that were
|
||||||
|
mispredicted at the Front End (BAC).
|
||||||
|
.It Li p6-br-ret-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of return instructions executed.
|
||||||
|
.It Li p6-br-ret-missp-exec
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of return instructions executed that were
|
||||||
|
mispredicted at execution.
|
||||||
|
.It Li p6-br-taken-retired
|
||||||
|
Count the number of taken branches retired.
|
||||||
|
.It Li p6-btb-misses
|
||||||
|
Count the number of branches for which the BTB did not produce a
|
||||||
|
prediction.
|
||||||
|
.It Li p6-bus-bnr-drv
|
||||||
|
Count the number of bus clock cycles during which this processor is
|
||||||
|
driving the BNR# pin.
|
||||||
|
.It Li p6-bus-data-rcv
|
||||||
|
Count the number of bus clock cycles during which this processor is
|
||||||
|
receiving data.
|
||||||
|
.It Li p6-bus-drdy-clocks Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of clocks during which DRDY# is asserted.
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-hit-drv
|
||||||
|
Count the number of bus clock cycles during which this processor is
|
||||||
|
driving the HIT# pin.
|
||||||
|
.It Li p6-bus-hitm-drv
|
||||||
|
Count the number of bus clock cycles during which this processor is
|
||||||
|
driving the HITM# pin.
|
||||||
|
.It Li p6-bus-lock-clocks Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of clocks during with LOCK# is asserted on the
|
||||||
|
external system bus.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-req-outstanding
|
||||||
|
Count the number of bus requests outstanding in any given cycle.
|
||||||
|
.It Li p6-bus-snoop-stall
|
||||||
|
Count the number of clock cycles during which the bus is snoop stalled.
|
||||||
|
.It Li p6-bus-tran-any Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed bus transactions of any kind.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-brd Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of burst read transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-burst Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed burst transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-def Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed deferred transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-ifetch Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed instruction fetch transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-inval Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed invalidate transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-mem Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed memory transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-pwr Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed partial write transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-tran-rfo Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed read-for-ownership transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-trans-io Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed I/O transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-trans-p Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed partial transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-bus-trans-wb Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of completed write-back transactions.
|
||||||
|
An additional qualifier may be specified and comprises one of the following
|
||||||
|
keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li any
|
||||||
|
Count transactions generated by any agent on the bus.
|
||||||
|
.It Li self
|
||||||
|
Count transactions generated by this processor.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations generated by this processor.
|
||||||
|
.It Li p6-cpu-clk-unhalted
|
||||||
|
Count the number of cycles during with the processor was not halted.
|
||||||
|
.Pp
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of cycles during with the processor was not halted
|
||||||
|
and not in a thermal trip.
|
||||||
|
.It Li p6-cycles-div-busy
|
||||||
|
Count the number of cycles during which the divider is busy and cannot
|
||||||
|
accept new divides.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p6-cycles-in-pending-and-masked
|
||||||
|
Count the number of processor cycles for which interrupts were
|
||||||
|
disabled and interrupts were pending.
|
||||||
|
.It Li p6-cycles-int-masked
|
||||||
|
Count the number of processor cycles for which interrupts were
|
||||||
|
disabled.
|
||||||
|
.It Li p6-data-mem-refs
|
||||||
|
Count all loads and all stores using any memory type, including
|
||||||
|
internal retries.
|
||||||
|
Each part of a split store is counted separately.
|
||||||
|
.It Li p6-dcu-lines-in
|
||||||
|
Count the total lines allocated in the data cache unit.
|
||||||
|
.It Li p6-dcu-m-lines-in
|
||||||
|
Count the number of M state lines allocated in the data cache unit.
|
||||||
|
.It Li p6-dcu-m-lines-out
|
||||||
|
Count the number of M state lines evicted from the data cache unit.
|
||||||
|
.It Li p6-dcu-miss-outstanding
|
||||||
|
Count the weighted number of cycles while a data cache unit miss is
|
||||||
|
outstanding, incremented by the number of outstanding cache misses at
|
||||||
|
any time.
|
||||||
|
.It Li p6-div
|
||||||
|
Count the number of integer and floating-point divides including
|
||||||
|
speculative divides.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p6-emon-esp-uops
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the total number of micro-ops.
|
||||||
|
.It Li p6-emon-est-trans Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of
|
||||||
|
.Tn "Enhanced Intel SpeedStep"
|
||||||
|
transitions.
|
||||||
|
An additional qualifier may be specified, and can be one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li all
|
||||||
|
Count all transitions.
|
||||||
|
.It Li freq
|
||||||
|
Count only frequency transitions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all transitions.
|
||||||
|
.It Li p6-emon-fused-uops-ret Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of retired fused micro-ops.
|
||||||
|
An additional qualifier may be specified, and may be one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li all
|
||||||
|
Count all fused micro-ops.
|
||||||
|
.It Li loadop
|
||||||
|
Count only load and op micro-ops.
|
||||||
|
.It Li stdsta
|
||||||
|
Count only STD/STA micro-ops.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all fused micro-ops.
|
||||||
|
.It Li p6-emon-kni-comp-inst-ret
|
||||||
|
.Pq Tn "Pentium III"
|
||||||
|
Count the number of SSE computational instructions retired.
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li packed-and-scalar
|
||||||
|
Count packed and scalar operations.
|
||||||
|
.It Li scalar
|
||||||
|
Count scalar operations only.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count packed and scalar operations.
|
||||||
|
.It Li p6-emon-kni-inst-retired Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium III"
|
||||||
|
Count the number of SSE instructions retired.
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li packed-and-scalar
|
||||||
|
Count packed and scalar operations.
|
||||||
|
.It Li scalar
|
||||||
|
Count scalar operations only.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count packed and scalar operations.
|
||||||
|
.It Li p6-emon-kni-pref-dispatched Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium III"
|
||||||
|
Count the number of SSE prefetch or weakly ordered instructions
|
||||||
|
dispatched (including speculative prefetches).
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li nta
|
||||||
|
Count non-temporal prefetches.
|
||||||
|
.It Li t1
|
||||||
|
Count prefetches to L1.
|
||||||
|
.It Li t2
|
||||||
|
Count prefetches to L2.
|
||||||
|
.It Li wos
|
||||||
|
Count weakly ordered stores.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count non-temporal prefetches.
|
||||||
|
.It Li p6-emon-kni-pref-miss Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium III"
|
||||||
|
Count the number of prefetch or weakly ordered instructions that miss
|
||||||
|
all caches.
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li nta
|
||||||
|
Count non-temporal prefetches.
|
||||||
|
.It Li t1
|
||||||
|
Count prefetches to L1.
|
||||||
|
.It Li t2
|
||||||
|
Count prefetches to L2.
|
||||||
|
.It Li wos
|
||||||
|
Count weakly ordered stores.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count non-temporal prefetches.
|
||||||
|
.It Li p6-emon-pref-rqsts-dn
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of downward prefetches issued.
|
||||||
|
.It Li p6-emon-pref-rqsts-up
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of upward prefetches issued.
|
||||||
|
.It Li p6-emon-simd-instr-retired
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of retired
|
||||||
|
.Tn MMX
|
||||||
|
instructions.
|
||||||
|
.It Li p6-emon-sse-sse2-comp-inst-retired Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of computational SSE instructions retired.
|
||||||
|
An additional qualifier may be specified and can be one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li sse-packed-single
|
||||||
|
Count SSE packed-single instructions.
|
||||||
|
.It Li sse-scalar-single
|
||||||
|
Count SSE scalar-single instructions.
|
||||||
|
.It Li sse2-packed-double
|
||||||
|
Count SSE2 packed-double instructions.
|
||||||
|
.It Li sse2-scalar-double
|
||||||
|
Count SSE2 scalar-double instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count SSE packed-single instructions.
|
||||||
|
.It Li p6-emon-sse-sse2-inst-retired Op Li ,umask= Ns Ar qualifer
|
||||||
|
.Pp
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of SSE instructions retired.
|
||||||
|
An additional qualifier can be specified, and can be one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li sse-packed-single
|
||||||
|
Count SSE packed-single instructions.
|
||||||
|
.It Li sse-packed-single-scalar-single
|
||||||
|
Count SSE packed-single and scalar-single instructions.
|
||||||
|
.It Li sse2-packed-double
|
||||||
|
Count SSE2 packed-double instructions.
|
||||||
|
.It Li sse2-scalar-double
|
||||||
|
Count SSE2 scalar-double instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count SSE packed-single instructions.
|
||||||
|
.It Li p6-emon-synch-uops
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of sync micro-ops.
|
||||||
|
.It Li p6-emon-thermal-trip
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the duration or occurrences of thermal trips.
|
||||||
|
Use the
|
||||||
|
.Dq Li edge
|
||||||
|
qualifier to count occurrences of thermal trips.
|
||||||
|
.It Li p6-emon-unfusion
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count the number of unfusion events in the reorder buffer.
|
||||||
|
.It Li p6-flops
|
||||||
|
Count the number of computational floating point operations retired.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p6-fp-assist
|
||||||
|
Count the number of floating point exceptions handled by microcode.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p6-fp-comps-ops-exe
|
||||||
|
Count the number of computation floating point operations executed.
|
||||||
|
This event is only allocated on counter 0.
|
||||||
|
.It Li p6-fp-mmx-trans Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of transitions between MMX and floating-point
|
||||||
|
instructions.
|
||||||
|
An additional qualifier may be specified, and comprises one of the
|
||||||
|
following keywords:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li mmxtofp
|
||||||
|
Count transitions from MMX instructions to floating-point instructions.
|
||||||
|
.It Li fptommx
|
||||||
|
Count transitions from floating-point instructions to MMX instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count MMX to floating-point transitions.
|
||||||
|
.It Li p6-hw-int-rx
|
||||||
|
Count the number of hardware interrupts received.
|
||||||
|
.It Li p6-ifu-fetch
|
||||||
|
Count the number of instruction fetches, both cacheable and non-cacheable.
|
||||||
|
.It Li p6-ifu-fetch-miss
|
||||||
|
Count the number of instruction fetch misses (i.e., those that produce
|
||||||
|
memory accesses).
|
||||||
|
.It Li p6-ifu-mem-stall
|
||||||
|
Count the number of cycles instruction fetch is stalled for any reason.
|
||||||
|
.It Li p6-ild-stall
|
||||||
|
Count the number of cycles the instruction length decoder is stalled.
|
||||||
|
.It Li p6-inst-decoded
|
||||||
|
Count the number of instructions decoded.
|
||||||
|
.It Li p6-inst-retired
|
||||||
|
Count the number of instructions retired.
|
||||||
|
.It Li p6-itlb-miss
|
||||||
|
Count the number of instruction TLB misses.
|
||||||
|
.It Li p6-l2-ads
|
||||||
|
Count the number of L2 address strobes.
|
||||||
|
.It Li p6-l2-dbus-busy
|
||||||
|
Count the number of cycles during which the L2 cache data bus was busy.
|
||||||
|
.It Li p6-l2-dbus-busy-rd
|
||||||
|
Count the number of cycles during which the L2 cache data bus was busy
|
||||||
|
transferring read data from L2 to the processor.
|
||||||
|
.It Li p6-l2-ifetch Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of L2 instruction fetches.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations affecting all (MESI) state lines.
|
||||||
|
.It Li p6-l2-ld Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of L2 data loads.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li both
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count both hardware-prefetched lines and non-hardware-prefetched lines.
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li hw
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count hardware-prefetched lines only.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li nonhw
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Exclude hardware-prefetched lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default on processors other than
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count operations affecting all (MESI) state lines.
|
||||||
|
The default on
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count both hardware-prefetched and
|
||||||
|
non-hardware-prefetch operations on all (MESI) state lines.
|
||||||
|
.Pq Errata
|
||||||
|
This event is affected by processor errata E53.
|
||||||
|
.It Li p6-l2-lines-in Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of L2 lines allocated.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li both
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count both hardware-prefetched lines and non-hardware-prefetched lines.
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li hw
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count hardware-prefetched lines only.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li nonhw
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Exclude hardware-prefetched lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default on processors other than
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count operations affecting all (MESI) state lines.
|
||||||
|
The default on
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count both hardware-prefetched and
|
||||||
|
non-hardware-prefetch operations on all (MESI) state lines.
|
||||||
|
.Pq Errata
|
||||||
|
This event is affected by processor errata E45.
|
||||||
|
.It Li p6-l2-lines-out Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of L2 lines evicted.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li both
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count both hardware-prefetched lines and non-hardware-prefetched lines.
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li hw
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
Count hardware-prefetched lines only.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li nonhw
|
||||||
|
.Pq Tn "Pentium M" only
|
||||||
|
Exclude hardware-prefetched lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default on processors other than
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count operations affecting all (MESI) state lines.
|
||||||
|
The default on
|
||||||
|
.Tn "Pentium M"
|
||||||
|
processors is to count both hardware-prefetched and
|
||||||
|
non-hardware-prefetch operations on all (MESI) state lines.
|
||||||
|
.Pq Errata
|
||||||
|
This event is affected by processor errata E45.
|
||||||
|
.It Li p6-l2-m-lines-inm
|
||||||
|
Count the number of modified lines allocated in L2 cache.
|
||||||
|
.It Li p6-l2-m-lines-outm Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the number of L2 M-state lines evicted.
|
||||||
|
.Pp
|
||||||
|
.Pq Tn "Pentium M"
|
||||||
|
On these processors an additional qualifier may be specified and
|
||||||
|
comprises a list of the following keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li both
|
||||||
|
Count both hardware-prefetched lines and non-hardware-prefetched lines.
|
||||||
|
.It Li hw
|
||||||
|
Count hardware-prefetched lines only.
|
||||||
|
.It Li nonhw
|
||||||
|
Exclude hardware-prefetched lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count both hardware-prefetched and
|
||||||
|
non-hardware-prefetch operations.
|
||||||
|
.Pq Errata
|
||||||
|
This event is affected by processor errata E53.
|
||||||
|
.It Li p6-l2-rqsts Op Li ,umask= Ns Ar qualifier
|
||||||
|
Count the total number of L2 requests.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations affecting all (MESI) state lines.
|
||||||
|
.It Li p6-l2-st
|
||||||
|
Count the number of L2 data stores.
|
||||||
|
An additional qualifier may be specified and comprises a list of the following
|
||||||
|
keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li e
|
||||||
|
Count operations affecting E (exclusive) state lines.
|
||||||
|
.It Li i
|
||||||
|
Count operations affecting I (invalid) state lines.
|
||||||
|
.It Li m
|
||||||
|
Count operations affecting M (modified) state lines.
|
||||||
|
.It Li s
|
||||||
|
Count operations affecting S (shared) state lines.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations affecting all (MESI) state lines.
|
||||||
|
.It Li p6-ld-blocks
|
||||||
|
Count the number of load operations delayed due to store buffer blocks.
|
||||||
|
.It Li p6-misalign-mem-ref
|
||||||
|
Count the number of misaligned data memory references (crossing a 64
|
||||||
|
bit boundary).
|
||||||
|
.It Li p6-mmx-assist
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of MMX assists executed.
|
||||||
|
.It Li p6-mmx-instr-exec
|
||||||
|
.Pq Tn Celeron , Tn "Pentium II"
|
||||||
|
Count the number of MMX instructions executed, except MOVQ and MOVD
|
||||||
|
stores from register to memory.
|
||||||
|
.It Li p6-mmx-instr-ret
|
||||||
|
.Pq Tn "Pentium II"
|
||||||
|
Count the number of MMX instructions retired.
|
||||||
|
.It Li p6-mmx-instr-type-exec Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of MMX instructions executed.
|
||||||
|
An additional qualifier may be specified and comprises a list of
|
||||||
|
the following keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li pack
|
||||||
|
Count MMX pack operation instructions.
|
||||||
|
.It Li packed-arithmetic
|
||||||
|
Count MMX packed arithmetic instructions.
|
||||||
|
.It Li packed-logical
|
||||||
|
Count MMX packed logical instructions.
|
||||||
|
.It Li packed-multiply
|
||||||
|
Count MMX packed multiply instructions.
|
||||||
|
.It Li packed-shift
|
||||||
|
Count MMX packed shift instructions.
|
||||||
|
.It Li unpack
|
||||||
|
Count MMX unpack operation instructions.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count all operations.
|
||||||
|
.It Li p6-mmx-sat-instr-exec
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of MMX saturating instructions executed.
|
||||||
|
.It Li p6-mmx-uops-exec
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of MMX micro-ops executed.
|
||||||
|
.It Li p6-mul
|
||||||
|
Count the number of integer and floating-point multiplies, including
|
||||||
|
speculative multiplies.
|
||||||
|
This event is only allocated on counter 1.
|
||||||
|
.It Li p6-partial-rat-stalls
|
||||||
|
Count the number of cycles or events for partial stalls.
|
||||||
|
.It Li p6-resource-stalls
|
||||||
|
Count the number of cycles there was a resource related stall of any kind.
|
||||||
|
.It Li p6-ret-seg-renames
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of segment register rename events retired.
|
||||||
|
.It Li p6-sb-drains
|
||||||
|
Count the number of cycles the store buffer is draining.
|
||||||
|
.It Li p6-seg-reg-renames Op Li ,umask= Ns Ar qualifier
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of segment register renames.
|
||||||
|
An additional qualifier may be specified, and comprises a list of the
|
||||||
|
following keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li ds
|
||||||
|
Count renames for segment register DS.
|
||||||
|
.It Li es
|
||||||
|
Count renames for segment register ES.
|
||||||
|
.It Li fs
|
||||||
|
Count renames for segment register FS.
|
||||||
|
.It Li gs
|
||||||
|
Count renames for segment register GS.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations affecting all segment registers.
|
||||||
|
.It Li p6-seg-rename-stalls
|
||||||
|
.Pq Tn "Pentium II" , Tn "Pentium III"
|
||||||
|
Count the number of segment register renaming stalls.
|
||||||
|
An additional qualifier may be specified, and comprises a list of the
|
||||||
|
following keywords separated by
|
||||||
|
.Ql +
|
||||||
|
characters:
|
||||||
|
.Pp
|
||||||
|
.Bl -tag -width indent -compact
|
||||||
|
.It Li ds
|
||||||
|
Count stalls for segment register DS.
|
||||||
|
.It Li es
|
||||||
|
Count stalls for segment register ES.
|
||||||
|
.It Li fs
|
||||||
|
Count stalls for segment register FS.
|
||||||
|
.It Li gs
|
||||||
|
Count stalls for segment register GS.
|
||||||
|
.El
|
||||||
|
.Pp
|
||||||
|
The default is to count operations affecting all the segment registers.
|
||||||
|
.It Li p6-segment-reg-loads
|
||||||
|
Count the number of segment register loads.
|
||||||
|
.It Li p6-uops-retired
|
||||||
|
Count the number of micro-ops retired.
|
||||||
|
.El
|
||||||
|
.Ss Event Name Aliases
|
||||||
|
The following table shows the mapping between the PMC-independent
|
||||||
|
aliases supported by
|
||||||
|
.Lb libpmc
|
||||||
|
and the underlying hardware events used.
|
||||||
|
.Bl -column "branch-mispredicts" "Description"
|
||||||
|
.It Em Alias Ta Em Event
|
||||||
|
.It Li branches Ta Li p6-br-inst-retired
|
||||||
|
.It Li branch-mispredicts Ta Li p6-br-miss-pred-retired
|
||||||
|
.It Li dc-misses Ta Li p6-dcu-lines-in
|
||||||
|
.It Li ic-misses Ta Li p6-ifu-fetch-miss
|
||||||
|
.It Li instructions Ta Li p6-inst-retired
|
||||||
|
.It Li interrupts Ta Li p6-hw-int-rx
|
||||||
|
.It Li unhalted-cycles Ta Li p6-cpu-clk-unhalted
|
||||||
|
.El
|
||||||
|
.Sh SEE ALSO
|
||||||
|
.Xr pmc 3 ,
|
||||||
|
.Xr pmc.k7 3 ,
|
||||||
|
.Xr pmc.k8 3 ,
|
||||||
|
.Xr pmc.p4 3 ,
|
||||||
|
.Xr pmc.p5 3 ,
|
||||||
|
.Xr pmc.tsc 3 ,
|
||||||
|
.Xr pmclog 3 ,
|
||||||
|
.Xr hwpmc 4
|
||||||
|
.Sh HISTORY
|
||||||
|
The
|
||||||
|
.Nm pmc
|
||||||
|
library first appeared in
|
||||||
|
.Fx 6.0 .
|
||||||
|
.Sh AUTHORS
|
||||||
|
The
|
||||||
|
.Lb libpmc
|
||||||
|
library was written by
|
||||||
|
.An "Joseph Koshy"
|
||||||
|
.Aq jkoshy@FreeBSD.org .
|
Loading…
x
Reference in New Issue
Block a user