809 lines
24 KiB
Groff
809 lines
24 KiB
Groff
.\" Copyright (c) 2008 Joseph Koshy. All rights reserved.
|
||
.\"
|
||
.\" Redistribution and use in source and binary forms, with or without
|
||
.\" modification, are permitted provided that the following conditions
|
||
.\" are met:
|
||
.\" 1. Redistributions of source code must retain the above copyright
|
||
.\" notice, this list of conditions and the following disclaimer.
|
||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||
.\" notice, this list of conditions and the following disclaimer in the
|
||
.\" documentation and/or other materials provided with the distribution.
|
||
.\"
|
||
.\" This software is provided by Joseph Koshy ``as is'' and
|
||
.\" any express or implied warranties, including, but not limited to, the
|
||
.\" implied warranties of merchantability and fitness for a particular purpose
|
||
.\" are disclaimed. in no event shall Joseph Koshy be liable
|
||
.\" for any direct, indirect, incidental, special, exemplary, or consequential
|
||
.\" damages (including, but not limited to, procurement of substitute goods
|
||
.\" or services; loss of use, data, or profits; or business interruption)
|
||
.\" however caused and on any theory of liability, whether in contract, strict
|
||
.\" liability, or tort (including negligence or otherwise) arising in any way
|
||
.\" out of the use of this software, even if advised of the possibility of
|
||
.\" such damage.
|
||
.\"
|
||
.\" $FreeBSD$
|
||
.\"
|
||
.Dd November 12, 2008
|
||
.Os
|
||
.Dt PMC.CORE 3
|
||
.Sh NAME
|
||
.Nm pmc.core
|
||
.Nd measurement events for
|
||
.Tn Intel
|
||
.Tn Core Solo
|
||
and
|
||
.Tn Core Duo
|
||
family CPUs
|
||
.Sh LIBRARY
|
||
.Lb libpmc
|
||
.Sh SYNOPSIS
|
||
.In pmc.h
|
||
.Sh DESCRIPTION
|
||
.Tn Intel
|
||
.Tn "Core Solo"
|
||
and
|
||
.Tn "Core Duo"
|
||
CPUs contain PMCs conforming to version 1 of the
|
||
.Tn Intel
|
||
performance measurement architecture.
|
||
.Pp
|
||
These PMCs are documented in
|
||
.Rs
|
||
.%B "IA-32 Intel(R) Architecture Software Developer's Manual"
|
||
.%T "Volume 3: System Programming Guide"
|
||
.%N "Order Number 253669-027US"
|
||
.%D July 2008
|
||
.%Q "Intel Corporation"
|
||
.Re
|
||
.Ss PMC Features
|
||
CPUs conforming to version 1 of the
|
||
.Tn Intel
|
||
performance measurement architecture contain two programmable PMCs of
|
||
class
|
||
.Li PMC_CLASS_IAP .
|
||
The PMCs are 40 bits width and offer the following capabilities:
|
||
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
||
.It Em Capability Ta Em Support
|
||
.It PMC_CAP_CASCADE Ta \&No
|
||
.It PMC_CAP_EDGE Ta Yes
|
||
.It PMC_CAP_INTERRUPT Ta Yes
|
||
.It PMC_CAP_INVERT Ta Yes
|
||
.It PMC_CAP_READ Ta Yes
|
||
.It PMC_CAP_PRECISE Ta \&No
|
||
.It PMC_CAP_SYSTEM Ta Yes
|
||
.It PMC_CAP_TAGGING Ta \&No
|
||
.It PMC_CAP_THRESHOLD Ta Yes
|
||
.It PMC_CAP_USER Ta Yes
|
||
.It PMC_CAP_WRITE Ta Yes
|
||
.El
|
||
.Ss Event Qualifiers
|
||
Event specifiers for these PMCs support the following common
|
||
qualifiers:
|
||
.Bl -tag -width indent
|
||
.It Li cmask= Ns Ar value
|
||
Configure the PMC to increment only if the number of configured
|
||
events measured in a cycle is greater than or equal to
|
||
.Ar value .
|
||
.It Li edge
|
||
Configure the PMC to count the number of deasserted to asserted
|
||
transitions of the conditions expressed by the other qualifiers.
|
||
If specified, the counter will increment only once whenever a
|
||
condition becomes true, irrespective of the number of clocks during
|
||
which the condition remains true.
|
||
.It Li inv
|
||
Invert the sense of comparision when the
|
||
.Dq Li cmask
|
||
qualifier is present, making the counter increment when the number of
|
||
events per cycle is less than the value specified by the
|
||
.Dq Li cmask
|
||
qualifier.
|
||
.It Li os
|
||
Configure the PMC to count events happening at processor privilege
|
||
level 0.
|
||
.It Li usr
|
||
Configure the PMC to count events occurring at privilege levels 1, 2
|
||
or 3.
|
||
.El
|
||
.Pp
|
||
If neither of the
|
||
.Dq Li os
|
||
or
|
||
.Dq Li usr
|
||
qualifiers are specified, the default is to enable both.
|
||
.Pp
|
||
Events that require core-specificity to be specified use a
|
||
additional qualifier
|
||
.Dq Li core= Ns Ar value ,
|
||
where argument
|
||
.Ar value
|
||
is one of:
|
||
.Bl -tag -width indent -compact
|
||
.It Li all
|
||
Measure event conditions on all cores.
|
||
.It Li this
|
||
Measure event conditions on this core.
|
||
.El
|
||
The default is
|
||
.Dq Li this .
|
||
.Pp
|
||
Events that require an agent qualifier to be specified use an
|
||
additional qualifier
|
||
.Dq Li agent= Ns value ,
|
||
where argument
|
||
.Ar value
|
||
is one of:
|
||
.Bl -tag -width indent -compact
|
||
.It Li this
|
||
Measure events associated with this bus agent.
|
||
.It Li any
|
||
Measure events caused by any bus agent.
|
||
.El
|
||
The default is
|
||
.Dq Li this .
|
||
.Pp
|
||
Events that require a hardware prefetch qualifier to be specified use an
|
||
additional qualifier
|
||
.Dq Li prefetch= Ns Ar value ,
|
||
where argument
|
||
.Ar value
|
||
is one of:
|
||
.Bl -tag -width "exclude" -compact
|
||
.It Li both
|
||
Include all prefetches.
|
||
.It Li only
|
||
Only count hardware prefetches.
|
||
.It Li exclude
|
||
Exclude hardware prefetches.
|
||
.El
|
||
The default is
|
||
.Dq Li both .
|
||
.Pp
|
||
Events that require a cache coherence qualifier to be specified use an
|
||
additional qualifer
|
||
.Dq Li cachestate= Ns Ar value ,
|
||
where argument
|
||
.Ar value
|
||
contains one or more of the following letters:
|
||
.Bl -tag -width indent -compact
|
||
.It Li e
|
||
Count cache lines in the exclusive state.
|
||
.It Li i
|
||
Count cache lines in the invalid state.
|
||
.It Li m
|
||
Count cache lines in the modified state.
|
||
.It Li s
|
||
Count cache lines in the shared state.
|
||
.El
|
||
The default is
|
||
.Dq Li eims .
|
||
.Ss Event Specifiers
|
||
The following event names are case insensitive.
|
||
Whitespace, hyphens and underscore characters in these names are
|
||
ignored.
|
||
.Pp
|
||
Core PMCs support the following events:
|
||
.Bl -tag -width indent
|
||
.It Li BAClears
|
||
.Pq Event E6H , Umask 00H
|
||
The number of BAClear conditions asserted.
|
||
.It Li BTB_Misses
|
||
.Pq Event E2H , Umask 00H
|
||
The number of branches for which the branch table buffer did not
|
||
produce a prediction.
|
||
.It Li Br_BAC_Missp_Exec
|
||
.Pq Event 8AH , Umask 00H
|
||
The number of branch instructions executed that were mispredicted at
|
||
the front end.
|
||
.It Li Br_Bogus
|
||
.Pq Event E4H , Umask 00H
|
||
The number of bogus branches.
|
||
.It Li Br_Call_Exec
|
||
.Pq Event 92H , Umask 00H
|
||
The number of
|
||
.Li CALL
|
||
instructions executed.
|
||
.It Li Br_Call_Missp_Exec
|
||
.Pq Event 93H , Umask 00H
|
||
The number of
|
||
.Li CALL
|
||
instructions executed that were mispredicted.
|
||
.It Li Br_Cnd_Exec
|
||
.Pq Event 8BH , Umask 00H
|
||
The number of conditional branch instructions executed.
|
||
.It Li Br_Cnd_Missp_Exec
|
||
.Pq Event 8CH , Umask 00H
|
||
The number of conditional branch instructions executed that were mispredicted.
|
||
.It Li Br_Ind_Call_Exec
|
||
.Pq Event 94H , Umask 00H
|
||
The number of indirect
|
||
.Li CALL
|
||
instructions executed.
|
||
.It Li Br_Ind_Exec
|
||
.Pq Event 8DH , Umask 00H
|
||
The number of indirect branches executed.
|
||
.It Li Br_Ind_Missp_Exec
|
||
.Pq Event 8EH , Umask 00H
|
||
The number of indirect branch instructions executed that were mispredicted.
|
||
.It Li Br_Inst_Exec
|
||
.Pq Event 88H , Umask 00H
|
||
The number of branch instructions executed including speculative branches.
|
||
.It Li Br_Instr_Decoded
|
||
.Pq Event E0H , Umask 00H
|
||
The number of branch instructions decoded.
|
||
.It Li Br_Instr_Ret
|
||
.Pq Event C4H , Umask 00H
|
||
.Pq Alias Qq "Branch Instruction Retired"
|
||
The number of branch instructions retired.
|
||
This is an architectural performance event.
|
||
.It Li Br_MisPred_Ret
|
||
.Pq Event C5H , Umask 00H
|
||
.Pq Alias Qq "Branch Misses Retired"
|
||
The number of mispredicted branch instructions retired.
|
||
This is an architectural performance event.
|
||
.It Li Br_MisPred_Taken_Ret
|
||
.Pq Event CAH , Umask 00H
|
||
The number of taken and mispredicted branches retired.
|
||
.It Li Br_Missp_Exec
|
||
.Pq Event 89H , Umask 00H
|
||
The number of branch instructions executed and mispredicted at
|
||
execution including branches that were not predicted.
|
||
.It Li Br_Ret_BAC_Missp_Exec
|
||
.Pq Event 91H , Umask 00H
|
||
The number of return branch instructions that were mispredicted at the
|
||
front end.
|
||
.It Li Br_Ret_Exec
|
||
.Pq Event 8FH , Umask 00H
|
||
The number of return branch instructions executed.
|
||
.It Li Br_Ret_Missp_Exec
|
||
.Pq Event 90H , Umask 00H
|
||
The number of return branch instructions executed that were mispredicted.
|
||
.It Li Br_Taken_Ret
|
||
.Pq Event C9H , Umask 00H
|
||
The number of taken branches retired.
|
||
.It Li Bus_BNR_Clocks
|
||
.Pq Event 61H , Umask 00H
|
||
The number of external bus cycles while BNR (bus not ready) was asserted.
|
||
.It Li Bus_DRDY_Clocks Op ,agent= Ns Ar agent
|
||
.Pq Event 62H , Umask 00H
|
||
The number of external bus cycles while DRDY was asserted.
|
||
.It Li Bus_Data_Rcv
|
||
.Pq Event 64H , Umask 40H
|
||
.\" XXX Using the description in Core2 PMC documentation.
|
||
The number of cycles during which the processor is busy receiving data.
|
||
.It Li Bus_Locks_Clocks Op ,core= Ns Ar core
|
||
.Pq Event 63H
|
||
The number of external bus cycles while the bus lock signal was asserted.
|
||
.It Li Bus_Not_In_Use Op ,core= Ns Ar core
|
||
.Pq Event 7DH
|
||
The number of cycles when there is no transaction from the core.
|
||
.It Li Bus_Req_Outstanding Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 60H
|
||
The weighted cycles of cacheable bus data read requests
|
||
from the data cache unit or hardware prefetcher.
|
||
.It Li Bus_Snoop_Stall
|
||
.Pq Event 7EH , Umask 00H
|
||
The number bus cycles while a bus snoop is stalled.
|
||
.It Li Bus_Snoops Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Xc
|
||
.Pq Event 77H
|
||
.\" XXX Using the description in Core2 PMC documentation.
|
||
The number of snoop responses to bus transactions.
|
||
.It Li Bus_Trans_Any Op ,agent= Ns Ar agent
|
||
.Pq Event 70H
|
||
The number of completed bus transactions.
|
||
.It Li Bus_Trans_Brd Op ,core= Ns Ar core
|
||
.Pq Event 65H
|
||
The number of read bus transactions.
|
||
.It Li Bus_Trans_Burst Op ,agent= Ns Ar agent
|
||
.Pq Event 6EH
|
||
The number of completed burst transactions.
|
||
Retried transactions may be counted more than once.
|
||
.It Li Bus_Trans_Def Op ,core= Ns Ar core
|
||
.Pq Event 6DH
|
||
The number of completed deferred transactions.
|
||
.It Li Bus_Trans_IO Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 6CH
|
||
The number of completed I/O transactions counting both reads and
|
||
writes.
|
||
.It Li Bus_Trans_Ifetch Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 68H
|
||
Completed instruction fetch transactions.
|
||
.It Li Bus_Trans_Inval Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 69H
|
||
The number completed invalidate transactions.
|
||
.It Li Bus_Trans_Mem Op ,agent= Ns Ar agent
|
||
.Pq Event 6FH
|
||
The number of completed memory transactions.
|
||
.It Li Bus_Trans_P Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 6BH
|
||
The number of completed partial transactions.
|
||
.It Li Bus_Trans_Pwr Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 6AH
|
||
The number of completed partial write transactions.
|
||
.It Li Bus_Trans_RFO Xo
|
||
.Op ,agent= Ns Ar agent
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 66H
|
||
The number of completed read-for-ownership transactions.
|
||
.It Li Bus_Trans_WB Op ,agent= Ns Ar agent
|
||
.Pq Event 67H
|
||
The number of completed writeback transactions from the data cache
|
||
unit, excluding L2 writebacks.
|
||
.It Li Cycles_Div_Busy
|
||
.Pq Event 14H , Umask 00H
|
||
The number of cycles the divider is busy.
|
||
The event is only available on PMC0.
|
||
.It Li Cycles_Int_Masked
|
||
.Pq Event C6H , Umask 00H
|
||
The number of cycles while interrupts were disabled.
|
||
.It Li Cycles_Int_Pending_Masked
|
||
.Pq Event C7H , Umask 00H
|
||
The number of cycles while interrupts were disabled and interrupts
|
||
were pending.
|
||
.It Li DCU_Snoop_To_Share Op ,core= Ns core
|
||
.Pq Event 78H
|
||
The number of data cache unit snoops to L1 cache lines in the shared
|
||
state.
|
||
.It Li DCache_Cache_Lock Op ,cachestate= Ns Ar mesi
|
||
.\" XXX needs clarification
|
||
.Pq Event 42H
|
||
The number of cacheable locked read operations to invalid state.
|
||
.It Li DCache_Cache_LD Op ,cachestate= Ns Ar mesi
|
||
.Pq Event 40H
|
||
The number of cacheable L1 data read operations.
|
||
.It Li DCache_Cache_ST Op ,cachestate= Ns Ar mesi
|
||
.Pq Event 41H
|
||
The number cacheable L1 data write operations.
|
||
.It Li DCache_M_Evict
|
||
.Pq Event 47H , Umask 00H
|
||
The number of M state data cache lines that were evicted.
|
||
.It Li DCache_M_Repl
|
||
.Pq Event 46H , Umask 00H
|
||
The number of M state data cache lines that were allocated.
|
||
.It Li DCache_Pend_Miss
|
||
.Pq Event 48H , Umask 00H
|
||
The weighted cycles an L1 miss was outstanding.
|
||
.It Li DCache_Repl
|
||
.Pq Event 45H , Umask 0FH
|
||
The number of data cache line replacements.
|
||
.It Li Data_Mem_Cache_Ref
|
||
.Pq Event 44H , Umask 02H
|
||
The number of cacheable read and write operations to L1 data cache.
|
||
.It Li Data_Mem_Ref
|
||
.Pq Event 43H , Umask 01H
|
||
The number of L1 data reads and writes, both cacheable and
|
||
uncacheable.
|
||
.It Li Dbus_Busy Op ,core= Ns Ar core
|
||
.Pq Event 22H
|
||
The number of core cycles during which the data bus was busy.
|
||
.It Li Dbus_Busy_Rd Op ,core= Ns Ar core
|
||
.Pq Event 23H
|
||
The nunber of cycles during which the data bus was busy transferring
|
||
data to a core.
|
||
.It Li Div
|
||
.Pq Event 13H , Umask 00H
|
||
The number of divide operations including speculative operations for
|
||
integer and floating point divides.
|
||
This event can only be counted on PMC1.
|
||
.It Li Dtlb_Miss
|
||
.Pq Event 49H , Umask 00H
|
||
The number of data references that missed the TLB.
|
||
.It Li ESP_Uops
|
||
.Pq Event D7H , Umask 00H
|
||
The number of ESP folding instructions decoded.
|
||
.It Li EST_Trans Op ,trans= Ns Ar transition
|
||
.Pq Event 3AH
|
||
Count the number of Intel Enhanced SpeedStep transitions.
|
||
The argument
|
||
.Ar transition
|
||
can be one of the following values:
|
||
.Bl -tag -width indent -compact
|
||
.It Li any
|
||
(Umask 00H) Count all transitions.
|
||
.It Li frequency
|
||
(Umask 01H) Count frequency transitions.
|
||
.El
|
||
The default is
|
||
.Dq Li any .
|
||
.It Li FP_Assist
|
||
.Pq Event 11H , Umask 00H
|
||
The number of floating point operations that required microcode
|
||
assists.
|
||
The event is only available on PMC1.
|
||
.It Li FP_Comp_Instr_Ret
|
||
.Pq Event C1H , Umask 00H
|
||
The number of X87 floating point compute instructions retired.
|
||
The event is only available on PMC0.
|
||
.It Li FP_Comps_Op_Exe
|
||
.Pq Event 10H , Umask 00H
|
||
The number of floating point computational instructions executed.
|
||
.It Li FP_MMX_Trans
|
||
.Pq Event CCH , Umask 01H
|
||
The number of transitions from X87 to MMX.
|
||
.It Li Fused_Ld_Uops_Ret
|
||
.Pq Event DAH , Umask 01H
|
||
The number of fused load uops retired.
|
||
.It Li Fused_St_Uops_Ret
|
||
.Pq Event DAH , Umask 02H
|
||
The number of fused store uops retired.
|
||
.It Li Fused_Uops_Ret
|
||
.Pq Event DAH , Umask 00H
|
||
The number of fused uops retired.
|
||
.It Li HW_Int_Rx
|
||
.Pq Event C8H , Umask 00H
|
||
The number of hardware interrupts received.
|
||
.It Li ICache_Misses
|
||
.Pq Event 81H , Umask 00H
|
||
The number of instruction fetch misses in the instruction cache and
|
||
streaming buffers.
|
||
.It Li ICache_Reads
|
||
.Pq Event 80H , Umask 00H
|
||
The number of instruction fetches from the the instruction cache and
|
||
streaming buffers counting both cacheable and uncacheable fetches.
|
||
.It Li IFU_Mem_Stall
|
||
.Pq Event 86H , Umask 00H
|
||
The number of cycles the instruction fetch unit was stalled while
|
||
waiting for data from memory.
|
||
.It Li ILD_Stall
|
||
.Pq Event 87H , Umask 00H
|
||
The number of instruction length decoder stalls.
|
||
.It Li ITLB_Misses
|
||
.Pq Event 85H , Umask 00H
|
||
The number of instruction TLB misses.
|
||
.It Li Instr_Decoded
|
||
.Pq Event D0H , Umask 00H
|
||
The number of instructions decoded.
|
||
.It Li Instr_Ret
|
||
.Pq Event C0H , Umask 00H
|
||
.Pq Alias Qq "Instruction Retired"
|
||
The number of instructions retired.
|
||
This is an architectural performance event.
|
||
.It Li L1_Pref_Req
|
||
.Pq Event 4FH , Umask 00H
|
||
The number of L1 prefetch request due to data cache misses.
|
||
.It Li L2_ADS Op ,core= Ns core
|
||
.Pq Event 21H
|
||
The number of L2 address strobes.
|
||
.It Li L2_IFetch Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 28H
|
||
The number of instruction fetches by the instruction fetch unit from
|
||
L2 cache including speculative fetches.
|
||
.It Li L2_LD Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 29H
|
||
The number of L2 cache reads.
|
||
.It Li L2_Lines_In Xo
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 24H
|
||
The number of L2 cache lines allocated.
|
||
.It Li L2_Lines_Out Xo
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 26H
|
||
The number of L2 cache lines evicted.
|
||
.It Li L2_M_Lines_In Op ,core= Ns Ar core
|
||
.Pq Event 25H
|
||
The number of L2 M state cache lines allocated.
|
||
.It Li L2_M_Lines_Out Xo
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 27H
|
||
The number of L2 M state cache lines evicted.
|
||
.It Li L2_No_Request_Cycles Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 32H
|
||
The number of cycles there was no request to access L2 cache.
|
||
.It Li L2_Reject_Cycles Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 30H
|
||
The number of cycles the L2 cache was busy and rejecting new requests.
|
||
.It Li L2_Rqsts Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Op ,prefetch= Ns Ar prefetch
|
||
.Xc
|
||
.Pq Event 2EH
|
||
The number of L2 cache requests.
|
||
.It Li L2_ST Xo
|
||
.Op ,cachestate= Ns Ar mesi
|
||
.Op ,core= Ns Ar core
|
||
.Xc
|
||
.Pq Event 2AH
|
||
The number of L2 cache writes including speculative writes.
|
||
.It Li LD_Blocks
|
||
.Pq Event 03H , Umask 00H
|
||
The number of load operations delayed due to store buffer blocks.
|
||
.It Li LLC_Misses
|
||
.Pq Event 2EH , Umask 41H
|
||
The number of cache misses for references to the last level cache,
|
||
excluding misses due to hardware prefetches.
|
||
This is an architectural performance event.
|
||
.It Li LLC_Reference
|
||
The number of references to the last level cache,
|
||
excluding those due to hardware prefetches.
|
||
This is an architectural performance event.
|
||
.Pq Event 2EH , Umask 4FH
|
||
This is an architectural performance event.
|
||
.It Li MMX_Assist
|
||
.Pq Event CDH , Umask 00H
|
||
The number of EMMX instructions executed.
|
||
.It Li MMX_FP_Trans
|
||
.Pq Event CCH , Umask 00H
|
||
The number of transitions from MMX to X87.
|
||
.It Li MMX_Instr_Exec
|
||
.Pq Event B0H , Umask 00H
|
||
The number of MMX instructions executed excluding
|
||
.Li MOVQ
|
||
and
|
||
.Li MOVD
|
||
stores.
|
||
.It Li MMX_Instr_Ret
|
||
.Pq Event CEH , Umask 00H
|
||
The number of MMX instructions retired.
|
||
.It Li Misalign_Mem_Ref
|
||
.Pq Event 05H , Umask 00H
|
||
The number of misaligned data memory references, counting loads and
|
||
stores.
|
||
.It Li Mul
|
||
.Pq Event 12H , Umask 00H
|
||
The number of multiply operations include speculative floating point
|
||
and integer multiplies.
|
||
This event is available on PMC1 only.
|
||
.It Li NonHlt_Ref_Cycles
|
||
.Pq Event 3CH , Umask 01H
|
||
.Pq Alias Qq "Unhalted Reference Cycles"
|
||
The number of non-halted bus cycles.
|
||
This is an architectural performance event.
|
||
.It Li Pref_Rqsts_Dn
|
||
.Pq Event F8H , Umask 00H
|
||
The number of hardware prefetch requests issued in backward streams.
|
||
.It Li Pref_Rqsts_Up
|
||
.Pq Event F0H , Umask 00H
|
||
The number of hardware prefetch requests issued in forward streams.
|
||
.It Li Resource_Stall
|
||
.Pq Event A2H , Umask 00H
|
||
The number of cycles where there is a resource related stall.
|
||
.It Li SD_Drains
|
||
.Pq Event 04H , Umask 00H
|
||
The number of cycles while draining store buffers.
|
||
.It Li SIMD_FP_DP_P_Ret
|
||
.Pq Event D8H , Umask 02H
|
||
The number of SSE/SSE2 packed double precision instructions retired.
|
||
.It Li SIMD_FP_DP_P_Comp_Ret
|
||
.Pq Event D9H , Umask 02H
|
||
The number of SSE/SSE2 packed double precision compute instructions
|
||
retired.
|
||
.It Li SIMD_FP_DP_S_Ret
|
||
.Pq Event D8H , Umask 03H
|
||
The number of SSE/SSE2 scalar double precision instructions retired.
|
||
.It Li SIMD_FP_DP_S_Comp_Ret
|
||
.Pq Event D9H , Umask 03H
|
||
The number of SSE/SSE2 scalar double precision compute instructions
|
||
retired.
|
||
.It Li SIMD_FP_SP_P_Comp_Ret
|
||
.Pq Event D9H , Umask 00H
|
||
The number of SSE/SSE2 packed single precision compute instructions
|
||
retired.
|
||
.It Li SIMD_FP_SP_Ret
|
||
.Pq Event D8H , Umask 00H
|
||
The number of SSE/SSE2 scalar single precision instructions retired,
|
||
both packed and scalar.
|
||
.It Li SIMD_FP_SP_S_Ret
|
||
.Pq Event D8H , Umask 01H
|
||
The number of SSE/SSE2 scalar single precision instructions retired.
|
||
.It Li SIMD_FP_SP_S_Comp_Ret
|
||
.Pq Event D9H , Umask 01H
|
||
The number of SSE/SSE2 single precision compute instructions retired.
|
||
.It Li SIMD_Int_128_Ret
|
||
.Pq Event D8H , Umask 04H
|
||
The number of SSE2 128-bit integer instructions retired.
|
||
.It Li SIMD_Int_Pari_Exec
|
||
.Pq Event B3H , Umask 20H
|
||
The number of SIMD integer packed arithmetic instructions executed.
|
||
.It Li SIMD_Int_Pck_Exec
|
||
.Pq Event B3H , Umask 04H
|
||
The number of SIMD integer pack operations instructions executed.
|
||
.It Li SIMD_Int_Plog_Exec
|
||
.Pq Event B3H , Umask 10H
|
||
The number of SIMD integer packed logical instructions executed.
|
||
.It Li SIMD_Int_Pmul_Exec
|
||
.Pq Event B3H , Umask 01H
|
||
The number of SIMD integer packed multiply instructions executed.
|
||
.It Li SIMD_Int_Psft_Exec
|
||
.Pq Event B3H , Umask 02H
|
||
The number of SIMD integer packed shift instructions executed.
|
||
.It Li SIMD_Int_Sat_Exec
|
||
.Pq Event B1H , Umask 00H
|
||
The number of SIMD integer saturating instructions executed.
|
||
.It Li SIMD_Int_Upck_Exec
|
||
.Pq Event B3H , Umask 08H
|
||
The number of SIMD integer unpack instructions executed.
|
||
.It Li SMC_Detected
|
||
.Pq Event C3H , Umask 00H
|
||
The number of times self-modifying code was detected.
|
||
.It Li SSE_NTStores_Miss
|
||
.Pq Event 4BH , Umask 03H
|
||
The number of times an SSE streaming store instruction missed all caches.
|
||
.It Li SSE_NTStores_Ret
|
||
.Pq Event 07H , Umask 03H
|
||
The number of SSE streaming store instructions executed.
|
||
.It Li SSE_PrefNta_Miss
|
||
.Pq Event 4BH , Umask 00H
|
||
The number of times
|
||
.Li PREFETCHNTA
|
||
missed all caches.
|
||
.It Li SSE_PrefNta_Ret
|
||
.Pq Event 07H , Umask 00H
|
||
The number of
|
||
.Li PREFETCHNTA
|
||
instructions retired.
|
||
.It Li SSE_PrefT1_Miss
|
||
.Pq Event 4BH , Umask 01H
|
||
The number of times
|
||
.Li PREFETCHT1
|
||
missed all caches.
|
||
.It Li SSE_PrefT1_Ret
|
||
.Pq Event 07H , Umask 01H
|
||
The number of
|
||
.Li PREFETCHT1
|
||
instructions retired.
|
||
.It Li SSE_PrefT2_Miss
|
||
.Pq Event 4BH , Umask 02H
|
||
The number of times
|
||
.Li PREFETCHNT2
|
||
missed all caches.
|
||
.It Li SSE_PrefT2_Ret
|
||
.Pq Event 07H , Umask 02H
|
||
The number of
|
||
.Li PREFETCHT2
|
||
instructions retired.
|
||
.It Li Seg_Reg_Loads
|
||
.Pq Event 06H , Umask 00H
|
||
The number of segment register loads.
|
||
.It Li Serial_Execution_Cycles
|
||
.Pq Event 3CH , Umask 02H
|
||
The number of non-halted bus cycles of this code while the other core
|
||
was halted.
|
||
.It Li Thermal_Trip
|
||
.Pq Event 3BH , Umask C0H
|
||
The duration in a thermal trip based on the current core clock.
|
||
.It Li Unfusion
|
||
.Pq Event DBH , Umask 00H
|
||
The number of unfusion events.
|
||
.It Li Unhalted_Core_Cycles
|
||
.Pq Event 3CH , Umask 00H
|
||
The number of core clock cycles when the clock signal on a specific
|
||
core is not halted.
|
||
This is an architectural performance event.
|
||
.It Li Uops_Ret
|
||
.Pq Event C2H , Umask 00H
|
||
The number of micro-ops retired.
|
||
.El
|
||
.Ss Event Name Aliases
|
||
The following table shows the mapping between the PMC-independent
|
||
aliases supported by
|
||
.Lb libpmc
|
||
and the underlying hardware events used.
|
||
.Bl -column "branch-mispredicts" "Description"
|
||
.It Em Alias Ta Em Event
|
||
.It Li branches Ta Li Br_Instr_Ret
|
||
.It Li branch-mispredicts Ta Li Br_MisPred_Ret
|
||
.It Li dc-misses Ta (unsupported)
|
||
.It Li ic-misses Ta Li ICache_Misses
|
||
.It Li instructions Ta Li Instr_Ret
|
||
.It Li interrupts Ta Li HW_Int_Rx
|
||
.It Li unhalted-cycles Ta (unsupported)
|
||
.El
|
||
.Sh PROCESSOR ERRATA
|
||
The following errata affect performance measurement on these
|
||
processors.
|
||
These errata are documented in
|
||
.Rs
|
||
.%T "Intel<65> CoreTM Duo Processor and Intel<65> CoreTM Solo Processor on 65 nm Process"
|
||
.%B "Specification Update"
|
||
.%N "Order Number 309222-017"
|
||
.%D July 2008
|
||
.%Q "Intel Corporation"
|
||
.Re
|
||
.Bl -tag -width indent -compact
|
||
.It AE19
|
||
Data prefetch performance monitoring events can only be enabled
|
||
on a single core.
|
||
.It AE25
|
||
Performance monitoring counters that count external bus events
|
||
may report incorrect values after processor power state transitions.
|
||
.It AE28
|
||
Performance monitoring events for retired floating point operations
|
||
(C1H) may not be accurate.
|
||
.It AE29
|
||
DR3 address match on MOVD/MOVQ/MOVNTQ memory store
|
||
instruction may incorrectly increment performance monitoring count
|
||
for saturating simd instructions retired (Event CFH).
|
||
.It AE33
|
||
Hardware prefetch performance monitoring events may be counted
|
||
inaccurately.
|
||
.It AE36
|
||
The
|
||
.Li CPU_CLK_UNHALTED
|
||
performance monitoring event (Event 3CH) counts
|
||
clocks when the processor is in the C1/C2 processor power states.
|
||
.It AE39
|
||
Certain performance monitoring counters related to bus, L2 cache
|
||
and power management are inaccurate.
|
||
.It AE51
|
||
Performance monitoring events for retired instructions (Event C0H) may
|
||
not be accurate.
|
||
.It AE67
|
||
Performance monitoring event
|
||
.Li FP_ASSIST
|
||
may not be accurate.
|
||
.It AE78
|
||
Performance monitoring event for hardware prefetch requests (Event
|
||
4EH) and hardware prefetch request cache misses (Event 4FH) may not be
|
||
accurate.
|
||
.It AE82
|
||
Performance monitoring event
|
||
.Li FP_MMX_TRANS_TO_MMX
|
||
may not count some transitions.
|
||
.El
|
||
.Sh SEE ALSO
|
||
.Xr pmc 3 ,
|
||
.Xr pmc.atom 3 ,
|
||
.Xr pmc.core2 3 ,
|
||
.Xr pmc.iaf 3 ,
|
||
.Xr pmc.k7 3 ,
|
||
.Xr pmc.k8 3 ,
|
||
.Xr pmc.p4 3 ,
|
||
.Xr pmc.p5 3 ,
|
||
.Xr pmc.p6 3 ,
|
||
.Xr pmc.tsc 3 ,
|
||
.Xr pmclog 3 ,
|
||
.Xr hwpmc 4
|
||
.Sh HISTORY
|
||
The
|
||
.Nm pmc
|
||
library first appeared in
|
||
.Fx 6.0 .
|
||
.Sh AUTHORS
|
||
The
|
||
.Lb libpmc
|
||
library was written by
|
||
.An "Joseph Koshy"
|
||
.Aq jkoshy@FreeBSD.org .
|