freebsd-nq/lib/libpmc/pmc.core.3

809 lines
24 KiB
Groff
Raw Normal View History

.\" Copyright (c) 2008 Joseph Koshy. All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" This software is provided by Joseph Koshy ``as is'' and
.\" any express or implied warranties, including, but not limited to, the
.\" implied warranties of merchantability and fitness for a particular purpose
.\" are disclaimed. in no event shall Joseph Koshy be liable
.\" for any direct, indirect, incidental, special, exemplary, or consequential
.\" damages (including, but not limited to, procurement of substitute goods
.\" or services; loss of use, data, or profits; or business interruption)
.\" however caused and on any theory of liability, whether in contract, strict
.\" liability, or tort (including negligence or otherwise) arising in any way
.\" out of the use of this software, even if advised of the possibility of
.\" such damage.
.\"
.\" $FreeBSD$
.\"
.Dd November 12, 2008
.Dt PMC.CORE 3
.Os
.Sh NAME
.Nm pmc.core
.Nd measurement events for
.Tn Intel
.Tn Core Solo
and
.Tn Core Duo
family CPUs
.Sh LIBRARY
.Lb libpmc
.Sh SYNOPSIS
.In pmc.h
.Sh DESCRIPTION
.Tn Intel
.Tn "Core Solo"
and
.Tn "Core Duo"
CPUs contain PMCs conforming to version 1 of the
.Tn Intel
performance measurement architecture.
.Pp
These PMCs are documented in
.Rs
.%B "IA-32 Intel(R) Architecture Software Developer's Manual"
.%T "Volume 3: System Programming Guide"
.%N "Order Number 253669-027US"
.%D July 2008
.%Q "Intel Corporation"
.Re
.Ss PMC Features
CPUs conforming to version 1 of the
.Tn Intel
performance measurement architecture contain two programmable PMCs of
class
.Li PMC_CLASS_IAP .
The PMCs are 40 bits width and offer the following capabilities:
.Bl -column "PMC_CAP_INTERRUPT" "Support"
.It Em Capability Ta Em Support
.It PMC_CAP_CASCADE Ta \&No
.It PMC_CAP_EDGE Ta Yes
.It PMC_CAP_INTERRUPT Ta Yes
.It PMC_CAP_INVERT Ta Yes
.It PMC_CAP_READ Ta Yes
.It PMC_CAP_PRECISE Ta \&No
.It PMC_CAP_SYSTEM Ta Yes
.It PMC_CAP_TAGGING Ta \&No
.It PMC_CAP_THRESHOLD Ta Yes
.It PMC_CAP_USER Ta Yes
.It PMC_CAP_WRITE Ta Yes
.El
.Ss Event Qualifiers
Event specifiers for these PMCs support the following common
qualifiers:
.Bl -tag -width indent
.It Li cmask= Ns Ar value
Configure the PMC to increment only if the number of configured
events measured in a cycle is greater than or equal to
.Ar value .
.It Li edge
2009-08-23 07:32:30 +00:00
Configure the PMC to count the number of de-asserted to asserted
transitions of the conditions expressed by the other qualifiers.
If specified, the counter will increment only once whenever a
condition becomes true, irrespective of the number of clocks during
which the condition remains true.
.It Li inv
2009-08-23 07:32:30 +00:00
Invert the sense of comparison when the
.Dq Li cmask
qualifier is present, making the counter increment when the number of
events per cycle is less than the value specified by the
.Dq Li cmask
qualifier.
.It Li os
Configure the PMC to count events happening at processor privilege
level 0.
.It Li usr
Configure the PMC to count events occurring at privilege levels 1, 2
or 3.
.El
.Pp
If neither of the
.Dq Li os
or
.Dq Li usr
qualifiers are specified, the default is to enable both.
.Pp
Events that require core-specificity to be specified use a
additional qualifier
.Dq Li core= Ns Ar value ,
where argument
.Ar value
is one of:
.Bl -tag -width indent -compact
.It Li all
Measure event conditions on all cores.
.It Li this
Measure event conditions on this core.
.El
The default is
.Dq Li this .
.Pp
Events that require an agent qualifier to be specified use an
additional qualifier
.Dq Li agent= Ns value ,
where argument
.Ar value
is one of:
.Bl -tag -width indent -compact
.It Li this
Measure events associated with this bus agent.
.It Li any
Measure events caused by any bus agent.
.El
The default is
.Dq Li this .
.Pp
Events that require a hardware prefetch qualifier to be specified use an
additional qualifier
.Dq Li prefetch= Ns Ar value ,
where argument
.Ar value
is one of:
.Bl -tag -width "exclude" -compact
.It Li both
Include all prefetches.
.It Li only
Only count hardware prefetches.
.It Li exclude
Exclude hardware prefetches.
.El
The default is
.Dq Li both .
.Pp
Events that require a cache coherence qualifier to be specified use an
2009-08-23 07:32:30 +00:00
additional qualifier
.Dq Li cachestate= Ns Ar value ,
where argument
.Ar value
contains one or more of the following letters:
.Bl -tag -width indent -compact
.It Li e
Count cache lines in the exclusive state.
.It Li i
Count cache lines in the invalid state.
.It Li m
Count cache lines in the modified state.
.It Li s
Count cache lines in the shared state.
.El
The default is
.Dq Li eims .
.Ss Event Specifiers
The following event names are case insensitive.
Whitespace, hyphens and underscore characters in these names are
ignored.
.Pp
Core PMCs support the following events:
.Bl -tag -width indent
.It Li BAClears
2008-11-13 10:40:13 +00:00
.Pq Event E6H , Umask 00H
The number of BAClear conditions asserted.
.It Li BTB_Misses
2008-11-13 10:40:13 +00:00
.Pq Event E2H , Umask 00H
The number of branches for which the branch table buffer did not
produce a prediction.
.It Li Br_BAC_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8AH , Umask 00H
The number of branch instructions executed that were mispredicted at
the front end.
.It Li Br_Bogus
2008-11-13 10:40:13 +00:00
.Pq Event E4H , Umask 00H
The number of bogus branches.
.It Li Br_Call_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 92H , Umask 00H
The number of
.Li CALL
instructions executed.
.It Li Br_Call_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 93H , Umask 00H
The number of
.Li CALL
instructions executed that were mispredicted.
.It Li Br_Cnd_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8BH , Umask 00H
The number of conditional branch instructions executed.
.It Li Br_Cnd_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8CH , Umask 00H
The number of conditional branch instructions executed that were mispredicted.
.It Li Br_Ind_Call_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 94H , Umask 00H
The number of indirect
.Li CALL
instructions executed.
.It Li Br_Ind_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8DH , Umask 00H
The number of indirect branches executed.
.It Li Br_Ind_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8EH , Umask 00H
The number of indirect branch instructions executed that were mispredicted.
.It Li Br_Inst_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 88H , Umask 00H
The number of branch instructions executed including speculative branches.
.It Li Br_Instr_Decoded
2008-11-13 10:40:13 +00:00
.Pq Event E0H , Umask 00H
The number of branch instructions decoded.
.It Li Br_Instr_Ret
2008-11-13 16:32:20 +00:00
.Pq Event C4H , Umask 00H
.Pq Alias Qq "Branch Instruction Retired"
The number of branch instructions retired.
This is an architectural performance event.
.It Li Br_MisPred_Ret
2008-11-13 16:32:20 +00:00
.Pq Event C5H , Umask 00H
.Pq Alias Qq "Branch Misses Retired"
The number of mispredicted branch instructions retired.
This is an architectural performance event.
.It Li Br_MisPred_Taken_Ret
2008-11-13 10:40:13 +00:00
.Pq Event CAH , Umask 00H
The number of taken and mispredicted branches retired.
.It Li Br_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 89H , Umask 00H
The number of branch instructions executed and mispredicted at
execution including branches that were not predicted.
.It Li Br_Ret_BAC_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 91H , Umask 00H
The number of return branch instructions that were mispredicted at the
front end.
.It Li Br_Ret_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 8FH , Umask 00H
The number of return branch instructions executed.
.It Li Br_Ret_Missp_Exec
2008-11-13 10:40:13 +00:00
.Pq Event 90H , Umask 00H
The number of return branch instructions executed that were mispredicted.
.It Li Br_Taken_Ret
2008-11-13 10:40:13 +00:00
.Pq Event C9H , Umask 00H
The number of taken branches retired.
.It Li Bus_BNR_Clocks
2008-11-13 10:40:13 +00:00
.Pq Event 61H , Umask 00H
The number of external bus cycles while BNR (bus not ready) was asserted.
.It Li Bus_DRDY_Clocks Op ,agent= Ns Ar agent
2008-11-13 10:40:13 +00:00
.Pq Event 62H , Umask 00H
The number of external bus cycles while DRDY was asserted.
.It Li Bus_Data_Rcv
2008-11-13 10:40:13 +00:00
.Pq Event 64H , Umask 40H
.\" XXX Using the description in Core2 PMC documentation.
The number of cycles during which the processor is busy receiving data.
.It Li Bus_Locks_Clocks Op ,core= Ns Ar core
.Pq Event 63H
The number of external bus cycles while the bus lock signal was asserted.
.It Li Bus_Not_In_Use Op ,core= Ns Ar core
.Pq Event 7DH
The number of cycles when there is no transaction from the core.
.It Li Bus_Req_Outstanding Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 60H
The weighted cycles of cacheable bus data read requests
from the data cache unit or hardware prefetcher.
.It Li Bus_Snoop_Stall
2008-11-13 10:40:13 +00:00
.Pq Event 7EH , Umask 00H
The number bus cycles while a bus snoop is stalled.
.It Li Bus_Snoops Xo
.Op ,agent= Ns Ar agent
.Op ,cachestate= Ns Ar mesi
.Xc
.Pq Event 77H
.\" XXX Using the description in Core2 PMC documentation.
The number of snoop responses to bus transactions.
.It Li Bus_Trans_Any Op ,agent= Ns Ar agent
.Pq Event 70H
The number of completed bus transactions.
.It Li Bus_Trans_Brd Op ,core= Ns Ar core
.Pq Event 65H
The number of read bus transactions.
.It Li Bus_Trans_Burst Op ,agent= Ns Ar agent
.Pq Event 6EH
The number of completed burst transactions.
Retried transactions may be counted more than once.
.It Li Bus_Trans_Def Op ,core= Ns Ar core
.Pq Event 6DH
The number of completed deferred transactions.
.It Li Bus_Trans_IO Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 6CH
The number of completed I/O transactions counting both reads and
writes.
.It Li Bus_Trans_Ifetch Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 68H
Completed instruction fetch transactions.
.It Li Bus_Trans_Inval Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 69H
The number completed invalidate transactions.
.It Li Bus_Trans_Mem Op ,agent= Ns Ar agent
.Pq Event 6FH
The number of completed memory transactions.
.It Li Bus_Trans_P Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 6BH
The number of completed partial transactions.
.It Li Bus_Trans_Pwr Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 6AH
The number of completed partial write transactions.
.It Li Bus_Trans_RFO Xo
.Op ,agent= Ns Ar agent
.Op ,core= Ns Ar core
.Xc
.Pq Event 66H
The number of completed read-for-ownership transactions.
.It Li Bus_Trans_WB Op ,agent= Ns Ar agent
.Pq Event 67H
2009-08-23 07:32:30 +00:00
The number of completed write-back transactions from the data cache
unit, excluding L2 write-backs.
.It Li Cycles_Div_Busy
2008-11-13 10:40:13 +00:00
.Pq Event 14H , Umask 00H
The number of cycles the divider is busy.
2008-11-26 03:48:20 +00:00
The event is only available on PMC0.
.It Li Cycles_Int_Masked
2008-11-13 10:40:13 +00:00
.Pq Event C6H , Umask 00H
The number of cycles while interrupts were disabled.
.It Li Cycles_Int_Pending_Masked
2008-11-13 10:40:13 +00:00
.Pq Event C7H , Umask 00H
The number of cycles while interrupts were disabled and interrupts
were pending.
.It Li DCU_Snoop_To_Share Op ,core= Ns core
.Pq Event 78H
The number of data cache unit snoops to L1 cache lines in the shared
state.
.It Li DCache_Cache_Lock Op ,cachestate= Ns Ar mesi
.\" XXX needs clarification
.Pq Event 42H
The number of cacheable locked read operations to invalid state.
.It Li DCache_Cache_LD Op ,cachestate= Ns Ar mesi
.Pq Event 40H
The number of cacheable L1 data read operations.
.It Li DCache_Cache_ST Op ,cachestate= Ns Ar mesi
.Pq Event 41H
The number cacheable L1 data write operations.
.It Li DCache_M_Evict
2008-11-13 10:40:13 +00:00
.Pq Event 47H , Umask 00H
The number of M state data cache lines that were evicted.
.It Li DCache_M_Repl
2008-11-13 10:40:13 +00:00
.Pq Event 46H , Umask 00H
The number of M state data cache lines that were allocated.
.It Li DCache_Pend_Miss
2008-11-13 10:40:13 +00:00
.Pq Event 48H , Umask 00H
The weighted cycles an L1 miss was outstanding.
.It Li DCache_Repl
2008-11-13 10:40:13 +00:00
.Pq Event 45H , Umask 0FH
The number of data cache line replacements.
.It Li Data_Mem_Cache_Ref
2008-11-13 10:40:13 +00:00
.Pq Event 44H , Umask 02H
The number of cacheable read and write operations to L1 data cache.
.It Li Data_Mem_Ref
2008-11-13 10:40:13 +00:00
.Pq Event 43H , Umask 01H
The number of L1 data reads and writes, both cacheable and
2009-08-23 07:32:30 +00:00
un-cacheable.
.It Li Dbus_Busy Op ,core= Ns Ar core
.Pq Event 22H
The number of core cycles during which the data bus was busy.
.It Li Dbus_Busy_Rd Op ,core= Ns Ar core
.Pq Event 23H
2009-08-23 07:32:30 +00:00
The number of cycles during which the data bus was busy transferring
data to a core.
.It Li Div
2008-11-13 10:40:13 +00:00
.Pq Event 13H , Umask 00H
The number of divide operations including speculative operations for
integer and floating point divides.
This event can only be counted on PMC1.
.It Li Dtlb_Miss
2008-11-13 10:40:13 +00:00
.Pq Event 49H , Umask 00H
The number of data references that missed the TLB.
.It Li ESP_Uops
2008-11-13 10:40:13 +00:00
.Pq Event D7H , Umask 00H
The number of ESP folding instructions decoded.
.It Li EST_Trans Op ,trans= Ns Ar transition
.Pq Event 3AH
Count the number of Intel Enhanced SpeedStep transitions.
The argument
.Ar transition
can be one of the following values:
.Bl -tag -width indent -compact
.It Li any
(Umask 00H) Count all transitions.
.It Li frequency
(Umask 01H) Count frequency transitions.
.El
The default is
.Dq Li any .
.It Li FP_Assist
2008-11-13 10:40:13 +00:00
.Pq Event 11H , Umask 00H
The number of floating point operations that required microcode
assists.
The event is only available on PMC1.
.It Li FP_Comp_Instr_Ret
2008-11-13 10:40:13 +00:00
.Pq Event C1H , Umask 00H
The number of X87 floating point compute instructions retired.
The event is only available on PMC0.
.It Li FP_Comps_Op_Exe
2008-11-13 10:40:13 +00:00
.Pq Event 10H , Umask 00H
The number of floating point computational instructions executed.
.It Li FP_MMX_Trans
.Pq Event CCH , Umask 01H
The number of transitions from X87 to MMX.
.It Li Fused_Ld_Uops_Ret
.Pq Event DAH , Umask 01H
The number of fused load uops retired.
.It Li Fused_St_Uops_Ret
.Pq Event DAH , Umask 02H
The number of fused store uops retired.
.It Li Fused_Uops_Ret
.Pq Event DAH , Umask 00H
The number of fused uops retired.
.It Li HW_Int_Rx
2008-11-13 10:40:13 +00:00
.Pq Event C8H , Umask 00H
The number of hardware interrupts received.
.It Li ICache_Misses
2008-11-13 10:40:13 +00:00
.Pq Event 81H , Umask 00H
The number of instruction fetch misses in the instruction cache and
streaming buffers.
.It Li ICache_Reads
2008-11-13 10:40:13 +00:00
.Pq Event 80H , Umask 00H
The number of instruction fetches from the the instruction cache and
2009-08-23 07:32:30 +00:00
streaming buffers counting both cacheable and un-cacheable fetches.
.It Li IFU_Mem_Stall
2008-11-13 10:40:13 +00:00
.Pq Event 86H , Umask 00H
The number of cycles the instruction fetch unit was stalled while
waiting for data from memory.
.It Li ILD_Stall
2008-11-13 10:40:13 +00:00
.Pq Event 87H , Umask 00H
The number of instruction length decoder stalls.
.It Li ITLB_Misses
2008-11-13 10:40:13 +00:00
.Pq Event 85H , Umask 00H
The number of instruction TLB misses.
.It Li Instr_Decoded
2008-11-13 10:40:13 +00:00
.Pq Event D0H , Umask 00H
The number of instructions decoded.
.It Li Instr_Ret
2008-11-13 10:40:13 +00:00
.Pq Event C0H , Umask 00H
.Pq Alias Qq "Instruction Retired"
The number of instructions retired.
This is an architectural performance event.
.It Li L1_Pref_Req
2008-11-13 10:40:13 +00:00
.Pq Event 4FH , Umask 00H
The number of L1 prefetch request due to data cache misses.
.It Li L2_ADS Op ,core= Ns core
.Pq Event 21H
The number of L2 address strobes.
.It Li L2_IFetch Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Xc
.Pq Event 28H
The number of instruction fetches by the instruction fetch unit from
L2 cache including speculative fetches.
.It Li L2_LD Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Xc
.Pq Event 29H
The number of L2 cache reads.
.It Li L2_Lines_In Xo
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 24H
The number of L2 cache lines allocated.
.It Li L2_Lines_Out Xo
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 26H
The number of L2 cache lines evicted.
.It Li L2_M_Lines_In Op ,core= Ns Ar core
.Pq Event 25H
The number of L2 M state cache lines allocated.
.It Li L2_M_Lines_Out Xo
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 27H
The number of L2 M state cache lines evicted.
.It Li L2_No_Request_Cycles Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 32H
The number of cycles there was no request to access L2 cache.
.It Li L2_Reject_Cycles Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 30H
The number of cycles the L2 cache was busy and rejecting new requests.
.It Li L2_Rqsts Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Op ,prefetch= Ns Ar prefetch
.Xc
.Pq Event 2EH
The number of L2 cache requests.
.It Li L2_ST Xo
.Op ,cachestate= Ns Ar mesi
.Op ,core= Ns Ar core
.Xc
.Pq Event 2AH
The number of L2 cache writes including speculative writes.
.It Li LD_Blocks
2008-11-13 10:40:13 +00:00
.Pq Event 03H , Umask 00H
The number of load operations delayed due to store buffer blocks.
.It Li LLC_Misses
2008-11-13 16:32:20 +00:00
.Pq Event 2EH , Umask 41H
The number of cache misses for references to the last level cache,
excluding misses due to hardware prefetches.
This is an architectural performance event.
.It Li LLC_Reference
The number of references to the last level cache,
excluding those due to hardware prefetches.
This is an architectural performance event.
2008-11-13 16:32:20 +00:00
.Pq Event 2EH , Umask 4FH
This is an architectural performance event.
.It Li MMX_Assist
2008-11-13 10:40:13 +00:00
.Pq Event CDH , Umask 00H
The number of EMMX instructions executed.
.It Li MMX_FP_Trans
.Pq Event CCH , Umask 00H
The number of transitions from MMX to X87.
.It Li MMX_Instr_Exec
2008-11-13 10:40:13 +00:00
.Pq Event B0H , Umask 00H
The number of MMX instructions executed excluding
.Li MOVQ
and
.Li MOVD
stores.
.It Li MMX_Instr_Ret
2008-11-13 10:40:13 +00:00
.Pq Event CEH , Umask 00H
The number of MMX instructions retired.
.It Li Misalign_Mem_Ref
2008-11-13 10:40:13 +00:00
.Pq Event 05H , Umask 00H
The number of misaligned data memory references, counting loads and
stores.
.It Li Mul
2008-11-13 10:40:13 +00:00
.Pq Event 12H , Umask 00H
The number of multiply operations include speculative floating point
and integer multiplies.
This event is available on PMC1 only.
.It Li NonHlt_Ref_Cycles
.Pq Event 3CH , Umask 01H
.Pq Alias Qq "Unhalted Reference Cycles"
The number of non-halted bus cycles.
This is an architectural performance event.
.It Li Pref_Rqsts_Dn
2008-11-13 10:40:13 +00:00
.Pq Event F8H , Umask 00H
The number of hardware prefetch requests issued in backward streams.
.It Li Pref_Rqsts_Up
2008-11-13 10:40:13 +00:00
.Pq Event F0H , Umask 00H
The number of hardware prefetch requests issued in forward streams.
.It Li Resource_Stall
2008-11-13 10:40:13 +00:00
.Pq Event A2H , Umask 00H
The number of cycles where there is a resource related stall.
.It Li SD_Drains
2008-11-13 10:40:13 +00:00
.Pq Event 04H , Umask 00H
The number of cycles while draining store buffers.
.It Li SIMD_FP_DP_P_Ret
.Pq Event D8H , Umask 02H
The number of SSE/SSE2 packed double precision instructions retired.
.It Li SIMD_FP_DP_P_Comp_Ret
.Pq Event D9H , Umask 02H
The number of SSE/SSE2 packed double precision compute instructions
retired.
.It Li SIMD_FP_DP_S_Ret
.Pq Event D8H , Umask 03H
The number of SSE/SSE2 scalar double precision instructions retired.
.It Li SIMD_FP_DP_S_Comp_Ret
.Pq Event D9H , Umask 03H
The number of SSE/SSE2 scalar double precision compute instructions
retired.
.It Li SIMD_FP_SP_P_Comp_Ret
.Pq Event D9H , Umask 00H
The number of SSE/SSE2 packed single precision compute instructions
retired.
.It Li SIMD_FP_SP_Ret
.Pq Event D8H , Umask 00H
The number of SSE/SSE2 scalar single precision instructions retired,
both packed and scalar.
.It Li SIMD_FP_SP_S_Ret
.Pq Event D8H , Umask 01H
The number of SSE/SSE2 scalar single precision instructions retired.
.It Li SIMD_FP_SP_S_Comp_Ret
.Pq Event D9H , Umask 01H
The number of SSE/SSE2 single precision compute instructions retired.
.It Li SIMD_Int_128_Ret
.Pq Event D8H , Umask 04H
The number of SSE2 128-bit integer instructions retired.
.It Li SIMD_Int_Pari_Exec
.Pq Event B3H , Umask 20H
The number of SIMD integer packed arithmetic instructions executed.
.It Li SIMD_Int_Pck_Exec
.Pq Event B3H , Umask 04H
The number of SIMD integer pack operations instructions executed.
.It Li SIMD_Int_Plog_Exec
.Pq Event B3H , Umask 10H
The number of SIMD integer packed logical instructions executed.
.It Li SIMD_Int_Pmul_Exec
.Pq Event B3H , Umask 01H
The number of SIMD integer packed multiply instructions executed.
.It Li SIMD_Int_Psft_Exec
.Pq Event B3H , Umask 02H
The number of SIMD integer packed shift instructions executed.
.It Li SIMD_Int_Sat_Exec
2008-11-13 10:40:13 +00:00
.Pq Event B1H , Umask 00H
The number of SIMD integer saturating instructions executed.
.It Li SIMD_Int_Upck_Exec
.Pq Event B3H , Umask 08H
The number of SIMD integer unpack instructions executed.
.It Li SMC_Detected
2008-11-26 03:48:20 +00:00
.Pq Event C3H , Umask 00H
The number of times self-modifying code was detected.
.It Li SSE_NTStores_Miss
.Pq Event 4BH , Umask 03H
The number of times an SSE streaming store instruction missed all caches.
.It Li SSE_NTStores_Ret
.Pq Event 07H , Umask 03H
The number of SSE streaming store instructions executed.
.It Li SSE_PrefNta_Miss
.Pq Event 4BH , Umask 00H
The number of times
.Li PREFETCHNTA
missed all caches.
.It Li SSE_PrefNta_Ret
.Pq Event 07H , Umask 00H
The number of
.Li PREFETCHNTA
instructions retired.
.It Li SSE_PrefT1_Miss
.Pq Event 4BH , Umask 01H
The number of times
.Li PREFETCHT1
missed all caches.
.It Li SSE_PrefT1_Ret
.Pq Event 07H , Umask 01H
The number of
.Li PREFETCHT1
instructions retired.
.It Li SSE_PrefT2_Miss
.Pq Event 4BH , Umask 02H
The number of times
.Li PREFETCHNT2
missed all caches.
.It Li SSE_PrefT2_Ret
.Pq Event 07H , Umask 02H
The number of
.Li PREFETCHT2
instructions retired.
.It Li Seg_Reg_Loads
2008-11-13 10:40:13 +00:00
.Pq Event 06H , Umask 00H
The number of segment register loads.
.It Li Serial_Execution_Cycles
.Pq Event 3CH , Umask 02H
The number of non-halted bus cycles of this code while the other core
was halted.
.It Li Thermal_Trip
.Pq Event 3BH , Umask C0H
The duration in a thermal trip based on the current core clock.
.It Li Unfusion
2008-11-13 10:40:13 +00:00
.Pq Event DBH , Umask 00H
The number of unfusion events.
2008-11-13 10:40:13 +00:00
.It Li Unhalted_Core_Cycles
.Pq Event 3CH , Umask 00H
The number of core clock cycles when the clock signal on a specific
core is not halted.
This is an architectural performance event.
.It Li Uops_Ret
2008-11-13 10:40:13 +00:00
.Pq Event C2H , Umask 00H
The number of micro-ops retired.
.El
.Ss Event Name Aliases
The following table shows the mapping between the PMC-independent
aliases supported by
.Lb libpmc
and the underlying hardware events used.
.Bl -column "branch-mispredicts" "Description"
.It Em Alias Ta Em Event
.It Li branches Ta Li Br_Instr_Ret
.It Li branch-mispredicts Ta Li Br_MisPred_Ret
.It Li dc-misses Ta (unsupported)
.It Li ic-misses Ta Li ICache_Misses
.It Li instructions Ta Li Instr_Ret
.It Li interrupts Ta Li HW_Int_Rx
.It Li unhalted-cycles Ta (unsupported)
.El
.Sh PROCESSOR ERRATA
The following errata affect performance measurement on these
processors.
These errata are documented in
.Rs
.%T "Intel<65> CoreTM Duo Processor and Intel<65> CoreTM Solo Processor on 65 nm Process"
.%B "Specification Update"
.%N "Order Number 309222-017"
.%D July 2008
.%Q "Intel Corporation"
.Re
.Bl -tag -width indent -compact
.It AE19
Data prefetch performance monitoring events can only be enabled
on a single core.
.It AE25
Performance monitoring counters that count external bus events
may report incorrect values after processor power state transitions.
.It AE28
Performance monitoring events for retired floating point operations
(C1H) may not be accurate.
.It AE29
DR3 address match on MOVD/MOVQ/MOVNTQ memory store
instruction may incorrectly increment performance monitoring count
2009-08-23 07:32:30 +00:00
for saturating SIMD instructions retired (Event CFH).
.It AE33
Hardware prefetch performance monitoring events may be counted
inaccurately.
.It AE36
The
.Li CPU_CLK_UNHALTED
performance monitoring event (Event 3CH) counts
clocks when the processor is in the C1/C2 processor power states.
.It AE39
Certain performance monitoring counters related to bus, L2 cache
and power management are inaccurate.
.It AE51
Performance monitoring events for retired instructions (Event C0H) may
not be accurate.
.It AE67
Performance monitoring event
.Li FP_ASSIST
may not be accurate.
.It AE78
Performance monitoring event for hardware prefetch requests (Event
4EH) and hardware prefetch request cache misses (Event 4FH) may not be
accurate.
.It AE82
Performance monitoring event
.Li FP_MMX_TRANS_TO_MMX
may not count some transitions.
.El
.Sh SEE ALSO
.Xr pmc 3 ,
.Xr pmc.atom 3 ,
.Xr pmc.core2 3 ,
.Xr pmc.iaf 3 ,
.Xr pmc.k7 3 ,
.Xr pmc.k8 3 ,
.Xr pmc.p4 3 ,
.Xr pmc.p5 3 ,
.Xr pmc.p6 3 ,
.Xr pmc.tsc 3 ,
.Xr pmclog 3 ,
.Xr hwpmc 4
.Sh HISTORY
The
.Nm pmc
library first appeared in
.Fx 6.0 .
.Sh AUTHORS
The
.Lb libpmc
library was written by
.An "Joseph Koshy"
.Aq jkoshy@FreeBSD.org .