freebsd-skq/lib/libpmc/pmc.p5.3

403 lines
15 KiB
Groff
Raw Normal View History

.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" This software is provided by Joseph Koshy ``as is'' and
.\" any express or implied warranties, including, but not limited to, the
.\" implied warranties of merchantability and fitness for a particular purpose
.\" are disclaimed. in no event shall Joseph Koshy be liable
.\" for any direct, indirect, incidental, special, exemplary, or consequential
.\" damages (including, but not limited to, procurement of substitute goods
.\" or services; loss of use, data, or profits; or business interruption)
.\" however caused and on any theory of liability, whether in contract, strict
.\" liability, or tort (including negligence or otherwise) arising in any way
.\" out of the use of this software, even if advised of the possibility of
.\" such damage.
.\"
.\" $FreeBSD$
.\"
.Dd September 16, 2008
.Os
.Dt PMC 3
.Sh NAME
.Nm pmc
.Nd library for accessing hardware performance monitoring counters
.Sh LIBRARY
.Lb libpmc
.Sh SYNOPSIS
.In pmc.h
.Sh DESCRIPTION
Intel Pentium PMCs are present in Intel
.Tn Pentium
and
.Tn "Pentium MMX"
processors.
These PMCs are documented in the
.Rs
.%B "Intel 64 and IA-32 Intel(R) Architectures Software Developer's Manual"
.%T "Volume 3B: System Programming Guide, Part 2"
.%N "Order Number 253669-024US"
.%D "August 2007"
.%Q "Intel Corporation"
.Re
.Ss PMC Features
These CPUs contain two PMCs, each 40 bits wide.
These PMCs support the following capabilities:
.Bl -column "PMC_CAP_INTERRUPT" "Support"
.It Em Capability Ta Em Support
.It PMC_CAP_CASCADE Ta \&No
.It PMC_CAP_EDGE Ta \&No
.It PMC_CAP_INTERRUPT Ta \&No
.It PMC_CAP_INVERT Ta \&No
.It PMC_CAP_READ Ta Yes
.It PMC_CAP_PRECISE Ta \&No
.It PMC_CAP_SYSTEM Ta Yes
.It PMC_CAP_TAGGING Ta \&No
.It PMC_CAP_THRESHOLD Ta \&No
.It PMC_CAP_USER Ta Yes
.It PMC_CAP_WRITE Ta Yes
.El
.Ss Event Qualifiers
Event specifiers for Intel Pentium PMCs can have the following common
qualifiers:
.Bl -tag -width indent
.It Li duration
Count duration (in clocks) of events.
The default is to count events.
.It Li os
Measure events at privilege levels 0, 1 and 2.
.It Li overflow
Assert the external processor pin associated with a counter on counter
overflow.
.It Li usr
Measure events at privilege level 3.
.El
.Pp
If neither of the
.Dq Li os
or
.Dq Li usr
qualifiers are specified, the default is to enable both.
.Pp
Some events may only be used on specific counters and some events
are defined only on processors supporting the MMX instruction set.
Note that these PMCs do not have the ability to interrupt the CPU.
.Ss Intel Pentium Event Specifiers
The event specifiers supported by Intel Pentium PMCs are:
.Bl -tag -width indent
.It Li p5-any-segment-register-loaded
The number of writes to any segment register, including the LDTR,
GDTR, TR and IDTR.
Far control transfers and task switches that involve privilege
level changes will count this event twice.
.It Li p5-bank-conflicts
The number of actual bank conflicts.
.It Li p5-branches
The number of taken and not taken branches including branches, jumps, calls,
software interrupts and interrupt returns.
.It Li p5-breakpoint-match-on-dr0-register
The number of matches on the DR0 breakpoint register.
.It Li p5-breakpoint-match-on-dr1-register
The number of matches on the DR1 breakpoint register.
.It Li p5-breakpoint-match-on-dr2-register
The number of matches on the DR2 breakpoint register.
.It Li p5-breakpoint-match-on-dr3-register
The number of matches on the DR3 breakpoint register.
.It Li p5-btb-false-entries
.Pq Tn Pentium MMX
The number of false entries in the BTB.
This event is only allocated on counter 0.
.It Li p5-btb-hits
The number of branches executed that hit in the branch table buffer.
.It Li p5-btb-miss-prediction-on-not-taken-branch
.Pq Tn Pentium MMX
The number of times the BTB predicted a not-taken branch as taken.
This event is only allocated on counter 1.
.It Li p5-bus-cycle-duration
The number of cycles while a bus cycle was in progress.
.It Li p5-bus-ownership-latency
.Pq Tn Pentium MMX
The time from bus ownership being requested to ownership being granted.
This event is only allocated on counter 0.
.It Li p5-bus-ownership-transfers
.Pq Tn Pentium MMX
The number of bus ownership transfers.
This event is only allocated on counter 1.
.It Li p5-bus-utilization-due-to-processor-activity
.Pq Tn Pentium MMX
The number of clocks the bus is busy due to the processor's own
activity.
This event is only allocated on counter 0.
.It Li p5-cache-line-sharing
.Pq Tn Pentium MMX
The number of shared data lines in L1 cache.
This event is only allocated on counter 1.
.It Li p5-cache-m-state-line-sharing
.Pq Tn Pentium MMX
The number of hits to an M- state line due to a memory access by
another processor.
This event is only allocated on counter 0.
.It Li p5-code-cache-miss
The number of instruction reads that miss the internal code cache.
Both cacheable and uncacheable misses are counted.
.It Li p5-code-read
The number of instruction reads to both cacheable and uncacheable regions.
.It Li p5-code-tlb-miss
The number of instruction reads that miss the instruction TLB.
Both cacheable and uncacheable unreads are counted.
.It Li p5-d1-starvation-and-fifo-is-empty
.Pq Tn Pentium MMX
The number of times the D1 stage cannot issue any instructions because
the FIFO was empty.
This event is only allocated on counter 0.
.It Li p5-d1-starvation-and-only-one-instruction-in-fifo
.Pq Tn Pentium MMX
The number of times the D1 stage could issue only one instruction
because the FIFO had one instruction ready.
This event is only allocated on counter 1.
.It Li p5-data-cache-lines-written-back
The number of data cache lines that are written back, including
those caused by internal and external snoops.
.It Li p5-data-cache-tlb-miss-stall-duration
.Pq Tn Pentium MMX
The number of clocks the pipeline is stalled due to a data cache
TLB miss.
This event is only allocated on counter 1.
.It Li p5-data-read
The number of memory data reads, counting internal data cache hits and
misses.
I/O and data memory accesses due to TLB miss processing are
not included.
Split cycle reads are counted individually.
.It Li p5-data-read-miss
The number of memory read accesses that miss the data cache, counting
both cacheable and uncacheable accesses.
Data accesses that are part of TLB miss processing are not included.
I/O accesses are not included.
.It Li p5-data-read-miss-or-write-miss
The number of data reads and writes that miss the internal data cache,
counting uncacheable accesses.
Data accesses due to TLB miss processing are not counted.
.It Li p5-data-read-or-write
The number of data reads and writes including internal data cache hits
and misses.
Data reads due to TLB miss processing are not counted.
.It Li p5-data-tlb-miss
The number of misses to the data cache translation lookaside buffer.
.It Li p5-data-write
The number of memory data writes, counting internal data cache hits
and misses.
I/O is not included and split cycle writes are counted individually.
.It Li p5-data-write-miss
The number of memory write accesses that miss the data cache, counting
both cacheable and uncacheable accesses.
I/O accesses are not counted.
.It Li p5-emms-instructions-executed
.Pq Tn Pentium MMX
The number of EMMS instructions executed.
This event is only allocated on counter 0.
.It Li p5-external-data-cache-snoop-hits
The number of external snoops to the data cache that hit a valid line,
or the data line fill buffer, or one of the write back buffers.
.It Li p5-external-snoops
The number of external snoop requests accepted, including snoops that
hit in the code cache, the data cache and that hit in neither.
.It Li p5-floating-point-stalls-duration
.Pq Tn Pentium MMX
The number of cycles the pipeline is stalled due to a floating point
freeze.
This event is only allocated on counter 0.
.It Li p5-flops
The number of floating point adds, subtracts, multiples, divides and
square roots.
Transcendental instructions trigger this event multiple times.
Instructions generating divide-by-zero, negative square root, special
operand and stack exceptions are not counted.
Integer multiply instructions that use the x87 FPU are counted.
.It Li p5-full-write-buffer-stall-duration-while-executing-mmx-instructions
.Pq Tn Pentium MMX
The number of clocks the pipeline has stalled due to full write
buffers when executing MMX instructions.
This event is only allocated on counter 0.
.It Li p5-hardware-interrupts
The number of taken INTR and NMI interrupts.
.It Li p5-instructions-executed
The number of instructions executed.
Repeat prefixed instructions are counted only once.
The HLT instruction is counted only once, irrespective of the number
of cycles spent in the halted state.
All hardware and software exceptions are counted as instructions, and
fault handler invocations are also counted as instructions.
.It Li p5-instructions-executed-v-pipe
The number of instructions that executed in the V pipe.
.It Li p5-io-read-or-write-cycle
The number of bus cycles directed to I/O space.
.It Li p5-locked-bus-cycle
The number of locked bus cycles that occur on account of the lock
prefixes, LOCK instructions, page table updates and descriptor table
updates.
.It Li p5-memory-accesses-in-both-pipes
The number of data memory reads or writes that are paired in both pipes.
.It Li p5-misaligned-data-memory-on-mmx-instructions
.Pq Tn Pentium MMX
The number of misaligned data memory references when executing MMX
instructions.
This event is only allocated on counter 0.
.It Li p5-misaligned-data-memory-or-io-references
The number of memory or I/O reads or writes that are not aligned on
natural boundaries.
2- and 4-byte accesses are counted as misaligned if they cross a 4
byte boundary.
.It Li p5-mispredicted-or-unpredicted-returns
.Pq Tn Pentium MMX
The number of returns predicted incorrectly or not at all, only
counting RET instructions.
This event is only allocated on counter 0.
.It Li p5-mmx-instruction-data-read-misses
.Pq Tn Pentium MMX
The number of MMX instruction data read misses.
This event is only allocated on counter 1.
.It Li p5-mmx-instruction-data-reads
.Pq Tn Pentium MMX
The number of MMX instruction data reads.
This event is only allocated on counter 0.
.It Li p5-mmx-instruction-data-write-misses
.Pq Tn Pentium MMX
The number of data write misses caused by MMX instructions.
This event is only allocated on counter 1.
.It Li p5-mmx-instruction-data-writes
.Pq Tn Pentium MMX
The number of data writes caused by MMX instructions.
This event is only allocated on counter 0.
.It Li p5-mmx-instructions-executed-u-pipe
.Pq Tn Pentium MMX
The number of MMX instructions executed in the U pipe.
This event is only allocated on counter 0.
.It Li p5-mmx-instructions-executed-v-pipe
The number of MMX instructions executed in the V pipe.
This event is only allocated on counter 1.
.It Li p5-mmx-multiply-unit-interlock
.Pq Tn Pentium MMX
The number of clocks the pipeline is stalled because the destination
of a prior MMX multiply is not ready.
This event is only allocated on counter 0.
.It Li p5-movd-movq-store-stall-due-to-previous-mmx-operation
.Pq Tn Pentium MMX
The number of clocks a MOVD/MOVQ instruction stalled in the D2 stage
of the pipeline due to a previous MMX instruction.
This event is only allocated on counter 1.
.It Li p5-noncacheable-memory-reads
The number of bus cycles for non-cacheable instruction or data reads,
including cycles caused by TLB misses.
.It Li p5-number-of-cycles-not-in-halt-state
.Pq Tn Pentium MMX
The number of cycles the processor is not idle due to the HLT
instruction.
This event is only allocated on counter 0.
.It Li p5-pipeline-agi-stalls
The number of address generation interlock stalls.
An AGI that occurs in both the U and V pipelines in the same clock
signals the event twice.
.It Li p5-pipeline-flushes
The number of pipeline flushes that occur.
Pipeline flushes are caused by branch mispredicts, exceptions,
interrupts, some segment register loads, and BTB misses.
Prefetch queue flushes due to serializing instructions are not
counted.
.It Li p5-pipeline-flushes-due-to-wrong-branch-predictions
.Pq Tn Pentium MMX
The number of pipeline flushes due to wrong branch predictions
resolved in either the E- or WB- stage of the pipeline.
This event is only allocated on counter 0.
.It Li p5-pipeline-flushes-due-to-wrong-branch-predictions-resolved-in-wb-stage
.Pq Tn Pentium MMX
The number of pipeline flushes due to wrong branch predictions
resolved in the stage of the pipeline.
This event is only allocated on counter 1.
.It Li p5-pipeline-stall-for-mmx-instruction-data-memory-reads
.Pq Tn Pentium MMX
The number of clocks during pipeline stalls caused by waiting MMX data
memory reads.
This event is only allocated on counter 0.
.It Li p5-predicted-returns
.Pq Tn Pentium MMX
The number of predicted returns, whether correct or incorrect.
This counter only counts RET instructions.
This event is only allocated on counter 1.
.It Li p5-returns
.Pq Tn Pentium MMX
The number of RET instructions executed.
This event is only allocated on counter 0.
.It Li p5-saturating-mmx-instructions-executed
.Pq Tn Pentium MMX
The number of saturating MMX instructions executed.
This event is only allocated on counter 0.
.It Li p5-saturations-performed
.Pq Tn Pentium MMX
The number of saturating MMX instructions executed when at least one
of its results were actually saturated.
This event is only allocated on counter 1.
.It Li p5-stall-on-mmx-instruction-write-to-e-o-m-state-line
.Pq Tn Pentium MMX
The number of clocks during stalls on MMX instructions writing to
E- or M- state cache lines.
This event is only allocated on counter 1.
.It Li p5-stall-on-write-to-an-e-or-m-state-line
The number of stalls on a write to an exclusive or modified data cache
line.
.It Li p5-taken-branch-or-btb-hit
The number of events that may cause a hit in the BTB, namely either
taken branches or BTB hits.
.It Li p5-taken-branches
.Pq Tn Pentium MMX
The number of taken branches.
This event is only allocated on counter 1.
.It Li p5-transitions-between-mmx-and-fp-instructions
.Pq Tn Pentium MMX
The number of transitions between MMX and floating-point instructions
and vice-versa.
This event is only allocated on counter 1.
.It Li p5-waiting-for-data-memory-read-stall-duration
The number of clocks the pipeline was stalled waiting for data
memory reads.
Data TLB misses processing is included in this count.
.It Li p5-write-buffer-full-stall-duration
The number of clocks while the pipeline was stalled due to write
buffers being full.
.It Li p5-write-hit-to-m-or-e-state-lines
The number of writes that hit exclusive or modified lines in the data
cache.
.It Li p5-writes-to-noncacheable-memory
.Pq Tn Pentium MMX
The number of writes to non-cacheable memory, including write cycles
caused by TLB misses and I/O writes.
This event is only allocated on counter 1.
.El
.Sh SEE ALSO
.Xr pmc 3 ,
.Xr pmc.k7 3 ,
.Xr pmc.k8 3 ,
.Xr pmc.p4 3 ,
.Xr pmc.p6 3 ,
.Xr pmc.tsc 3 ,
.Xr pmclog 3 ,
.Xr hwpmc 4
.Sh HISTORY
The
.Nm pmc
library first appeared in
.Fx 6.0 .
.Sh AUTHORS
The
.Lb libpmc
library was written by
.An "Joseph Koshy"
.Aq jkoshy@FreeBSD.org .