.\" Copyright (c) 2003-2005 Joseph Koshy .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd Apr 15, 2005 .Dt HWPMC 4 .Os .Sh NAME .Nm hwpmc .Nd Hardware performance monitoring counter support .Sh SYNOPSIS .Cd options PMC_HOOKS .br .Cd device hwpmc .Sh DESCRIPTION The .Nm driver virtualizes the hardware performance monitoring facilities in modern CPUs and provides support for using these facilities from user level processes. .Pp The driver supports multi-processor systems. .Pp PMCs are allocated using the .Ic PMC_OP_PMCALLOCATE request. A successful .Ic PMC_OP_PMCALLOCATE request will return an integer handle (typically a small integer) to the requesting process. Subsequent operations on the allocated PMC use this handle to denote the specific PMC. A process that has successfully allocated a PMC is termed an .Dq "owner process" . .Pp PMCs may be allocated to operate in process-private or in system-wide modes. .Bl -hang -width "XXXXXXXXXXXXXXX" .It Em Process-private In process-private mode, a PMC is active only when a thread belonging to a process it is attached to is scheduled on a CPU. .It Em System-wide In system-wide mode a PMC operates independently of processes and measures hardware events for the system as a whole. .El .Pp The .Nm driver supports the use of hardware PMCs for counting or for sampling: .Bl -hang -width "XXXXXXXXX" .It Em Counting In counting modes, the PMCs count hardware events. These counts are retrievable using the .Ic PMC_OP_PMCREAD system call on all architectures, though some architectures like the x86 and amd64 offer faster methods of reading these counts. .It Em Sampling In sampling modes, where PMCs are configured to sample the CPU instruction pointer after a configurable number of hardware events have been observed. These instruction pointer samples are directed to a log file for subsequent analysis. .El .Pp These modes of operation are orthogonal; a PMC may be configured to operate in one of four modes: .Bl -tag -width indent .It Process-private, counting These PMCs count hardware events whenever a thread in their attached process is scheduled on a CPU. These PMCs normally count from zero, but the initial count may be set using the .Ic PMC_OP_SETCOUNT operation. Applications can read the value of the PMC anytime using the .Ic PMC_OP_PMCRW operation. .It Process-private, sampling These PMCs sample the target processes instruction pointer after they have seen the configured number of hardware events. The PMCs only count events when a thread belonging to their attached process is active. The desired frequency of sampling is set using the .Ic PMC_OP_SETCOUNT operation prior to starting the PMC. Log files are configured using the .Ic PMC_OP_CONFIGURELOG operation. .It System-wide, counting These PMCs count hardware events seen by them independent of the processes that are executing. The current count on these PMCs can be read using the .Ic PMC_OP_PMCRW request. These PMCs normally count from zero, but the initial count may be set using the .Ic PMC_OP_SETCOUNT operation. .It System-wide, sampling These PMCs will periodically sample the instruction pointer of the CPU they are allocated on, and will write the sample to a log for further processing. The desired frequency of sampling is set using the .Ic PMC_OP_SETCOUNT operation prior to starting the PMC. Log files are configured using the .Ic PMC_OP_CONFIGURELOG operation. .Pp System-wide statistical sampling can only be enabled by a process with super-user privileges. .El .Pp Processes are allowed to allocate as many PMCs are the hardware and current operating conditions permit. Processes may mix allocations of system-wide and process-private PMCs. Multiple processes are allowed to be concurrently using the facilities of the .Nm driver. .Pp Allocated PMCs are started using the .Ic PMC_OP_PMCSTART operation, and stopped using the .Ic PMC_OP_PMCSTOP operation. Stopping and starting a PMC is permitted at any time the owner process has a valid handle to the PMC. .Pp Process-private PMCs need to be attached to a target process before they can be used. Attaching a process to a PMC is done using the .Ic PMC_OP_PMCATTACH operation. An already attached PMC may be detached from its target process using the converse .Ic PMC_OP_PMCDETACH operation. Issuing an .Ic PMC_OP_PMCSTART operation on an as yet unattached PMC will cause it to be attached to its owner process. The following rules determine whether a given process may attach a PMC to another target process: .Bl -bullet -compact .It A non-jailed process with super-user privileges is allowed to attach to any other process in the system. .It Other processes are only allowed to attach to targets that they would be able to attach to for debugging (as determined by .Xr p_candebug 9 ) . .El .Pp PMCs are released using .Ic PMC_OP_PMCRELEASE . After a successful .Ic PMC_OP_PMCRELEASE operation the handle to the PMC will become invalid. .Ss MODIFIER FLAGS The .Ic PMC_OP_PMCALLOCATE operation supports the following flags that modify the behavior of an allocated PMC: .Bl -tag -width indent .It Dv PMC_F_DESCENDANTS This flag is valid only for a PMC being allocated in process-private mode. It signifies that the PMC will track hardware events for its target process and the target's current and future descendants. .El .Ss SIGNALS The .Nm driver may deliver signals to processes that have allocated PMCs: .Bl -tag -width indent .It Bq SIGIO A .Ic PMC_OP_PMCRW operation was attempted on a process-private PMC that does not have attached target processes. .It Bq SIGBUS The .Nm driver is being unloaded from the kernel. .El .Sh PROGRAMMING API The recommended way for application programs to use the facilities of the .Nm driver is using the API provided by the library .Xr pmc 3 . .Pp The .Nm driver operates using a system call number that is dynamically allotted to it when it is loaded into the kernel. .Pp The .Nm driver supports the following operations: .Bl -tag -width indent .It Ic PMC_OP_CONFIGURELOG Configure a log file for sampling mode PMCs. .It Ic PMC_OP_GETCPUINFO Retrieve information about the number of CPUs on the system and the number of hardware performance monitoring counters available per-CPU. .It Ic PMC_OP_GETDRIVERSTATS Retrieve module statistics (for analyzing the behavior of .Nm itself). .It Ic PMC_OP_GETMODULEVERSION Retrieve the version number of API. .It Ic PMC_OP_GETPMCINFO Retrieve information about the current state of the PMCs on a given CPU. .It Ic PMC_OP_PMCADMIN Set the administrative state (i.e., whether enabled or disabled) for the hardware PMCs managed by the .Nm driver. .It Ic PMC_OP_PMCALLOCATE Allocate and configure a PMC. On successful allocation, a handle to the PMC (a small integer) is returned. .It Ic PMC_OP_PMCATTACH Attach a process mode PMC to a target process. The PMC will be active whenever a thread in the target process is scheduled on a CPU. .Pp If the .Dv PMC_F_DESCENDANTS flag had been specified at PMC allocation time, then the PMC is attached to all current and future descendants of the target process. .It Ic PMC_OP_PMCDETACH Detach a PMC from its target process. .It Ic PMC_OP_PMCRELEASE Release a PMC. .It Ic PMC_OP_PMCRW Read and write a PMC. This operation is valid only for PMCs configured in counting modes. .It Ic PMC_OP_SETCOUNT Set the initial count (for counting mode PMCs) or the desired sampling rate (for sampling mode PMCs). .It Ic PMC_OP_PMCSTART Start a PMC. .It Ic PMC_OP_PMCSTOP Stop a PMC. .It Ic PMC_OP_WRITELOG Insert a timestamped user record into the log file. .El .Ss i386 SPECIFIC API Some i386 family CPUs support the RDPMC instruction which allows a user process to read a PMC value without needing to invoke a .Ic PMC_OP_PMCRW operation. On such CPUs, the machine address associated with an allocated PMC is retrievable using the .Ic PMC_OP_PMCX86GETMSR system call. .Bl -tag -width indent .It Ic PMC_OP_PMCX86GETMSR Retrieve the MSR (machine specific register) number associated with the given PMC handle. .Pp The PMC needs to be in process-private mode and allocated without the .Va PMC_F_DESCENDANTS modifier flag, and should be attached only to its owner process at the time of the call. .El .Ss amd64 SPECIFIC API AMD64 cpus support the RDPMC instruction which allows a user process to read a PMC value without needing to invoke a .Ic PMC_OP_PMCRW operation. The machine address associated with an allocated PMC is retrievable using the .Ic PMC_OP_PMCX86GETMSR system call. .Bl -tag -width indent .It Ic PMC_OP_PMCX86GETMSR Retrieve the MSR (machine specific register) number associated with the given PMC handle. .Pp The PMC needs to be in process-private mode and allocated without the .Va PMC_F_DESCENDANTS modifier flag, and should be attached only to its owner process at the time of the call. .El .Sh SYSCTL TUNABLES The behavior of .Nm is influenced by the following .Xr sysctl 8 tunables: .Bl -tag -width indent .It Va kern.hwpmc.debugflags (Only available if the .Nm driver was compiled with .Fl DDEBUG ) . Control the verbosity of debug messages from the .Nm driver. .It Va kern.hwpmc.hashsize The number of rows in the hash-tables used to keep track of owner and target processes. .It Va kern.hwpmc.mtxpoolsize The size of the spin mutex pool used by the PMC driver. .It Va kern.hwpmc.pcpubuffersize The size of the per-cpu hash table used when performing system-wide statistical profiling. .It Va security.bsd.unprivileged_syspmcs If set to non-zero, allow unprivileged processes to allocate system-wide PMCs. The default value is 0. .It Va security.bsd.unprivileged_proc_debug If set to 0, the .Nm driver will only allow privileged process to attach PMCs to other processes. .El .Pp These variables may be set in the kernel environment using .Xr kenv 1 before .Nm is loaded. .Sh SECURITY CONSIDERATIONS PMCs may be used to monitor the actual behaviour of the system on hardware. In situations where this constitutes an undesirable information leak, the following options are available: .Bl -enum .It Set the .Xr sysctl 8 tunable .Va "security.bsd.unprivileged_syspmcs" to 0. .Pp This ensures that unprivileged processes cannot allocate system-wide PMCs and thus cannot observe the hardware behavior of the system as a whole. .Pp This tunable may also be set at boot time using .Xr loader 8 , or with .Xr kenv 1 prior to loading the .Nm driver into the kernel. .It Set the .Xr sysctl 8 tunable .Va "security.bsd.unprivileged_proc_debug" to 0. .Pp This will ensure that an unprivileged process cannot attach a PMC to any process other than itself and thus cannot observe the hardware behavior of other processes with the same credentials. .El .Pp System administrators should note that on IA-32 platforms .Fx makes the content of the IA-32 TSC counter available to all processes via the RDTSC instruction. .Sh IMPLEMENTATION NOTES .Ss i386 TSC Handling Historically, on the x86 architecture, .Fx has permitted user processes running at a processor CPL of 3 to read the TSC using the RDTSC instruction. The .Nm driver preserves this semantic. .Pp TSCs are treated as shared, read-only counters and hence are only allowed to be allocated in system-wide counting mode. .Ss Intel P4/HTT Handling On CPUs with HTT support, Intel P4 PMCs are capable of qualifying only a subset of hardware events on a per-logical CPU basis. Consequently, if HTT is enabled on a system with Intel Pentium P4 PMCs, then the .Nm driver will reject allocation requests for process-private PMCs that request counting of hardware events that cannot be counted separately for each logical CPU. .Sh ERRORS An command issued to the .Nm driver may fail with the following errors: .Bl -tag -width Er .It Bq Er EBUSY An .Ic OP_CONFIGURELOG operation was requested while an existing log was active. .It Bq Er EBUSY A .Ic DISABLE operation was requested using the .Ic PMC_OP_PMCADMIN request for a set of hardware resources currently in use for process-private PMCs. .It Bq Er EBUSY A .Ic PMC_OP_PMCADMIN operation was requested on an active system mode PMC. .It Bq Er EBUSY A .Ic PMC_OP_PMCATTACH operation was requested for a target process that already had another PMC using the same hardware resources attached to it. .It Bq Er EBUSY An .Ic PMC_OP_PMCRW request writing a new value was issued on a PMC that was active. .It Bq Er EBUSY An .Ic PMC_OP_PMCSETCOUNT request was issued on a PMC that was active. .It Bq Er EEXIST A .Ic PMC_OP_PMCATTACH request was reissued for a target process that already is the target of this PMC. .It Bq Er EFAULT A bad address was passed in to the driver. .It Bq Er EINVAL A process specified an invalid PMC handle. .It Bq Er EINVAL An invalid CPU number was passed in for an .Ic PMC_OP_GETPMCINFO operation. .It Bq Er EINVAL An invalid CPU number was passed in for an .Ic PMC_OP_PMCADMIN operation. .It Bq Er EINVAL An invalid operation request was passed in for an .Ic PMC_OP_PMCADMIN operation. .It Bq Er EINVAL An invalid PMC id was passed in for an .Ic PMC_OP_PMCADMIN operation. .It Bq Er EINVAL A suitable PMC matching the parameters passed in to a .Ic PMC_OP_PMCALLOCATE request could not be allocated. .It Bq Er EINVAL An invalid PMC mode was requested during a .Ic PMC_OP_PMCALLOCATE request. .It Bq Er EINVAL An invalid CPU number was specified during a .Ic PMC_OP_PMCALLOCATE request. .It Bq Er EINVAL A cpu other than .Li PMC_CPU_ANY was specified in a .Ic PMC_OP_ALLOCATE request for a process-private PMC. .It Bq Er EINVAL A cpu number of .Li PMC_CPU_ANY was specified in a .Ic PMC_OP_ALLOCATE request for a system-wide PMC. .It Bq Er EINVAL The .Ar pm_flags argument to an .Ic PMC_OP_PMCALLOCATE request contained unknown flags. .It Bq Er EINVAL A PMC allocated for system-wide operation was specified with a .Ic PMC_OP_PMCATTACH request. .It Bq Er EINVAL The .Ar pm_pid argument to a .Ic PMC_OP_PMCATTACH request specified an illegal process id. .It Bq Er EINVAL A .Ic PMC_OP_PMCDETACH request was issued for a PMC not attached to the target process. .It Bq Er EINVAL Argument .Ar pm_flags to a .Ic PMC_OP_PMCRW request contained illegal flags. .It Bq Er EINVAL A .Ic PMC_OP_PMCX86GETMSR operation was requested for a PMC not in process-virtual mode, or for a PMC that is not solely attached to its owner process, or for a PMC that was allocated with flag .Va PMC_F_DESCENDANTS . .It Bq Er EINVAL (On Intel Pentium 4 CPUs with HTT support) An allocation request for a process-private PMC was issued for an event that does not support counting on a per-logical CPU basis. .It Bq Er ENOMEM The system was not able to allocate kernel memory. .It Bq Er ENOSYS (i386 architectures) A .Ic PMC_OP_PMCX86GETMSR operation was requested for hardware that does not support reading PMCs directly with the RDPMC instruction. .It Bq Er ENXIO An .Ic OP_GETPMCINFO operation was requested for a disabled CPU. .It Bq Er ENXIO A system-wide PMC on a disabled CPU was requested to be allocated with .Ic PMC_OP_PMCALLOCATE . .It Bq Er ENXIO A .Ic PMC_OP_PMCSTART or .Ic PMC_OP_PMCSTOP request was issued for a system-wide PMC that was allocated on a currently disabled CPU. .It Bq Er EPERM An .Ic OP_PMCADMIN request was issued by a process without super-user privilege or by a jailed super-user process. .It Bq Er EPERM An .Ic PMC_OP_PMCATTACH operation was issued for a target process that the current process does not have permission to attach to. .It Bq Er EPERM .Pq "i386 and amd64 architectures" An .Ic PMC_OP_PMCATTACH operation was issued on a PMC whose MSR has been retrieved using .Ic PMC_OP_PMCX86GETMSR . .It Bq Er ESRCH A process issued a PMC operation request without having allocated any PMCs. .It Bq Er ESRCH A .Ic PMC_OP_PMCATTACH request specified a non-existent process id. .It Bq Er ESRCH The target process for a .Ic PMC_OP_PMCDETACH operation is not being monitored by the .Nm driver. .El .Sh BUGS The kernel driver requires all CPUs in an SMP system to be symmetric with respect to their performance monitoring counter resources. .Pp The driver samples the state of the kernel's logical processor support at the time of initialization (i.e., at module load time). On CPUs supporting logical processors, the driver could misbehave if logical processors are subsequently enabled or disabled while the driver is active. .Sh SEE ALSO .Xr kenv 1 , .Xr pmc 3 , .Xr kldload 8 , .Xr pmccontrol 8 , .Xr pmcstat 8 , .Xr sysctl 8 , .Xr p_candebug 9