freebsd-dev/cddl/contrib/dtracetoolkit/Notes/cputimes_notes.txt
2012-05-12 21:25:48 +00:00

139 lines
4.6 KiB
Plaintext

**************************************************************************
* The following are additional notes on the cputimes command.
*
* $Id: cputimes_notes.txt 44 2007-09-17 07:47:20Z brendan $
*
* COPYRIGHT: Copyright (c) 2007 Brendan Gregg.
**************************************************************************
* How cputimes works
cputimes measures time consumed by the kernel, idle therads and processes,
by tracking the activity of the schedular. In particular we track on-cpu
and off-cpu events for kernel therads, measuring the timestamps at each event.
* Why cputimes?
If you are interested in how much time processes are consuming, the data
given by "prstat" or "prstat -m" is fine. However there is no easy way to
see kernel consumed time, which is the idea behind cputimes.
* What does it mean?
The output shows categories of threads by the sum of time, in nanoseconds.
A nanosecond is 10^-9, or 0.000000001 of a second. This program uses
nanoseconds as units, but does not have nanosecond accuracy. It would be
reasonable to assume that this has microsecond accuracy (10^-6), so in
practise ignore the last three digits of the times.
The sections reported are,
PROCESSES - the sum of all the process time on the CPU.
KERNEL - the sum of the time spent in the kernel.
IDLE - the time the kernel spent in the idle thread, waiting for some work.
If your system isn't doing much, then the idle time will be quite large. If
your system is running many applications, then there may be no idle time
at all - instead most of the time appearing under processes.
* When is there a problem?
Expect to see most of the time in processes or idle, depending on how busy
your server is. Seeing a considerable amout of time in kernel would
definately be interesting.
The kernel generally doesn't use much CPU time, usually less than 5%.
If it were using more, that may indicate heavy activity from an interrupt
thread, or activity caused by DTrace.
For example,
# cputimes 1
2005 Apr 27 23:49:32,
THREADS TIME (ns)
IDLE 28351679
KERNEL 436022725
PROCESS 451304688
In this sample the kernel is using a massive amount of the CPUs, around 47%.
This sample was taken during heavy network utilisation, the time consumed
by the TCP/IP and network driver threads (and DTrace). The "intrstat" command
could be used for further analysis of the interrupt threads responsible
for servicing the network interface.
* Problems with cputimes
The way cputimes measures schedular activity turns out to be a lot of work.
There are many scheduling events per second where one thread steps onto a
CPU and another leaves. It turns out that cputimes itself causes some degree
of kernel load.
Here we run 1 cputimes,
# cputimes 1
2005 May 15 12:00:41,
THREADS TIME (ns)
KERNEL 12621985
PROCESS 982751579
2005 May 15 12:00:42,
THREADS TIME (ns)
KERNEL 12267577
PROCESS 983513765
[...]
Now a second cputimes is run at the same time,
# cputimes 1
2005 May 15 12:02:06,
THREADS TIME (ns)
KERNEL 17366426
PROCESS 978804165
2005 May 15 12:02:07,
THREADS TIME (ns)
KERNEL 17614829
PROCESS 978671601
[...]
And now a third,
# cputimes 1
2005 May 15 12:03:09,
THREADS TIME (ns)
KERNEL 21303089
PROCESS 974925124
2005 May 15 12:03:10,
THREADS TIME (ns)
KERNEL 21222992
PROCESS 975152727
[...]
Each extra cputimes is consuming an extra 4 to 5 ms of the CPU as kernel time.
Around 0.5%. This can be used as an estimate of the kernel load caused by
running cputimes, and a similar strategy could be used to measure the kernel
load of other DTrace scripts.
However the following CPU characteristics must be taken into consideration,
# psrinfo -v
Status of virtual processor 0 as of: 05/15/2005 12:06:05
on-line since 04/30/2005 13:32:32.
The i386 processor operates at 867 MHz,
and has an i387 compatible floating point processor.
as well as the type of activity that was also running on the system, which
cputimes was monitoring (frequency of scheduling events).
A system with a slower CPU will use a larger proportion of kernel time to
perform the same tasks. Also, a system that is context switching more
(switching between different processes) is likely to consume more kernel time
as well.