process. We don't *quite* pull that number out of our backside, as
the actual number is difficult to determine without modifying the VM
system to report it, but it's still useful to get an idea of what's
going on when a machine unexpectedly starts swapping.
MFC after: 1 week
more obvious imprecision in the previous top changes.
Specifically, top uses a delta of clock_gettime() calls right after
invoking the kern.proc sysctl to fetch the process/thread list to
compute the time delta between the fetches. However, the kern.proc
sysctl handler does not run in constant time. It can spin on locks,
be preempted by an interrupt handler, etc. As a result, the time
between the gathering of stats for individual processes or threads
between subsequent kern.proc handlers can vary. If a "slow" kern.proc
run is followed by a "fast" kern.proc run, then the threads/processes
at the start of the "slow" run will have a longer time delta than the
threads/processes at the end. If the clock_gettime() time delta is
not itself skewed by preemption, then the delta may be too short for
a given thread/process resulting in a higher percent CPU than actual.
However, there is no good way to calculate the exact amount of overage,
nor to know which threads to subtract the overage from. Instead, just
punt and fix the definitely-wrong case of an individual thread having
more than 100% CPU.
Discussed with: zonk
displays after a pause, use the difference in runtime divided by the
length of the pause as the percentage of CPU used instead of the value
calculated by the kernel. In addition, when determing if a process or
thread is idle or not, treat any process or thread that has used any
runtime or performed any context switches during the interval as busy.
Note that the percent CPU is calculated as a double and stored in an
array to avoid recalculating the value multiple times in the comparison
method used to sort processes in the CPU display.
Tested by: Jamie Landeg-Jones <jamie@dyslexicfish.net>
Reviewed by: emaste (earlier version)
MFC after: 1 week
to 999.99% CPU. It still won't be aligned if you have a multithreaded
process using more than 1000% CPU (e.g. idle process on an idle 12-way
system), but 100% is a common case.
Submitted by: Jeremy Chadwick (partial)
MFC after: 1 week
in the manpage by having it display the current CPU (ki_oncpu) rather
than the previously used CPU (ki_lastcpu). ki_lastcpu is still used for
all other thread states.
Reported by: Chris Ross <cross+freebsd@distal.com>
MFC after: 1 week
cmdlengthdelta is the size of the header and we were using it to
allocate a buffer to store the command line. This would mean that
the cmdbuf could be too short. In practice this was never noticed unless
you usually run top -a. On a stock FreeBSD system you can see the
problem by running sendmail and then running top -a on a big terminal
window. In practice this doubles to size available to cmdbuf since the
header is around 65-68 bytes.
Reviewed by: adrian
usage on hosts using ZFS. The new line displays the total amount of RAM
used by the ARC along with the size of MFU, MRU, anonymous (in flight),
headers, and other (miscellaneous) sub-categories. The line is not
displayed on systems that are not using ZFS.
Reviewed by: avg, fs@
MFC after: 3 days
to the maximum number of CPUs to ensure that lcpustates[] array is always
allocated to the maximum size. Previously, if top was started without
per-CPU stats it would allocate a smaller lcpustates[] array. When
per-CPU stats were then enabled, it would overflow the array and trash
the cpustates_columns[] array causing the CPU stats to be printed in the
wrong locations.
Approved by: re (kib)
MFC after: 1 week
ki_rusage member when KERN_PROC_INC_THREAD is passed to one of the
process sysctls.
- Correctly account for the current thread's cputime in the thread when
doing the runtime fixup in calcru().
- Use TIDs as the key to lookup the previous thread to compute IO stat
deltas in IO mode in top when thread display is enabled.
Reviewed by: kib
Approved by: re (kib)
idle threads). The process is displayed by default (subject to whether or
not system processes are displayed) to preserve existing behavior. The
system idle process can be hidden via the '-z' command line argument or the
'z' key while top is running. When it is hidden, top more closely matches
the behavior of FreeBSD <= 4.x where idle time was not accounted to any
process.
MFC after: 2 weeks
The bug was unnoticed on non-i386 because mp_maxid is
initialized differently, kern.cp_times doesn't print
zeroes for non-existing CPUs, so no "writing outside of
array bounds" happens.
MFC after: 3 days
kthread_add() takes the same parameters as the old kthread_create()
plus a pointer to a process structure, and adds a kernel thread
to that process.
kproc_kthread_add() takes the parameters for kthread_add,
plus a process name and a pointer to a pointer to a process instead of just
a pointer, and if the proc * is NULL, it creates the process to the
specifications required, before adding the thread to it.
All other old kthread_xxx() calls return, but act on (struct thread *)
instead of (struct proc *). One reason to change the name is so that
any old kernel modules that are lying around and expect kthread_create()
to make a process will not just accidentally link.
fix top to show kernel threads by their thread name in -SH mode
add a tdnam formatting option to ps to show thread names.
make all idle threads actual kthreads and put them into their own idled process.
make all interrupt threads kthreads and put them in an interd process
(mainly for aesthetic and accounting reasons)
rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper'
man page fixes to follow.
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
previously the sched_lock. These bugs have existed for some time.
- Allow swapout to try each thread in a process individually and then
swapin the whole process if any of these fail. This allows us to move
most scheduler related swap flags into td_flags.
- Keep ki_sflag for backwards compat but change all in source tools to
use the new and more correct location of P_INMEM.
Reported by: pho
Reviewed by: attilio, kib
Approved by: re (kensmith)
priorities, etc.) in the NICE field:
Use a combination of pri_native and pri_user instead of pri_level to
guess the original realtime priority. Using pri_level here has been
wrong since 2001/02/12. Using only pri_native here would be correct
if the kernel actually initialized it reasonably. (The kernel exports
its raw td_base_priority as pri_native, but userland mostly wants a
refined base priority). Give up on waiting pri_native to work correctly
and only use it when there is nothing better (for kthreads).
This should reduce printing of bizarre pseudo-nice values. Bizarre
values are still printed if we observe a transient borrowed priority
for a kthread (transient borrowing is the main thing that makes the
raw td_base_priority almost useless in userland), or if there is a
kernel bug. One current kernel bug involves the kernel idprio thread
pagezero permanently changing its priority from PRI_MAX_IDLE (255) to
PUSER (160). Then the bizarre value "ki-6" is printed instead of
"ki31". Here "-6" is PRI_MIN_IDLE - PUSER = -64 truncated to 2
characters. We are observing a transient borrowed priority that has
become permanent due to a bug.
ps/print.c:priorityr() needs similar changes (including ones in stage 2
here).
titles extracted from argv vector instead of the real executable names.
This is useful when you want to watch applications that set their status
information via setproctitle(3).
Approved by: alfred
MFC after: 2 weeks
priority class and use this to:
- print "-" instead of a garbage value for ithreads. Print "-" instead
of the unused nice value for kthreads which are (mis)classified as
PRI_TIMESHARE. For such threads, the nice value can be set to nonzero
by root, but it is never used (at least by the 4bsd scheduler). For
ithreads, we didn't even print the unused value.
- print "i<priority>" and "r<priority>" instead of a biased "<priority>"
for idletime and realtime threads, Here <priority> is the priority
parameter to idprio/rtprio(1). Just add the prefix and remove the
bias for now. <priority> has been stored indirectly in the kernel
since 2001/02/12, and even the kernel cannot recover the original
value in all cases. Here we need to handle more cases than pri_to_rtp(),
but actually handle fewer cases, and end up printing garbage after
a thread changes its current priority while in the kernel.
- for idletime and realtime threads, if they are kthreads then add a prefix
of "k" to the previous string.
- for idletime and realtime threads, if they in the FIFO scheduling class
then add a suffix of "F" to the previous string (if it fits; the other
parts of the string are sure to fit unless <priority> is garbage).
machine.c. The traditional condition was (pctcpu > 0 || SRUN), but the
negation of the condition logic (from select to skip) made this come
out as (pctcpu > 0 && SRUN), leading to a very erratic display, except
for purely CPU bound processes.
This has been discussed in the mail lists some time ago and I have used
top with this patch on my systems for more than a year without problems
(just forgot to commit it earlier, since my systems were all fixed ...).
so that it can be more easily unbroken and extended.
Try to use `static', `const' (as appropriate), prototypes declared together,
and parameter names in prototypes for all private functions, not just the
new one.