appear to be never called:
(1) If a function is never called according to its call count but it
must have been called because its child time is nonzero, then print
it in the flat profile. Previously, if its call count was zero
then we only printed it in the flat profile if its self time was
nonzero.
(2) If a function has a zero call count but has a nonzero self or child
time, then print its total self time in the self time per call
column as a percentage of the total (self + child) time. It is
not possible to print the times per call in this case because the
call count is zero. Previously, this was handled by leaving both
per-call columns blank. The self time is printed in another column
but there was no way to recover the total time.
(1) partially fixes the case of the "never called" function main() and
prepares for (2) to apply to main() and other functions. Profiling
of main() was lost in the conversion from a.out to ELF, so main()'s
call count has always been zero for many years; then in the common
case where main() is a tiny function, it gets no profiling ticks, so
main() was completely lost in the flat profile.
(2) improves mainly cases like kernel threads. Most kernel threads
appear to be never called because they are always started before
userland can run to turn on profiling. As for main(), the fact that
they are called is not very interesting and their callers are
uninteresting, but their relative self time is interesting since they
are long-running.
Almost always printing percentages in the per-call columns would be
more useful than almost always printing 0.0ms. 0.1ms is now a long
time, so only very large functions take that long per call. The accuracy
per call can approach 1-10 nsec provided programs are run for about
100000 times as long as is necessary to get this accuracy with high
resolution kernel profiling.
on allocation failure instead of displaying a warning and deferencing NULL
pointer after. Spelling. Add prototypes. Add list of option in synopsis section
of man page, -d is not referenced because available as a compile option. It
should be made a runtime option btw.
things when sizeof(UNIT) becomes a runtime parameter. The relevant 2
is the one in profil(2)'s scaling of pc's to bucket numbers:
bucket = (pc - offset) / 2 * profil_scale / 65536
gprof(1) must duplicate this scaling, bug for bug compatibly, so it
must first do an integer division by 2 although this mainly makes
scales larger than 65536 useless. sizeof(UNIT) was already wrong in
gprof4, but there were no problems because the fake profil scale is a
multiple of 2.
There are also some rounding bugs in the scaling, but these are only
problems if profil(2) is used directly to create unusual (and not
useful) scales.
without possibly losing lots of precision, and print the scale using
%g instead of %d in case it is non-integral. %g might not be the best
format for this.
resolution profiling on Pentiums. On a 100MHz Pentium, the resolution
is at best 10 ns and actually a few hundred ns, but units of 10's or
100's of ns would be inconvenient and the current units of 1 us are a
bit too coarse.
underscore. Use it to avoid seeing badsw when profiling the kernel.
Print times more accurately (e.g. usec in %8.0f format instead of
msec in %8.2f format for averages) if hz >= 10000. This should have
no effect now since profhz is only 1024.