a running event each time it executes a callout function. The event
includes the function pointer, argument, and whether or not it was run from
hardware interrupt context. The callwheel is marked idle when each handler
completes. This effectively logs the duration of each callout routine in
the graph.
- Print human readable time as a float with two digits of precision. Use
ns now as well since clock periods are well into the hundreds of
picoseconds now.
- Show the average duration in the stats frame. This is often more useful
than total duration.
about invalid timestamps. Nehalem CPUs seem to be synchronized but only
within a fraction of a microsecond.
- Make the Counter code more flexible to poor timestamps. In general we
now complain a lot but render as much as we can.
- Change the scaler behavior so it works better with very long and very
short traces. We now set the maximum scale such that it properly
displays the entire file by default and doesn't permit zooming out
beyond the file. This improves other awkward navigation behavior.
The interval is now set very small which can't be achieved by simply
dragging the mouse. Clicking to the left of or right of the scaler bar
will produce increments of a single, very small, interval now.
Sponsored by: Nokia
printing it to the terminal. Now only parse errors go to the terminal.
- Speedup drawing by raising and lowering tags only once everything has
been drawn. Surprisingly, it now takes a little longer to parse than
it does to draw.
- Parameterize the layout with X_ and Y_ defines that determine the sizes
of various things.
- Remove unnecessary tags.
optimized single pass function for each. This reduces the number of
tkinter calls required to the minimum.
- Add a right-click context menu for sources. Supported commands hide
the source, hide the whole group the source is in, and bring up a stat
window.
- Add a source stat frame that gives an event frequency table as well as
the total duration for each event type that has a duration. This can
be used to see, for example, the total time a thread spent running or
blocked by a wchan or lock.
displaying sources.
- Add functions to the main SchedGraph to facilitate source hiding. The
source is simply moved off screen and all other sources are moved to
compensate.
This no longer requires any custom classes or parsers to support new
event types.
- Add an optional command line argument for specifying the clock frequency
in ghz. This is useful for traces that do not include KTR_SCHED.
Sponsored by: Nokia
- Add support for sorting rows by clicking and dragging them to their new
position.
- Add support for configuring the cpu background colors.
- Improve the scaling so a better center is maintained as you zoom. This
is not perfect due to precision loss with floats used in the window
views.
- Add new colors and a random assignment for unknown event types. A table
is used for known event types. This is the only event specific
information.
- Callwheels traced via KTR_CALLOUT. Each CPU is assigned a callwheel
source. The events on this source are the execution of individual callout
routines. Each routine shows up as a green rectangle while it is executed
and the event details include the function pointer and argument.
- Locks traced via KTR_LOCK. Currently, each lock name is assigned an event
source (since the existing KTR_LOCK traces only include lock names and
not pointers). This does mean that if multiple locks of the same name are
manipulated, the source line for that name may be confusing. However, for
many cases this can be useful. Locks are blue when they are held and
purple when contested. The contention support is a bit weak due to
limitations in the rw_rlock() and mtx_lock_spin() logging messages
currently. I also have not added support for contention on lockmgr,
sx, or rmlocks yet. What is there now can be profitably used to examine
activity on Giant however.
- Expand the width of the event source names column a bit to allow for some
of the longer names of these new source types.
(threads, CPU load counters, etc.). Each source is tagged with a group
and an order similar to the SYSINIT SI_SUB_* and SI_ORDER_*. After
the file is parsed, all the sources are then sorted. Currently, the only
affects of this are that the CPU loads are now sorted by CPU ID (so
CPU 0 is always first). However, this makes it easier to add new types
of event sources in the future and have them all clustered together
instead of intertwined with threads.
- Python lists perform insertions at the tail much faster than insertions
at the head. For a trace that had a lot of events for a single event
source, the constant insertions of new events to the head of the
per-source event list caused a noticable slow down. To compensate,
append new events to the end of the list during parsing and then
reverse the list prior to drawing.
- Somewhere in the tkinter internals the coordinates of a canvas are
stored in a signed 32-bit integer. As a result, if an the box for
an event spans 2^31, it would actually end up having a negative
X offset at one end. The result was a single box that covered the
entire event source. Kris worked around this for some traces by
bumping up the initial ticks/pixel ratio from 1 to 10. However, a
divisor of 10 can still be too small for large tracefiles (e.g.
with 4 million entries). Instead of hardcoding the initial scaling
ratio, calculate it from the time span of the trace file.
- Add support for using the mouse wheel to scroll the graph window
up and down.
post collection. This is too error prone and introduces uncertainty into
the timing. We'll simply have to require synchronized TSCs to run
schedgraph on MP.
Sponsored by: Nokia
* Explain why 32768 entries is usually not enough
* Increase the scaling ratio to 10 to deal with 32-bit overflows that
can occur in calculating the canvas offsets
o add things i want to TODO list
o add Record entry to each event which back-maps to the line # in the ktr file;
useful for finding local context when the ktr file has lots of items that
schedgraph doesn't grok
o add missing KTR_SCHED event handlers
o expose Counter max value through a ymax method for widget building
o show timestamps in records rejected 'cuz time goes backwards
This only works if there is no significant drift and all processors are
running at the same frequency. Fortunately, schedgraph traces on MP
machines tend to cover less than a second so drift shouldn't be an issue.
- KTRFile::synchstamp() iterates once over the whole list to determine the
lowest tsc value and syncs adjusts all other values to match. We assume
that the first tick recorded on all cpus happened at the same instant to
start with.
- KTRFile::monostamp() iterates again over the whole file and checks for
a cpu agnostic monotonically increasing clock. If the time ever goes
backwards the cpu responsible is adjusted further to fit. This will
make the possible incorrect delta between cpus as small as the shortest
time between two events. This time can be fairly large due to sched_lock
essentially protecting all events.
- KTRFile::checkstamp() now returns an adjusted timestamp.
- StateEvent::draw() detects states that occur out of order in time and
draws them as 0 pixels after printing a warning.
from the tsc.
- Set skipnext = 1 for yielding and preempted events so we don't show the
event that adds us back to the run queue. It used to be 2 so we would
skip the ksegrp run queue addition and the system run queue addition
but the ksegrp run queue has gone away.
- Don't display down to nanosecond resolution for scheduling events right
now. This can sometimes cause a division by zero.
as they are the setrunqueue() and sched_add() calls. Since they happen
immediately before the thread is placed on a run queue they would normally
dwarf the more informative preemption or yield event and it is implicitly
understood that a thread is back on the run queue as part of these events.
python and tkinter. Schedgraph takes input from files produces by
ktrdump -ct when KTR_SCHED is compiled into the kernel. The output
represents the states of each thread with colored line segments as well
as colored points for non-state scheduler events. Each line segment and
point is clickable to obtain extra detail.