malloc(9) statistics from kernel memory or a kernel coredump, to catch
up with recent changes to adopt per-CPU malloc(9) statistics. The new
routines walk the per-CPU statistics pools and coalesce them for
presentation to the user.
Fixed a nearby bug. The "play it safe" code in dosysctl() was unsafe
because it overran the buffer by 1 if sysctl() filled all of the buffer.
Fixed a nearby style bug in output. Not just 1, but 2 extra newlines
were printed at the end by "vmstat -m" and "vmstat -z". Don't print
any newlines explicitly. This depends on 2 of the many formatting
bugs in the corresponding sysctls. First, the sysctls return an extra
newline at the end of the strings. This also messes up output from
sysctl(8). Second, the sysctls return an extra newline at the beginning
of the strings. This is good for separating the 2 tables output by
"vmstat -mz" and for starting the header on a new line in plain sysctl
output, but gives a bogus extra newline at the beginning for "vm -[m | z]"
and "sysctl -n [kern.malloc | vm.zone]".
Fixed some nearby style bugs in the source code:
- the same line that misspelled 0 as NULL also spelled NULL as 0.
- the size was doubled twice in the realloc loop.
- the "play it safe" comment was misleading. Terminating the buffer
is bogus because dosysctl() is only meant to work with sysctls that
return strings and the terminator is part of a string. However, the
kern.malloc sysctl has more than style bugs. It also doesn't return
a string. Termination is needed to work around this bug.
- Replace overly-complicated (and buggy) -a logic with a much simpler
version: -a causes all interrupts to be displayed, otherwise only
those that have occurred are displayed. This removes the need for
any MD code.
- Instead of just making sure intrcnt is large enough, figure out the
exact size it needs to be. We derive nintr from this number, and we
don't want to risk printing garbage. Note that on sparc64, we end up
printing garbage anyway because the names of non-existent interrupts
are left uninitialized by the kernel.
Tested on: alpha, i386, sparc64
o nintr and inamlen must by of type size_t, not int,
o Remove now unnecessary casts,
o Handle the aflag differently, because the intr. names have a
fixed width and almost always have trailing spaces.
The use of libkvm for post-mortem analysis is still supported (though it
could use more testing). We can now remove vmstat's setgid bit.
While I'm here, hack the interrupt listing code to not display interrupts
that haven't occurred unless the -a option was given on the command line,
and document this change.
Kernel:
Change statistics to use the *uptime() timescale (ie: relative to
boottime) rather than the UTC aligned timescale. This makes the
device statistics code oblivious to clock steps.
Change timestamps to bintime format, they are cheaper.
Remove the "busy_count", and replace it with two counter fields:
"start_count" and "end_count", which are updated in the down and
up paths respectively. This removes the locking constraint on
devstat.
Add a timestamp argument to devstat_start_transaction(), this will
normally be a timestamp set by the *_bio() function in bp->bio_t0.
Use this field to calculate duration of I/O operations.
Add two timestamp arguments to devstat_end_transaction(), one is
the current time, a NULL pointer means "take timestamp yourself",
the other is the timestamp of when this transaction started (see
above).
Change calculation of busy_time to operate on "the salami principle":
Only when we are idle, which we can determine by the start+end
counts being identical, do we update the "busy_from" field in the
down path. In the up path we accumulate the timeslice in busy_time
and update busy_from.
Change the byte_* and num_* fields into two arrays: bytes[] and
operations[].
Userland:
Change the misleading "busy_time" name to be called "snap_time" and
make the time long double since that is what most users need anyway,
fill it using clock_gettime(CLOCK_MONOTONIC) to put it on the same
timescale as the kernel fields.
Change devstat_compute_etime() to operate on struct bintime.
Remove the version 2 legacy interface: the change to bintime makes
compatibility far too expensive.
Fix a bug in systat's "vm" page where boot relative busy times would
be bogus.
Bump __FreeBSD_version to 500107
Review & Collaboration by: ken
Updated the kmemzones logic such that the ks_size bitmap can be used as an
index into it to report the size of the zone used.
Create the kern.malloc sysctl which replaces the kvm mechanism to report
similar data. This will provide an easy place for statistics aggregation if
malloc_type statistics become per cpu data.
Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing
the malloc buckets.
to unsigned long long.
Don't be too overzealous with the printing of ks_calls in the total
statistics, cut back from 20 to 13 positions to print (which should last
a couple of years easily (20 digits is enough for 3168 years of calls at a
measly billion (10^9) calls per second.)).
Submitted by: bde
make sense to me) and change the printf argument from %8ld to %20llu to
accompany the printing of the totals.
Realigned the header printed above it as well.
PR: 32342
Submitted by: ryan beasley <ryanb@goddamnbastard.org>
Reviewed by: jeff, Tim J Robbins