Introduce pmcstat(8) changes for summarizing hwpmc(4) callchain records in

in textual form and in gmon.out format.

Update manual page.

Sponsored by:	FreeBSD Foundation and Google Inc.
This commit is contained in:
Joseph Koshy 2007-12-07 08:26:21 +00:00
parent d07f36b075
commit b6010f9e61
5 changed files with 1079 additions and 213 deletions

View File

@ -6,7 +6,7 @@ PROG= pmcstat
MAN= pmcstat.8
DPADD= ${LIBKVM} ${LIBPMC} ${LIBM}
LDADD= -lkvm -lpmc -lm
LDADD= -lelf -lkvm -lpmc -lm
WARNS?= 6

View File

@ -1,4 +1,6 @@
.\" Copyright (c) 2003-2007 Joseph Koshy. All rights reserved.
.\" Copyright (c) 2003-2007 Joseph Koshy
.\" Copyright (c) 2007 The FreeBSD Foundation
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
@ -34,7 +36,9 @@
.Op Fl C
.Op Fl D Ar pathname
.Op Fl E
.Op Fl G Ar pathname
.Op Fl M Ar mapfilename
.Op Fl N
.Op Fl O Ar logfilename
.Op Fl P Ar event-spec
.Op Fl R Ar logfilename
@ -53,6 +57,7 @@
.Op Fl t Ar process-spec
.Op Fl v
.Op Fl w Ar secs
.Op Fl z Ar graphdepth
.Op Ar command Op Ar args
.Sh DESCRIPTION
The
@ -123,6 +128,16 @@ complex pipeline of processes when used in conjunction with the
.Fl d
option.
The default is to not to enable per-process tracking.
.It Fl G Ar pathname
Print callchain information to file
.Ar pathname .
If argument
.Ar pathname
is a
.Dq Li -
this information is sent to the output file specified by the
.Fl o
option.
.It Fl M Ar mapfilename
Write the mapping between executable objects encountered in the event
log and the abbreviated pathnames used for
@ -138,6 +153,9 @@ in which case this mapping information is sent to the output
file configured by the
.Fl o
option.
.It Fl N
Toggle capturing callchain information for subsequent sampling PMCs.
The default is for sampling PMCs to capture callchain information.
.It Fl O Ar logfilename
Send logging output to file
.Ar logfilename .
@ -192,14 +210,15 @@ Argument
is a comma separated list of CPU numbers, or the literal
.Sq *
denoting all CPUs.
The default is to allocate system mode PMCs on all CPUs.
The default is to allocate system mode PMCs on all active CPUs in
the system.
.It Fl d
Toggle between process mode PMCs measuring events for the target
process' current and future children or only measuring events for
the target process.
The default is to measure events for the target process alone.
.It Fl g
Produce flat execution profiles in a format compatible with
Produce profiles in a format compatible with
.Xr gprof 1 .
A separate profile file is generated for each executable object
encountered.
@ -223,7 +242,10 @@ Send counter readings and textual representations of logged data
to file
.Ar outputfile .
The default is to send output to
.Pa stderr .
.Pa stderr
when collecting live data and to
.Pa stdout
when processing a pre-existing logfile.
.It Fl p Ar event-spec
Allocate a process mode counting PMC measuring hardware events
specified in
@ -257,6 +279,10 @@ The argument
.Ar secs
may be a fractional value.
The default interval is 5 seconds.
.It Fl z Ar graphdepth
When printing system-wide callgraphs, limit callgraphs to the depth
specified by argument
.Ar graphdepth .
.El
.Pp
If
@ -286,9 +312,15 @@ To count instruction tlb-misses on CPUs 0 and 2 on a Intel
Pentium Pro/Pentium III SMP system use:
.Dl "pmcstat -c 0,2 -s p6-itlb-miss"
.Pp
To collect profiling information for a specific process with pid 1234
based on instruction cache misses seen by it use:
.Dl "pmcstat -P ic-misses -t 1234 -O /tmp/sample.out"
.Pp
To perform system-wide sampling on all configured processors
based on processor instructions retired use:
.Dl "pmcstat -S instructions -O /tmp/sample.out"
If callgraph capture is not desired use:
.Dl "pmcstat -N -S instructions -O /tmp/sample.out"
.Pp
To send the generated event log to a remote machine use:
.Dl "pmcstat -S instructions -O remotehost:port"
@ -298,10 +330,27 @@ On the remote machine, the sample log can be collected using
.Pp
To generate
.Xr gprof 1
compatible flat profiles from a sample file use:
compatible profiles from a sample file use:
.Dl "pmcstat -R /tmp/sample.out -g"
.Pp
To print a system-wide profile with callgraphs to file
.Pa "foo.graph"
use:
.Dl "pmcstat -R /tmp/sample.out -G foo.graph"
.Sh DIAGNOSTICS
.Ex -std
.Sh COMPATIBILITY
Due to the limitations of the
.Pa gmon.out
file format,
.Xr gprof 1
compatible profiles generated by the
.Fl g
option do not contain information about calls that cross executable
boundaries.
The generated
.Pa gmon.out
files are also only meaningful for native executables.
.Sh SEE ALSO
.Xr gprof 1 ,
.Xr nc 1 ,

View File

@ -1,7 +1,11 @@
/*-
* Copyright (c) 2003-2007, Joseph Koshy
* Copyright (c) 2007 The FreeBSD Foundation
* All rights reserved.
*
* Portions of this software were developed by A. Joseph Koshy under
* sponsorship from the FreeBSD Foundation and Google, Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@ -486,7 +490,9 @@ pmcstat_show_usage(void)
"\t -C\t\t (toggle) show cumulative counts\n"
"\t -D path\t create profiles in directory \"path\"\n"
"\t -E\t\t (toggle) show counts at process exit\n"
"\t -G file\t write a system-wide callgraph to \"file\"\n"
"\t -M file\t print executable/gmon file map to \"file\"\n"
"\t -N\t\t (toggle) capture callchains\n"
"\t -O file\t send log output to \"file\"\n"
"\t -P spec\t allocate a process-private sampling PMC\n"
"\t -R file\t read events from \"file\"\n"
@ -504,7 +510,8 @@ pmcstat_show_usage(void)
"\t -s spec\t allocate a system-wide counting PMC\n"
"\t -t pid\t\t attach to running process with pid \"pid\"\n"
"\t -v\t\t increase verbosity\n"
"\t -w secs\t set printing time interval"
"\t -w secs\t set printing time interval\n"
"\t -z depth\t limit callchain display depth"
);
}
@ -516,16 +523,17 @@ int
main(int argc, char **argv)
{
double interval;
int option, npmc, ncpu;
int option, npmc, ncpu, haltedcpus;
int c, check_driver_stats, current_cpu, current_sampling_count;
int do_print, do_descendants;
int do_logproccsw, do_logprocexit;
int do_callchain, do_descendants, do_logproccsw, do_logprocexit;
int do_print;
size_t dummy;
int graphdepth;
int pipefd[2];
int use_cumulative_counts;
uint32_t cpumask;
char *end, *tmp;
const char *errmsg;
const char *errmsg, *graphfilename;
enum pmcstat_state runstate;
struct pmc_driverstats ds_start, ds_end;
struct pmcstat_ev *ev;
@ -538,10 +546,12 @@ main(int argc, char **argv)
check_driver_stats = 0;
current_cpu = 0;
current_sampling_count = DEFAULT_SAMPLE_COUNT;
do_callchain = 1;
do_descendants = 0;
do_logproccsw = 0;
do_logprocexit = 0;
use_cumulative_counts = 0;
graphfilename = "-";
args.pa_required = 0;
args.pa_flags = 0;
args.pa_verbosity = 1;
@ -550,21 +560,33 @@ main(int argc, char **argv)
args.pa_kernel = strdup("/boot/kernel");
args.pa_samplesdir = ".";
args.pa_printfile = stderr;
args.pa_graphdepth = DEFAULT_CALLGRAPH_DEPTH;
args.pa_graphfile = NULL;
args.pa_interval = DEFAULT_WAIT_INTERVAL;
args.pa_mapfilename = NULL;
args.pa_inputpath = NULL;
args.pa_outputpath = NULL;
STAILQ_INIT(&args.pa_events);
SLIST_INIT(&args.pa_targets);
bzero(&ds_start, sizeof(ds_start));
bzero(&ds_end, sizeof(ds_end));
ev = NULL;
dummy = sizeof(ncpu);
/*
* The initial CPU mask specifies all non-halted CPUS in the
* system.
*/
dummy = sizeof(int);
if (sysctlbyname("hw.ncpu", &ncpu, &dummy, NULL, 0) < 0)
err(EX_OSERR, "ERROR: Cannot determine #cpus");
err(EX_OSERR, "ERROR: Cannot determine the number of CPUs");
cpumask = (1 << ncpu) - 1;
if (sysctlbyname("machdep.hlt_cpus", &haltedcpus, &dummy,
NULL, 0) < 0)
err(EX_OSERR, "ERROR: Cannot determine which CPUs are halted");
cpumask &= ~haltedcpus;
while ((option = getopt(argc, argv,
"CD:EM:O:P:R:S:Wc:dgk:n:o:p:qr:s:t:vw:")) != -1)
"CD:EG:M:NO:P:R:S:Wc:dgk:n:o:p:qr:s:t:vw:z:")) != -1)
switch (option) {
case 'C': /* cumulative values */
use_cumulative_counts = !use_cumulative_counts;
@ -598,6 +620,11 @@ main(int argc, char **argv)
args.pa_required |= FLAG_HAS_PROCESS_PMCS;
break;
case 'G': /* produce a system-wide callgraph */
args.pa_flags |= FLAG_DO_CALLGRAPHS;
graphfilename = optarg;
break;
case 'g': /* produce gprof compatible profiles */
args.pa_flags |= FLAG_DO_GPROF;
break;
@ -605,7 +632,7 @@ main(int argc, char **argv)
case 'k': /* pathname to the kernel */
free(args.pa_kernel);
args.pa_kernel = strdup(optarg);
args.pa_required |= FLAG_DO_GPROF;
args.pa_required |= FLAG_DO_ANALYSIS;
args.pa_flags |= FLAG_HAS_KERNELPATH;
break;
@ -619,6 +646,11 @@ main(int argc, char **argv)
args.pa_mapfilename = optarg;
break;
case 'N':
do_callchain = !do_callchain;
args.pa_required |= FLAG_HAS_SAMPLING_PMCS;
break;
case 'p': /* process virtual counting PMC */
case 's': /* system-wide counting PMC */
case 'P': /* process virtual sampling PMC */
@ -664,6 +696,8 @@ main(int argc, char **argv)
ev->ev_cpu = PMC_CPU_ANY;
ev->ev_flags = 0;
if (do_callchain)
ev->ev_flags |= PMC_F_CALLCHAIN;
if (do_descendants)
ev->ev_flags |= PMC_F_DESCENDANTS;
if (do_logprocexit)
@ -725,7 +759,7 @@ main(int argc, char **argv)
break;
case 'R': /* read an existing log file */
if (args.pa_logparser != NULL)
if (args.pa_inputpath != NULL)
errx(EX_USAGE, "ERROR: option -R may only be "
"specified once.");
args.pa_inputpath = optarg;
@ -761,6 +795,15 @@ main(int argc, char **argv)
FLAG_HAS_COUNTING_PMCS | FLAG_HAS_OUTPUT_LOGFILE);
break;
case 'z':
graphdepth = strtod(optarg, &end);
if (*end != '\0' || graphdepth <= 0)
errx(EX_USAGE, "ERROR: Illegal callchain "
"depth \"%s\".", optarg);
args.pa_graphdepth = graphdepth;
args.pa_required |= FLAG_DO_CALLGRAPHS;
break;
case '?':
default:
pmcstat_show_usage();
@ -771,9 +814,14 @@ main(int argc, char **argv)
args.pa_argc = (argc -= optind);
args.pa_argv = (argv += optind);
args.pa_cpumask = cpumask; /* For selecting CPUs using -R. */
if (argc) /* command line present */
args.pa_flags |= FLAG_HAS_COMMANDLINE;
if (args.pa_flags & (FLAG_DO_GPROF | FLAG_DO_CALLGRAPHS))
args.pa_flags |= FLAG_DO_ANALYSIS;
/*
* Check invocation syntax.
*/
@ -822,9 +870,10 @@ main(int argc, char **argv)
errx(EX_USAGE, "ERROR: options -d, -E, and -W require a "
"process mode PMC to be specified.");
/* check for -c cpu and not system mode PMCs */
/* check for -c cpu with no system mode PMCs or logfile. */
if ((args.pa_required & FLAG_HAS_SYSTEM_PMCS) &&
(args.pa_flags & FLAG_HAS_SYSTEM_PMCS) == 0)
(args.pa_flags & FLAG_HAS_SYSTEM_PMCS) == 0 &&
(args.pa_flags & FLAG_READ_LOGFILE) == 0)
errx(EX_USAGE, "ERROR: option -c requires at least one "
"system mode PMC to be specified.");
@ -837,14 +886,14 @@ main(int argc, char **argv)
/* check for sampling mode options without a sampling PMC spec */
if ((args.pa_required & FLAG_HAS_SAMPLING_PMCS) &&
(args.pa_flags & FLAG_HAS_SAMPLING_PMCS) == 0)
errx(EX_USAGE, "ERROR: options -n and -O require at least "
"one sampling mode PMC to be specified.");
errx(EX_USAGE, "ERROR: options -N, -n and -O require at "
"least one sampling mode PMC to be specified.");
/* check if -g is being used correctly */
if ((args.pa_flags & FLAG_DO_GPROF) &&
/* check if -g/-G are being used correctly */
if ((args.pa_flags & FLAG_DO_ANALYSIS) &&
!(args.pa_flags & (FLAG_HAS_SAMPLING_PMCS|FLAG_READ_LOGFILE)))
errx(EX_USAGE, "ERROR: option -g requires sampling PMCs or -R "
"to be specified.");
errx(EX_USAGE, "ERROR: options -g/-G require sampling PMCs "
"or -R to be specified.");
/* check if -O was spuriously specified */
if ((args.pa_flags & FLAG_HAS_OUTPUT_LOGFILE) &&
@ -853,16 +902,16 @@ main(int argc, char **argv)
"ERROR: option -O is used only with options "
"-E, -P, -S and -W.");
/* -D dir and -k kernel path require -g or -R */
/* -k kernel path require -g/-G or -R */
if ((args.pa_flags & FLAG_HAS_KERNELPATH) &&
(args.pa_flags & FLAG_DO_GPROF) == 0 &&
(args.pa_flags & FLAG_DO_ANALYSIS) == 0 &&
(args.pa_flags & FLAG_READ_LOGFILE) == 0)
errx(EX_USAGE, "ERROR: option -k is only used with -g/-R.");
/* -D only applies to gprof output mode (-g) */
if ((args.pa_flags & FLAG_HAS_SAMPLESDIR) &&
(args.pa_flags & FLAG_DO_GPROF) == 0 &&
(args.pa_flags & FLAG_READ_LOGFILE) == 0)
errx(EX_USAGE, "ERROR: option -D is only used with -g/-R.");
(args.pa_flags & FLAG_DO_GPROF) == 0)
errx(EX_USAGE, "ERROR: option -D is only used with -g.");
/* -M mapfile requires -g or -R */
if (args.pa_mapfilename != NULL &&
@ -882,9 +931,9 @@ main(int argc, char **argv)
"sampling PMCs are specified together.");
/*
* Check if "-k kerneldir" was specified, and if whether 'kerneldir'
* actually refers to a a file. If so, use `dirname path` to determine
* the kernel directory.
* Check if "-k kerneldir" was specified, and if whether
* 'kerneldir' actually refers to a a file. If so, use
* `dirname path` to determine the kernel directory.
*/
if (args.pa_flags & FLAG_HAS_KERNELPATH) {
(void) snprintf(buffer, sizeof(buffer), "%s%s", args.pa_fsroot,
@ -910,13 +959,27 @@ main(int argc, char **argv)
}
}
/*
* If we have a callgraph be created, select the outputfile.
*/
if (args.pa_flags & FLAG_DO_CALLGRAPHS) {
if (strcmp(graphfilename, "-") == 0)
args.pa_graphfile = args.pa_printfile;
else {
args.pa_graphfile = fopen(graphfilename, "w");
if (args.pa_graphfile == NULL)
err(EX_OSERR, "ERROR: cannot open \"%s\" "
"for writing", graphfilename);
}
}
/* if we've been asked to process a log file, do that and exit */
if (args.pa_flags & FLAG_READ_LOGFILE) {
/*
* Print the log in textual form if we haven't been
* asked to generate gmon.out files.
* asked to generate profiling information.
*/
if ((args.pa_flags & FLAG_DO_GPROF) == 0)
if ((args.pa_flags & FLAG_DO_ANALYSIS) == 0)
args.pa_flags |= FLAG_DO_PRINT;
pmcstat_initialize_logging(&args);
@ -1162,7 +1225,7 @@ main(int argc, char **argv)
FLAG_HAS_PIPE)) {
runstate = pmcstat_close_log(&args);
if (args.pa_flags &
(FLAG_DO_PRINT|FLAG_DO_GPROF))
(FLAG_DO_PRINT|FLAG_DO_ANALYSIS))
pmcstat_process_log(&args);
}
do_print = 1; /* print PMCs at exit */

View File

@ -1,7 +1,11 @@
/*-
* Copyright (c) 2005-2007, Joseph Koshy
* Copyright (c) 2007 The FreeBSD Foundation
* All rights reserved.
*
* Portions of this software were developed by A. Joseph Koshy under
* sponsorship from the FreeBSD Foundation and Google, Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@ -43,11 +47,14 @@
#define FLAG_HAS_SAMPLESDIR 0x00000800 /* -D dir */
#define FLAG_HAS_KERNELPATH 0x00001000 /* -k kernel */
#define FLAG_DO_PRINT 0x00002000 /* -o */
#define FLAG_DO_CALLGRAPHS 0x00004000 /* -G */
#define FLAG_DO_ANALYSIS 0x00008000 /* -g or -G */
#define DEFAULT_SAMPLE_COUNT 65536
#define DEFAULT_WAIT_INTERVAL 5.0
#define DEFAULT_DISPLAY_HEIGHT 23
#define DEFAULT_BUFFER_SIZE 4096
#define DEFAULT_CALLGRAPH_DEPTH 4
#define PRINT_HEADER_PREFIX "# "
#define READPIPEFD 0
@ -68,9 +75,9 @@
#define PMCSTAT_LDD_COMMAND "/usr/bin/ldd"
#define PMCSTAT_PRINT_ENTRY(A,T,...) do { \
fprintf((A)->pa_printfile, "%-8s", T); \
fprintf((A)->pa_printfile, " " __VA_ARGS__); \
fprintf((A)->pa_printfile, "\n"); \
(void) fprintf((A)->pa_printfile, "%-9s", T); \
(void) fprintf((A)->pa_printfile, " " __VA_ARGS__); \
(void) fprintf((A)->pa_printfile, "\n"); \
} while (0)
enum pmcstat_state {
@ -112,7 +119,10 @@ struct pmcstat_args {
char *pa_kernel; /* pathname of the kernel */
const char *pa_samplesdir; /* directory for profile files */
const char *pa_mapfilename;/* mapfile name */
FILE *pa_graphfile; /* where to send the callgraph */
int pa_graphdepth; /* print depth for callgraphs */
double pa_interval; /* printing interval in seconds */
uint32_t pa_cpumask; /* filter for CPUs analysed */
int pa_argc;
char **pa_argv;
STAILQ_HEAD(, pmcstat_ev) pa_events;

File diff suppressed because it is too large Load Diff