Commit Graph

41 Commits

Author SHA1 Message Date
Alexander Kabaev
8b5ae4db0d Use newly added __used attribute to keep static function symbol from
being eliminated.
2004-07-29 18:02:28 +00:00
Peter Wemm
6d05d7c75a Make profiling work for varargs functions.. %al is an additional argument
which indicates the number of xmm registers used in the varargs.  This
stops the explosion that happened when profiling printf() etc.
2004-06-10 22:00:58 +00:00
Bruce Evans
5a8f125ad9 MFi386 (1.37: GUPROF calibration macros; only routine adjustments needed). 2004-05-20 16:22:57 +00:00
Bruce Evans
8693960479 Fixed the type of fptrdiff_t. It needs to be 64 bits in theory, and in
practice too since kernel addresses are almost 2^64 higher than most
user addresses.
2004-05-19 16:19:11 +00:00
Bruce Evans
19b5915afa Fixed some style bugs (mainly misalignment of backslashes). 2004-05-19 16:04:26 +00:00
Bruce Evans
b2321e7cdb Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>.  Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.
2004-05-19 15:41:26 +00:00
Peter Wemm
2079cde964 The 'call mcount' hooks that gcc inserts when profiling are in a place that
cannot handle the scratch registers being trashed.  So we have to preserve
them ourselves.
2004-05-18 22:52:32 +00:00
Warner Losh
29ae923f44 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:29:41 +00:00
Jacques Vidrine
3f6f39ff54 Remove `static' prototype from header file. 2004-01-06 20:36:21 +00:00
David E. O'Brien
69bb404192 Use C99 compatable asm statements. 2003-06-02 00:29:35 +00:00
Peter Wemm
afa8862328 Commit MD parts of a loosely functional AMD64 port. This is based on
a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to
attempt to get a stable base to start from.  There is a lot missing still.
Worth noting:
- The kernel runs at 1GB in order to cheat with the pmap code.  pmap uses
  a variation of the PAE code in order to avoid having to worry about 4
  levels of page tables yet.
- It boots in 64 bit "long mode" with a tiny trampoline embedded in the
  i386 loader.  This simplifies locore.s greatly.
- There are still quite a few fragments of i386-specific code that have
  not been translated yet, and some that I cheated and wrote dumb C
  versions of (bcopy etc).
- It has both int 0x80 for syscalls (but using registers for argument
  passing, as is native on the amd64 ABI), and the 'syscall' instruction
  for syscalls.  int 0x80 preserves all registers, 'syscall' does not.
- I have tried to minimize looking at the NetBSD code, except in a couple
  of places (eg: to find which register they use to replace the trashed
  %rcx register in the syscall instruction).  As a result, there is not a
  lot of similarity.  I did look at NetBSD a few times while debugging to
  get some ideas about what I might have done wrong in my first attempt.
2003-05-01 01:05:25 +00:00
Mark Murray
82e5cdeb6e Fix a declaration that is actually supposed to be a macro definition.
Submitted by:	marius@alchemy.franken.de
2002-09-25 13:46:23 +00:00
Peter Wemm
66422f5b7a Initiate deorbit burn for the i386-only a.out related support. Moves are
under way to move the remnants of the a.out toolchain to ports.  As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL.  It has already lasted WAY beyond
that.

Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.

Tested on: i386 (extensively), alpha
2002-09-17 01:49:00 +00:00
Mark Murray
db8f2e326c Stylify (mainly line up macro EOL-continuation \'s), and add a dummy
alternative for lint.
2002-04-21 10:49:00 +00:00
Alfred Perlstein
b63dc6ad47 Remove __P. 2002-03-20 05:48:58 +00:00
Bruce Evans
92fd4795fa Finish revs.1.23 and 1.24 so that MCOUNT_ENTER really actually compiles
for SMP in the plain profiling case.  It seems to work too.

This error was not detected by LINT because LINT only compiles the
GUPROF profiling case, which is is a superset of the plain profiling
case for !SMP but which is so broken for SMP that the buggy code is
not compiled.
2002-01-31 13:49:55 +00:00
Brian Feldman
4a44bd4b4a Add kmupetext(), a function that expands the range of memory covered
by the profiler on a running system.  This is not done sparsely, as
memory is cheaper than processor speed and each gprof mcount() and
mexitcount() operation is already very expensive.

Obtained from:	NAI Labs CBOSS project
Funded by:	DARPA
2001-10-30 15:04:57 +00:00
John Baldwin
ce11a18f0e Fix MCOUNT_ENTER() so it actually compiles in the profiling case.
Pointy hat to:	me
Submitted by:	Danny J. Zerkel <dzerkel@columbus.rr.com>
2001-07-14 21:40:53 +00:00
John Baldwin
25142c5ea1 Get kernel profiling on SMP systems closer to working by replacing the
mcount spin mutex with a very simple non-recursive spinlock implemented
using atomic operations.
2001-06-28 04:03:29 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Jason Evans
1b367556b5 Convert all simplelocks to mutexes and remove the simplelock implementations. 2001-01-24 12:35:55 +00:00
Peter Wemm
664a31e496 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 04:46:21 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
John Polstra
5584f22bb3 Make profiling work for ELF. gprof now autodetects the format of
the executable file, so it will work for both a.out and ELF format
files.  I have split the object format specific code into separate
source files.  It's cleaner than it was before, but it's still
pretty crufty.

Don't cheat on your make world for this update.  A lot of things
have to be rebuilt for it to work, including the compiler and all
of the profiled libraries.
1998-09-07 23:32:00 +00:00
Bruce Evans
37889b394a Changed to the C9x draft spelling of the (unsigned) integral type
suitable for holding object pointers (ptrint_t -> uintptr_t).
Added corresponding signed type (intptr_t).  Changed/added
corresponding non-C9x types for function pointers to match.  Don't
use nonstandard types to implement these types, and don't comment
on them in <machine/types.h>.
1998-07-14 05:09:48 +00:00
Bruce Evans
930a642372 Oops, fptrint_t still needs to be declared in <machine/profile.h> in the
!KERNEL case.  The kludge to get it declared in libc/gmon/mcount.c wasn't
sufficient because fptrint_t is used in <sys/gmon.h>.
1998-07-10 09:26:41 +00:00
Bruce Evans
2e480d34aa Added a kernel-only typedef (ptrint_t) giving an integral type that is
least unsuitable for holding an object pointer.  This should have been
used to fix warnings about casts between pointers and ints on alphas.

Moved corresponding existing general typedef (fptrint_t) for function
pointers from the i386 <machine/profile.h> to a kernel-only typedef
in <machine/types.h>.  Kludged libc/gmon/mcount.c so that it can
still see this typedef.
1998-07-10 02:27:16 +00:00
Bruce Evans
7a1a679ecb Ifdefed use of a GNU feature. 1998-02-03 20:32:38 +00:00
Tor Egge
5c623cb649 Add support for low resolution SMP kernel profiling.
- A nonprofiling version of s_lock (called s_lock_np) is used
    by mcount.

  - When profiling is active, more registers are clobbered in
    seemingly simple assembly routines. This means that some
    callers needed to save/restore extra registers.

  - The stack pointer must have space for a 'fake' return address
    in idle, to avoid stack underflow.
1997-12-15 02:18:35 +00:00
Steve Passe
78292efeef Another round of lock pushdown.
Add a simplelock to deal with disable_intr()/enable_intr() as used in UP kernel.
UP kernel expects that this is enough to guarantee exclusive access to
regions of code bracketed by these 2 functions.
Add a simplelock to bracket clock accesses in clock.c: clock_lock.

Help from:	Bruce Evans <bde@zeta.org.au>
1997-08-30 08:08:10 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Bruce Evans
a7d00b5bf6 Moved definition of FUNCTION_ALIGNMENT to a machine-dependent place.
Changed it from 4 to 16 for i386's.  It can be anything for i386's,
but compiler options limit it to a power of 2, and assembler and
linker deficiencies limit it to a small power of 2 (<= 16).
We use 16 in the kernel to get smaller tables (see Makefile.i386 and
<machine/asmacros.h>).  We still use the default of 4 in user mode.

Use HISTCOUNTER instead of (*kcount) in the definition of KCOUNT()
for consistency with other macros.
1997-02-13 10:47:29 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Bruce Evans
d6b9e17eb5 Improved non-statistical (GUPROF) profiling:
- use a more accurate and more efficient method of compensating for
  overheads.  The old method counted too much time against leaf
  functions.
- normally use the Pentium timestamp counter if available.
  On Pentiums, the times are now accurate to within a couple of cpu
  clock cycles per function call in the (unlikely) event that there
  are no cache misses in or caused by the profiling code.
- optionally use an arbitrary Pentium event counter if available.
- optionally regress to using the i8254 counter.
- scaled the i8254 counter by a factor of 128.  Now the i8254 counters
  overflow slightly faster than the TSC counters for a 150MHz Pentium :-)
  (after about 16 seconds).  This is to avoid fractional overheads.

files.i386:
permon.c temporarily has to be classified as a profiling-routine
because a couple of functions in it may be called from profiling code.

options.i386:
- I586_CTR_GUPROF is currently unused (oops).
- I586_PMC_GUPROF should be something like 0x70000 to enable (but not
  use unless prof_machdep.c is changed) support for Pentium event
  counters.  7 is a control mode and the counter number 0 is somewhere
  in the 0000 bits (see perfmon.h for the encoding).

profile.h:
- added declarations.
- cleaned up separation of user mode declarations.

prof_machdep.c:
Mostly clock-select changes.  The default clock can be changed by
editing kmem.  There should be a sysctl for this.

subr_prof.c:
- added copyright.
- calibrate overheads for the new method.
- documented new method.
- fixed races and and machine dependencies in start/stop code.

mcount.c:
Use the new overhead compensation method.

gmon.h:
- changed GPROF4 counter type from unsigned to int.  Oops, this should
  be machine-dependent and/or int32_t.
- reorganized overhead counters.

Submitted by:	Pentium event counter changes mostly by wollman
1996-10-17 19:32:31 +00:00
Bruce Evans
1f403fcfbf Cleaned up interrupt masking by declaring the state variable in a
machine-dependent macro and passing it to all machine-dependent
macros.

Eliminated the state variable for the GUPROF case.
1996-08-28 20:15:32 +00:00
Bruce Evans
e5171bbec0 Fixed user-mode mcount which I broke in the previous revision.
Do it the old way for now.

Moved recent additions around a lot to minimise ifdefs.

Added prototypes.
1996-01-01 17:11:21 +00:00
Bruce Evans
912e603778 Implemented non-statistical kernel profiling. This is based on
looking at a high resolution clock for each of the following events:
function call, function return, interrupt entry, interrupt exit,
and interesting branches.  The differences between the times of
these events are added at appropriate places in a ordinary histogram
(as if very fast statistical profiling sampled the pc at those
places) so that ordinary gprof can be used to analyze the times.

gmon.h:
Histogram counters need to be 4 bytes for microsecond resolutions.
They will need to be larger for the 586 clock.
The comments were vax-centric and wrong even on vaxes.  Does anyone
disagree?

gprof4.c:
The standard gprof should support counters of all integral sizes
and the size of the counter should be in the gmon header.  This
hack will do until then.  (Use gprof4 -u to examine the results
of non-statistical profiling.)

config/*:
Non-statistical profiling is configured with `config -pp'.
`config -p' still gives ordinary profiling.

kgmon/*:
Non-statistical profiling is enabled with `kgmon -B'.  `kgmon -b'
still enables ordinary profiling (and distables non-statistical
profiling) if non-statistical profiling is configured.
1995-12-29 15:30:05 +00:00
Paul Richards
8db02de884 Added MCOUNT_ENTER and MCOUNT_EXIT macros to profile.h
Removed inb function since it's more correctly in pio.h

Copied write_eflags and read_eflags over from npx.c

(Some changes to the macros suggested by Bruce were not made at this
time since his suggestions probably apply to all the macros and
these inlined/macro definitions need a lot of cleaning up at some
point in the future.)

Reviewed by:	Bruce
1994-09-15 16:27:14 +00:00
Paul Richards
836dc83b6a Made idempotent.
Reviewed by:
Submitted by:
1994-08-21 04:55:31 +00:00
David Greenman
3c4dd3568f Added $Id$ 1994-08-02 07:55:43 +00:00
Rodney W. Grimes
6cda32c071 BSD 4.4 Lite Kernel Sources 1994-05-25 01:34:38 +00:00