Commit Graph

223 Commits

Author SHA1 Message Date
Bruce Evans
a7d00b5bf6 Moved definition of FUNCTION_ALIGNMENT to a machine-dependent place.
Changed it from 4 to 16 for i386's.  It can be anything for i386's,
but compiler options limit it to a power of 2, and assembler and
linker deficiencies limit it to a small power of 2 (<= 16).
We use 16 in the kernel to get smaller tables (see Makefile.i386 and
<machine/asmacros.h>).  We still use the default of 4 in user mode.

Use HISTCOUNTER instead of (*kcount) in the definition of KCOUNT()
for consistency with other macros.
1997-02-13 10:47:29 +00:00
Bruce Evans
47cac30b9a Align text to 16-byte boundaries if profiling is enabled. This will
allow a fourfold reduction in the size of the profiling buffers.  This
goes with rev.1.91 of Makefile.i386 which does the same thing for C
functions.
1997-02-13 08:31:53 +00:00
Poul-Henning Kamp
b7a652ab84 I have no idea what this is all about, but it works and Bruce hasn't
complained so it cannot be entirely bad :-)

I include the email that probably explains it for people who already know:

> >Compiling with -O3 inlines functions.  However the function that is being
> >inlined in makeinfo.c (add_word_args()) is a vararg function and must not be
> >inlined.
> >
> >The code in question is K&R style, and AFIK, there is no way for the compiler
> >to determine that the function uses vararg.  Either change the code to use
> >prototypes, or use stdarg, or add a directive to prevent inlining.
>
> Not declaring a varargs function as varargs before it is used gives
> undefined behaviour.
>
> However, in practice the bug is probably in FreeBSD's <varargs.h>, which
> doesn't use gcc's __builtin_next_arg().  gcc should notice that it is
> used and not inline functions that have it.  <stdarg.h.> uses it, but I
> think there's another gcc builtin that it should be using.

Patch attached.  The ellipsis causes gcc to flag this as a varargs function,
and the name "__builtin_va_alist" is special cased in gcc to hide the last
argument in the arglist.

Reviewed by:	bde & phk
Submitted by:	jlemon@americantv.com (Jonathan Lemon)
1997-02-07 20:22:15 +00:00
KATO Takenori
d5605f2a31 Deleted i386_cpus[]. i386_cpus[] is a static variable in identcpu.c.
Found-by: lint
1997-02-02 10:43:35 +00:00
Bruce Evans
7d350e7256 Fixed printing of small offsets. E.g., -4(%ebp) is now printed
as -0x4(%ebp) instead of as _APTD+0xffc(%ebp), and if GUPROF is
defined, 8(%ebp) is now printed as 0x8(%ebp) instead of as
GMON_PROF_HIRES+0x4(%ebp).
1997-01-16 11:27:11 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
KATO Takenori
af4342c037 Staticize the functions rtc_inb, rtc_outb, rtc_serialcombit, and
rtc_serialcom.  These functions are only used by PC98.
1997-01-10 17:11:09 +00:00
John Dyson
d0aea04fe0 Let the VM system know that on certain arch's that VM_PROT_READ
also implies VM_PROT_EXEC.  We support it that way for now,
since the break system call by default gives VM_PROT_ALL.  Now
we have a better chance of coalesing map entries when mixing
mmap/break type operations.  This was contributing to excessive
numbers of map entries on the modula-3 runtime system.  The
problem is still not "solved", but the situation makes more
sense.

Eventually, when we work on architectures where VM_PROT_READ
is orthogonal to VM_PROT_EXEC, we will have to visit this
issue carefully (esp. regarding security issues.)
1996-12-30 05:31:21 +00:00
John Dyson
d22671dcce Support the PG_G flag on Pentium-Pro processors. This pretty
much eliminates the unnecessary unmapping of the kernel during
context switches and during invtlb...
1996-11-11 04:20:19 +00:00
Julian Elischer
d13d3630fd Further improved version of hadling a HALT when there is no console. 1996-10-31 00:57:28 +00:00
Satoshi Asami
e30f001135 More merge and update.
(1) deleted #if 0

    pc98/pc98/mse.c

(2) hold per-unit I/O ports in ed_softc

    pc98/pc98/if_ed.c
    pc98/pc98/if_ed98.h

(3) merge more files by segregating changes into headers.

  new file (moved from pc98/pc98):

    i386/isa/aic_98.h

  deleted:

    well, it's already in the commit message so I won't repeat the
    long list here ;)

Submitted by:	The FreeBSD(98) Development Team
1996-10-30 22:41:46 +00:00
Bruce Evans
835bd1ce62 Improved biasing of i586 clock by adjusting for hardclock() latency.
I decided to do this for every hardclock() call instead of lazily
in microtime().  The lazy method is simpler but has more overhead
if microtime() is called a lot.

CPU_THISTICKLEN() is now a no-op and should probably go away.
Previously it did nothing directly but had the side effect of
setting i586_last_tick for CPU_CLOCKUPDATE() and i586_avg_tick for
debugging.  CPU_CLOCKUPDATE() now uses a better method and
i586_avg_tick is too much trouble to maintain.

Reduced nesting of #includes in the usual case.

Increased nesting of #includes when CLOCK_HAIR is defined.  This
is a kludge to get typedefs for inline functions only when the
inline functions are used.  Normally only kern_clock.c defines
this.  kern_clock.c can't include the i386 headers directly.

Removed unused LOCORE support.
1996-10-25 13:01:56 +00:00
Bruce Evans
d6b9e17eb5 Improved non-statistical (GUPROF) profiling:
- use a more accurate and more efficient method of compensating for
  overheads.  The old method counted too much time against leaf
  functions.
- normally use the Pentium timestamp counter if available.
  On Pentiums, the times are now accurate to within a couple of cpu
  clock cycles per function call in the (unlikely) event that there
  are no cache misses in or caused by the profiling code.
- optionally use an arbitrary Pentium event counter if available.
- optionally regress to using the i8254 counter.
- scaled the i8254 counter by a factor of 128.  Now the i8254 counters
  overflow slightly faster than the TSC counters for a 150MHz Pentium :-)
  (after about 16 seconds).  This is to avoid fractional overheads.

files.i386:
permon.c temporarily has to be classified as a profiling-routine
because a couple of functions in it may be called from profiling code.

options.i386:
- I586_CTR_GUPROF is currently unused (oops).
- I586_PMC_GUPROF should be something like 0x70000 to enable (but not
  use unless prof_machdep.c is changed) support for Pentium event
  counters.  7 is a control mode and the counter number 0 is somewhere
  in the 0000 bits (see perfmon.h for the encoding).

profile.h:
- added declarations.
- cleaned up separation of user mode declarations.

prof_machdep.c:
Mostly clock-select changes.  The default clock can be changed by
editing kmem.  There should be a sysctl for this.

subr_prof.c:
- added copyright.
- calibrate overheads for the new method.
- documented new method.
- fixed races and and machine dependencies in start/stop code.

mcount.c:
Use the new overhead compensation method.

gmon.h:
- changed GPROF4 counter type from unsigned to int.  Oops, this should
  be machine-dependent and/or int32_t.
- reorganized overhead counters.

Submitted by:	Pentium event counter changes mostly by wollman
1996-10-17 19:32:31 +00:00
Bruce Evans
b40a35f250 Added missing extern declaration of timer_freq.
Sorted declarations of scalars.
1996-10-17 17:31:25 +00:00
Bruce Evans
435929a84b Added macros CROSSJUMP(), CROSSJUMP_LABEL() and GPROF_RET. These will
be used to fix some benign(?) bugs in GUPROF profiling.

Fixed stale comments and long lines.
1996-10-16 18:13:25 +00:00
John Dyson
8de30f117f Pmap_resident_count was mistakenly removed from pmap.h, thereby
disabling the RSS listing in ps and ^T.  This commit re-inserts
the macro defn.
1996-10-13 03:14:57 +00:00
John Dyson
9d3fbbb5f4 Performance optimizations. One of which was meant to go in before the
previous snap.  Specifically, kern_exit and kern_exec now makes a
call into the pmap module to do a very fast removal of pages from the
address space.  Additionally, the pmap module now updates the PG_MAPPED
and PG_WRITABLE flags.  This is an optional optimization, but helpful
on the X86.
1996-10-12 21:35:25 +00:00
Bruce Evans
da2186afa3 Cleaned up:
- fixed a sloppy common-style declaration.
- removed an unused macro.
- moved once-used macros to the one file where they are used.
- removed unused forward struct declarations.
- removed __pure.
- declared inline functions as inline in their prototype as well
  as in theire definition (gcc unfortunately allows the prototype
  to be inconsistent).
- staticized.
1996-10-12 20:36:15 +00:00
Bruce Evans
a0ea75ecbd Don't include "opt_cpu.h" in <machine/clock.h>, since this breaks lkm's.
The change breaks kern_clock.c; fix that temporarily by including
"opt_cpu.h" there.
1996-10-10 10:25:26 +00:00
Bruce Evans
c20b324bb6 Put I*86_CPU defines in opt_cpu.h. 1996-10-09 19:47:44 +00:00
Bruce Evans
ece15d78ea Added "memory" to clobber list in invlpg(). It needs it if invltlb()
needs it.

Fixed style in invlpg().

Sorted recently renamed functions.

Added prototypes in the non-gcc section for recently added/renamed
functions.
1996-09-29 18:35:07 +00:00
John Dyson
27e9b35e07 Essentially rename pmap_update to be invltlb. It is a very machine
dependent operation, and not really a correct name.  invltlb and invlpg
are more descriptive, and in the case of invlpg, a real opcode.

Additionally, fix the tlb management code for 386 machines.
1996-09-28 22:37:57 +00:00
John Dyson
9299d3c4bb Move pmap_update_1pg to cpufunc.h. Additionally,
use the invlpg opcode instead of the nasty looking .byte directives.
There are some other minor micro-level code improvements to pmap.c
1996-09-28 04:22:46 +00:00
Peter Wemm
90af01bef3 Apparently, BSDI have a new system call gate. I was experimenting
with this quite a while ago when somebody reported a BSD/OS 2.1 binary
that wouldn't run.  I'm pretty sure they tried it and I'm pretty sure
they mentioned to me that the patch worked.
1996-09-27 13:33:49 +00:00
Bruce Evans
388dfa7112 Fixed a few hundred warnings (2400 in LINT) for signed vs unsigned
comparisons in the inb() and outb() macros.  I decided that int args
are OK here.  Any type that can hold a u_int16_t without overflow
is correct, and 32-bit types are optimal.

Introduced a few tens of warnings (100 in LINT) for use of pessimized
(short) types for the port arg.  Only a few drivers are affected by
this.  u_short pessimizations aren't detected.

Added `__extension__' before the statement-expression in inb() so
that it can be compiled without warnings by gcc -pedantic.
1996-09-24 17:47:59 +00:00
Satoshi Asami
0e408c25a1 Another round of merge/update.
(1) Add PC98 support to apm_bios.h and ns16550.h, remove pc98/pc98/ic
(2) Move PC98 specific code out of cpufunc.h (to pc98.h)
(3) Let the boot subtrees look more alike

Submitted by:	The FreeBSD(98) Development Team
		<freebsd98-hackers@jp.freebsd.org>
1996-09-12 11:12:18 +00:00
John Dyson
b8e251a56d Improve the scalability of certain pmap operations. 1996-09-08 16:57:53 +00:00
Bruce Evans
1f403fcfbf Cleaned up interrupt masking by declaring the state variable in a
machine-dependent macro and passing it to all machine-dependent
macros.

Eliminated the state variable for the GUPROF case.
1996-08-28 20:15:32 +00:00
David Greenman
150022d8fc Defined T_MCHK exception for i686; renumbered T_RESERVED to 29. 1996-08-11 17:29:39 +00:00
Bruce Evans
d9927d1118 Eliminated i586_ctr_rate. Use i586_ctr_freq instead.
Changed i586_ctr_bias from long long to u_int.  Only the low 32 bits
are used now that microtime uses a multiplication to do the scaling.
Previously the high 32 bits had to match those of rdtsc() to prevent
overflow traps and invalid timeval adjustments.
1996-08-02 21:16:13 +00:00
Garrett Wollman
13f588f83e Add an fls() inline function which does the opposite operation to
ffs().  (That is to say, it searches in the opposite direction.)
1996-08-01 20:29:28 +00:00
Bruce Evans
85acc6887d Eliminated pcb_inl. It was always 0 because context switches don't occur
in interrupt handlers.
1996-07-31 12:36:11 +00:00
Bruce Evans
33ded19fc2 Fixed the machdep.i8254_freq and machdep.i586_freq sysctls. Writes were
handled bogusly.

Centralized the setting of all the frequency variables.  Set these
variables atomically.  Some new ones aren't used yet.
1996-07-30 19:26:55 +00:00
John Dyson
67bf686897 Backed out the recent changes/enhancements to the VM code. The
problem with the 'shell scripts' was found, but there was a 'strange'
problem found with a 486 laptop that we could not find.  This commit
backs the code back to 25-jul, and will be re-entered after the snapshot
in smaller (more easily tested) chunks.
1996-07-30 03:08:57 +00:00
John Dyson
4f4d35edf0 This commit is meant to solve a couple of VM system problems or
performance issues.

	1) The pmap module has had too many inlines, and so the
	   object file is simply bigger than it needs to be.
	   Some common code is also merged into subroutines.
	2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls.
	   Unfortunately, a few have needed to be added also.
	   The removal caused the need for more vm_page_lookups.
	   I added lookup hints to minimize the need for the
	   page table lookup operations.
	3) Removal of some bogus performance improvements, that
	   mostly made the code more complex (tracking individual
	   page table page updates unnecessarily).  Those improvements
	   actually hurt 386 processors perf (not that people who
	   worry about perf use 386 processors anymore :-)).
	4) Changed pv queue manipulations/structures to be TAILQ's.
	5) The pv queue code has had some performance problems since
	   day one.  Some significant scalability issues are resolved
	   by threading the pv entries from the pmap AND the physical
	   address instead of just the physical address.  This makes
	   certain pmap operations run much faster.  This does
	   not affect most micro-benchmarks, but should help loaded system
	   performance *significantly*.  DG helped and came up with most
	   of the solution for this one.
	6) Most if not all pmap bit operations follow the pattern:
		pmap_test_bit();
		pmap_clear_bit();
	   That made for twice the necessary pv list traversal.   The
	   pmap interface now supports only pmap_tc_bit type operations:
	   pmap_[test/clear]_modified, pmap_[test/clear]_referenced.
	   Additionally, the modified routine now takes a vm_page_t arg
	   instead of a phys address.  This eliminates a PHYS_TO_VM_PAGE
	   operation.
	7) Several rewrites of routines that contain redundant code to
	   use common routines, so that there is a greater likelihood of
	   keeping the cache footprint smaller.
1996-07-27 03:24:10 +00:00
Satoshi Asami
92b4f2e0df Update to current state of PC98 world.
Submitted by:	The FreeBSD(98) development team
1996-07-23 07:46:59 +00:00
Bruce Evans
7baccf64d3 Fixed lots of warnings about unportable casts of pointers to volatile
variables: don't depend on the compiler generating atomic code to set
the variables - use inline asm to specify the atomic instruction(s)
explicitly.
1996-07-01 20:16:10 +00:00
Bruce Evans
a111a7f827 Moved declarations of non-cpu things from <machine/cpufunc.h> to better
places.
1996-07-01 18:12:24 +00:00
Bruce Evans
79df6d8597 trap.c:
Fixed profiling of system times.  It was pre-4.4Lite and didn't support
statclocks.  System times were too small by a factor of 8.

Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc().  Call addupc_task() directly instead of using the ADDUPC()
macro.

Removed vestigial support for PROFTIMER.

switch.s:
Removed addupc().

resourcevar.h:
Removed ADDUPC() and declarations of addupc().

cpu.h:
Updated a comment.  i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.

Obtained from:	mostly from NetBSD
1996-06-25 20:02:16 +00:00
Satoshi Asami
ad63a118b2 The Great PC98 Merge.
All new code is "#ifdef PC98"ed so this should make no difference to
PC/AT (and its clones) users.

Ok'd by:	core
Submitted by:	FreeBSD(98) development team
1996-06-14 11:02:28 +00:00
Bruce Evans
87bc5d1973 Removed unnecessary forward declarations of incomplete structs. 1996-06-08 11:21:19 +00:00
Søren Schmidt
70013b5739 Added missing CR0_NW define for Cyrix 486DLC support. It's still not
stable on my hardware, but its better... *sigh*

Obtained from: NetBSD
1996-06-03 19:37:38 +00:00
Peter Wemm
ee323f62ad Jump some hoops to have the *.s code being able to be run through both an
ansi and traditional cpp.

The nesting rules of macros are different, which required some changes.
Use __CONCAT(x,y) instead of /**/.
Redo some comments to use /* */ rather than "# comment" because the ansi
  cpp cares about those, and also cares about quote matching.
1996-05-31 01:08:08 +00:00
John Dyson
b18bfc3da7 This set of commits to the VM system does the following, and contain
contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>,
Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:

	More usage of the TAILQ macros.  Additional minor fix to queue.h.
	Performance enhancements to the pageout daemon.
		Addition of a wait in the case that the pageout daemon
		has to run immediately.
		Slightly modify the pageout algorithm.
	Significant revamp of the pmap/fork code:
		1) PTE's and UPAGES's are NO LONGER in the process's map.
		2) PTE's and UPAGES's reside in their own objects.
		3) TOTAL elimination of recursive page table pagefaults.
		4) The page directory now resides in the PTE object.
		5) Implemented pmap_copy, thereby speeding up fork time.
		6) Changed the pv entries so that the head is a pointer
		   and not an entire entry.
		7) Significant cleanup of pmap_protect, and pmap_remove.
		8) Removed significant amounts of machine dependent
		   fork code from vm_glue.  Pushed much of that code into
		   the machine dependent pmap module.
		9) Support more completely the reuse of already zeroed
		   pages (Page table pages and page directories) as being
		   already zeroed.
	Performance and code cleanups in vm_map:
		1) Improved and simplified allocation of map entries.
		2) Improved vm_map_copy code.
		3) Corrected some minor problems in the simplify code.
	Implemented splvm (combo of splbio and splimp.)  The VM code now
		seldom uses splhigh.
	Improved the speed of and simplified kmem_malloc.
	Minor mod to vm_fault to avoid using pre-zeroed pages in the case
		of objects with backing objects along with the already
		existant condition of having a vnode.  (If there is a backing
		object, there will likely be a COW...  With a COW, it isn't
		necessary to start with a pre-zeroed page.)
	Minor reorg of source to perhaps improve locality of ref.
1996-05-18 03:38:05 +00:00
Poul-Henning Kamp
5084d10dd0 Move atdevbase out of locore.s and into machdep.c
Macroize locore.s' page table setup even more, now it's almost readable.
Rename PG_U to PG_A (so that I can...)
Rename PG_u to PG_U.  "PG_u" was just too ugly...
Remove some unused vars in pmap.c
Remove PG_KR and PG_KW
Remove SSIZE
Remove SINCR
Remove BTOPKERNBASE

This concludes my spring cleaning, modulus any bug fixes for messes I
have made on the way.

(Funny to be back here in pmap.c, that's where my first significant
contribution to 386BSD was... :-)
1996-05-02 22:25:18 +00:00
Poul-Henning Kamp
e911eafcba removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
        ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
        NPDEPG

Major macro cleanup.
1996-05-02 14:21:14 +00:00
Bruce Evans
2dafbfcbab Added calibration the i8254 and the i586 clocks agains the RTC at boot
time.  The results are currently ignored unless certain temporary options
are used.

Added sysctls to support reading and writing the clock frequency variables
(not the frequencies themselves).  Writing is supposed to atomically
adjust all related variables.

machdep.c:
Fixed spelling of a function name in a comment so that I can log this
message which should have been with the previous commit.

Initialize `cpu_class' earlier so that it can be used in startrtclock()
instead of in calibrate_cyclecounter() (which no longer exists).

Removed range checking of `cpu'.  It is always initialized to CPU_XXX
so it is less likely to be out of bounds than most variables.

clock.h:
Removed I586_CYCLECTR().  Use rdtsc() instead.

clock.c:
TIMER_FREQ is now a variable timer_freq that defaults to the old value of
TIMER_FREQ.  #define'ing TIMER_FREQ should still work and may be the best
way of setting the frequency.

Calibration involves counting cycles while watching the RTC for one second.
This gives values correct to within (a few ppm) + (the innaccuracy of the
RTC) on my systems.
1996-05-01 08:39:02 +00:00
Bruce Evans
eabe0f9f57 Don't return unused values in cpu_switch() or savectx().
Don't preserve unused registers in the NPX case in savectx().
1996-05-01 03:47:04 +00:00
Poul-Henning Kamp
68c1eb1215 pte.h: Add the VADDR(pdi,pti) macro to construct virtual address from
page dir+table index.
pmap.h: remove NUPDE, it was wrong and not used.  Sanitize KSTKPTEOFF.
vmparam.h: Calculate virtual addr from PDI+PTI from pmap.h rather than
using magic math.  Remove UPDT, not used.
1996-04-30 12:02:12 +00:00
Poul-Henning Kamp
68832d3037 Fix cpu_fork for real.
Suggested by:	 bde
1996-04-25 06:20:19 +00:00