212 Commits

Author SHA1 Message Date
Bruce Evans
835bd1ce62 Improved biasing of i586 clock by adjusting for hardclock() latency.
I decided to do this for every hardclock() call instead of lazily
in microtime().  The lazy method is simpler but has more overhead
if microtime() is called a lot.

CPU_THISTICKLEN() is now a no-op and should probably go away.
Previously it did nothing directly but had the side effect of
setting i586_last_tick for CPU_CLOCKUPDATE() and i586_avg_tick for
debugging.  CPU_CLOCKUPDATE() now uses a better method and
i586_avg_tick is too much trouble to maintain.

Reduced nesting of #includes in the usual case.

Increased nesting of #includes when CLOCK_HAIR is defined.  This
is a kludge to get typedefs for inline functions only when the
inline functions are used.  Normally only kern_clock.c defines
this.  kern_clock.c can't include the i386 headers directly.

Removed unused LOCORE support.
1996-10-25 13:01:56 +00:00
Bruce Evans
d6b9e17eb5 Improved non-statistical (GUPROF) profiling:
- use a more accurate and more efficient method of compensating for
  overheads.  The old method counted too much time against leaf
  functions.
- normally use the Pentium timestamp counter if available.
  On Pentiums, the times are now accurate to within a couple of cpu
  clock cycles per function call in the (unlikely) event that there
  are no cache misses in or caused by the profiling code.
- optionally use an arbitrary Pentium event counter if available.
- optionally regress to using the i8254 counter.
- scaled the i8254 counter by a factor of 128.  Now the i8254 counters
  overflow slightly faster than the TSC counters for a 150MHz Pentium :-)
  (after about 16 seconds).  This is to avoid fractional overheads.

files.i386:
permon.c temporarily has to be classified as a profiling-routine
because a couple of functions in it may be called from profiling code.

options.i386:
- I586_CTR_GUPROF is currently unused (oops).
- I586_PMC_GUPROF should be something like 0x70000 to enable (but not
  use unless prof_machdep.c is changed) support for Pentium event
  counters.  7 is a control mode and the counter number 0 is somewhere
  in the 0000 bits (see perfmon.h for the encoding).

profile.h:
- added declarations.
- cleaned up separation of user mode declarations.

prof_machdep.c:
Mostly clock-select changes.  The default clock can be changed by
editing kmem.  There should be a sysctl for this.

subr_prof.c:
- added copyright.
- calibrate overheads for the new method.
- documented new method.
- fixed races and and machine dependencies in start/stop code.

mcount.c:
Use the new overhead compensation method.

gmon.h:
- changed GPROF4 counter type from unsigned to int.  Oops, this should
  be machine-dependent and/or int32_t.
- reorganized overhead counters.

Submitted by:	Pentium event counter changes mostly by wollman
1996-10-17 19:32:31 +00:00
Bruce Evans
b40a35f250 Added missing extern declaration of timer_freq.
Sorted declarations of scalars.
1996-10-17 17:31:25 +00:00
Bruce Evans
435929a84b Added macros CROSSJUMP(), CROSSJUMP_LABEL() and GPROF_RET. These will
be used to fix some benign(?) bugs in GUPROF profiling.

Fixed stale comments and long lines.
1996-10-16 18:13:25 +00:00
John Dyson
8de30f117f Pmap_resident_count was mistakenly removed from pmap.h, thereby
disabling the RSS listing in ps and ^T.  This commit re-inserts
the macro defn.
1996-10-13 03:14:57 +00:00
John Dyson
9d3fbbb5f4 Performance optimizations. One of which was meant to go in before the
previous snap.  Specifically, kern_exit and kern_exec now makes a
call into the pmap module to do a very fast removal of pages from the
address space.  Additionally, the pmap module now updates the PG_MAPPED
and PG_WRITABLE flags.  This is an optional optimization, but helpful
on the X86.
1996-10-12 21:35:25 +00:00
Bruce Evans
da2186afa3 Cleaned up:
- fixed a sloppy common-style declaration.
- removed an unused macro.
- moved once-used macros to the one file where they are used.
- removed unused forward struct declarations.
- removed __pure.
- declared inline functions as inline in their prototype as well
  as in theire definition (gcc unfortunately allows the prototype
  to be inconsistent).
- staticized.
1996-10-12 20:36:15 +00:00
Bruce Evans
a0ea75ecbd Don't include "opt_cpu.h" in <machine/clock.h>, since this breaks lkm's.
The change breaks kern_clock.c; fix that temporarily by including
"opt_cpu.h" there.
1996-10-10 10:25:26 +00:00
Bruce Evans
c20b324bb6 Put I*86_CPU defines in opt_cpu.h. 1996-10-09 19:47:44 +00:00
Bruce Evans
ece15d78ea Added "memory" to clobber list in invlpg(). It needs it if invltlb()
needs it.

Fixed style in invlpg().

Sorted recently renamed functions.

Added prototypes in the non-gcc section for recently added/renamed
functions.
1996-09-29 18:35:07 +00:00
John Dyson
27e9b35e07 Essentially rename pmap_update to be invltlb. It is a very machine
dependent operation, and not really a correct name.  invltlb and invlpg
are more descriptive, and in the case of invlpg, a real opcode.

Additionally, fix the tlb management code for 386 machines.
1996-09-28 22:37:57 +00:00
John Dyson
9299d3c4bb Move pmap_update_1pg to cpufunc.h. Additionally,
use the invlpg opcode instead of the nasty looking .byte directives.
There are some other minor micro-level code improvements to pmap.c
1996-09-28 04:22:46 +00:00
Peter Wemm
90af01bef3 Apparently, BSDI have a new system call gate. I was experimenting
with this quite a while ago when somebody reported a BSD/OS 2.1 binary
that wouldn't run.  I'm pretty sure they tried it and I'm pretty sure
they mentioned to me that the patch worked.
1996-09-27 13:33:49 +00:00
Bruce Evans
388dfa7112 Fixed a few hundred warnings (2400 in LINT) for signed vs unsigned
comparisons in the inb() and outb() macros.  I decided that int args
are OK here.  Any type that can hold a u_int16_t without overflow
is correct, and 32-bit types are optimal.

Introduced a few tens of warnings (100 in LINT) for use of pessimized
(short) types for the port arg.  Only a few drivers are affected by
this.  u_short pessimizations aren't detected.

Added `__extension__' before the statement-expression in inb() so
that it can be compiled without warnings by gcc -pedantic.
1996-09-24 17:47:59 +00:00
Satoshi Asami
0e408c25a1 Another round of merge/update.
(1) Add PC98 support to apm_bios.h and ns16550.h, remove pc98/pc98/ic
(2) Move PC98 specific code out of cpufunc.h (to pc98.h)
(3) Let the boot subtrees look more alike

Submitted by:	The FreeBSD(98) Development Team
		<freebsd98-hackers@jp.freebsd.org>
1996-09-12 11:12:18 +00:00
John Dyson
b8e251a56d Improve the scalability of certain pmap operations. 1996-09-08 16:57:53 +00:00
Bruce Evans
1f403fcfbf Cleaned up interrupt masking by declaring the state variable in a
machine-dependent macro and passing it to all machine-dependent
macros.

Eliminated the state variable for the GUPROF case.
1996-08-28 20:15:32 +00:00
David Greenman
150022d8fc Defined T_MCHK exception for i686; renumbered T_RESERVED to 29. 1996-08-11 17:29:39 +00:00
Bruce Evans
d9927d1118 Eliminated i586_ctr_rate. Use i586_ctr_freq instead.
Changed i586_ctr_bias from long long to u_int.  Only the low 32 bits
are used now that microtime uses a multiplication to do the scaling.
Previously the high 32 bits had to match those of rdtsc() to prevent
overflow traps and invalid timeval adjustments.
1996-08-02 21:16:13 +00:00
Garrett Wollman
13f588f83e Add an fls() inline function which does the opposite operation to
ffs().  (That is to say, it searches in the opposite direction.)
1996-08-01 20:29:28 +00:00
Bruce Evans
85acc6887d Eliminated pcb_inl. It was always 0 because context switches don't occur
in interrupt handlers.
1996-07-31 12:36:11 +00:00
Bruce Evans
33ded19fc2 Fixed the machdep.i8254_freq and machdep.i586_freq sysctls. Writes were
handled bogusly.

Centralized the setting of all the frequency variables.  Set these
variables atomically.  Some new ones aren't used yet.
1996-07-30 19:26:55 +00:00
John Dyson
67bf686897 Backed out the recent changes/enhancements to the VM code. The
problem with the 'shell scripts' was found, but there was a 'strange'
problem found with a 486 laptop that we could not find.  This commit
backs the code back to 25-jul, and will be re-entered after the snapshot
in smaller (more easily tested) chunks.
1996-07-30 03:08:57 +00:00
John Dyson
4f4d35edf0 This commit is meant to solve a couple of VM system problems or
performance issues.

	1) The pmap module has had too many inlines, and so the
	   object file is simply bigger than it needs to be.
	   Some common code is also merged into subroutines.
	2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls.
	   Unfortunately, a few have needed to be added also.
	   The removal caused the need for more vm_page_lookups.
	   I added lookup hints to minimize the need for the
	   page table lookup operations.
	3) Removal of some bogus performance improvements, that
	   mostly made the code more complex (tracking individual
	   page table page updates unnecessarily).  Those improvements
	   actually hurt 386 processors perf (not that people who
	   worry about perf use 386 processors anymore :-)).
	4) Changed pv queue manipulations/structures to be TAILQ's.
	5) The pv queue code has had some performance problems since
	   day one.  Some significant scalability issues are resolved
	   by threading the pv entries from the pmap AND the physical
	   address instead of just the physical address.  This makes
	   certain pmap operations run much faster.  This does
	   not affect most micro-benchmarks, but should help loaded system
	   performance *significantly*.  DG helped and came up with most
	   of the solution for this one.
	6) Most if not all pmap bit operations follow the pattern:
		pmap_test_bit();
		pmap_clear_bit();
	   That made for twice the necessary pv list traversal.   The
	   pmap interface now supports only pmap_tc_bit type operations:
	   pmap_[test/clear]_modified, pmap_[test/clear]_referenced.
	   Additionally, the modified routine now takes a vm_page_t arg
	   instead of a phys address.  This eliminates a PHYS_TO_VM_PAGE
	   operation.
	7) Several rewrites of routines that contain redundant code to
	   use common routines, so that there is a greater likelihood of
	   keeping the cache footprint smaller.
1996-07-27 03:24:10 +00:00
Satoshi Asami
92b4f2e0df Update to current state of PC98 world.
Submitted by:	The FreeBSD(98) development team
1996-07-23 07:46:59 +00:00
Bruce Evans
7baccf64d3 Fixed lots of warnings about unportable casts of pointers to volatile
variables: don't depend on the compiler generating atomic code to set
the variables - use inline asm to specify the atomic instruction(s)
explicitly.
1996-07-01 20:16:10 +00:00
Bruce Evans
a111a7f827 Moved declarations of non-cpu things from <machine/cpufunc.h> to better
places.
1996-07-01 18:12:24 +00:00
Bruce Evans
79df6d8597 trap.c:
Fixed profiling of system times.  It was pre-4.4Lite and didn't support
statclocks.  System times were too small by a factor of 8.

Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc().  Call addupc_task() directly instead of using the ADDUPC()
macro.

Removed vestigial support for PROFTIMER.

switch.s:
Removed addupc().

resourcevar.h:
Removed ADDUPC() and declarations of addupc().

cpu.h:
Updated a comment.  i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.

Obtained from:	mostly from NetBSD
1996-06-25 20:02:16 +00:00
Satoshi Asami
ad63a118b2 The Great PC98 Merge.
All new code is "#ifdef PC98"ed so this should make no difference to
PC/AT (and its clones) users.

Ok'd by:	core
Submitted by:	FreeBSD(98) development team
1996-06-14 11:02:28 +00:00
Bruce Evans
87bc5d1973 Removed unnecessary forward declarations of incomplete structs. 1996-06-08 11:21:19 +00:00
Søren Schmidt
70013b5739 Added missing CR0_NW define for Cyrix 486DLC support. It's still not
stable on my hardware, but its better... *sigh*

Obtained from: NetBSD
1996-06-03 19:37:38 +00:00
Peter Wemm
ee323f62ad Jump some hoops to have the *.s code being able to be run through both an
ansi and traditional cpp.

The nesting rules of macros are different, which required some changes.
Use __CONCAT(x,y) instead of /**/.
Redo some comments to use /* */ rather than "# comment" because the ansi
  cpp cares about those, and also cares about quote matching.
1996-05-31 01:08:08 +00:00
John Dyson
b18bfc3da7 This set of commits to the VM system does the following, and contain
contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>,
Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:

	More usage of the TAILQ macros.  Additional minor fix to queue.h.
	Performance enhancements to the pageout daemon.
		Addition of a wait in the case that the pageout daemon
		has to run immediately.
		Slightly modify the pageout algorithm.
	Significant revamp of the pmap/fork code:
		1) PTE's and UPAGES's are NO LONGER in the process's map.
		2) PTE's and UPAGES's reside in their own objects.
		3) TOTAL elimination of recursive page table pagefaults.
		4) The page directory now resides in the PTE object.
		5) Implemented pmap_copy, thereby speeding up fork time.
		6) Changed the pv entries so that the head is a pointer
		   and not an entire entry.
		7) Significant cleanup of pmap_protect, and pmap_remove.
		8) Removed significant amounts of machine dependent
		   fork code from vm_glue.  Pushed much of that code into
		   the machine dependent pmap module.
		9) Support more completely the reuse of already zeroed
		   pages (Page table pages and page directories) as being
		   already zeroed.
	Performance and code cleanups in vm_map:
		1) Improved and simplified allocation of map entries.
		2) Improved vm_map_copy code.
		3) Corrected some minor problems in the simplify code.
	Implemented splvm (combo of splbio and splimp.)  The VM code now
		seldom uses splhigh.
	Improved the speed of and simplified kmem_malloc.
	Minor mod to vm_fault to avoid using pre-zeroed pages in the case
		of objects with backing objects along with the already
		existant condition of having a vnode.  (If there is a backing
		object, there will likely be a COW...  With a COW, it isn't
		necessary to start with a pre-zeroed page.)
	Minor reorg of source to perhaps improve locality of ref.
1996-05-18 03:38:05 +00:00
Poul-Henning Kamp
5084d10dd0 Move atdevbase out of locore.s and into machdep.c
Macroize locore.s' page table setup even more, now it's almost readable.
Rename PG_U to PG_A (so that I can...)
Rename PG_u to PG_U.  "PG_u" was just too ugly...
Remove some unused vars in pmap.c
Remove PG_KR and PG_KW
Remove SSIZE
Remove SINCR
Remove BTOPKERNBASE

This concludes my spring cleaning, modulus any bug fixes for messes I
have made on the way.

(Funny to be back here in pmap.c, that's where my first significant
contribution to 386BSD was... :-)
1996-05-02 22:25:18 +00:00
Poul-Henning Kamp
e911eafcba removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
        ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
        NPDEPG

Major macro cleanup.
1996-05-02 14:21:14 +00:00
Bruce Evans
2dafbfcbab Added calibration the i8254 and the i586 clocks agains the RTC at boot
time.  The results are currently ignored unless certain temporary options
are used.

Added sysctls to support reading and writing the clock frequency variables
(not the frequencies themselves).  Writing is supposed to atomically
adjust all related variables.

machdep.c:
Fixed spelling of a function name in a comment so that I can log this
message which should have been with the previous commit.

Initialize `cpu_class' earlier so that it can be used in startrtclock()
instead of in calibrate_cyclecounter() (which no longer exists).

Removed range checking of `cpu'.  It is always initialized to CPU_XXX
so it is less likely to be out of bounds than most variables.

clock.h:
Removed I586_CYCLECTR().  Use rdtsc() instead.

clock.c:
TIMER_FREQ is now a variable timer_freq that defaults to the old value of
TIMER_FREQ.  #define'ing TIMER_FREQ should still work and may be the best
way of setting the frequency.

Calibration involves counting cycles while watching the RTC for one second.
This gives values correct to within (a few ppm) + (the innaccuracy of the
RTC) on my systems.
1996-05-01 08:39:02 +00:00
Bruce Evans
eabe0f9f57 Don't return unused values in cpu_switch() or savectx().
Don't preserve unused registers in the NPX case in savectx().
1996-05-01 03:47:04 +00:00
Poul-Henning Kamp
68c1eb1215 pte.h: Add the VADDR(pdi,pti) macro to construct virtual address from
page dir+table index.
pmap.h: remove NUPDE, it was wrong and not used.  Sanitize KSTKPTEOFF.
vmparam.h: Calculate virtual addr from PDI+PTI from pmap.h rather than
using magic math.  Remove UPDT, not used.
1996-04-30 12:02:12 +00:00
Poul-Henning Kamp
68832d3037 Fix cpu_fork for real.
Suggested by:	 bde
1996-04-25 06:20:19 +00:00
Nate Williams
e597b4972e - add apm to the GENERIC kernel (disabled by default), and add some comments
regarding apm to LINT
- Disabled the statistics clock on machines which have an APM BIOS and
  have the options "APM_BROKEN_STATCLOCK" enabled (which is default
  in GENERIC now)
- move around some of the code in clock.c dealing with the rtc to make
  it more obvios the effects of disabling the statistics clock

Reviewed by:	bde
1996-04-22 19:40:28 +00:00
Poul-Henning Kamp
6f1e6c97de savectx returns through cpu_switch in case of the child, so it must
return void just like cpu_switch.  Fix prototype and usage from machdep.c
1996-04-19 07:28:04 +00:00
Poul-Henning Kamp
7d2392141d Fix a bogon. cpu_fork & savectx ecpected cpu_switch to restore %eax,
they shouldn't.
1996-04-18 21:34:53 +00:00
Nate Williams
9368fc2012 hp300 -> i386 1996-04-10 05:27:11 +00:00
Bruce Evans
5dbd168e2e Changed bdb() to breakpoint() and always enable it.
Made the style more consistent, especially for the new Pentium functions.
1996-04-07 18:30:56 +00:00
Bruce Evans
73dc05d67c Moved declaration of bootverbose to a better place. It isn't
machine-dependent.

Moved declaration of cpu_fork() to a better place.  Only its
implementation is machine-dependent.
1996-04-07 16:44:28 +00:00
Andrey A. Chernov
fe0d5f43c5 Add wall_cmos_clock sysctl variable, needed to manage adjkerntz even for
UTC cmos clocks (needed for Local Timezone FSes)
1996-04-05 03:36:31 +00:00
John Dyson
030ad08012 Fixed a problem that the UPAGES of a process were being run down
in a suboptimal manner.  I had also noticed some panics that appeared
to be at least superficially caused by this problem.  Also, included
are some minor mods to support more general handling of page table page
faulting.  More details in a future commit.
1996-04-03 05:23:44 +00:00
Bruce Evans
048cd610ad Finished removing NOP macros. 1996-03-31 04:17:25 +00:00
Bruce Evans
ef9805a3c8 Moved rtcin() to clock.c.
Always delay using one inb(0x84) after each i/o in rtcin() - don't
do this conditional on the bogus option DUMMY_NOPS not being defined.
If you want an optionally slightly faster rtcin() again, then inline
it and use a better named option or sysctl variable.  It only needs
to be fast in rtcintr().
1996-03-31 04:05:36 +00:00
Bruce Evans
78966e20d9 Parenthesized macros.
Fixed munged tabs.
1996-03-29 14:14:07 +00:00