Commit Graph

130 Commits

Author SHA1 Message Date
Peter Wemm
eb1443c8dd Create inlines for ltr(sel), lldt(sel), lidt(addr) rather than
functions that have one instruction.
2002-09-22 04:45:21 +00:00
Mark Murray
d7ee442578 Provide in inline function for the (GNUC) assembler "hlt" instruction. 2002-09-21 18:26:53 +00:00
Peter Wemm
e344afe7c9 Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat
gathering is not an x86 cpu feature)
2002-07-21 05:22:16 +00:00
Mark Murray
7e622d3c84 Cast to prevent "signed/unsigned comparison" warnings. 2002-07-15 13:27:43 +00:00
Peter Wemm
f1b665c8fe Revive backed out pmap related changes from Feb 2002. The highlights are:
- It actually works this time, honest!
- Fine grained TLB shootdowns for SMP on i386.  IPI's are very expensive,
  so try and optimize things where possible.
- Introduce ranged shootdowns that can be done as a single IPI.
- PG_G support for i386
- Specific-cpu targeted shootdowns.  For example, there is no sense in
  globally purging the TLB cache for where we are stealing a page from
  the local unshared process on the local cpu.  Use pm_active to track
  this.
- Add some instrumentation for the tlb shootdown code.
- Rip out SMP code from <machine/cpufunc.h>
- Try and fix some very bogus PG_G and PG_PS interactions that were bad
  enough to cause vm86 bios calls to break.  vm86 depended on our existing
  bugs and this was the cause of the VESA panics last time.
- Fix the silly one-line error that caused the 'panic: bad pte' last time.
- Fix a couple of other silly one-line errors that should have caused more
  pain than they did.

Some more work is needed:
- pmap_{zero,copy}_page[_idle].  These can be done without IPI's if we
  have a hook in cpu_switch.
- The IPI handlers need some cleanup.  I have a bogus %ds load that can
  be avoided.
- APTD handling is rather bogus and appears to be a large source of
  global TLB IPI shootdowns for no really good reason.

I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop.
I expect to see a bigger difference when there is significant pageout
activity or the system otherwise has memory shortages.

I have backed out a few optimizations that I had been using over the last
few days in order to be a little more conservative.  I'll revisit these
again over the next few days as the dust settles.

New option:  DISABLE_PG_G - In case I missed something.
2002-07-12 07:56:11 +00:00
John Baldwin
6b8c698908 Rename pause() to ia32_pause() so it doesn't conflict with the pause()
function defined in <unistd.h>.  I didn't #ifdef _KERNEL it because the
mutex implementation in libpthread will probably need this.
2002-05-22 20:32:39 +00:00
John Baldwin
07508f90b6 Debug registers aren't selectors, so use saner names for the variables in
the inline functions for reading and writing the debug registers.
2002-05-22 13:29:18 +00:00
John Baldwin
2be69f326a - Sort the pause() inline into the appropriate location.
- Add many missing prototypes to the non-GCC section.
2002-05-22 13:27:05 +00:00
John Baldwin
0228ea4e0b Rename cpu_pause() to pause(). Originally I was going to make this an
MI API with empty cpu_pause() functions on other arch's, but this
functionality is definitely unique to IA-32, so I decided to leave it
as i386-only and wrap it in #ifdef's.  I should have dropped the cpu_
prefix when I made that decision.

Requested by:	bde
2002-05-22 13:19:22 +00:00
John Baldwin
bb0d293f15 Add an inline function cpu_pause() for the IA32 'pause' instruction. 2002-05-21 20:21:53 +00:00
David Malone
a983fdfe4c Move do_cpuid into the correct place in this file and make
the indentation more like the other multi-line assembley in
this file.

Someone who understands gcc constraints could update the
constraints for do_cpuid.
2002-04-10 21:18:46 +00:00
Matthew Dillon
182da8209d Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development.  Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures.  Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable.  Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks.  They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by:	jake
2002-04-01 23:51:23 +00:00
Matthew Dillon
d74ac6819b Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
Bruce Evans
809dbbc99b Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-23 15:09:35 +00:00
Warner Losh
ba74981e71 Fix abuses of cpu_critical_{enter,exit} by converting to
intr_{disable,restore} as well as providing an implemenation of
intr_{disable,restore}.

Reviewed by: jake, rwatson, jhb
2002-03-21 06:19:08 +00:00
Warner Losh
e7b110dcf7 Fix minor style(9) violation in de__Ping 2002-03-20 19:04:56 +00:00
Alfred Perlstein
b63dc6ad47 Remove __P. 2002-03-20 05:48:58 +00:00
Mark Murray
f5d9a10b94 Make it a bit clearer where this file is to be used and where it
should not be. (Comments only)

Inspired by:	bde
2002-02-28 18:26:30 +00:00
Peter Wemm
d1693e1701 Back out all the pmap related stuff I've touched over the last few days.
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels.  Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely.  Userland programs still crashed.
2002-02-27 09:51:33 +00:00
Matthew Dillon
181df8c9d4 revert last commit temporarily due to whining on the lists. 2002-02-26 20:33:41 +00:00
Matthew Dillon
f96ad4c223 STAGE-1 of 3 commit - allow (but do not require) interrupts to remain
enabled in critical sections and streamline critical_enter() and
critical_exit().

This commit allows an architecture to leave interrupts enabled inside
critical sections if it so wishes.  Architectures that do not wish to do
this are not effected by this change.

This commit implements the feature for the I386 architecture and provides
a sysctl, debug.critical_mode, which defaults to 1 (use the feature).  For
now you can turn the sysctl on and off at any time in order to test the
architectural changes or track down bugs.

This commit is just the first stage.  Some areas of the code, specifically
the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will
be cleaned up in the STAGE-2 commit when the critical_*() functions are
moved entirely into MD files.

The following changes have been made:

	* critical_enter() and critical_exit() for I386 now simply increment
	  and decrement curthread->td_critnest.  They no longer disable
	  hard interrupts.  When critical_exit() decrements the counter to
	  0 it effectively calls a routine to deal with whatever interrupts
	  were deferred during the time the code was operating in a critical
	  section.

	  Other architectures are unaffected.

	* fork_exit() has been conditionalized to remove MD assumptions for
	  the new code.  Old code will still use the old MD assumptions
	  in regards to hard interrupt disablement.  In STAGE-2 this will
	  be turned into a subroutine call into MD code rather then hardcoded
	  in MI code.

	  The new code places the burden of entering the critical section
	  in the trampoline code where it belongs.

	* I386: interrupts are now enabled while we are in a critical section.
	  The interrupt vector code has been adjusted to deal with the fact.
	  If it detects that we are in a critical section it currently defers
	  the interrupt by adding the appropriate bit to an interrupt mask.

	* In order to accomplish the deferral, icu_lock is required.  This
	  is i386-specific.  Thus icu_lock can only be obtained by mainline
	  i386 code while interrupts are hard disabled.  This change has been
	  made.

	* Because interrupts may or may not be hard disabled during a
	  context switch, cpu_switch() can no longer simply assume that
	  PSL_I will be in a consistent state.  Therefore, it now saves and
	  restores eflags.

	* FAST INTERRUPT PROVISION.  Fast interrupts are currently deferred.
	  The intention is to eventually allow them to operate either while
	  we are in a critical section or, if we are able to restrict the
	  use of sched_lock, while we are not holding the sched_lock.

	* ICU and APIC vector assembly for I386 cleaned up.  The ICU code
	  has been cleaned up to match the APIC code in regards to format
	  and macro availability.  Additionally, the code has been adjusted
	  to deal with deferred interrupts.

	* Deferred interrupts use a per-cpu boolean int_pending, and
	  masks ipending, spending, and fpending.  Being per-cpu variables
	  it is not currently necessary to lock; bus cycles modifying them.

	  Note that the same mechanism will enable preemption to be
	  incorporated as a true software interrupt without having to
	  further hack up the critical nesting code.

	* Note: the old critical_enter() code in kern/kern_switch.c is
	  currently #ifdef to be compatible with both the old and new
	  methodology.  In STAGE-2 it will be moved entirely to MD code.

Performance issues:

	One of the purposes of this commit is to enhance critical section
	performance, specifically to greatly reduce bus overhead to allow
	the critical section code to be used to protect per-cpu caches.
	These caches, such as Jeff's slab allocator work, can potentially
	operate very quickly making the effective savings of the new
	critical section code's performance very significant.

	The second purpose of this commit is to allow architectures to
	enable certain interrupts while in a critical section.  Specifically,
	the intention is to eventually allow certain FAST interrupts to
	operate rather then defer.

	The third purpose of this commit is to begin to clean up the
	critical_enter()/critical_exit()/cpu_critical_enter()/
	cpu_critical_exit() API which currently has serious cross pollution
	in MI code (in fork_exit() and ast() for example).

	The fourth purpose of this commit is to provide a framework that
	allows kernel-preempting software interrupts to be implemented
	cleanly.  This is currently used for two forward interrupts in I386.
	Other architectures will have the choice of using this infrastructure
	or building the functionality directly into critical_enter()/
	critical_exit().

	Finally, this commit is designed to greatly improve the flexibility
	of various architectures to manage critical section handling,
	software interrupts, preemption, and other highly integrated
	architecture-specific details.
2002-02-26 17:06:21 +00:00
Peter Wemm
6bd95d70db Work-in-progress commit syncing up pmap cleanups that I have been working
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
  a single IPI, since the IPI is very expensive.  Adjust some callers
  that used to trigger this inside tight loops to do a ranged shootdown
  at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
  PSE and the 4MB pages at the start of the kernel.  This should solve
  some rumored strangeness about stale PG_G entries getting stuck
  underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines.  gcc
  seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers.  I will fix
  this again shortly.

This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round.  The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP).  I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
2002-02-25 23:49:51 +00:00
David Malone
34221a4505 Move do_cpuid() from a identcpu.c into cpufunc.h. 2002-02-12 21:06:48 +00:00
John Baldwin
3f9a462fb9 Various assembly fixes mostly in the form of using the "+" modifier for
output operands to mark them as both input and output rather than listing
operands twice.

Reviewed by:	bde
2001-12-18 08:54:39 +00:00
John Baldwin
7e1f6dfe9d Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
  prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
  count and a per-thread critical section saved state set when entering
  a critical section while at nesting level 0 and restored when exiting
  to nesting level 0.  This moves the saved state out of spin mutexes so
  that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
  cpu_critical_enter/exit.  MI code such as device drivers and spin
  mutexes use the MI wrappers.  Note that since the MI wrappers store
  the state in the current thread, they do not have any return values or
  arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
  assigned to curthread->td_savecrit during fork_exit().

Tested on:	i386, alpha
2001-12-18 00:27:18 +00:00
Brian S. Dean
6eda157eaa Provide access to the IA32 hardware debug registers from the ddb
kernel debugger.  Proper use of these registers allows setting
hardware watchpoints for use in kernel debugging.

MFC after: 2 weeks
2001-06-28 02:08:13 +00:00
Warner Losh
a5e25da40d Back out 1.103. It wasn't approved by the owner of the file and
introduced style bugs.

Submited by: bde
2001-04-18 20:57:43 +00:00
Warner Losh
884c6f61f4 De __P() while I'm here. Done as a separate commit since it is just
stylistic.

# Yes, this break K&R, but this file already used so many gcc extensions
# keeping K&R support seemed too anachronistic for me.

Didn't fix the bug where functions that can only be used in the kernel
are exported to userland.
2001-04-03 18:50:55 +00:00
Warner Losh
29d5de8ad0 Make this file C++ safe. It defines many useful functions (inb, outb)
that people use from userland in C++ programs.  I've had this in my
tree for ages and just got bit by it not being in the real tree again.

This is a MFC candidate.
2001-04-03 18:19:49 +00:00
John Baldwin
034dc442ad - Add the new critical_t type used to save state inside of critical
sections.
- Add implementations of the critical_enter() and critical_exit() functions
  and remove restore_intr() and save_intr().
- Remove the somewhat bogus disable_intr() and enable_intr() functions on
  the alpha as the alpha actually uses a priority level and not simple bit
  flag on the CPU.
2001-03-28 02:31:54 +00:00
Mark Murray
39413503a4 Assembler fixes.
Fix opcodes that were typed as ".byte 0xNN, 0xMM" when an older
assembler could not recognise the newer Pentium instructions.
Reviewed by:	jhb
2000-11-21 20:16:49 +00:00
Bruce Evans
4d448fc0ea Removed unused include of <machine/lock.h>. The locking interface stopped
being (ab)used here in rev.1.97.
2000-10-12 17:05:33 +00:00
John Baldwin
12e8a79ce1 Replace loadandclear() with atomic_readandclear_int(). 2000-10-05 22:22:31 +00:00
Jason Evans
0384fff8c5 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
Brian S. Dean
80275388cb Fix an __asm operand constraint which broke the -O3 and -O0 builds.
Submitted by:	Seigo Tanimura <tanimura@freebsd.org>
Approved by:	jkh
2000-02-21 13:06:50 +00:00
Brian S. Dean
de8050f9b8 Don't forget to reset the hardware debug registers when a process that
was using them exits.

Don't allow a user process to cause the kernel to take a TRCTRAP on a
user space address.

Reviewed by:	jlemon, sef
Approved by:	jkh
2000-02-20 20:51:23 +00:00
Bruce Evans
c83b1328f1 Fixed style bugs related to the access functions for the bsfl and bsrl
i386 instructions.
2000-01-09 16:46:03 +00:00
Peter Wemm
664a31e496 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 04:46:21 +00:00
Luoqi Chen
e870e9b278 Segment registers can be read(write) to(from) memory locations as well as
general registers.
1999-11-15 19:45:19 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Peter Wemm
264c3d8738 Undo my previous commit and do it differently. Break the ffs() etc macros
into two parts - one to do the bsfl and the other to convert the result
(base 0) to ffs()-like (base 1) in inline C.  This enables the optimizer
to be a lot smarter in certain cases, like where it knows that the argument
is non-zero and we want ffs(known non zero arg) - 1.  This appears to
produce identical code to the old inline when the argument is unknown.
1999-08-19 14:54:40 +00:00
Peter Wemm
bb41d37104 Try using the builtin ffs() for egcs, it (by random inspection)
generates slightly better code and avoids the incl then subl when
using ffs(foo) - 1.
1999-08-19 00:32:48 +00:00
Alan Cox
03e3bc8e62 atomic.h:
Change "void *" to "volatile TYPE *", improving type safety
	and eliminating some warnings (e.g., mp_machdep.c rev 1.106).

cpufunc.h:
	Eliminate setbits.  As defined, it's not precisely correct;
	and it's redundant.  (Use atomic_set_int instead.)

ipl_funcs.c:
	Use atomic_set_int instead of setbits.

systm.h:
	Include atomic.h.

Reviewed by:	bde
1999-07-23 23:45:50 +00:00
Peter Wemm
0264a0ebd1 loadandclear() uses an atomic instruction (even on SMP, where it's an
implicitly LOCK'ed instruction), so there shouldn't be any harm in making
it volatile pointer compatable for one of the users of it.  It seems to
generate the same code regardless.
1999-05-09 23:30:01 +00:00
Luoqi Chen
5206bca10a Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
Bruce Evans
896763fa9e Don't put operands in clobber lists, since this is dubious for old
versions of gcc and broken for current versions of egcs.

Submitted by:	"John S. Dyson" <dyson@iquest.net> but rewritten by me
1999-01-09 13:00:27 +00:00
Bruce Evans
f48bbd5fb8 Fixed some style bugs. Clarified a comment. 1999-01-08 19:51:02 +00:00
Bruce Evans
2a32c15f45 Unspammed includes in <machine/cpufunc.h> in the !SMP case. Partially
unspammed them in the SMP case.
1999-01-08 19:17:49 +00:00
Bruce Evans
68ba369606 Moved declarations related to copying and zeroing to the right place. 1999-01-08 16:29:59 +00:00
Doug Rabson
e31fa854a0 Add macros for accessing device memory. 1998-08-17 08:57:05 +00:00