1671 Commits

Author SHA1 Message Date
jhb
41cadaa11e Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
imp
8a1e88ee77 Move pc98 specific parts to the pc98 specific file. 2005-04-03 23:27:11 +00:00
imp
b8c5a800b0 With pc98/include, we can have pc98 and i386 specific bus space
implementations in their own files named $MACHINE/include/bus.h.  Copy
the contents appropriately.
2005-04-03 17:47:03 +00:00
netchild
02f22e6e2c The file machine/ieeefp.h needs sys/cdefs.h on amd64 and i386 after my
compiler features tests. This is ok, since machine/ieeefp.h is an internal
interface. But floatingpoint.h is a public interface and some ports use it,
so include sys/cdefs.h in the amd64 and i386 version of floatingpoint.h.

Note: some architectures don't provide recursive inclusion protection in
floatingpoint.h, namely alpha and ia64. Except for this part and now the
include of sys/cdefs.h, all those files are equal (from a compiler POV),
so they could be moved to only one version in src/include/.

Approved by:	joerg
2005-04-02 17:31:42 +00:00
das
148ae38eee Initialize the mxcsr properly, so the initial value in a process isn't
just the value that was left over from some other application.
2005-03-17 22:21:36 +00:00
das
a84bfd6e04 Remove fpsetsticky(). This was added for SysV compatibility, but due
to mistakes from day 1, it has always had semantics inconsistent with
SVR4 and its successors.  In particular, given argument M:

- On Solaris and FreeBSD/{alpha,sparc64}, it clobbers the old flags
  and *sets* the new flag word to M.  (NetBSD, too?)
- On FreeBSD/{amd64,i386}, it *clears* the flags that are specified in M
  and leaves the remaining flags unchanged (modulo a small bug on amd64.)
- On FreeBSD/ia64, it is not implemented.

There is no way to fix fpsetsticky() to DTRT for both old FreeBSD apps
and apps ported from other operating systems, so the best approach
seems to be to kill the function and fix any apps that break.  I
couldn't find any ports that use it, and any such ports would already
be broken on FreeBSD/ia64 and Linux anyway.

By the way, the routine has always been undocumented in FreeBSD,
except for an MLINK to a manpage that doesn't describe it.  This
manpage has stated since 5.3-RELEASE that the functions it describes
are deprecated, so that must mean that functions that it is *supposed*
to describe but doesn't are even *more* deprecated.  ;-)

Note that fpresetsticky() has been retained on FreeBSD/i386.  As far
as I can tell, no other operating systems or ports of FreeBSD
implement it, so there's nothing for it to be inconsistent with.

PR:		75862
Suggested by:	bde
2005-03-15 15:53:39 +00:00
scottl
7be505a035 Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch.  This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date.  Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API.  sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.
2005-03-14 16:46:28 +00:00
peter
985987350f Remove an OBE set of comments, fix a minor whitespace nit while here. 2005-03-11 21:42:11 +00:00
jhb
35e48efc22 - Remove the BURN_BRIDGES marked support for hooking into the ISA timer 0
interrupt.
- Remove the timer_func variable as it now has a static value of
  hardclock() and is only used in one place.

Axe borrowed from:	phk
2005-03-09 15:33:58 +00:00
joerg
c85a3e95f7 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
ru
6cc6926066 Use a common multi-inclusion protection, and add such a
protection to alpha/include/exec.h.
2005-02-19 21:16:48 +00:00
marius
63e23f028e Together with the changes to compile kernels with the Intel C/C++ compiler
preliminary support for using the GCC-compatibility of ICC was committed
but couldn't be tested at that time due to problems with ICC itself. Since
ICC 8.1 it's possible to use its GCC-compatibility under FreeBSD and it
turned out that a typedef for __gnuc_va_list is required in that case.
Revert the part of rev. 1.8 which #ifdef'ed out __gnuc_va_list for ICC.

MFC after:	1 week
2005-02-19 13:46:40 +00:00
alc
d0bed09103 Implement support for CPU private mappings within sf_buf_alloc(). 2005-02-13 06:23:13 +00:00
jhb
ad1ee10f6d Use the local APIC timer to drive the various kernel clocks on SMP machines
rather than forwarding interrupts from the clock devices around using IPIs:
- Add an IDT vector that pushes a clock frame and calls
  lapic_handle_timer().
- Add functions to program the local APIC timer including setting the
  divisor, and setting up the timer to either down a periodic countdown
  or one-shot countdown.
- Add a lapic_setup_clock() function that the BSP calls from
  cpu_init_clocks() to setup the local APIC timer if it is going to be
  used.  The setup uses a one-shot countdown to calibrate the timer.  We
  then program the timer on each CPU to fire at a frequency of hz * 3.
  stathz is defined as freq / 23 (hz * 3 / 23), and profhz is defined as
  freq / 2 (hz * 3 / 2).  This gives the clocks relatively prime divisors
  while keeping a low LCM for the frequency of the clock interrupts.
  Thanks to Peter Jeremy for suggesting this approach.
- Remove the hardclock and statclock forwarding code including the two
  associated IPIs.  The bitmap IPI handler has now effectively degenerated
  to just IPI_AST.
- When the local APIC timer is used we don't turn the RTC on at all, but
  we still enable interrupts on the ISA timer 0 (i8254) for timecounting
  purposes.
2005-02-08 20:25:07 +00:00
sobomax
ef41053770 o Move copyin()/copyout() out of i386_{get,set}_ldt() and
i386_{get,set}_ioperm() and make those APIs visible in the kernel namespace;

o use i386_{get,set}_ldt() and i386_{get,set}_ioperm() instead of sysarch()
  in the linuxlator, which allows to kill another two stackgaps.

MFC after:	2 weeks
2005-01-26 13:59:46 +00:00
jhb
5a3bc2892e Tweak the ELCR support slightly. Explicitly probe the ELCR during boot
instead of burying that in the atpic(4) code as atpic(4) is not the only
user of elcr(4).  Change the elcr(4) code to export a global elcr_found
variable that other code can check to see if a valid ELCR was found.

MFC after:	1 month
2005-01-18 20:24:47 +00:00
scottl
d7842c4d70 Introduce bus_dmamap_load_mbuf_sg(). Instead of taking a callback arg, this
cuts to the chase and fills in a provided s/g list.  This is meant to optimize
out the cost of the callback since the callback doesn't serve much purpose for
mbufs since mbuf loads will never be deferred.  This is just for amd64 and
i386 at the moment, other arches will be coming shortly.
2005-01-07 07:57:18 +00:00
imp
8d58b9df12 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 22:18:23 +00:00
imp
3a5dc12cb2 Remove left over include file from stallion driver. 2005-01-06 22:07:20 +00:00
imp
0bfe4ce99e Expand indirect reference to BSD license with the current one. 2005-01-06 22:05:28 +00:00
imp
1df02b4d34 This doesn't seem to have been used since 386BSD days 2005-01-06 22:00:50 +00:00
imp
90af8e5e9a These appear to be unused in our tree, so remove them. 2005-01-05 20:50:31 +00:00
jhb
1666faecb2 Add some constants for the local APIC timer. 2004-12-23 20:35:07 +00:00
jhb
b74bf1946f Add a simple 'intrcnt_add' function that other MD code can use to add a
single named counter to the interrupt counts without having to fake up an
entire interrupt source.
2004-12-23 20:34:18 +00:00
jhb
8aad935790 - Add a function to set the Task Priority Register (TPR) of the local APIC.
Currently this is only used to initiailize the TPR to 0 during initial
  setup.
- Reallocate vectors for the local APIC timer, error, and thermal LVT
  entries.  The timer entry is allocated from the top of the I/O interrupt
  range reducing the number of vectors available for hardware interrupts
  to 191.  Linux happens to use the same exact vector for its timer
  interrupt as well.  If the timer vector shared the same priority queue
  as the IPI handlers, then the frequency that the timer vector will
  eventually be firing at can interact badly with the IPIs resulting in
  the queue filling and the dreaded IPI stuck panics, hence it being located
  at the top of the previous priority queue instead.
- Fixup various minor nits in comments.
2004-12-23 19:47:59 +00:00
ups
0e900aeba0 Avoid more than two pending IPI interrupt vectors per local APIC
as this may cause deadlocks.

This should fix kern/72123.

Discussed with: jhb
Tested by: Nik Azim Azam, Andy Farkas, Flack Man, Aykut KARA
           Izzet BESKARDES, Jens Binnewies, Karl Keusgen
Approved by:    sam (mentor)
2004-12-07 20:15:01 +00:00
marcel
c106bd9120 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
das
90a65a896e Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
jhb
cc03c9de7d Initiate deorbit burn sequence for 80386 support in FreeBSD: Remove
80386 (I386_CPU) support from the kernel.
2004-11-16 20:42:32 +00:00
jhb
747be736a2 Spell _KERNEL correctly so that UP kernels are actually optimized again.
Submitted by:	pjd
2004-11-12 19:18:46 +00:00
jhb
ba0a3a16d6 - Use the SMP style ops for atomic_load/store() in userland so that
libraries and binaries will work on both UP and SMP machines.
- Remove unnecessary gcc memory barrier from the UP atomic_store() op.

Submitted by:	bde
2004-11-12 18:40:22 +00:00
jhb
dabf0d0c4f - Place the gcc memory barrier hint in the right place in the 80386 version
of atomic_store_rel().
- Use the 80386 versions of atomic_load_acq() and atomic_store_rel() that
  do not use serializing instructions on all UP kernels since a UP machine
  does need to synchronize with other CPUs.  This trims lots of cycles from
  spin locks on UP kernels among other things.

Benchmarked by:	rwatson
2004-11-11 22:42:25 +00:00
peter
9534fe12fd Begin an invasion of i386-land by amd64.
Expose some of the amd64-specific sysarch functions to allow alternative
implementations of the %fs/%gs code for TLS, threads, etc.  USER_LDT does
not exist on the amd64 kernel, so we have to implement things other ways.
2004-11-06 03:23:36 +00:00
njl
abd4abd5bd Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after:	2 weeks
2004-10-11 05:39:15 +00:00
alc
417a40f2bf Make pte_load_store() an atomic operation in all cases, not just i386 PAE.
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit
in a race between processors.  (This restructuring assumes the newly atomic
pte_load_store() for correct operation.)

Reviewed by: tegge@
PR: i386/61852
2004-10-08 08:23:43 +00:00
alc
118a3c283b Prevent the unexpected deallocation of a page table page while performing
pmap_copy().  This entails additional locking in pmap_copy() and the
addition of a "flags" parameter to the page table page allocator for
specifying whether it may sleep when memory is unavailable.  (Already,
pmap_copy() checks the availability of memory, aborting if it is scarce.
In theory, another CPU could, however, allocate memory between
pmap_copy()'s check and the call to the page table page allocator,
causing the current thread to release its locks and sleep.  This change
makes this scenario impossible.)

Reviewed by: tegge@
2004-09-29 19:20:40 +00:00
julian
3f6dedb449 Fix breakpoint handling for i386.
not sure yet about 5.x... MFC if needed.
Also fixes small problems with examining some registers and
some specific gdb transfer problems.

	As the patch says:
	This is not a pretty patch and only meant as a temporary
	fix until a better solution is committed.

PR:		i386/71715
Submitted by:	Stephan Uphoff <ups@tree.com>
MFC after:	1 week
2004-09-15 23:26:49 +00:00
scottl
136dd28a2e Double the number of kernel page tables for amd64 and for i386/PAE. The old
value was only enough for 8GB of RAM, the new value can do 16GB.  This still
isn't optimal since it doesn't scale.  Fixing this for amd64 looks to be
fairly easy, but for i386 will be quite difficult.

Reviewed by: peter
2004-09-11 01:31:26 +00:00
scottl
d9af98161a Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
julian
2c3ff98534 Give up trying to make preemption dependent on SCHED_4BSD
the list of breakages was getting too long
2004-09-01 20:41:18 +00:00
julian
ecf87479de Don't ask for this for modules. no modules need to know about preemption at the moment 2004-09-01 18:29:57 +00:00
scottl
c0ab9d7bd6 Protect the PREEMPTION logic with #ifdef _KERNEL to fix the build. 2004-09-01 10:12:08 +00:00
julian
5a545bc8bc Only turn preemption for 4bsd.
it's still poison for ULE.
2004-09-01 09:01:32 +00:00
julian
8354ba9e3a Give the 4bsd scheduler the ability to wake up idle processors
when there is new work to be done.

MFC after:	5 days
2004-09-01 06:42:02 +00:00
marcel
01fd13440d Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
obrien
cb4ed35612 Fix a bug in in_cksum_hdr w/o -O.
The C code assumes that the carry bit is always kept from the previous
operation. However, the pointer indexing requires another add operation.
Thus, the carry bit from the first operation is tromped over by the
"addl" operation that ends up following it, so the "adcl" that follows
that has no effect because the carry bit is cleared before it.
The result is checksum failure on received packets.

The larger issue is that there isn't any other way of preventing the compiler
inserting arbitrary instructions between different __asm statements (and
that the commit message in revision 1.13 of in_cksum.h is wrong on
this point).  From
http://developer.apple.com/documentation/DeveloperTools/gcc-3.3/gcc/Extended-Asm.html
	---8<---8<---8<---
	You can't expect a sequence of volatile asm instructions to remain
	perfectly consecutive. If you want consecutive output, use a single
	asm.  Also, GCC will perform some optimizations across a volatile
	asm instruction; GCC does not "forget everything" when it encounters
	a volatile asm instruction the way some other compilers do.
	---8<---8<---8<---

Also, this change also makes the ASM code much easier to read.

PR:		69257
Submitted by:	Mike Bristow <mike@urgle.com>, Qing Li <qing.li@bluecoat.com>
2004-08-25 18:28:15 +00:00
obrien
01b52fbc3e Increase the scaling of VM_KMEM_SIZE_MAX.
Submitted by:	alc
2004-08-16 08:35:22 +00:00
rwatson
abf6ea2973 Add an "options MP_WATCHDOG" to i386. This option allows one of the
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond.  A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog.  If the callout fails to reset the timer for ten
seconds, the watchdog will fire.  The sysctl allows you to select which
CPU will run the watchdog.

A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog.  On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.

This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.

On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.
2004-08-15 18:02:09 +00:00
mux
35780dc21a Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait().  It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by:	jhb
Reviewed by:	jhb
2004-08-03 18:44:27 +00:00
dfr
090a3e72f1 Add definitions for TLS relocations. 2004-08-02 19:12:17 +00:00