Commit Graph

353 Commits

Author SHA1 Message Date
John Baldwin
48fd1f38ee - Change all callers of addupc_task() to check PS_PROFIL explicitly and
remove the check from addupc_task().  It would need sched_lock while
  testing the flag anyways.
- Always read sticks while holding sched_lock using a temporary variable
  where needed.
- Always init prticks to 0 in ast() to quiet a warning.
2001-12-18 09:06:10 +00:00
John Baldwin
7e1f6dfe9d Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
  prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
  count and a per-thread critical section saved state set when entering
  a critical section while at nesting level 0 and restored when exiting
  to nesting level 0.  This moves the saved state out of spin mutexes so
  that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
  cpu_critical_enter/exit.  MI code such as device drivers and spin
  mutexes use the MI wrappers.  Note that since the MI wrappers store
  the state in the current thread, they do not have any return values or
  arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
  assigned to curthread->td_savecrit during fork_exit().

Tested on:	i386, alpha
2001-12-18 00:27:18 +00:00
John Baldwin
8e2e767b1f Add a per-thread ucred reference for syscalls and synchronous traps from
userland.  The per thread ucred reference is immutable and thus needs no
locks to be read.  However, until all the proc locking associated with
writes to p_ucred are completed, it is still not safe to use the per-thread
reference.

Tested on:	x86 (SMP), alpha, sparc64
2001-10-26 08:12:54 +00:00
John Baldwin
278da5113f Remove a bogus comment. "atomic" doesn't mean that the operation is done
as a physical atomic operation.  That would require the code to use the
atomic API, which it does not.  Instead, the operation is made psuedo
atomic (hence the quotes) by use of the lock to protect clearing all of the
flags in question.
2001-09-21 19:26:57 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
356861db03 Remove the MPSAFE keyword from the parser for syscalls.master.
Instead introduce the [M] prefix to existing keywords.  e.g.
MSTD is the MP SAFE version of STD.  This is prepatory for a
massive Giant lock pushdown.  The old MPSAFE keyword made
syscalls.master too messy.

Begin comments MP-Safe procedures with the comment:
/*
 * MPSAFE
 */
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).

sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE

ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)

Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not.  Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.

Rebuild syscall tables.
2001-08-30 18:50:57 +00:00
John Baldwin
688ebe120c - Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel.  The ast() function now loops as long
  as the PS_ASTPENDING or PS_NEEDRESCHED flags are set.  It returns with
  preemption disabled so that any further AST's that arrive via an
  interrupt will be delayed until the low-level MD code returns to user
  mode.
- Use u_int's to store the tick counts for profiling purposes so that we
  do not need sched_lock just to read p_sticks.  This also closes a
  problem where the call to addupc_task() could screw up the arithmetic
  due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
  clear_resched(), and resched_wanted() in favor of direct bit operations
  on p_sflag.
- Fix up locking with sched_lock some.  In addupc_intr(), use sched_lock
  to ensure pr_addr and pr_ticks are updated atomically with setting
  PS_OWEUPC.  In ast() we clear pr_ticks atomically with clearing
  PS_OWEUPC.  We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by:	bde (mostly)
2001-08-10 22:53:32 +00:00
Matthew Dillon
085be199c6 postsig() currently requires Giant to be held. Giant is held properly at
the first postsig() call, but not always held at the second place,
resulting in an occassional panic.
2001-07-04 15:36:30 +00:00
John Baldwin
64acb05b1c Grab Giant around postsig() since sendsig() can call into the vm to
grow the stack and we already needed Giant for KTRACE.
2001-07-03 05:27:53 +00:00
John Baldwin
7aa7260e4a Move ast() and userret() to sys/kern/subr_trap.c now that they are MI. 2001-06-29 19:51:37 +00:00
John Baldwin
6be523bca7 Add a new MI pointer to the process' trapframe p_frame instead of using
various differently named pointers buried under p_md.

Reviewed by:	jake (in principle)
2001-06-29 11:10:41 +00:00
John Baldwin
92809bc001 Grab Giant around trap_pfault() for now. 2001-06-29 04:18:10 +00:00
John Baldwin
06c836bbca - Grab the proc lock around CURSIG and postsig(). Don't release the proc
lock until after grabbing the sched_lock to avoid CURSIG racing with
  psignal.
- Don't grab Giant for addupc_task() as it isn't needed.

Reported by:	tegge (signal race), bde (addupc_task a while back)
2001-06-22 23:05:11 +00:00
John Baldwin
262c9f8a3b Don't hold sched_lock across addupc_task().
Reported by:	David Taylor <davidt@yadt.co.uk>
Submitted by:	bde
2001-06-06 00:57:24 +00:00
John Baldwin
0dfefe6829 Don't acquire Giant just to call trap_fatal(), we are about to panic
anyway so we'd rather see the printf's then block if the system is
hosed.
2001-05-23 22:58:09 +00:00
Bruce Evans
1c1771cb5b Convert npx interrupts into traps instead of vice versa. This is much
simpler for npx exceptions that start as traps (no assembly required...)
and works better for npx exceptions that start as interrupts (there is
no longer a problem for nested interrupts).

Submitted by:	original (pre-SMPng) version by luoqi
2001-05-22 21:20:49 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
John Baldwin
8bd57f8fc2 Remove unneeded includes of sys/ipl.h and machine/ipl.h. 2001-05-15 23:22:29 +00:00
John Baldwin
1efb92b7ca Simplify the vm fault trap handling code a bit by using if-else instead of
duplicating code in the then case and then using a goto to jump around
the else case.
2001-05-11 23:50:08 +00:00
John Baldwin
6caa8a1501 Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
  into hardclock_process() and statclock_process() respectively.  hardclock()
  and statclock() call the *_process() functions for the current process so
  that UP systems will run as before.  For SMP systems, it is simply necessary
  to ensure that all other processors execute the *_process() functions when the
  main clock functions are triggered on one CPU by an interrupt.  For the alpha
  4100, clock interrupts are delievered in a staggered broadcast fashion, so
  we simply call hardclock/statclock on the boot CPU and call the *_process()
  functions on the secondaries.  For x86, we call statclock and hardclock as
  usual and then call forward_hardclock/statclock in the MD code to send an IPI
  to cause the AP's to execute forwared_hardclock/statclock which then call the
  *_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
  involve less hackery.  Now the cpu doing the forward sets any flags, etc. and
  sends a very simple IPI_AST to the other cpu(s).  AST IPIs now just basically
  return so that they can execute ast() and don't bother with setting the
  astpending or needresched flags themselves.  This also removes the loop in
  forward_signal() as sched_lock closes the race condition that the loop worked
  around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
  a process to act on rather than assuming curproc so that they can be used to
  implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
  header sys/smp.h declares MI SMP variables and API's.   The IPI API's from
  machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
  SLIST of globaldata structures has become MI and moved into subr_smp.c.
  Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by:	jake, peter
Looked over by:	eivind
2001-04-27 19:28:25 +00:00
John Baldwin
f227364a17 - Release Giant a bit earlier on syscall exit.
- Don't try to grab Giant before postsig() in userret() as it is no longer
  needed.
- Don't grab Giant before psignal() in ast() but get the proc lock instead.
2001-03-07 03:53:39 +00:00
Jake Burkholder
631d7bf3da - Rename the lcall system call handler from Xsyscall to Xlcall_syscall
to be more like Xint0x80_syscall and less like c function syscall().
- Reduce code duplication between the int0x80 and lcall handlers by
  shuffling the elfags into the right place, saving the sizeof the
  instruction in tf_err and jumping into the common int0x80 code.

Reviewed by:	peter
2001-02-25 02:53:06 +00:00
John Baldwin
feb43c5f37 The p_md.md_regs member of proc is used in signal handling to reference
the the original trapframe of the syscall, trap, or interrupt that entered
the kernel.  Before SMPng, ast's were handled via a psuedo trap at the
end of doerti.  With the SMPng commit, ast's were broken out into a
separate ast() function that was called from doreti to match the behavior
of other architectures.  Unfortunately, when this was done, the
p_md.md_regs member of curproc was not updateda in ast(), thus when
signals are handled by userret() after an interrupt that returns to
userland, we end up using a stale trapframe that will result in the
registers from the old trapframe overwriting the real trapframe and
smashing all the registers right before we return to usermode.  The saved
%cs:%eip from where we were in usermode are saved in the trapframe for
example.
2001-02-22 19:35:20 +00:00
John Baldwin
f308e0d714 - Change ast() to take a pointer to a trapframe like other architectures.
- Don't use an atomic operation to update cnt.v_soft in ast().  This is
  the only place the variable is written to, and sched_lock is always
  held when it is written, so it is already protected and the mutex release
  of sched_lock asserts a memory barrier that ensures the value will be
  updated in a timely fashion.
2001-02-22 18:05:15 +00:00
John Baldwin
26f9f5c7c7 - Use TRAPF_PC() on the alpha to acess the PC in the trap frame.
- Don't hold sched_lock around addupc_task() as this apparently breaks
  profiling badly due to sched_lock being held across copyin().

Reported by:	bde (2)
2001-02-22 16:23:12 +00:00
John Baldwin
5813dc03bd - Don't call clear_resched() in userret(), instead, clear the resched flag
in mi_switch() just before calling cpu_switch() so that the first switch
  after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
  cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
  proc in mi_switch(), and handle the sched_lock state change across a
  context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
  have to setup an initial state for sched_lock in fork_exit() before we
  release it.
2001-02-20 05:26:15 +00:00
Bruce Evans
0ad74739ac Removed all traces of T_ASTFLT (except for gaps where it was). It became
unused except in dead code when ast() was split off from trap().
2001-02-19 15:47:38 +00:00
Bruce Evans
866546105a Changed the aston() family to operate on a specified process instead of
always on curproc.  This is needed to implement signal delivery properly
(see a future log message for kern_sig.c).

Debogotified the definition of aston().  aston() was defined in terms
of signotify() (perhaps because only the latter already operated on
a specified process), but aston() is the primitive.

Similar changes are needed in the ia64 versions of cpu.h and trap.c.
I didn't make them because the ia64 is missing the prerequisite changes
to make astpending and need_resched per-process and those changes are
too large to make without testing.
2001-02-19 04:15:59 +00:00
Jake Burkholder
d5a08a6065 Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
Jake Burkholder
3cbe75a414 Clear the reschedule flag after finding it set in userret(). This
used to be in cpu_switch(), but I don't see any difference between
doing it here.
2001-02-10 20:33:35 +00:00
John Baldwin
142ba5f3d7 - Make astpending and need_resched process attributes rather than CPU
attributes.  This is needed for AST's to be properly posted in a preemptive
  kernel.  They are backed by two new flags in p_sflag: PS_ASTPENDING and
  PS_NEEDRESCHED.  They are still accesssed by their old macros:
  aston(), astoff(), etc.  For completeness, an astpending() macro has been
  added to check for a pending AST, and clear_resched() has been added to
  clear need_resched().
- Rename syscall2() on the x86 back to syscall() to be consistent with
  other architectures.
2001-02-10 02:20:34 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
John Baldwin
297c46b68c Don't enable interrupts for a kernel breakpoint or trace trap. Otherwise,
this negates the explicit disabling of interrupts when entering the
debugger in Debugger().
2001-02-08 00:10:07 +00:00
Jeroen Ruigrok van der Werven
1a6e52d0e9 Fix typo: seperate -> separate.
Seperate does not exist in the english language.
2001-02-06 11:21:58 +00:00
Peter Wemm
03927d3c33 Send "#if NISA > 0" to the bit-bucket and replace it with an option.
These were compile-time "is the isa code present?" tests and not
'how many isa busses' tests.
2001-01-29 09:38:39 +00:00
Jake Burkholder
28df158b49 Push Giant down into the trap handlers that need it, instead of
acquiring it unconditionally.

Reviewed by:	jhb
2001-01-26 04:16:16 +00:00
John Baldwin
625c76db3a - Kill the have_giant parameter to userret() along with all instances of
that name as a variable.  Use mtx_owned(&Giant) where appropriate
  instead.
- Proc locking.
- P_FOO -> PS_FOO.
- Update comments about enable interrupts during trap and why this may be
  bad if we trap while holding a spin mutex.
- Don't bother resetting p to curproc in syscall() in case we are the child
  returning from fork.  The child hasn't returned from fork through syscall
  in a while.
- Remove fork_return() as it has been superseded by the MI version.
2001-01-24 09:53:49 +00:00
Jake Burkholder
a448b62ac9 Make intr_nesting_level per-process, rather than per-cpu. Setup
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context.  This is also necessary
in order to context switch from sched_ithd() directly.

Reviewed By:	peter
2001-01-21 19:25:07 +00:00
Peter Wemm
558226eae7 Use #ifdef DEV_NPX from opt_npx.h instead of #if NNPX > 0 from npx.h 2001-01-19 13:19:02 +00:00
Jake Burkholder
ef73ae4b0c Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables
other then curproc.
2001-01-10 04:43:51 +00:00
John Baldwin
05f9877c15 If we fail to emulate a vm86 trap in kernel mode, then we use
vm86_trap() to return to the calling program directly.  vm86_trap()
doesn't return, thus it was never returning to trap() to release
Giant.  Thus, release Giant before calling vm86_trap().
2000-12-13 18:57:15 +00:00
Jake Burkholder
92cf772d8d - Add code to detect if a system call returns with locks other than Giant
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
  panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
  a system call, which were missing for alpha and ia64.
2000-12-12 01:14:32 +00:00
Jake Burkholder
7da6f97772 - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
John Baldwin
20cdcc5b73 Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
John Baldwin
35e0e5b311 Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
John Baldwin
6c56727456 - Change fast interrupts on x86 to push a full interrupt frame and to
return through doreti to handle ast's.  This is necessary for the
  clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
  This is needed because both hardclock() and statclock() need to run in
  the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
  during the clock interrupt.  Instead, use two p_flag's in the proc struct
  to mark the current process as having a pending SIGVTALRM or a SIGPROF
  and let them be delivered during ast() when hardclock() has finished
  running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways.  It was
  broken on the x86 if it was turned on since cpl is gone.  It's only use
  was to bogusly run softclock() directly during hardclock() rather than
  scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
  mutex.  Since the spin mutex already handles disabling/restoring
  interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
  temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
  signals in hardclock() that are to be delivered in ast().

Submitted by:	jakeb (making statclock safe in a fast interrupt)
Submitted by:	cp (concept of delaying signals until ast())
2000-10-06 02:20:21 +00:00
John Baldwin
a91b7dc11b Various whitespace cleanups after the SMPng commit, which jumbled things
around a bit in the trap handling code.
2000-10-06 01:55:07 +00:00
John Baldwin
0e2aab1237 Don't treat a kernel stack fault the same as a general protect fault or
a segment not present fault in the non-vm86 case.
2000-10-06 01:50:43 +00:00
Bruce Evans
9c15b3c143 Fixed hang on booting with -d. mtx_enter() was called on an uninitialized
lock.  The quick fix in trap.c was not quite the version tested and had no
effect; back it out.
2000-09-13 12:40:43 +00:00
Bruce Evans
bbbb2579b4 Quick fix for hang on booting with -d. mtx_enter() was called before
curproc was initialized.  curproc == NULL was interpreted as matching
the process holding Giant...  Just skip mtx_enter() and mtx_exit() in
trap() if (curproc == NULL && cold) (&& cold for safety).
2000-09-12 18:41:56 +00:00
Jason Evans
0384fff8c5 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
Paul Saab
c206a8609e Change the behavior of isa_nmi to log an error message instead of
panicing and return a status so that we can decide whether to drop
into DDB or panic.  If the status from isa_nmi is true, panic the
kernel based on machdep.panic_on_nmi, otherwise if DDB is
enabled, drop to DDB based on machdep.ddb_on_nmi.

Reviewed by:	peter, phk
2000-08-06 14:17:21 +00:00
Luoqi Chen
3fb50adb4c Handle write page faults (both write only or read-modify-write) as MI vm
write-only faults.  This would allow write-only mmapped regions to function
correctly.
2000-07-31 14:47:14 +00:00
Paul Saab
88f675ba30 Change the way NMI's are handled. Before, if DDB was enabled and
a NMI occured, you could type continue in DDB and the kernel would
not attempt to detect what type of NMI was recieved.  Now we check
for the type of NMI first and then go to DDB if it is enabled.

This will solve the problem with having DDB enabled and getting an
NMI due to some possibly bad error and being able to continue the
operation of the kernel when you really want to panic and know
what happened.

Submitted by:	jhb
2000-07-14 11:49:44 +00:00
Brian S. Dean
c6d3f3bfc1 Fix my own style bugs (use of spaces instead of tabs for indentation).
This is a style-only change.
2000-07-01 02:40:13 +00:00
Matthew Dillon
36e9f877df Commit major SMP cleanups and move the BGL (big giant lock) in the
syscall path inward.  A system call may select whether it needs the MP
    lock or not (the default being that it does need it).

    A great deal of conditional SMP code for various deadended experiments
    has been removed.  'cil' and 'cml' have been removed entirely, and the
    locking around the cpl has been removed.  The conditional
    separately-locked fast-interrupt code has been removed, meaning that
    interrupts must hold the CPL now (but they pretty much had to anyway).
    Another reason for doing this is that the original separate-lock for
    interrupts just doesn't apply to the interrupt thread mechanism being
    contemplated.

    Modifications to the cpl may now ONLY occur while holding the MP
    lock.  For example, if an otherwise MP safe syscall needs to mess with
    the cpl, it must hold the MP lock for the duration and must (as usual)
    save/restore the cpl in a nested fashion.

    This is precursor work for the real meat coming later: avoiding having
    to hold the MP lock for common syscalls and I/O's and interrupt threads.
    It is expected that the spl mechanisms and new interrupt threading
    mechanisms will be able to run in tandem, allowing a slow piecemeal
    transition to occur.

    This patch should result in a moderate performance improvement due to
    the considerable amount of code that has been removed from the critical
    path, especially the simplification of the spl*() calls.  The real
    performance gains will come later.

Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
2000-03-28 07:16:37 +00:00
Peter Dufault
6d9a8d3e8f I applied the wrong patch set. Back out anything associated
with the known bogus currtpriority.  This undoes the previous changes to
sys/i386/i386/trap.c, sys/alpha/alpha/trap.c, sys/sys/systm.h

Now we have the patch set approved by bde.

Approved by:	bde
2000-03-02 22:03:49 +00:00
Peter Dufault
383774c417 Patches that eliminate extra context switches in FIFO case.
Fixes p1003_1b regression test in the simple case of no RR and
FIFO processes competing.

Reviewed by:	jkh, bde
2000-03-02 16:20:07 +00:00
Brian S. Dean
de8050f9b8 Don't forget to reset the hardware debug registers when a process that
was using them exits.

Don't allow a user process to cause the kernel to take a TRCTRAP on a
user space address.

Reviewed by:	jlemon, sef
Approved by:	jkh
2000-02-20 20:51:23 +00:00
Kazutaka YOKOTA
35e61cbd71 Add a new mechanism, cndbctl(), to tell the console driver that
ddb is entered.  Don't refer to `in_Debugger' to see if we
are in the debugger.  (The variable used to be static in Debugger()
and wasn't updated if ddb is entered via traps and panic anyway.)

- Don't refer to `in_Debugger'.
- Add `db_active' to i386/i386/db_interface.d (as in
  alpha/alpha/db_interface.c).
- Remove cnpollc() stub from ddb/db_input.c.
- Add the dbctl function to syscons, pcvt, and sio. (The function for
  pcvt and sio is noop at the moment.)

Jointly developed by: bde and me

(The final version was tweaked by me and not reviewed by bde.  Thus,
if there is any error in this commit, that is entirely of mine, not
his.)

Some changes were obtained from: NetBSD
2000-01-11 14:54:01 +00:00
Alan Cox
b561683329 Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It
should be "VM_FAULT_NORMAL".
1999-11-09 01:44:28 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Martin Cracauer
a7674320e9 On FPU exceptions, pass a useful error code (one of the FPE_...
macros) to the signal handler, for old-style BSD signal handlers as
the second (int) argument, for SA_SIGINFO signal handlers as
siginfo_t->si_code. This is source-compatible with Solaris, except
that we have no <siginfo.h> (which isn't even mentioned in POSIX
1003.1b).

An rather complete example program is at
  http://www3.cons.org/cracauer/freebsd-signal.c
This will be added to the regression tests in src/.

This commit also adds code to disable the (hardware) FPU from
userconfig, so that you can use a software FP emulator on a machine
that has hardware floating point. See LINT.
1999-07-25 13:16:09 +00:00
Bruce Evans
50045fbc7c Changed the global `idt' from an array to a pointer so that npx.c
automatically hacks on the active copy of the IDT if f00f_hack()
has changed it.  This also allows simplifications in setidt().
This fixes breakage of FP exception handling by rev.1.55 of
sys/kernel.h.  FP exceptions were sent to npx.c's probe handlers
because npx.c "restored" the old handlers to the wrong copy of the
IDT.  The SYSINIT for f00f_hack() was purposely run quite late to
avoid problems like this, but it is bogusly associated with the
SYSINIT for proc0 so it was moved with the latter.

Problem reported and fix tested by:  Martin Cracauer <cracauer@cons.org>
1999-06-18 14:32:21 +00:00
Jonathan Lemon
eb9d435ae7 Unifdef VM86.
Reviewed by:	silence on on -current
1999-06-01 18:20:36 +00:00
Peter Wemm
dfd5dee1b0 Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.
1999-05-06 18:13:11 +00:00
Luoqi Chen
5206bca10a Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
Peter Wemm
db42d90829 unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.
1999-04-19 14:14:14 +00:00
Poul-Henning Kamp
a2210fe12b Make TIMER_FREQ a normal, undocumented option. Raise confusion to
a higher level with example in LINT.

Clarify comment about PPS_SYNC.  Ignore for now that it doesn't
work in FLL mode, it will in a few days.
1999-03-09 20:20:09 +00:00
Julian Elischer
2267af789e Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>
1999-01-06 23:05:42 +00:00
Mike Smith
9959b1a882 Improved DDB_UNATTENDED behaviour. From the submitter:
There's something that's been bugging me for a while, so I decided to fix it.
FreeBSD now will DTRT WRT DDB and DDB_UNATTENDED (!debugger_on_panic), at least
in my opinion. The behavior change is such that:

	1. Nothing changes when debugger_on_panic != 0.
	2. When DDB_UNATTENDED (!debugger_on_panic), if a panic occurs, the
		machine will reboot. Also, if a trap occurs, the machine will
		panic and reboot, unlike how it broke to DDB before. HOWEVER,
		a trap inside DDB will not cause a panic, allowing full use
		of DDB without having to worry about the machine being stuck
		at a DDB prompt if something goes wrong during the day.
		Patches for this behavior follow my signature, and it would
		be a boon to anyone (like me) who uses DDB_UNATTENDED, but
		actually wants the machine to panic on a trap (otherwise,
		what's the use, if the machine causes a fatal trap rather than
		a true panic, of debugger_on_panic?). The changes cause no
		adverse behavior, but do involve two symbols becoming global

Submitted by:	Brian Feldman <green@unixhelp.org>
1998-12-28 23:03:00 +00:00
Bruce Evans
4f2129fa86 Removed bogus casts of USRSTACK and/or the other operand in binary
expressions involving USRSTACK.
1998-12-16 15:21:51 +00:00
Archie Cobbs
2326715f79 Avoid compiler warning (printf arg type mismatch) when compiling #ifdef DEBUG 1998-12-06 00:03:30 +00:00
KATO Takenori
9ad861edee - For some old Cyrix CPUs, %cr2 is clobbered by interrupts. This
problem is worked around by using an interrupt gate for the page
   fault handler.  This code was originally made for NetBSD/pc98 by
   Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp> and has already
   been in PC98 tree.  Because of this bug, trap_fatal cannot show
   correct page fault address if %cr2 is obtained in this function.
   Therefore, trap_fatal uses the value from trap() function.
-  The trap handler always enables interruption when buggy application
   or kernel code has disabled interrupts and then trapped.  This code
   was prepared by Bruce Evans <bde@FreeBSD.org>.

Submitted by:	Bruce Evans <bde@FreeBSD.org>
		Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp>
1998-12-02 08:15:17 +00:00
Bruce Evans
1fcee46997 Fixed printf format errors. 1998-08-23 10:16:26 +00:00
Eivind Eklund
288078be0f Translate T_PROTFLT to SIGSEGV instead of SIGBUS when running under
Linux emulation.  This make Allegro Common Lisp 4.3 work under
FreeBSD!

Submitted by: Fred Gilham <gilham@csl.sri.com>
Commented on by: bde, dg, msmith, tg
Hoping he got everything right:  eivind
1998-04-28 18:15:08 +00:00
Bruce Evans
c1087c1324 Support compiling with `gcc -ansi'. 1998-04-15 17:47:40 +00:00
Poul-Henning Kamp
227ee8a188 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
Bruce Evans
08637435f2 Moved some #includes from <sys/param.h> nearer to where they are actually
used.
1998-03-28 10:33:27 +00:00
Jonathan Lemon
640c4313af Add the ability to make real-mode BIOS calls from the kernel. Currently,
everything is contained inside #ifdef VM86, so this option must be
present in the config file to use this functionality.

Thanks to Tor Egge, these changes should work on SMP machines.  However,
it may not be throughly SMP-safe.

Currently, the only BIOS calls made are memory-sizing routines at bootup,
these replace reading the RTC values.
1998-03-23 19:52:59 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
Eivind Eklund
e0d781f3a5 Make POWERFAIL_NMI, PPS_SYNC and NATM new style options.
This also fixes a couple of defunct options; submitted by bde.
1998-01-31 05:00:21 +00:00
Sean Eric Fagan
2a024a2b05 Changes to allow event-based process monitoring and control. 1997-12-06 04:11:14 +00:00
John-Mark Gurney
4d9deedb49 document and make the NO_F00F_HACK a proper option...
also, sort some option includes while I'm here..

Forgotten by:	sef
1997-12-04 21:21:26 +00:00
Jordan K. Hubbard
e41b6f2db7 After consultation with David, change
#ifndef NO_F00F_HACK
to
#if defined(I586_CPU) && !defined(NO_F00F_HACK)
1997-12-04 14:35:40 +00:00
Sean Eric Fagan
c4fbf2774d Work around for the Intel Pentium F00F bug; this is Intel's recommended
workaround.  Note that this currently eats up two pages extra in the system;
this could be alleviated by aligning idt correctly, and then only dealing with
that (as opposed to the current method of allocated two pages and copying the
IDT table to that, and then setting that to be the IDT table).
1997-12-03 02:45:50 +00:00
Bruce Evans
21e5241572 Fixed some #include messes.
Hid the check of the user %cs in syscall() under `#ifdef DIAGNOSTIC'.
1997-11-24 13:25:37 +00:00
Poul-Henning Kamp
cb226aaa62 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
Peter Wemm
b67dffdad2 Compensate for pcb.h tweaks.
(Bruce pointed out the nesting)
1997-10-10 12:42:54 +00:00
Peter Wemm
98823b2366 Convert the VM86 option from a global option to an option only depended
on by the files that use it.  Changing the VM86 option now only causes
a recompile of a dozen files or so rather than the entire kernel.
1997-10-10 09:44:12 +00:00
Justin T. Gibbs
919429034e autoconf.c:
Add cpu_rootconf and cpu_dumpconf so that configuring these
	two devices can be better controlled by the MI configuration
	code.

machdep.c:
	MD initialization code for the new callout interface.

trap.c:
	Add support for printing out whether cam interrupts are masked
	during a panic.
1997-09-21 21:38:05 +00:00
Peter Wemm
279a69322c Cosmetic adjustment for the trap/double fault/panic cpu id listing.
It now prints the apic id in hex rather than decimal.
1997-09-05 08:54:55 +00:00
Jonathan Lemon
5f07393373 Remove the vm86 support as an LKM, and link it directly into the kernel
if 'options "VM86"' is in the config file.  The LKM was really for
development, and has probably outlived its usefulness.
1997-08-28 14:36:56 +00:00
Peter Wemm
9a3b3e8bce Clean up the SMP AP bootstrap and eliminate the wretched idle procs.
- We now have enough per-cpu idle context, the real idle loop has been
revived (cpu's halt now with nothing to do).
- Some preliminary support for running some operations outside the
global lock (eg: zeroing "free but not yet zeroed pages") is present
but appears to cause problems.  Off by default.
- the smp_active sysctl now behaves differently. It's merely a 'true/false'
option.  Setting smp_active to zero causes the AP's to halt in the idle
loop and stop scheduling processes.
- bootstrap is a lot safer.  Instead of sharing a statically compiled in
stack a number of times (which has caused lots of problems) and then
abandoning it, we use the idle context to boot the AP's directly.  This
should help >2 cpu support since the bootlock stuff was in doubt.
- print physical apic id in traps.. helps identify private pages getting
out of sync.  (You don't want to know how much hair I tore out with this!)

More cleanup to follow, this is more of a checkpoint than a
'finished' thing.
1997-08-26 18:10:38 +00:00
Philippe Charnier
40d5099441 Revert my previous commit about using CS_SECURE macro.
Requested by:	Bruce.
1997-08-21 06:33:04 +00:00
Steve Passe
7b185ef809 Preperation for moving cpl into critical region access.
Several new fine-grained locks.
New FAST_INTR() methods:
 - separate simplelock for FAST_INTR, no more giant lock.
 - FAST_INTR()s no longer checks ipending on way out of ISR.
sio made MP-safe (I hope).
1997-08-20 05:25:48 +00:00
Philippe Charnier
15f3549108 Use CS_SECURE macro.
Reviewed by:	John Dyson
1997-08-18 06:58:59 +00:00
John Dyson
0b6e0f74f9 Back out a part of the disk scheduling "improvements" :-(. Let me know
how the system works now!!!
1997-08-12 19:07:42 +00:00
John Dyson
c0ecffb96b Modify the scheduling policy to take into account disk I/O waits
as chargeable CPU usage.  This should mitigate the problem of processes
doing disk I/O hogging the CPU.  Various users have reported the
problem, and test code shows that the problem should now be gone.
1997-08-09 10:13:32 +00:00
John Dyson
48a09cf276 VM86 kernel support.
Work done by BSDI, Jonathan Lemon <jlemon@americantv.com>,
	Mike Smith <msmith@gsoft.com.au>, Sean Eric Fagan <sef@kithrup.com>,
	and probably alot of others.
Submitted by:	Jnathan Lemon <jlemon@americantv.com>
1997-08-09 00:04:06 +00:00
Bruce Evans
e31521c3dd Removed unused #includes. 1997-07-20 08:37:24 +00:00
Peter Wemm
b3196e4b9f Preliminary support for per-cpu data pages.
This eliminates a lot of #ifdef SMP type code.  Things like _curproc reside
in a data page that is unique on each cpu, eliminating the expensive macros
like:    #define curproc (SMPcurproc[cpunumber()])

There are some unresolved bootstrap and address space sharing issues at
present, but Steve is waiting on this for other work.  There is still some
strictly temporary code present that isn't exactly pretty.

This is part of a larger change that has run into some bumps, this part is
standalone so it should be safe.  The temporary code goes away when the
full idle cpu support is finished.

Reviewed by: fsmp, dyson
1997-06-22 16:04:22 +00:00
Bruce Evans
7b3c84247b Preserve %fs and %gs across context switches. This has a relatively low
cost since it is only done in cpu_switch(), not for every exception.
The extra state is kept in the pcb, and handled much like the npx state,
with similar deficiencies (the state is not preserved across signal
handlers, and error handling loses state).
1997-06-07 04:36:10 +00:00
Doug Rabson
683523378c Move interrupt handling code from isa.c to a new file. This should make
isa.c (slightly) more portable and will make my life developing the really
portable version much easier.

Reviewed by:	peter, fsmp
1997-06-02 08:19:06 +00:00
Peter Wemm
5400ed3b2f Include file updates.. <machine/spl.h> -> <machine/ipl.h>, add
<machine/ipl.h> to those files that were depending on getting SWI_*
implicitly via <machine/cpufunc.h>
1997-05-31 09:27:31 +00:00
Peter Wemm
ae9249615f remove opt_smp.h and fix the reason it was needed. 1997-05-29 05:04:30 +00:00
Peter Wemm
835834c085 md_regs is now a struct trapframe * 1997-05-07 20:08:53 +00:00
John Dyson
b332d9a66e Make sure that *fork() always returns with %edx == 1 in the
child.  This was sometimes not happening correctly during my
threads code work.
1997-05-05 04:08:12 +00:00
Peter Wemm
477a642cee Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people.  A special thanks to Steve Passe for implementing
the APIC code!
1997-04-26 11:46:25 +00:00
Bruce Evans
58611a61ed Fixed printing of registers in dbflalt_handler(). The registers
were always in a tss; that tss just changed from the one in the
pcb to common_tss (who knows where it was when there was no curpcb?).
Not using the pcb also fixed the problem that there is no pcb in
idle(), so we now always get useful register values.
1997-04-14 13:52:52 +00:00
Peter Wemm
a2a1c95c10 The biggie: Get rid of the UPAGES from the top of the per-process address
space. (!)

Have each process use the kernel stack and pcb in the kvm space.  Since
the stacks are at a different address, we cannot copy the stack at fork()
and allow the child to return up through the function call tree to return
to user mode - create a new execution context and have the new process
begin executing from cpu_switch() and go to user mode directly.
In theory this should speed up fork a bit.

Context switch the tss_esp0 pointer in the common tss.  This is a lot
simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer
to each process's tss since the esp0 pointer is a 32 bit pointer, and the
sd_base setting is split into three different bit sections at non-aligned
boundaries and requires a lot of twiddling to reset.

The 8K of memory at the top of the process space is now empty, and unmapped
(and unmappable, it's higher than VM_MAXUSER_ADDRESS).

Simplity the pmap code to manage process contexts, we no longer have to
double map the UPAGES, this simplifies and should measuably speed up fork().

The following parts came from John Dyson:

Set PG_G on the UPAGES that are now in kernel context, and invalidate
them when swapping them out.

Move the upages object (upobj) from the vmspace to the proc structure.

Now that the UPAGES (pcb and kernel stack) are out of user space, make
rfork(..RFMEM..) do what was intended by sharing the vmspace
entirely via reference counting rather than simply inheriting the mappings.
1997-04-07 07:16:06 +00:00
Peter Wemm
271b264e4c No longer use an i386tss as the basis of our pcb - it wasn't particularly
convenient and makes life difficult for my next commit.  We still need
an i386tss to point to for the tss slot in the gdt, so we use a common
tss shared between all processes.

Note that this is going to break debugging until this series of commits
is finished.  core dumps will change again too. :-(  we really need
a more modern core dump format that doesn't depend on the pcb/upages.

This change makes VM86 mode harder, but the following commits will remove
a lot of constraints for the VM86 system, including the possibility of
extending the pcb for an IO port map etc.

Obtained from: bde
1997-04-07 06:45:18 +00:00
John Dyson
a04c970a7a Fix the gdb executable modify problem. Thanks to the detective work
by Alan Cox <alc@cs.rice.edu>, and his description of the problem.

The bug was primarily in procfs_mem, but the mistake likely happened
due to the lack of vm system support for the operation.  I added
better support for selective marking of page dirty flags so that
vm_map_pageable(wiring) will not cause this problem again.

The code in procfs_mem is now less bogus (but maybe still a little
so.)
1997-04-06 02:29:45 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
John Dyson
996c772f58 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
John Dyson
7e64cb7a96 Remove some dead code from trapwrite.
Submitted by:	Stephen McKay <syssgm@devetir.qld.gov.au>
1997-01-23 01:30:59 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Bruce Evans
959c02787e Only handle copyin/out/etc faults when not in an interrupt handler.
This makes unexpected faults (in an interrupt handler) more likely
to crash properly.  It could be done even better (more robustly and
more efficiently) using lazy fault handling.
1996-12-18 19:12:01 +00:00
Bruce Evans
f313170d3c Updated #includes to 4.4Lite style. 1996-09-10 08:32:01 +00:00
David Greenman
eaed89032e Change an splclock that needs to be an splhigh into an splhigh.
Reviewed by:	bde
1996-09-01 10:10:12 +00:00
David Greenman
11282a57ce Add support for i686 machine check trap. 1996-08-11 17:41:25 +00:00
Bruce Evans
f3460ead96 Fixed cloned comments about npx traps to match context. 1996-07-12 06:03:14 +00:00
Bruce Evans
79df6d8597 trap.c:
Fixed profiling of system times.  It was pre-4.4Lite and didn't support
statclocks.  System times were too small by a factor of 8.

Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc().  Call addupc_task() directly instead of using the ADDUPC()
macro.

Removed vestigial support for PROFTIMER.

switch.s:
Removed addupc().

resourcevar.h:
Removed ADDUPC() and declarations of addupc().

cpu.h:
Updated a comment.  i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.

Obtained from:	mostly from NetBSD
1996-06-25 20:02:16 +00:00
Satoshi Asami
d7629dff3b A fast memory copy for Pentiums using floating point registers.
It is called from copyin and copyout.

The new routine is conditioned on I586_CPU and I586_FAST_BCOPY, so you
need

options "I586_FAST_BCOPY"

(quotes essenstial) in your kernel config file.

Also, if you have other kernel types configured in your kernel, an
additional check to make sure it is running on a Pentium is inserted.
(It is not clear why it doesn't help on P6s, it may be just that the
 Orion chipset doesn't prefetch as efficiently as Tritons and friends.)

Bruce can now hack this away. :)
1996-06-13 07:17:21 +00:00
Gary Palmer
c23670e294 Clean up -Wunused warnings.
Reviewed by:		bde
1996-06-12 05:11:41 +00:00
John Dyson
b18bfc3da7 This set of commits to the VM system does the following, and contain
contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>,
Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:

	More usage of the TAILQ macros.  Additional minor fix to queue.h.
	Performance enhancements to the pageout daemon.
		Addition of a wait in the case that the pageout daemon
		has to run immediately.
		Slightly modify the pageout algorithm.
	Significant revamp of the pmap/fork code:
		1) PTE's and UPAGES's are NO LONGER in the process's map.
		2) PTE's and UPAGES's reside in their own objects.
		3) TOTAL elimination of recursive page table pagefaults.
		4) The page directory now resides in the PTE object.
		5) Implemented pmap_copy, thereby speeding up fork time.
		6) Changed the pv entries so that the head is a pointer
		   and not an entire entry.
		7) Significant cleanup of pmap_protect, and pmap_remove.
		8) Removed significant amounts of machine dependent
		   fork code from vm_glue.  Pushed much of that code into
		   the machine dependent pmap module.
		9) Support more completely the reuse of already zeroed
		   pages (Page table pages and page directories) as being
		   already zeroed.
	Performance and code cleanups in vm_map:
		1) Improved and simplified allocation of map entries.
		2) Improved vm_map_copy code.
		3) Corrected some minor problems in the simplify code.
	Implemented splvm (combo of splbio and splimp.)  The VM code now
		seldom uses splhigh.
	Improved the speed of and simplified kmem_malloc.
	Minor mod to vm_fault to avoid using pre-zeroed pages in the case
		of objects with backing objects along with the already
		existant condition of having a vnode.  (If there is a backing
		object, there will likely be a COW...  With a COW, it isn't
		necessary to start with a pre-zeroed page.)
	Minor reorg of source to perhaps improve locality of ref.
1996-05-18 03:38:05 +00:00
John Dyson
4e489ec421 Remove a now unnecessary prototype from pmap.c. Also remove now
unnecessary vm_fault's of page table pages in trap.c.
1996-03-28 05:40:58 +00:00
Bruce Evans
ba00d77a82 Print stack pointer and frame pointer in trap messages.
Fixed "trace/trap" message.

Reviewed by:	davidg
1996-03-27 17:33:39 +00:00
Peter Wemm
d66a506616 Mega-commit for Linux emulator update.. This has been stress tested under
netscape-2.0 for Linux running all the Java stuff.  The scrollbars are now
working, at least on my machine. (whew! :-)

I'm uncomfortable with the size of this commit, but it's too
inter-dependant to easily seperate out.

The main changes:

COMPAT_LINUX is *GONE*.  Most of the code has been moved out of the i386
machine dependent section into the linux emulator itself.  The int 0x80
syscall code was almost identical to the lcall 7,0 code and a minor tweak
allows them to both be used with the same C code.  All kernels can now
just modload the lkm and it'll DTRT without having to rebuild the kernel
first.  Like IBCS2, you can statically compile it in with "options LINUX".

A pile of new syscalls implemented, including getdents(), llseek(),
readv(), writev(), msync(), personality().  The Linux-ELF libraries want
to use some of these.

linux_select() now obeys Linux semantics, ie: returns the time remaining
of the timeout value rather than leaving it the original value.

Quite a few bugs removed, including incorrect arguments being used in
syscalls..  eg:  mixups between passing the sigset as an int, vs passing
it as a pointer and doing a copyin(), missing return values, unhandled
cases, SIOC* ioctls, etc.

The build for the code has changed.  i386/conf/files now knows how
to build linux_genassym and generate linux_assym.h on the fly.

Supporting changes elsewhere in the kernel:

The user-mode signal trampoline has moved from the U area to immediately
below the top of the stack (below PS_STRINGS).  This allows the different
binary emulations to have their own signal trampoline code (which gets rid
of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so
that the emulator can provide the exact "struct sigcontext *" argument to
the program's signal handlers.

The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which
have the same values as the re-used SA_DISABLE and SA_ONSTACK which are
intended for sigaction only.  This enables the support of a SA_RESETHAND
flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal
semantics where the signal handler is reset when it's triggered.

makesyscalls.sh no longer appends the struct sysentvec on the end of the
generated init_sysent.c code.  It's a lot saner to have it in a seperate
file rather than trying to update the structure inside the awk script. :-)

At exec time, the dozen bytes or so of signal trampoline code are copied
to the top of the user's stack, rather than obtaining the trampoline code
the old way by getting a clone of the parent's user area.  This allows
Linux and native binaries to freely exec each other without getting
trampolines mixed up.
1996-03-02 19:38:20 +00:00
John Dyson
3eb77c8302 Fix a problem with tracking the modified bit. Eliminate the
ugly inline-asm code, and speed up the page-table-page tracking.
1996-02-25 03:02:53 +00:00
John Dyson
bd7e5f992e Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
	overhead for merged cache.
Efficiency improvement for vfs_cluster.  It used to do alot of redundant
	calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files.  Additionally,
	fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources.  The pageout code
	will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
	page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
	thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
	that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
	case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
	buffers.
1996-01-19 04:00:31 +00:00
Garrett Wollman
0e41ee3037 Convert DDB to new-style option. 1996-01-04 21:13:23 +00:00
Garrett Wollman
db6a20e23e Converted two options over to the new scheme: USER_LDT and KTRACE. 1996-01-03 21:42:35 +00:00
David Greenman
c96819306b Corrected a typo in a comment. 1995-12-19 14:47:41 +00:00
David Greenman
2838c9682a Implemented a (sorely needed for years) double fault handler to catch stack
overflows.
It sure would be nice if there was an unmapped page between the PCB and
the stack (and that the size of the stack was configurable!). With the
way things are now, the PCB will get clobbered before the double fault
handler gets control, making somewhat of a mess of things. Despite this,
it is still fairly easy to poke around in the overflowed stack to figure
out the cause.
1995-12-19 14:30:50 +00:00
Peter Wemm
b1529bda75 GENERIC/LINT: Remove redundant quoting on some option lines.
LINT: add a couple of new/missing/undocumented options
files.i386: add linux code so that you can compile a kernel with static
linux emulation ("options LINUX")
i386/*: use #if defined(COMPAT_LINUX) || defined(LINUX) to enable static
support of linux emulation (just like "IBCS2" makes ibcs2 static)

The main thing this is going to make obvious, is that the LINUX code
(when compiled from LINT) has a lot of warnings, some of which dont look
too pleasant..
1995-12-14 14:35:36 +00:00
Poul-Henning Kamp
5e46340891 Make math_emulators LKMable. 1995-12-14 08:21:33 +00:00
Poul-Henning Kamp
7dfe504fe2 Remove various unused symbols and procedures. 1995-12-09 20:40:43 +00:00
David Greenman
efeaf95a41 Untangled the vm.h include file spaghetti. 1995-12-07 12:48:31 +00:00
Poul-Henning Kamp
4ccc87c594 Remove unused functions and variables, make things static, and other cleanups. 1995-10-28 15:39:31 +00:00
Bruce Evans
029b0fc88f Fix tracing of syscalls. The previous fix required the undocumented
option DDB_NO_LCALLS to stop ddb getting control and broke all ddb
tracing.  Now there is no option and no way for ddb to trace at
address _Xsyscall or to _Xsyscall, but tracing everywhere else
works.  The previous fix did unnecessary things for Linux syscalls.

Don't bother checking that syscall frames are for user mode.

Make debugger traps inside the kernel (except at addresses _Xsyscall
and _Xsyscall+1) fatal if ddb is not configured.  They "can't happen".

Add prototypes.

Remove stupid comments, e.g., /*ARGSUSED*/ for args that are used.
1995-10-09 04:36:01 +00:00
Julian Elischer
00c6cadad3 Submitted by: Juergen Lock <nox@jelal.hb.north.de>
Obtained from: other people on the net ?

1. stepping over syscalls (gdb ni) sends you to DDB, and returned
to the wrong address afterwards, with or without DDB.  patch in
i386/i386/trap.c below.

2. the linux emulator (modload'ed) still causes panics with DIAGNOSTIC,
re-applied a patch posted to one of the lists...
1995-10-04 07:08:04 +00:00
David Greenman
4219d2b2ac A couple of micro optimizations to improve NULL syscall performance by
about 2%.
1995-08-21 18:06:48 +00:00
David Greenman
a705fd3e48 Fix a bug in my disabled version of trap_pfault()...curpcb may be NULL even
when curproc isn't. This condition occurs at system startup and perhaps
at other times.
1995-07-30 17:49:24 +00:00
Peter Wemm
1174c7d121 This fixes a compiler warning, and a cosmetic problem with the linux
emul code when compiling with "options KTRACE".
ktrsyscall() was expecting an array of integers, this was passing the
address of a structure containing an array of integers..
The cosmetic problem was that it was calling the "enter syscall"
trace hook twice - this looks like a cut/paste error/typo.
1995-07-16 14:10:55 +00:00
Joerg Wunsch
446cee6e6d Include ``options POWERFAIL_NMI'' for owners of older (non-apm)
notebooks where a powerfail condition (external power drop; battery
state low) is signalled by an NMI.  Makes it beep instead of panicing.

Reviewed by:	davidg
1995-07-16 10:31:26 +00:00
David Greenman
9e951f36f1 Truncate the fault address to a page boundry when calling vm_fault(). The
last change to fix the fault-twice bug with page tables wasn't quite
complete.
1995-07-16 05:39:22 +00:00
David Greenman
4a67eb7121 Fixed bug that caused page tables to be faulted twice instead of once.
Submitted by:	John Dyson
1995-07-14 09:25:51 +00:00
Rodney W. Grimes
d3628763db Merge RELENG_2_0_5 into HEAD 1995-06-11 19:33:05 +00:00
Rodney W. Grimes
9b2e535452 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
David Greenman
f550a707bd Added a new version of trap_pfault() that disallows kernel page faults
to the user address space unless pcb_onfault is set. The code is currently
commented out because iBCS2 and process debugging parts of the kernel
need to be changed/fixed first.
1995-03-21 07:16:12 +00:00
David Greenman
c6d5f3ac3e Changed some #ifdef DIAGNOSTIC code that I added to be #ifdef DEBUG. 1995-03-21 07:02:51 +00:00
Bruce Evans
b5e8ce9f12 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'.  Fix all the bugs found.  There were no serious
ones.
1995-03-16 18:17:34 +00:00
Søren Schmidt
1e1e0b4463 First attempt to run linux binaries. This is only the changes needed to
the generic kernel. The actual emulator is a separate LKM. (not finished
yet, sorry).
Submitted by:	sos@freebsd.org & sef@kithrup.com
1995-02-14 19:23:22 +00:00
David Greenman
b1e4a738e0 Removed unnecessary check for pr_scale in the AST/OWEUPC case. 1995-02-10 06:43:47 +00:00
David Greenman
5a32829d98 Check P_PROFIL flag for profiling rather than pr_scale as it makes more
sense.
1995-02-10 06:25:14 +00:00
David Greenman
fbdfe8ac22 Changed buffer allocation policy (machdep.c)
Moved various pmap 'bit' test/set functions back into real functions; gcc
generates better code at the expense of more of it. (pmap.c)
Fixed a deadlock problem with pv entry allocations (pmap.c)
Added a new, optional function 'pmap_prefault' that does clustered page
table preloading (pmap.c)
Changed the way that page tables are held onto (trap.c).

Submitted by:	John Dyson
1995-01-24 09:56:33 +00:00
Bruce Evans
20415301cd Fix security holes in sigreturn(), ptrace() and procfs. sigreturn()
attempted to check for insecure and fatal eflags and segment
selectors, but missed many cases and got the IOPL check back to
front.  The other syscalls didn't check at all.

sys_process.c, machdep.c:
Only allow PT_WRITE_U to write to the registers (ordinary and FP).

psl.h, locore.s, machdep.c:
Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR.  We are not supposed
to assume anything about the reserved bits.  Use PSL_USERCHANGE
and PSL_KERNEL instead.  Rename PSL_USERSET to PSL_USER.

exception.s:
Define a private label for use by doreti when returning to user
mode fails.

machdep.c:
In syscalls, allow changing only the eflags that can be changed on
486's in user mode (no longer attempt to allow benign IOPL changes;
allow changing the nasty PSL_NT; don't allow changing the i586
bits).

Don't attempt to check all the cases involving invalid selectors
and %eip's.  Just check for privilege violations and let the invalid
things cause a trap.

procfs_machdep.c:
Call the ptrace register functions to do all the work for reading
and writing ordinary registers and for single stepping.

trap.c:
Ignore traps caused by PSL_NT being set.  Previously, users could
cause a fatal trap in user mode by setting PSL_NT and executing an
iret, and a fatal trap in kernel mode by setting PSL_NT and making
a syscall.  PSL_NT was cleared too late and not in enough modes to
fix the problem.

Make all traps in user mode (except T_NMI) nonfatal.

Recover from traps caused by attempting to load invalid user
registers in doreti by restarting the traps so that they appear to
occur in user mode.
---

Fix bogons that I noticed while fixing the above:

psl.h:
Fix some comments.

Uniformize idempotency ifdef.

exception.s, machdep.c:
Remove rsvd[0-14].  rsvd0 hasn't been reserved since the 486 came
out.  Replace rsvd0 by `align'.  rsvd[0-11] used wrong (magic
non-unique) trap numbers.  Replace rsvd[1-14] by rsvd.

locore.s:
Enable alignment check flag on 486's and 586's.

machdep.c:
Use a better type for kstack[].

Use TFREGP() to find the registers.

Reformat ptrace functions from SEF to something closer to KNF.

procfs_machdep.c:
The wrong pointer to the registers got fixed as a side effect.

Implement reading and writing of FP registers.

/proc/*/*regs now work (only) for processes that are in memory.

Clean up comments.

trap.c, trap.h:
Remove unused trap types.
1995-01-14 13:20:26 +00:00
David Greenman
0d94caffca These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme.  The scheme is almost fully compatible with the old filesystem
interface.  Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache.  Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code.  Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now.  Somehow in 2.0, some "enhancements"
broke the code.  This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code.  No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency.  Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache.  Add "bypass" support for sneaking in on
busy buffers.

Submitted by:	John Dyson and David Greenman
1995-01-09 16:06:02 +00:00
Bruce Evans
13d7c724ca Obtained from: 1.1.5
Fix single-stepping of emulated FPU instructions.

Don't panic if an FPU instruction is attempted but there is no FPU
and no FPU emulator is configured.
1994-12-24 07:22:58 +00:00
Bruce Evans
ab4bc4b293 Fix selector arg to match the (missing) prototype for sdtossd().
Cosmetic.

Return from trap() if trap_fatal() returns.  trap_fatal() isn't
fatal if you have ddb.  Returning from trap() is usually the right
thing to do and much better than falling through.
1994-10-30 20:25:21 +00:00
Garrett Wollman
09f7992adf Make my ALLDEVS kernel compile (basically, LINT minus a lot of options). 1994-10-21 01:18:38 +00:00
Søren Schmidt
fabbd9b7ce Ouch, fixed bug in errno translation (ibcs2 support). 1994-10-11 22:37:14 +00:00
Søren Schmidt
76d121f2b4 Hmm, only translate errno when doing an actual return.
Reviewed by:	sef@freefall.cdrom.com
1994-10-10 07:33:01 +00:00
Søren Schmidt
c96f129304 Updated to convert errno return in syscall if conversion tabel present. 1994-10-09 22:02:06 +00:00
Poul-Henning Kamp
3fb3086e98 db_disasm.c: Unused var zapped.
pmap.c: tons of unused vars zapped, various other warnings silenced.
trap.c: unused vars zapped.
vm_machdep.c:  A wrong argument, which by chance did the right thing, was
corrected.
1994-10-08 22:19:51 +00:00
David Greenman
22414e535a Laptop Advanced Power Management support by HOSOKAWA Tatsumi.
Submitted by:	HOSOKAWA Tatsumi
1994-10-01 02:56:21 +00:00
David Greenman
f7d6afc696 Be more careful about dereferencing curproc, p_vmspace, and curpcb,
otherwise the machine will overflow the stack in a recursive fault loop
(causing the machine to spontaneously reboot because of the stack fault
that ultimately happens).

Submitted by:	Inspired by Bruce Evans, but this change is different
		than what he suggested.
1994-09-11 11:26:18 +00:00
Bruce Evans
fe7bb84c74 Remove <machine/eflags.h> and all dependencies on it. eflags.h is just
the Mach/i386 version of the BSD/vax(?) <machine/psl.h>.  The Mach
version has slightly better names for many macros but is now out of
date and little used.  It was originally used even less (for spelling
PSL_T as EFL_TF in <machine/db_machdep.h>).
1994-09-08 11:49:04 +00:00
Bruce Evans
406d45e059 Don't test if a u_int is < 0. The remaining test is sufficient and the
extra one caused a warning.
1994-08-28 16:16:33 +00:00
David Greenman
8a129caed5 1) Changed ddb into a option rather than a pseudo-device (use options DDB
in your kernel config now).
2) Added ps ddb function from 1.1.5. Cleaned it up a bit and moved into its
   own file.
3) Added \r handing in db_printf.
4) Added missing memory usage stats to statclock().
5) Added dummy function to pseudo_set so it will be emitted if there
   are no other pseudo declarations.
1994-08-27 16:14:39 +00:00
Søren Schmidt
f3f0ca6051 Changes preparing for iBCS support
Reviewed by:
Submitted by:
1994-08-24 11:52:21 +00:00
Garrett Wollman
f23b4c91c4 Fix up some sloppy coding practices:
- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
  header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.
1994-08-18 22:36:09 +00:00
Garrett Wollman
5c8b38d41d Handle NMI's in accordance with data in van Gilluwe book. 1994-08-10 04:39:52 +00:00
David Greenman
03e6c2532f Removed all code related to the pagescan daemon, and changed 'act_count'
adjustments to compensate for a world without the pagescan daemon.
1994-08-01 11:25:45 +00:00
David Greenman
8c481329f7 Fixed minor spelling error. 1994-06-11 05:13:33 +00:00
David Greenman
3c256f5395 trap.c:
Vastly improved trap.c from me. This rewritten version has a variety of
features, amoung them: higher performance and much higher code quality.

support.s, cpufunc.h:
No longer use gs override to enforce range limits - compare directly
against VM_MAXUSER_ADDRESS instead. The old way caused problems in
preserving the gs selector...and this method is just as fast or faster.
1994-06-06 14:54:41 +00:00
Rodney W. Grimes
26f9a76710 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
Gary Clark II
9f82ad3dfb Added ifdef for GPL_MATH_EMULATE to keep the sytem from panicing when
using it.
1994-04-29 21:39:55 +00:00
David Greenman
2862674874 Make Bruce happy: silently enter ddb on a BPT or trace trap if ddb is
configured in the kernel.
1994-04-07 10:51:00 +00:00
David Greenman
d230622648 New interrupt code from Bruce Evans. In additional to Bruce's attached
list of changes, I've made the following additional changes:

1) i386/include/ipl.h renamed to spl.h as the name conflicts with the
   file of the same name in i386/isa/ipl.h.
2) changed all use of *mask (i.e. netmask, biomask, ttymask, etc) to
   *_imask (net_imask, etc).
3) changed vestige of splnet use in if_is to splimp.
4) got rid of "impmask" completely (Bruce had gotten rid of netmask),
   and are now using net_imask instead.
5) dozens of minor cruft to glue in Bruce's changes.

   These require changes I made to config(8) as well, and thus it must
be rebuilt.

-DG

from Bruce Evans:

sio:
	o No diff is supplied.  Remove the define of setsofttty().  I hope
	  that is enough.

*.s:
	o i386/isa/debug.h no longer exists.  The event counters became too
	  much trouble to maintain.  All function call entry and exception
	  entry counters can be recovered by using profiling kernel (the new
	  profiling supports all entry points; however, it is too slow to
	  leave enabled all the time; it also).  Only BDBTRAP() from debug.h
	  is now used.  That is moved to exception.s.  It might be worth
	  preserving SHOW_BITS() and calling it from _mcount() (if enabled).
	o T_ASTFLT is now only set just before calling trap().
	o All exception handlers set SWI_AST_MASK in cpl as soon as possible
	  after entry and arrange for _doreti to restore it atomically with
	  exiting.  It is not possible to set it atomically with entering
	  the kernel, so it must be checked against the user mode bits in
	  the trap frame before committing to using it.  There is no place
	  to store the old value of cpl for syscalls or traps, so there are
	  some complications restoring it.

Profiling stuff (mostly in *.s):
	o Changes to kern/subr_mcount.c, gcc and gprof are not supplied yet.
	o All interesting labels `foo' are renamed `_foo' and all
	  uninteresting labels `_bar' are renamed `bar'.  A small change
	  to gprof allows ignoring labels not starting with underscores.
	o MCOUNT_LABEL() is to provide names for counters for times spent
	  in exception handlers.
	o FAKE_MCOUNT() is a version of MCOUNT() suitable for exception
	  handlers.  Its arg is the pc where the exception occurred.  The
	  new mcount() pretends that this was a call from that pc to a
	  suitable MCOUNT_LABEL().
	o MEXITCOUNT is to turn off any timer started by MCOUNT().

/usr/src/sys/i386/i386/exception.s:
	o The non-BDB BPTTRAP() macros were doing a sti even when interrupts
	  were disabled when the trap occurred.  The sti (fixed) sti is
	  actually a no-op unless you have my changes to machdep.c that make
	  the debugger trap gates interrupt gates, but fixing that would
	  make the ifdefs messier.  ddb seems to be unharmed by both
	  interrupts always disabled and always enabled (I had the branch in
	  the fix back to front for some time :-().
	o There is no known pushal bug.
	o tf_err can be left as garbage for syscalls.

/usr/src/sys/i386/i386/locore.s:
	o Fix and update BDE_DEBUGGER support.
	o ENTRY(btext) before initialization was dangerous.
	o Warm boot shot was longer than intended.

/usr/src/sys/i386/i386/machdep.c:
	o DON'T APPLY ALL OF THIS DIFF.  It's what I'm using, but may require
	  other changes.
	  Use the following:
		o Remove aston() and setsoftclock().
	  Maybe use the following:
		o No netisr.h.
		o Spelling fix.
		o Delay to read the Rebooting message.
		o Fix for vm system unmapping a reduced area of memory
		  after bounds_check_with_label() reduces the size of
		  a physical i/o for a partition boundary.  A similar
		  fix is required in kern_physio.c.
		o Correct use of __CONCAT.  It never worked here for non-
		  ANSI cpp's.  Is it time to drop support for non-ANSI?
		o gdt_segs init.  0xffffffffUL is bogus because ssd_limit
		  is not 32 bits.  The replacement may have the same
		  value :-), but is more natural.
		o physmem was one page too low.  Confusing variable names.
	  Don't use the following:
		o Better numbers of buffers.  Each 8K page requires up to
		  16 buffer headers.  On my system, this results in 5576
		  buffers containing [up to] 2854912 bytes of memory.
		  The usual allocation of about 384 buffers only holds
		  192K of disk if you use it on an fs with a block size
		  of 512.
		o gdt changes for bdb.
		o *TGT -> *IDT changes for bdb.
		o #ifdefed changes for bdb.

/usr/src/sys/i386/i386/microtime.s:
	o Use the correct asm macros.  I think asm.h was copied from Mach
	  just for microtime and isn't used now.  It certainly doesn't
	  belong in <sys>.  Various macros are also duplicated in
	  sys/i386/boot.h and libc/i386/*.h.
	o Don't switch to and from the IRR; it is guaranteed to be selected
	  (default after ICU init and explicitly selected in isa.c too, and
	  never changed until the old microtime clobbered it).

/usr/src/sys/i386/i386/support.s:
	o Non-essential changes (none related to spls or profiling).
	o Removed slow loads of %gs again.  The LDT support may require
	  not relying on %gs, but loading it is not the way to fix it!
	  Some places (copyin ...) forgot to load it.  Loading it clobbers
	  the user %gs.  trap() still loads it after certain types of
	  faults so that fuword() etc can rely on it without loading it
	  explicitly.  Exception handlers don't restore it.  If we want
	  to preserve the user %gs, then the fastest method is to not
	  touch it except for context switches.  Comparing with
	  VM_MAXUSER_ADDRESS and branching takes only 2 or 4 cycles on
	  a 486, while loading %gs takes 9 cycles and using it takes
	  another.
	o Fixed a signed branch to unsigned.

/usr/src/sys/i386/i386/swtch.s:
	o Move spl0() outside of idle loop.
	o Remove cli/sti from idle loop.  sw1 does a cli, and in the
	  unlikely event of an interrupt occurring and whichqs becoming
	  zero, sw1 will just jump back to _idle.
	o There's no spl0() function in asm any more, so use splz().
	o swtch() doesn't need to be superaligned, at least with the
	  new mcounting.
	o Fixed a signed branch to unsigned.
	o Removed astoff().

/usr/src/sys/i386/i386/trap.c:
	o The decentralized extern decls were inconsistent, of course.
	o Fixed typo MATH_EMULTATE in comments. */
	o Removed unused variables.
	o Old netmask is now impmask; print it instead.  Perhaps we
	  should print some of the new masks.
	o BTW, trap() should not print anything for normal debugger
	  traps.

/usr/src/sys/i386/include/asmacros.h:
	o DON'T APPLY ALL OF THIS DIFF.  Just use some of the null macros
	  as necessary.

/usr/src/sys/i386/include/cpu.h:
	o CLKF_BASEPRI() changes since cpl == SWI_AST_MASK is now normal
	  while the kernel is running.
	o Don't use var++ to set boolean variables.  It fails after a mere
	  4G times :-) and is slower than storing a constant on [3-4]86s.

/usr/src/sys/i386/include/cpufunc.h:
	o DON'T APPLY ALL OF THIS DIFF.  You need mainly the include of
	  <machine/ipl.h>.  Unfortunately, <machine/ipl.h> is needed by
	  almost everything for the inlines.

/usr/src/sys/i386/include/ipl.h:
	o New file.  Defines spl inlines and SWI macros and declares most
	  variables related to hard and soft interrupt masks.

/usr/src/sys/i386/isa/icu.h:
	o Moved definitions to <machine/ipl.h>

/usr/src/sys/i386/isa/icu.s:
	o Software interrupts (SWIs) and delayed hardware interrupts (HWIs)
	  are now handled uniformally, and dispatching them from splx() is
	  more like dispatching them from _doreti.  The dispatcher is
	  essentially *(handler[ffs(ipending & ~cpl)]().
	o More care (not quite enough) is taken to avoid unbounded nesting
	  of interrupts.
	o The interface to softclock() is changed so that a trap frame is
	  not required.
	o Fast interrupt handlers are now handled more uniformally.
	  Configuration is still too early (new handlers would require
	  bits in <machine/ipl.h> and functions to vector.s).
	o splnnn() and splx() are no longer here; they are inline functions
	  (could be macros for other compilers).  splz() is the nontrivial
	  part of the old splx().

/usr/src/sys/i386/isa/ipl.h
	o New file.  Supposed to have only bus-dependent stuff.  Perhaps
	  the h/w masks should be declared here.

/usr/src/sys/i386/isa/isa.c:
	o DON'T APPLY ALL OF THIS DIFF.  You need only things involving
	  *mask and *MASK and comments about them.  netmask is now a pure
	  software mask.  It works like the softclock mask.

/usr/src/sys/i386/isa/vector.s:
	o Reorganize AUTO_EOI* macros.
	o Option FAST_INTR_HANDLER_USERS_ES for people who don't trust
	  fastintr handlers.
	o fastintr handlers need to metamorphose into ordinary interrupt
	  handlers if their SWI bit has become set.  Previously, sio had
	  unintended latency for handling output completions and input
	  of SLIP framing characters because this was not done.

/usr/src/sys/net/netisr.h:
	o The machine-dependent stuff is now imported from <machine/ipl.h>.

/usr/src/sys/sys/systm.h
	o DON'T APPLY ALL OF THIS DIFF.  You need mainly the different
	  splx() prototype.  The spl*() prototypes are duplicated as
	  inlines in <machine/ipl.h> but they need to be duplicated here
	  in case there are no inlines.  I sent systm.h and cpufunc.h
	  to Garrett.  We agree that spl0 should be replaced by splnone
	  and not the other way around like I've done.

/usr/src/sys/kern/kern_clock.c
	o splsoftclock() now lowers cpl so the direct call to softclock()
	  works as intended.
	o softclock() interface changed to avoid passing the whole frame
	  (some machines may need another change for profile_tick()).
	o profiling renamed _profiling to avoid ANSI namespace pollution.
	  (I had to improve the mcount() interface and may as well fix it.)
	  The GUPROF variant doesn't actually reference profiling here,
	  but the 'U' in GUPROF should mean to select the microtimer
	  mcount() and not change the interface.
1994-04-02 07:00:53 +00:00
David Greenman
ed7fcbd079 From John Dyson: performance improvements to the new bounce buffer
code.
1994-03-24 23:12:48 +00:00
David Greenman
943a66f340 Performance improvements from John Dyson.
1) A new mechanism has been added to prevent pages from being paged
	out called "vm_page_hold". Similar to vm_page_wire, but
	much lower overhead.
2) Scheduling algorithm has been changed to improve interactive
	performance.
3) Paging algorithm improved.
4) Some vnode and swap pager bugs fixed.
1994-03-14 21:54:03 +00:00
David Greenman
04f1835605 1) "Pre-faulting" in of pages into process address space
Eliminates vm_fault overhead on process startup and
		mmap referenced data for in-memory pages.

		(process startup time using in-memory segments *much* faster)

	2)	Even more efficient pmap code.  Code partially cleaned up.
		More comments yet to follow.

		(generally more efficient pte management)

	3)	Pageout clustering ( in addition to the FreeBSD V1.1 pagein
		clustering.)

		(much faster paging performance on non-write behind disk
		subsystems, slightly faster performance on other systems.)

	4)	Slightly changed vm_pageout code for more efficiency and
		better statistics.  Also, resist swapout a little more.

		(less likely to pageout a recently used page)

	5)	Slight improvement to the page table page trap efficiency.

		(generally faster system VM fault performance)

	6)	Defer creation of unnamed anonymous regions pager until needed.

		(speeds up shared memory bss creation)

	7)	Remove possible deadlock from swap_pager initialization.

	8)	Enhanced procfs to provide "vminfo" about vm objects and user
		pmaps.

	9)	Increased MCLSHIFT/MCLBYTES from 2K to 4K to improve net &
		socket performance and to prepare for things to come.

John Dyson
dyson@implode.root.com
David Greenman
davidg@root.com
1994-03-07 11:38:49 +00:00
David Greenman
b9d60b3f59 Fixed bugs in stack grow code, and moved it back into a seperate function
like it was originally. Also added back call to "grow" in sendsig now
that this routine actually works.
1994-02-08 09:26:04 +00:00
David Greenman
0172c219f1 Minor cleanup. Decode state information better in the case of a fatal
trap.
1994-02-01 23:07:35 +00:00
David Greenman
d64f660fac Improvements mostly from John Dyson, with a little bit from me.
* Removed pmap_is_wired
* added extra cli/sti protection in idle (swtch.s)
* slight code improvement in trap.c
* added lots of comments
* improved paging and other algorithms in VM system
1994-01-17 09:32:32 +00:00
David Greenman
7f8cb36869 "New" VM system from John Dyson & myself. For a run-down of the
major changes, see the log of any effected file in the sys/vm
directory (swap_pager.c for instance).
1994-01-14 16:25:31 +00:00
David Greenman
c8a13ecd00 Convert syscall to trapframe. Based on work done by John Brezak. 1994-01-03 07:55:47 +00:00
Garrett Wollman
aaf08d94ca Make everything compile with -Wtraditional. Make it easier to distribute
a binary link-kit.  Make all non-optional options (pagers, procfs) standard,
and update LINT to reflect new symtab requirements.

NB: -Wtraditional will henceforth be forgotten.  This editing pass was
primarily intended to detect any constructions where the old code might
have been relying on traditional C semantics or syntax.  These were all
fixed, and the result of fixing some of them means that -Wall is now a
realistic possibility within a few weeks.
1993-12-19 00:55:01 +00:00
David Greenman
6d01f02e51 1) Added proc file system from Paul Kranenburg with changes from
John Dyson to make it reliably work under FreeBSD.
2) Added and enabled PROCFS in the GENERICxx and LINT kernels.
3) New execve() from me. Still work to be done here, but this version
	works well and is needed before other changes can be made. For
	a description of the design behind this, see freebsd-arch or
	ask me.
4) Rewrote stack fault code; made user stack VM grow as needed rather
	than all up front; improves performance a little and reduces
	process memory requirements.
5) Incorporated fix from Gene Stark to fault/wire a user page table
	page to fix a problem in copyout. This is a temporary fix and
	is not appropriate for pageable page tables. For a description
	of the problem, see Gene's post to the freebsd-hackers mailing
	list.
6) Tighten up vm_page struct to reduce memory requirements for it. ifdef
	pager page lock code as it's not being used currently.
7) Introduced new element to vmspace struct - vm_minsaddr; initial
	(minimum) stack address. Compliment to vm_maxsaddr.
8) Added a panic if the allocation for process u-pages fails.
9) Improve performance and accuracy of kernel profiling by putting in
	a little inline assembly instead of spl().
10) Made serial console with sio driver work. Still has problems with
	serial input, but is almost useable.
11) Added -Bstatic to SYSTEM_LD in Makefile.i386 so that kernels will
	build properly with the new ld.
1993-12-12 12:22:57 +00:00
Andrew Moore
05e634ef64 From: Jeffrey Hsu <hsu@soda.berkeley.edu>
The following patch adds the addr argument to signal handlers.

The kernel with the patch is no more and no less in compliance or in
violation of POSIX and ANSI C than the kernel before the patch.

The added functionality this addr argument provides is quite useful.  It
enables an entire class of algorithms which use mprotect to trace memory
references.  Beside garbage collectors, I have heard of this technique being
applied to debuggers and profilers.  The only benchmarking I've performed is
using akcl to compile maxima:  without the kernel patch, it takes 7 hours to
compile maxima, while with stratified garbage collection, it only takes 50
minutes.

Basically, I can't think of a reason not to add the addr argument and there
is a compelling need for it.

If you find the patch acceptable, please let me know so I can send my
FreeBSD akcl config files to wfs for inclusion in the core akcl release.
The old 386BSD config files there won't work on either NetBSD or FreeBSD.
1993-12-03 05:10:08 +00:00
David Greenman
a9627169cd Patch from Gene Stark:
Subject: Page fault in PTE area fails in copyout
Index: sys/i386/i386/trap.c FreeBSD-1.0.2

Description:
	Reading files of several megabytes into Emacs, or many small
	files all at once, would fail with "IO error - bad address".

Repeat-By:
	The bug can be exercised by a test program that malloc()'s
	a 5MB chunk of memory, and then, without accessing the memory
	first, filling it with data from a file using read().
	(I read 64k chunks from /dev/wd0d into successive 64k regions
	of the 5MB chunk.)  The read() will fail with EFAULT at the first
	virtual address boundary that is a multiple of 0x400000.

Fix:
	The problem was code in sys/i386/i386/trap.c that tries to
	figure out what kind of trap occurred and to handle it appropriately.
	It was interpreting any page fault with virtual address
	>= vm->vm_maxsaddr as being a user stack segment fault.
	In fact, addresses >= USRSTACK are in the user structure/PTE area,
	and if they are handled as stack faults, the proper PTE will
	not be paged in when it is supposed to be.  This situation comes
	up in copyout() and copyoutstr(), if PTE's are accessed for the
	first time ever.  The page fault on accessing the nonexistent PTE
	is mishandled as a stack fault, and then the fault that occurs on
	the subsequent access to the page itself causes copyout to fail
	with EFAULT.
1993-11-28 09:28:54 +00:00
Garrett Wollman
381fe1aaf4 Make the LINT kernel compile with -W -Wreturn-type -Wcomment -Werror, and
add same (sans -Werror) to Makefile for future compilations.
1993-11-25 01:38:01 +00:00
David Greenman
0967373e1c First steps in rewriting locore.s, and making info useful
when the machine panics.

i386/i386/locore.s:
1) got rid of most .set directives that were being used like
	#define's, and replaced them with appropriate #define's in
	the appropriate header files (accessed via genassym).
2) added comments to header inclusions and global definitions,
	and global variables
3) replaced some hardcoded constants with cpp defines (such as
	PDESIZE and others)
4) aligned all comments to the same column to make them easier to
	read
5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to
	/sys/i386/include/asmacros.h
6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code
7) added new global '_KERNend' to store last location+1 of kernel
8) cleaned up zeroing of bss so that only bss is zeroed
9) fix zeroing of page tables so that it really does zero them all
	- not just if they follow the bss.
10) rewrote page table initialization code so that 1) works correctly
	and 2) write protects the kernel text by default
11) properly initialize the kernel page directory, upages, p0stack PT,
	and page tables. The previous scheme was more than a bit
	screwy.
12) change allocation of virtual area of IO hole so that it is
	fixed at KERNBASE + 0xa0000. The previous scheme put it
	right after the kernel page tables and then later expected
	it to be at KERNBASE +0xa0000
13) change multiple bogus settings of user read/write of various
	areas of kernel VM - including the IO hole; we should never
	be accessing the IO hole in user mode through the kernel
	page tables
14) split kernel support routines such as bcopy, bzero, copyin,
	copyout, etc. into a seperate file 'support.s'
15) split swtch and related routines into a seperate 'swtch.s'
16) split routines related to traps, syscalls, and interrupts
	into a seperate file 'exception.s'
17) remove some unused global variables from locore that got
	inserted by Garrett when he pulled them out of some .h
	files.

i386/isa/icu.s:
1) clean up global variable declarations
2) move in declaration of astpending and netisr

i386/i386/pmap.c:
1) fix calculation of virtual_avail. It previously was calculated
	to be right in the middle of the kernel page tables - not
	a good place to start allocating kernel VM.
2) properly allocate kernel page dir/tables etc out of kernel map
	- previously only took out 2 pages.

i386/i386/machdep.c:
1) modify boot() to print a warning that the system will reboot in
	PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user
	abort with a key on the console. The machine will wait for
	ever if a key is typed before the reboot. The default is
	15 seconds, but can be set to 0 to mean don't wait at all,
	-1 to mean wait forever, or any positive value to wait for
	that many seconds.
2) print "Rebooting..." just before doing it.

kern/subr_prf.c:
1) remove PANICWAIT as it is deprecated by the change to machdep.c

i386/i386/trap.c:
1) add table of trap type strings and use it to print a real trap/
	panic message rather than just a number. Lot's of work to
	be done here, but this is the first step. Symbolic traceback
	is in the TODO.

i386/i386/Makefile.i386:
1) add support in to build support.s, exception.s and swtch.s

...and various changes to various header files to make all of the
	above happen.
1993-11-13 02:25:21 +00:00
David Greenman
d19eeaea76 splnone()'s in the trap code can be deadly. Save/restore previous priority
instead.
1993-11-04 15:05:41 +00:00
Christoph Robitschko
d73546248a Modified the "rude stack hack" that it only applies to addresses within
the stack area and not memory above VM_MAXUSER_ADDRESS.
That way, copyout and friends now work for pages whose page table entries
have not yet been allocated/been paged out.
1993-11-01 11:51:29 +00:00
Rodney W. Grimes
960173b9b2 genassym.c:
Remove NKMEMCLUSTERS, it is no longer define or used.

locores.s:
	Fix comment on PTDpde and APTDpde to be pde instead of pte
	Add new equation for calculating location of Sysmap
	Remove Bill's old #ifdef garbage for counting up memory,
	that stuff will never be made to work and was just cluttering
	up the file.

	Add code that places the PTD, page table pages, and kernel
	stack below the 640k ISA hole if there is room for it, otherwise
	put this stuff all at 1MB.  This fixes the 28K bogusity in
	the boot blocks, that can now go away!

	Fix the caclulation of where first is to be dependent on
	NKPDE so that we can skip over the above mentioned areas.
	The 28K thing is now 44K in size due to the increase in
	kernel virtual memory space, but since we no longer have
	to worry about that this is no big deal.

	Use if NNPX > 0 instead of ifdef NPX for floating point code.

machdep.c
	Change the calculation of for the buffer cache to be
	20% of all memory above 2MB and add back the upper limit
	of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL
	of the kernel memory space on large memory machines, note
	that this will not even come into effect unless you have
	more than 32MB.  The current buffer cache limit is 6.7MB
	due to this caclulation.

	It seems that we where erroniously allocating bufpages pages
	for buffer_map.  buffer_map is UNUSED in this implementation
	of the buffer cache, but since the map is referenced in
	several if statements a quick fix was to simply allocate
	1 vm page (but no real memory) to it.

pmap.h
	Remove rcsid, don't want them in the kernel files!

	Removed some cruft inside an #ifdef DEBUGx that caused
	compiler errors if you where compiling this for debug.

	Use the #defines for PD_SHIFT and PG_SHIFT in place of
	constants.

trap.c:
	Remove patch kit header and rcsid, fix $Id$.
	Now include "npx.h" and use NNPX for controlling the
	floating point code.

	Remove a now completly invalid check for a maximum virtual
	address, the virtual address now ends at 0xFFFFFFFF so
	there is no more MAX!!  (Thanks David, I completly missed
	that one!)

vm_machdep.c
	Remove patch kit header and rcsid, fix $Id$.
	Now include "npx.h" and use NNPX for controlling the
	floating point code.

	Replace several 0xFE00000 constants with KERNBASE
1993-10-15 10:34:29 +00:00