46 Commits

Author SHA1 Message Date
gallatin
1e1aef3713 remove bogus check -- for kernel threads we fork off of proc0, not curproc
This was causing panics when modules which create kthreads were loaded
after boot.

pointed out by: jake, jhb
2001-03-15 02:32:26 +00:00
jhb
83d74ad162 Use the proc lock to protect p_pptr when waking up our parent in cpu_exit()
and remove the mpfixme() message that is now fixed.
2001-03-07 03:20:15 +00:00
jhb
5eea0dc69d Rename switch_trampoline() to fork_trampoline() on the alpha and ia64.
Suggested by:	dfr
2001-02-22 16:56:53 +00:00
jhb
27efeb0d30 - Don't call clear_resched() in userret(), instead, clear the resched flag
in mi_switch() just before calling cpu_switch() so that the first switch
  after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
  cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
  proc in mi_switch(), and handle the sched_lock state change across a
  context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
  have to setup an initial state for sched_lock in fork_exit() before we
  release it.
2001-02-20 05:26:15 +00:00
bmilekic
f364d4ac36 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
jhb
17984f33e4 Update some comments, s0 in the pcb of a child returning from fork1() is
now passed in as a0 to fork_exit() and and s2 is passed in as a1.
2001-01-26 23:32:38 +00:00
jhb
af00dc8eef - Change fork_exit() to take a pointer to a trapframe as its 3rd argument
instead of a trapframe directly.  (Requested by bde.)
- Convert the alpha switch_trampoline to call fork_exit() and use the MI
  fork_return() instead of child_return().
- Axe child_return().
2001-01-24 21:59:25 +00:00
jake
fa7a58ab48 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
gallatin
d874e25bc3 acquire/release Giant in vm_page_zero_idle(), like on i386
Discused with: jhb
2000-12-01 18:55:58 +00:00
dfr
da6f7f0f11 Convert various calls to splhigh() to disable_intr() since splhigh() is
now a no-op.
2000-11-19 12:28:42 +00:00
jhb
4b3e3264c7 Don't perform an mi_switch() when we release Giant during cpu_exit(). We
are about to call cpu_switch() anyways.

Found by:	witness
2000-11-15 19:44:38 +00:00
jhb
ff18363a3e - Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt.  Roughly, what used to be a bit in spending
  now maps to a swi thread.  Each thread can have multiple handlers, just
  like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
  software interrupt thread to run, so spending, NSWI, and the shandlers
  array are no longer needed.  We can now have an arbitrary number of
  software interrupt threads.  When you register a software interrupt
  thread via sinthand_add(), you get back a struct intrhand that you pass
  to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
  more intuitive.  Also, prefix all the members of struct intrhand with
  'ih_'.
- Make swi_net() a MI function since there is now no point in it being
  MD.

Submitted by:	cp
2000-10-25 05:19:40 +00:00
jhb
d944886e4d Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
dfr
0e1317469e Clear pcb_schednest in cpu_fork() for the child process. This is
is necessary since the child's call stack only includes one recursive
hold of sched_lock.
2000-10-03 08:03:03 +00:00
jasone
769e0f974d Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
gallatin
ccfeb47d98 Support bounce buffers for ISA DMA on the alpha. This is required for the
irongate chipset (used in the UP1000) which does not support scatter/gather
DMA.  We'll still use scatter gather if the core logic chipset supports it.

Reviewed by: dfr
2000-06-19 18:41:27 +00:00
alc
4fc801a857 cpu_fork(): Check "flags" before dereferencing "p2". Otherwise,
the call "vm_fork(p1, 0, flags);" early in fork1 can cause a kernel
panic.
2000-06-11 06:22:01 +00:00
phk
36c3965ff9 Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
phk
a246e10f55 Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd.  The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue.  It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users:  Greg has not had time to test this yet, be careful.
2000-03-20 10:44:49 +00:00
gallatin
e9bf4a720b The kernel side of per-process unaligned access control (printing, fixing &
delivering SIGBUS).  This will allow a non-superuser to control unaligned
access behaviour on a per-process basis once a userland control program
(uac) is written.

Reviewed by: obrien
Tested by:   obrien
2000-01-16 07:07:33 +00:00
peter
3524c347b0 Make this compile again. (missing #include for RFPROC) 1999-12-06 18:12:29 +00:00
luoqi
5c9244cd12 User ldt sharing. 1999-12-06 04:53:08 +00:00
dfr
6dfb400106 Re-organise the code which manages the owner of the FP state (fpcurproc).
The old code was spread out through the machdep code and was sloppy about
enabling and disabling the FEN bit (which controls access to the FP
register set). This caused a DIAGNOSTIC warning "DANGER WILL ROBINSON:
FEN SET IN cpu_fork!" sometimes when operating under high loads and could
conceivably lead to processes getting incorrect FP results.

The new code is much more strict about the FEN bit and makes sure that
*only* fpcurproc ever has it enabled. This also allows us to remove a
section of code from the exception_return path which might improve
performance marginally.

Reviewed by: gallatin
1999-11-10 21:14:25 +00:00
alc
a301513e80 The core of this patch is to vm/vm_page.h. The effects are two-fold: (1) to
eliminate an extra (useless) level of indirection in half of the page
queue accesses and (2) to use a single name for each queue throughout,
instead of, e.g., "vm_page_queue_active" in some places and
"vm_page_queues[PQ_ACTIVE]" in others.

Reviewed by:	dillon
1999-10-30 07:37:14 +00:00
phk
8e3c3eafed useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
dillon
e8761b4e05 Fix bug in pipe code relating to writes of mmap'd but illegal address
spaces which cross a segment boundry in the page table.  pmap_kextract()
    is not designed for access to the user space portion of the page
    table and cannot handle the null-page-directory-entry case.

    The fix is to have vm_fault_quick() return a success or failure which
    is then used to avoid calling pmap_kextract().
1999-09-20 19:08:48 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
gallatin
ffd6399600 Fix the child's return path from fork so that fork will return 0
in the child.  This corrects a problem where linux/alpha binaries see
the child's return value of fork as the parent's pid.  This happens because
linux/alpha binaries apparently check the return value directly, rather
than looking for a non-zero value in a4, as *BSD & OSF/1 do.

Reviewed by:dfr@nlsystems.com
1999-08-27 14:47:23 +00:00
jdp
86197f5aad Sync with alc's revision 1.125 of i386/i386/vm_machdep.c. This
fixes the kernel build breakage.
1999-08-05 23:38:13 +00:00
alc
9cda945475 Reduce the number of "magic constants" used for page coloring
by one: PQ_PRIME2 and PQ_PRIME3 are used to accomplish the same
thing at different places in the kernel.  Drop PQ_PRIME3.
1999-07-22 06:04:17 +00:00
peter
6cb5fe6c6b Slight reorganization of kernel thread/process creation. Instead of using
SYSINIT_KT() etc (which is a static, compile-time procedure), use a
NetBSD-style kthread_create() interface.  kproc_start is still available
as a SYSINIT() hook.  This allowed simplification of chunks of the
sysinit code in the process.  This kthread_create() is our old kproc_start
internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work
the same as the NetBSD one.

One thing I'd like to do shortly is get rid of nfsiod as a user initiated
process.  It makes sense for the nfs client code to create them on the
fly as needed up to a user settable limit.  This means that nfsiod
doesn't need to be in /sbin and is always "available".  This is a fair bit
easier to do outside of the SYSINIT_KT() framework.
1999-07-01 13:21:46 +00:00
dt
16f5a97188 Replace my previous fix of saving the FP state with a much simpler one: when
we swap out fpcurproc, save its FP state.

Suggested by:	bde
1999-06-10 20:40:59 +00:00
dt
6f2e7884b6 Keep fpcurproc locked in memory, so that we always can save the FP state
correctly.

This should fix the "pmap_changebit didn't" panic that some people see.

Reviewed by:	dfr
1999-06-08 16:42:19 +00:00
dt
a1d50daf71 Fixed several (not all) warnings. 1999-04-23 19:53:38 +00:00
dt
9eacfd4188 Added consts to cpu_set_fork_handler prototype. (Follow i386 version.) 1999-04-20 22:53:54 +00:00
peter
a74bdeb7d1 unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.
1999-04-19 14:14:14 +00:00
dillon
e402e06da9 Adjust idle zero-page fill hysteresis based on tests. Use 2/3 and 4/5
zero-fill levels.

    Adjust comment for ozfod in vmmeter.h - this counter represents
    non-optimal ( on the fly ) zero fills, not prefills.
1999-02-08 02:42:13 +00:00
dillon
f403a5fe02 Add hysteresis to alpha version of vm_page_zero_idle(). 1999-02-08 00:47:32 +00:00
dillon
b7a0b99c31 Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in with
PQ_FREE.  There is little operational difference other then the kernel
    being a few kilobytes smaller and the code being more readable.

    * vm_page_select_free() has been *greatly* simplified.
    * The PQ_ZERO page queue and supporting structures have been removed
    * vm_page_zero_idle() revamped (see below)

    PG_ZERO setting and clearing has been migrated from vm_page_alloc()
    to vm_page_free[_zero]() and will eventually be guarenteed to remain
    tracked throughout a page's life ( if it isn't already ).

    When a page is freed, PG_ZERO pages are appended to the appropriate
    tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended.
    When locating a new free page, PG_ZERO selection operates from within
    vm_page_list_find() ( get page from end of queue instead of beginning
    of queue ) and then only occurs in the nominal critical path case.  If
    the nominal case misses, both normal and zero-page allocation devolves
    into the same _vm_page_list_find() select code without any specific
    zero-page optimizations.

    Additionally, vm_page_zero_idle() has been revamped.  Hysteresis has been
    added and zero-page tracking adjusted to conform with the other changes.
    Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free
    pages.  We may wish to increase both parameters as time permits.  The
    hysteresis is designed to avoid silly zeroing in borderline allocation/free
    situations.
1999-02-08 00:37:36 +00:00
julian
4b7738dba1 Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option.  I was aware of this
|when I first did the work, but then forgot about it.  The VM_STACK stuff
|has some code changes in the i386 branch.  There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in.  This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches.  The vm_map_entry will then always have one
|extra element (avail_ssize).  It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c.  This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch.  I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:49:52 +00:00
dfr
9aad9d912f Various changes to support OSF1 emulation:
* Move the user stack from VM_MAXUSER_ADDRESS to a place below the 32bit
  boundary (needed to support 32bit OSF programs).  This should also save
  one pagetable per process.
* Add cvtqlsv to the set of instructions handled by the floating point
  software completion code.
* Disable all floating point exceptions by default.
* A minor change to execve to allow the OSF1 image activator to support
  dynamic loading.
1998-12-30 10:38:59 +00:00
bde
9dd9cb4cb2 Removed bogus casts of USRSTACK and/or the other operand in binary
expressions involving USRSTACK.
1998-12-16 15:21:51 +00:00
dfr
8f01fff32f Implement 'software completion' for floating point arithmetic. On the
alpha, operations involving non-finite numbers or denormalised numbers
or operations which should generate such numbers will cause an arithmetic
exception.  For programs which follow some strict code generation rules,
the kernel trap handler can then 'complete' the operation by emulating
the faulting instruction.

To use software completion, a program must be compiled with the arguments
'-mtrap-precision=i' and '-mfp-trap-mode=su' or '-mfp-trap-mode=sui'.
Programs compiled in this way can use non-finite and denormalised numbers
at the expense of slightly less efficient code generation of floating
point instructions.  Programs not compiled with these options will receive
a SIGFPE signal when non-finite or denormalised numbers are used or
generated.

Reviewed by: John Polstra <jdp@polstra.com>
1998-12-04 10:52:48 +00:00
dfr
87a0ba4fba Change a bogus cast to the correct one. 1998-10-15 09:53:27 +00:00
dfr
f2d6c22423 Don't bother calling pmap_emulate_reference() from cpu_fork(). It isn't
needed and it panics a DIAGNOSTIC kernel.
1998-07-12 16:30:58 +00:00
dfr
8bc7b7e51a Add missing copyrights. Thanks to Jason Thorpe for politely noting the
mistake...
1998-06-10 19:59:41 +00:00