Commit Graph

1130 Commits

Author SHA1 Message Date
marcel
6c553bc634 Remove a gratuitous align directive after the endp directive for
IVT entries.
2003-07-11 08:49:26 +00:00
marcel
14925f9a36 Don't call malloc() and free() while in the debugger and unwinding
to get a stacktrace. This does not work even with M_NOWAIT when we
have WITNESS and is generally a bad idea (pointed out by bde@). We
allocate an 8K heap for use by the unwinder when ddb is active. A
stack trace roughly takes up half of that in any case, so we have
some room for complex unwind situations. We don't want to waste too
much space though. Due to the nature of unwinding, we don't worry
too much about fragmentation or performance of unwinding while in
the debugger. For now we have our own heap management, but we may
be able to leverage from existing code at some later time.

While here:
o  Make sure we actually free the unwind environment after unwinding.
   This fixes a memory leak.
o  Replace Doug's license with mine in unwind.c and unwind.h. Both
   files don't have much, if any, of Doug's code left since the EPC
   syscall overhaul and the import of the unwinder.
o  Remove dead code.
o  Replace M_NOWAIT with M_WAITOK for all remaining malloc() calls.
2003-07-05 23:21:58 +00:00
alc
0699f7e17f Background: pmap_object_init_pt() premaps the pages of a object in
order to avoid the overhead of later page faults.  In general, it
implements two cases: one for vnode-backed objects and one for
device-backed objects.  Only the device-backed case is really
machine-dependent, belonging in the pmap.

This commit moves the vnode-backed case into the (relatively) new
function vm_map_pmap_enter().  On amd64 and i386, this commit only
amounts to code rearrangement.  On alpha and ia64, the new machine
independent (MI) implementation of the vnode case is smaller and more
efficient than their pmap-based implementations.  (The MI
implementation takes advantage of the fact that objects in -CURRENT
are ordered collections of pages.)  On sparc64, pmap_object_init_pt()
hadn't (yet) been implemented.
2003-07-03 20:18:02 +00:00
ru
ceee3c7367 The .s files were repo-copied to .S files.
Approved by:	marcel
Repocopied by:	joe
2003-07-02 12:57:07 +00:00
marcel
d5e294a2e0 The use of SYSINIT requires the inclusion of <sys/kernel.h> 2003-07-02 01:22:29 +00:00
mux
3e14cb60b5 Make this even closer to other busdma backends. 2003-07-01 21:21:45 +00:00
mux
7f5998c707 Sync bounce pages support with the alpha backend. More precisely:
o use a mutex to protect the bounce pages structure.
	o use a SYSINIT function to initialize the bounce pages structures
	  and thus avoid a race condition in alloc_bounce_pages().
	o add support for the BUS_DMA_NOWAIT flag in bus_dmamap_load().
	o remove obsolete splhigh()/splx() calls.
	o remove printf() about incorrect locking in busdma_swi() and sync
	  busdma_swi() with the one of the alpha backend.
	o use __FBSDID.
2003-07-01 18:08:05 +00:00
mux
152160211a Honor the boundary of the busdma tag when allocating bounce pages.
This was fixed in revision 1.5 of alpha/alpha/busdma_machdep.c and
was never fixed in other busdma backends using bounce pages.
2003-07-01 16:54:54 +00:00
scottl
4d495abb9d Mega busdma API commit.
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma.  At the moment, this is used for the
asynchronous busdma_swi and callback mechanism.  Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg.  dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create().  The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.

sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms.  The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.

If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.

Reviewed by:	tmm, gibbs
2003-07-01 15:52:06 +00:00
alc
44509f207f - Export pmap_enter_quick() to the MI VM. This will permit the
implementation of a largely MI pmap_object_init_pt() for vnode-backed
   objects.  pmap_enter_quick() is implemented via pmap_enter() on sparc64
   and powerpc.
 - Correct a mismatch between pmap_object_init_pt()'s prototype and its
   various implementations.  (I plan to keep pmap_object_init_pt() as
   the MD hook for device-backed objects on i386 and amd64.)
 - Correct an error in ia64's pmap_enter_quick() and adjust its interface
   to match the other versions.  Discussed with: marcel
2003-06-29 21:20:04 +00:00
alc
4418bf544e - Remove the calls to pmap_install() from pmap_object_init_pt(); they are
redundant.  Discussed with: marcel
 - MFi386: Add vm object locking to pmap_object_init_pt().
2003-06-29 06:10:32 +00:00
marcel
96d5411913 Implement cpu_set_upcall_kse(). Elementary testing shows that this
function behaves correctly in principle, but is not expected to be
100% complete. In any case, with this commit we have KSE ported
enough to start runtime testing with threaded applications and fix
whatever bugs or omissions we encounter. Yay!
2003-06-28 09:22:25 +00:00
davidxu
bb3ae5a363 Add a machine depended function thread_siginfo, SA signal code
will use the function to construct a siginfo structure and use
the result to export to userland.

Reviewed by: julian
2003-06-28 06:34:08 +00:00
scottl
d68d16eebb Do the first and mostly mechanical step of adding mutex support to the
bus_dma async callback scheme.  Note that sparc64 does not seem to do
async callbacks.  Note that ia64 callbacks might not be MPSAFE at the
moment.  Note that powerpc doesn't seem to do async callbacks due to
the implementation being incomplete.

Reviewed by:	mostly silence on arch@
2003-06-27 08:31:48 +00:00
marcel
2ef56d2c5b Add TLS related relocation. 2003-06-19 06:51:43 +00:00
alc
9dcd110789 Fix a performance bug in all of the various implementations of
uma_small_alloc(): They always zeroed the page regardless of what the
caller requested.
2003-06-18 02:57:38 +00:00
davidxu
abb4420bbe Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
alc
83f108b04d Migrate the thread stack management functions from the machine-dependent
to the machine-independent parts of the VM.  At the same time, this
introduces vm object locking for the non-i386 platforms.

Two details:

1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES.  The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES.  To disable guard page, set
KSTACK_GUARD_PAGES to 0.

2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new.  In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
2003-06-14 23:23:55 +00:00
alc
d20c30720b Move the *_new_altkstack() and *_dispose_altkstack() functions out of the
various pmap implementations into the machine-independent vm.  They were
all identical.
2003-06-14 06:20:25 +00:00
marcel
8898e92876 Remove kernel event tracing. The overhead is significant when running
under ski.
2003-06-14 00:01:24 +00:00
marcel
4d9ed138d1 Make sure pcpu->pc_pcb is pointing to a 16-byte aligned address. The
PCB contains FP registers, whose alignment must be 16 bytes at least.
Since the PCB pointed to by pc_pcb is immediately after the PCPU
itself, round-up the size of thge PCPU to a multiple of 16 bytes. The
PCPU is page aligned.

This fixes a misalignment trap caused by stopping a CPU in a SMP
kernel, such as been done when entering the debugger.

Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
2003-06-12 00:15:18 +00:00
peter
fda03b7cfc GC unused cpu_wait() function 2003-06-11 05:20:33 +00:00
jmallett
9e0997a080 Note that scbus is required for SCSI, not just "required" in general.
Submitted by:	Edward Kaplan (tmbg37 on IRC)
Reviewed by:	rwatson (in principle)
2003-06-08 02:03:02 +00:00
marcel
4c5771dae0 pmap_find_vhpt() has been observed to return a NULL pointer when
the caller assumes this to not happen by means of performing an
indirection without checking the return value. Add KASSERTs to
force a kernel with INVARIANTS to panic. This is a short-term
measure. The pmap code is scheduled to be overhauled.
2003-06-07 04:17:39 +00:00
marcel
67620c84fd If we get a fault in the gateway page, which would happen if we try
to deliver a signal and the RSE backing store has been exhausted or
the backing store pointer has been clobbered, we need to make sure
we call userret() and do_ast() when we exit from trap(). Not adjusting
the local variable 'user' in this case will prevent the faulty process
from being terminated and we end up in an infinite fault repetition.

Faulty process provided by: bento
2003-06-07 04:10:07 +00:00
marcel
8f82b3111e Use TRAPF_USERMODE() to replace an equivalent check in trap(). While
here, amend the related comment.
2003-06-06 23:44:05 +00:00
marcel
ef38b07248 Have TRAPF_USERMODE() take into account that the gateway page is not
always kernel space. It should be treated as user space when run with
user privileges (which is the case for the signal trampolines). This
fixes its only use in a KASSERT in subr_trap.c.
2003-06-06 23:27:18 +00:00
marcel
23fb7b1d8a Fix the dreaded double counting that was present on alpha as well and
got fixed two weeks after the ia64 version was copied from the alpha
version (see rev 1.32 of sys/alpha/alpha/mem.c). As such, we were
missing the same continue as on alpha.

While here, add a default case for the device minor switch and do
some general style(9) cleanups.

WARNING: this file still has bugs. When reading from region 6 or
region 7, we don't validate the physical address. One can trivially
cause a machine check by trying to read from address 0xFFFFFFFFFFFFFFF0
or something that uses the unimplemented physical address bits.

Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
2003-06-04 21:56:10 +00:00
marcel
482a35058c Change the second (and last) argument of cpu_set_upcall(). Previously
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.

Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.

This change officially marks the ability on ia64 for 1:1 threading.

Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
2003-06-04 21:13:21 +00:00
marcel
6f0fa0f8a5 Improve set_mcontext:
o  Don't copy psr verbatim from the user supplied context. Only allow
   userland to change the processor settings that are part of the user
   mask.
2003-06-01 23:22:56 +00:00
marcel
617972a658 Improve on cpu_set_upcall:
o  Use pcb and tf for the new pcb and the new trapframe and use pcb0
   for the old (current) pcb. The mix of pcb, pcb2 and tf was slightly
   confusing.
o  Don't define td->td_frame here. It has already been set previously
   by cpu_thread_setup. Add a KASSERT to make sure pcb and tf are both
   non-NULL.
o  Make sure the number of dirty registers is 0 for the new thread.
   There are no user registers on the backing store because we heven't
   enter userland yet.
2003-06-01 23:19:21 +00:00
marcel
10b3bd530b Implement cpu_thread_setup(). This is mostly the same as on i386,
except for the fact that trapframes have a size recorded in it
that we set here too. We need this for proper thread setup.

Pointed out by: mtm
2003-06-01 08:29:43 +00:00
marcel
536604960e Now that we have the signal trampolines in the gateway page and the
gateway page is considered kernel space, we can panic when we should
only SIGSEGV. Hence, add the additional constraint that for page
faults we also require running with kernel privileges. The gateway
page is the only kernel code running with user privileges, iso this
is a correct way to exclude the gateway page from kernel land.

We do not currently exclude the gateway page for other faults as it
is not always the right way to do it. Further tuning will happen on
a case by case bases.
2003-05-31 21:21:35 +00:00
marcel
2fe9074be4 Implement cpu_set_upcall(). Required by libthr and used by
thr_create(2). This implementation is so far only compile tested.
But since this is also the last of the functions required to
support libthr, we're now functionally complete (for some weird
definition of functionally; and complete). Runtime testing can
commence.
2003-05-31 21:14:25 +00:00
marcel
bf7c437f68 Implement set_mcontext() and get_mcontext(). Just as for sendsig() and
sigreturn(), we cheat and assume the preserved registers are still
on-chip and unmodified. This is actually the case, but more by accident
than by design. We need to use unwinding eventually or explicitly
compile the kernel in a way that the compiler steers clear from using
the preserved registers completely.
2003-05-31 21:07:08 +00:00
marcel
dc5393a5ef Make the regset pointers const pointers for the context restore functions.
This works better with set_mcontext() and is more precise in general.
2003-05-31 21:02:18 +00:00
marcel
e4d5efee39 Some ia32 related finetuning for the EPC syscall path:
o  The SDM states that flushing the RSE in the cycle prior to the
   call to ia32 code yields the best performance. We don't really
   care to much about performance here, but we do the same anyway.
   I'm being paranoia and conservative here.
o  Only initialize the ia32 state registers, not the registers used
   as scratch by the ia32 engine. This saves a couple of loads from
   the trapframe, but also helps debugging: we don't clobber useful
   debugging data (engineering hints :-)
o  Make sure all general registers constituting ia32 state have been
   initialized. If there's no useful to be loaded from the trapframe,
   clear the register. This avoids accidentally leaking NaT bits.
o  Make sure we set ar.k6 prior to clobbering ar.bspstore and also
   set ar.k7 prior to setting sp. This fixes a race seen for ia64
   native code as well (and previously fixed too).
2003-05-31 20:57:26 +00:00
marcel
bf9a37ed83 Make sure we have all the dirty registers in user frames on the
backing store before we discard them. It is possible that we
enter the kernel (due to an execve in this case) with a lot of
dirty user registers and that the RSE has only partially spilled
them (to make room for new frames). We cannot move the backing
store pointer down (to discard user registers) when not all of
the user registers are on the backing store.
So, we flush the register stack IFF this happens. Unconditionally
doing the flush is too costly, because the condition in which we
need to flush is very rare.

This change appears to fix the SIGSEGV that sometimes happen for
newly executed processes and so far also appears to fix the last
of the corruption. It is possible, although not likely, that this
change prevents some other bug from happening, even though it is
itself not a fix. Hence the uncertainty. We'll know in a couple
of months I guess :-)
2003-05-31 20:42:35 +00:00
hmp
d48f3818ad Rename BUS_DMAMEM_NOSYNC to BUS_DMA_COHERENT.
The current name is confusing, because it indicates to
the client that a bus_dmamap_sync() operation is not
necessary when the flag is specified, which is wrong.

The main purpose of this flag is to hint the underlying
architecture that DMA memory should be mapped in a coherent
way, but the architecture can ignore it.  But if the
architecture does supports coherent mapping of memory, then
it makes bus_dmamap_sync() calls cheap.

This flag is the same as the one in NetBSD's Bus DMA.

Reviewed by: gibbs, scottl, des (implicitly)
Approved by: re@ (jhb)
2003-05-30 20:40:33 +00:00
marcel
355d7ef9c5 Move the sysctls of the misalignment handler to where they belong
and use OID_AUTO instead of fixed IDs.

Approved by: re@ (blanket)
2003-05-29 06:30:36 +00:00
marcel
c050190f48 Fix what I think is a cut-n-paste bug: use OID_AUTO for the
print_usertrap sysctl instead of CPU_UNALIGNED_PRINT. The
latter is used already.

Approved by: re@ (blanket)
2003-05-29 05:09:15 +00:00
marcel
cae8aad886 A flushrs must be the first in an instruction group.
Approved by: re@ (blanket)
2003-05-27 07:10:58 +00:00
scottl
f26aca7b71 Bring back bus_dmasync_op_t. It is now a typedef to an int, though the
BUS_DMASYNC_ definitions remain as before.  The does not change the ABI,
and reverts the API to be a bit more compatible and flexible.  This has
survived a full 'make universe'.

Approved by:	re (bmah)
2003-05-27 04:59:59 +00:00
marcel
5fd5bdfb84 Have the unwinder allocate memory with M_NOWAIT. The unwinder is
used by DDB and we cannot know in advance whether it's save to
sleep. It often enough isn't. We may want to pre-allocate space
to cover the most common cases without having to use malloc at
all, but that requires some analysis. We leave that for later.

Approved by: re@ (blanket)
2003-05-27 01:15:16 +00:00
marcel
a5442fbe5b Fix fu{byte|word*} and su{byte|word*}:
o  If the address was not within user space we jumped to fusufault
   where we would clear pcb_onfault and return 0. There are two
   bugs here:
   1. We never got to the point where we assigned the address of
      pcb_onfault to r15, which means that we would clobber some
      random memory location, including I/O space or ROM.
   2. We're supposed to return -1 on error.
o  Make sure we have proper memory ordering for setting pcb_onfault,
   doing the memory access to user space and clearing pcb_onfault.
   For the fu* family of functions this means that we need a mf
   instruction, because we don't have acquire semantics on stores
   and release semantics on loads (hence st;ld cannot be ordered
   without intermediate mf).

While here, implement casuptr() so that we are a (small) step
closer to supporting libthr and deobfuscate the non-implementation
of {f|s}uswintr.

Approved by: re@ (blanket)
2003-05-27 01:00:12 +00:00
marcel
2e3e224616 Revision 1.99 of this file changed the allocation request from
VM_ALLOC_INTERRUPT to VM_ALLOC_SYSTEM. There was no mention of
this in commit log as it was considered harmless. Guess what:
it does harm. WITNESS showed that we can not safely grab the
page queue lock in vm_page_alloc() in all cases as we may have
to sleep on it. Revert the request to VM_ALLOC_INTERRUPT to
circumvent this. We panic if vm_page_alloc returns 0. I'm not
entirely happy about this, but we have bigger fish to fry.

Approved by: re@ (blanket)
2003-05-26 22:54:18 +00:00
marcel
58b1c667e7 Now that we define user mode as any IP address that isn't in the
kernel's VA regions, we cannot limit the use of break-based
syscalls to user mode only. The signal trampolines are in the
gateway page, which is mapped into the process address space in
region 5 and thus is kernel space.

We don't special case the gateway page here. Allow break-based
syscalls from anywhere in the kernel VA space.

Approved by: re@ (blanket)
2003-05-25 01:01:28 +00:00
marcel
ca381c2e5a Fix a source of instability specific to an EPC userland. We return
to userland with interrupts disabled until we restore PSR. However,
it has been observed that interrupts do actually happen before they
are enabled again. This is a bit surprising and I don't know yet
what's going on exactly. Nevertheless, the code was not crafted
carefully enough to allow interrupts to happen and we could
clobber the kernel stack of another thread when interrupts did
happen.

This is what happens: we restore the (memory) stack pointer (sp)
and the register stack base prior to restoring ar.k6 and ar.k7.
This is not a problem if interrupts don't happen between setting
sp/ar.bspstore and ar.k6/ar.k7. Alas, interrupts can happen.
Since sp/ar.bspstore already point to the userland stacks, we
need to switch to the kernel stack in interrupt. However, ar.k6
and ar.k7 have not been set, which means that we were switching
to some unrelated kstack and happily clobbered the trapframe
present there if the thread to which the kstack belonged was
in kernel mode or otherwise we could have our trapframe clobbered
if that other thread enters the kernel. Nasty either way.

We now carefully restore ar.k6 prior to restoring ar.bspstore and
likewise for ar.k7 and sp. All we need is the guarantee that an
interrupt does not clobber ar.k6 or ar.k7 before we're back in
userland. That has been achieved by restoring ar.k6/ar.k7
unconditionally (see exception.s)

While here, remove the disabling of interrupts on EPC entry. It
was added as a way to "resolve" the crashes until it was understood
what was going on. I think I achieved the latter, so we can remove
the patch. Note that setting up a trapframe with interrupts
enabled has it's own share of corner cases, but it's better to
properly fixed those than to keep a mostly wrong patch around
because we're afraid to remove it...

Approved by: re@ (blanket)
2003-05-24 22:53:10 +00:00
marcel
d19c2253df Be more careful how we restore interrupts. Don't rewrite most of the
PSR only to achieve setting PSR.i back to it's previous value. It
makes it impossible to change any of the 30+ other unrelated bits
when done between intr_disable() and intr_restore(). That's bad.

Instead have intr_disable() return 1 when interrupts were previously
enabled and 0 otherwise and only enable interrupts in intr_restore()
when given a non-0 value.

This change specifically disallows using intr_restore() to disable
interrupts. The reason is simple: interrupts only need to be restored
after they are being disabled, which means that intr_restore() is
called with interrupts disabled and we only need to enable them if
they were previously enabled.

This change does not fix any bugs, other than that it bugged me...

Approved by: re@ (blanket)
2003-05-24 21:44:24 +00:00
marcel
7b4ed28b3c Consistently us the same metric to differentiate between kernel mode
and user mode. We need to take into account that the EPC syscall path
introduces a grey area in which one can argue either way, including a
third: neither.

We now use the region in which the IP address lies. Regions 5, 6 and 7
are kernel VA regions and if the IP lies any any of those regions we
assume we're in kernel mode. Hence, we can be in kernel mode even if
we're not on the kernel stack and/or have user privileges. There're
gremlins living in the twilight zone :-)

For the EPC syscall path this particularly means that the process
leaves user mode the moment it calls into the gateway page. This
makes the most sense because from a process' point of view the call
represents a request to the kernel for some service and that service
has been performed if the call returns. With the metric we picked,
this also means that we're back in user mode IFF the call returns.

Approved by: re@ (blanket)
2003-05-24 21:16:19 +00:00
marcel
36470ce9b8 Unconditionally restore ar.k7 (memory stack) and ar.k6 (register stack)
when returning from an interrupt. Both registers are used on interrupt
to switch to the right kernel stack, but other than that they are not
used. This means we only have to make sure they contain proper values
while in user mode. As such, we conditionally restored these registers
based on whether we returned to userland or not. A nice property of
conditionally restoring ar.k6 and ar.k7 is that it introduces two
invariants: ar.k6 always points to the bottom of the kernel stack and
ar.k7 always points to the top of the kernel stack (immediately below
the PCB we have there).

However, the EPC syscall path introduces an irregularity: there's no
"thin red line" between user and kernel. There's a grey area that's a
couple of instructions wide. Any interruption in that grey area is
bound to see an inconsistent state. One such state is that we're in
kernel space for all practical purposes, but we still need to have
ar.k6 and ar.k7 restored as if we're in userland.

Thus: restore ar.k6 and ar.k7 unconditionally at the cost of losing
a valuable invariant. Both registers now hold the extend of the
usable portion of the kernel stack at any interrupt nesting, which
when in userland mean the bottom and the top of the kstack.
2003-05-24 20:51:55 +00:00
marcel
d2359553a8 Fix an alpha inheritance bug:
On alpha, PAL is involved in context management and after wiring
the CPU (in alpha_init()) a context switch was performed to tell
PAL about the context. This was bogusly brought over to ia64
where it introduced bugs, because we restored the context from
a mostly uninitialized PCB.

The cleanup constitutes:
o  Remove the unused arguments from ia64_init().
o  Don't return from ia64_init(), but instead call mi_startup()
   directly. This reduces the amount of muckery in assembly and
   also allows for the next bullet:
o  Save our currect context prior to calling mi_startup(). The
   reason for this is that many threads are created from thread0
   by cloning the PCB. By saving our context in the PCB, we have
   something sane to clone. It also ensures that a cloned thread
   that does not alter the context in any way will return to
   the saved context, where we're ready for the eventuality with
   a nice, user unfriendly panic().

The cleanup fixes at least the following bugs:
o  Entering mi_startup() with the RSE in enforced lazy mode.
o  Re-execution of ia64_init() in certain "lab" conditions.

While here, add proper unwind directives to __start() so that
the unwind knows it has reached the bottom of the (call) stack.

Approved by: re@ (blanket)
2003-05-24 00:17:34 +00:00
marcel
23cc51994d Fix a (new) source of instability:
When interrupting a kernel context, we don't need to switch stacks
(memory nor register). As such, we were also not restoring the
register stack pointer (ar.bspstore). This, however, fails to be
valid in 1 situation: when we interrupt a register stack switch as
is being done in restorectx(). The problem is that restorectx()
needs to have ar.bsp == ar.bspstore before it can assign the new
value to ar.bspstore. This is achieved by doing a loadrs prior to
assigning to ar.bspstore. If we take an interrupt in between the
loadrs and the assignment and we don't make sure we restore the
ar.bspstore prior to returning from the interrupt, we switch
stacks with possibly non-zero dirty registers, which means that
the new frame pointer (ar.bsp) will be invalid.

So, instead of jumping over the restoration of the register frame
pointer and related registers, we conditionalize it based on whether
we return to kernel context or user context. A future performance
tweak is possible by only restoring ar.bspstore when returning to
kernel mode *and* when the RSE is in enforced lazy mode. One cannot
assume ar.bsp == ar.bspstore if the RSE is not in enforced lazy mode
anyway.

While here (well, not quite) don't unconditionally assign to
ar.bspstore in exception_save. Only do that when we actually switch
stacks. It can only harm us to do it unconditionally.

Approved by: re@ (blanket)
2003-05-23 23:55:31 +00:00
marcel
7cd751dbd6 In swapctx(), put the RSE in enforced lazy mode before we flush the
register stack. There's nothing really wrong with flushing before
putting the RSE in enforced lazy mode, provided you don't depend on
ar.bspstore being equal to ar.bsp when the RSE has been put in
enforced lazy more. The small window between the flush and setting
the RSE may be sufficient to have the RSE eagerly increase the dirty
region (and hence cause ar.bspstore != ar.bsp) or have an interrupt
that may even get the laziest RSE to do something.

Anyway: we don't depend on ar.bspstore being equal to ar.bsp, so
nothing was and is broken. But the code was non-intuitive and
easily confuses. This is a source of future bugs.

Note: the advantage of not depending on ar.bspstore is that there's
some recilience against an interrupted flushrs. Clobbering is limited
to stacked register contents only, not to RSE address clobbering.

Approved: re@ (blanket)
2003-05-23 23:16:43 +00:00
marcel
bdb2f38ea8 o Fix a definite bogon: the dirty bity fault, instruction access
failt and data access fault install the PTE in question into
   the VHPT table. However, a post-increment was missing and we
   wrote the raw PTE data into the pagesize/access key field.
   This leaves a corrupt VHPT entry.
o  While here, remove the explicit cache purge. Insertion into
   the translation implicitly purges any overlapping entries.
o  Make sure there's a cycle break between the itc and the rfi.
o  Whitespace fixes.
2003-05-20 06:57:20 +00:00
marcel
3e51b25670 Rename the "IA64 ITC" counter to "ITC" counter. We don't call the
"TSC" counter on i386 "I386 TSC".

Approved by: re@ (blanket)
2003-05-20 06:51:20 +00:00
marcel
460100722d Prevent corruption of the VHPT collision chain by protecting it with
a mutex. The only volatile chain operations are insertion and deletion
but since updating an existing PTE also updates the VHPT entry itself,
and we have the VHPT mutex in both other cases, we also lock when we
update an existing PTE even though no chain operation is involved.
Note that we perform the insertion and deletion careful enough that
we don't need to lock traversals. If we need to lock traversals, we
also need to lock from the exception handler, which we can't without
creating a trapframe.

We're now able to withstand a -j8 buildworld. More work is needed to
withstand Murphy fields. In other words: we still have a bogon...

Approved by: re@ (blanket)
2003-05-20 02:52:41 +00:00
kan
f35a6040c1 sys/sys/limits.h:
- Fix visibilty test for LONG_BIT and WORD_BIT.  `#if defined(__FOO_VISIBLE)'
   is alays wrong because __FOO_VISIBLE is always defined (to 0 for
   invisibility).

sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:

 - Style fixes.

Submitted by:	bde
Reviewed by:	bsdmike
Approved by:	re (scottl)
2003-05-19 20:29:07 +00:00
marcel
eb68623668 Turn pmap_install_pte() into a critical section. We better not get
interrupted while writing into the VHPT table. While here, make sure
memory accesses a properly ordered. Tag invalidation must happen
first so that the hardware VHPT walker will not be able to match
this entry while we're updating it and we have to make sure the new
new tag gets written only after the PTE is completely updated.

Approved by: re (blanket)
2003-05-19 08:02:36 +00:00
marcel
5d98f3c472 Unconditionally set pcb_current_pmap. WIP versions of the code
previously committed cleared pcb_current_pmap prior to changing
the region registers, but that was removed before committing.
Since we don't normally (at all?) pass a NULL pointer, the bug
was mostly harmless. Fix it while I'm here...

I'm here because we need to have data serialization after writing
to the region registers. Not doing so was likely the cause of the
hangs we were experiencing. General exceptions in cpu_switch may
also be caused by the lack of serialization.

Approved by: re (blanket)
2003-05-19 06:05:30 +00:00
marcel
cd7086c532 pmap_install() needs to be atomic WRT to context switching. Protect
switching user regions (region 0-4) with schedlock. Avoid unnecessary
recursion on schedlock by moving the core functionality to another
function (pmap_switch()) where we assert schedlock is held. Turn
pmap_install() into a wrapper that grabs schedlock. This minimizes
the number of callsites that need to be changed.
Since we already have schedlock in cpu_switch() and cpu_throw(),
have them call pmap_switch() directly. These were also the only two
calls to pmap_install() outside pmap.c, so make pmap_install() static
and remove its prototype from pmap.h

Approved by: re (blanket)
2003-05-19 04:16:30 +00:00
marcel
899a43e474 Remove unused files. cpu_switch() and cpu_throw(), normally in swtch.s,
can be found in machdep.c.

Approved: re@
2003-05-17 04:55:04 +00:00
marcel
5d3af2c5ab Revamp of the syscall path, exception and context handling. The
prime objectives are:
o  Implement a syscall path based on the epc inststruction (see
   sys/ia64/ia64/syscall.s).
o  Revisit the places were we need to save and restore registers
   and define those contexts in terms of the register sets (see
   sys/ia64/include/_regset.h).

Secundairy objectives:
o  Remove the requirement to use contigmalloc for kernel stacks.
o  Better handling of the high FP registers for SMP systems.
o  Switch to the new cpu_switch() and cpu_throw() semantics.
o  Add a good unwinder to reconstruct contexts for the rare
   cases we need to (see sys/contrib/ia64/libuwx)

Many files are affected by this change. Functionally it boils
down to:
o  The EPC syscall doesn't preserve registers it does not need
   to preserve and places the arguments differently on the stack.
   This affects libc and truss.
o  The address of the kernel page directory (kptdir) had to
   be unstaticized for use by the nested TLB fault handler.
   The name has been changed to ia64_kptdir to avoid conflicts.
   The renaming affects libkvm.
o  The trapframe only contains the special registers and the
   scratch registers. For syscalls using the EPC syscall path
   no scratch registers are saved. This affects all places where
   the trapframe is accessed. Most notably the unaligned access
   handler, the signal delivery code and the debugger.
o  Context switching only partly saves the special registers
   and the preserved registers. This affects cpu_switch() and
   triggered the move to the new semantics, which additionally
   affects cpu_throw().
o  The high FP registers are either in the PCB or on some
   CPU. context switching for them is done lazily. This affects
   trap().
o  The mcontext has room for all registers, but not all of them
   have to be defined in all cases. This mostly affects signal
   delivery code now. The *context syscalls are as of yet still
   unimplemented.

Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.

Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.

This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.

Approved by: re@ (jhb)
2003-05-16 21:26:42 +00:00
marcel
7658c8da7d o In pmap_install, don't prevent switching the pmap if we're
switching to kernel_pmap. The pmap is not special enough.
o  Clear the active bit on the pmap we're switching out.
o  Fix some nearby style(9) bugs.

Approved by: re@
2003-05-16 07:57:44 +00:00
marcel
99b4e67709 Indent a comment. This makes 1.100.
Still approved by: re@ (blanket)
2003-05-16 07:05:08 +00:00
marcel
ea1f6119a1 Turn pmap_growkernel() into a critical section. While here, initialize
kernel_vm_end in pmap_bootstrap. Don't delay the initialization until
we need to grow the kernel VM space. This BTW happens twice before
we enter either single- or multi-user mode. Don't adjust kernel_vm_end
while growing based on whether the KPT contains a non-NULL entry. We
trust kernel_vm_end to be correct and we make sure it's still correct
after growing.
Define virtual_avail and virtual_end in terms of VM_MIN_KERNEL_ADDRESS
and VM_MAX_KERNEL_ADDRESS (resp). Don't hardcode region knowledge.
2003-05-16 07:03:15 +00:00
marcel
1acf9e2b81 Revamp the RID allocation code:
o  Limit the size of the region ID map to 64KB. This gives a bitmap
   that is large enough to keep track of 2^19 numbers. The minimal map
   size is 32KB. The reason we limit the map size is that processor
   models may have implemented a 24-bit region ID, which would give
   a 2MB bitmap while the maximum number of allocations is always
   less than PID_MAX*5, which is less than 2^19.
o  Allocate all region IDs up-front. The slight downside of reserving
   more RIDs then a process needs (3 for ia64 native and 1 for ia32)
   is preferable over the call to pmap_ensure_rid() where RIDs are
   allocated on demand. On SMP systems this may lead to a race
   condition.
o  When allocating a region ID, don't use arc4random(). We're not
   interested in randomness or uniform distribution across the
   spectrum. We only need uniqueness. Random numbers may easily
   collide when the number of allocated RIDs is high, creating a
   possibly unbounded retry rate.
2003-05-16 06:40:40 +00:00
marcel
ee46c327a1 Move the conditional definition of KSTACK_MAX_PAGES up ahead where
it's more visible.

Approved by: re@ (blanket)
2003-05-16 06:17:34 +00:00
marcel
2983398f57 This file creates register sets based on the runtime specification.
The advantage of using register sets is that you don't focus on each
register seperately, but instead instroduce a level of abstraction.
This reduces the chance of errors, and also simplifies the code.
The register sers form the basis of everything register.
The sets in this file are:

struct _special
contains all of the control related registers, such as instruction
pointer and stack pointer. It also contains interrupt specific registers
like the faulting address. The set is roughly split in 3 groups. The
first contains the registers that define a context or thread. This is
the only group that the kernel needs to switch threads.  The second group
contains registers needed in addition to the first group needed to switch
userland threads. This group contains the thread pointer and the FP control
register. The third group contains those registers we need for execption
handling and are used on top of the first two groups.

struct _callee_saved, struct _callee_saved_fp
These sets contain the preserved registers, including the NaT after
spilling. The general registers (including branch registers) are
seperated from the FP registers for ptrace(2).

struct _caller_saved, struct _caller_saved_fp
These sets contain the scratch registers based on SDM 2.1, This means that
both ar.csd and ar.ccd are included here, even though they contain ia32
segment register descriptions. We keep seperate NaT bits for scratch and
preserved registers, because they are never saved/restored at the same
time.

struct _high_fp
The upper 96 FP registers that can be enabled/disabled seperately on
the CPU from the lower 32 FP registers. Due to the size of this set,
we treat them specially, even though they are defined as scratch
registers.

CVS ----------------------------------------------------------------------
2003-05-15 08:36:03 +00:00
marcel
08a1cc9dd4 This file contains elementary context related functions used to
save and restore "sets" of registers in various places.
The restorectx and swapctx functions are used by cpu_switch()
and deal with the special registers, as well as the preserved
registers.
The *callee_saved* functions are used to save and restore the
preserved registers (integer and floating-point). They are
useful for signal delivery and ptrace support.
The save_high_fp and restore_high_fp functions are used to
"load" and "unload" to and from the CPU as part of lazy context
switching.
The ia32 specific context functions have been kept with the ia32
code.

Approved by: re@ (blanket)
2003-05-15 08:08:32 +00:00
marcel
606bf22520 This file contains the code that implements the syscall path based
on the epc instruction. The epc instruction, given the permissions
of the page in which the epc is located, allows the privilege level
to be increased with little or no overhead. The previous privilege
level is recorded in the current frame marker and is restored by
a regular (function) return.
Since the epc instruction has to live in a page with non-standard
properties, we hardwire a "gateway" page in the address space. The
address of the gateway page is exported to userland in ar.k7. This
allows us to rewire the page without breaking the ABI.
The syscall stubs in libc are regular function calls that slightly
differ from the normal runtime. The difference is mostly to simplify
the stubs themselves by by moving some of the logic to the kernel.
The libc stubs call into the gateway page (offset 0), from where the
kernel trampolines to the code that sets up a minimal trapframe and
arranges to execute from the kernel stack.
The way back is basicly the same. The kernel returns to the gateway
page, whereby privilege is dropped, and jumps back to the syscall
stub.
Only the special registers are saved in the trapframe. None of the
scratch registers are preserved and since the kernel follows the
same runtime model, none of the preserved registers are saved.
Future enhancements can include the implementation of lightweight
syscalls, where kernel functions are performed without setting up
a trapframe. Good candidates are the *context syscalls for example.

Now that there's a gateway page from which code can be executed in
a non-privileged context, we also have the ideal place to put the
signal trampolines. By moving the signal trampolines from the user
stack to the gateway page, we open up the doors to unexecutable
stacks. The gateway page contains signal trampolines for both the
"legacy" break-based syscall code and the new and improved epc-
based syscall code.

Approved: re@ (blanket)
2003-05-15 07:51:22 +00:00
jhb
89a4eb17de - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
kan
9328ad6bf8 Style fixes.
Remove DBL_DIG, DBL_MIN, DBL_MAX and their FLT_ counterparts, they
were marked for deprecation ever since SUSv1 at least.
Only define ULLONG_MIN/MAX and LLONG_MAX if long long type is
supported.
Restore a lost comment in MI _limits.h file and remove it from
sys/limits.h where it does not belong.
2003-05-04 22:13:04 +00:00
marcel
750391fd90 Fix c99 victim: the accepted character '0 most now be types as '0'. 2003-05-03 23:05:16 +00:00
marcel
6b35b9cda0 Option KADB does not exist. It came from alpha, where it still exists. 2003-05-02 20:34:15 +00:00
marcel
02105f8221 Kill MID_MACHINE, its a.out specific, the only platform that supports
it is i386. All of the other platforms should remove it too.
	-- peter@
2003-04-30 23:16:33 +00:00
jhb
09adcd8b3e Range check the syscall number before looking it up in the syscallnames[]
array.

Submitted by:	pho
2003-04-30 17:59:27 +00:00
kan
9468fdaf14 Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
marcel
397de08118 Revamp the newbus functions:
o  do not use the in* and out* functions. These functions are used by
   legacy drivers and thus must have ia32 compatible behaviour. Hence,
   they  need to have fences. Using these functions for newbus would
   then pessimize performance.
o  remove the conditional compilation of PIO and/or MEMIO support. It's
   a PITA without having any significant benefit. We always support them
   both. Since there are no I/O ports on ia64 (they are simulated by the
   chipset by translating memory mapped I/O to predefined uncacheable
   memory regions) the only difference between PIO and MEMIO is in the
   address calculation. There should be enough ILP that can be exploited
   here that making these computations compile-time conditional is not
   worth it. We now also don't use the read* and write* functions.
o  Add the missing *_8 variants. They were missing, although not missed.
   It's for completeness.
o  Do not add the fences that were present in the low-level support
   functions here. We're using uncacheable memory, which means that
   accesses are in program order. Change the barrier implementation
   to not only do a memory fence, but also an acceptance fence. This
   should more reliably synchronize drivers with the hardware. The
   memory fence enforces ordering, but does not imply visibility (ie
   the access does not necessarily have happened). This is what the
   acceptance deals with.

cpufunc.h cleanup:
o  Remove the low-level memory mapped I/O support functions. They are
   not used. Keep the low-level I/O port access functions for legacy
   drivers and add fences to ensure ia32 compatibility.
o  Remove the syscons specific functions now that we have moved the
   proper definitions where they belong.
o  Replace the ia64_port_address() and ia64_memory_address() functions
   with macros. There's a bigger change inline functions get inlined
   when there aren't function callsi and the calculations are simply
   enough to do it with macros.

Replace the one reference to ia64_memory address in mp_machdep.c to
use the macro.
2003-04-29 09:50:03 +00:00
jhb
ec7071fcb8 - Push down Giant into the sysarch() calls that still need Giant.
- Standardize on EINVAL rather than EOPNOTSUPP if the sysarch op value is
  invalid.
2003-04-25 20:04:02 +00:00
jhb
dcf45ed625 Regen. 2003-04-25 15:59:44 +00:00
jhb
011ef0f3d8 Oops, the thr_* and jail_attach() syscall entries should be NOPROTO rather
than STD.
2003-04-25 15:59:18 +00:00
deischen
3d51b3a280 Add an argument to get_mcontext() which specified whether the
syscall return values should be cleared.  The system calls
getcontext() and swapcontext() want to return 0 on success
but these contexts can be switched to at a later time so
the return values need to be cleared in the saved register
sets.  Other callers of get_mcontext() would normally want
the context without clearing the return values.

Remove the i386-specific context saving from the KSE code.
get_mcontext() is not i386-specific any more.

Fix a bad pointer in the alpha get_mcontext() code.  The
context was being bcopy()'d from &td->tf_frame, but tf_frame
is itself a pointer, so the thread was being copied instead.
Spotted by jake.

Glanced at by:  jake
Reviewed by:    bde (months ago)
2003-04-25 01:50:30 +00:00
jhb
968ad4dbc6 Regen. 2003-04-24 20:50:57 +00:00
jhb
4d35246c8d Fix the thr_create() entry by adding a trailing \. Also, sync up the
MP safe flag for thr_* with the main table.
2003-04-24 20:49:46 +00:00
kan
b86b779077 Add a new sys/limits.h file which in turn depends on machine/_limits.h
to get actual constant values. This is in preparation for machine/limits.h
retirement.

Discussed on:	standards@
Submitted by:	Craig Rodrigues <rodrigc@attbi.com>  (*)
Modified by:	kan
2003-04-23 21:41:59 +00:00
jhb
146e8aecec - Replace inline implementations of sigprocmask() with calls to
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
  sigprocmask() that used the stackgap with calls to the corresponding
  kern_sig*() functions instead without using the stackgap.
2003-04-22 18:23:49 +00:00
davidxu
4f1ed41d01 Remove single threading detecting code, these code really should be
replaced by thread_user_enter(), but current we don't want to enable
this in trap.
2003-04-22 03:17:41 +00:00
marcel
bbf5306d5f Don't use the tpa instruction to implement pmap_kextract. The tpa
instruction requires that a translation is present in the TC. This
may trigger a TLB miss and a subsequent call to vm_fault().
This implementation is deliberately non-inline for debugging and
profiling purposes. Partial or full inlining should eventually be
done.

Valuable insights by: jake
2003-04-22 01:48:43 +00:00
simokawa
9f7fbe4b69 Add FireWire drivers to GENERIC. 2003-04-21 16:44:05 +00:00
jhb
8b7a3b47d1 Use the proc lock to protect p_singlethread and a P_WEXIT test. This
fixes a couple of potential KSE panics on non-i386 arch's that weren't
holding the proc lock when calling thread_exit().
2003-04-18 20:20:00 +00:00
marcel
64dcd8e861 Add the EHCI host controller. 2003-04-16 01:29:08 +00:00
mux
37f577805d I deserve a big pointy hat for having missed all those references
to bus_dmasync_op_t in my last commit.
2003-04-10 23:50:06 +00:00
mux
ea793948f7 Change the operation parameter of bus_dmamap_sync() from an
enum to an int and redefine the BUS_DMASYNC_* constants as
flags.  This allows us to specify several operations in one
call to bus_dmamap_sync() as in NetBSD.
2003-04-10 23:03:33 +00:00
mike
75859ca578 o In struct prison, add an allprison linked list of prisons (protected
by allprison_mtx), a unique prison/jail identifier field, two path
  fields (pr_path for reporting and pr_root vnode instance) to store
  the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
  vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
  to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
  struct xprison instances that represent a snapshot of active jails.

Reviewed by:	rwatson, tjr
2003-04-09 02:55:18 +00:00
des
567ac2b268 Introduce an M_ASSERTPKTHDR() macro which performs the very common task
of asserting that an mbuf has a packet header.  Use it instead of hand-
rolled versions wherever applicable.

Submitted by:	Hiten Pandya <hiten@unixdaemons.com>
2003-04-08 14:25:47 +00:00
marcel
8b0dbc5c03 Remove COMPAT_FREEBSD4. It's impossible because FreeBSD 4 does not
run on ia64 at all.
2003-04-08 08:32:00 +00:00
marcel
6246b15b31 Remove the 32KB VHPT section from the kernel image. We don't really
use it because we allocate a VHPT based on the size of the physical
memory and even if the allocated VHPT is 32KB, we don't use the in-
image section for it. Since the VHPT must be naturally aligned, we
save 48K on average (due to alignment).
Consequently, we start off with the VHPT disabled (it is assumed
the VHPT is disabled because the EFI loader runs without memory
address translation and thus has no need to setup the VHPT). It's
probably a good idea to explicitly disable the VHPT if we make the
use of the VHPT optional.
2003-04-06 21:31:26 +00:00
marcel
5dd3e0b439 Also set the access bit in the PTE when we get a data dirty bit fault.
This avoids an immediate access bit fault when we serviced the dirty
bit fault in case the access bit is unset. This typically happens for
newly allocated memory that's being zeroed and thus very common.
2003-04-06 05:55:36 +00:00
marcel
437a4f18b2 Include <geom/geom_disk.h> and stop including <sys/disk.h>. The
former gives us 'struct disk'.
2003-04-05 21:14:05 +00:00
des
5468286a89 Define ovbcopy() as a macro which expands to the equivalent bcopy() call,
to take care of the KAME IPv6 code which needs ovbcopy() because NetBSD's
bcopy() doesn't handle overlap like ours.

Remove all implementations of ovbcopy().

Previously, bzero was a function pointer on i386, to save a jmp to
bzero_vector.  Get rid of this microoptimization as it only confuses
things, adds machine-dependent code to an MD header, and doesn't really
save all that much.

This commit does not add my pagezero() / pagecopy() code.
2003-04-04 17:29:55 +00:00
phk
c235e25328 Use bioq_flush() to drain a bio queue with a specific error code.
Retain the mistake of not updating the devstat API for now.

Spell bioq_disksort() consistently with the remaining bioq_*().

#include <geom/geom_disk.h> where this is more appropriate.
2003-04-01 15:06:26 +00:00
jeff
5f8f1497c8 - Add thr and umtx system calls. 2003-04-01 01:15:56 +00:00
jeff
420a77ecd5 - Define a new md function 'casuptr'. This atomically compares and sets
a pointer that is in user space.  It will be used as the basic primitive
   for a kernel supported user space lock implementation.
 - Implement this function in x86's support.s
 - Provide stubs that return -1 in all other architectures.  Implementations
   will follow along shortly.

Reviewed by:	jake
2003-04-01 00:18:55 +00:00
jeff
23844ff023 - Add a placeholder for sigwait 2003-03-31 23:36:40 +00:00
jeff
46e6ba39f1 - Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
 - signotify() now operates on a thread since unmasked pending signals are
   stored in the thread.
 - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
2003-03-31 22:49:17 +00:00
jeff
4a3718fb25 - Change trapsignal() to accept a thread and not a proc.
- Change all consumers to pass in a thread.

Right now this does not cause any functional changes but it will be important
later when signals can be delivered to specific threads.
2003-03-31 22:02:38 +00:00
jeff
848087b9b0 - Use sigexit() instead of twiddling the signal mask, catch, ignore, and
action bits to allow SIGILL to work as expected.  This brings this file in
   line with other architectures.
2003-03-31 21:40:47 +00:00
das
72b54236b9 Correct LDBL_* constants based on values from i386. 2003-03-27 20:38:22 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
ru
3e93151335 Remove bitrot associated with `maxusers'.
Submitted by:	bde
2003-03-22 14:18:23 +00:00
mux
0977536812 Use atomic operations to increment and decrement the refcount
in busdma tags.  There are currently no tags shared accross
different drivers so this isn't needed at the moment, but it
will be required when we'll have a proper newbus method to get
the parent busdma tag.
2003-03-20 19:45:26 +00:00
jake
c1838df603 Made the prototypes for pmap_kenter and pmap_kremove MD. These functions
are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.
2003-03-16 04:16:03 +00:00
mux
700905b523 Bah, get it right this time and add sys/lock.h before sys/mutex.h. 2003-03-14 13:30:31 +00:00
mux
48ca93061d Oops, add missing includes. Pass me the pointy hat.
Reported by:	jake
2003-03-14 00:04:37 +00:00
mux
15b2d31e35 Grab Giant around calls to contigmalloc() and contigfree() so
that drivers converted to be MP safe don't have to deal with it.
2003-03-13 17:18:48 +00:00
mux
d3ce48cb48 Memory allocated with contigmalloc() should be freed with
contigfree(), not with free().
2003-03-13 17:10:54 +00:00
marcel
14ed623069 Fix two rounds of breakages and cleanup. Remove the sccdebug sysctl
while I'm here and garbage collect dead code (ssc_clone). Define
d_maxsize as DFLTPHYS for now because that's what it will be if we
don't define it.
2003-03-10 01:58:31 +00:00
phk
e01fc931cf Centralize the devstat handling for all GEOM disk device drivers
in geom_disk.c.

As a side effect this makes a lot of #include <sys/devicestat.h>
lines not needed and some biofinish() calls can be reduced to
biodone() again.
2003-03-08 08:01:31 +00:00
marcel
d4ee62b07a Fix threaded applications on ia64 that are linked dynamicly. We did
not save (restore) the global pointer (GP) in the jmpbuf in setjmp
(longjmp) because it's not needed in general. GP is considered a
scratch register at callsites and hence is always restored after a
call (when it's possible that the call resolves to a symbol in a
different loadmodule; otherwise GP does not have to be saved and
restored at all), including calls to setjmp/longjmp. There's just
one problem with this now that we use setjmp/longjmp for context
switching: A new context must have GP defined properly for the
thread's entry point. This means that we need to put GP in the
jmpbuf and consequently that we have to restore is in longjmp.
This automaticly requires us to save it as well.

When setjmp/longjmp isn't used for context switching, this can be
reverted again.
2003-03-05 04:39:24 +00:00
marcel
55f069454e ABI breaker: Move the J_SIGMASK field in the jmpbuf before
the J_SIG0 field. While here, rename J_SIG0 to J_SIGSET and
remove J_SIG1. The main reason for this change is that the
128-bit sigset_t is now aligned on a 16-byte boundary, which
allows us to use 16-byte atomic loads and stores on CPUs that
support it. The removal of J_SIG1 is done to avoid confusion:
it is never accessed and should not be. Renaming J_SIG0 to
J_SIGSET is the icing on the cake that's better done now than
later.
2003-03-05 03:30:54 +00:00
jhb
e4bcd25517 Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls
to WITNESS_WARN().
2003-03-04 21:03:05 +00:00
phk
0ae911eb0e Gigacommit to improve device-driver source compatibility between
branches:

Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.

This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.

Approved by:    re(scottl)
2003-03-03 12:15:54 +00:00
alc
821fc9fc7a MFi386 revision 1.88
Remove some long unused declarations.
2003-03-01 10:02:11 +00:00
davidxu
f7f8ecaac2 Needn't kse.h 2003-02-27 03:16:35 +00:00
julian
3fc9836d46 Change the process flags P_KSES to be P_THREADED.
This is just a cosmetic change but I've been meaning to do it for about a year.
2003-02-27 02:05:19 +00:00
mux
186c547c81 Correctly set BUS_SPACE_MAXSIZE in all the busdma backends.
It was bogusly set to 64 * 1024 or 128 * 1024 because it was
bogusly reused in the BUS_DMAMAP_NSEGS definition.
2003-02-26 02:16:06 +00:00
mux
541937cf73 Cleanup of the d_mmap_t interface.
- Get rid of the useless atop() / pmap_phys_address() detour.  The
  device mmap handlers must now give back the physical address
  without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
  int.  Now we properly pass a vm_offset_t * and expect it to be
  filled by the mmap handler when the mapping was successful.  The
  mmap handler must now return 0 when successful, any other value
  is considered as an error.  Previously, returning -1 was the only
  way to fail.  This change thus accidentally fixes some devices
  which were bogusly returning errno constants which would have been
  considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
  no longer used.
- Convert all the d_mmap_t consumers to the new API.

I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.

Discussed with:		alc, phk, jake
Reviewed by:		peter
Compile-tested on:	LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on:	i386
2003-02-25 03:21:22 +00:00
phk
72688ad7fe Change the console interface to pass a "struct consdev *" instead of a
dev_t to the method functions.

The dev_t can still be found at struct consdev *->cn_dev.

Add a void *cn_arg element to struct consdev which the drivers can use
for retrieving their softc.
2003-02-20 20:54:45 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
julian
8900ca0cc3 Fix missed patch in last commit 2003-02-17 10:21:32 +00:00
julian
af55753a06 Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by:	 parts by davidxu@
Reviewed by:	jeff@ mini@
2003-02-17 09:55:10 +00:00
marcel
653bc68f53 Define _ALIGNBYTES to be 15. This should have been done right away. 2003-02-17 09:53:29 +00:00
marcel
a75d478b87 Print two new processor features:
o  Spontaneous deferral (A feature required by dutch railways :-)
o  16-byte atomic operations (ld, st, cmpxchg)
2003-02-17 08:17:26 +00:00
jeff
590a39e29b - Split the struct kse into struct upcall and struct kse. struct kse will
soon be visible only to schedulers.  This greatly simplifies much the
   KSE code.

Submitted by:	davidxu
2003-02-17 05:14:26 +00:00
jeff
aa384c931f - Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into
the proc.  These counters are only examined through calcru.

Submitted by:	davidxu
Tested on:	x86, alpha, UP/SMP
2003-02-17 02:19:58 +00:00
phk
4bfb37f22e Remove #include <sys/dkstat.h> 2003-02-16 14:13:23 +00:00
marcel
1141b745fe Fix misuse of Maxmem in the calculation of the VHPT size. Maxmem
is already in pages, so we should not convert from bytes to pages.
The result of this bug was bad scaling of the VHPT relative to the
available memory.

Submitted by: Arun Sharma <arun@sharma-home.net>
2003-02-15 20:58:32 +00:00
obrien
85bd6322e6 Fix the style of the SCHED_4BSD commit. 2003-02-13 22:24:44 +00:00
alc
f1cd81fb95 MFi386
Remove kptobj.  Instead, use VM_ALLOC_NOOBJ.
2003-02-13 07:03:44 +00:00
mike
b4e3f2f94a Implement fpclassify():
o Add a MD header private to libc called _fpmath.h; this header
  contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
  contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
  storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
  double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
  HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
  <math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
  <machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
  on the size of its argument.  __fpclassifyl() is never called on
  alpha because (sizeof(long double) == sizeof(double)), which is good
  since __fpclassifyl() can't deal with such a small `long double'.

This was developed by David Schultz and myself with input from bde and
fenner.

PR:		23103
Submitted by:	David Schultz <dschultz@uclink.Berkeley.EDU>
		(significant portions)
Reviewed by:	bde, fenner (earlier versions)
2003-02-08 20:37:55 +00:00
harti
570156f24c Fix a problem in bus_dmamap_load_{mbuf,uio} when the first mbuf or the first
uio segment is empty. In this case no dma segment is create by
bus_dmamap_load_buffer, but the calling routine clears the first flag.
Under certain combinations of addresses of the first and second mbuf/uio
buffer this leads to corrupted DMA segment descriptors. This was already
fixed by tmm in sparc64/sparc64/iommu.c.

PR:		kern/47733
Reviewed by:	sam
Approved by:	jake (mentor)
2003-02-04 16:30:27 +00:00
jake
6b3763a173 Split statclock into statclock and profclock, and made the method for driving
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway.  This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.

Reviewed by:	jhb, tmm
Tested on:	i386, sparc64
2003-02-03 17:53:15 +00:00
marcel
75bf49b53c Don't use the 'c' partition for mounting root. A disklabel is very
likely not present under the simulator. If multiple partitions are
present on the virtual disk, then the 'a' partition would be the
most logical choice. Nowadays partitions are GPT based, which would
make the assumption of a disklabel even more questionable. Given
all the possible scenarios, assuming a raw "device" seems best.
2003-02-03 01:10:01 +00:00
alfred
b5c0015ac9 Consolidate MIN/MAX macros into one place (param.h).
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
2003-02-02 13:17:30 +00:00
phk
0e3c8673cc We don't need sscopen() and sscclose().
Register sscstrategy directly, instead of using a cdevsw{} for the purpose.

Tested by:	marcel
2003-02-02 10:22:34 +00:00
marcel
3ddb056ffb Export IA32 from opt_ia32.h to assembly so that we can eliminate
saving and restoring ia32 specific registers when switching
context and ia32 support has not been compiled-in. The primary
reason for this change is that one of the ia32 registers (ar.fcr)
is wrongly marked as invalid by the simulator. Now that we avoid
using the register when possible, usability is improved. The
secundary reason is that it saves us 7 loads and stores.

Note that the PCB will continue to have room for these registers,
irrespective of the IA32 option. There are no benefits that make
it worthwhile.
2003-02-02 09:07:15 +00:00
marcel
389e4c3a2a Remove special casing for running in the simulator from the kernel
and instead add platform, firmware and EFI stubs to the loader.
The net effect of this change is that besides a special console and
disk driver, the kernel has no knowledge of the simulator. This has
the following advantages:
o  Simulator support is much harder to break,
o  It's easier to make use of more feature complete simulators.
   This would only need a change in the simulator specific loader,
o  Running SMP kernels within the simulator. Note that ski at this
   time does not simulate IPIs, so there's no way to start APs.

The platform, firmware and EFI stubs describe the following hardware:
o  4 CPU Itanium,
o  128 MB RAM within the 4GB address space,
o  64 MB RAM above the 4GB address space.

NOTE: The stubs in the skiloader describe a machine that should in
parts be defined by the simulator. Things like processor interrupt
block and AP wakeup vector cannot be choosen at random because they
require interpretation by the simulator. Currently the simulator is
ignorant of this.

This change introduces an unofficial SSC call SSC_SAL_SET_VECTORS
which is ignored by the simulator.

Tested with: ski (version 0.943 for linux)
2003-02-01 22:50:09 +00:00
joe
457d099669 Put replace spaces with tabs in keeping with the rest of the file. 2003-02-01 18:45:18 +00:00
julian
e8efa7328e Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.
2003-02-01 12:17:09 +00:00
phk
ae21d4debd Remove D_CANFREE from sscdisk.c.
I belive it got here by copy&paste and I see no signs in the source
code that BIO_DELETE was dealt with correctly and can only wonder
what kind of trouble this may have caused.
2003-01-30 11:48:50 +00:00
julian
0558ae7971 Unbreak SMP cases for these architectures.
statclock_process() changed arguments.
note: it may be worth checking if curkse is needed on these architectures..
(and if so, why?)
2003-01-27 00:00:06 +00:00
davidxu
4b9b549ca2 Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian
2003-01-26 11:41:35 +00:00
jeff
8d3838b535 - Introduce the SCHED_ULE and SCHED_4BSD options for compile time selection
of the scheduler.
 - Add SCHED_4BSD as the scheduler for all kernel config files in cvs.
2003-01-26 05:29:12 +00:00
dfr
e15ccd7613 Fix pmap_extract so that it doesn't panic if the user types
'cat /proc/pid/map'

Submitted by: Arun Sharma <arun.sharma@intel.com>
2003-01-24 09:58:32 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
jeff
18309dfabe - Add a VM_WAIT in the appropriate cases where vm_page_alloc() fails and flags
indicate that uma_small_alloc should not.  This code should be refactored so
   that there is not so much cross arch duplication.

Reviewed by:	jake
Spotted by:	tmm
Tested on:	alpha, sparc64
Pointy hat to:	jeff and everyone who cut and pasted the bad code. :-)
2003-01-21 05:44:52 +00:00
jake
c1e42cc8bb Resolve relative relocations in klds before trying to parse the module's
metadata.  This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.

Submitted by:	Hartmut Brandt <brandt@fokus.gmd.de>
PR:		46732
Tested on:	alpha (obrien), i386, sparc64
2003-01-21 02:42:44 +00:00
phk
2b485632c1 We need neither <sys/diskslice.h> nor <sys/disklabel.h> here. 2003-01-20 11:11:51 +00:00
mux
baad1fb8f8 Don't try to free() map in bus_dmamap_destroy() when it's
set to &nobounce_dmamap.  A similar bug was fixed by wpaul
in revision 1.19 of sys/alpha/alpha/busdma_machdep.c.
2003-01-18 18:33:56 +00:00
dillon
aba727244c Merge all the various copies of vm_fault_quick() into a single
portable copy.
2003-01-16 00:02:21 +00:00
dillon
bd6fdb8977 Merge all the various copies of vmapbuf() and vunmapbuf() into a single
portable copy.  Note that pmap_extract() must be used instead of
pmap_kextract().

This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
2003-01-15 23:54:35 +00:00
marcel
007da5a53d Move ia64_sapics and ia64_sapic_count from interrupt.c to sapic.c
and declare them extern in interrupt.c. This eliminates the need
for ia64_add_sapic(), which is called from sapic.c.
While here, reformat ia64_enable() in interrupt.c to improve
indentation and add a sysctl (machdep.apic) to dump the I/O APIC
entries currently programmed into all I/O APICs. The latter can
help analyze interrupt problems.
Note that the sysctl is not intended as a userland (software)
interface. It may be changed in the future to include counters
so that vmstat -i can make use of it. It may also be removed...
2003-01-06 02:09:08 +00:00
peter
1d946ebfc2 Move the itm reload to a single place rather than having two identical
copies of the reload.  Note that we use the precomputed itm_reload value
so that we can avoid a division in the kernel.  The ia64 cpu does not
have integer divide, so this would have been done by a floating point
operation.
2003-01-06 01:53:55 +00:00
marcel
584a6a5f1c Replace the hardcoding of 255 as the clock interrupt vector with
CLOCK_VECTOR and define it as 254, not 255. Vector 255 is already
in use as the AP wakeup vector on the HP rx2600.

This needs to be made more dynamic. The likelyhood of vector 254
being in use is pretty small, but we already have code to assign
vectors to IPIs (see sal.c) and it's preobably better to have a
centralized "vector manager" that hands out vectors based on
some imput (like priority).
2003-01-06 01:39:25 +00:00
marcel
89da141f5e Manually inline handleclock(). There's only a single caller and
handleclock itself is trivial.

While here, replace (itc_frequency+hz/2)/hz with itm_reload for
consistency. There's now a single place where we determine the
ITM reload value.
2003-01-06 00:38:35 +00:00
marcel
95f59f22f7 Count interrupts as soon as possible. This makes sure interrupts are
counted even when there are no handlers.
2003-01-06 00:25:31 +00:00
marcel
d16b835650 Don't hardcode the address of the local (S)APIC (aka processor
interrupt block). We use the previously hardcoded address as a
default only, but will otherwise use whatever ACPI tells us.
The address can be found in the MADT table header or in the
LAPIC override table entry.
2003-01-05 22:14:30 +00:00
marcel
b94ae4e3db Bump the number of interrupts from 65 to 257. This is a waste of
space most of the time, but handles machines with lots of I/O
(S)APICs. We cannot make this more dynamic without breaking the
interface with vmstat. Hence, we need to fix the interface first.
2003-01-05 22:00:19 +00:00
marcel
1116de1ec0 Handle 3-digit interrupt numbers (vectors). While here, change the
name of unused entries from "intr XXX" to "#XXX". This makes it
easier to debug interrupt problems, because vmstat can be hacked
more easily to dump all interrupt entries that are in use and not
those that have had interrupts.
2003-01-05 21:48:33 +00:00
marcel
d692da2da6 Make all memory I/O addresses (explicitly) 64-bit. Memory mapped
devices aren't necessarily mapped within 4GB. I/O port addresses
are offsets into the memory mapped I/O port space, which is not
larger than 16MB. No need to convert those to 64 bit types.
2003-01-05 21:40:45 +00:00
marcel
260320dd29 Provide a null-implementation for bus_space_unmap, like i386.
bus_space_unmap is required for puc(4).
2003-01-05 21:34:05 +00:00
marcel
921fff3737 Adopt, adapt and improve:
o  Make the URL of the handbook match reality
o  Improve some comments (either wording or formatting)
o  Sync with i386: comment-out DDB, INVARIANTS, INVARIANT_SUPPORT
o  Add some more SCSI/RAID controllers:
	ahd, mpt, asr, ciss, dpt, iir, mly, ida
o  Remove support for the parallel port
o  Add NICs: em, bge
o  Remove NICs: ste, tl, tx, vr, wb
o  Enable USB support again, except of the UHCI host controller.
   UHCI still hangs the BigSur (=HP i2000) machines, and makes
   them useless. The OHCI controller works fine. Note that newer
   ia64 boxes based on the Intel host controllers (UHCI or EHCI)
   still won't have USB support. We really need to import the
   EHCI host controller from NetBSD...
2003-01-05 00:04:28 +00:00
alc
e3e519cb43 Hold the page queues lock around pmap_remove_pte() in pmap_enter().
Submitted by:	Arun Sharma <adsharma@unix-os.sc.intel.com>
2003-01-04 06:49:52 +00:00
marcel
9a4b1f6baa Make this build and sync-up:
o  Add COMPAT_FREEBSD4
o  Remove NO_GEOM
o  Remove commented out options.
2003-01-03 23:10:47 +00:00
schweikh
d3367c5f5d Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
rwatson
22c41db3e5 Synchronize to kern/syscalls.master:1.139.
Obtained from:	TrustedBSD Project
2002-12-29 20:33:26 +00:00
tjr
0fa1ae4aca MB_LEN_MAX is not MD, move it to the MI limits.h. 2002-12-22 06:38:45 +00:00
marcel
5aac02aea3 More MFp4: DIG64 structures. 2002-12-18 18:52:20 +00:00
marcel
6317601ef1 Export the physical address of the RSDP to userland by means
of the `machdep.acpi_root' sysctl. This is required on ia64
because the root pointer hardly ever, if at all, lives in the
first MB of memory and also because scanning the first MB of
memory can cause machine checks.
This provides a save and reliable way for ACPI tools to work
with the tables if ACPI support is present in the kernel. On
ia64 ACPI is non-optional.
2002-12-18 08:47:07 +00:00
marcel
527408760d Check that the dump device is large enough. Otherwise we could
end up with a dump offset that's smaller than the start of the
dump device and either clobber data in preceding partitions or
try to write beyond the end of the medium (unsigned wrap).

Implement legacy behaviour to never write to the first 64KB as
that is where metadata (ie disklabels) may reside.
2002-12-17 02:51:56 +00:00
marcel
3c86a795f0 Regen: swapoff 2002-12-16 00:49:36 +00:00
marcel
4451a382e7 Change swapoff from MNOPROTO to UNIMPL. The former doesn't work. 2002-12-16 00:48:52 +00:00
dillon
b43fb3e920 This is David Schultz's swapoff code which I am finally able to commit.
This should be considered highly experimental for the moment.

Submitted by:	David Schultz <dschultz@uclink.Berkeley.EDU>
MFC after:	3 weeks
2002-12-15 19:17:57 +00:00
alfred
d070c0a52d SCARGS removal take II. 2002-12-14 01:56:26 +00:00
alfred
4f48184fb2 Backout removal SCARGS, the code freeze is only "selectively" over. 2002-12-13 22:41:47 +00:00
alfred
d19b4e039d Remove SCARGS.
Reviewed by: md5
2002-12-13 22:27:25 +00:00
julian
9868d96f1f Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage
during the context switch. Rearrange thread cleanups
to avoid problems with Giant. Clean threads when freed or
when recycled.

Approved by:	re (jhb)
2002-12-10 02:33:45 +00:00
marcel
0c941a7611 Use one of the bi_spare entries for the DIG64 HCDP table address.
The HCDP table is one (non-proprietary) way for the platform to
inform the OS about headless operation. This field would normally
hold the address as can be found by scanning the EFI system table,
which we also pass to the kernel. The apparent duplication allows
us to synthesize a HCDP table in the loader by whatever means we
can think of, including relocating the platform table into pre-
mapped address space. In short: it gives us more freedom.

Approved by: re (blanket)
2002-12-08 20:32:56 +00:00
marcel
9d0493b32c Disable SMP. It reduces the chance that the kernel boots. On top
of that, there's some nasty process corruption when running with
SMP.

Note that this was already in effect for the 5.0-RC1 kernels in
the form of a local patch.

Approved by: re (blanket)
2002-12-08 20:14:04 +00:00
alc
d5387889cf MFi386
Hold the page queues lock around vm_page_unhold() in vunmapbuf().

Approved by:	re (blanket)
2002-12-02 01:12:05 +00:00
marcel
fddfacd0f0 Implement bus_space_subregion(). Identical to i386.
Approved by: re (carte blanc)
2002-11-29 20:14:03 +00:00
marcel
e31313a4a8 Better handle sparse physical memory: Don't use the address range
as a measure for available memory to scale the VHPT. Instead, use
the previously determined Maxmem.

Approved by: re (carte blanc)
2002-11-29 20:10:21 +00:00
marcel
923bcb0860 MFp4:
Add function map_port_space() to map the memory mapped I/O port
range as uncacheable virtual memory and call it prior to probing
for a console. This removes the dependency on the loader to have
done this for us. Note that this change does not include doing
the same for APs.

Approved by: re (blanket)
2002-11-24 20:15:08 +00:00
marcel
3f1e360689 Fix comparison that caused a 1-off bug. This appeared harmless for
the kernel itself, but SAL on Itanium2 machines spontaneously
rebooted the machine.

Approved by: re (blanket)
Submitted by: Arun Sharma <adsharma@unix-os.sc.intel.com>
2002-11-24 20:07:23 +00:00
mux
8169a213d9 Under certain circumstances, we were calling kmem_free() from
i386 cpu_thread_exit().  This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit().  We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().

Approved by:	re@
Suggested by:	jhb
2002-11-22 23:57:02 +00:00
alc
9f35c304e0 MFi386 r1.369
- Clear the PG_WRITEABLE flag in pmap_page_protect() if write access is
   being removed.  Return immediately if write access is being removed and
   PG_WRITEABLE is already clear.
2002-11-17 21:48:42 +00:00
deischen
54d9a4c0f7 Regenerate after adding syscalls. 2002-11-16 23:48:14 +00:00
deischen
280e9bbfe8 Add *context() syscalls to ia64 32-bit compatability table as requested
in kern/syscalls.master.
2002-11-16 15:15:17 +00:00
deischen
31ea801074 Add getcontext, setcontext, and swapcontext as system calls.
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.

A few style nits and comments from bde are also included.

Tested on alpha by: gallatin
2002-11-16 06:35:53 +00:00