Commit Graph

1182 Commits

Author SHA1 Message Date
des
849bd5ea0c Whitespace nit. 2004-01-13 15:30:36 +00:00
nectar
11f80dcf0d Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm
at it, use the ANSI C generic pointer type for the second argument,
thus matching the documentation.

Remove the now extraneous (and now conflicting) function declarations
in various libc sources.  Remove now unnecessary casts.

Reviewed by:	bde
2004-01-09 16:52:09 +00:00
davidxu
f39653dda8 Make sigaltstack as per-threaded, because per-process sigaltstack state
is useless for threaded programs, multiple threads can not share same
stack.
The alternative signal stack is private for thread, no lock is needed,
the orignal P_ALTSTACK is now moved into td_pflags and renamed to
TDP_ALTSTACK.
For single thread or Linux clone() based threaded program, there is no
semantic changed, because those programs only have one kernel thread
in every process.

Reviewed by: deischen, dfr
2004-01-03 02:02:26 +00:00
silby
a7d8091ae5 Track three new sendfile-related statistics:
- The number of times sendfile had to do disk I/O
- The number of times sfbuf allocation failed
- The number of times sfbuf allocation had to wait
2003-12-28 08:57:09 +00:00
silby
a58bddbe36 Move the declaration of sfbufspeak and sfbufsused to mbuf.h,
and use imax instead of max, as sfbufspeak and sfbufsused
are signed.

Submitted by:   bde
2003-12-28 01:43:22 +00:00
silby
5c5418dd6e Track current and peak sfbuf usage, export the values via sysctl. 2003-12-27 07:52:47 +00:00
marcel
52c1801016 Don't use NULL with integral types. 2003-12-24 19:55:07 +00:00
peter
b1487e9a60 Return AE_OK for stub functions returning ACPI_STATUS, not NULL 2003-12-24 05:26:26 +00:00
peter
daf42805ba GC the unused <machine/kse.h> file. 2003-12-24 00:51:30 +00:00
peter
998b79089f Add an additional field to the elf brandinfo structure to support
quicker exec-time replacement of the elf interpreter on an emulation
environment where an entire /compat/* tree isn't really warranted.
2003-12-23 02:42:39 +00:00
peter
293d5a3efd Add missing #include "opt_compat.h" so that the compatability function
freebsd4_freebsd32_sigreturn() is defined when expected.  This should
unbreak the tinderbox. Sorry.
2003-12-18 06:59:18 +00:00
marcel
6b2a4c99ba In set_mcontext(), take into account that kse_switchin(2) will
eventually be passed an async. context as well as a syscall
context.
While here, fix a serious bug in that if the trapframe is a
syscall frame, but we're restoring an async context, we need
to clear the FRAME_SYSCALL flag so that we leave the kernel
via exception_restore.
2003-12-14 01:59:31 +00:00
peter
90628de204 Assimilate ia64 back into the fold with the common freebsd32/ia32 code.
The split-up code is derived from the ia64 code originally.

Note that I have only compile-tested this, not actually run-tested it.
The ia64 side of the force is missing some significant chunks of signal
delivery code.
2003-12-11 01:05:09 +00:00
peter
0fad8ea5b0 Fix last second typo. 2003-12-10 22:59:03 +00:00
peter
6c7e20736e Use gcc's superior ffs() builtin. 2003-12-10 22:51:40 +00:00
peter
f5f05f9b78 Use ffs(x) == popcnt(x ^ (x - 1)) to implement 64 bit ffsl(). gcc's
ffs() builtin uses this already but truncates the upper 32 bits.
2003-12-10 22:47:02 +00:00
marcel
b8e9d2beb0 Don't panic for misalignment traps when the onfault handler is set.
Not all transfers between kernel and user space are byte oriented
and thus alignment safe. Especially fuword*() and suword*() are
sensitive to alignment but in general more optimal than block copies.
By catching the misalignment trap we avoid pessimizing the common
case of properly aligned memory accesses which we would do if we
were to use byte copies or adding tests for proper alignment.

Note that the expectation that the kernel produces aligned pointers
is unchanged. This change therefore relates to possible unaligned
pointers generated in userland.
2003-12-09 09:52:14 +00:00
njl
efa66ad0f3 Use the ACPI-CA definitions for the various APIC tables instead of our
own.
2003-12-09 03:04:19 +00:00
obrien
4867d63660 Move the bktr(4) <arch>/include/ioctl_{bt848,meteor}.h files to dev/bktr
as these ioctl's aren't MD.  This also means they are installed in
/usr/include/dev/bktr now.  Also provide compatability wrappers for
where these headers lived in 4.x.
2003-12-08 07:22:42 +00:00
marcel
4b6eafd82f Simplify the contexts created by the kernel and remove the related
flags. We now create asynchronous contexts or syscall contexts only.
Syscall contexts differ from the minimal ABI dictated contexts by
having the scratch registers saved and restored because that's where
we keep the syscall arguments and syscall return values.
Since this change affects KSE, have it use kse_switchin(2) for the
"new" syscall context.
2003-12-07 20:47:33 +00:00
imp
ee4277ab02 Ooops. These are still used by the bktr driver. David O'Brien has
plans for dealing, but I'll let him deal.

Pointy hat to: imp@
2003-12-07 06:37:32 +00:00
imp
a7899e4b16 Remote meteor driver. It hasn't compiled in over 3 years. If someone
makes it compile again, and can test it, we can restore the driver to
the tree.
2003-12-07 04:41:11 +00:00
jhb
bbe7d290ea - Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called
very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid.
  cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is
  actually present and sets mp_ncpus and all_cpus.  Splitting these up
  allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just
  setting mp_maxid to MAXCPU in cpu_mp_setmaxid().  This could allow the
  CPU probing code to live in a module, for example, since modules
  sysinit's in modules cannot be invoked prior to SI_SUB_KLD.  This is
  needed to re-enable the ACPI module on i386.
- For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating
  its contents in a few places.  Also, add a smp_cpu_enabled() function
  to avoid duplicating some code.  There is room for further code
  reduction later since much of this code is also present in cpu_mp_start().
- All archs besides i386 still set mp_maxid to the same values they set it
  to before this change.  i386 now sets mp_maxid to MAXCPU.

Tested on:	alpha, amd64, i386, ia64, sparc64
Approved by:	re (scottl)
2003-11-21 22:23:26 +00:00
marcel
683673d640 Set the ACPI processor Id in the PCPU structure so that CPU idling
on SMP systems has a chance of working. This was a loose end of the
implementation of the ACPI Cx idle states. Since our logical CPU Id
is the ACPI processor Id, we do not need to jump through hoops to
obtain it.

Approved: re@ (jhb)
2003-11-20 16:42:39 +00:00
peter
d29883b254 Widen the enable/disable helper function's argument in line with the
ithread_create() changes etc.  This should be mostly a NOP.
2003-11-17 06:10:15 +00:00
bde
59742d249e Fixed a pedantic syntax error (a stray semicolon at the end of
PCPU_MD_FIELDS).
2003-11-17 03:40:41 +00:00
alc
aea6af995e - Remove unnecessary synchronization from sf_buf_init(). (There is only
one active CPU when sf_buf_init() is performed.)
2003-11-16 23:40:06 +00:00
alc
74614e7f63 - Modify alpha's sf_buf implementation to use the direct virtual-to-
physical mapping.
 - Move the sf_buf API to its own header file; make struct sf_buf's
   definition machine dependent.  In this commit, we remove an
   unnecessary field from struct sf_buf on the alpha, amd64, and ia64.
   Ultimately, we may eliminate struct sf_buf on those architecures
   except as an opaque pointer that references a vm page.
2003-11-16 06:11:26 +00:00
njl
1f1358c8e0 Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to
dereference the softc.
2003-11-15 18:58:29 +00:00
marcel
7a2fc00406 Remove ia64_highfp_load() now that it's unused. 2003-11-12 03:24:34 +00:00
marcel
19c69bbd49 Further work-out the handling of the high FP registers. The most
important change is in cpu_switch() where we disable the high FP
registers for the thread that we switch-out if the CPU currently
has its high FP registers. This avoids that the high FP registers
remain enabled for the thread even when the CPU has unloaded them
or the thread migrated to another processor.
Likewise, when we switch-in a thread of that has its high FP
registers on the CPU, we enable them. This avoids an otherwise
harmless, but unnecessary trap to have them enabled.

The code that handles the disabled high FP trap (in trap()) has
been turned into a critical section for the most part to avoid
being preempted. If there's a race, we bail out and have the
processor trap again if necessary.

Avoid using the generic ia64_highfp_save() function when the
context is predictable. The function adds unnecessary overhead.
Don't use ia64_highfp_load() for the same reason. The function
is now unused and can be removed.

These changes make the lazy context switching of the high FP
registers in an UP kernel functional.
2003-11-12 01:26:02 +00:00
marcel
12457aa888 Save and restore the high FP registers in {g|s}_mcontext(). Note
that we currently do not keep track of whether the thread has
actually used the high FP registers before. If not, we should
not save them in the context which automaticly means that we
also would not restore them from the context. For now, do it
unconditionally so that we can reach functional completeness.
2003-11-11 09:53:37 +00:00
marcel
b097722b0b Fix a nasty bug that got exposed when the sendsig() and sigreturn()
functions switched to using {g|s}et_mcontext(). The problem is that
sigreturn(), being a syscall, can be given an async. context (i.e.
one corresponding to an interrupt or trap). When this happens, we
try to return to user mode via epc_syscall_return with a trapframe
that can only be used to return to user mode via exception_restore.

To fix this, we check the frame's flags immediately prior to
epc_syscall_return and branch to exception_restore for non-syscall
frames. Modify the assertion in set_mcontext() to check that if
there's a mismatch, it's because of sigreturn().
2003-11-11 09:25:19 +00:00
marcel
2322554b50 In get_mcontext(), do not update bspstore and ndirty in the trapframe.
Only update them in the newly created context to reflect the state
after copying the dirty registers onto the user stack. If we were to
update the trapframe, we lose the state at entry into the kernel. We
may need that after we create the context, such as for KSE upcalls.

We have to update the trapframe after writing the dirty registers to
the user stack for signal delivery to work. But this is best done in
sendsig() itself where it applies, not in get_mcontext() where it's
done unconditionally.
2003-11-10 05:28:05 +00:00
marcel
e8baed96b5 When a thread is being swapped-out, save the high FP registers. We
have a pointer in the PCPU to the PCB of the thread that currently
has its high FP registers loaded.
2003-11-09 23:13:23 +00:00
marcel
b4eb6c7533 Use get_mcontext() to construct the signal context in sendsig() and
use set_mcontext() to restore the context in sigreturn(). Since we
put the syscall number and the syscall arguments in the trapframe
(we don't save the scratch registers for syscalls, which allows us
to reuse the space to our advantage), create a MD specific flag so
that we save the scratch registers even for syscalls. We would not
be able to restart a syscall otherwise.

The signal trampoline does not need to flush the regiters anymore,
because get_mcontext() already handles that. In fact, if we set up
the context correctly, we do not need to have a trampoline at all.
This change however only minimally changes the trampoline code. In
follow-up commits this can be further optimized.

Note that normally we preserve cfm and iip in the trapframe created
by the EPC syscall path when we restore a context in set_mcontext()
because those fields are not normally set for a synchronuous context.
The kernel puts the return address and frame info of the syscall
stub in there. By preserving these fields we hide this detail from
userland which allows us to use setcontext(2) for user created
contexts. However, sigreturn() is commonly called from the trampoline,
which means that if we preserve cfm and iip in all cases, we would
return to the trampoline after the sigreturn(), which means we hit
the safety net: we call exit(2). So, we do not preserve cfm and iip
when we have a synchronous context that also has scratch registers
(the uncommon context created by sendsig() only), under the assumption
that if such a context is created in userland, something special is
going on and the use of cfm and iip is then just another quirk. All
this is invisible in the common case.
2003-11-09 22:17:36 +00:00
marcel
21340f30b3 Change the clear_ret argument of get_mcontext() to be a flags argument.
Since all callers either passed 0 or 1 for clear_ret, define bit 0 in
the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI
code for possible (but unlikely) future use. The remaining bits are for
use by MD code.

This change is triggered by a need on ia64 to have another knob for
get_mcontext().
2003-11-09 20:31:04 +00:00
marcel
1cd9ce158f Remove the atkbd, psm, sc and vga devices. Most ia64 boxes out there
are zx1 based machines and they don't particularly like it when we
poke at them with PC legacy code. The atkbd and psm devices were
disabled in the hints file so that one could enable them on machines
that support legacy devices, but that's not really something you can
expect from a first-time installer. This still leaves syscons (sc)
and the vga device, which were enabled by default and wrecking havoc
anyway. We could disable them by default like the atkbd and psm
devices, but there's really no point in pretending we're in a better
shape that way.
2003-11-08 23:19:13 +00:00
scottl
e3855085d1 Document the lockfunc and lockfuncarg arguments to bus_dma_tag_create() in
the busdma headers.
2003-11-07 23:29:42 +00:00
jhb
b6b663a274 Regen. 2003-11-07 20:30:30 +00:00
jhb
e5691bdd94 Sync with global syscalls.master. ptrace(), dup(), pipe(), ktrace(),
ia32_sigaltstack(), sysarch(), issetugid(), utrace(), and ia32_sigaction()
are MP safe.
2003-11-07 20:27:16 +00:00
marcel
636849aec5 Add support for unaligned ld2, st2, st4 and st8. While here, make
sure we handle stacked registers properly by taking into account
that:
1. bspstore points after the frame (due to cover),
2. we need to adjust for intermediate NaT collections.
2003-11-06 04:26:40 +00:00
marcel
f3021513b4 Handle unaligned 4-byte loads. While in the neighborhood, remove the
cr.isr sanity check. We actually encounter insanities, which very
likely means that the insanity check itself is insane. Remove an empty
comment while I'm at it.
2003-11-03 08:04:04 +00:00
marcel
4c4077d63b Add a bogus definition of __va_list for use by lint. Make it visible
only when lint is defined to protect builds with non-GNU compilers.
2003-11-03 05:04:09 +00:00
marcel
69c81d4892 Remove headers copied from i386 and either useless or wrong on ia64.
An example of useless is bios.h. An example of wrong is msdos.h (due
to the use of long for 32-bit fields).

display.h cannot be removed because it's used by syscons. That header
however has no platform dependency and shouldn't really be here.

Removal if these headers may cause build failures in the ports tree.
It's the ports that need fixing in that case.

Tested with: buildworld, LINT
2003-11-02 09:19:07 +00:00
marcel
ba29587a94 When switching the RSE to use the kernel stack as backing store, keep
the RNAT bit index constant. The net effect of this is that there's
no discontinuity WRT NaT collections which greatly simplifies certain
operations. The cost of this is that there can be up to 504 bytes of
unused stack between the true base of the kernel stack and the start
of the RSE backing store. The cost of adjusting the backing store
pointer to keep the RNAT bit index constant, for each kernel entry,
is negligible.

The primary reasons for this change are:
1. Asynchronuous contexts in KSE processes have the disadvantage of
   having to copy the dirty registers from the kernel stack onto the
   user stack. The implementation we had so far copied the registers
   one at a time without calculating NaT collection values. A process
   that used speculation would not work. Now that the RNAT bit index
   is constant, we can block-copy the registers from the kernel stack
   to the user stack without having to worry about NaT collections.
   They will be in the right place on the user stack.
2. The ndirty field in the trapframe is now also usable in userland.
   This was previously not the case because ndirty also includes the
   space occupied by NaT collections. The value could be off by 8,
   depending on the discontinuity. Now that the RNAT bit index is
   contants, we have exactly the same number of NaT collection points
   on the kernel stack as we would have had on the user stack if we
   didn't switch backing stores.
3. Debuggers and other applications that use ptrace(2) can now copy
   the dirty registers from the kernel stack (using ptrace(2)) and
   copy them whereever they want them (onto the user stack of the
   inferior as might be the case for gdb) without having to worry
   about NaT collections in the same way the kernel doesn't have to
   worry about them.

There's a second order effect caused by the randomization of the
base of the backing store, for it depends on the number of dirty
registers the processor happened to have at the time of entry into
the kernel. The second order effect is that the RSE will have a
better cache utilization as compared to having the backing store
always aligned at page boundaries. This has not been measured and
may be in practice only minimally beneficial, if at all measurable.
2003-10-28 19:38:26 +00:00
marcel
23e5537e11 The previous commit removed both clause 3 and clause 4 from the UCB
license. Only clause 3 has been revoked. Restore the fourth clause
as clause 3.

Pointed out by: das@

Remove my name as a copyright holder since I don't use a BSD license
compatible or comparable to the UCB license. I choose not to add a
complete second license for my work for aesthetic reasons, nor to
replace the UCB license on grounds of rewriting more than 90% of the
source files. The rewrite can also be seen as an enhancement and since
the files were practically empty, it's rather trivial to have changed
90% of the files.
2003-10-27 22:54:34 +00:00
marcel
367436bcad Add support for userland to access I/O port space. This is primarily
added for XFree86. There are 2 reasons for doing this with sysarch():
1. The memory mapped I/O space is not at a fixed physical address. An
   application has to use some interface to get the base address. It
   gets worse if the machine has multiple memory mapped I/O spaces.
2. Access to the memory mapped I/O space needs to happen through a
   translation that is flagged as uncachable. There's no interface
   that allows a process to do uncached memory I/O, other than though
   /dev/mem (possibly).

So, until we either disallow direct access to I/O or bus space from
userland or have a better way of doing this, sysarch() has the least
negative impact on existing interfaces.
2003-10-27 05:45:35 +00:00
marcel
c703d9a33e Remove unused header. See also ia64/disasm/disasm.h. 2003-10-24 06:53:43 +00:00
marcel
a7758e9138 Remove ia64_pack_bundle() and ia64_unpack_bundle(). They are not
used anymore.
2003-10-24 06:52:21 +00:00
marcel
5039faef62 Remove unused file. db_disasm() has been implemented in db_interface.c
now.
2003-10-24 06:48:41 +00:00
marcel
2a08abfe7d Implement db_disasm() by using the new disassembler. Temporarily
unimplement db_write_breakpoint() and db_clear_breakpoint().
2003-10-24 06:42:03 +00:00
arun
05b0056432 Use a TR of size 1 << IA64_ID_PAGE_SHIFT instead of 16M to avoid
overlapping TR/TC entries (which results in a machine check). Note
that we don't look at the size of the memory descriptor, because
it doesn't guarantee non-overlap.

With this change, a UP kernel could boot on a Intel Tiger4 machine
with the following options:

options         LOG2_ID_PAGE_SIZE=26		# 64M
options         LOG2_PAGE_SIZE=14               # 16K

Approved by: marcel
2003-10-24 04:56:58 +00:00
marcel
9dca341e06 Don't use fuword() or suword() unconditionally. They explicitly
disallow reading or writing.
2003-10-24 02:33:26 +00:00
marcel
911fff3816 Remove two unused fields in the operand structure (o_read & o_write). 2003-10-24 02:05:53 +00:00
marcel
ad1ef3fe10 Cleanup. Remove the md_flags for threads. It's not used. The flags
we had were bogus.
While here, reassign the copyright to the Project. There's nothing
in this files that originates from NetBSD, especially now that the
FreeBSD/alpha bits have been removed, but even then the amount of
inherited code that we actually used was nil.
2003-10-23 06:41:59 +00:00
marcel
c3d0eca2a7 Reimplement unaligned_fixup() using the new disassembler and a
mcontext_t for the register values. Currently only ld8 and ldfd
instructions are handled as those are the ones we need now (a
misaligned ld8 occurs 4 times in ntpd(8) and a misaligned ldfd
occurs once in mozilla 1.4 and 1.5). Other instructions are added
when needed.
2003-10-23 06:32:34 +00:00
marcel
670b5bf535 Remove unused include of <machine/inst.h> 2003-10-23 06:23:55 +00:00
marcel
92a56e0f1b Remove prototype of unaligned_fixup() and fix a nearby style(9)
bug.
2003-10-23 06:21:44 +00:00
marcel
74d6568906 Add prototypes for spillfd() and unaligned_fixup(). 2003-10-23 06:20:38 +00:00
marcel
9298bcdbb4 Add spillfd(). This function loads a double-precision FP register
at the first address and spills it to the second address. This
allows unaligned_fixup() to update the context of the process in
a way that assures proper rounding.
Similar functions for single-and extended-precision are added when
needed.
2003-10-23 06:19:06 +00:00
marcel
4f18f4daa0 Add a new disassembler that improves over the previous disassembler
in that it provides an abstract (intermediate) representation for
instructions. This significantly improves working with instructions
such as emulation of instructions that are not implemented by the
hardware (e.g. long branch) or enhancing implemented instructions
(e.g. handling of misaligned memory accesses). Not to mention that
it's much easier to print instructions.

Functions are included that provide a textual representation for
opcodes, completers and operands.

The disassembler supports all ia64 instructions defined by revision
2.1 of the SDM (Oct 2002).
2003-10-23 06:01:52 +00:00
marcel
7df6e35964 Remove md_bspstore from the MD fields of struct thread. Now that
the backing store is at a fixed address, there's no need for a
per-thread variable.
2003-10-21 01:13:49 +00:00
marcel
87282b0dcb Put the RSE backing store at a fixed address. This change is triggered
by libguile that needs to know the base of the RSE backing store. We
currently do not export the fixed address to userland by means of a
sysctl so user code needs to hardcode it for now. This will be revisited
later.

The RSE backing store is now at the bottom of region 4. The memory stack
is at the top of region 4. This means that the whole region is usable
for the stacks, giving a 61-bit stack space.

Port: lang/guile (depended of x11/gnome2)
2003-10-20 05:34:10 +00:00
njl
8689796a4b Add the cpu_idle_hook() function pointer so that other idlers can be
hooked at runtime.  Make C1 sleep (e.g., HLT) be the default.  This
prepares the way for further ACPI sleep states.
2003-10-18 22:25:07 +00:00
marcel
1d3178bbc0 Implement cpu_idle() on ia64. We put the processor in a lightweight
halt state that minimizes power consumption while still preserving
cache and TLB coherency. Halting the processor is not conditional at
this time. Tested with UP and SMP kernels.
2003-10-17 02:24:59 +00:00
robert
8519aa2ff0 Implement preliminary support for the PT_SYSCALL command to ptrace(2). 2003-10-09 10:17:16 +00:00
marcel
98e6d4d80f With BETA 5 of libuwx some of the application registers are renamed
from UWX_REG_MUMBLE to UWX_REG_AR_MUMBLE. Compatibility defines are
present in libuwx. Change the names here so that we don't depend on
compatibility defines.

Note that there's now an UWX_REG_PFS and an UWX_REG_AR_PFS and the
former is not a compatibility define for the latter AFAICT. Change
to UWX_REG_AR_PFS as that seems to be the one we need to handle.
2003-10-09 03:11:37 +00:00
marcel
e54e3cef6f Include <sys/smp.h> for the prototype of smp_rendezvous(). 2003-10-08 19:55:45 +00:00
bms
d8d01a1fa7 Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by:	truckman
Discussed with:	peter
2003-10-06 01:47:12 +00:00
alc
b1691aebe4 Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added.  This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine.  Note: pmap_is_prefaultable() and pmap_mincore() have
much in common.  Going forward, it's worth considering their merger.
2003-10-03 22:46:53 +00:00
marcel
1d961a2a8f Swap the syscall caller frame info (i.e. the return pointer and
frame marker) and the syscall stub frame info in the trap frame.
Previously we stored the stub frame info in (rp,pfs) and the
caller frame info in (iip,cfm). This ends up being suboptimal
for the following reasons:
1. When we create a new context, such as for an execve(2), we had
   to set the (rp,pfs) pair for the entry point when using the
   syscall path out of the kernel but we need to set the (iip,cfm)
   pair when we take the interrupt way out. This is mostly just
   an inconsistency from the kernel's point of view, but an ugly
   irregularity from gdb(1)'s point of view.
2. The getcontext(2) and setcontext(2) syscalls had to swap the
   (rp,pfs) and (iip,cfm) pairs to make the context compatible
   with one created purely in userland.

Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal
handlers that actually peek at the mcontext_t and to gdb(1).
Since this change is made for gdb(1) and we don't care about
signal handlers that peek at the mcontext_t because we're still
a tier 2 platform, this ABI breakage is academic at this moment
in time.

Note that there was no real reason to save the caller frame info
in (iip,cfm) and the stub frame info in (rp,pfs).
2003-10-03 03:50:29 +00:00
marcel
cf9458da70 Drop any and all support for varargs. There's no history to worry
about because we're still tier 2 and our current compiler, as well
as future compilers will not support varargs. This is mostly a
no-op in practice, because <sys/varargs.h> should already cause
compile failures.
2003-09-28 05:34:07 +00:00
phk
6e7e1fbcfa Set cn_name, not cn_dev 2003-09-26 10:37:16 +00:00
peter
8ecb3577d8 Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function.  Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable.  This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'.  And that causes exec etc to fail since it can no
longer find space to mmap things.
2003-09-25 01:10:26 +00:00
nyan
2ae09c9ace Implement the bus_space_map() function to allocate resources and initialize
a bus_handle, but currently it does only initializing a bus_handle.
2003-09-23 08:22:34 +00:00
marcel
db44a810da Fix the last remaining problem encountered by KSE: apparently it is
not guaranteed that the RSE writes the NaT collection immediately,
sort of atomically, to the backing store when it writes the register
immediately prior to the NaT collection point. This means that we
cannot assume that the low 9 bits of the backingstore pointer do not
point to the NaT collection. This is rather a surprise and I don't
know at this time if it's a bug in the Merced or that it's actually
a valid condition of the architecture. A quick scan over the sources
does not indicate that we depend on the false assumption elsewhere,
but it's something to keep in mind.

The fix is to write the saved contents of the ar.rnat register to
the backingstore prior to entering the loop that copies the dirty
registers from the kernel stack to the user stack.
2003-09-20 20:34:58 +00:00
marcel
10c85423c8 Move uma_small_alloc() and uma_small_free() to uma_machdep.c. These
functions reference UMA internals from <vm/uma_int.h>, which makes
them highly unwanted in non-UMA specific files.

While here, prune the includes in pmap.c and use __FBSDID(). Move
the includes above the descriptive comment.

The copyright of uma_machdep.c is assigned to the project and can
be reassigned to the foundation if and when when such is preferrable.
2003-09-20 19:27:48 +00:00
marcel
901f7b2cb7 Fix the most significant KSE breakage caused by not restoring the
restart instruction bits in the PSR. As such, we were returning
from interrupt to the instruction in the bundle that caused us
to enter the kernel, only now we're returning to a completely
different bundle.

While close here: add two KASSERTs to make sure that we restore
sync contexts only when entered the kernel through a syscall and
restore an async context only when entered the kernel through an
interrupt, trap or fault.

While not exactly here, but close enough: use suword64() when we
copy the dirty registers from the kernel stack to the user stack.
The code was intended to be be replaced shortly after being added,
but that was a couple of weeks ago. I might as well avoid that it
is a source for panics until it's replaced.
2003-09-19 22:51:26 +00:00
marcel
77b076f87c Revamp trap(): make it more explicit which kinds of traps/faults we
can get (or not) and what we do with them. This fixes the behaviour
for NaT consumption and speculation faults in that we now don't panic
for user faults.

Remove the dopanic label and move the code to a function. This makes
it easier in the simulator to set a breakpoint.

While here, remove the special handling of the old break-based syscall
path and move it to where we handle the break vector. While here,
reserve a new break immediate for KSE. We currently use the old break-
based syscall to deal with restoring async contexts. However, it has
the side-effect of also setting the signal mask and callong ast() on
the way out. The new break immediate simply restores the context and
returns without calling ast().
2003-09-19 22:41:52 +00:00
marcel
758f95abc6 Change TRAPF_USERMODE and CLOCKF_USERMODE to not test for CPL == 3,
but for CPL != 0. For some reason yet unknown it is possible for the
CPL to be 2. This would previously be counted as kernel mode, which
resulted in nasty panics. By changing the test it is now treated as
user mode, which is more correct. We still need to figure out how it
is possible that the privilege level can be 2 (or 1 for that matter),
because it's not used by us. We only use 3 (user mode) and 0 (kernel
mode).
2003-09-19 07:48:22 +00:00
marcel
4ab0c8936b Include "opt_kstack_pages.h". We export KSTACK_PAGES to assembly and
better have the right value.
2003-09-19 00:37:41 +00:00
alc
76fcb264a0 Add a new parameter to pmap_extract_and_hold() that is needed to eliminate
Giant from vmapbuf().

Idea from:	tegge
2003-09-12 07:07:49 +00:00
marcel
6dce2e8fe5 Rewrite the SAPIC initialization to always program the RTEs with what
we think is the correct trigger mode and polarity. This allows us to
implement BUS_CONFIG_INTR() as an update of the RTE in question.
Consequently, we can trust the RTE when we enable an interrupt and
avoids that we need to know about the trigger mode and polarity at
that time.
2003-09-10 22:49:38 +00:00
jhb
5fcbea6e19 Move the definitions for ACPI MADT table entries not present in the ACPICA
distribution to a MI header so it can be shared with other architectures.
2003-09-10 06:32:27 +00:00
marcel
bdbd90ef2b Introduce IA64_ID_PAGE_{MASK|SHIFT|SIZE} and LOG2_ID_PAGE_SIZE. The
latter is a kernel option for IA64_ID_PAGE_SHIFT, which in turn
determines IA64_ID_PAGE_MASK and IA64_ID_PAGE_SIZE.

The constants are used instead of the literal hardcoding (in its
various forms) of the size of the direct mappings created in region
6 and 7. The default and probably only workable size is still 256M,
but for kicks we use 128M for LINT.
2003-09-09 05:59:09 +00:00
alc
a81d9ad0b9 Introduce a new pmap function, pmap_extract_and_hold(). This function
atomically extracts and holds the physical page that is associated with the
given pmap and virtual address.  Such a function is needed to make the
memory mapping optimizations used by, for example, pipes and raw disk I/O
MP-safe.

Reviewed by:	tegge
2003-09-08 02:45:03 +00:00
wpaul
ce0ede96f1 Take the support for the 8139C+/8169/8169S/8110S chips out of the
rl(4) driver and put it in a new re(4) driver. The re(4) driver shares
the if_rlreg.h file with rl(4) but is a separate module. (Ultimately
I may change this. For now, it's convenient.)

rl(4) has been modified so that it will never attach to an 8139C+
chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to
match the 8169/8169S/8110S gigE chips. if_re.c contains the same
basic code that was originally bolted onto if_rl.c, with the
following updates:

- Added support for jumbo frames. Currently, there seems to be
  a limit of approximately 6200 bytes for jumbo frames on transmit.
  (This was determined via experimentation.) The 8169S/8110S chips
  apparently are limited to 7.5K frames on transmit. This may require
  some more work, though the framework to handle jumbo frames on RX
  is in place: the re_rxeof() routine will gather up frames than span
  multiple 2K clusters into a single mbuf list.

- Fixed bug in re_txeof(): if we reap some of the TX buffers,
  but there are still some pending, re-arm the timer before exiting
  re_txeof() so that another timeout interrupt will be generated, just
  in case re_start() doesn't do it for us.

- Handle the 'link state changed' interrupt

- Fix a detach bug. If re(4) is loaded as a module, and you do
  tcpdump -i re0, then you do 'kldunload if_re,' the system will
  panic after a few seconds. This happens because ether_ifdetach()
  ends up calling the BPF detach code, which notices the interface
  is in promiscuous mode and tries to switch promisc mode off while
  detaching the BPF listner. This ultimately results in a call
  to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init()
  to handle the IFF_PROMISC flag change. Unfortunately, calling re_init()
  here turns the chip back on and restarts the 1-second timeout loop
  that drives re_tick(). By the time the timeout fires, if_re.ko
  has been unloaded, which results in a call to invalid code and
  blows up the system.

  To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(),
  which stops the ioctl routine from trying to reset the chip.

- Modified comments in re_rxeof() relating to the difference in
  RX descriptor status bit layout between the 8139C+ and the gigE
  chips. The layout is different because the frame length field
  was expanded from 12 bits to 13, and they got rid of one of the
  status bits to make room.

- Add diagnostic code (re_diag()) to test for the case where a user
  has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some
  NICs have the REQ64# and ACK64# lines connected even though the
  board is 32-bit only (in this case, they should be pulled high).
  This fools the chip into doing 64-bit DMA transfers even though
  there is no 64-bit data path. To detect this, re_diag() puts the
  chip into digital loopback mode and sets the receiver to promiscuous
  mode, then initiates a single 64-byte packet transmission. The
  frame is echoed back to the host, and if the frame contents are
  intact, we know DMA is working correctly, otherwise we complain
  loudly on the console and abort the device attach. (At the moment,
  I don't know of any way to work around the problem other than
  physically modifying the board, so until/unless I can think of a
  software workaround, this will have do to.)

- Created re(4) man page

- Modified rlphy.c to allow re(4) to attach as well as rl(4).

Note that this code works for the sample 8169/Marvell 88E1000 NIC
that I have, but probably won't work for the 8169S/8110S chips.
RealTek has sent me some sample NICs, but they haven't arrived yet.
I will probably need to add an rlgphy driver to handle the on-board
PHY in the 8169S/8110S (it needs special DSP initialization).
2003-09-08 02:11:25 +00:00
marcel
ae81d949d1 Untangle the code in this file to improve understandability. Both
ia64_count_cpus() and ia64_probe_sapics() called a single function
to do the the actual work. The difference in behaviour was handled
in that function and was further complicated by adding bootverbose
related code. As such, even the simplest of changes was hard to
comprehend.

Untangling has been done by increasing code duplication and using
a more naive style of coding. FWIW, the object file is slightly
smaller than before, so things aren't as bad as it may seem.

Triggered by: a simple fix on the P4 branch that never got merged.
2003-09-07 23:09:08 +00:00
alc
e443888faa MFamd64/i386
Add necessary page locking to pmap_mincore().
2003-09-07 20:02:38 +00:00
marcel
f7a2b4ef59 MFp4: Revamped GENERIC (and hints). This is some much more pleasant
to look at...
2003-09-07 06:39:51 +00:00
marcel
d45904e351 Replace sio(4) with uart(4). Remove the sio(4) hints and only add
those hints used by uart(4) for the determination of the serial
console in the absence of the HCDP table.
2003-09-07 05:47:10 +00:00
marcel
b7c5acb1f2 Fix a place where I forgot to change the code that checks whether
we return to kernel or userland. This triggered a panic in a KSE
application when TDF_USTATCLOCK was set in the case userland was
interrupted, but we never called ast() on our way out. As such,
we called ast() at some other time. Unfortunately, TDF_USTATCLOCK
handling assumes running in the interrupt thread. This was not
the case anymore.

To avoid making the same mistake later, interrupt() now returns
to its caller whether we interrupted userland or not. This avoids
that we have to duplicate the check in assembly, where it's bound
to fall off the scope. Now we simply check the return value and
call ast() if appropriate.

Run into this: davidxu
2003-09-05 22:50:10 +00:00
marcel
0d2e39083f Use pmap_steal_memory() for the msgbuf instead of trying to squeeze
it in the last chunk (phys_avail block). The last chunk very often is
not larger than one or two pages, resulting in a msgbuf that's too
small to hold a complete verbose boot.
Note that pmap_steal_memory() will bzero the memory it "allocates".
Consequently, ia64 will never preserve previous msgbufs. This is not
a noticable difference in practice. If the msgbuf could be reused,
it was invariably too small to have anything preserved anyway.
2003-09-01 07:06:57 +00:00
marcel
f11904081f Use direct mapped KVA for the sf_buf allocator, as made possible
by the previous commit. While here, fix a typo, reformat comments
and fix a long line.

Tested with: ftpd
2003-09-01 00:12:27 +00:00
alc
8b0114def1 Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy
sockets into machine-dependent files.  The rationale for this
migration is illustrated by the modified amd64 allocator.  It uses the
amd64's direct map to avoid emphemeral mappings in the kernel's
address space.  On an SMP, the emphemeral mappings result in an IPI
for TLB shootdown for each transmitted page.  Yuck.

Maintainers of other 64-bit platforms with direct maps should be able
to use the amd64 allocator as a reference implementation.
2003-08-29 20:04:10 +00:00
njl
638644189e Minor style cleanups. 2003-08-28 16:30:31 +00:00
marcel
d6144c3ba3 Change LOG2_PAGE_SIZE from 14 to 15 bits. This will cause the CTASSERT
in vm_page.h to be reached and thus slightly increases the overall
coverage of LINT on ia64.
2003-08-25 20:02:18 +00:00
marcel
4270721f5e Add the bits for a LINT kernel. It has been verified to compile. We
may need to polish this.
2003-08-23 21:47:33 +00:00
marcel
0a0d516cca Remove PAGE_SIZE_4K, PAGE_SIZE_8K and PAGE_SIZE_16K and replace them
with LOG2_PAGE_SIZE. A single option is better to LINT than multiple
mutual exclusive ones.
2003-08-23 03:39:55 +00:00