Commit Graph

219 Commits

Author SHA1 Message Date
dillon
3ad295d416 Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development.  Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures.  Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable.  Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks.  They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by:	jake
2002-04-01 23:51:23 +00:00
jake
15dc222b82 Move the CTASSERT macro from MD code to systm.h alongside KASSERT so other
code can use it.  This takes a single constant argument and fails to compile
if it is 0 (false).  The main application of this is to make assertions about
structure sizes at compile time, in order to validate assumptions made in
other code.  Examples:

CTASSERT(sizeof(struct foo) == FOO_SIZEOF);
CTASSERT(sizeof(struct foo) == (1 << FOO_SHIFT));

Requested by:	jhb, phk
2002-04-01 21:55:00 +00:00
dillon
dc5aafeb94 Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
tmm
b562cca381 Add missing declarations. 2002-03-25 04:53:18 +00:00
obrien
8842976cdd Guard against redefining __gnuc_va_list. 2002-03-24 11:25:46 +00:00
tmm
6f021cf47c Revamp the busdma implementation a bit:
- change the IOMMU support code so that it supports overcommittting the
  available DVMA memory, while still allocating as lazily as possible.
  This is achieved by limiting the preallocation, and deferring the
  allocation to map load time when it fails. In the latter case, the
  DVMA memory reserved for unloaded maps can be stolen to free up enough
  memory for loading a map.
- allow NULL settings in the method tables, and search the parent tags
  until an appropriate implementation is found. This allows to remove some
  kluges in the old implementation.
2002-03-24 02:50:53 +00:00
tmm
521e80e700 Make the OpenFirmware interrupt mapping code more generic, to reduce
the bus-dependent code and to be able to support more systems. The core
of the new code is mostly obtained from NetBSD.
Kluge the interrupt routing methods of the psycho and apb drivers so
that an intline of 0 can be handled for now; real routing is still not
possible (all intline registers are preinitialized instead); this will
require a sparc64-specific adaption of the driver for generic PCI-PCI
bridges with a custom routing method to work right.
2002-03-24 02:11:06 +00:00
tmm
b0d3f0a8e8 Add code to print the fault virtual address for uncorrectable DMA errors
caused by IOMMU misses to aid debugging. This will only work on
UltraSPARC-IIi and IIe.
2002-03-23 20:42:23 +00:00
tmm
5421e07c01 De-__P(), de-K&R, remove superfluous comments and prototypes, some
style fixes. No functional changes.
2002-03-23 20:27:32 +00:00
jake
f11b0f8cfb Fix a deadlock condition with tlb shootdown ipi delivery. Since ipis are
not blocked by raising the pil, a reciever may be interrupted while holding
a spinlock.  If the sender does not defer interrupts throughout the entire
operation it may be interrupted and try to acquire a spinlock held by a
reciever, leading to a deadlock due to the synchronization used by the
ipi handlers themselves.

Submitted by:	tmm
2002-03-23 04:20:00 +00:00
obrien
4c2f517045 ASM versions of __FBSDID. 2002-03-23 02:01:27 +00:00
imp
7ca3d3c8ba intr_disable returns register_t 2002-03-21 06:21:32 +00:00
jeff
70ff425bc3 Remove references to vm_zone.h and switch over to the new uma API.
Reviewed by:	jake
2002-03-21 02:30:27 +00:00
alfred
e1ec4d77dc Remove __P.
profile.h and bus.h were excluded because there is currently WIP.

Reviewed by: tmm
2002-03-21 00:06:55 +00:00
jeff
2923687da3 This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by:	arch@
2002-03-19 09:11:49 +00:00
des
a032109782 Move the definition of PT_[GS]ET{,DB,FP}REGS from the MD ptrace.h to the
MI ptrace.h, since all platforms define them.  Keep the MD ptrace.h around
for FIX_SSTEP (which is currently only needed on Alpha).
2002-03-16 00:25:53 +00:00
jake
87daf4df50 Fix ifdef LOCORE protection. 2002-03-13 06:04:36 +00:00
jake
97430a03ac Add support for starting and stopping cpus with ipis.
Stop the other cpus when shutting down or entering the debugger.

Submitted by:	tmm
2002-03-13 04:59:01 +00:00
jake
2b8f2f82cf Add support for driving the clocks on secondary cpus.
Submitted by:	tmm
2002-03-13 04:38:33 +00:00
jake
df6db29bae Make IPI_WAIT use a bit mask of the cpus that a pmap is active on and only
wait for those cpus, instead of all of them by using a count.  Oops.
Make the pointer to the mask that the primary cpu spins on volatile, so
gcc doesn't optimize out an important load.  Oops again.
Activate tlb shootdown ipi synchronization now that it works.  We have
all involved cpus wait until all the others are done.  This may not be
necessary, it is mostly for sanity.
Make the trigger level interrupt ipi handler work.

Submitted by:	tmm
2002-03-13 03:43:00 +00:00
jake
6ee80641df Add an ATOMIC_CLEAR_INT macro.
Submitted by:	tmm
2002-03-13 03:28:47 +00:00
tmm
aababfffd5 Fix the type of some constants, and make some macros safer by casting
the argument.
2002-03-11 03:04:28 +00:00
tmm
4667cf8132 Add convenience macros to extract the cc0 and cc1 from format 2 and 3
instructions.
2002-03-11 03:03:35 +00:00
jake
5f2da45bc7 Increase VM_KMEM_SIZE to 16 megs from 12. Define VM_KMEM_SIZE_SCALE so that
the number of physical pages per KVA page allocated scales properly with
memory size.  This fixes problems with kmem_map being too small.

Noticed by:	mike, wollman
Submitted by:	tmm
2002-03-09 23:35:50 +00:00
mike
b8cc0d1207 o Don't require long long support in bswap64() functions.
o In i386's <machine/endian.h>, macros have some advantages over
  inlines, so change some inlines to macros.
o In i386's <machine/endian.h>, ungarbage collect word_swap_int()
  (previously __uint16_swap_uint32), it has some uses on i386's with
  PDP endianness.

Submitted by:	bde

o Move a comment up in <machine/endian.h> that was accidentially moved
  down a few revisions ago.
o Reenable userland's use of optimized inline-asm versions of
  byteorder(3) functions.
o Fix ordering of prototypes vs. redefinition of byteorder(3)
  functions, so that the non-GCC (libc asm) case has proper
  prototypes.
o Add proper prototypes for byteorder(3) functions in <sys/param.h>.
o Prevent redundant duplicate prototypes by making use of the
  _BYTEORDER_PROTOTYPED define.
o Move the bswap16(), bswap32(), bswap64() C functions into MD space
  for platforms in which asm versions don't exist.  This significantly
  reduces the complexity of some things at the cost of duplicate code.

Reviewed by:	bde
2002-03-09 21:02:16 +00:00
jake
33d439a7d0 Implement delivery of tlb shootdown ipis. This is currently more fine grained
than the other implementations; we have complete control over the tlb, so we
only demap specific pages.  We take advantage of the ranged tlb flush api
to send one ipi for a range of pages, and due to the pm_active optimization
we rarely send ipis for demaps from user pmaps.

Remove now unused routines to load the tlb; this is only done once outside
of the tlb fault handlers.
Minor cleanups to the smp startup code.

This boots multi user with both cpus active on a dual ultra 60 and on a
dual ultra 2.
2002-03-07 06:01:40 +00:00
jake
951cf2831e Modify the tlb demap API to take a pmap instead of a tlb context number.
Due to allocating tlb contexts on the fly, we only ever need to demap the
primary context, non-primary contexts have already been implicitly flushed
by context switching.  All we really need to tell is if its a kernel demap
or not, and its easier just to compare against the kernel_pmap which is a
constant.
2002-03-07 05:25:15 +00:00
jake
4adfe1f199 Add support for starting secondary cpus in kernel, as opposed to relying
on the loader to do it.  Improve smp startup code to be less racy and to
defer certain things until the right time.  This almost boots single user
on my dual ultra 60, it is still very fragile:

SMP: AP CPU #1 Launched!
Enter full pathname of shell or RETURN for /bin/sh:
# ls
Debugger("trapsig")
Stopped at      Debugger+0x1c:  ta              %xcc, 1
db> heh
No such command
db>
2002-03-04 07:12:36 +00:00
jake
c87ee2427d Dig the information about which tlb slots were used to map the kernel out
of the metadata passed by the loader.
2002-03-04 07:07:10 +00:00
jake
8322761809 Allocate tlb contexts on the fly in cpu_switch, instead of statically 1 to 1
with pmaps.  When the context numbers wrap around we flush all user mappings
from the tlb.  This makes use of the array indexed by cpuid to allow a pmap
to have a different context number on a different cpu.  If the context numbers
are then divided evenly among cpus such that none are shared, we can avoid
sending tlb shootdown ipis in an smp system for non-shared pmaps.  This also
removes a limit of 8192 processes (pmaps) that could be active at any given
time due to running out of tlb contexts.

Inspired by:		the brown book
Crucial bugfix from:	tmm
2002-03-04 05:20:29 +00:00
tmm
3ed05b7b89 Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
  architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
  versa, using a naming scheme like le16toh(), htole16().
  These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
  conversion (while the normal access functions would if the bus endianess
  differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by:	mike, bde
Tested on alpha by:	mike
2002-02-27 17:16:18 +00:00
jake
0f3fdcbf9d Minimal testing has shown that a 4 page tsb is a nice sweet spot for current
work loads.  It tapers off after that as gcc's working set generally just fits.

compiling bin/csh:

TSB_PAGES = 2
	213.33 real        77.59 user       110.01 sys
TSB_PAGES = 4
	116.43 real        75.78 user        19.16 sys
TSB_PAGES = 8
	119.27 real        76.38 user        18.12 sys

Testing by:	tmm
2002-02-27 06:18:02 +00:00
jake
e4a45ab17b Parameterize the number of pages to allocate for the per-cpu area on
PCPU_PAGES.
2002-02-27 06:08:13 +00:00
jake
c584d5961c Make cpu_identify take the value of the ver register and cpuid as arguments
so we can print nice things about non-current cpus.
2002-02-27 06:05:50 +00:00
jake
0f47dc5152 Wrap long lines. 2002-02-27 00:28:35 +00:00
jake
aec950ed91 Add a macro for shift of an integer (1 << shift == sizeof). Move the pointer
define to live alongside it.  For kicks assert at compile time that they are
correct.  Use these instead of magic numbers.
2002-02-27 00:21:04 +00:00
obrien
8449ab85ee Define basic macros required by GDB. 2002-02-26 21:49:46 +00:00
jake
8319be1fd2 Convert pmap.pm_context to an array of contexts indexed by cpuid. This
doesn't make sense for SMP right now, but it is a means to an end.
2002-02-26 06:57:30 +00:00
jake
84d0ef9268 Allow the user tsb to span multiple pages. Make the default 2 pages for now
until we do some testing to see what's best.  This gives a massive reduction
in system time for processes with a relatively large working set.  The size
of the tsb directly affects the rss size that a user process can keep mapped.
When it starts to get full replacements occur and the process takes a lot of
soft vm faults.  Increasing the default from 1 page to 2 gives the following
before and after numbers for compiling vfs_bio.c:

before:
       14.27 real         6.56 user         5.69 sys
after:
        8.57 real         6.11 user         1.62 sys

This should make self hosted builds more tolerable.
2002-02-26 02:37:43 +00:00
jake
2a0b1812b2 Remove code to lock the user tsb into the tlb. We can handle faults on it
now, as we do for normal wired kernel memory.
2002-02-25 22:58:41 +00:00
jake
11e9d44ed7 Implement a nested window state. This avoids attempting to spill a user
window to the user stack while in a nested kernel trap.  We do this for
entry to the kernel from user mode, but if we get an interrupt in kernel
mode while there are still user windows in the cpu, and we attempt to spill
to the user stack, we may take too many nested traps and overflow the trap
stack, causing a red state exception.  This is needed by upcoming changes
to allow the user tsb to not be locked in the tlb.

Reviewed by:	tmm
2002-02-25 18:37:17 +00:00
jake
7eea55cfea Modify the tte format to not include the tlb context number and to store the
virtual page number in a much more convenient way; all in one piece.  This
greatly simplifies the comparison for a matching tte, and allows the fault
handlers to be much simpler due to not having to load wierd masks.
Rewrite the tlb fault handlers to account for the new format.  These are also
written to allow faults on the user tsb inside of the fault handlers; the
kernel fault handler must be aware of this and not clobber the other's
registers.  The faults do not yet occur due to other support that is needed
(and still under my desk).

Bug fixes from:	tmm
2002-02-25 04:56:50 +00:00
jake
63d6dd6dde Add inlines for demapping a range of pages from the itlb and dtlb. This
will be used to reduce the number of tlb shootdown ipis in an smp system
by sending one ipi for a whole range of pages, instead of one per page.
Munge the context demap operations slightly to support demapping a non-primary
context.
2002-02-23 21:10:06 +00:00
jake
b0c5fb0be3 Use intr_disable/intr_restore instead of TLB_ATOMIC_START/END.
Submitted by:	tmm
2002-02-23 20:59:35 +00:00
jake
ed147b733c Adapt the tsb_foreach interface to take a source and a destination pmap so
that it can be used for pmap_copy.  Other consumers ignore the second pmap.
Add statistics gathering for tsb_foreach.
Implement pmap_copy.
2002-02-23 20:25:20 +00:00
jake
d715e97c9b Add macros to extract the UPA module id from the UPA config register.
This is the hardware cpuid.
2002-02-23 19:54:34 +00:00
jake
fc548e14d2 Include intr_machdep.h only for !LOCORE. 2002-02-23 18:41:34 +00:00
jake
3ad7e30eb5 Add metadata types for dtlb and itlb data, and number of slots used. 2002-02-23 17:43:44 +00:00
mike
bcee06d42c o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
  source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
  Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
  POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
  and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
  complexities associated with having MD (asm and inline) versions, and
  having to prevent exposure of these functions in other headers that
  happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
  third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on:	alpha, i386
Reviewed by:	bde, jake, tmm
2002-02-18 20:35:27 +00:00
wollman
7d29dd10d3 Resurrect one of the easiest changes from my big include files roll-up
patch from a year ago: give file flags their own type.  This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().
2002-02-15 22:15:39 +00:00