Commit Graph

192 Commits

Author SHA1 Message Date
Jake Burkholder
eb5b9c0be3 Add support for starting secondary cpus in kernel, as opposed to relying
on the loader to do it.  Improve smp startup code to be less racy and to
defer certain things until the right time.  This almost boots single user
on my dual ultra 60, it is still very fragile:

SMP: AP CPU #1 Launched!
Enter full pathname of shell or RETURN for /bin/sh:
# ls
Debugger("trapsig")
Stopped at      Debugger+0x1c:  ta              %xcc, 1
db> heh
No such command
db>
2002-03-04 07:12:36 +00:00
Jake Burkholder
907164660a Dig the information about which tlb slots were used to map the kernel out
of the metadata passed by the loader.
2002-03-04 07:07:10 +00:00
Jake Burkholder
5573db3f0b Allocate tlb contexts on the fly in cpu_switch, instead of statically 1 to 1
with pmaps.  When the context numbers wrap around we flush all user mappings
from the tlb.  This makes use of the array indexed by cpuid to allow a pmap
to have a different context number on a different cpu.  If the context numbers
are then divided evenly among cpus such that none are shared, we can avoid
sending tlb shootdown ipis in an smp system for non-shared pmaps.  This also
removes a limit of 8192 processes (pmaps) that could be active at any given
time due to running out of tlb contexts.

Inspired by:		the brown book
Crucial bugfix from:	tmm
2002-03-04 05:20:29 +00:00
Thomas Moestl
90ce56c287 Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
  architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
  versa, using a naming scheme like le16toh(), htole16().
  These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
  conversion (while the normal access functions would if the bus endianess
  differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by:	mike, bde
Tested on alpha by:	mike
2002-02-27 17:16:18 +00:00
Jake Burkholder
df38f87be1 Minimal testing has shown that a 4 page tsb is a nice sweet spot for current
work loads.  It tapers off after that as gcc's working set generally just fits.

compiling bin/csh:

TSB_PAGES = 2
	213.33 real        77.59 user       110.01 sys
TSB_PAGES = 4
	116.43 real        75.78 user        19.16 sys
TSB_PAGES = 8
	119.27 real        76.38 user        18.12 sys

Testing by:	tmm
2002-02-27 06:18:02 +00:00
Jake Burkholder
95a44511f3 Parameterize the number of pages to allocate for the per-cpu area on
PCPU_PAGES.
2002-02-27 06:08:13 +00:00
Jake Burkholder
62ad058292 Make cpu_identify take the value of the ver register and cpuid as arguments
so we can print nice things about non-current cpus.
2002-02-27 06:05:50 +00:00
Jake Burkholder
d689965cb4 Wrap long lines. 2002-02-27 00:28:35 +00:00
Jake Burkholder
5d60dc233a Add a macro for shift of an integer (1 << shift == sizeof). Move the pointer
define to live alongside it.  For kicks assert at compile time that they are
correct.  Use these instead of magic numbers.
2002-02-27 00:21:04 +00:00
David E. O'Brien
db3442f54d Define basic macros required by GDB. 2002-02-26 21:49:46 +00:00
Jake Burkholder
de3fee8992 Convert pmap.pm_context to an array of contexts indexed by cpuid. This
doesn't make sense for SMP right now, but it is a means to an end.
2002-02-26 06:57:30 +00:00
Jake Burkholder
6e5f3d0f0f Allow the user tsb to span multiple pages. Make the default 2 pages for now
until we do some testing to see what's best.  This gives a massive reduction
in system time for processes with a relatively large working set.  The size
of the tsb directly affects the rss size that a user process can keep mapped.
When it starts to get full replacements occur and the process takes a lot of
soft vm faults.  Increasing the default from 1 page to 2 gives the following
before and after numbers for compiling vfs_bio.c:

before:
       14.27 real         6.56 user         5.69 sys
after:
        8.57 real         6.11 user         1.62 sys

This should make self hosted builds more tolerable.
2002-02-26 02:37:43 +00:00
Jake Burkholder
07d99740b6 Remove code to lock the user tsb into the tlb. We can handle faults on it
now, as we do for normal wired kernel memory.
2002-02-25 22:58:41 +00:00
Jake Burkholder
4c4a1a19e8 Implement a nested window state. This avoids attempting to spill a user
window to the user stack while in a nested kernel trap.  We do this for
entry to the kernel from user mode, but if we get an interrupt in kernel
mode while there are still user windows in the cpu, and we attempt to spill
to the user stack, we may take too many nested traps and overflow the trap
stack, causing a red state exception.  This is needed by upcoming changes
to allow the user tsb to not be locked in the tlb.

Reviewed by:	tmm
2002-02-25 18:37:17 +00:00
Jake Burkholder
3c997c536c Modify the tte format to not include the tlb context number and to store the
virtual page number in a much more convenient way; all in one piece.  This
greatly simplifies the comparison for a matching tte, and allows the fault
handlers to be much simpler due to not having to load wierd masks.
Rewrite the tlb fault handlers to account for the new format.  These are also
written to allow faults on the user tsb inside of the fault handlers; the
kernel fault handler must be aware of this and not clobber the other's
registers.  The faults do not yet occur due to other support that is needed
(and still under my desk).

Bug fixes from:	tmm
2002-02-25 04:56:50 +00:00
Jake Burkholder
4d6fc96f19 Add inlines for demapping a range of pages from the itlb and dtlb. This
will be used to reduce the number of tlb shootdown ipis in an smp system
by sending one ipi for a whole range of pages, instead of one per page.
Munge the context demap operations slightly to support demapping a non-primary
context.
2002-02-23 21:10:06 +00:00
Jake Burkholder
e87e98c1d6 Use intr_disable/intr_restore instead of TLB_ATOMIC_START/END.
Submitted by:	tmm
2002-02-23 20:59:35 +00:00
Jake Burkholder
919f71f0fc Adapt the tsb_foreach interface to take a source and a destination pmap so
that it can be used for pmap_copy.  Other consumers ignore the second pmap.
Add statistics gathering for tsb_foreach.
Implement pmap_copy.
2002-02-23 20:25:20 +00:00
Jake Burkholder
502a371561 Add macros to extract the UPA module id from the UPA config register.
This is the hardware cpuid.
2002-02-23 19:54:34 +00:00
Jake Burkholder
40e8552ea0 Include intr_machdep.h only for !LOCORE. 2002-02-23 18:41:34 +00:00
Jake Burkholder
47fc42e6a0 Add metadata types for dtlb and itlb data, and number of slots used. 2002-02-23 17:43:44 +00:00
Mike Barcroft
fd8e4ebc8c o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
  source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
  Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
  POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
  and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
  complexities associated with having MD (asm and inline) versions, and
  having to prevent exposure of these functions in other headers that
  happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
  third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on:	alpha, i386
Reviewed by:	bde, jake, tmm
2002-02-18 20:35:27 +00:00
Garrett Wollman
3b7a4c4b1d Resurrect one of the easiest changes from my big include files roll-up
patch from a year ago: give file flags their own type.  This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().
2002-02-15 22:15:39 +00:00
Thomas Moestl
5dc4151c36 Add a delta missed in the last iommu.c commit. This unbreaks the sparc64
kernel build.

Pointy hat to:	tmm
2002-02-15 14:48:54 +00:00
Thomas Moestl
9a60579c15 Avoid crashing in early boot when WITNESS is enabled by moving the
mtx_init() for intr_table_lock after the globaldata pointer
initialization.
2002-02-13 16:36:44 +00:00
Thomas Moestl
fd2ee897e4 Use stxa_sync() when accessing the diagnostic registers to invalidate
caches; this is needed to avoid undefined behaviour.
Clean up a bit.
2002-02-13 16:20:38 +00:00
Thomas Moestl
bb2b9370c7 Add support for the counter-timer which is included in the Sun U2S and
U2P bridges as a time counter.
2002-02-13 16:16:36 +00:00
Thomas Moestl
e6c7af37bc Merge r1.42 of iommu.c and r1.9 of iommuvar.h from NetBSD (this adds
support for managing both streaming caches on psycho pairs).
Use explicit bus space accesses instead of mapping the device memory into
kva.
Move DVMA allocation to the map creation/dma memory allocation functions.
2002-02-13 15:59:17 +00:00
Thomas Moestl
64bc899300 Clean up bus space debugging support; change sparc64_bus_mem_map() to
take a bus tag and handle as argument instead of a i/o space id and a
physical address, now that nexus handles device memory resource
allocations.
2002-02-13 15:51:57 +00:00
Thomas Moestl
68716de3ec Define constants for the CPU implementation id; export the dectected id
as cpu_impl.
2002-02-13 15:47:12 +00:00
Thomas Moestl
9fb2b0d55e Add a few new functions/macros: intr_disable() and intr_restore() to
disable interrupts completely, and stxa_sync(), which performs a store
immediately followed by a membar #Sync with interrupts disabled (this
is needed for writes to diagnostic registers).
2002-02-13 15:40:05 +00:00
David E. O'Brien
837d5e7485 Add this FreeBSD standard header. 2002-02-10 14:27:20 +00:00
Jake Burkholder
d2d3e8a55a Add extern to avoid sloppy common style declarations.
Tripped over by:	jhb, mux@sneakerz.org
2002-01-16 14:28:50 +00:00
Thomas Moestl
d7ff50a055 Add upa.h, which I had previously forgotten, to unbreak the sparc64
kernel build.

Pointy hat to:	tmm
2002-01-08 16:25:51 +00:00
Jake Burkholder
6deb695c1d Add initial smp support. This gets as far as allowing the secondary
cpu(s) into the kernel, and sync-ing them up to "kernel" mode so we can
send them ipis, which also work.

Thanks to John Baldwin for providing me with access to the hardware
that made this possible.

Parts obtained from:	bsd/os
2002-01-08 05:50:26 +00:00
Jake Burkholder
01cad5de12 Add a macro for getting the tlbs (itlb and/or dtlb) which the given
tte may be mapped by.
2002-01-08 05:07:58 +00:00
Jake Burkholder
70f550ae58 Prototype pmap_map_tsb(). 2002-01-08 05:06:39 +00:00
Jake Burkholder
dff34dde25 Remove PANIC_STACK_PAGES which is no longer used.
Redefine the compile time assertion macro to take one parameter.
2002-01-08 05:05:42 +00:00
Jake Burkholder
d2746ddb32 Add declarations needed by last commit. 2002-01-08 05:03:36 +00:00
Jake Burkholder
338f7e3e47 Add a md field to pcpu for the upa module id.
Remove the alt_stack field.
Use the defines for the register variables declared in C, so that they
don't get out of sync with the assembler.
2002-01-08 04:40:13 +00:00
Jake Burkholder
f25af321eb Define CKLF_PC in terms of TRAPF_PC. 2002-01-08 04:36:53 +00:00
Jake Burkholder
c0e12e9356 Add a mov() macro, which is used in conjunction with the register defines
for setting reserved global registers from c.
2002-01-08 04:36:01 +00:00
Jake Burkholder
2a32c6cea1 Update comments and defines to reflect that normal and alternate g6 point
to the current pcb.
Remove interrupt global defines; they use PCPU_REG now.
Move ATOMIC_INC_INT here from exception.s, add ATOMIC_DEC_INT.
Add a KASSERT macro for use in assembler.
2002-01-08 04:34:20 +00:00
Jake Burkholder
3a3b7ddc5a Add asis for the upa config reg, which contains the hardware cpu id, and
for the interrupt send register, which is used for dispatching ipis.
2002-01-08 04:29:50 +00:00
Thomas Moestl
62ec4add59 1. Implement an optimization for pmap_remove() and pmap_protect(): if a
substantial fraction of the number of entries of tte's in the tsb
   would need to be looked up, traverse the tsb instead. This is crucial
   in some places, e.g. when swapping out a process, where a certain
   pmap_remove() call would take very long time to complete without this.
2. Implement pmap_qenter_flags(), which will become used later
3. Reactivate the instruction cache flush done when mapping as executable.
   This is required e.g. when executing files via NFS, but is known to
   cause problems on UltraSPARC-IIe CPU's. If you have such a CPU, you
   will need to comment this call out for now.

Submitted by:	jake (3)
2002-01-02 18:49:20 +00:00
Thomas Moestl
ca8712b098 Correct the defintion of struct ofw_upa_regs, and use it instead of
struct ofw_nexus_reg. Implement UPA device memory management in the
nexus driver.
Adapt the psycho driver to these changes, and do some minor cleanup work
while being there.
2002-01-02 18:27:13 +00:00
Jake Burkholder
2f3a74c449 Define __ASM__ so that libc will know not to define C things. 2002-01-01 21:21:05 +00:00
Jake Burkholder
947f30e25e Add a define for the fp restore soft trap type.
Only declare C things if __ASM__ is not defined.
2002-01-01 21:19:46 +00:00
Jake Burkholder
1ca0488670 Implement sysarch(SPARC_UTRAP_INSTALL).
Forgot this file in last commit.
2002-01-01 20:57:51 +00:00
Jake Burkholder
cbe522d307 Implement user trap delivery as specified by the sparc abi. This provides
an efficient way for the kernel to bounce certain mundane traps back to
userland for handling there.  A user trap handler returns directly to the
trapping user code, rather than going through the kernel again.  Only a
handful of instructions are actually executed in kernel mode.
Implement sysarch(SPARC_UTRAP_INSTALL).
Add code to handle sharing of the user trap table across forks and unsharing
at exec.

This can be used to implement efficient tracking of floating point register
usage in userland, fe by a thread library, and to handle alignment fault
fixups and instruction emulation in userland, for which the code may need
to be different for 32bit and 64bit binaries.
2002-01-01 20:56:28 +00:00