Commit Graph

352 Commits

Author SHA1 Message Date
Mike Silbersack
7f3a40933b Fix a horribly suboptimal algorithm in the vm_daemon.
In order to determine what to page out, the vm_daemon checks
reference bits on all pages belonging to all processes.  Unfortunately,
the algorithm used reacted badly with shared pages; each shared page
would be checked once per process sharing it; this caused an O(N^2)
growth of tlb invalidations.  The algorithm has been changed so that
each page will be checked only 16 times.

Prior to this change, a fork/sleepbomb of 1300 processes could cause
the vm_daemon to take over 60 seconds to complete, effectively
freezing the system for that time period.  With this change
in place, the vm_daemon completes in less than a second.  Any system
with hundreds of processes sharing pages should benefit from this change.

Note that the vm_daemon is only run when the system is under extreme
memory pressure.  It is likely that many people with loaded systems saw
no symptoms of this problem until they reached the point where swapping
began.

Special thanks go to dillon, peter, and Chuck Cranor, who helped me
get up to speed with vm internals.

PR:		33542, 20393
Reviewed by:	dillon
MFC after:	1 week
2002-02-27 18:03:02 +00:00
Thomas Moestl
90ce56c287 Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
  architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
  versa, using a naming scheme like le16toh(), htole16().
  These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
  conversion (while the normal access functions would if the bus endianess
  differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by:	mike, bde
Tested on alpha by:	mike
2002-02-27 17:16:18 +00:00
Jake Burkholder
df38f87be1 Minimal testing has shown that a 4 page tsb is a nice sweet spot for current
work loads.  It tapers off after that as gcc's working set generally just fits.

compiling bin/csh:

TSB_PAGES = 2
	213.33 real        77.59 user       110.01 sys
TSB_PAGES = 4
	116.43 real        75.78 user        19.16 sys
TSB_PAGES = 8
	119.27 real        76.38 user        18.12 sys

Testing by:	tmm
2002-02-27 06:18:02 +00:00
Jake Burkholder
95a44511f3 Parameterize the number of pages to allocate for the per-cpu area on
PCPU_PAGES.
2002-02-27 06:08:13 +00:00
Jake Burkholder
62ad058292 Make cpu_identify take the value of the ver register and cpuid as arguments
so we can print nice things about non-current cpus.
2002-02-27 06:05:50 +00:00
Jake Burkholder
ad414cb452 Minor cleanup. 2002-02-27 00:31:31 +00:00
Jake Burkholder
d689965cb4 Wrap long lines. 2002-02-27 00:28:35 +00:00
Jake Burkholder
9d16662aa3 Use pcpu.pc_cpumask instead of computing 1 << cpuid. 2002-02-27 00:27:05 +00:00
Jake Burkholder
5d60dc233a Add a macro for shift of an integer (1 << shift == sizeof). Move the pointer
define to live alongside it.  For kicks assert at compile time that they are
correct.  Use these instead of magic numbers.
2002-02-27 00:21:04 +00:00
Jake Burkholder
7a8ee66881 Wrap long lines. 2002-02-27 00:03:01 +00:00
David E. O'Brien
db3442f54d Define basic macros required by GDB. 2002-02-26 21:49:46 +00:00
Jake Burkholder
c7f9e1fdbf Apparently gcc3.1 is now using deprcated v8 instructions in v9 code
due to them being faster in certain cases.  Therefore we need to save
and restore the v8 %y register around traps in kernel mode as well as
traps in usermode.

Tested by:	obrien, tmm
2002-02-26 17:09:24 +00:00
Jake Burkholder
de3fee8992 Convert pmap.pm_context to an array of contexts indexed by cpuid. This
doesn't make sense for SMP right now, but it is a means to an end.
2002-02-26 06:57:30 +00:00
Jake Burkholder
200e6309c9 Pu back a call to pmap_context_destroy which was accidentily removed
in the previous commit.

Spotted by:	tmm
2002-02-26 06:39:38 +00:00
Jake Burkholder
6e5f3d0f0f Allow the user tsb to span multiple pages. Make the default 2 pages for now
until we do some testing to see what's best.  This gives a massive reduction
in system time for processes with a relatively large working set.  The size
of the tsb directly affects the rss size that a user process can keep mapped.
When it starts to get full replacements occur and the process takes a lot of
soft vm faults.  Increasing the default from 1 page to 2 gives the following
before and after numbers for compiling vfs_bio.c:

before:
       14.27 real         6.56 user         5.69 sys
after:
        8.57 real         6.11 user         1.62 sys

This should make self hosted builds more tolerable.
2002-02-26 02:37:43 +00:00
Jake Burkholder
07d99740b6 Remove code to lock the user tsb into the tlb. We can handle faults on it
now, as we do for normal wired kernel memory.
2002-02-25 22:58:41 +00:00
David E. O'Brien
65955377e3 I was able to boot this kernel using the latest WIP kernel sources.
I don't believe anyone is quite using the sparc64 kernel sources in CVS
yet -- things aren't just quite ready (but almost).  So this commit should
be OK to make.
2002-02-25 22:13:44 +00:00
Jake Burkholder
4c4a1a19e8 Implement a nested window state. This avoids attempting to spill a user
window to the user stack while in a nested kernel trap.  We do this for
entry to the kernel from user mode, but if we get an interrupt in kernel
mode while there are still user windows in the cpu, and we attempt to spill
to the user stack, we may take too many nested traps and overflow the trap
stack, causing a red state exception.  This is needed by upcoming changes
to allow the user tsb to not be locked in the tlb.

Reviewed by:	tmm
2002-02-25 18:37:17 +00:00
Jake Burkholder
3c997c536c Modify the tte format to not include the tlb context number and to store the
virtual page number in a much more convenient way; all in one piece.  This
greatly simplifies the comparison for a matching tte, and allows the fault
handlers to be much simpler due to not having to load wierd masks.
Rewrite the tlb fault handlers to account for the new format.  These are also
written to allow faults on the user tsb inside of the fault handlers; the
kernel fault handler must be aware of this and not clobber the other's
registers.  The faults do not yet occur due to other support that is needed
(and still under my desk).

Bug fixes from:	tmm
2002-02-25 04:56:50 +00:00
David E. O'Brien
44bec8227d Sync with the Alpha's GENERIC configuration.
Most of the contents are commented out as they are as-yet untested.
However, I wanted the contents to match our other arches, so that when
people make changes to {i386,alpha,ia64}, they will also make the same
changes here.
2002-02-24 18:49:38 +00:00
Jake Burkholder
27886bcca9 Make use of the ranged tlb demap operations where ever possible. Use
pmap_qenter and pmap_qremove in preference to pmap_kenter/pmap_kremove.
The former maps in multiple pages at a time, and so can do a ranged
flush.  Don't assume that pmap_kenter and pmap_kremove will flush the tlb,
even though they still do.  It will not once the MI code is updated to use
pmap_qenter and pmap_qremove.
2002-02-23 22:18:15 +00:00
Jake Burkholder
4852de3022 Add needed include of ucontext.h. 2002-02-23 22:03:25 +00:00
Jake Burkholder
4d6fc96f19 Add inlines for demapping a range of pages from the itlb and dtlb. This
will be used to reduce the number of tlb shootdown ipis in an smp system
by sending one ipi for a whole range of pages, instead of one per page.
Munge the context demap operations slightly to support demapping a non-primary
context.
2002-02-23 21:10:06 +00:00
Jake Burkholder
e87e98c1d6 Use intr_disable/intr_restore instead of TLB_ATOMIC_START/END.
Submitted by:	tmm
2002-02-23 20:59:35 +00:00
Jake Burkholder
b7e93803fb Use PCB_REG instead of loading the pcb from curthread. This fixes a bug
where %g6 could be inconsistent for 1 instruction.
2002-02-23 20:54:01 +00:00
Jake Burkholder
919f71f0fc Adapt the tsb_foreach interface to take a source and a destination pmap so
that it can be used for pmap_copy.  Other consumers ignore the second pmap.
Add statistics gathering for tsb_foreach.
Implement pmap_copy.
2002-02-23 20:25:20 +00:00
Jake Burkholder
7cc66bde35 Add statistic gathering for various tsb operations.
Submitted by:	tmm
2002-02-23 20:11:11 +00:00
Jake Burkholder
d250bf3668 Remove debug code. 2002-02-23 20:08:06 +00:00
Jake Burkholder
b07078429e Add statistic gathering for various pmap operations.
Submitted by:	tmm
2002-02-23 20:06:19 +00:00
Jake Burkholder
ca0d9d125e Remove CADDR1 and CADDR2 which are no longer used. On other architectures
these are used for copy and zeroing physical pages; we use physical addresses
directly.
2002-02-23 20:00:33 +00:00
Jake Burkholder
502a371561 Add macros to extract the UPA module id from the UPA config register.
This is the hardware cpuid.
2002-02-23 19:54:34 +00:00
Jake Burkholder
120a25d68a 1. Setup the user stack pointer before returning to a user trap handler.
If we don't do this here there's a 1 instruction race where an interrupt
   could come in and crash the user process due to having no stack.
2. Pass %fsr to the user trap handler in %l4.  Since %fsr can only be loaded
   from or stored to memory, we need to do some contortions and temporarily
   save it to the alternate global stack.
3. Reload the pcb and pcpu registers for traps in kernel mode, for sanity.

Submitted by:	tmm (1, 2)
2002-02-23 18:55:21 +00:00
Jake Burkholder
40e8552ea0 Include intr_machdep.h only for !LOCORE. 2002-02-23 18:41:34 +00:00
Jake Burkholder
4d4b329999 Add needed include of ucontext.h. Fix braino setting curpcb. 2002-02-23 18:39:09 +00:00
Jake Burkholder
47fc42e6a0 Add metadata types for dtlb and itlb data, and number of slots used. 2002-02-23 17:43:44 +00:00
Julian Elischer
77c4066424 Add some DIAGNOSTIC code.
While in userland, keep the thread's ucred reference in a shadow
field so that the usual place to store it is NULL.
If DIAGNOSTIC is not set, the thread ucred is kept valid until the next
kernel entry, at which time it is checked against the process cred
and possibly corrected. Produces a BIG speedup in
kernels with INVARIANTS set. (A previous commit corrected it
for the non INVARIANTS case already)

Reviewed by:	dillon@freebsd.org
2002-02-22 23:58:22 +00:00
Poul-Henning Kamp
1cbb9c3b03 Convert p->p_runtime and PCPU(switchtime) to bintime format. 2002-02-22 13:32:01 +00:00
Julian Elischer
5b8c0a2993 Catch up with i386 change I forgot to commit. 2002-02-19 03:23:28 +00:00
Mike Barcroft
fd8e4ebc8c o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
  source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
  Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
  POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
  and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
  complexities associated with having MD (asm and inline) versions, and
  having to prevent exposure of these functions in other headers that
  happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
  third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on:	alpha, i386
Reviewed by:	bde, jake, tmm
2002-02-18 20:35:27 +00:00
Garrett Wollman
3b7a4c4b1d Resurrect one of the easiest changes from my big include files roll-up
patch from a year ago: give file flags their own type.  This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().
2002-02-15 22:15:39 +00:00
Thomas Moestl
5dc4151c36 Add a delta missed in the last iommu.c commit. This unbreaks the sparc64
kernel build.

Pointy hat to:	tmm
2002-02-15 14:48:54 +00:00
Thomas Moestl
89a87417d6 Calculate physmem before calling init_param2().
Submitted by:	jake
2002-02-13 17:05:56 +00:00
Thomas Moestl
9a60579c15 Avoid crashing in early boot when WITNESS is enabled by moving the
mtx_init() for intr_table_lock after the globaldata pointer
initialization.
2002-02-13 16:36:44 +00:00
Thomas Moestl
54a7de96bc Add the counter-timer node to the exclusion list, as it is handled
specially. While being there, sort that list.
2002-02-13 16:28:40 +00:00
Thomas Moestl
fac409f6bf Use stxa_sync() when accessing the LSU control register to avoid undefined
behaviour.
2002-02-13 16:25:33 +00:00
Thomas Moestl
fd2ee897e4 Use stxa_sync() when accessing the diagnostic registers to invalidate
caches; this is needed to avoid undefined behaviour.
Clean up a bit.
2002-02-13 16:20:38 +00:00
Thomas Moestl
bb2b9370c7 Add support for the counter-timer which is included in the Sun U2S and
U2P bridges as a time counter.
2002-02-13 16:16:36 +00:00
Thomas Moestl
633541ac6e Add support for the SBus, which is used in early Sun UltraSPARC machines.
Ported from NetBSD.
2002-02-13 16:11:36 +00:00
Thomas Moestl
e37d222c43 Merge r1.39 from NetBSD (manage both streaming caches for psycho pairs).
Use explicit bus space accesses instead of mapping the device memory
into kva.
Fix support for psycho pairs, and catch up with iommu code changes.
2002-02-13 16:07:59 +00:00
Thomas Moestl
e6c7af37bc Merge r1.42 of iommu.c and r1.9 of iommuvar.h from NetBSD (this adds
support for managing both streaming caches on psycho pairs).
Use explicit bus space accesses instead of mapping the device memory into
kva.
Move DVMA allocation to the map creation/dma memory allocation functions.
2002-02-13 15:59:17 +00:00