Commit Graph

126 Commits

Author SHA1 Message Date
David Greenman
24a1cce34f NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct
proc or any VM system structure will have to be rebuilt!!!

Much needed overhaul of the VM system. Included in this first round of
changes:

1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages,
   haspage, and sync operations are supported. The haspage interface now
   provides information about clusterability. All pager routines now take
   struct vm_object's instead of "pagers".

2) Improved data structures. In the previous paradigm, there is constant
   confusion caused by pagers being both a data structure ("allocate a
   pager") and a collection of routines. The idea of a pager structure has
   escentially been eliminated. Objects now have types, and this type is
   used to index the appropriate pager. In most cases, items in the pager
   structure were duplicated in the object data structure and thus were
   unnecessary. In the few cases that remained, a un_pager structure union
   was created in the object to contain these items.

3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now
   be removed. For instance, vm_object_enter(), vm_object_lookup(),
   vm_object_remove(), and the associated object hash list were some of the
   things that were removed.

4) simple_lock's removed. Discussion with several people reveals that the
   SMP locking primitives used in the VM system aren't likely the mechanism
   that we'll be adopting. Even if it were, the locking that was in the code
   was very inadequate and would have to be mostly re-done anyway. The
   locking in a uni-processor kernel was a no-op but went a long way toward
   making the code difficult to read and debug.

5) Places that attempted to kludge-up the fact that we don't have kernel
   thread support have been fixed to reflect the reality that we are really
   dealing with processes, not threads. The VM system didn't have complete
   thread support, so the comments and mis-named routines were just wrong.
   We now use tsleep and wakeup directly in the lock routines, for instance.

6) Where appropriate, the pagers have been improved, especially in the
   pager_alloc routines. Most of the pager_allocs have been rewritten and
   are now faster and easier to maintain.

7) The pagedaemon pageout clustering algorithm has been rewritten and
   now tries harder to output an even number of pages before and after
   the requested page. This is sort of the reverse of the ideal pagein
   algorithm and should provide better overall performance.

8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup
   have been removed. Some other unnecessary casts have also been removed.

9) Some almost useless debugging code removed.

10) Terminology of shadow objects vs. backing objects straightened out.
    The fact that the vm_object data structure escentially had this
    backwards really confused things. The use of "shadow" and "backing
    object" throughout the code is now internally consistent and correct
    in the Mach terminology.

11) Several minor bug fixes, including one in the vm daemon that caused
    0 RSS objects to not get purged as intended.

12) A "default pager" has now been created which cleans up the transition
    of objects to the "swap" type. The previous checks throughout the code
    for swp->pg_data != NULL were really ugly. This change also provides
    the rudiments for future backing of "anonymous" memory by something
    other than the swap pager (via the vnode pager, for example), and it
    allows the decision about which of these pagers to use to be made
    dynamically (although will need some additional decision code to do
    this, of course).

13) (dyson) MAP_COPY has been deprecated and the corresponding "copy
    object" code has been removed. MAP_COPY was undocumented and non-
    standard. It was furthermore broken in several ways which caused its
    behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will
    continue to work correctly, but via the slightly different semantics
    of MAP_PRIVATE.

14) (dyson) Sharing maps have been removed. It's marginal usefulness in a
    threads design can be worked around in other ways. Both #12 and #13
    were done to simplify the code and improve readability and maintain-
    ability. (As were most all of these changes)

TODO:

1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing
   this will reduce the vnode pager to a mere fraction of its current size.

2) Rewrite vm_fault and the swap/vnode pagers to use the clustering
   information provided by the new haspage pager interface. This will
   substantially reduce the overhead by eliminating a large number of
   VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be
   improved to provide both a "behind" and "ahead" indication of
   contiguousness.

3) Implement the extended features of pager_haspage in swap_pager_haspage().
   It currently just says 0 pages ahead/behind.

4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps
   via a much more general mechanism that could also be used for disk
   striping of regular filesystems.

5) Do something to improve the architecture of vm_object_collapse(). The
   fact that it makes calls into the swap pager and knows too much about
   how the swap pager operates really bothers me. It also doesn't allow
   for collapsing of non-swap pager objects ("unnamed" objects backed by
   other pagers).
1995-07-13 08:48:48 +00:00
Bruce Evans
943c18018b Fix standards conformance bugs in <signal.h>:
include/signal.h:
There was massive namespace pollution from including <sys/types.h>.
POSIX functions were declared even when _ANSI_SOURCE is defined.

sys.sys/signal.h:
NSIG was declared even if _ANSI_SOURCE or _POSIX_SOURCE is defined.
sig_atomic_t wasn't declared if _POSIX_SOURCE is defined.
Declare a typedef for signal handling functions and use it to
unobfuscate declarations and to avoid half-baked function types
that cause unwanted compiler warnings at certain warning levels.
Fix confusing comment about SA_RESTART.

sys/i386/include/signal.h:
This has to be included to get the declaration of sig_atomic_t even
when _ANSI_SOURCE is defined, so be more careful about polluting
the ANSI namespace.

Uniformize idempotency ifdefs.
1995-06-28 02:14:13 +00:00
Rodney W. Grimes
9b2e535452 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
David Greenman
b64b660cd3 Made "NMBCLUSTERS" calculation dynamic and fixed bogus use of "NMBCLUSTERS"
in machdep.c (it should use the global nmbclusters). Moved the calculation
of nmbclusters into conf/param.c (same place where nmbclusters has always
been assigned), and made the calculation include an extra amount based
on "maxusers". NMBCLUSTERS can still be overrided in the kernel config
file as always, but this change will make that generally unnecessary. This
fixes the "bug" reports from people who have misconfigured kernels seeing
the network hang when the mbuf cluster pool runs out.

Reviewed by:	John Dyson
1995-05-25 07:41:28 +00:00
David Greenman
30fd0561cd Added apersand constraint to make sure that the source and destination
registers aren't combined.

Reviewed by:	Bruce Evans and David Greenman
Submitted by:	John Dyson
1995-05-14 22:25:11 +00:00
Bruce Evans
0ee893eb32 Add loadandclear(). It atomically loads a value from memory, clears the
value in memory and returns the original value.
1995-05-11 07:24:35 +00:00
David Greenman
85eaa94715 Correct the definition for the (unused) cpu_setstack(). 1995-05-04 07:50:06 +00:00
Bruce Evans
3aa12267a5 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) that I didn't notice when I fixed
"all" such warnings before.
1995-03-28 07:58:53 +00:00
David Greenman
1ae89ac36a Removed declaration of pmap_changebit()...it is no longer exported.
Submitted by:	John Dyson
1995-03-26 23:42:55 +00:00
Bruce Evans
b5e8ce9f12 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'.  Fix all the bugs found.  There were no serious
ones.
1995-03-16 18:17:34 +00:00
David Greenman
3b7517f887 Preserve reverse link integraty while doing the queue insertion. 1995-03-03 22:14:42 +00:00
Bruce Evans
ceb91b3e4a Fix syntax errors in #ifdefed out code. 1995-02-16 13:21:47 +00:00
Søren Schmidt
1e1e0b4463 First attempt to run linux binaries. This is only the changes needed to
the generic kernel. The actual emulator is a separate LKM. (not finished
yet, sorry).
Submitted by:	sos@freebsd.org & sef@kithrup.com
1995-02-14 19:23:22 +00:00
Poul-Henning Kamp
86ca01bcee Whoops! back out last commit partly. 1995-02-14 06:57:45 +00:00
Poul-Henning Kamp
b53d84607c YFfix. 1995-02-14 06:55:42 +00:00
Poul-Henning Kamp
f026fea644 susword -> systm.h 1995-02-14 06:51:31 +00:00
David Greenman
fff3cea1a9 Moved various pmap 'bit' test/set functions back into real functions; gcc
generates better code at the expense of more of it.

Submitted by:	John Dyson
1995-01-24 09:57:39 +00:00
Bruce Evans
20415301cd Fix security holes in sigreturn(), ptrace() and procfs. sigreturn()
attempted to check for insecure and fatal eflags and segment
selectors, but missed many cases and got the IOPL check back to
front.  The other syscalls didn't check at all.

sys_process.c, machdep.c:
Only allow PT_WRITE_U to write to the registers (ordinary and FP).

psl.h, locore.s, machdep.c:
Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR.  We are not supposed
to assume anything about the reserved bits.  Use PSL_USERCHANGE
and PSL_KERNEL instead.  Rename PSL_USERSET to PSL_USER.

exception.s:
Define a private label for use by doreti when returning to user
mode fails.

machdep.c:
In syscalls, allow changing only the eflags that can be changed on
486's in user mode (no longer attempt to allow benign IOPL changes;
allow changing the nasty PSL_NT; don't allow changing the i586
bits).

Don't attempt to check all the cases involving invalid selectors
and %eip's.  Just check for privilege violations and let the invalid
things cause a trap.

procfs_machdep.c:
Call the ptrace register functions to do all the work for reading
and writing ordinary registers and for single stepping.

trap.c:
Ignore traps caused by PSL_NT being set.  Previously, users could
cause a fatal trap in user mode by setting PSL_NT and executing an
iret, and a fatal trap in kernel mode by setting PSL_NT and making
a syscall.  PSL_NT was cleared too late and not in enough modes to
fix the problem.

Make all traps in user mode (except T_NMI) nonfatal.

Recover from traps caused by attempting to load invalid user
registers in doreti by restarting the traps so that they appear to
occur in user mode.
---

Fix bogons that I noticed while fixing the above:

psl.h:
Fix some comments.

Uniformize idempotency ifdef.

exception.s, machdep.c:
Remove rsvd[0-14].  rsvd0 hasn't been reserved since the 486 came
out.  Replace rsvd0 by `align'.  rsvd[0-11] used wrong (magic
non-unique) trap numbers.  Replace rsvd[1-14] by rsvd.

locore.s:
Enable alignment check flag on 486's and 586's.

machdep.c:
Use a better type for kstack[].

Use TFREGP() to find the registers.

Reformat ptrace functions from SEF to something closer to KNF.

procfs_machdep.c:
The wrong pointer to the registers got fixed as a side effect.

Implement reading and writing of FP registers.

/proc/*/*regs now work (only) for processes that are in memory.

Clean up comments.

trap.c, trap.h:
Remove unused trap types.
1995-01-14 13:20:26 +00:00
Bruce Evans
3117fbd98e Enable define of CR0_AM to prepare for implementing alignment checking.
Uniformize idempotency ifdef.
1995-01-14 10:44:55 +00:00
Bruce Evans
c277f99332 Declare a real `struct fpreg' to prepare for implementing reading and
writing of FP regs for procfs.

Uniformize idempotency ifdef.
1995-01-14 10:41:41 +00:00
Bruce Evans
1e1a3d012d Remove reference to impossible trap type T_KDBTRAP. We don't support
watchpoints.

Uniformize idempotency ifdef.
1995-01-14 10:34:52 +00:00
David Greenman
0d94caffca These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme.  The scheme is almost fully compatible with the old filesystem
interface.  Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache.  Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code.  Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now.  Somehow in 2.0, some "enhancements"
broke the code.  This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code.  No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency.  Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache.  Add "bypass" support for sneaking in on
busy buffers.

Submitted by:	John Dyson and David Greenman
1995-01-09 16:06:02 +00:00
David Greenman
b5ba45f6f3 Corrected the list of volatile registers for outsb, outsw, and outsl.
This bug caused my ethernet driver to break, among other things no doubt.
1995-01-04 20:42:25 +00:00
Bruce Evans
d5ebbddc5e Replace sv_ex_tw by padding (it is no longer used; the tag word in sv_env
is valid).

Expand comment about bogus padding for emulators.

Update prototpe for npxinit().
1995-01-03 03:57:46 +00:00
David Greenman
d9b026fcbd Add two more page table pages to keep 64MB machines happy. 1994-12-18 03:11:46 +00:00
Bruce Evans
91290462f6 Disable CLKF_BASEPRI() again. I forgot to edit an unwanted change out of
the diffs for the previous commit.
1994-12-03 10:18:24 +00:00
Bruce Evans
b39b673d37 i386/exception.s,
Keep track of interrupt nesting level.  It is normally 0
	for syscalls and traps, but is fudged to 1 for their exit
	processing in case they metamorphose into an interrupt
	handler.

i386/genassym.c;
	Remove support for the obsolete pcb_iml and pcb_cmap2.

	Add support for pcb_inl.

i386/swtch.s:
	Fudge the interrupt nesting level across context switches and in
	the idle loop so that the work for preemptive context switches
	gets counted as interrupt time, the work for voluntary context
	switches gets counted mostly as system time (the part when
	curproc == 0 gets counted as interrupt time), and only truly idle
	time gets counted as idle time.

	Remove obsolete support (commented out and otherwise) for pcb_iml.

	Load curpcb just before curproc instead of just after so that
	curpcb is always valid if curproc is.  A few more changes like
	this may fix tracing through context switches.

	Remove obsolete function swtch_to_inactive().

include/cpu.h:
	Use the new interrupt nesting level variable to implement a
	non-fake CLF_INTR() so that accounting for the interrupt state
	works.

	You can use top, iostat or (best) an up to date systat to see
	interrupt overheads.  I see the expected huge interrupt overheads
	for ISA devices (on a 486DX/33, about 55% for an IDE drive
	transferring 1250K/sec and the same for a WD8013EBT network card
	transferring 1100K/sec).  The huge interrupt overheads for serial
	devices are unfortunately normally invisible.

include/pcb.h:
	Remove the obsolete pcb_iml and pcb_cmap2.  Replace them by
	padding to preserve binary compatibility.

	Use part of the new padding for pcb_inl.

isa/icu.s:
isa/vector.s:
	Keep track of interrupt nesting level.
1994-12-03 10:03:19 +00:00
Poul-Henning Kamp
0a6a925d04 Declare "extern int bootverbose", so that device-drivers and others
easily can find it.
1994-11-26 09:27:58 +00:00
Bruce Evans
ff030ea17d Add prototype for Debugger(). 1994-11-15 14:55:25 +00:00
Bruce Evans
b0d1e6de04 Make gdt_segs[] public again for APM.
Make ldt[] public again and restore currentldt and _default_ldt for
USER_LDT.
1994-11-15 14:12:55 +00:00
Bruce Evans
004bedeb68 Rewrite almost everything.
Alphabetize.

Write all i/o functions in sleep so that we don't use anything from
NetBSD.

Restore the correct type of u_int for ports.  This saves a whole cycle
per i/o on 486's.

Change `inline' back to __inline to avoid compiler warnings with
-Wreally-all.

Don't implement bdb() unless BDE_DEBUGGER is defined.  Declare bdb_exists
outside the function to avoid hundreds of compiler warnings.

Let the compiler pick the register in asms if possible.

Implement ffs() using inline asm().  gcc provides a slightly different
one.  It was broken in gcc-2.4.5 but works now.  Declaring a correct
version inline ensures getting a correct version.  FreeBSD-1.1.5 has
an slow inline version but FreeBSD-2.0 has a library version (which
probably never gets used).

Do inb() and outb() without using %edx for constant ports below 0x100.

Remove casts to the same type in queue functions.

Declare prototypes for everything implemented i386/*.s and also for
everything that is normally implemented as an inline here (I don't
like the current complete dependency on gcc).  Ifdef out the prototypes
that are declared elsewhere.  THere should be a separate header to
declare things implemented in i386/*.s, but then it would be harder
to override declarations with inlines.

${UII}
1994-11-14 15:04:06 +00:00
Bruce Evans
040f100044 Remove 1.5+K of bloat for unused idt entries.
Partly support BDE_DEBUGGER.  Still broken by conflict with APM.  Does
nothing if BDE_DEBUGGER is not defined.

Clean up prototypes and data declarations.  Declare most of the segment
functions that are implemented in support.s.  Make data private in
machdep.c if possible.

Parenthesize expressions in macros properly!

${Uniformize idempotency ifdef}.
1994-11-14 14:18:15 +00:00
Bruce Evans
3bbb00e1a3 Declare inline functions as __inline and with new-style parameter lists
to avoid compiler warnings.

Clean up prototypes: alphabetize; don't use redundant `extern' or
meaningless `extern inline'.

Uniformize idempotency ifdef.
1994-11-14 14:12:24 +00:00
Bruce Evans
86a8bb8a33 Don't declare DELAY() here. Callers should include <machine/clock.h>. 1994-11-09 00:51:38 +00:00
Bruce Evans
a1ca704e29 Declare all functions exported by the npx driver.
Uniformize idempotency ifdefs.
1994-11-05 22:59:09 +00:00
Bruce Evans
65af765646 Declare the full uglyness of the interfaces to the clock driver (except
things declared in machine-independent files).
1994-11-05 22:51:17 +00:00
Bruce Evans
c342b9faa3 Disable the direct call from hardclock() to softclock(). Support
for it is incomplete and buggy.  There is no problem unless Xintr0()
is reentered or should be reentered, but high clock interrupt
frequencies for pcaudio cause Xintr0() to be reentered (or clock
ticks to be lost when Xintr0() should have been reentered but
wasn't), and we lose little by delaying the call to softclock().

Move declarations related to the clock driver to clock.h.

Move declarations related to the npx driver to npx.h.

Clean up the remaining declarations.
1994-11-05 22:44:34 +00:00
Jordan K. Hubbard
fb59d6ab65 __386BSD__ -> __FreeBSD__
I know that many of these entries are bogus and need to be revisited,
but let's get the tree working again for now and then do a pass through
looking at all the __FreeBSD__ entries, shall we?
1994-11-04 02:14:13 +00:00
Bruce Evans
0bf495e561 Fix the test for the code segment being the usual one. Unusual code
segments can still cause panics.  Their pc is converted to 0 and 0
is only checked for in one place before use.
1994-10-19 21:13:51 +00:00
Andrey A. Chernov
37b28ca421 Remove CPU_COLORDISP, GIO_COLOR now exists 1994-10-18 03:42:18 +00:00
Andrey A. Chernov
9d40918f0f CPU_COLORDISP sysctl added for console display type 1994-10-15 21:18:11 +00:00
Poul-Henning Kamp
a12dee4de7 Cosmetics. Added a prototype. 1994-10-10 01:06:48 +00:00
Poul-Henning Kamp
50a1a05445 Added prototypes. 1994-10-08 22:21:34 +00:00
Andrey A. Chernov
f80d8a2e88 CPU_DISRTCSET added to disable resettodr(), needed in adjkerntz -i,
per Bruce suggestion
1994-10-04 18:25:51 +00:00
Poul-Henning Kamp
45a0b89468 Avoid ddb getting a panic if the code-segment isn't the usual one... 1994-10-02 19:36:30 +00:00
Poul-Henning Kamp
abd358cd49 apm_bios.h: removed the equiv-stuff. Not needed now that the kernel module
works correctly.

clock.h & reg.h: prototypes.
1994-10-02 17:31:29 +00:00
David Greenman
22414e535a Laptop Advanced Power Management support by HOSOKAWA Tatsumi.
Submitted by:	HOSOKAWA Tatsumi
1994-10-01 02:56:21 +00:00
David Greenman
7bfaa9cdaf Inlined ins/outs functions.
Obtained from:	NetBSD
1994-09-25 21:31:55 +00:00
David Greenman
d5c97aea74 Undo last change: the ins/outs functions DO NOT return a pointer! 1994-09-25 20:03:41 +00:00
Poul-Henning Kamp
bb56ec4a05 While in the real world, I had a bad case of being swapped out for a lot of
cycles.  While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent).  So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there.  Having a lap-top is
highly recommended.  My kernel still runs, yell at me if you kernel breaks.
1994-09-25 19:34:02 +00:00