much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.
vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.
vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.
vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.
vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.
machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.
machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.
Submitted by: John Dyson and David Greenman
Keep track of interrupt nesting level. It is normally 0
for syscalls and traps, but is fudged to 1 for their exit
processing in case they metamorphose into an interrupt
handler.
i386/genassym.c;
Remove support for the obsolete pcb_iml and pcb_cmap2.
Add support for pcb_inl.
i386/swtch.s:
Fudge the interrupt nesting level across context switches and in
the idle loop so that the work for preemptive context switches
gets counted as interrupt time, the work for voluntary context
switches gets counted mostly as system time (the part when
curproc == 0 gets counted as interrupt time), and only truly idle
time gets counted as idle time.
Remove obsolete support (commented out and otherwise) for pcb_iml.
Load curpcb just before curproc instead of just after so that
curpcb is always valid if curproc is. A few more changes like
this may fix tracing through context switches.
Remove obsolete function swtch_to_inactive().
include/cpu.h:
Use the new interrupt nesting level variable to implement a
non-fake CLF_INTR() so that accounting for the interrupt state
works.
You can use top, iostat or (best) an up to date systat to see
interrupt overheads. I see the expected huge interrupt overheads
for ISA devices (on a 486DX/33, about 55% for an IDE drive
transferring 1250K/sec and the same for a WD8013EBT network card
transferring 1100K/sec). The huge interrupt overheads for serial
devices are unfortunately normally invisible.
include/pcb.h:
Remove the obsolete pcb_iml and pcb_cmap2. Replace them by
padding to preserve binary compatibility.
Use part of the new padding for pcb_inl.
isa/icu.s:
isa/vector.s:
Keep track of interrupt nesting level.
Alphabetize.
Write all i/o functions in sleep so that we don't use anything from
NetBSD.
Restore the correct type of u_int for ports. This saves a whole cycle
per i/o on 486's.
Change `inline' back to __inline to avoid compiler warnings with
-Wreally-all.
Don't implement bdb() unless BDE_DEBUGGER is defined. Declare bdb_exists
outside the function to avoid hundreds of compiler warnings.
Let the compiler pick the register in asms if possible.
Implement ffs() using inline asm(). gcc provides a slightly different
one. It was broken in gcc-2.4.5 but works now. Declaring a correct
version inline ensures getting a correct version. FreeBSD-1.1.5 has
an slow inline version but FreeBSD-2.0 has a library version (which
probably never gets used).
Do inb() and outb() without using %edx for constant ports below 0x100.
Remove casts to the same type in queue functions.
Declare prototypes for everything implemented i386/*.s and also for
everything that is normally implemented as an inline here (I don't
like the current complete dependency on gcc). Ifdef out the prototypes
that are declared elsewhere. THere should be a separate header to
declare things implemented in i386/*.s, but then it would be harder
to override declarations with inlines.
${UII}
Partly support BDE_DEBUGGER. Still broken by conflict with APM. Does
nothing if BDE_DEBUGGER is not defined.
Clean up prototypes and data declarations. Declare most of the segment
functions that are implemented in support.s. Make data private in
machdep.c if possible.
Parenthesize expressions in macros properly!
${Uniformize idempotency ifdef}.
to avoid compiler warnings.
Clean up prototypes: alphabetize; don't use redundant `extern' or
meaningless `extern inline'.
Uniformize idempotency ifdef.
for it is incomplete and buggy. There is no problem unless Xintr0()
is reentered or should be reentered, but high clock interrupt
frequencies for pcaudio cause Xintr0() to be reentered (or clock
ticks to be lost when Xintr0() should have been reentered but
wasn't), and we lose little by delaying the call to softclock().
Move declarations related to the clock driver to clock.h.
Move declarations related to the npx driver to npx.h.
Clean up the remaining declarations.
I know that many of these entries are bogus and need to be revisited,
but let's get the tree working again for now and then do a pass through
looking at all the __FreeBSD__ entries, shall we?
cycles. While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent). So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there. Having a lap-top is
highly recommended. My kernel still runs, yell at me if you kernel breaks.
matter, but similar bogusness in npx.c causes compiling without -O to fail.
Use __volatile in all asms.
Parenthesize macro args.
Change the names of the macros to avoid namespace pollution.
Remove unnecessary "#ifdef __i386__".
Sort #defines.
Add comments.
This code is mostly taken from the 1.1 port (which was in turn taken from
Dave Mills's kern.tar.Z example). A few significant differences:
1) ntp_gettime() is now a MIB variable rather than a system call. A few
fiddles are done in libc to make it behave the same.
2) mono_time does not participate in the PLL adjustments.
3) A new interface has been defined (in <machine/clock.h>) for doing
possibly machine-dependent things around the time of the clock update.
This is used in Pentium kernels to disable interrupts, set `time', and
reset the CPU cycle counter as quickly as possible to avoid jitter in
microtime(). Measurements show an apparent resolution of a bit more than
8.14usec, which is reasonable given system-call overhead.
Removed inb function since it's more correctly in pio.h
Copied write_eflags and read_eflags over from npx.c
(Some changes to the macros suggested by Bruce were not made at this
time since his suggestions probably apply to all the macros and
these inlined/macro definitions need a lot of cleaning up at some
point in the future.)
Reviewed by: Bruce
if KERNEL is not defined. lib/msun/i387/*.S include asmacros.h to
get the definitions of ENTRY(), etc. This is bogus since asmacros.h
is only supposed to give definitions suitable for the kernel. The
current definitions for the kernel almost worked but are missing
the ".type" declarations. This caused the linker to print warnings
about doubtful relocations for almost anything linked to libm[sun].
Uniformize name and use of idempotence identifier.
the Mach/i386 version of the BSD/vax(?) <machine/psl.h>. The Mach
version has slightly better names for many macros but is now out of
date and little used. It was originally used even less (for spelling
PSL_T as EFL_TF in <machine/db_machdep.h>).
negation whenever we access memory between 640k and 1M.
Original code from NetBSD 1.0-BETA. The exact origins are unclear but
Theo de Raadt, Charles, and Michael V. may have contributed to it.
Submitted by: pst
Changed u_int_inb to just inb and deleted define.
The code generated is identical to that generated with the cast so
the problem was obviously fixed at some point after gcc 1.4
Reviewed by:
Submitted by:
hosing syscons. Doesn anyone know anything about this
or can we just delete it now?
/*
* This roundabout method of returning a u_char helps stop gcc-1.40 from
* generating unnecessary movzbl's.
*/
#ifdef disable_for_gcc-2_6_0
#define inb(port) ((u_char) u_int_inb(port))
#endif
static inline u_int
u_int_inb(u_int port)
{
u_char data;
/*
* We use %%dx and not %1 here because i/o is done at %dx and
not at
* %edx, while gcc-2.2.2 generates inferior code (movw instead
of movl)
* if we tell it to load (u_short) port.
*/
__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
return data;
}
Reviewed by:
Submitted by: