just thinking about it.
Two changes need to be made to allow 'config kernel swap generic' to
work properly without requiring any compile-time flags:
/usr/src/usr.sbin/config/mkswapconf.c: we need to define a dummy stub
for the setconf() function to replace the one in swapgeneric.c that
isn't available in non-generic configurations.
/usr/src/sys/i386/i386/autoconf.c: the -a boot flag causes setroot()
to be skipped and lets setconf() prompt the user for a root device.
If you skip setroot() in a non-generic kernel, you could get severely
hosed. To avoid this, we silently ignore the -a flag if rootdev != NODEV.
(rootdev is always initialized to NODEV in swapgeneric.c, so if
we find that rootdev is something other than NODEV, we know we're
not using a generic configuration.)
briefly over it, and see some serious architectural issues in this stuff.
On the other hand, I doubt that we will have any solution to these issues
before 2.1, so we might as well leave this in.
Most of the stuff is bracketed by #ifdef's so it shouldn't matter too much
in the normal case.
Reviewed by: phk
Submitted by: HOSOKAWA, Tatsumi <hosokawa@mt.cs.keio.ac.jp>
entire kernel.
Unfortunately we didn't send him a copy of the style guide before he did it.
I'm trying to find all the benign and downright sound bits and will commit
them without any other explanation than "YF fix" if they are merely cosmetic.
Reviewed by: phk
Submitted by: yves@dutncp8.tn.tudelft.nl (Yves Fonk)
some comparisons as it is more correct (we want the kernel page tables
included).
Reorganized some of the expressions for efficiency.
Fixed the new pmap_prefault() routine - it would sometimes pick up the
wrong page if the page in the shadow was present but the page in object
was paged out. The routine remains unused and commented out, however.
Explicitly free zero reference count page tables (rather than waiting
for the pagedaemon to do it).
Submitted by: John Dyson
Now it matches the man page and also the only other commercial implementation
i have found so far ( Solaris 2.x).
Changed the name from ss_base to ss_sp.
Handles at least Trantor T130 and ProAudioSpectrum adapters.
The pas driver has consequently been removed.
This driver can be configured without without interrupts.
Manpage to follow when PAS16 has been edited in.
Reviewed by: phk
Submitted by: Serge Vakulenko, <vak@cronyx.ru>
(Boot with the -D flag if you want symbols.)
Make it easier to extend `struct bootinfo' without losing either forwards
or backwards compatibility.
ddb_aout.c:
Get the symbol table from wherever the loader put it.
Nuke db_symtab[SYMTAB_SPACE].
boot.c:
Enable loading of symbols. Align them on a page boundary. Add printfs
about the symbol table sizes.
Pass the memory sizes to the kernel.
Fix initialization of `unit' (it got moved out of the loop).
Fix adding the bss size (it got moved inside an ifdef).
Initialize serial port when RB_SERIAL is toggled on.
Fix comments.
Clean up formatting of recently added code.
io.c:
Clean up formatting of recently added code.
netboot/main.c, machdep.c, wd.c:
Change names of bootinfo fields.
LINT:
Nuke SYMTAB_SPACE.
Fix comment about DODUMP.
Makefile.i386:
Nuke use of dbsym.
Exclude gcc symbols from kernel unless compiling with -g.
Remove unused macro.
Fix comments and formatting.
genassym.c:
Generate defines for some new bootinfo fields. Change names of old ones.
locore.s:
Copy only the valid part of the `struct bootinfo' passed by the loader.
Reserve space for symbol table, if any.
machdep.c:
Check the memory sizes passed by the loader, if any. Don't use them yet.
bootinfo.h:
Add a size field so that we can resolve some mismatches between the loader
bootinfo and the kernel boot info. The version number is not so good for
this because of historical botches and because it's harder to maintain.
Add memory size and symbol table fields. Change the names of everything.
Hacks to save a few bytes:
asm.S, boot.c, boot2.S:
Replace `ouraddr' by `(BOOTSEG << 4)'.
boot.c:
Don't statically initialize `loadflags' to 0. Disable the "REDUNDANT"
code that skips the BIOS variables. Eliminate `total'. Combine some
more printfs.
boot.h, disk.c, io.c, table.c:
Move all statically initialzed data to table.c.
io.c:
Don't put the A20 gate bits in a variable.
Moved various pmap 'bit' test/set functions back into real functions; gcc
generates better code at the expense of more of it. (pmap.c)
Fixed a deadlock problem with pv entry allocations (pmap.c)
Added a new, optional function 'pmap_prefault' that does clustered page
table preloading (pmap.c)
Changed the way that page tables are held onto (trap.c).
Submitted by: John Dyson
work (mi_switch() counted the last timeslice again but this didn't affect
the exiting process' rusage because the rusage has already been finalized).
Remove stale comment.
Put in the much shorter and cleaner version for the calibrate_cycle_counter
for the Pentium that Bruce suggested. Tested here on my Pentium and
it works okay.
sigreturn() sometimes failed for ordinary returns from signal handlers.
Failures of ordinary returns "can't happen" and are badly handled.
"Temporary" fix: allow users to corrupt PSL_RF. This is fairly
harmless. A correct fix would involve saving the old %eflags (and
perhaps the old segment registers) where the user can't get at them.
attempted to check for insecure and fatal eflags and segment
selectors, but missed many cases and got the IOPL check back to
front. The other syscalls didn't check at all.
sys_process.c, machdep.c:
Only allow PT_WRITE_U to write to the registers (ordinary and FP).
psl.h, locore.s, machdep.c:
Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR. We are not supposed
to assume anything about the reserved bits. Use PSL_USERCHANGE
and PSL_KERNEL instead. Rename PSL_USERSET to PSL_USER.
exception.s:
Define a private label for use by doreti when returning to user
mode fails.
machdep.c:
In syscalls, allow changing only the eflags that can be changed on
486's in user mode (no longer attempt to allow benign IOPL changes;
allow changing the nasty PSL_NT; don't allow changing the i586
bits).
Don't attempt to check all the cases involving invalid selectors
and %eip's. Just check for privilege violations and let the invalid
things cause a trap.
procfs_machdep.c:
Call the ptrace register functions to do all the work for reading
and writing ordinary registers and for single stepping.
trap.c:
Ignore traps caused by PSL_NT being set. Previously, users could
cause a fatal trap in user mode by setting PSL_NT and executing an
iret, and a fatal trap in kernel mode by setting PSL_NT and making
a syscall. PSL_NT was cleared too late and not in enough modes to
fix the problem.
Make all traps in user mode (except T_NMI) nonfatal.
Recover from traps caused by attempting to load invalid user
registers in doreti by restarting the traps so that they appear to
occur in user mode.
---
Fix bogons that I noticed while fixing the above:
psl.h:
Fix some comments.
Uniformize idempotency ifdef.
exception.s, machdep.c:
Remove rsvd[0-14]. rsvd0 hasn't been reserved since the 486 came
out. Replace rsvd0 by `align'. rsvd[0-11] used wrong (magic
non-unique) trap numbers. Replace rsvd[1-14] by rsvd.
locore.s:
Enable alignment check flag on 486's and 586's.
machdep.c:
Use a better type for kstack[].
Use TFREGP() to find the registers.
Reformat ptrace functions from SEF to something closer to KNF.
procfs_machdep.c:
The wrong pointer to the registers got fixed as a side effect.
Implement reading and writing of FP registers.
/proc/*/*regs now work (only) for processes that are in memory.
Clean up comments.
trap.c, trap.h:
Remove unused trap types.
unreachable case label in kdb_trap().
Use the correct case labels in kdb_trap() so that normal ddb entry doesn't
print a message.
Change all printf's to db_printf's. Now you can put a breakpoint at printf,
and ddb entry messages don't spam the syslog output.
Cosmetic:
Use ISPL() instead of magic numbers.
Don't compile the unused function kdb_kbd_trap().
Improve some asms.
Print the arg to Debugger().
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.
vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.
vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.
vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.
vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.
machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.
machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.
Submitted by: John Dyson and David Greenman
shifting. Also correct the original code as Garrett noticed it in mail.
Leave the mishandled code in to use it later if future versions of gcc
are correct. The code was part of the calibrate_cyclecounter routine to
get the speed of the pentium chip.
Remove bogus input operands for fnsave(), fnstcw() and fnstsw().
Change all fwait's to fnop's. This might help avoid hardware bugs.
Wait after fninit with an fnop. This should be safer now.
Fix some spelling and formatting errors.
Use natural sizes for control and status words (u_short, promotes to int).
Don't clobber the SWI_CLOCK_MASK bits in npx0_imask when using IRQ13.
Set the devconf state correctly (always busy, if configured). Improve
code for npx_registerdev() a little (gcc can't keep id->id_unit in a
register for some reason). Don't register a nonexistent npx device.
Print a useful message in npxattach() again (delete references to errors
and not the whole message). Don't print "387 emulator" if there is no
emulator in the kernel.
Use %p for pointers in error messages.
Don't clobber the FPU state when there is an FPU exception. Just clear
the exception flags (after saving the flags as before). This allows
debuggers and SIGFPE handlers to look at the full exception state.
SIGFPE handlers should normally return via longjmp(), which restores a
good FPU state (as before). Returning from a SIGFPE handler may leave
the FPU in the wrong state (as before).
Clear the busy latch _after_ clearing the exception flags so that there
is less chance of getting a bogus h/w interrupt for a control operation.
Clear the saved exception status word when the next FPU instruction is
excuted so that it doesn't stick around until the next exception.
Clear the busy latch after fnsave() in npxsave() in case it was set when
npxsave() was called.
- /sys/i386/i386/swapgeneric.c is just plain broke. But fear not, for I
have unbroken it. One thing that swapgeneric.c does is walk through the
list of configured devices searching for a boot device. The only easy
way to accomplish this in 2.0 is to use Garret Wollman's kern_devconf
stuff. *BUT*, the head of the kern_devconf linked list (dc_list) is declared
static in /sys/kern/kern_devconf.c. This means that swapgeneric.c can't
see it at link time. I had to remove the 'static' keyword to get around
this little problem. I hope this doesn't break anything anywhere.
*Furthermore,* there's a small matter of making the call to setconf()
in swapgeneric.c disappear when 'config kernel swap generic' isn't used.
You could change /sbin/config to create a dummy setconf() function in
swapkernel.c, but that seems messy somehow. (It's also someting of an
'it isn't broken, why are you fixing it' situation.) My solution was to
do what the NetBSD people did and put an #ifdef GENERIC around the call
to setconf(). If your kernel is called GENERIC or you define 'options
GENERIC,' then you can use 'config kernel swap generic' and it'll work.
That aside, the upshot is that: a) swapgeneric.c actually works, and
and b) the -a boot flag now works as well. If you boot with -a, as in
"Boot: wd(0,a)/kernel -a" you will be presented with a 'root device?'
prompt after the autoconfig phase, at which point you can specify what
device you want mounted as root. Regrettably, you can't specify an NFS
filesystem. Yet. Three files are affected: /sys/i386/i386/swapgeneric.c,
/sys/i386/i386/autoconf.c and /sys/kern/kern_devconf.c.
Submitted by: wpaul
Move definition of `stat_imask' to clock.c.
clock.c:
Rename `rtcmask' to `stat_imask' and export it. Rename `clkmask' to
`clk_imask' for consistency.
Only calculate TIMER_DIV(hz) once.
Merge debugging and "garbage" code to produce debugging code and format the
output better.
Make writertc() static inline and use it everywhere. Now all accesses to
the clock registers go through rtcin() and writertc().
Move rtc initialization to cpu_initclocks().
Merge enablertclock() with cpu_initclocks() and remove enablertclock().
The extra entry point was just a leftover from 1.1.5.
Fix single-stepping of emulated FPU instructions.
Don't panic if an FPU instruction is attempted but there is no FPU
and no FPU emulator is configured.
Keep track of interrupt nesting level. It is normally 0
for syscalls and traps, but is fudged to 1 for their exit
processing in case they metamorphose into an interrupt
handler.
i386/genassym.c;
Remove support for the obsolete pcb_iml and pcb_cmap2.
Add support for pcb_inl.
i386/swtch.s:
Fudge the interrupt nesting level across context switches and in
the idle loop so that the work for preemptive context switches
gets counted as interrupt time, the work for voluntary context
switches gets counted mostly as system time (the part when
curproc == 0 gets counted as interrupt time), and only truly idle
time gets counted as idle time.
Remove obsolete support (commented out and otherwise) for pcb_iml.
Load curpcb just before curproc instead of just after so that
curpcb is always valid if curproc is. A few more changes like
this may fix tracing through context switches.
Remove obsolete function swtch_to_inactive().
include/cpu.h:
Use the new interrupt nesting level variable to implement a
non-fake CLF_INTR() so that accounting for the interrupt state
works.
You can use top, iostat or (best) an up to date systat to see
interrupt overheads. I see the expected huge interrupt overheads
for ISA devices (on a 486DX/33, about 55% for an IDE drive
transferring 1250K/sec and the same for a WD8013EBT network card
transferring 1100K/sec). The huge interrupt overheads for serial
devices are unfortunately normally invisible.
include/pcb.h:
Remove the obsolete pcb_iml and pcb_cmap2. Replace them by
padding to preserve binary compatibility.
Use part of the new padding for pcb_inl.
isa/icu.s:
isa/vector.s:
Keep track of interrupt nesting level.
of config so YOU MUST RECOMPILE CONFIG. Modifying config was the cleanest
solution to integrating this driver into the tree which will become more
obvious in the next commit.
was supposed to have already been made, but got botched somewhere.
Don't clobber the last page of memory (where the message buffer is). Some
BIOS don't gratuitously wipe it out on reboot.
Alphabetize.
Write all i/o functions in sleep so that we don't use anything from
NetBSD.
Restore the correct type of u_int for ports. This saves a whole cycle
per i/o on 486's.
Change `inline' back to __inline to avoid compiler warnings with
-Wreally-all.
Don't implement bdb() unless BDE_DEBUGGER is defined. Declare bdb_exists
outside the function to avoid hundreds of compiler warnings.
Let the compiler pick the register in asms if possible.
Implement ffs() using inline asm(). gcc provides a slightly different
one. It was broken in gcc-2.4.5 but works now. Declaring a correct
version inline ensures getting a correct version. FreeBSD-1.1.5 has
an slow inline version but FreeBSD-2.0 has a library version (which
probably never gets used).
Do inb() and outb() without using %edx for constant ports below 0x100.
Remove casts to the same type in queue functions.
Declare prototypes for everything implemented i386/*.s and also for
everything that is normally implemented as an inline here (I don't
like the current complete dependency on gcc). Ifdef out the prototypes
that are declared elsewhere. THere should be a separate header to
declare things implemented in i386/*.s, but then it would be harder
to override declarations with inlines.
${UII}
with the current default exception (un)mask. There should be no such
processes unless you change the mask. Someday the mask should be
changed to the IEEE default of everything masked. The npx state
gets saved so that it can be checked and this may have the side effect
of fixing a bug that was reported for 1.1.5. (npx exceptions may
sometimes leak across exits and clobber another process. I can't see
how this can happen.)
Get some missing/wrong declarations from headers now that the headers
have them.
the following.
Move declarations to and from <machine/segments.h>. Make segment stuff
static if possible.
Remove unused (although initialized) global variables _default_ldt,
currentldt, _gsel_tss (rename the latter to the auto variable
gtel_tss).
Use "correct" and consistent types for interrupt handlers.
Remove a mailing address from the code.
Fix type mismatches found by adding prototypes.
Partly support BDE_DEBUGGER. Still broken by conflict with APM. Does
nothing if BDE_DEBUGGER is not defined.
Clean up prototypes and data declarations. Declare most of the segment
functions that are implemented in support.s. Make data private in
machdep.c if possible.
Parenthesize expressions in macros properly!
${Uniformize idempotency ifdef}.
to avoid compiler warnings.
Clean up prototypes: alphabetize; don't use redundant `extern' or
meaningless `extern inline'.
Uniformize idempotency ifdef.
Somebody should make a mib variable for it.
Just now it is pointless to dump the kernel, since we have nothing which
can read the dump.
Furthermore is should never be the default to dump.
options DODUMP
will enable dumps.
for all reasonable HZ's. HZ > 1000 doesn't work because of sloppy
conversions in hzto() (division by (tick / 1000) == 0). This was
fixed in 1.1.5.
Eliminate some extern declarations by including the appropriate header
files that now contain appropriate declarations.
doesn't have to calculate it every call.
Rename `timer0_prescale' to `timer0_prescaler_count' and maintain it
correctly. Previously we lost a few 8253 cycles for every "prescaled"
clock interrupt, and the lossage grows rapidly at 16 KHz. Now we
only lose a few cycles for every standard clock interrupt.
Rename `*_divisor' to `*_max_count'.
Do the calculation of TIMER_DIV(rate) only once instead of 3 times each
time the rate is changed.
Don't allow preposterously large interrupt rates. Bug fixes elsewhere
should allow the system to survive rates that saturate the system, however.
Clean up declarations.
Include <machine/clock.h> to check our own declarations.
for it is incomplete and buggy. There is no problem unless Xintr0()
is reentered or should be reentered, but high clock interrupt
frequencies for pcaudio cause Xintr0() to be reentered (or clock
ticks to be lost when Xintr0() should have been reentered but
wasn't), and we lose little by delaying the call to softclock().
Move declarations related to the clock driver to clock.h.
Move declarations related to the npx driver to npx.h.
Clean up the remaining declarations.
I know that many of these entries are bogus and need to be revisited,
but let's get the tree working again for now and then do a pass through
looking at all the __FreeBSD__ entries, shall we?
Cosmetic.
Return from trap() if trap_fatal() returns. trap_fatal() isn't
fatal if you have ddb. Returning from trap() is usually the right
thing to do and much better than falling through.
Build a dummy frame at the top of tmpstk to help debuggers trace the stack
when the system is idle.
swtch.s: idle():
Initialize the frame pointer so that debuggers don't try to trace a bogus
stack.
Load the frame pointer, load the stack pointer and switch out the old
stack in the unique order that never leaves one of the pointers pointers
invalid so that debuggers can trace idle(). Disabling interrupts
provides sufficient validity for normal operation, but debuggers use
(trace) traps.
Changed the fifth parameter to register_intr() from u_int mask into
u_int *maskptr in preparation for new features (shared interrupts and
removable devices, eg. for PCMCIA).
Changed the fifth parameter to register_intr() from u_int mask into
u_int *maskptr in preparation for new features (shared interrupts and
removable devices, eg. for PCMCIA).
of memory to work without running out of kernel VM (and increasing it to
even more than it is now (96MB) is out of the question. Changed bufpages
calculation to allocation a little less bufer cache (16% of mem-2MB instead
of 20%); this is simply a better figure for most systems.
text. Fixed rounding bug that caused the last page of kernel text to be
read/write instead of read-only. This is important now that tmpstk can
crash into it. Removed +4 bias of tmpstk because it screws up ddb's
ability to traceback correctly.
and all SCSI devices (except that it's not done quite the way I want). New
information added includes:
- A text description of the device
- A ``state''---unknown, unconfigured, idle, or busy
- A generic parent device (with support in the m.i. code)
- An interrupt mask type field (which will hopefully go away) so that
. ``doconfig'' can be written
This requires a new version of the `lsdev' program as well (next commit).