(1) deleted #if 0
pc98/pc98/mse.c
(2) hold per-unit I/O ports in ed_softc
pc98/pc98/if_ed.c
pc98/pc98/if_ed98.h
(3) merge more files by segregating changes into headers.
new file (moved from pc98/pc98):
i386/isa/aic_98.h
deleted:
well, it's already in the commit message so I won't repeat the
long list here ;)
Submitted by: The FreeBSD(98) Development Team
AHC_FORCE_PIO - This forces the driver to use PIO even on systems that
say they have memory mapped the controller's registers. This
seems to fix Ken Lam's problems. I've also placed this option
in the GENERIC kernel file so that we are guaranteed to install
even on these flakey machines.
AHC_SHARE_SCBS - This option attempts to share the external SCB SRAM on
the 398X controllers allowing a totoll of 255 non-paged SCBs.
This doesn't work quite yet, so this option is mostly here to
help 398X owners to experiment and give me feedback until this
works properly.
I decided to do this for every hardclock() call instead of lazily
in microtime(). The lazy method is simpler but has more overhead
if microtime() is called a lot.
CPU_THISTICKLEN() is now a no-op and should probably go away.
Previously it did nothing directly but had the side effect of
setting i586_last_tick for CPU_CLOCKUPDATE() and i586_avg_tick for
debugging. CPU_CLOCKUPDATE() now uses a better method and
i586_avg_tick is too much trouble to maintain.
Reduced nesting of #includes in the usual case.
Increased nesting of #includes when CLOCK_HAIR is defined. This
is a kludge to get typedefs for inline functions only when the
inline functions are used. Normally only kern_clock.c defines
this. kern_clock.c can't include the i386 headers directly.
Removed unused LOCORE support.
- use a more accurate and more efficient method of compensating for
overheads. The old method counted too much time against leaf
functions.
- normally use the Pentium timestamp counter if available.
On Pentiums, the times are now accurate to within a couple of cpu
clock cycles per function call in the (unlikely) event that there
are no cache misses in or caused by the profiling code.
- optionally use an arbitrary Pentium event counter if available.
- optionally regress to using the i8254 counter.
- scaled the i8254 counter by a factor of 128. Now the i8254 counters
overflow slightly faster than the TSC counters for a 150MHz Pentium :-)
(after about 16 seconds). This is to avoid fractional overheads.
files.i386:
permon.c temporarily has to be classified as a profiling-routine
because a couple of functions in it may be called from profiling code.
options.i386:
- I586_CTR_GUPROF is currently unused (oops).
- I586_PMC_GUPROF should be something like 0x70000 to enable (but not
use unless prof_machdep.c is changed) support for Pentium event
counters. 7 is a control mode and the counter number 0 is somewhere
in the 0000 bits (see perfmon.h for the encoding).
profile.h:
- added declarations.
- cleaned up separation of user mode declarations.
prof_machdep.c:
Mostly clock-select changes. The default clock can be changed by
editing kmem. There should be a sysctl for this.
subr_prof.c:
- added copyright.
- calibrate overheads for the new method.
- documented new method.
- fixed races and and machine dependencies in start/stop code.
mcount.c:
Use the new overhead compensation method.
gmon.h:
- changed GPROF4 counter type from unsigned to int. Oops, this should
be machine-dependent and/or int32_t.
- reorganized overhead counters.
Submitted by: Pentium event counter changes mostly by wollman
- use CROSSJUMP() and CROSSJUMP_LABEL() for conditional jumps from idle()
into cpu_switch() and vice versa.
- moved badsw code to after cpu_switch().
Cosmetic changes:
- moved sw0 string to be immediately after its caller (badsw).
- removed unused #include.
previous snap. Specifically, kern_exit and kern_exec now makes a
call into the pmap module to do a very fast removal of pages from the
address space. Additionally, the pmap module now updates the PG_MAPPED
and PG_WRITABLE flags. This is an optional optimization, but helpful
on the X86.
- fixed a sloppy common-style declaration.
- removed an unused macro.
- moved once-used macros to the one file where they are used.
- removed unused forward struct declarations.
- removed __pure.
- declared inline functions as inline in their prototype as well
as in theire definition (gcc unfortunately allows the prototype
to be inconsistent).
- staticized.
<net/if_arp.h> and fixed the things that depended on it. The nested
include just allowed unportable programs to compile and made my
simple #include checking program report that networking code doesn't
need to include <sys/socket.h>.
I586_OPTIMIZED_BCOPY is configured.
Similarly for bzero/I586_OPTIMIZED_BZERO.
Fake 586's had better have a hardware FPU with non-broken exception
handling (we mask exceptions, but broken exception handling may trap
on the instructions that do the masking). I guess this means that
the routines won't work on most 386's or FPUless 486's even when they
have a h/w FPU.
These are based on using the FPU to do 64-bit stores. They also
use i586-optimized instruction ordering, i586-optimized cache
management and a couple of other tricks. They should work on any
i*86 with a h/w FPU, but are slower on at least i386's and i486's.
They come close to saturating the memory bus on i586's. bzero()
can maintain a 3-3-3-3 burst cycle to 66 MHz non-EDO main memory
on a P133 (but is too slow to keep up with a 2-2-2-2 burst cycle
for EDO - someone with EDO should fix this). bcopy() is several
cycles short of keeping up with a 3-3-3-3 cycle for writing. For
a P133 writing to 66 MHz main memory, it just manages an N-3-3-3,
3-3-3-3 pair of burst cycles, where N is typically 6.
The new routines are not used by default. They are always configured
and can be enabled at runtime using a debugger or an lkm to change
their function pointer, or at compile time using new options (see
another log message).
Removed old, dead i586_bzero() and i686_bzero(). Read-before-write is
usually bad for i586's. It doubles the memory traffic unless the data
is already cached, and data is (or should be) very rarely cached for
large bzero()s (the system should prefer uncached pages for cleaning),
and the amount of data handled by small bzero()s is relatively small
in the kernel.
Improved comments about overlapping copies.
Removed unused #include.
First, change sysinstall and the Makefile rules to not build the kernel
nlist directly into sysinstall now. Instead, spit it out as an ascii
file in /stand and parse it from sysinstall later. This solves the chicken-n-
egg problem of building sysinstall into the fsimage before BOOTMFS is built
and can have its symbols extracted. Now we generate the symbol file in
release.8.
Second, add Poul-Henning's USERCONFIG_BOOT changes. These have two
effects:
1. Userconfig is always entered, rather than only after a -c
(don't scream yet, it's not as bad as it sounds).
2. Userconfig reads a message string which can optionally be
written just past the boot blocks. This string "preloads"
the userconfig input buffer and is parsed as user input.
If the first command is not "USERCONFIG", userconfig will
treat this as an implied "quit" (which is why you don't need
to scream - you never even know you went through userconfig
and back out again if you don't specifically ask for it),
otherwise it will read and execute the following commands
until a "quit" is seen or the end is reached, in which case
the normal userconfig command prompt will then be presented.
How to create your own startup sequences, using any boot.flp image
from the next snap forward (not yet, but soon):
% dd of=/dev/rfd0 seek=1 bs=512 count=1 conv=sync <<WAKKA_WAKKA_DOO
USERCONFIG
irq ed0 10
iomem ed0 0xcc000
disable ed1
quit
WAKKA_WAKKA_DOO
Third, add an intro screen to UserConfig so that users aren't just thrown
into this strange screen if userconfig is auto-launched. The default
boot.flp startup sequence is now, in fact, this:
USERCONFIG
intro
visual
(Since visual never returns, we don't need a following "quit").
Submitted-By: phk & jkh
dependent operation, and not really a correct name. invltlb and invlpg
are more descriptive, and in the case of invlpg, a real opcode.
Additionally, fix the tlb management code for 386 machines.
lcall 7,0 (ie: ldt slot 0) and lcall 0x87,0 (ldt slot 16, it's shifted
three bits to the left). I was fiddling with this so long ago, I don't
recall the specifics.
with this quite a while ago when somebody reported a BSD/OS 2.1 binary
that wouldn't run. I'm pretty sure they tried it and I'm pretty sure
they mentioned to me that the patch worked.
comparisons in the inb() and outb() macros. I decided that int args
are OK here. Any type that can hold a u_int16_t without overflow
is correct, and 32-bit types are optimal.
Introduced a few tens of warnings (100 in LINT) for use of pessimized
(short) types for the port arg. Only a few drivers are affected by
this. u_short pessimizations aren't detected.
Added `__extension__' before the statement-expression in inb() so
that it can be compiled without warnings by gcc -pedantic.
the prototype.
Put the jump table for i486_bzero() in the data section. This
speeds up i486_bzero() a little on Pentiums without significantly
affecting its speed on 486's.
Don't waste time falling through 14 nop's to return from do1 in
i486_bzero().
Use fastmove() for counts >= 1024 (was > 1024). Cosmetic.
Fixed profiling of fastmove().
Restored meaningful labels from the pre-1.1 version in fastmove().
Local labels are evil.
Fixed (high resolution non-) profiling of __bb_init_func().
I maintain that it saves more power to simply "hlt" the CPU than to
spend tons of time trying to tell the APM bios to do the same.
In particular if you do it 100 times a second...
instead of 0 if there is no input.
syscons.c:
Added missing spl locking in sccncheckc(). Return the same value as
sccngetc() would. It is wrong for sccngetc() to return non-ASCII, but
stripping the non-ASCII bits doesn't help.
(1) Add PC98 support to apm_bios.h and ns16550.h, remove pc98/pc98/ic
(2) Move PC98 specific code out of cpufunc.h (to pc98.h)
(3) Let the boot subtrees look more alike
Submitted by: The FreeBSD(98) Development Team
<freebsd98-hackers@jp.freebsd.org>
modified. Pages that are removed by the pageout daemon were
the worst affected. Additionally, numerous minor cleanups,
including better handling of busy page table pages. This
commit fixes the worst of the pmap problems recently introduced.
biosextmem > 65536, but biosextmem is a 16-bit quantity so it is
guaranteed to be < 65536. Related cruft for biosbasemem was
mostly cleaned up in rev.1.26.
The default level works with minimal overhead, but one can also enable
full, efficient use of a 512K cache. (Parameters can be generated
to support arbitrary cache sizes also.)
returned by the RTC, use the bootblock supplied value. Also, map the
'stolen by BIOS' memory in the same manner as the ISA-hole memory, since
it is really an extenstion of the BIOS. This is necessary for 32-bit
BIOS functions such as APM support on laptops, and the loss of memory
for non-necessary functions seems to be at most 4k.
Reviewed by: phk
Obtained from: email conversation with jtk@atria.com
nearest .01 Mhz rather than simply truncating it downwards.
This hack makes this 89.999928 Mhz clock correctly round to the closer
90.00-MHz rather than 89.99-MHz:
> i586 clock: 89999928 Hz, i8254 clock: 1193152 Hz
> CPU: Pentium (90.00-MHz 586-class CPU)
Changed i586_ctr_bias from long long to u_int. Only the low 32 bits
are used now that microtime uses a multiplication to do the scaling.
Previously the high 32 bits had to match those of rdtsc() to prevent
overflow traps and invalid timeval adjustments.
early enough when the dump device is specified in the config file.
Removed stale comment about configuration root and swap devices.
Don't bother clearing dumplo when dumpdev is set to NODEV. Everything
is controlled by dumpdev.
Fixed the kern.dumpdev sysctl. Writes were handle bogusly.
problem with the 'shell scripts' was found, but there was a 'strange'
problem found with a 486 laptop that we could not find. This commit
backs the code back to 25-jul, and will be re-entered after the snapshot
in smaller (more easily tested) chunks.
performance issues.
1) The pmap module has had too many inlines, and so the
object file is simply bigger than it needs to be.
Some common code is also merged into subroutines.
2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls.
Unfortunately, a few have needed to be added also.
The removal caused the need for more vm_page_lookups.
I added lookup hints to minimize the need for the
page table lookup operations.
3) Removal of some bogus performance improvements, that
mostly made the code more complex (tracking individual
page table page updates unnecessarily). Those improvements
actually hurt 386 processors perf (not that people who
worry about perf use 386 processors anymore :-)).
4) Changed pv queue manipulations/structures to be TAILQ's.
5) The pv queue code has had some performance problems since
day one. Some significant scalability issues are resolved
by threading the pv entries from the pmap AND the physical
address instead of just the physical address. This makes
certain pmap operations run much faster. This does
not affect most micro-benchmarks, but should help loaded system
performance *significantly*. DG helped and came up with most
of the solution for this one.
6) Most if not all pmap bit operations follow the pattern:
pmap_test_bit();
pmap_clear_bit();
That made for twice the necessary pv list traversal. The
pmap interface now supports only pmap_tc_bit type operations:
pmap_[test/clear]_modified, pmap_[test/clear]_referenced.
Additionally, the modified routine now takes a vm_page_t arg
instead of a phys address. This eliminates a PHYS_TO_VM_PAGE
operation.
7) Several rewrites of routines that contain redundant code to
use common routines, so that there is a greater likelihood of
keeping the cache footprint smaller.
make it more intelligible, improve the partially bogus locking, and
allow for a ``quick re-acquiration'' from a pending release of timer 0
that happened ``recently'', so it was not processed yet by clkintr().
This latter modification now finally allows to play XBoing over
pcaudio without losing sounds or getting complaints. ;-) (XBoing
opens/writes/closes the sound device all over the day.)
Correct locking for sysbeep().
Extensively (:-) reviewed by: bde
Finished the constant poisoning that was begun in rev.1.14. Consts
aren't very poisonous (or useful) unless -Wcast-qual is in CFLAGS,
and it isn't in the default CFLAGS.
type identification code out of machdep.c and into a new file of its
own. Hopefully other grot can be moved out of machdep.c as well
(by other people) into more descriptively-named files.
Testing with the high frequency of 20000 Hz (to find problems) only found
the problem that this frequency is too high for slow i386's.
Disable interrupts while setting the timer frequency. This was unnecessary
before rev.1.57 and forgotten in rev.1.57. The critical (i8254) interrupts
are disabled in another way at boot time but not in the sysctl to change
the frequency.
Fixed profiling of system times. It was pre-4.4Lite and didn't support
statclocks. System times were too small by a factor of 8.
Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead
of addupc(). Call addupc_task() directly instead of using the ADDUPC()
macro.
Removed vestigial support for PROFTIMER.
switch.s:
Removed addupc().
resourcevar.h:
Removed ADDUPC() and declarations of addupc().
cpu.h:
Updated a comment. i386's never were tahoe's, and the deferred profiling
tick became (possibly) multiple ticks in 4.4Lite.
Obtained from: mostly from NetBSD
multiplication in addupc() overflowed for addresses >= 256K, assuming
the usual profil(2) scale parameter of 0x8000. addupc() will go away
soon.
Submitted by: John Polstra <jdp@polstra.com>