I586_OPTIMIZED_BCOPY is configured.
Similarly for bzero/I586_OPTIMIZED_BZERO.
Fake 586's had better have a hardware FPU with non-broken exception
handling (we mask exceptions, but broken exception handling may trap
on the instructions that do the masking). I guess this means that
the routines won't work on most 386's or FPUless 486's even when they
have a h/w FPU.
These are based on using the FPU to do 64-bit stores. They also
use i586-optimized instruction ordering, i586-optimized cache
management and a couple of other tricks. They should work on any
i*86 with a h/w FPU, but are slower on at least i386's and i486's.
They come close to saturating the memory bus on i586's. bzero()
can maintain a 3-3-3-3 burst cycle to 66 MHz non-EDO main memory
on a P133 (but is too slow to keep up with a 2-2-2-2 burst cycle
for EDO - someone with EDO should fix this). bcopy() is several
cycles short of keeping up with a 3-3-3-3 cycle for writing. For
a P133 writing to 66 MHz main memory, it just manages an N-3-3-3,
3-3-3-3 pair of burst cycles, where N is typically 6.
The new routines are not used by default. They are always configured
and can be enabled at runtime using a debugger or an lkm to change
their function pointer, or at compile time using new options (see
another log message).
Removed old, dead i586_bzero() and i686_bzero(). Read-before-write is
usually bad for i586's. It doubles the memory traffic unless the data
is already cached, and data is (or should be) very rarely cached for
large bzero()s (the system should prefer uncached pages for cleaning),
and the amount of data handled by small bzero()s is relatively small
in the kernel.
Improved comments about overlapping copies.
Removed unused #include.
I have only tested the ABP5140 card and only with a single CDROM drive
but it seems to work fine. This driver relies on features found only in
the SCSI branch so will not work in -current until those changes
are brought in. It also doesn't have any error handling code *yet*.
The goal is to use this driver as the development platform for the new
generic SCSI layer error recovery/handling code.
PCI and EISA front ends will show up as soon as I get my hands on
the cards. There are also a few issues in the driver that I need
to clear up with AdvanSys before I can suggest sticking one of
these cards in your server. 8-)
Thanks to AdvanSys for releasing this code under a suitable copyright.
Obtained from: Ported from the Linux driver writen by
bobf@advansys.com (Bob Frey).
[long long] results when I last worked on them, but they are normally
used together with to daddr_t's and off_t's which are signed, so the
unsigned results did little except cause warnings.
The main change is from unsigned long unsigned int. It just needs to
be a 32-bit type and unsigned int is most natural. Using a non-long
type has the "advantage" of hiding bugs in the "machine-independent"
code where it prints foo_t's using %d or %x. These bugs are currently
hidden bug not compiling with -Wformat.
I tried changing vm_ooffset_t from long long to unsigned long long, but
that was wrong because vm_ooffset_t needs to be long to match off_t,
although file offsets are never negative.
Reviewed by: dyson
Staticized it in userconfig. The one in pcvt is unused.
Removed bogus unused arg to getchar(). This should not have compiled
in the USERCONFIG_BOOT case, but the getchar() was also non-prototyped
and defined in K&R style.
Staticized the badly named global variable `next'. Even static variables
should have a unique module-specific prefix so that they can be referenced
easily in debuggers, etc.
First, change sysinstall and the Makefile rules to not build the kernel
nlist directly into sysinstall now. Instead, spit it out as an ascii
file in /stand and parse it from sysinstall later. This solves the chicken-n-
egg problem of building sysinstall into the fsimage before BOOTMFS is built
and can have its symbols extracted. Now we generate the symbol file in
release.8.
Second, add Poul-Henning's USERCONFIG_BOOT changes. These have two
effects:
1. Userconfig is always entered, rather than only after a -c
(don't scream yet, it's not as bad as it sounds).
2. Userconfig reads a message string which can optionally be
written just past the boot blocks. This string "preloads"
the userconfig input buffer and is parsed as user input.
If the first command is not "USERCONFIG", userconfig will
treat this as an implied "quit" (which is why you don't need
to scream - you never even know you went through userconfig
and back out again if you don't specifically ask for it),
otherwise it will read and execute the following commands
until a "quit" is seen or the end is reached, in which case
the normal userconfig command prompt will then be presented.
How to create your own startup sequences, using any boot.flp image
from the next snap forward (not yet, but soon):
% dd of=/dev/rfd0 seek=1 bs=512 count=1 conv=sync <<WAKKA_WAKKA_DOO
USERCONFIG
irq ed0 10
iomem ed0 0xcc000
disable ed1
quit
WAKKA_WAKKA_DOO
Third, add an intro screen to UserConfig so that users aren't just thrown
into this strange screen if userconfig is auto-launched. The default
boot.flp startup sequence is now, in fact, this:
USERCONFIG
intro
visual
(Since visual never returns, we don't need a following "quit").
Submitted-By: phk & jkh
divisor latch registers if the registers wouldn't change.
Use the default console cfcr setting while setting the divisor
latch registers for console i/o. Input may be messed up by
transiently changing the cfcr.
Use a usual cfcr setting while setting the divisor latch registers
in the probe. This shouldn't matter, but this is not the place to
test the UART's handling of 5 bit words.
Removed a stale devfs comment.
was always disabled because "pci.h" wasn't included. Now the configured
pci devices are listed and you can edit bogus flags for them.
Fixed bitrot in the disabled code. A used #include was removed and const
poisoning wasn't fixed.
Removed unused #include.
dependent operation, and not really a correct name. invltlb and invlpg
are more descriptive, and in the case of invlpg, a real opcode.
Additionally, fix the tlb management code for 386 machines.
for headers in the compile directory work unsurprisingly. Without
-I-, the search for "foo.h" begins in the directory of the file
that includes it, and the compile directory is only searched because
`-I.' is in ${INCLUDES}.
Removed -I$S/sys from ${INCLUDES}. It was once necessary to find
things like "param.h" in $S/sys. Now <sys/param.h> is found in $S.
lcall 7,0 (ie: ldt slot 0) and lcall 0x87,0 (ldt slot 16, it's shifted
three bits to the left). I was fiddling with this so long ago, I don't
recall the specifics.
with this quite a while ago when somebody reported a BSD/OS 2.1 binary
that wouldn't run. I'm pretty sure they tried it and I'm pretty sure
they mentioned to me that the patch worked.
comparisons in the inb() and outb() macros. I decided that int args
are OK here. Any type that can hold a u_int16_t without overflow
is correct, and 32-bit types are optimal.
Introduced a few tens of warnings (100 in LINT) for use of pessimized
(short) types for the port arg. Only a few drivers are affected by
this. u_short pessimizations aren't detected.
Added `__extension__' before the statement-expression in inb() so
that it can be compiled without warnings by gcc -pedantic.
- don't include <sys/ioctl.h> in any header. Include <sys/ioccom.h>
instead. This was already done in 4.4Lite for the most important
ioctl headers. Header spam currently increases kernel build
times by 10-20%. There are more than 30000 #includes (not counting
duplicates) for compiling LINT.
- include <sys/types.h> if and only it is necessary to make the header
almost self-sufficient (some ioctl headers still need structs from
elsewhere).
- uniformized idempotency ifdefs. Copied the style in the 4.4Lite
ioctl headers.
It is needed for implementation details but very little of it is
needed for the interface. Include it in the few places that didn't
already include it.
Include <sys/ioccom.h> in <sys/disklabel.h> (as already in
<sys/diskslice.h>) so that all the disk-related headers are almost
self-sufficient.
the prototype.
Put the jump table for i486_bzero() in the data section. This
speeds up i486_bzero() a little on Pentiums without significantly
affecting its speed on 486's.
Don't waste time falling through 14 nop's to return from do1 in
i486_bzero().
Use fastmove() for counts >= 1024 (was > 1024). Cosmetic.
Fixed profiling of fastmove().
Restored meaningful labels from the pre-1.1 version in fastmove().
Local labels are evil.
Fixed (high resolution non-) profiling of __bb_init_func().
I maintain that it saves more power to simply "hlt" the CPU than to
spend tons of time trying to tell the APM bios to do the same.
In particular if you do it 100 times a second...
Saved a few bytes by copying `dosdev' and/or `name' to local variables.
This optimization (for dosdev) was done in one place before but this
was lost in the devread() cleanup. This optimization (for dosdev)
can almost be done by bogusly declaring dosdev as const, but gcc still
often space-pessimizes code like the following:
extern const int dosdev; ... foo(dosdev); bar(dosdev);
gcc often doesn't bother to copy dosdev to a temporary local because
the local would have to be preserved in memory across the call to
foo(). OTOH, for
extern int dosdev; ... auto int dosdev_copy = dosdev; ...
foo(dosdev_copy); bar(dosdev_copy);
the copy must be made because foo() might alter dosdev.
the pointer to the string "/kernel". This pointer was once only
statically to once save space, but it has had to be dynamically
initialized for some time, so the static initialization just wastes
space. The string gets moved to the text section, so the actual
savings may be negative due to padding.
instead of 0 if there is no input.
pcvt_drv.c:
Partially fixed pccncheckc(). It returned a boolean value instead of
the character that it fetches from the input fifo (if any). I think
it still discards characters after the first for multi-char input.
instead of 0 if there is no input.
syscons.c:
Added missing spl locking in sccncheckc(). Return the same value as
sccngetc() would. It is wrong for sccngetc() to return non-ASCII, but
stripping the non-ASCII bits doesn't help.
still being used just to support printing of the device name in the
probe. Restored the method used in rev.1.6 and changed it to print
the same strings as the previous revision.
Reviewed by: Paul Richards
(1) Add PC98 support to apm_bios.h and ns16550.h, remove pc98/pc98/ic
(2) Move PC98 specific code out of cpufunc.h (to pc98.h)
(3) Let the boot subtrees look more alike
Submitted by: The FreeBSD(98) Development Team
<freebsd98-hackers@jp.freebsd.org>
modified. Pages that are removed by the pageout daemon were
the worst affected. Additionally, numerous minor cleanups,
including better handling of busy page table pages. This
commit fixes the worst of the pmap problems recently introduced.
biosextmem > 65536, but biosextmem is a 16-bit quantity so it is
guaranteed to be < 65536. Related cruft for biosbasemem was
mostly cleaned up in rev.1.26.
It worked because it is spelled correctly in LINT.
Added old obscure syscons options MAXCONS, SLOW_VGA and XT_KEYBOARD.
This file should be sorted both alphabetically and on the module
name by using a consistent prefix for each module, but there is no
consistency in the old options. E.g., MAXCONS is spelled PCVT_NSCREENS
for pcvt.
and xdm, possibly in general.
What was happening was that the server was doing a tcsetattr(.. TCSADRAIN)
on the mouse fd after a write. Since /dev/sysmouse had a null t_oproc,
the drain failed with EIO. Somehow this spammed XFree86 (!@&^#%*& binary
release!!), and the driver was left in a bogus state (ie: switch_in_progress
permanently TRUE).
The simplest way out was to implement a dummy scmousestart() routine to
accept any characters from the tty system and toss them into the void.
It would probably be more correct to intercept scwrite()'s to the mouse
device, but that's executed for every single write to the screen.
Supplying a start routine to eat the characters is only executed for the
mouse port during startup/shutdown, so it should be faster.
-I- to CFLAGS. <sb.h> must currently be used to give the version
of sb.h in the current directory, while "sb.h" in the buggy version
gave the (wrong) version in the source directory. Searching in the
source directory first is normal, but is the reverse of the order
suggested by the 4.4Lite2 #include style. -I- will remove the
ambiguities.
This enables other consumers of the mouse, to get it info via
moused/syscons.
In order to use it run moused (from sysconfig), and then tell
your Xserver that it should use /dev/sysmouse (mknod sysmouse c 12 128)
and it a mousesystems mouse. Everybody will be happy then :)
Remember that moused still needs to know what kind of mouse you
have..
Comments welcome, as is test results...
The default level works with minimal overhead, but one can also enable
full, efficient use of a 512K cache. (Parameters can be generated
to support arbitrary cache sizes also.)
(A pointer to a const was misused to avoid loading loading the same
value twice, but gcc does exactly the same optimization automatically.
It can see that the value hasn't changed.)
- avoiding strcmp("?" saved 12 bytes. gcc inlined the strcmp()
but this takes as much or more code as a function call. The
inlining was bogus because the strcmp() in the bootstrap isn't
standard.
- using a char instead of an int for the boolean `last_only' saved 8
bytes. Booleans should usually be represented as chars on the i386.
- simplifying the return tests saved 9 bytes.
- using putc instead of printf to print a newline saved 3 bytes of code
and 2 bytes of const data.
- avoiding `else's by always doing the else clause and fixing it up
saved 4+8 bytes.
gcc always generates large code for accesses to globals. For locals
it only generates large code if there are more than 128 bytes of
locals. It sorts scalar locals after array locals to pessimize for
space in the usual case when there are more (static) references to
scalars than to arrays.
Saved another 16 bytes (13 before padding) by adding a `continue'.
Fall-through tests normally save space, but here one of them made
gcc do space-unoptimal register allocation (it allocates ch in %bl
because preserving this register across function calls is "free",
but comparisions with %bl take one byte fewer than comparsions with
%bl).