824 Commits

Author SHA1 Message Date
bde
3cbf52b741 Attempt to fix build breakage in r344458.
Non-x86 arches use an inconsistently named header for the file containing
"pc" attributes, and the ifdef messes to include the right header were out
of date in the 2 files that I added to the MI files list.

Only amd64, arm, i386, mips, powerpc and sparc64 are supposed to support
syscons.  Only arm and mips were out of date in the ifdef.  Test
coverage for of syscons in arm is broken (turned off) in NOTES, but
syscons is in some other arm config files which universe detects as broken.
arm64 and riscv remain broken due to the opposite bug of not turning off
sc in NOTES, the same as before r344458 (see r344443).

The header is MD to contain possibly-non-"pc" encodings of attributes, but
since the attributes are essentially virtual in graphics mode and non-x86
arches only support graphics mode, the header has always been the same on
all arches except for different style bugs, so there should be only 1 MI
copy of it for syscons' use.  It was used in pcvt and still gives an an
API and an ABI, so it should be public and MI near or in sys/consio.h.
2019-02-26 09:44:10 +00:00
bde
543fd0f706 Fix the dumb and sc terminal emulators to compile and work.
First remove ifdefs of the unsupported option SC_DUMB_TERMINAL which
prevented building using both in the same kernel and broke regression
tests.  This option will be replaced by per-emulator supported options.

The dumb emulator rotted with KSE in r83366, but usually compiled since
it is ifdefed to nothing unless SC_DUMB_TERMINAL is defined.  The type
of an unused function parameter changed.

Both emulators rotted when 2 new methods were added while the emulators
were removed.  Only null methods are needed, but null function pointers
give panics instead.

The wildcard in the default for the unsupported option SC_DFLT_TERM
never really worked.  It tends to prefer the dumb emulator when multiple
emulators are configured.  Change it to prefer scteken for compatibility.
2019-02-21 19:19:30 +00:00
bde
61c06858a1 Restore syscons' terminal emulators. The trivial fixes to make them compile
will be committed later.

The "sc" emulator has the advantages of full support for cons25 and running
about 8 times faster than teken (for writing to the frame buffer).

The "dumb" emulator has the advantage of being simple.

Runtime choice of the emulator is good, but compile time choice is bad.
2019-02-21 08:37:39 +00:00
bde
711105c7ed My recent fix for programmable function keys in syscons only worked
when TEKEN_CONS25 is configured.  Fix this by adding a function to
set the flag that enables the fix and always calling this function
for syscons.

Expand the man page for teken_set_cons25().  This function is not
very useful since it can only set but not clear 1 flag.  In practice,
it is only used when TEKEN_CONS25 is configured and all that does is
choose the the default emulation for syscons at compile time.
2019-02-05 16:59:29 +00:00
kib
af881ec390 i386: Merge PAE and non-PAE pmaps into same kernel.
Effectively all i386 kernels now have two pmaps compiled in: one
managing PAE pagetables, and another non-PAE. The implementation is
selected at cold time depending on the CPU features. The vm_paddr_t is
always 64bit now. As result, nx bit can be used on all capable CPUs.

Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE
configs, for drivers compatibility. Kernel layout, esp. max kernel
address, low memory PDEs and max user address (same as trampoline
start) are now same for PAE and for non-PAE regardless of the type of
page tables used.

Non-PAE kernel (when using PAE pagetables) can handle physical memory
up to 24G now, larger memory requires re-tuning the KVA consumers and
instead the code caps the maximum at 24G. Unfortunately, a lot of
drivers do not use busdma(9) properly so by default even 4G barrier is
not easy. There are two tunables added: hw.above4g_allow and
hw.above24g_allow, the first one is kept enabled for now to evaluate
the status on HEAD, second is only for dev use.

i386 now creates three freelists if there is any memory above 4G, to
allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed
from 3 to 1.

The PAE_TABLES kernel config option is retired.

In collaboarion with: pho
Discussed with:	emaste
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D18894
2019-01-30 02:07:13 +00:00
markm
d8723e8b03 Remove the Yarrow PRNG algorithm option in accordance with due notice
given in random(4).

This includes updating of the relevant man pages, and no-longer-used
harvesting parameters.

Ensure that the pseudo-unit-test still does something useful, now also
with the "other" algorithm instead of Yarrow.

PR:		230870
Reviewed by:	cem
Approved by:	so(delphij,gtetlow)
Approved by:	re(marius)
Differential Revision:	https://reviews.freebsd.org/D16898
2018-08-26 12:51:46 +00:00
wulf
df1e48fddf Drop MOUSE_GETVARS and MOUSE_SETVARS ioctls support.
These ioctls are not documented and only stubbed in a few drivers: mse(4),
psm(4) and syscon's sysmouse(4). The only exception is MOUSE_GETVARS
implemented in psm(4)

Given the fact that they were introduced 20 years ago and implementation
has never been completed, remove any related code.

PR:		228718 (exp-run)
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D15726
2018-06-10 10:23:31 +00:00
bde
107463a1fd Improve defaults for per-CPU kernel console colors, especially with 2
or 4 CPUs.  Add a compile-time option SC_KERNEL_CONS_ATTRS to control the
defaults.

Default to color numbers in reverse order to CPU numbers (instead of
in the same order with white first and wrapping to dark grey), so that
the brightest bright colors are used first.  Don't use dark grey at all;
replace it by dark green.

Syscons has too many compile-time options, but this one is needed in
in case the defaults give something like white on white, or the user
really hates this feature and can't wait to turn it off in rc.

MFC after:	next release?
2018-06-02 14:07:27 +00:00
bde
070d6e2563 Use per-CPU attributes earlier.
The per-CPU ts is not initialized early, so the global kernel ts is used
early, but it ony has 1 (normal) attribute.  Switch this to the per-CPU
attribute.

The difference is most visible with EARLY_AP_STARTUP.

Change to using the curcpu macro instead of PCPU_GET(cpuid) in 2 places for
the above and in 1 other place in my old code in syscons.  The function-like
spelling is perhaps better for indicating that curcpu is volatile (unlike
curthread), but for CPU attributes volatility is a feature.
2018-06-02 10:36:30 +00:00
bde
14f65346d3 Fix low-level locking during panics.
The SCHEDULER_STOPPED() hack breaks locking generally, and
mtx_trylock_*() especially.  When mtx_trylock_*() returns nonzero,
naive code version here trusts it to have worked.  But when
SCHEDULER_STOPPED() is true, mtx_trylock_*() returns 1 without doing
anything.  Then mtx_unlock_*() crashes especially badly attempting to
unlock iff the error is detected, since mutex unlocking functions don't
check SCHEDULER_STOPPED().

syscons already didn't trust mtx_trylock_spin(), but it was missing the
logic to turn on sp->kdb_locked when turning off sp->mtx_locked during
panics.  It also used panicstr instead of SCHEDULER_LOCKED because I
thought that panicstr was more fragile.  They only differ for a window
of lines in panic(), and in broken cases where stop_cpus_hard() in panic()
didn't work.
2018-06-02 08:38:59 +00:00
kib
e3089a0318 i386 4/4G split.
The change makes the user and kernel address spaces on i386
independent, giving each almost the full 4G of usable virtual addresses
except for one PDE at top used for trampoline and per-CPU trampoline
stacks, and system structures that must be always mapped, namely IDT,
GDT, common TSS and LDT, and process-private TSS and LDT if allocated.

By using 1:1 mapping for the kernel text and data, it appeared
possible to eliminate assembler part of the locore.S which bootstraps
initial page table and KPTmap.  The code is rewritten in C and moved
into the pmap_cold(). The comment in vmparam.h explains the KVA
layout.

There is no PCID mechanism available in protected mode, so each
kernel/user switch forth and back completely flushes the TLB, except
for the trampoline PTD region. The TLB invalidations for userspace
becomes trivial, because IPI handlers switch page tables. On the other
hand, context switches no longer need to reload %cr3.

copyout(9) was rewritten to use vm_fault_quick_hold().  An issue for
new copyout(9) is compatibility with wiring user buffers around sysctl
handlers. This explains two kind of locks for copyout ptes and
accounting of the vslock() calls.  The vm_fault_quick_hold() AKA slow
path, is only tried after the 'fast path' failed, which temporary
changes mapping to the userspace and copies the data to/from small
per-cpu buffer in the trampoline.  If a page fault occurs during the
copy, it is short-circuit by exception.s to not even reach C code.

The change was motivated by the need to implement the Meltdown
mitigation, but instead of KPTI the full split is done.  The i386
architecture already shows the sizing problems, in particular, it is
impossible to link clang and lld with debugging.  I expect that the
issues due to the virtual address space limits would only exaggerate
and the split gives more liveness to the platform.

Tested by: pho
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D14633
2018-04-13 20:30:49 +00:00
brooks
9d79658aab Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
imp
b1d831d84d Revert r331298
Normally, shutdown_nice() just signals init. However, sometimes it
calls kern_reboot directly. For that case, r331298 dropped the Giant
lock before calling it. This turns out to be incorrect for the more
common case where init exists and we just signal it. Restore the old
behavior. The direct call to kern_reboot() doesn't sync buffers to the
disk, so should work with Giant held, so we don't need to drop locks
here for that.

Noticed by: bde@
Sponsored by: Netflix
2018-03-22 15:11:53 +00:00
imp
17b7d8a1eb Unlock giant when calling shutdown_nice() 2018-03-21 14:47:12 +00:00
pfg
ced875130d Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
pfg
86c1e7ab7b dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
eadler
421a929b1e kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
pfg
1537078d8f sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
wulf
a9e0169879 sysmouse(4): Fix ums(4)-style T-axis reporting via evdev protocol
- Do not report T-axis wheel events as button presses
- Reverse T-axis to match Linux
- Remove wrong comment. T-axis buttons state should be checked by level not
    by edge to allow continuous wheel tilt reporting

Reviewed by:		gonzo
Approved by:		gonzo (mentor)
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D12676
2017-11-01 22:30:36 +00:00
bde
3ae8c07ecd Fix bugs in (mostly) not-yet-activated parts of early/emergency output:
- map the hard-coded frame buffer address above KERNBASE.  Using the
  physical address only worked because of larger mapping bugs.

  The hard-coded frame buffer address only works on x86.  Use messy ifdefs
  to try to avoid warnings about unused code for other arches.

- remove the sysctl for reading and writing the table kernel console
  attributes.  Writing only worked for emergency output since normal
  output uses unalterd copies.

- fix the test for the emergency console being usable

- explain why a hard-coded attribute is used very early.  Emergency output
  works on x86 even before the pcpu pointer is initialized.
2017-08-25 10:57:17 +00:00
bde
27405ee693 Support setting the colors of cursors for the VGA renderer.
Advertise this by changing the defaults to mostly red.  If you don't like
this, change them (almost) back using:
   vidcontrol -c charcolors,base=7,height=0
   vidcontrol -c mousecolors,base=0[,height=15]

The (graphics mode only) mouse cursor colors were hard-coded to a black
border and lightwhite interior.  Black for the border is the worst
possible default, since it is the same as the default black background
and not good for any dark background.  Reversing this gives the better
default of X Windows.  Coloring everything works better still.  Now
the coloring defaults to a lightwhite border and red interior.

Coloring for the character cursor is more complicated and mode
dependent.  The new coloring doesn't apply for hardware cursors.  For
non-block cursors, it only applies in graphics mode.  In text mode,
the cursor color was usually a hard-coded (dull)white for the background
only, unless the foreground was white when it was a hard-coded black
for the background only, unless the foreground was white and the
background was black it was reverse video.  In graphics mode, it was
always reverse video for the block cursor.  Reverse video is worse,
especially over cutmarking regions, since cutmarking still uses simple
reverse video (nothing better is possible in text mode) and double
reverse video for the cursor gives normal video.  Now, graphics mode
uses the same algorithm as the best case for text mode in all cases
for graphics mode.  The hard-coded sequence { white, black, } for the
background is now { red, white, blue, } where the first 2 colors can
be configured.  The blue color at the end is a sentinel which prevents
reverse video being used in most cases but breaks the compatibility
setting for white on black and black on white characters.  This will
be fixed later.  The compatibility setting is most needed for mono modes.

The previous commit to syscons.c changed sc_cnterm() to be more careful.
It followed null pointers in some cases.  But sc_cnterm() has been
unreachable for 15+ years since changes for multiple consoles turned
off calls to the the cnterm destructor for all console drivers.  Before
them, it was only called at boot time.  So no driver with an attached
console has ever been unloadable and not even the non-console destructors
have been tested much.
2017-08-25 07:04:41 +00:00
bde
a887544cd3 Oops, the previous commit was missing 1 line. 2017-08-25 02:41:01 +00:00
bde
b3f7bab923 Fix missing switching of the terminal emulator when switching the
terminal state for kernel console output.

r56043 in 2000 added many complications to support dynamic selection
of the terminal emulator using modules and the ioctl CONS_SETTERM.
This was never completed.  There are still no modules, but it is easy
to restore the scterm and dumb emulators at compile time.  Then
boot-time configuration for the preferred one doesn't work right, but
CONS_SETTERM almost works after fixing this bug.  CONS_SETTERM only
switches the emulator for the user state, leaving the kernel state(s)
still using the boot-time emulator.  The fix is especially important
when switching from sc to scteken, since the scteken state has pointers
in it.

Rename kernel_console_ts to sc_kts.
2017-08-25 02:37:32 +00:00
bde
d61f277f2c Fix setting of defaults for the text cursor.
There was already a per-vty defaults field, but it was useless since it was
only initialized when propagating the global settings and thus no different
from the current global settings and not per-vty.  The global defaults field
was also invariant after boot time, but not quite so useless.

Fix this by adding a second selection bit the the control flags of the
relevant ioctl().  vidcontrol doesn't support this yet.  Setting either
default propagates the change to the current setting for the same level
and then to all lower levels.

Improve the 3-way escape sequence used by termcap to control the cursor.
The "normal" (ve) case has always used reset, so the user could set
it to anything, but since the reset is to a global value this is not
very useful, especially since the "very visible" (vs) case doesn't
reset but inconsistently forces to a blinking block.  Change vs to
first reset and then XOR the blinking bit so that it is predictably
different from ve.
2017-08-19 23:13:33 +00:00
bde
1eec2f6095 Rename curr_curs_attr to base_curr_attr. The actual current cursor
attribute field is curs_attr.  The base field holds user data translated
in a reversible way and is needed because current field holds this in
an irreversible way for efficiency.

Factor out some common code for the reversible translation.  This is
slightly simpler now, and much easier to expand.

Translate the magic flags value -1 to a single control flag internally
up front so other flags can be trusted later.  This can be used for the
relevant ioctl() too.

Remove CONS_CURSOR_FLAGS which contained all the control flags.  It was
unused and not useful.  After adding more flags, there will be tests on
a couple at a time but never on them all.  This API should have used this
to disallow unknown flags.
2017-08-19 21:40:42 +00:00
bde
fce552fb16 Use better hard-coded defaults for the cursor shape, and remove nearby
redundant initializations.

Hard-code base = 0, height = (approx. 1/8 of the boot-time font height)
in all cases, and remove the BIOS/MD support for setting these values.
This asks for an underline cursor sized for the boot-time font instead
of various less hard-coded but worse values.  I used that think that
the x86 BIOS always gave the same values as the above hard-coding, but
on 1 of my systems it gives the wrong value of base = 1.

The remaining BIOS fields are shift_state and bell_pitch.  These are now
consistently not explicitly reinitialized to 0.  All sc_get_bios_value()
functions except x86's are now empty, and the only useful thing that x86
returns is shift_state.  This really belongs in atkbdc, but heavier
use of the BIOS to read the more useful typematic rate has been removed
there.  fb still makes much heavier use of the BIOS.
2017-08-19 19:33:16 +00:00
bde
9a96357487 Fix syscons escape sequence for setting the local cursor type. This sequence
was aliased to a vt sequence, causing and fixing various bugs.

For syscons, this restores support for arg 2 which sets blinking block
too forcefully, and restores bugs for arg 0 and 1.  Arg 2 is used for
vs in the cons25 entry in termcap, but I've never noticed an application
that uses this.  The bugs involve replacing local settings by global
ones and need better handling of defaults to fix.

For vt, this requires moving the aliasing code from teken to vt where
it belongs.  This sequences is very important for cons25 compatibility
in vt since it is used by the cons25 termcap entries for ve, vi and
vs.  vt can't properly support vs for either cons25 or xterm since it
doesn't support blinking.  For xterm, the termcap entry for vs asks
for something different using 12;25h instead of 25h.

Rename C25CURS for this to C25LCT and change its description to be closer
to echoing the old comment about it.  CURS is too generic.

Fix missing syscons escape sequence for setting the global cursor shape
(and type).  Only support this in syscons since vt can't emulate anything
in it.
2017-08-18 15:40:40 +00:00
bde
05782903dc Fix vt100 escape sequence for showing and hiding the cursor in syscons.
It should toggle between 2 states, but it used a cut-down version of
support for a related 3-state syscons escape sequence and inherited
bugs from that.  The usual misbehaviour was that hiding and showing
the cursor reset it to a global default.

Support for the 3-state sequence remains broken by aliasing to the 2-state
sequence.  This works better but incompatibly for the 2 cases that it
supports.
2017-08-18 12:45:00 +00:00
bde
7962e7e7dc Fix missing syscons escape sequence for setting the border color. 2017-08-18 10:38:49 +00:00
bde
84d82b383a Undeprecate the CONS_CURSORTYPE ioctl. It was "deprecated" in 2001,
but it was actually extended then and it is still used (just once) in
/usr/src by its primary user (vidcontrol), while its replacement is
still not used in /usr/src.

yokota became inactive soon after deprecating CONS_CURSORTYPE (this
was part of a large change to make cursor attributes per-vty).

vidcontrol has incomplete support even for the old ioctl.  I will
update it soon.  Then there are many broken escape sequences to fix.
This is just to prepare for setting cursor colors using vidcontrol.
2017-08-16 10:59:37 +00:00
bde
d48d00cb17 Fix attribute flipping for cut marking in pixel mode. The text-mode
code was used, so the lightness bit was not flipped, so the flipping
was unnecessarily null in some cases.  E.g., the unusal color scheme
of lightwhite on white (white = lightgrey in kernelspeak) is not
completely unusable, except null flipping of it gave no visible marks
for cut marking.  Now flipping it works in pixel mode only.

Fix text cursor attribute adjustment over cut marking in text mode for
the usual cursor type (non-blinking full block).  Apply the flipping
for cut marking first and adjust that instead of vice versa.  This
gives a uniform color scheme for the usual text cursor type in text
mode: a white block background with no change to the character
foreground except for variations to avoid collisions.  The old order
gave a white character fg with no change in the bg in non-colliding
cases.  Versions before r316636 changed the bg to the non-cut-marked
one about half the time using a saveunder bug; this accidentally gave
something resembling a block cursor half the time.
2017-07-10 09:00:35 +00:00
bde
23cedd336c Move open coding of construction of attributes for cut regions and
text cursors to functions so that it is easier to fix and improve.
This commit doesn't fix anything except for removing unnecessary
complications and adding comments.
2017-07-09 12:13:37 +00:00
bde
823f504338 Add many bitmaps (now there are 13) for mouse cursors and logic to try
to choose the best one.

The old 9x13 cursor was was sort of correct for CGA 640x200 text mode,
but distorted for all other modes.  This mode is still available on
all systems with VGA, but stopped being useful in ~1985.  It has very
unsquare pixels with an aspect ratio of 240:100 on 4:3 monitors.  On
16:9 monitors, the unsquareness in this mode is reduced to only 180:100
iff the monitor stretches the pixels to the full screen.

Newer modes and systems have smaller distortions, but with many more
variations.  Square pixels first became common with VGA 640x480 mode
on 4:3 monitors.  However, standard VGA text mode also has 9-bit wide
characters and only 25 lines, so it has 720x400 pixels.  This has
unsquare pixels with an aspect ratio of 135:100 on 4:3 monitors.  On
16:9 monitors, it gives almost-square pixels with an aspect ration of
101:100 iff the monitor stretches, but in modes that were square on
4:3 monitors square similar monitor stretching breaks the squareness.

Guess the physical aspect ratio using heuristics.  The old version of
X that I use is further from doing this using info from PnP monitors
that is unavailable in syscons (X doesn't understand if the monitor
is doing stretching and doesn't even understand how its its own mode
changes affect the pixel size).  Monitors with aspect ratio control
should be configured to _not_ stretch 4:3 modes to 16:9.  Otherwise,
use the machdep.vga_aspect_scale sysctl to compensate.  Only 1 of my
4 monitors/laptops requires this.  It always stretches to 16:9.

The mouse data has new aspect ratio fields for selecting the best
cursor and a new name field for display in debugging messages.

Selecting the mouse cursor is now a slow operation so it is not done
for every drawing of the cursor.  To avoid a new initialization method,
it is done whenever the text cursor is set or changed.  Also remove
dead code in settings of text cursors.

Use larger mouse cursors (sometimes the full 10x16 one) for 8x8 fonts
in cases where this works better (mostly in graphics mode).
2017-07-08 17:30:33 +00:00
bde
a262d2cd7c Add files to help manage the (vga) syscons mouse cursor.
To mostly fix distortion of mouse cursors by non-square pixels, I
needed 8 variants of the same cursor shape for large fonts and
another 7 variants for small fonts.  Some variants are shared,
leaving only 13 variants in 26 glyphs altogether.  Keep these in
the BDF source file cursor.bdf.  cursor.bdf has another 5 unused
experimental cursors in 10 glyphs.  cursor.awk is a simple awk
script for converting this and similar bdf files into C declarations
for copying into scvgarndr.c.  syscons doesn't use any of this yet.
2017-07-08 15:01:55 +00:00
bde
04b8d88f13 Change the drawing method for the mouse cursor in planar mode to support
colors.

Colors are still hard-coded as 15 (normally lightwhite) for the interior
and 0 (normally black) for the border, but these are now values used in
2 expressions instead of built in to the algorithm.  The algorithm used
a fancy and/or method, but this gives no control over the colors except
and'ing all color planes off gives black and or'ing all color planes on
gives lightwhite.  Just draw the border and interior in separate colors
using the same method as for characters, including its complications to
optimize for VGA adaptors.  Optimization is not really needed here, but
for the VGA case it avoids being slower than the and/or method.  The
optimization is worth about 30%.
2017-04-23 08:59:35 +00:00
bde
a31946f8ed Optimize setting of the foreground color in the main planar method much
like for the background color.

This is a about 5% faster for output that actually reaches the screen.
2017-04-21 17:57:23 +00:00
bde
1753d0e8f9 Merge the main ega drawing method into the main vga planar method and
remove the former.

All other EGA/VGA methods were already shared, with VGA-only features
mostly not used and no decisions in inner loops to optimize fof VGA,
but this method was split up because it is the only important one and
using VGA methods if possible is about twice as fast.  The speed is
mostly not from splitting to reduce branches but from doing half as
many bus accesses, so make this easier to maintain by not splitting.
There is now 1 extra branch in an inner loop where it costs less than
1% of the bus access overhead on Haswell even if the compiler schedules
it poorly.
2017-04-21 15:12:43 +00:00
bde
120c3bfb69 Oops, the previous commit swapped the main ega method with the main
vga planar method (for testing that was supposed to be local that the
former still works).  The ega method works on vga but is about twice
as slow.  The vga method doesn't work on ega.

Optimize the main vga planar method a little.  For changing the
background color (which was otherwise optimized better than most
things), don't switch the write mode from 3 to 0 just to select
the pixel mask of 0xff obscurely by writing 0.  Just write 0xff
directly.
2017-04-21 06:55:17 +00:00
bde
f980795b00 Eliminate the ega renderer switch. It did nothing useful except hold
a pointer to the main ega drawing method which is misoptimized be in
a different function than the main vga planar mode drawing method.
Vga initialization handles everything with no extra code except for
selecting the different function.
2017-04-20 17:22:03 +00:00
bde
9a3d9516b6 When the character width is 9, remove vertical lines in the mouse cursor
corresponding to the gaps between characters.  This fixes distortion
of the cursor due to expanding it across the gaps.

Again for character width 9, when the cursor characters are not in the
graphics range (0xb0-0xdf), the gaps were always there (filled in the
background color for the previous char).  They still look strange, but
don't cause distortion.  When the cursor characters are in the graphics
range, the gaps are filled by repeating the previous line.  This gives
distortion with cilia.  Removing vertical lines reduces the distortion
to vertical cilia.

Move the default for the cursor characters out of the graphics range.
With character width 9, this gives gaps instead of distortion and
other problems.  With character width 8, it just fixes a smaller set
of other problems.  Some distortion and other problems can be recovered
using vidcontrol -M.  Presumably the default was to fill the gaps
intentionally, but it is much better to leave gaps.  The gaps can even
be considered as a feature for text processing -- they give sub-pointers
to character boundaries.  The other problems are: (1) with character
width 9, characters near the cursor are moved into the graphics range
and thus distorted if any of their 8th bits is set; (2) conflicts with
national characters in the graphics range.

The default range for the graphics cursor characters is now 8-11.  This
doesn't conflict with anything, since the glyphs for the characters in
this range are unreachable.

Use the 10x16 mouse cursor in text mode too (if the font size is >= 14).

When the character width is 9, removal of 1 or 2 vertical lines makes
10x16 cursor no wider than the 9x13 one usually was.  We could even
handle cursors 1 pixel wider in 2 character cells and gaps without
more clipping than given by the gaps (the worst case is 1 pixel in the
left cell, 1 removed in the middle gap, 8 in the right cell and 1
removed in the right gap.  The pixel in the right gap is removed so
it doesn't matter if it is in the font).

When the character width is 8, we now clip the 10-wide cursor by 1
pixel in the worst case.  This clipping is usually invisible since it
is of the border and and the border usually merges with the background
so is invisible.  There should be an option to use reverse video to
highlight the border and its tip instead of the interior (graphics
mode can do better using separate colors).  This needs the 9x13 cursor
again.

Ideas from: ache (especially about the bad default character range)
2017-04-20 16:34:09 +00:00
glebius
6132d41d49 Fix build without SC_PIXEL_MODE defined. 2017-04-19 22:48:27 +00:00
bde
65c9c293c8 Fix missing support for drawing the mouse cursor in depth 24 of direct
mode.

Use the general DRAWPIXEL() macro with its bigger case statement
(twice) instead of our big case statement (once).  DRAWPIXEL() is more
complicated since it is not missing support for depth 24 or
complications for colors in depth 16 (we currently hard-code black and
white so the complications for colors are not needed).  DRAWPIXEL()
also does the bpp calculation in the inner loop.  Compilers optimize
DRAWPIXEL() well enough, and the main text drawing method always
depended on this.  In direct mode, mouse cursor drawing is now similar
to normal text drawing except it draws in 2 hard-coded colors instead
of 1 variable color.

This also fixes a nested hard-coding of colors.  DRAWPIXEL() uses the
palette in all cases, but the direct code didn't use the palette for
its hard-coded black.  This only had an effect in depth 8, since
changing the palette is not supported in other depths.
2017-04-19 18:35:34 +00:00
bde
8ba8702bad Stop using a saveunder method for mouse cursor drawing in the vga
direct mode renderer.  I thought that reads were not much slower than
writes, so that the method only tripled the time for the whole function,
but I recently measured that video memory reads can be up to 53 times
slower than writes in tighter loops than here.  Loop overheap here
reduces the multiplier to only 16-20 on Haswell.

Start cleaning up and fixing larger bugs in this function.  Only replace
the 22-line removal loop by a 3-line one for now, since adjusting the
old loop would have required many palette calculations which are better
done in the DRAW_PIXEL() macro.  This also fixes missing support for
depth 24, but only for removal.

Removal is currently sloppy at the right bottom corner.  It sometimes
leaks border color into the text window.  This is soon cleaned up by the
caller.  The planar renderer has complications to clip at the corner.
2017-04-19 16:24:51 +00:00
bde
1470a1016b Add a 10x16 mouse cursor and use it in all graphics (strictly, pixel)
modes if the font size is >= 14.

This is the X cursor XC_left_ptr (#68) (glyph #45 in an X cursor font).
Also found in vt.  The old 9x13 cursor is the 10x16 one trimmed not very
well.

8x8 fonts need a smaller cursor instead of a larger one, except when
the pixel size is small.  Text mode is still limited to width and height
1 more than the font (so the 9x13 is already 4 pixels too high for it).
2017-04-15 20:03:50 +00:00
bde
a6c7051d95 Structure the mouse cursor data so that it is easier to switch, and
access it via pointers (still to only 1 instance, now with a less
generic name).

Restructure the "and" and "or" masks as border and interior masks
(where the "and" mask was for the union of the border and the interior).
"and" and "or" were only a detail in a not very good implementation,
and after fixing that the union was only used to calculate the border
at runtime.

Use the metric data in more places to clip to active pixels earlier.
2017-04-15 19:27:39 +00:00
bde
815bf10d96 Oops, the previous revision was missing the update of the shift variable. 2017-04-14 17:38:43 +00:00
bde
16de330c1e Adjust shifting so that cursor widths up to 17 (was 9) work in vga planar
mode.

Direct mode always supported widths up to 32, except for its hard-coded
16s matching the pixmap size.  Text mode is still limited to 9 its 2x2
character cell method and missing adjustments for the gap between
characters, if any.

Cursor heights can be almost anything in graphics modes.
2017-04-14 17:02:24 +00:00
bde
635fe3c642 Optimize drawing of the mouse cursor in vga planar mode almost as
much as possible, by avoiding null ANDs and ORs to the frame buffer.

Mouse cursors are fairly sparse, especially for their frame.  Pixels
are written in groups of 8 in planar mode and the per-group sparseness
is not as large, but it still averages about 40% with the current
9x13 mouse cursor.  The average drawing time is reduced by about this
amount (from 22 usec constant to 12.5 usec average on Haswell).

This optimization is relatively larger with larger cursors.  Width 10
requires 6 frame buffer accesses per line instead of 4 if not done
sparsely, but rarely more than 4 if done sparsely.
2017-04-14 14:00:13 +00:00
bde
4172610070 Further unobfuscate the method of drawing the mouse cursor in vga planar
mode.

Don't manually unroll the 2 inner loops.  On Haswell, doing so gave a
speedup of about 0.5% (about 4 cycles per iteration out of 1400), but
hard-coded a limit of width 9 and made better better optimizations
harder to see.  gcc-4.2.1 -O does the unrolling anyway, unless tricked
with a volatile hack.  gcc's unrolling is not very good and gives a
a speedup of about half as much (about 2 cycles per iteration).  (All
timing on i386.)

Manual unrolling was only feasible because the inner loop only iterates
once or twice.  Usually twice, but a dynamic check is needed to decide,
and was not moved from the second-innermost loop manually or by gcc.
This commit basically adds another dynamic check in the inner loop.

Cursor widths of 10-17 require 3 iterations in the inner loop and this
is not so easy to unroll -- even gcc stops at 2.
2017-04-14 12:03:34 +00:00
bde
93efbe5f35 Improve drawing of the vga planar mode mouse image a little. Unobfuscate
the method a lot.

Reduce the AND mask to the complement of the cursor's frame, so that area
inside the frame is not drawn first in black and then in lightwhite.  The
AND-OR method is only directly suitable for the text mouse image, since
it doesn't go to the hardware there.  Planar mode Mouse cursor drawing
takes 10-20 usec on my Haswell system (approx. 100 graphics accesses
at 130 nsec each), so the transient was not visible.

The method used the fancy read mode 1 and its color compare and color
don't care registers with value 0 in them so that all colors matched.
All that this did was make byte reads of frame buffer memory return 0xff,
so that the x86 case could obfuscate read+write as "and".  The read must
be done for its side effect on the graphics controller but is not used,
except it must return 0xff to avoid affecting the write when the write
is obfuscated as a read-modify-write "and".  Perhaps that was a good
optimization for 8088 CPUs where each extra instruction byte took as
long as a byte memory access.

Just use read+write after removing the fancy read mode.  Remove x86
ifdefs that did the "and".  After removing the "and" in the non-x86
part of the ifdefs, fix 4 of 6 cases where the shift was wrong.
2017-04-12 20:18:38 +00:00