colors.
Colors are still hard-coded as 15 (normally lightwhite) for the interior
and 0 (normally black) for the border, but these are now values used in
2 expressions instead of built in to the algorithm. The algorithm used
a fancy and/or method, but this gives no control over the colors except
and'ing all color planes off gives black and or'ing all color planes on
gives lightwhite. Just draw the border and interior in separate colors
using the same method as for characters, including its complications to
optimize for VGA adaptors. Optimization is not really needed here, but
for the VGA case it avoids being slower than the and/or method. The
optimization is worth about 30%.
remove the former.
All other EGA/VGA methods were already shared, with VGA-only features
mostly not used and no decisions in inner loops to optimize fof VGA,
but this method was split up because it is the only important one and
using VGA methods if possible is about twice as fast. The speed is
mostly not from splitting to reduce branches but from doing half as
many bus accesses, so make this easier to maintain by not splitting.
There is now 1 extra branch in an inner loop where it costs less than
1% of the bus access overhead on Haswell even if the compiler schedules
it poorly.
vga planar method (for testing that was supposed to be local that the
former still works). The ega method works on vga but is about twice
as slow. The vga method doesn't work on ega.
Optimize the main vga planar method a little. For changing the
background color (which was otherwise optimized better than most
things), don't switch the write mode from 3 to 0 just to select
the pixel mask of 0xff obscurely by writing 0. Just write 0xff
directly.
a pointer to the main ega drawing method which is misoptimized be in
a different function than the main vga planar mode drawing method.
Vga initialization handles everything with no extra code except for
selecting the different function.
corresponding to the gaps between characters. This fixes distortion
of the cursor due to expanding it across the gaps.
Again for character width 9, when the cursor characters are not in the
graphics range (0xb0-0xdf), the gaps were always there (filled in the
background color for the previous char). They still look strange, but
don't cause distortion. When the cursor characters are in the graphics
range, the gaps are filled by repeating the previous line. This gives
distortion with cilia. Removing vertical lines reduces the distortion
to vertical cilia.
Move the default for the cursor characters out of the graphics range.
With character width 9, this gives gaps instead of distortion and
other problems. With character width 8, it just fixes a smaller set
of other problems. Some distortion and other problems can be recovered
using vidcontrol -M. Presumably the default was to fill the gaps
intentionally, but it is much better to leave gaps. The gaps can even
be considered as a feature for text processing -- they give sub-pointers
to character boundaries. The other problems are: (1) with character
width 9, characters near the cursor are moved into the graphics range
and thus distorted if any of their 8th bits is set; (2) conflicts with
national characters in the graphics range.
The default range for the graphics cursor characters is now 8-11. This
doesn't conflict with anything, since the glyphs for the characters in
this range are unreachable.
Use the 10x16 mouse cursor in text mode too (if the font size is >= 14).
When the character width is 9, removal of 1 or 2 vertical lines makes
10x16 cursor no wider than the 9x13 one usually was. We could even
handle cursors 1 pixel wider in 2 character cells and gaps without
more clipping than given by the gaps (the worst case is 1 pixel in the
left cell, 1 removed in the middle gap, 8 in the right cell and 1
removed in the right gap. The pixel in the right gap is removed so
it doesn't matter if it is in the font).
When the character width is 8, we now clip the 10-wide cursor by 1
pixel in the worst case. This clipping is usually invisible since it
is of the border and and the border usually merges with the background
so is invisible. There should be an option to use reverse video to
highlight the border and its tip instead of the interior (graphics
mode can do better using separate colors). This needs the 9x13 cursor
again.
Ideas from: ache (especially about the bad default character range)
mode.
Use the general DRAWPIXEL() macro with its bigger case statement
(twice) instead of our big case statement (once). DRAWPIXEL() is more
complicated since it is not missing support for depth 24 or
complications for colors in depth 16 (we currently hard-code black and
white so the complications for colors are not needed). DRAWPIXEL()
also does the bpp calculation in the inner loop. Compilers optimize
DRAWPIXEL() well enough, and the main text drawing method always
depended on this. In direct mode, mouse cursor drawing is now similar
to normal text drawing except it draws in 2 hard-coded colors instead
of 1 variable color.
This also fixes a nested hard-coding of colors. DRAWPIXEL() uses the
palette in all cases, but the direct code didn't use the palette for
its hard-coded black. This only had an effect in depth 8, since
changing the palette is not supported in other depths.
direct mode renderer. I thought that reads were not much slower than
writes, so that the method only tripled the time for the whole function,
but I recently measured that video memory reads can be up to 53 times
slower than writes in tighter loops than here. Loop overheap here
reduces the multiplier to only 16-20 on Haswell.
Start cleaning up and fixing larger bugs in this function. Only replace
the 22-line removal loop by a 3-line one for now, since adjusting the
old loop would have required many palette calculations which are better
done in the DRAW_PIXEL() macro. This also fixes missing support for
depth 24, but only for removal.
Removal is currently sloppy at the right bottom corner. It sometimes
leaks border color into the text window. This is soon cleaned up by the
caller. The planar renderer has complications to clip at the corner.
modes if the font size is >= 14.
This is the X cursor XC_left_ptr (#68) (glyph #45 in an X cursor font).
Also found in vt. The old 9x13 cursor is the 10x16 one trimmed not very
well.
8x8 fonts need a smaller cursor instead of a larger one, except when
the pixel size is small. Text mode is still limited to width and height
1 more than the font (so the 9x13 is already 4 pixels too high for it).
access it via pointers (still to only 1 instance, now with a less
generic name).
Restructure the "and" and "or" masks as border and interior masks
(where the "and" mask was for the union of the border and the interior).
"and" and "or" were only a detail in a not very good implementation,
and after fixing that the union was only used to calculate the border
at runtime.
Use the metric data in more places to clip to active pixels earlier.
mode.
Direct mode always supported widths up to 32, except for its hard-coded
16s matching the pixmap size. Text mode is still limited to 9 its 2x2
character cell method and missing adjustments for the gap between
characters, if any.
Cursor heights can be almost anything in graphics modes.
much as possible, by avoiding null ANDs and ORs to the frame buffer.
Mouse cursors are fairly sparse, especially for their frame. Pixels
are written in groups of 8 in planar mode and the per-group sparseness
is not as large, but it still averages about 40% with the current
9x13 mouse cursor. The average drawing time is reduced by about this
amount (from 22 usec constant to 12.5 usec average on Haswell).
This optimization is relatively larger with larger cursors. Width 10
requires 6 frame buffer accesses per line instead of 4 if not done
sparsely, but rarely more than 4 if done sparsely.
mode.
Don't manually unroll the 2 inner loops. On Haswell, doing so gave a
speedup of about 0.5% (about 4 cycles per iteration out of 1400), but
hard-coded a limit of width 9 and made better better optimizations
harder to see. gcc-4.2.1 -O does the unrolling anyway, unless tricked
with a volatile hack. gcc's unrolling is not very good and gives a
a speedup of about half as much (about 2 cycles per iteration). (All
timing on i386.)
Manual unrolling was only feasible because the inner loop only iterates
once or twice. Usually twice, but a dynamic check is needed to decide,
and was not moved from the second-innermost loop manually or by gcc.
This commit basically adds another dynamic check in the inner loop.
Cursor widths of 10-17 require 3 iterations in the inner loop and this
is not so easy to unroll -- even gcc stops at 2.
the method a lot.
Reduce the AND mask to the complement of the cursor's frame, so that area
inside the frame is not drawn first in black and then in lightwhite. The
AND-OR method is only directly suitable for the text mouse image, since
it doesn't go to the hardware there. Planar mode Mouse cursor drawing
takes 10-20 usec on my Haswell system (approx. 100 graphics accesses
at 130 nsec each), so the transient was not visible.
The method used the fancy read mode 1 and its color compare and color
don't care registers with value 0 in them so that all colors matched.
All that this did was make byte reads of frame buffer memory return 0xff,
so that the x86 case could obfuscate read+write as "and". The read must
be done for its side effect on the graphics controller but is not used,
except it must return 0xff to avoid affecting the write when the write
is obfuscated as a read-modify-write "and". Perhaps that was a good
optimization for 8088 CPUs where each extra instruction byte took as
long as a byte memory access.
Just use read+write after removing the fancy read mode. Remove x86
ifdefs that did the "and". After removing the "and" in the non-x86
part of the ifdefs, fix 4 of 6 cases where the shift was wrong.
in unusual cases. Optimize and significantly clean up removal in this
renderer. Optimize removal in the vga direct renderer.
Removal only needs to be done in the border (the part with pixels) in
both cases. The planar renderer used the condition scp->xoff > 0 to
test whether a right border exists. This actually tests for a left
border, and when the total horizontal border is 8 pixels, rounding gives
only a right border. This was the unusual broken case. An example
is easy to configure using something like "vidcontrol -f 8x16 iso-8x16
-g 79x25 MODE_27".
Optimize the planar case a little by only removing 9x13 active pixels
out of 16x16. Optimize it a lot by not doing anything if there is no
overlap with the border. Don't unroll the main loop or hard-code so
many assumptions about font sizes in it. On my Haswell system, graphics
memory and i/o accesses takes about 520 cycles each so optimizations from
unrolling are in the noise.
Optimize the direct case to not do anything if there is no overlap with
the border. Do a sanity check on the saveunder's coordinates. This
requires a previous change to pass non-rounded coordinates.
The previous commit also fixed the coordinates passed to the mouse
removal renderer. The coordinates were rounded down to a character
boundary, and thus essentially unusable. The renderer had to keep
track of the previous position, or clear a larger area. The latter
is only safe in the border, which is all that needs special handling
anyway.
I think no renderer depends on the bug. They have the following
handling:
- gfb sparc64: this seems to assume non-rounded coordinates
- gfb other: does nothing (seems to be missing border handling)
- vga text: does nothing (doesn't need border handling)
- vga planar: clears extras in the border, with some bugs. The fixes
will use the precise coordinates to optimize.
- vga direct: clears at the previous position with no check that it
is active, and clears everything. Checking finds this bug.
- others: are there any?
reduce hard-coded assumptions on font sizes so that the cursor size
can be more independent of the font size. Moving the mouse in the
buggy mode left trails of garbage.
The mouse cursor currently has size 9x13 in all modes. This can occupy
2x3 character cells with 8x8 fonts, but the algorithm was hard-coded
for only 2x2 character cells. Rearrange to hard-code only a maximum
cursor size (now 10x16) and to not hard-code in the logic. The number
of cells needed is now over-estimated in some cases.
2x3 character cells should also be used in text mode with 8x8 fonts
(except with large pixels, the cursor size should be reduced), but
hard-coding for 2x2 in the implementation makes it not very easy to
expand, and it practice it shifts out bits to reduce to 2x2.
In graphics modes, expansion is easier and there is no shifting out
for 9x13 cursors (but 9 is a limit for hard-coding for 2 8-bit VGA
cells wide). A previous commit removed the same buggy hard-coding for
removal at a lower level in planar mode. Another previous commit fixed
the much larger code for lower-level removal in direct mode; this is
independent of the font size so worked for 8x8 fonts. Text mode always
depended on the higher-level removal here, and always worked since
everything was hard-coded consistently for 2x2 character cells.
scteken_init(). Move the internals of scteken_sync() into a local
function to help do this.
scteken_init() reset or adjusted the default attribute and screen
position at least 3 and 5 times, respectively. Warm init shouldn't
do any more than reset the "input" state.
(scterm-sc.c (which still works after minor editing), only resets
the escape state and the saved cursor position, and then does a
nearly-null sync of the current color.)
This mainly broke mode changes, and was most noticeable when the
background color is not teken's default (usually black). Then the
screen gets cleared in the wrong color. vidcontrol restores the
default normal attribute and tries to restore the default reverse
attribute. vidcontrol doesn't clear the screen again after restoring
the attribute(s), and it is too late to do it there without flicker.
Now the default normal attribute is restored before the change affects
the rendering.
When the foreground color is not teken's default, clearing with the
wrong attributes gave strange cursor colors for some cursor types.
The default reverse attribute is not restored since it is unsupported.
2/3 of the clobbering was from 2 resetting window resizing calls. The
second one is needed to restore the size, but must not reset. Window
resizing also sanitizes the cursor position, and after the main reset
resets the window size, the cursor row would often be adjusted from
24 to 23 if it were not already reset to 0. scteken_sync() is good
for restoring the window size and the cursor position in the correct
order, but was unusable at init time since scp->ts is not always
initialized then. Adjust to use its internals.
I didn't notice any problems from the cursor reset. The cursor should
be reset, and a previous fix was to reset it consistently a little
later.
Doing nothing for warm init works almost as well, if not better. It
is not very useful to reset the escape state for mode changes, since
the reset is especially likely to be null then. The escape state is
most likely to be non-initial and corrupted by its most normal uses
-- sloppy non-atomic output where a context switch or just mixing
stdout with stderr splits up escape sequences.
like I hoped, since they are needed for removing parts over the border.
Continue fixing bugs in them.
In the vga planar mode renderer, remove removal of the part of the
image over the text window. This was hard-coded for nearly 8x16 fonts
and in practice didn't remove enough for 8x8 fonts. This used the
wrong attribute over cutmarked regions. The caller refreshes with the
correct attribute later, so the attribute bug only caused flicker.
The caller uses the same hard-coding, so the refreshes fix up all the
spots with the wrong attribute, but keep missing the missed spots.
This still gives trails of bits of cursors for cursor motions in the
affected configurations (mainly depth 4 modes with 8x8) fonts. 8x14
fonts barely escape the problem since although the cursor is drawn
as 16x16, its active part is only 9x13 and the active part fits in
the hard-coded 2x2 character cell window for 8x14 fonts. 8x8 fonts
need a 2x3 window.
In the fb non-sparc64 renderer, the buggy image removal was buggier
and was already avoided by returning before it. Remove it completely
and fix nearby style bugs. It was essentially the same as for the vga
planar mode renderer (obfuscated by swapping x and y). This was buggier
since fb should handle more types of hardware so the hard-coding is
wronger.
The remaining fb image removal is also buggier. It never supported
software cursors drawn into the border, and the hardware cursor is
probably broken by other bugs to be fixed soon.
(that is, in all supported 8, 15, 16 and 24-color modes). Moving the
mouse cursor while holding down a button (giving cut marking) left a
trail of garbage from misremoved mouse cursors (usually colored
rectangles and not cursor shapes). Cases with a button not held down
worked better and may even have worked.
No renderer support for removing (software) mouse cursors is needed
(and many renderers don't have any), since sc_remove_mouse_image()
marks for update the region containing the image and usually much
more. The mouse cursor can be (partially) over as many as 4 character
cells, and removing it in only the 1-4 cells occupied by it would be
best for efficiency and for avoiding flicker. However,
sc_remove_mouse_image() can only mark a single linear region and
usually marks a full row of cells and 1 more to be sure to cover the
4 cells. It always does this, so using the special rendering method
just wastes even more time and gives even more flicker. The special
methods will be removed soon.
The general method always works. vga_pxlmouse_direct() appeared to
defer to it by returning immediately if !on. However,
vga_pxlmouse_direct() actually did foot-shooting using a disguised
saveunder method. Normal order near a mouse move is:
(1) remove the mouse cursor in the renderer (optional)
(2) remove the mouse cursor again and refresh the screen over the
mouse cursor and much more from the vtb. When the mouse has
actually moved and a button is down, many attributes in this
region are changed to be up to date with the new cut marking
(3) draw the keyboard cursor again if it was clobbered by the update
(4) draw the mouse cursor image in its new position.
The bug was to remove the mouse cursor again in step (4), before the
drawing it again in (4), using a saveunder that was valid in step (1)
at best. The quick fix is to use the saveunder in step (1) and not
in step (4). Using it in step (4) also used it before it was
initialized, initially and after mode and screen switches.
in the vga renderer. Removal used stale attributes and didn't try to
merge with the current attribute for cut marking, so special rendering
of cut marking was lost in many cases. The gfb renderer is too broken
to support special rendering of cut marking at all, so this change is
supposed to be just a style fix for it. Remove all traces of the
saveunder method which was used to implement this bug.
Fix drawing of the cursor image in text mode, only in the vga
renderer. This used a stale attribute from the frame buffer instead
of from the saveunder, but did merge with the current attribute for
cut marking so it caused less obvious bugs (subtle misrendering for
the character under the cursor).
The saveunder method may be good in simpler drivers, but in syscons
the 'under' is already saved in a better way in the vtb. Just redraw
it from there, with visible complications for cut marking and
invisible complications for mouse cursors. Almost all drawing
requests are passed a flag 'flip' which currently means to flip to
reverse video for characters in the cut marking region, but should
mean that the the characters are in the cut marking regions so should
be rendered specially, preferably using something better than reverse
video. The gfb renderer always ignores this flag. The vga renderer
ignored it for removal of the text cursor -- the saveunder gave the
stale rendering at the time the cursor was drawn. Mouse cursors need
even more complicated methods. They are handled by drawing them last
and removing them first. Removing them usually redraws many other
characters with the correct cut marking (but transiently loses the
keyboard cursor, which is redrawn soon). This tended to hide the
saveunder bug for forward motions of the keyboard cursor. But slow
backward motions of the keyboard cursor always lost the cut marking,
and fast backwards motions lost in for about 4 in every 5 characters,
depending on races with the scrn_update() timeout handler. This is
because the forward motions are usually into the region redrawn for
the mouse cursor, while backwards motions rarely are.
Text cursor drawing in the vga renderer used also used a
possibly-stale copy of the character and its attribute. The vga
render has the "optimization" of sometimes reading characters from the
screen instead of from the vtb (this was not so good even in 1990 when
main memory was only a few times faster than video RAM). Due to care
in update orders, the character is never stale, but its attribute
might be (just the cut marking part, again due to care in order).
gfb doesn't have the scp->scr pointer used for the "optimization", and
vga only uses this pointer for text mode. So most cases have to
refresh from the vtb, and we can be sure that the ordering of vtb
updates and drawing is as required for this to work.
position. Especially the screen size, and potentially everything except
the input state and attributes. Do this by changing the cursor position
setting method to a general syncing method.
Use proper constructors instead of copying to create kernel terminal
contexts. We really want clones and not new instances, but there is
no method for cloning and there is nothing in the active instance that
needs to be cloned exactly.
Add proper destructors for kernel terminal contexts. I doubt that the
destructor code has every been reached, but if it was then it leaked the
memory of the clones.
Remove freeing of statically allocated memory for the non-kernel terminal
context for the same terminal as the kernel. This is in the nearly
unreachable code. This used to not happen because delicate context
swapping made the user context use the dynamic memory and kernel
context the static memory. I didn't restore this swapping since it
would have been unnatural to have all kernel contexts except 1 dynamic.
The constructor for terminal context has bad layering for reasons
related to the bug. It has to return static memory early before
malloc() works. Callers also can't allocate memory until after the
first constructor selects an emulator and tells upper layers the size
of its context. After that, the cloning hack required the cloning
code to allocate the memory, but for all other constructors it would
be better for the terminal layer to allocate and deallocate the
memory in all cases.
Zero the memory when allocating terminal contexts dynamically.
it to a separate state for each CPU.
Terminal "input" is user or kernel output. Its state includes the current
parser state for escape sequences and multi-byte characters, and some
results of previous parsing (mainly attributes), and in teken the cursor
position, but not completed output. This state must be switched for kernel
output since the kernel can preempt anything, including itself, and this
must not affect the preempted state more than necessary. Since vty0 is
shared, it is necessary to affect the frame buffer and cursor position and
history, but escape sequences must not be affected and attributes for
further output must not be affected.
This used to work. The syscons terminal state contained mainly the parser
state for escape sequences and attributes, but not the cursor position,
and was switched. This was first broken by SMP and/or preemptive kernels.
Then there should really be a separate state for each thread, and one more
for ddb, or locking to prevent preemption. Serialization of printf() helps.
But it is arcane that full syscons escape sequences mostly work in kernel
printf(), and I have never seen them used except by me to test this fix.
They worked perfectly except for the races, since "input" from the kernel
was not special in any way.
This was broken to use teken. The general switch was removed, and the
kernel normal attribute was switched specially. The kernel reverse
attribute (config option SC_CONS_REVERSE_ATTR) became unused, and is
still unusable because teken doesn't support default reverse attributes
(it used to only be used via the ANSI escape sequence to set reverse
video).
The only new difficulty for using teken seems to be that the cursor
position is in the "input" state, so it must be updated in the active
input state for each half of the switch. Do this to complete the
restoration.
The per-CPU state is mainly to make per-CPU coloring work cleanly, at
a cost of some space. Each CPU gets its own full set of attribute
(not just the current attribute) maintained in the usual way. This
also reduces races from unserialized printf()s. However, this gives
races for serialized printf()s that otherwise have none. Nothing
prevents the CPU doing the a printf() changing in the middle of an
escape sequence.
for vt. Restore syscons' rendering of background (bg) brightness as
foreground (fg) blinking and vice versa, and add rendering of blinking
as background brightness to vt.
Bright/saturated is conflated with light/white in the implementation
and in this description.
Bright colors were broken in all cases, but appeared to work in the
only case shown by "vidcontrol show". A boldness hack was applied
only in 1 layering-violation place (for some syscons sequences) where
it made some cases seem to work but was undone by clearing bold using
ANSI sequences, and more seriously was not undone when setting
ANSI/xterm dark colors so left them bright. Move this hack to drivers.
The boldness hack is only for fg brightness. Restore/add a similar hack
for bg brightness rendered as fg blinking and vice versa. This works
even better for vt, since vt changes the default text mode to give the
more useful bg brightness instead of fg blinking.
The brightness bit in colors was unnecessarily removed by the boldness
hack. In other cases, it was lost later by teken_256to8(). Use
teken_256to16() to not lose it. teken_256to8() was intended to be
used for bg colors to allow finer or bg-specific control for the more
difficult reduction to 8; however, since 16 bg colors actually work
on VGA except in syscons text mode and the conversion isn't subtle
enough to significantly in that mode, teken_256to8() is not used now.
There are still bugs, especially in vidcontrol, if bright/blinking
background colors are set.
Restore XOR logic for bold/bright fg in syscons (don't change OR
logic for vt). Remove broken ifdef on FG_UNDERLINE and its wrong
or missing bit and restore the correct hard-coded bit. FG_UNDERLINE
is only for mono mode which is not really supported.
Restore XOR logic for blinking/bright bg in syscons (in vt, add
OR logic and render as bright bg). Remove related broken ifdef
on BG_BLINKING and its missing bit and restore the correct
hard-coded bit. The same bit means blinking or bright bg depending
on the mode, and we want to ignore the difference everywhere.
Simplify conversions of attributes in syscons. Don't pretend to
support bold fonts. Don't support unusual encodings of brightness.
It is as good as possible to map 16 VGA colors to 16 xterm-16
colors. E.g., VGA brown -> xterm-16 Olive will be converted back
to VGA brown, so we don't need to convert to xterm-256 Brown. Teken
cons25 compatibility code already does the same, and duplicates some
small tables. This is mostly for the sc -> te direction. The other
direction uses teken_256to16() which is too generic.
Fix this by using more dynamic initialization with simpler ifdefs for
the machine dependencies. Find a frame buffer address in a more
portable way that at least compiles on sparc64.
user default normal attribute to the current attribute).
This change only fixes a logic error. scterm_clear() used to be
used for terminal reset, but teken uses a general fill function for
that, leaving scterm_clear() only used for initialization and mode
change, when using the user default attribute is correct. It is not
really a terminal function, but needs to sync its changes with the
terminal layer. Syncing of the attribute is currently broken for
terminal reset, but works for initialization and mode change.
some cases of initialization and resetting of the teken cursor position.
(This bad name is consistent with others, but it is too easy to confuse
with scteken_cursor() which goes in the opposite direction.)
The following cases were broken:
- for booting without a syscons console, the teken and sc positions for
ttyv0 were (0, 0), but are supposed to be somewhere in the middle of
the screen (after carefully preserved BIOS and loader messages) (at
least if there is no mode switch that loses the messages).
- after mode switches, the screen is cleared and the cursor is supposed to
be moved to (0, 0), but it was only moved there for sc.
The following case was hacked to work:
- for booting with a syscons console, it was arranged that scteken_init()
for the console could see a nonzero cursor position and adjust, although
this broke the sc seeing it in the non-console case above.
looked like it might handle reverse attributes, but it actually handles
conversion of attributes in the direction indicated by the new name.
Reverse attributes are just broken.
Rename scteken_attr() to scteken_te_to_sc_attr(). scteken_attr() looked
like it might give teken attributes, but it actually gives sc attributes.
Change scteken_te_to_sc_attr() to return int instead of unsigned int.
u_char would be enough, and it promotes to int, and syscons uses int
or u_short for its attributes everywhere else (u_short holds a shifted
form and it promotes to int too).
This change just does cleanups missed in r56043 17 years ago. The
default attributes were still stored in structs for the purpose of
changing them and passing around pointers to the defaults, but r56043
added another layer that made the defaults invariant and only used for
initialization and reset. Just use the defaults directly. This was
already done for the kernel defaults. The defaults for reverse
attributes aren't actually used, but are ignored in layers that no
longer support them.
is unavailable on sparc64 only. This makes the new ec_putc() a non-op
on sparc64 but still calls it. On other non-x86 arches, it should
compile but might not work.
Reported by: gjb
it in emergency in sc_cnputc().
Locking fixes in sc_cnputc() previously turned off normal output in
near-deadlock conditions and added deferred output which might never
be completed. Emergency output goes to the frame buffer using
sufficiently atomic non-blocking writes if the console is in text
mode (in graphics mode, nothing is done, modulo races setting the
graphics mode bit). Screen updates overwrite the emergency output
if the emergency condition clears enough to reach them.
ec_putc() also works for "early" console output in normal x86 text
mode as soon as this mode is initialized (if ever). This uses a
hard-coded x86 frame buffer address before cninit() and a hopefully
MI address after cninit(). But non-x86 is more likely to not support
text mode, when ec_putc() will be null. ec_putc() has no dependencies
of syscons before cninit(), and only has them later to track syscons'
mode changes. This commit doesn't attach ec_putc() for early use.
To test emergency use, put a breakpoint in central syscons output code
like sc_puts() and do some user output. The system used to race or
deadlock in ddb output soon after entry to ddb. The locking fixes
deferred the output until after leaving ddb, so ddb was unusable and
you had to try typing c[ontinue] blindly until it exited, or better use
a serial console in parallel. Now the output goes to a window in the
middle 2/3 of the screen. Scrolling is circular and there is no cursor,
but otherwise ec_putc() provides full dumb terminal functionality and
very fast output that hides artificates from dumb overwrites.
by the CPU number.
This was originally for debugging near-deadlock conditions where
multiple CPUs either deadlock or scramble each other's output trying
to report the problem, but I found it interesting and sometimes
useful for ordinary kernel messages. Ordinary kernel messages
shouldn't be interleaved, but if they are then the colorization
makes them readable even if the interleaving is for every character
(provided the CPU printing each message doesn't change).
The default colors are 8-15 starting at 15 (bright white on black)
for CPU 0 and repeating every 8 CPUs. This works best with 8 CPUs.
Non-bright colors and nonzero background colors need special
configuration to avoid unreadable and ugly combinations so are not
configured by default. The next bright color after 15 is 8 (bright
black = dark gray) is not very readable but is the only other color
used with 2 CPUs. After that the next bright color is 9 (bright
blue) which is not much brighter than bright black, but is used with
3+ CPUs. Other bright colors are brighter.
Colorization is configured by default so that it gets tested. It can
only be turned off by configuring SC_KERNEL_CONS_ATTR to anything other
than FG_WHITE. After booting, all colors can be changed using the
syscons.kattr sysctl. This is a SYSCTL_OPAQUE, and no utility is
provided to change it (sysctl only displays it).
The default colors work in all VGA modes that I could test. In 2-color
graphics modes, all 8 bright colors are displayed as bright white, so
the colorization has no effect, but anything with a nonzero background
gives white on white unless the foreground is zero. I don't have an
mono or VGA grayscale hardware to test on. Support for mono mode seems
to have never worked right in syscons (I think bright white gives white
underline with either bold or bright), but VGA grayscale should work
better than 2-color graphics.
For horizontal (T-axis) wheel reporting which is not supported by
sysmouse protocol kern.evdev.sysmouse_t_axis sysctl is introduced.
It can take following values:
0 - no T-axis events (default)
1 - T-axis events are originated in ums(4) driver.
2 - T-axis events are originated in psm(4) driver.
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D8597
important detail that sc_cngetc() now opens and closes the keyboard
on every call again. This was moved from sc_cngetc() to scn_cngrab/
ungrab() in r228644, but the change wasn't quite complete. After
fixes for nesting in kbdd_poll() in ukbd and kbdmux, these opens
and closes should have no significant effect if done while grabbed.
They fix unusual cases when cngetc() is called while not grabbed.
This commit is the main fix for screen locking in sc_cnputc():
detect deadlock or likely-deadlock and handle it by buffering the
output atomically and printing it later if the deadlock condition
clears (and sc_cnputc() is called).
The most common deadlock is when the screen lock is held by ourself.
Then it would be safe to acquire the lock recursively if the console
driver is calling printf() in a safe context, but we don't know when
that is. It is not safe to ignore the lock even in kdb or panic mode.
But ignore it in panic mode. The only other known case of deadlock
is when another thread holds the lock but is running on a stopped CPU.
Detect that case approximately by using trylock and retrying for 1000
usec. On a 4 GHz CPU, 100 usec is almost long enough -- screen switches
take slightly longer than that. Not retrying at all is good enough
except for stress tests, and planned future versions will extend the
timeout so that the stress tests work better.
To see the behaviour when deadlock is detected, single step through
sctty_outwakeup() (or sc_puts() to start with deadlock). Another
(serial) console is needed to the buffered-only output, but the
keyboard works in this context to continue or step out of the
deadlocked region. The buffer is not large enough to hold all the
output for this.
Keyboard input needs Giant locking, and that is not possible to do
correctly here. Use mtx_trylock() and proceed unlocked as before if
we can't acquire Giant (non-recursively), except in kdb mode don't
even try to acquire Giant. Everything here is a hack, but it often
works. Even if mtx_trylock() succeeds, this might be a LOR.
Keyboard input also needs screen locking, to handle screen updates
and switches. Add this, using the same simplistic screen locking
as for sc_cnputc().
Giant must be acquired before the screen lock, and the screen lock
must be dropped when calling the keyboard driver (else it would get a
harmless LOR if it tries to acquire Giant). It was intended that sc
cn open/close hide the locking calls, and they do for i/o functions
functions except for this complication.
Non-console keyboard input is still only Giant-locked, with screen
locking in some called functions. This is correct for the keyboard
parts only.
When Giant cannot be acquired properly, atkbd and kbdmux tend to race
and work (they assume that the caller acquired Giant properly and don't
try to acquire it again or check that it has been acquired, and the
races rarely matter), while ukbd tends to deadlock or panic (since it
does the opposite, and has other usb threads to deadlock with).
The keyboard (Giant) locking here does very little, but the screen
locking completes screen locking for console mode except for not
detecting or handling deadlock.
Restore an splx() lost in r228644. We aren't nearly ready to remove
spl's. They give hints about missing locking. This lost one was
misplaced. Dropping it early for convenience gave race windows for
accesses to the fkey buffer. Giant locking accidentally fixed this
for non-console cases.
Put the spl's around the whole function. Since there are many returns
that would need splx() just before them for a direct fix, split the
function into a wrapper that does the spl's and a "locked" function
that does the work.
Return earlier when no keyboard is attached to match the ordering in a
planned version. This breaks the dubious feature of returning keys
from the fkey buffer after the keyboard has gone away. Losing the keys
wouldn't matter, but we keep them too long now.
just use the same mutex locking as sc cn putc so they have the same
defects.
The locking calls to acquire the lock are actually in sc cn open and close.
Ungrab has to unlock, although this opens a race window.
Change the direct mutex lock calls in sc cn putc to the new locking
functions via the open and close functions. Putc also has to unlock, but
doesn't keep the screen open like grab. Screen open and close reduce to
locking, except screen open for grab also attempts to switch the screen.
Keyboard locking is more difficult and still null, even when keyboard
input calls screen functions, except some of the functions have locks
too deep to work right.
This organization gives a single place to fix some of the locking.
syscons spinlock for the output routine alone. It is better to extend
the coverage of the first syscons spinlock added in r162285. 2 locks
might work with complicated juggling, but no juggling was done. What
the 2 locks actually did was to cover some of the missing locking in
each other and deadlock less often against each other than a single
lock with larger coverage would against itself. Races are preferable
to deadlocks here, but 2 locks are still worse since they are harder
to understand and fix.
Prefer deadlocks to races and merge the second lock into the first one.
Extend the scope of the spinlocking to all of sc_cnputc() instead of
just the sc_puts() part. This further prefers deadlocks to races.
Extend the kdb_active hack from sc_puts() internals for the second lock
to all spinlocking. This reduces deadlocks much more than the other
changes increases them. The s/p,10* test in ddb gets much further now.
Hide this detail in the SC_VIDEO_LOCK() macro. Add namespace pollution
in 1 nested #include and reduce namespace pollution in other nested
#includes to pay for this.
Move the first lock higher in the witness order. The second lock was
unnaturally low and the first lock was unnaturally high. The second
lock had to be above "sleepq chain" and/or "callout" to avoid spurious
LORs for visual bells in sc_puts(). Other console driver locks are
already even higher (but not adjacent like they should be) except when
they are missing from the table. Audio bells also benefit from the
syscons lock being high so that audio mutexes have chance of being
lower. Otherwise, console drviver locks should be as low as possible.
Non-spurious LORs now occur if the bell code calls printf() or is
interrupted (perhaps by an NMI) and the interrupt handler calls
printf(). Previous commits turned off many bells in console i/o but
missed ones done by the teken layer.
indicate (potentially partial) success of the open. Use these to
decide what to close in sccnclose(). Only grab/ungrab use open/close
so far.
Add a per-sc variable to count successful keyboard opens and use
this instead of the grab count to decide if the keyboad state has
been switched.
Start fixing the locking by using atomic ops for the most important
counter -- the grab level one. Other racy counting will eventually
be fixed by normal mutex or kdb locking in most cases.
Use a 2-entry per-sc stack of states for grabbing. 2 is just enough
to debug grabbing, e.g., for gets(). gets() grabs once and might not
be able to do a full (or any) state switch. ddb grabs again and has
a better chance of doing a full state switch and needs a place to
stack the previous state. For more than 3 levels, grabbing just
changes the count. Console drivers should try to switch on every i/o
in case lower levels of nesting failed to switch but the current level
succeeds, but then the switch (back) must be completed on every i/o
and this flaps the state unless the switch is null. The main point
of grabbing is to make it null quite often. Syscons grabbing also
does a carefully chosen screen focus that is not done on every i/o.
Add a large comment about grabbing.
Restore some small lost comments.
- in sccnopen(), open the keyboard before the screen. The keyboard
currently requires Giant (although it must be spinlocked to work
correctly as a console), so the previous order would be a LOR if
it has any semblance of locking.
- add a (currently dummy) state arg to scgetc().
close functions. Scattered calls to sc_cnputc() and sc_cngetc() were
broken by turning the semi-reentrant inline context-switching code in
these functions into the grabbing functions. cncheckc() calls for
panic dumps are the main broken case. The grabbing functions have
special behaviour (mainly screen switching in sc_cngrab()) which makes
them unsuitable as replacements for the inline code.
Simply change the mode to K_XLATE using a local variable and use the
grab level as a flag to tell screen switches not to change it again,
so that we don't need to switch scp->kbd_mode. We did the latter,
but didn't have the complications to update the keyboard mode switch
for every screen switch. sc->kbd_mode remains at its user setting
for all scp's and ungrabbing restores to it.
Like scr_lock, the grab count needs to be per-physical-device to work.
This bug corrupted the grab count on both vtys if the ungrabbed vty is
different from the console, and failed to restore the keyboard state
on the ungrabbed vty, but not restoring it usually left the keyboard
mode part of the keyboard state uncorrupted at 1 (K_XLATE), while
after this fix the keyboard mode part is usually corrupted to 0 (K_RAW).
While here, rename the grab count from grabbed to grab_level.
This bug corrupted the grab count on both vtys if the ungrabbed vty is
different from the console, and failed to restore the keyboard state
on the ungrabbed vty, but not restoring the latter usually left the
keyboard mode part of it uncorrupted at 1 (K_XLATE), while after this
fix the keyboard mode part is usually corrupted to 0 (K_RAW).
While here, rename the grab count from 'grabbed' to grab_level.
- never call up to the tty layer to restart output for keyboard input in
console mode. This was already disallowed in kdb mode. Other cases
are rarely reached.
- disable the reboot, halt and powerdown keys in console mode. The suspend,
standby and panic keys are still allowed, and aren't even conditonal
on excessive configuration options. Some of these actions are still
available in ddb mode as ddb commands which are equally unsafe. Some
are useful at input prompts and should be restored when the locking is
fixed.
- disallow bells in kdb mode (should be in console mode, but the flag for
that is not available). Visual bell gives very alarming behaviour by
trying to use callouts which don't work in kdb mode. Audio bell uses
timeouts and hardware resources with mutexes that can deadlock in
reasonable use of ddb.
Screen switches in kdb mode are not very safe, but they are important
functionality and there is a lot of code to make them sort of work.