bde 5d04e0e4d3 Use a shadow buffer and never read from the frame buffer. Remove large slow
code for reading from the frame buffer.

Reading from the frame buffer is usually much slower than writing to
the frame buffer.  Typically 10 to 100 times slower.  It old modes,
it takes many more PIOs, and in newer modes with no PIOs writes are
often write-combined while reads remain uncached.

Reading from the frame buffer is not very common, so this change doesn't
give speedups of 10 to 100 times.  My main test case is a floodfill()
function that reads about as many pixels as it writes.  The speedups
are typically a factor of 2 to 4.

Duplicating writes to the shadow buffer is slower when no reads from the
frame buffer are done, but reads are often done for the pixels under the
mouse cursor, and doing these reads from the shadow buffer more than
compensates for the overhead of writing the shadow buffer in at least the
slower modes.  Management of the mouse cursor also becomes simpler.

The shadow buffer doesn't take any extra memory, except twice as much
in old 4-plane modes.  A buffer for holding a copy of the frame buffer
was allocated up front for use in the screen switching signal handler.
This wasn't changed when the handler was made async-signal safe.  Use
the same buffer the shadow (but make it twice as large in the 4-plane
modes), and remove large special code for writing it as well as large
special code for reading ut.  It used to have a rawer format in the
4-plane modes.  Now it has a bitmap format which takes twice as much
memory but can be written almost as fast without special code.

VIDBUFs that are not the whole frame buffer were never supported, and the
change depends on this.  Check for invalid VIDBUFs in some places and do
nothing.  The removed code did something not so good.
2019-04-21 16:17:35 +00:00
..