on i486's (and probably i386's), but it has had very little effect
since gcc-2.7 or gcc-2.95. With gcc-3.3, it gave a small
pessimization for at least i386's, athlon-xp's and pentium4's, a
small optimization (I think) for pentium1's, and made no difference
for i386's. (movzbl is best for all the later processors, and the
micro-optimization was to stop it being used on i486's.)
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
(on an i486, 10 cycles (+ cache misses) instead of 15). The
change should be a no-op if the compiler is any good. The best
possible i*86 code for the same algorithm is only 1 more cycle
faster on i486's so I don't want to bother implementing an
assembler version.
scanc() is a bottleneck for OPOST processing. It is naturally
about 4 times as slow as bcopy() on 32-bit systems.