2916ad3e28
This makes little difference in float precision, but in double precision gives a speedup of about 30% on amd64 (A64 CPU) and i386 (A64). This depends on fabs[f]() being inline and efficient. The bit fiddling (or any use of SET_HIGH_WORD(), which libm does too much because it was best on old 32-bit machines) always causes packing overheads and sometimes causes stalls in the packing, since it operates on only part of a variable in the double precision case. It apparently did cause stalls in a critical path here. |
||
---|---|---|
.. | ||
amd64 | ||
arm | ||
bsdsrc | ||
i387 | ||
ia64 | ||
ld80 | ||
ld128 | ||
man | ||
powerpc | ||
sparc64 | ||
src | ||
Makefile | ||
Symbol.map |