registers as volatile. Instructions that *wrote* to FP state were
already marked volatile, but apparently gcc has license to move
non-volatile asms past volatile asms. This broke amd64's feupdateenv
at -O2 due to a WAR conflict between fnstsw and fldenv there.
avoid easily avoidable loss of precision when |x| is nearly 1.
Extended (64-bit) precision only moves the meaning of "nearly" here.
This probably could be done better by splitting up the range into
|x| <= 0.5 and |x| > 0.5 like the C version. However, ucbtest
does't report any errors in this version. Perhaps the C version
should be used anyway. It's only 25% slower now on a P5, provided
the C version of sqrt() isn't used, and the C version could be
optimized better.
Errors checked by: ucbtest
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
The fyl2xp1 instruction has such a limited range:
-(1 - (sqrt(2) / 2)) <= x <= sqrt(2) - 1
it's not worth trying to use it.
Also, I'm not sure fyl2xp1's extra precision will
matter once the result is converted from extended
real (80 bits) back to double real (64 bits).
Reviewed by: jkh
Submitted by: jtc
-- Begin comments from J.T. Conklin:
The most significant improvement is the addition of "float" versions
of the math functions that take float arguments, return floats, and do
all operations in floating point. This doesn't help (performance)
much on the i386, but they are still nice to have.
The float versions were orginally done by Cygnus' Ian Taylor when
fdlibm was integrated into the libm we support for embedded systems.
I gave Ian a copy of my libm as a starting point since I had already
fixed a lot of bugs & problems in Sun's original code. After he was
done, I cleaned it up a bit and integrated the changes back into my
libm.
-- End comments
Reviewed by: jkh
Submitted by: jtc