Oops, on amd64 (and probably on all non-i386 systems), the previous

commit broke the 2**24 cases where |x| > DBL_MAX/2.  There are exponent
range problems not just for denormals (underflow) but for large values
(overflow).  Doubles have more than enough exponent range to avoid the
problems, but I forgot to convert enough terms to double, so there was
an x+x term which was sometimes evaluated in float precision.

Unfortunately, this is a pessimization with some combinations of systems
and compilers (it makes no difference on Athlon XP's, but on Athlon64's
it gives a 5% pessimization with gcc-3.4 but not with gcc-3.3).

Exlain the problem better in comments.
This commit is contained in:
Bruce Evans 2006-01-05 09:18:48 +00:00
parent 79a7950c48
commit fd2891004d

View File

@ -53,16 +53,21 @@ cbrtf(float x)
} else
SET_FLOAT_WORD(t,sign|(hx/3+B1));
/* first step Newton iteration (solving t*t-x/t == 0) to 16 bits */
/* in double precision to avoid problems with denormals */
/*
* First step Newton iteration (solving t*t-x/t == 0) to 16 bits. In
* double precision so that its terms can be arranged for efficiency
* without causing overflow or underflow.
*/
T=t;
r=T*T*T;
T=T*(x+x+r)/(x+r+r);
T=T*((double)x+x+r)/(x+r+r);
/* second step Newton iteration to 47 bits */
/* in double precision for accuracy */
/*
* Second step Newton iteration to 47 bits. In double precision for
* efficiency and accuracy.
*/
r=T*T*T;
T=T*(x+x+r)/(x+r+r);
T=T*((double)x+x+r)/(x+r+r);
/* rounding to 24 bits is perfect in round-to-nearest mode */
return(T);