As usual, use a minimax polynomial that is specialized for float

precision. The new polynomial has degree 4 instead of 10, and a maximum error of 2**-30.04 ulps instead of 2**-33.15. This doesn't affect the final error significantly; the maximum error was and is about 0.5015 ulps on i386 -O1, and the number of cases with an error of > 0.5 ulps is increased from 13851 to 14407. Note that the error is only this close to 0.5 ulps due to excessive extra precision caused by compiler bugs on i386. The extra precision could be obtained intentionally, and is useful for keeping the error of the hyperbolic float functions below 1 ulp, since these functions are implemented using expm1f. My recent change for scaling by 2**k had the unintentional side effect of retaining extra precision for longer, so callers of expm1f see errors of more like 0.0015 ulps than 0.5015 ulps, and for the hyperbolic functions this reduces the maximum error from nearly about 2 ulps to about 0.75 ulps. This is about 10% faster on i386 (A64). expm1* is still very slow, but now the float version is actually significantly faster. The algorithm is very sophisticated but not very good except on machines with fast division.
svn path=/head/; revision=176132
2008-02-09 12:53:15 +00:00 · 2008-02-09 12:53:15 +00:00 · 52453261e9 · 2020-12-20 02:59:44 +00:00
commit 52453261e9
parent d57786ec68
1 changed files with 8 additions and 7 deletions
--- a/lib/msun/src/s_expm1f.c
+++ b/lib/msun/src/s_expm1f.c
@ -27,12 +27,13 @@ o_threshold	= 8.8721679688e+01,/* 0x42b17180 */
 ln2_hi		= 6.9313812256e-01,/* 0x3f317180 */
 ln2_lo		= 9.0580006145e-06,/* 0x3717f7d1 */
 invln2		= 1.4426950216e+00,/* 0x3fb8aa3b */
-	/* scaled coefficients related to expm1 */
-Q1  =  -3.3333335072e-02, /* 0xbd088889 */
-Q2  =   1.5873016091e-03, /* 0x3ad00d01 */
-Q3  =  -7.9365076090e-05, /* 0xb8a670cd */
-Q4  =   4.0082177293e-06, /* 0x36867e54 */
-Q5  =  -2.0109921195e-07; /* 0xb457edbb */
+/*
+ * Domain [-0.34568, 0.34568], range ~[-6.694e-10, 6.696e-10]:
+ * |6 / x * (1 + 2 * (1 / (exp(x) - 1) - 1 / x)) - q(x)| < 2**-30.04
+ * Scaled coefficients: Qn_here = 2**n * Qn_for_q (see s_expm1.c):
+ */
+Q1 = -3.3333212137e-2,		/* -0x888868.0p-28 */
+Q2 =  1.5807170421e-3;		/*  0xcf3010.0p-33 */

 float
 expm1f(float x)
@ -86,7 +87,7 @@ expm1f(float x)
    /* x is now in primary range */
 	hfx = (float)0.5*x;
 	hxs = x*hfx;
-	r1 = one+hxs*(Q1+hxs*(Q2+hxs*(Q3+hxs*(Q4+hxs*Q5))));
+	r1 = one+hxs*(Q1+hxs*Q2);
 	t  = (float)3.0-r1*hfx;
 	e  = hxs*((r1-t)/((float)6.0 - x*t));
 	if(k==0) return x - (x*e-hxs);		/* c is 0 */