Fixed range reduction near (but not very near) +-pi/2. A bug caused

a maximum error of 2.905 ulps for cosf(), but the algorithm for cosf()
is good for < 1 ulps and happens to give perfect rounding (< 0.5 ulps)
near +-pi/2 except for the bug.  The extra relative errors for tanf()
were similar (slightly larger).  The bug didn't affect sinf() since
sinf'(+-pi/2) is 0.

For range reduction in ~[-3pi/4, -pi/4] and ~[pi/4, 3pi/4] we must
subtract +-pi/2 and the only complication is that this must be done
in extra precision.  We have handy 17+24-bit and 17+17+24-bit
approximations to pi/2.  If we always used the former then we would
lose up to 24 bits of accuracy due to cancelation of leading bits, but
we need to keep at least 24 bits plus a guard digit or 2, and should
keep as many guard bits as efficiency permits.  So we used the
less-precise pi/2 not very near +-pi/2 and switched to using the
more-precise pi/2 very near +-pi/2.  However, we got the threshold for
the switch wrong by allowing 19 bits to cancel, so we ended up with
only 21 or 22 bits of accuracy in some cases, which is even worse than
naively subtracting pi/2 would have done.

Exhaustive checking shows that allowing only 17 bits to cancel (min.
accuracy ~24 bits) is sufficient to reduce the maximum error for cosf()
near +-pi/2 to 0.726 ulps, but allowing only 6 bits to cancel (min.
accuracy ~35-bits) happens to give perfect rounding for cosf() at
little extra cost so we prefer that.

We actually (in effect) allow 0 bits to cancel and always use the
17+17+24-bit pi/2 (min. accuracy ~41 bits).  This is simpler and
probably always more efficient too.  Classifying args to avoid using
this pi/2 when it is not needed takes several extra integer operations
and a branch, but just using it takes only 1 FP operation.

The patch also fixes misspelling of 17 as 24 in many comments.

For the double-precision version, the magic numbers include 33+53 bits
for the less-precise pi/2 and (53-32-1 = 20) bits being allowed to
cancel, so there are ~33-20 = 13 guard bits.  This is sufficient except
probably for perfect rounding.  The more-precise pi/2 has 33+33+53
bits and we still waste time classifying args to avoid using it.

The bug is apparently from mistranslation of the magic 32 in 53-32-1.
The number of bits allowed to cancel is not critical and we use 32 for
double precision because it allows efficient classification using a
32-bit comparison.  For float precision, we must use an explicit mask,
and there are fewer bits so there is less margin for error in their
allocation.  The 32 got reduced to 4 but should have been reduced
almost in proportion to the reduction of mantissa bits.
This commit is contained in:
Bruce Evans 2005-10-08 22:43:55 +00:00
parent 8eeb2ca6bf
commit a7b8acac04
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=151110

View File

@ -98,29 +98,17 @@ pio2_3t = 6.1232342629e-17; /* 0x248d3132 */
if(ix<=0x3f490fd8) /* |x| ~<= pi/4 , no need for reduction */
{y[0] = x; y[1] = 0; return 0;}
if(ix<0x4016cbe4) { /* |x| < 3pi/4, special case with n=+-1 */
/* 17+17+24 bit pi has sufficient precision and best efficiency */
if(hx>0) {
z = x - pio2_1;
if((ix&0xfffffff0)!=0x3fc90fd0) { /* 24+24 bit pi OK */
y[0] = z - pio2_1t;
y[1] = (z-y[0])-pio2_1t;
} else { /* near pi/2, use 24+24+24 bit pi */
z -= pio2_2;
y[0] = z - pio2_2t;
y[1] = (z-y[0])-pio2_2t;
}
z = (x - pio2_1) - pio2_2;
y[0] = z - pio2_2t;
y[1] = (z-y[0])-pio2_2t;
return 1;
} else { /* negative x */
z = x + pio2_1;
if((ix&0xfffffff0)!=0x3fc90fd0) { /* 24+24 bit pi OK */
y[0] = z + pio2_1t;
y[1] = (z-y[0])+pio2_1t;
} else { /* near pi/2, use 24+24+24 bit pi */
z += pio2_2;
y[0] = z + pio2_2t;
y[1] = (z-y[0])+pio2_2t;
}
z = (x + pio2_1) + pio2_2;
y[0] = z + pio2_2t;
y[1] = (z-y[0])+pio2_2t;
return -1;
}
}
if(ix<=0x43490f80) { /* |x| ~<= 2^7*(pi/2), medium size */
t = fabsf(x);