so that it can be faster for tiny x and avoided for reduced x.
This improves things a little differently than for cosine and sine.
We still need to reclassify x in the "kernel" functions, but we get
an extra optimization for tiny x, and an overall optimization since
tiny reduced x rarely happens. We also get optimizations for space
and style. A large block of poorly duplicated code to fix a special
case is no longer needed. This supersedes the fixes in k_sin.c revs
1.9 and 1.11 and k_sinf.c 1.8 and 1.10.
Fixed wrong constant for the cutoff for "tiny" in tanf(). It was
2**-28, but should be almost the same as the cutoff in sinf() (2**-12).
The incorrect cutoff protected us from the bugs fixed in k_sinf.c 1.8
and 1.10, except 4 cases of reduced args passed the cutoff and needed
special handling in theory although not in practice. Now we essentially
use a cutoff of 0 for the case of reduced args, so we now have 0 special
args instead of 4.
This change makes no difference to the results for sinf() (since it
only changes the algorithm for the 4 special args and the results for
those happen not to change), but it changes lots of results for sin().
Exhaustive testing is impossible for sin(), but exhaustive testing
for sinf() (relative to a version with the old algorithm and a fixed
cutoff) shows that the changes in the error are either reductions or
from 0.5-epsilon ulps to 0.5+epsilon ulps. The new method just uses
some extra terms in approximations so it tends to give more accurate
results, and there are apparently no problems from having extra
accuracy. On amd64 with -O1, on all float args the error range in ulps
is reduced from (0.500, 0.665] to [0.335, 0.500) in 24168 cases and
increased from 0.500-epsilon to 0.500+epsilon in 24 cases. Non-
exhaustive testing by ucbtest shows no differences.