This defeats the point of log1p(). ucbtest reports errors of +-5e+15
ULPs. A correct version would use the i387 fyl2xp1 instruction for
small x and maybe scale to small x. The C version does the scaling
reasonably efficiently, and fyl2px1 is slow (at least on P5s), so not
much is lost by always using the C version (only 25% for small x even
with the broken i387 version; 50% for large x).