Bruce Evans c5964538b7 Use a minimax polynomial approximation instead of a Pade rational
function approximation for the second step.  The polynomial has degree
2 for cbrtf() and 4 for cbrt().  These degrees are minimal for the final
accuracy to be essentially the same as before (slightly smaller).
Adjust the rounding between steps 2 and 3 to match.  Unfortunately,
for cbrt(), this breaks the claimed accuracy slightly although incorrect
rounding doesn't.  Claim less accuracy since its not worth pessimizing
the polynomial or relying on exhaustive testing to get insignificantly
more accuracy.

This saves about 30 cycles on Athlons (mainly by avoiding 2 divisions)
so it gives an overall optimization in the 10-25% range (a larger
percentage for float precision, especially in 32-bit mode, since other
overheads are more dominant for double precision, surprisingly more
in 32-bit mode).
2005-12-19 00:22:03 +00:00
..
2005-11-24 09:25:10 +00:00