freebsd-dev

History

Bruce Evans 16638b5585 Optimized by eliminating the special case for 0.67434 <= \|x\| < pi/4. A single polynomial approximation for tan(x) works in infinite precision up to \|x\| < pi/2, but in finite precision, to restrict the accumulated roundoff error to < 1 ulp, \|x\| must be restricted to less than about sqrt(0.5/((1.5+1.5)/3)) ~= 0.707. We restricted it a bit more to give a safety margin including some slop for optimizations. Now that we use double precision for the calculations, the accumulated roundoff error is in double-precision ulps so it can easily be made almost 2*29 times smaller than a single-precision ulp. Near x = pi/4 its maximum is about 0.5+(1.5+1.5)x**2/3 ~= 1.117 double-precision ulps. The minimax polynomial needs to be different to work for the larger interval. I didn't increase its degree the old degree is just large enough to keep the final error less than 1 ulp and increasing the degree would be a pessimization. The maximum error is now ~0.80 ulps instead of ~0.53 ulps. The speedup from this optimization for uniformly distributed args in [-2pi, 2pi] is 28-43% on athlons, depending on how badly gcc selected and scheduled the instructions in the old version. The old version has some int-to-float conversions that are apparently difficult to schedule well, but gcc-3.3 somehow did everything ~10 cycles or ~10% faster than gcc-3.4, with the difference especially large on AXPs. On A64s, the problem seems to be related to documented penalties for moving single precision data to undead xmm registers. With this version, the speed is cycles is almost independent of the athlon and gcc version despite the large differences in instruction selection to use the FPU on AXPs and SSE on A64s.		2005-11-24 02:04:26 +00:00
..
alpha	Replace fegetmask() and fesetmask() with feenableexcept(),	2005-03-16 19:03:46 +00:00
amd64	Add a missing ldexpf() alias for amd64.	2005-09-12 20:54:00 +00:00
arm	Replace fegetmask() and fesetmask() with feenableexcept(),	2005-03-16 19:03:46 +00:00
bsdsrc	Removed an unused declaration which was so old that it wasn't a prototype	2005-11-18 05:03:12 +00:00
i387	Fixed some comments added in rev.1.5.	2005-10-30 12:21:02 +00:00
ia64	Replace fegetmask() and fesetmask() with feenableexcept(),	2005-03-16 19:03:46 +00:00
man	-mdoc sweep.	2005-11-17 13:00:00 +00:00
powerpc	Replace fegetmask() and fesetmask() with feenableexcept(),	2005-03-16 19:03:46 +00:00
sparc64	Replace fegetmask() and fesetmask() with feenableexcept(),	2005-03-16 19:03:46 +00:00
src	Optimized by eliminating the special case for 0.67434 <= \|x\| < pi/4.	2005-11-24 02:04:26 +00:00
Makefile	Detach k_rem_pio2f.c from the build since it is now unused. It is a libm	2005-11-06 17:59:40 +00:00