freebsd-nq

Author	SHA1	Message	Date
Steve Kargl	8cc74771f2	ld80/s_expl.c: * Use integral numerical constants, and let the compiler do the conversion to long double. ld128/s_expl.c: * Use integral numerical constants, and let the compiler do the conversion to long double. * Use the ENTERI/RETURNI macros, which are no-ops on ld128. This however makes the ld80 and ld128 identical. Reviewed by: bde (as part of larger diff)	2013-06-03 19:13:44 +00:00
Steve Kargl	35cbca6a7f	Micro-optimization: move the unary mius operator to operate on a literal constant. Obtained from: bde	2013-06-03 18:57:35 +00:00
Steve Kargl	a3f70b4ed8	Add a comment to note that bde supplied most, if not all, of the optimizations.	2013-06-03 18:53:40 +00:00
Steve Kargl	1783063f18	ld80/s_expl.c: * In the special case x = -Inf or -NaN, use a micro-optimization to eliminate the need to access u.xbits.man. * Fix an off-by-one for small arguments \|x\| < 0x1p-65. ld128/s_expl.c: * In the special case x = -Inf or -NaN, use a micro-optimization to eliminate the need to access u.xbits.manh and u.xbits.manl. * Fix an off-by-one for small arguments \|x\| < 0x1p-114. Obtained from: bde	2013-06-03 18:51:34 +00:00
Steve Kargl	31407861b8	ld80/s_expl.c: * Update the evaluation of the polynomial. This allows the removal of the now unused variables t23 and t45. ld128/s_expl.c: * Update the evaluation of the polynomial and the intermediate result t. This update allows several numerical constants to be written as double rather than long double constants. Update the constants as appropriate. Obtained from: bde	2013-06-03 18:40:00 +00:00
Steve Kargl	f3049ab5f3	Update a comment to reflect that we are using an endpoint of an interval instead of a midpoint.	2013-06-03 18:14:18 +00:00
Steve Kargl	4aa8c9453f	Introduce the macro LOG2_INTERVAL, which is log2(number of intervals). Use the macroi as a micro-optimization to convert a subtraction and division to a shift. Obtained from: bde	2013-06-03 17:51:08 +00:00
Steve Kargl	03e1315345	Whitespace.	2013-06-03 17:40:52 +00:00
Steve Kargl	bb23de67bb	* Rename the polynomial coefficients from P2, P3, ... to A2, A3, .... The names now coincide with the name used in PTP Tang's paper. * Rename the variable from s to tbl to better reflect that this is a table, and to be consistent with the naming scheme in s_exp2l.c Reviewed by: bde (as part of larger diff)	2013-06-03 17:36:26 +00:00
Steve Kargl	a1d69112c1	ld80/s_expl.c: * Update Copyright years to include 2013. ld128/s_expl.c: * Correct and update Copyright years. This code originated from the ld80 version, so it should reflect the same time period. Reviewed by: bde (as part of larger diff)	2013-06-03 17:21:43 +00:00
David Schultz	25a4d6bfda	Add logl, log2l, log10l, and log1pl. Submitted by: bde	2013-06-03 09:14:31 +00:00
David Schultz	7dbbb6dde3	Fix some regressions caused by the switch from gcc to clang. The fixes are workarounds for various symptoms of the problem described in clang bugs 3929, 8100, 8241, 10409, and 12958. The regression tests did their job: they failed, someone brought it up on the mailing lists, and then the issue got ignored for 6 months. Oops. There may still be some regressions for functions we don't have test coverage for yet.	2013-05-27 08:50:10 +00:00
Steve Kargl	dba466c344	* ld80/s_expl.c: . Fix the threshold for expl(x) where \|x\| is small. . Also update the previously incorrect comment to match the new threshold. * ld128/s_expl.c: . Re-order logic in exceptional cases to match the logic used in other long double functions. . Fix the threshold for expl(x) where is \|x\| is small. . Also update the previously incorrect comment to match the new threshold. Submitted by: bde Approved by: das (mentor)	2012-09-23 18:32:03 +00:00
Steve Kargl	8f647ffd7f	* ld80/s_expl.c: . Guard a comment from reformatting by indent(1). . Re-order variables in declarations to alphabetical order. . Remove a banal comment. * ld128/s_expl.c: . Add a comment to point to ld80/s_expl.c for implementation details. . Move the #define of INTERVAL to reduce the diff with ld80/s_expl.c. . twom10000 does not need to be volatile, so move its declaration. . Re-order variables in declarations to alphabetical order. . Add a comment that describes the argument reduction. . Remove the same banal comment found in ld80/s_expl.c. Reviewed by: bde Approved by: das (mentor)	2012-09-23 18:06:27 +00:00
Steve Kargl	ca50c4b871	Whitespace. Submitted by: bde Approved by: das (pre-approved)	2012-07-30 21:55:49 +00:00
Steve Kargl	8345cbd275	Replace the macro name NUM with INTERVALS. This change provides compatibility with the INTERVALS macro used in the soon-to-be-commmitted expm1l() and someday-to-be-committed log*l() functions. Add a comment into ld128/s_expl.c noting at gcc issue that was deleted when rewriting ld80/e_expl.c as ld128/s_expl.c. Requested by: bde Approved by: das (mentor)	2012-07-26 04:05:08 +00:00
Steve Kargl	f7cfe68f59	* ld80/expl.c: . Remove a few #ifdefs that should have been removed in the initial commit. . Sort fpmath.h to its rightful place. * ld128/s_expl.c: . Replace EXPMASK with its actual value. . Sort fpmath.h to its rightful place. Requested by: bde Approved by: das (mentor)	2012-07-26 03:59:33 +00:00
Steve Kargl	b83ccea32c	Compute the exponential of x for Intel 80-bit format and IEEE 128-bit format. These implementations are based on PTP Tang, "Table-driven implementation of the exponential function in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 15, 144-157 (1989). PR: standards/152415 Submitted by: kargl Reviewed by: bde, das Approved by: das (mentor)	2012-07-23 19:13:55 +00:00
Steve Kargl	9aa461b570	Clean up the unneeded cpp macro INLINE_REM_PIO2L. Reviewed by: das Approved by: das (mentor)	2011-05-30 19:41:28 +00:00
Steve Kargl	c273267e83	Improve the accuracy from a max ULP of ~2000 to max ULP < 0.79 on i386-class hardware for sinl and cosl. The hand-rolled argument reduction have been replaced by e_rem_pio2l() implementations. To preserve history the following commands have been executed: svn cp src/e_rem_pio2.c ld80/e_rem_pio2l.h mv ${HOME}/bde/ld80/e_rem_pio2l.c ld80/e_rem_pio2l.h svn cp src/e_rem_pio2.c ld128/e_rem_pio2l.h mv ${HOME}/bde/ld128/e_rem_pio2l.c ld128/e_rem_pio2l.h The ld80 version has been tested by bde, das, and kargl over the last few years (bde, das) and few months (kargl). An older ld128 version was tested by das. The committed version has only been compiled tested via 'make universe'. Approved by: das (mentor) Obtained from: bde	2011-04-29 23:13:43 +00:00
David Schultz	17303c626f	Add implementations of acosl(), asinl(), atanl(), atan2l(), and cargl(). Reviewed by: bde sparc64 testing resources from: remko	2008-07-31 22:41:26 +00:00
David Schultz	3e13dd37ff	1 << 47 needs to be written 1ULL << 47.	2008-03-02 20:16:55 +00:00
David Schultz	61f955827d	Add kernel functions for 128-bit long doubles. These could be improved a bit, but access to a freebsd/sparc64 machine is needed. Submitted by: bde and Steve Kargl <sgk@apl.washington.edu> (earlier version)	2008-02-17 07:32:31 +00:00
Bruce Evans	f01bfe5c6d	Fix exp2*(x) on signaling NaNs by returning x+x as usual. This has the side effect of confusing gcc-4.2.1's optimizer into more often doing the right thing. When it does the wrong thing here, it seems to be mainly making too many copies of x with dependency chains. This effect is tiny on amd64, but in some cases on i386 it is enormous. E.g., on i386 (A64) with -O1, the current version of exp2() should take about 50 cycles, but took 83 cycles before this change and 66 cycles after this change. exp2f() with -O1 only speeded up from 51 to 47 cycles. (exp2f() should take about 40 cycles, on an Athlon in either i386 or amd64 mode, and now takes 42 on amd64). exp2l() with -O1 slowed down from 155 cycles to 123 for some args; this is unimportant since the i386 exp2l() is a fake; the wrong thing for it seems to involve branch misprediction.	2008-02-13 10:44:44 +00:00
Bruce Evans	a373e66b85	Use a better method of scaling by 2k. Instead of adding to the exponent bits of the reduced result, construct 2k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2*k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2 on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2**k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. This change ld128/s_exp2l.c has not been tested.	2008-02-07 03:17:05 +00:00
David Schultz	968b39e3b9	Implement exp2l(). There is one version for machines with 80-bit long doubles (i386, amd64, ia64) and one for machines with 128-bit long doubles (sparc64). Other platforms use the double version. I've only done runtime testing on i386. Thanks to bde@ for helpful discussions and bugfixes.	2008-01-18 21:42:46 +00:00
David Schultz	7cd4a83267	Since nan() is supposed to work the same as strtod("nan(...)", NULL), my original implementation made both use the same code. Unfortunately, this meant libm depended on a vendor header at compile time and previously- unexposed vendor bits in libc at runtime. Hence, I just wrote my own version of the relevant vendor routine. As it turns out, mine has a factor of 8 fewer of lines of code, and is a bit more readable anyway. The strtod() and *scanf() routines still use vendor code. Reviewed by: bde	2007-12-18 23:46:32 +00:00
David Schultz	4b6b574455	Implement and document nan(), nanf(), and nanl(). This commit adds two new directories in msun: ld80 and ld128. These are for long double functions specific to the 80-bit long double format used on x86-derived architectures, and the 128-bit format used on sparc64, respectively.	2007-12-16 21:19:28 +00:00

28 Commits