Commit Graph

55 Commits

Author SHA1 Message Date
Dimitry Andric
5a4c3b831b Recommit r336497: Fix powl, cpow, cpowf, and cpowl imports from OpenBSD
This is a follow-up to r336299.

* lib/msun/Makefile:
  . Remove polevll.c

* lib/msun/ld80/e_powl.c:
  . Copy contents of polevll.c to here.  This is the only consumer of
    these functions.  Make functions 'static inline'.
  . Make reducl a 'static inline' function.

* lib/msun/man/exp.3:
  . Remove BUGS section that no longer applies.

* lib/msun/src/math_private.h:
  . Remove prototypes of __p1evll() and __polevll()

* lib/msun/src/s_cpow.c:
* lib/msun/src/s_cpowf.c:
* lib/msun/src/s_cpowl.c
  . Include math_private.h.
  . Use the CMPLX macro from either C99 or math_private.h (depends on
    compiler support) instead of the problematic use of complex I.

Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
PR:		229876
MFC after:	1 week
2018-07-20 18:27:30 +00:00
Bruce Evans
27aa844253 Centralize the complications for special efficient rounding to integers.
This was open-coded in range reduction for trig and exp functions.  Now
there are 3 static inline functions rnint[fl]() that replace open-coded
expressions, and type-generic irint() and i64rint() macros that hide the
complications for efficiently using non-generic irint() and irintl()
functions and casts.

Special details:

ld128/e_rem_pio2l.h needs to use i64rint() since it needs a 46-bit integer
result.  Everything else only needs a (less than) 32-bit integer result so
uses irint().

Float and double cases now use float_t and double_t locally instead of
STRICT_ASSIGN() to avoid bugs in extra precision.

On amd64, inline asm is now only used for irint() on long doubles.  The SSE
asm for irint() on amd64 only existed because the ifdef tangles made the
correct method of simply casting to int for this case non-obvious.
2018-07-20 12:42:24 +00:00
Dimitry Andric
c422fbac00 Revert r336497 for now, as it breaks on architectures using gcc, with:
cc1: warnings being treated as errors
/usr/src/lib/msun/src/s_cpow.c: In function 'cpow':
/usr/src/lib/msun/src/s_cpow.c:63: warning: implicit declaration of function 'CMPLX'
2018-07-19 19:07:25 +00:00
Dimitry Andric
2ae9055f49 Fix powl, cpow, cpowf, and cpowl imports from OpenBSD
This is a follow-up to r336299.

* lib/msun/Makefile:
  . Remove polevll.c

* lib/msun/ld80/e_powl.c:
  . Copy contents of polevll.c to here.  This is the only consumer of
    these functions.  Make functions 'static inline'.
  . Make reducl a 'static inline' function.

* lib/msun/man/exp.3:
  . Remove BUGS section that no longer applies.

* lib/msun/src/math_private.h:
  . Remove prototypes of __p1evll() and __polevll()

* lib/msun/src/s_cpow.c:
* lib/msun/src/s_cpowf.c:
* lib/msun/src/s_cpowl.c
  . Use the CMPLX macro from either C99 or math_private.h (depends of
    compiler support) instead of the problematic use of complex I.

Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
PR:		229876
MFC after:	1 week
2018-07-19 18:44:10 +00:00
Bruce Evans
6f1b8a0792 Add a macro nan_mix() and use it to get NaN results that are (bitwise)
independent of the precision in most cases.  This is mainly to simplify
checking for errors.  r176266 did this for e_pow[f].c using a less
refined expression that often didn't work.  r176276 fixes an error in
the log message for r176266.  The main refinement is to always expand
to long double precision.  See old log messages (especially these 2)
and the comment on the macro for more general details.

Specific details:
- using nan_mix() consistently for the new and old pow*() functions was
  the only thing needed to make my consistency test for powl() vs pow()
  pass on amd64.

- catrig[fl].c already had all the refinements, but open-coded.

- e_atan2[fl].c, e_fmod[fl].c and s_remquo[fl] only had primitive NaN
  mixing.

- e_hypot[fl].c already had a different refined version of r176266.  Refine
  this further.  nan_mix() is not directly usable here since we want to
  clear the sign bit.

- e_remainder[f].c already had an earlier version of r176266.

- s_ccosh[f].c,/s_csinh[f].c already had a version equivalent to r176266.
  Refine this further.  nan_mix() is not directly usable here since the
  expression has to handle some non-NaN cases.

- s_csqrt.[fl]: the mixing was special and mostly wrong.  Partially fix the
  special version.

- s_ctanh[f].c already had a version of r176266.
2018-07-17 07:42:14 +00:00
Matt Macy
6813d08ff5 msun: add ld80/ld128 powl, cpow, cpowf, cpowl from openbsd
This corresponds to the latest status (hasn't changed in 9+
years) from openbsd of ld80/ld128 powl, and source cpowf, cpow,
cpowl (the complex power functions for float complex, double
complex, and long double complex) which are required for C99
compliance and were missing from FreeBSD. Also required for
some numerical codes using complex numbered Hamiltonians.

Thanks to jhb for tracking down the issue with making
weak_reference compile on powerpc.

When asked to review, bde said "I don't like it" - but
provided no actionable feedback or superior implementations.

Discussed with: jhb
Submitted by: jmd
Differential Revision: https://reviews.freebsd.org/D15919
2018-07-15 00:23:10 +00:00
Pedro F. Giffuni
5e53a4f90f lib: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-26 02:00:33 +00:00
Ed Maste
dba5d1ca17 libm: add braces around initialization of subobjects
This cleans up a warning when building libm at higher WARNS levels and
makes the intent more clear. By the C standard the values are assigned
to subobject members in order so this change introduces no functional
change. (6.7.9 20)

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D8333
2016-11-01 15:11:10 +00:00
Ed Maste
a01a51a352 libm: remove unused variables
Sponsored by:	The FreeBSD Foundation
2016-10-05 17:04:58 +00:00
Ed Schouten
2cec876a59 Rename cpack*() to CMPLX*().
The C11 standard introduced a set of macros (CMPLX, CMPLXF, CMPLXL) that
can be used to construct complex numbers from a pair of real and
imaginary numbers. Unfortunately, they require some compiler support,
which is why we only define them for Clang and GCC>=4.7.

The cpack() function in libm performs the same task as CMPLX(), but
cannot be used to generate compile-time constants. This means that all
invocations of cpack() can safely be replaced by C11's CMPLX(). To keep
the code building with GCC 4.2, provide copies of CMPLX() that can at
least be used to generate run-time complex numbers.

This makes it easier to build some of the functions outside of libm.
2014-12-16 09:21:56 +00:00
Steve Kargl
a4e4b355f4 The value small=2**-(p+3), where p is the precision, can be determine from
lgamma(x) = -log(x) - log(1+x) + x*(1-g) + x**2*P(x) with g = 0.57...
being the Euler constant and P(x) a polynomial.  Substitution of small
into the RHS shows that the last 3 terms are negligible in comparison to
the leading term.  The choice of 3 may be conservative.

The value large=2**(p+3) is detemined from Stirling's approximation
lgamma(x) = x*(log(x)-1) - log(x)/2 + log(2*pi)/2 + P(1/x)/x
Again, substitution of large into the RHS reveals the last 3 terms
are negligible in comparison to the leading term.

Move the x=+-0 special case into the |x|<small block.

In the ld80 and ld128 implementaion, use fdlibm compatible comparisons
involving ix, lx, and llx.  This replaces several floating point
comparisons (some involving fabsl()) and also fixes the special cases
x=1 and x=2.

While here
  . Remove unnecessary parentheses.
  . Fix/improve comments due to the above changes.
  . Fix nearby whitespace.

* src/e_lgamma_r.c:
  . Sort declaration.
  . Remove unneeded explicit cast for type conversion.
  . Replace a double literal constant by an integer literal constant.

* src/e_lgammaf_r.c:
  . Sort declaration.

* ld128/e_lgammal_r.c:
  . Replace a long double literal constant by a double literal constant.

* ld80/e_lgammal_r.c:
  . Remove unused '#include float.h'
  . Replace a long double literal constant by a double literal constant.

Requested by:	bde
2014-10-09 22:39:52 +00:00
Steve Kargl
f382031d34 For targets that have a signed zero, lgamma_r(-0, &signgamp) should
set signgamp = -1.

Submitted by:	enh at google dot com (e_lgamma[f]_r.c)
2014-09-17 19:01:22 +00:00
Steve Kargl
f7efd14df1 * Makefile:
. Hook e_lgammal[_r].c to the build.
  . Create man page links for lgammal[-r].3.

* Symbol.map:
  . Sort lgammal to its rightful place.
  . Add FBSD_1.4 section for the new lgamal_r symbol.

* ld128/e_lgammal_r.c:
  . 128-bit implementataion of lgammal_r().

* ld80/e_lgammal_r.c:
  . Intel 80-bit format implementation of lgammal_r().

* src/e_lgamma.c:
  . Expose lgammal as a weak reference to lgamma for platforms
    where long double is mapped to double.

* src/e_lgamma_r.c:
  . Use integer literal constants instead of real literal constants.
    Let compiler(s) do the job of conversion to the appropriate type.
  . Expose lgammal_r as a weak reference to lgamma_r for platforms
    where long double is mapped to double.

* src/e_lgammaf_r.c:
  . Fixed the Cygnus Support conversion of e_lgamma_r.c to float.
    This includes the generation of new polynomial and rational
    approximations with fewer terms.  For each approximation, include
    a comment on an estimate of the accuracy over the relevant domain.
  . Use integer literal constants instead of real literal constants.
    Let compiler(s) do the job of conversion to the appropriate type.
    This allows the removal of several explicit casts of double values
    to float.

* src/e_lgammal.c:
  . Wrapper for lgammal() about lgammal_r().

* src/imprecise.c:
  . Remove the lgamma.

* src/math.h:
  . Add a prototype for lgammal_r().

* man/lgamma.3:
  . Document the new functions.

Reviewed by:	bde
2014-09-15 23:21:57 +00:00
Steve Kargl
3b5e0d0f96 * Makefile:
. Add s_erfl.c to building libm.
  . Add MLINKS for erfl.3 and erfcl.3.

* Symbol.map:
  . Move erfl and erfcl to their proper location.

* ld128/s_erfl.c:
  . Implementations of erfl and erfcl in the IEEE 754 128-bit format.

* ld80/s_erfl.c:
  . Implementations of erfl and erfcl in the Intel 80-bit format.

* man/erf.3:
  . Document the new functions.
  . While here, remove an incomplete sentence.

* src/imprecise.c:
  . Remove the stupidity of mapping erfl and erfcl to erf and erfc.

* src/math.h:
  . Move the declarations of erfl and erfcl to their proper place.

* src/s_erf.c:
  . For architectures where double and long double are the same
    floating point format, use weak references to map erfl to
    erf and ercl to erfc.

Reviewed by:	bde (many earlier versions)
2014-07-13 17:05:03 +00:00
Steve Kargl
5f63fbd67f * ld80/k_expl.h:
* ld128/k_expl.h:
  . Split out a computational kernel,__k_expl(x, &hi, &lo, &k) from expl(x).
    x must be finite and not tiny or huge.  The kernel returns hi and lo
    values for extra precision and an exponent k for a 2**k scale factor.
  . Define additional kernels k_hexpl() and hexpl() that include a 1/2
    scaling and are used by the hyperbolic functions.

* ld80/s_expl.c:
* ld128/s_expl.c:
  . Use the __k_expl() kernel.

Obtained from:	bde
2013-12-30 00:51:25 +00:00
Steve Kargl
3ffff4bad5 ld80 and ld128 implementations of expm1l(). This code started life
as a fairly faithful implementation of the algorithm found in

PTP Tang, "Table-driven implementation of the Expm1 function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 18,
211-222 (1992).

Over the last 18-24 months, the code has under gone significant
optimization and testing.

Reviewed by:	bde
Obtained from:	bde (most of the optimizations)
2013-06-03 19:51:32 +00:00
Steve Kargl
8cc74771f2 ld80/s_expl.c:
* Use integral numerical constants, and let the compiler do the
  conversion to long double.

ld128/s_expl.c:

* Use integral numerical constants, and let the compiler do the
  conversion to long double.
* Use the ENTERI/RETURNI macros, which are no-ops on ld128.  This
  however makes the ld80 and ld128 identical.

Reviewed by:	bde (as part of larger diff)
2013-06-03 19:13:44 +00:00
Steve Kargl
35cbca6a7f Micro-optimization: move the unary mius operator to operate
on a literal constant.

Obtained from:	bde
2013-06-03 18:57:35 +00:00
Steve Kargl
1783063f18 ld80/s_expl.c:
* In the special case x = -Inf or -NaN, use a micro-optimization
  to eliminate the need to access u.xbits.man.

* Fix an off-by-one for small arguments |x| < 0x1p-65.

ld128/s_expl.c:

* In the special case x = -Inf or -NaN, use a micro-optimization
  to eliminate the need to access u.xbits.manh and u.xbits.manl.

* Fix an off-by-one for small arguments |x| < 0x1p-114.

Obtained from:	bde
2013-06-03 18:51:34 +00:00
Steve Kargl
31407861b8 ld80/s_expl.c:
* Update the evaluation of the polynomial.  This allows the removal
  of the now unused variables t23 and t45.

ld128/s_expl.c:

* Update the evaluation of the polynomial and the intermediate
  result t.  This update allows several numerical constants to be
  written as double rather than long double constants.   Update
  the constants as appropriate.

Obtained from:	bde
2013-06-03 18:40:00 +00:00
Steve Kargl
199b8e343d Rename a few P2, P3, ... coefficients to A2, A3, ... missed in
my previous commit.
2013-06-03 18:18:08 +00:00
Steve Kargl
f3049ab5f3 Update a comment to reflect that we are using an endpoint of
an interval instead of a midpoint.
2013-06-03 18:14:18 +00:00
Steve Kargl
ad36b00fcb Add a u suffix to the IEEEl2bits unions o_threshold and u_threshold,
and use macros to access the e component of the unions.  This allows
the portions of the code in ld80 to be identical to the ld128 code.

Obtained from:	bde
2013-06-03 18:07:04 +00:00
Steve Kargl
4aa8c9453f Introduce the macro LOG2_INTERVAL, which is log2(number of intervals).
Use the macroi as a micro-optimization to convert a subtraction and
division to a shift.

Obtained from:	bde
2013-06-03 17:51:08 +00:00
Steve Kargl
03e1315345 Whitespace. 2013-06-03 17:40:52 +00:00
Steve Kargl
bb23de67bb * Rename the polynomial coefficients from P2, P3, ... to A2, A3, ....
The names now coincide with the name used in PTP Tang's paper.

* Rename the variable from s to tbl to better reflect that
  this is a table, and to be consistent with the naming scheme
  in s_exp2l.c

Reviewed by:	bde (as part of larger diff)
2013-06-03 17:36:26 +00:00
Steve Kargl
b419a5506a * Style(9). Start non-Copyright fancy formatted comments with /**.
Reviewed by:	bde (as part of larger diff)
2013-06-03 17:24:46 +00:00
Steve Kargl
a1d69112c1 ld80/s_expl.c:
* Update Copyright years to include 2013.

ld128/s_expl.c:

* Correct and update Copyright years.  This code originated from
  the ld80 version, so it should reflect the same time period.

Reviewed by:	bde (as part of larger diff)
2013-06-03 17:21:43 +00:00
David Schultz
25a4d6bfda Add logl, log2l, log10l, and log1pl.
Submitted by:	bde
2013-06-03 09:14:31 +00:00
Steve Kargl
ad600fe1aa Style(9)
Approved by:	das (implicit)
Reported by:	jh
2013-05-27 22:45:05 +00:00
Steve Kargl
532fd61b45 * Update polynomial coefficients.
* Use ENTERI/RETURNI to allow the use of FP_PE on i386 target.

Reviewed by:	das (and bde a long time ago)
Approved by:	das (mentor)
Obtained from:	bde (polynomial coefficients)
2013-05-27 20:43:16 +00:00
David Schultz
7dbbb6dde3 Fix some regressions caused by the switch from gcc to clang. The fixes
are workarounds for various symptoms of the problem described in clang
bugs 3929, 8100, 8241, 10409, and 12958.

The regression tests did their job: they failed, someone brought it
up on the mailing lists, and then the issue got ignored for 6 months.
Oops. There may still be some regressions for functions we don't have
test coverage for yet.
2013-05-27 08:50:10 +00:00
Steve Kargl
f81d134e7e * Update the comment that explains the choice of values in the
table and the requirement on trailing zero bits.

* Remove the __aligned() compiler directives as these were found
  to have a negative effect on the produced code.

Submitted by:	bde
Approved by:	das (mentor)
2012-10-13 19:53:11 +00:00
Steve Kargl
a077586c53 * src/math_private.h:
. Change the API for the LD80C by removing the explicit passing
    of the sign bit.  The sign can be determined from the last
    parameter of the macro.
  . On i386, load long double by bit manipulations to work around
    at least a gcc compiler issue.  On non-i386 ld80 architectures,
    use a simple assignment.

* ld80/s_expl.c:
  . Update the only consumer of LD80C.

Submitted by:	bde
Approved by:	das (mentor)
2012-09-29 16:40:12 +00:00
Steve Kargl
dba466c344 * ld80/s_expl.c:
. Fix the threshold for expl(x) where |x| is small.
  . Also update the previously incorrect comment to match the
    new threshold.

* ld128/s_expl.c:
  . Re-order logic in exceptional cases to match the logic used in
    other long double functions.
  . Fix the threshold for expl(x) where is |x| is small.
  . Also update the previously incorrect comment to match the
    new threshold.

Submitted by:	bde
Approved by:	das (mentor)
2012-09-23 18:32:03 +00:00
Steve Kargl
724c1ee29f Fix whitespace issue.
Approved by:	das (mentor, implicit)
2012-09-23 18:13:46 +00:00
Steve Kargl
8f647ffd7f * ld80/s_expl.c:
. Guard a comment from reformatting by indent(1).
  . Re-order variables in declarations to alphabetical order.
  . Remove a banal comment.

* ld128/s_expl.c:
  . Add a comment to point to ld80/s_expl.c for implementation details.
  . Move the #define of INTERVAL to reduce the diff with ld80/s_expl.c.
  . twom10000 does not need to be volatile, so move its declaration.
  . Re-order variables in declarations to alphabetical order.
  . Add a comment that describes the argument reduction.
  . Remove the same banal comment found in ld80/s_expl.c.

Reviewed by:	bde
Approved by:	das (mentor)
2012-09-23 18:06:27 +00:00
Steve Kargl
c1a077829a * Update the lookup table to use 53-bit high and low values.
Also, update the comment to describe the choice of using
  a high and low decomposition of 2^(i/INTERNVAL) for
  0 <= i <= INTERVAL in preparation for an implementation of
  expm1l.

* Move the #define of INTERVAL above the comment, because the
  comment refers to INTERVAL.

Reviewed by:	bde
Approved by:	das (mentor)
2012-09-23 17:36:01 +00:00
Steve Kargl
ca50c4b871 Whitespace.
Submitted by:	bde
Approved by:	das (pre-approved)
2012-07-30 21:55:49 +00:00
Steve Kargl
8345cbd275 Replace the macro name NUM with INTERVALS. This change provides
compatibility with the INTERVALS macro used in the soon-to-be-commmitted
expm1l() and someday-to-be-committed log*l() functions.

Add a comment into ld128/s_expl.c noting at gcc issue that was
deleted when rewriting ld80/e_expl.c as ld128/s_expl.c.

Requested by:	bde
Approved by:	das (mentor)
2012-07-26 04:05:08 +00:00
Steve Kargl
f7cfe68f59 * ld80/expl.c:
. Remove a few #ifdefs that should have been removed in the initial
    commit.
  . Sort fpmath.h to its rightful place.

* ld128/s_expl.c:
  . Replace EXPMASK with its actual value.
  . Sort fpmath.h to its rightful place.

Requested by:	bde
Approved by:	das (mentor)
2012-07-26 03:59:33 +00:00
Steve Kargl
b83ccea32c Compute the exponential of x for Intel 80-bit format and IEEE 128-bit
format.  These implementations are based on

PTP Tang, "Table-driven implementation of the exponential function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 15,
144-157 (1989).

PR: standards/152415
Submitted by: kargl
Reviewed by: bde, das
Approved by: das (mentor)
2012-07-23 19:13:55 +00:00
Ben Laurie
5f301949ef Fix clang warnings.
Approved by:	philip (mentor)
2011-06-18 13:56:33 +00:00
Steve Kargl
9aa461b570 Clean up the unneeded cpp macro INLINE_REM_PIO2L.
Reviewed by:	das
Approved by:	das (mentor)
2011-05-30 19:41:28 +00:00
Steve Kargl
c273267e83 Improve the accuracy from a max ULP of ~2000 to max ULP < 0.79
on i386-class hardware for sinl and cosl.  The hand-rolled argument
reduction have been replaced by e_rem_pio2l() implementations.  To
preserve history the following commands have been executed:

svn cp src/e_rem_pio2.c ld80/e_rem_pio2l.h
mv ${HOME}/bde/ld80/e_rem_pio2l.c ld80/e_rem_pio2l.h

svn cp src/e_rem_pio2.c ld128/e_rem_pio2l.h
mv ${HOME}/bde/ld128/e_rem_pio2l.c ld128/e_rem_pio2l.h

The ld80 version has been tested by bde, das, and kargl over the
last few years (bde, das) and few months (kargl).  An older ld128
version was tested by das.  The committed version has only been
compiled tested via 'make universe'.

Approved by: das (mentor)
Obtained from: bde
2011-04-29 23:13:43 +00:00
David Schultz
1192a80ed1 On i386, gcc truncates long double constants to double precision
at compile time regardless of the dynamic precision, and there's
no way to disable this misfeature at compile time. Hence, it's
impossible to generate the appropriate tables of constants for the
long double inverse trig functions in a straightforward way on i386;
this change hacks around the problem by encoding the underlying bits
in the table.

Note that these functions won't pass the regression test on i386,
even with the FPU set to extended precision, because the regression
test is similarly damaged by gcc. However, the tests all pass when
compiled with a modified version of gcc.

Reported by:  	bde
2008-08-02 03:56:22 +00:00
David Schultz
17303c626f Add implementations of acosl(), asinl(), atanl(), atan2l(),
and cargl().

Reviewed by:			bde
sparc64 testing resources from:	remko
2008-07-31 22:41:26 +00:00
Bruce Evans
be396b71c1 2 long double constants were missing L suffixes. This helped break tanl()
on !(amd64 || i386).  It gave slightly worse than double precision in some
cases.  tanl() now passes tests of 2^24 values on ia64.
2008-02-18 15:39:52 +00:00
Bruce Evans
19a9e1bb1c Fix a typo which broke k_tanl.c on !(amd64 || i386). 2008-02-18 14:09:41 +00:00
David Schultz
de336b0c5e Add kernel functions for 80-bit long doubles. Many thanks to Steve and
Bruce for putting lots of effort into these; getting them right isn't
easy, and they went through many iterations.

Submitted by:	Steve Kargl <sgk@apl.washington.edu> with revisions from bde
2008-02-17 07:32:14 +00:00