Some of the functions in msun can be implemented using a compiler
builtin function to generate a small number of instructions. Implement
this support in fma, fmax, fmin, and sqrt on arm64.
Care must be taken as the builtin can be implemented as a function
call on some architectures that lack direct support. In these cases
we need to use the original code path.
As we don't set errno on failure build with -fno-math-errno so the
toolchain doesn't convert a builtin into a function call when it
detects a failure, e.g. gcc will add a call to sqrt when the input
is negative leading to an infinite loop.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32801
In ac76bc1145dd, I added a few volatiles to work around ctrig_test
failures with {inf,inf}. This is not necessary anymore now, since in
3b00222f156d we added -fp-exception-behavior=maytrap for clang >= 10 in
libm's Makefile. (The flag tells clang to use stricter floating point
semantics, which libm depends on.)
PR: 244732, 254911
Fixes: ac76bc1145dd
MFC after: 3 days
The change implements cexpl() for both ld80 and ld128 architectures.
Testing was done on x86_64 and aarch64 systems.
Along the way sincos[fl]() use an optimization that reduces the argument
to being done one rather than twice. This optimization actually pointed
to a bug in the ld128 version of sincosl(), which is now fixed. In
addition, the minmax polynomial coefficients for sincosl() have been
updated.
A concise log of the file-by-file changes follows.
* include/complex.h:
. Add a prototype for cexpl().
* lib/msun/Makefile:
. Add s_cexpl.c to the build.
. Setup a link for cexpl.3 to cexp.3.
* lib/msun/Symbol.map:
. Expose cexpl symbol in libm shared library.
* lib/msun/ld128/s_cexpl.c:
* Implementation of cexpl() for 128-bit long double architectures.
Tested on an aarch64 system.
* lib/msun/ld80/s_cexpl.c:
* Implementation of cexpl() for Intel 80-bit long double.
* lib/msun/man/cexp.3:
. Document cexpl().
* lib/msun/man/complex.3:
. Add a BUGS section about cpow[fl].
* lib/msun/src/s_cexp.c:
. Include float.h for weak references on 53-bit long double targets.
. Use sincos() to reduce argument reduction cost.
* lib/msun/src/s_cexpf.c:
. Use sincosf() to reduce argument reduction cost.
* lib/msun/src/k_sincosl.h:
. Catch up with the new minmax polynomial coefficients for the kernel for
the 128-bit cosl() implementation.
. BUG FIX: *cs was used where *sn should have been. This means that sinl()
was no computed correctly when iy != 0.
* lib/msun/src/s_cosl.c:
. Include fpmath.h to get access to IEEEl2bits.
. Replace M_PI_4 with pio4, a 64-bit or 113-bit approximation for pi / 4.
PR: 216862
MFC after: 1 week
As mention previously, the minmax polynomial approximation
in the kernel for cosl() seem to have a bad set of coefficients.
In testing, cosl() in the interval [0.785, pi/4] for 1 million
values and pi/4 written to 37 decimal digits. The old version
on an aarch64 system gave
% tlibm/tlibm_lmath -l -x 0.78 -X
7.85398163397448309615660845819875721e-1L cos
Interval tested for cosl: [0.78,0.785398]
count: 1000000
xm = 7.80213913234863919029058821396125599e-01L
libm = 7.10763080972549562455058499280609083e-01L
mpfr = 7.10763080972549562455058499280608983e-01L
ULP = 1.04431
The max ULP exceeds 1, which is not good. So, I rinsed off a 10
year code and recomputed coefficients. The new minmax polynomial
now yields
% tlibm/tlibm_lmath -l -x 0.78 -X
7.85398163397448309615660845819875721e-1L cos
Interval tested for cosl: [0.78,0.785398]
count: 1000000
xm = 7.82916198746768272588844890973704219e-01L
libm = 7.08859615479571058183956453286628396e-01L
mpfr = 7.08859615479571058183956453286628469e-01L
ULP = 0.75407
which is very good.
PR: 218514
MFC after: 1 week
Mark Murray graciously provided access to an aarch64 system
to test the ld128 implementations. This patch address
* Misuses of copysignl() in sinpil() and tanpil().
* Redo the splitting of argument 'x' into an integer part and
remainder. The remainder must satify 0 <= r < 1.
* Update the reduction of the integer part to something that can
easily be seen as even or odd, e.g., sin(pi*x) = (-1)^n*sin(pi*r)
with n <= 2^112 and we an reduce n by subtracting integer powers
of 2.
* In s_cospil.c, fix typos where 'x' is used where 'ax', the
remainder, is required.
* In tanpil(), fix the use of an uninitialized variable, ax = fabsl(ax),
ax should be x in fabsl().
One item of note, in the limited tested on aarch64, the max ULP
for sinpil() and cospil() were less than 1.1 ULP, which is higher
that the desired max ULP less than 1. This was traced to the
kernel for cosl() in the fundamental interval [0,pi/4].
The coefficients in the minmax polynomial likely need refinement.
PR: 218514
MFC after: 1 week
The patch fixes the omission of '#include <float.h>', which is needed for
the weak reference on systems with LDBL_MANT_DIG == DBL_MANT_DIG.
PR: 218514
MFC after: 1 week
Both IEEE-754 2008 and ISO/IEC TS 18661-4 define the half-cycle
trignometric functions cospi, sinpi, and tanpi. The attached
patch implements cospi[fl], sinpi[fl], and tanpi[fl]. Limited
testing on the cospi and sinpi reveal a max ULP less than 0.89;
while tanpi is more problematic with a max ULP less than 2.01
in the interval [0,0.5]. The algorithms used in these functions
are documented in {ks}_cospi.c, {ks}_sinpi.c, and s_tanpi.c.
Note. I no longer have access to a system with ld128 and
adequate support to compile and test the ld128 implementations
of these functions. Given the almost complete lack of input from
others on improvements to libm, I doubt that anyone cares. If
someone does care, the ld128 files contain a number of FIXME comments,
and in particular, while the polynomial coefficients are given
I did not update the polynomial algorithms to properly use the
coefficients.
PR: 218514
MFC after: 2 weeks
These files were copied from MUSL. Add the standard copyright notice and
SPDX-License-Identifier: MIT consistent with our new draft license
policy. It reads word for word the same as the MIT license on the SPDX
web site. Add a pointer to the MUSL COPYIRGHT file which contains a list
of all authors of MUSL.
Sponsored by: Netflix
Noticed by: Steve Kargl
Summary:
From Steve Kargl:
Paul Zimmermann has identified a bug in Openlibm's powf(),
which is identical to FreeBSD's libm. Both derived from
fdlibm. https://github.com/JuliaMath/openlibm/issues/212.
Consider
% cat h.c
int
main(void)
{
float x, y, z;
x = 0x1.ffffecp-1F;
y = -0x1.000002p+27F;
z = 0x1.557a86p115F;
printf("%e %e %e <-- should be %e\n", x, y, powf(x,y), z);
return 0;
}
% cc -o h -fno-builtin h.c -lm && ./h
9.999994e-01 -1.342177e+08 inf <-- should be 5.540807e+34
Reviewers: manu
Subscribers: imp, andrew, emaste
Differential Revision: https://reviews.freebsd.org/D31865
.Fa is the suitable macro for functions in comparsion to the
.Ar macro, which should be used for commandline arguments.
While here, fix some mandoc warnings.
Reviewed by: imp (earlier version)
Obtained from: OpenBSD (in partial)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D31090
After df3b437c1e073eb83e9a93af1c417f3ee8d0de3b, older gcc's such as
4.2.1 (still used on earlier branches for e.g. mips and powerpc) and
6.3.0 (still used for some cross-builds) started throwing bogus errors
like:
In file included from /workspace/src/lib/msun/src/s_llround.c:11:0:
/workspace/src/lib/msun/src/s_lround.c:54:31: error: initializer element is not constant
static const type dtype_min = type_min - 0.5;
^~~~~~~~
/workspace/src/lib/msun/src/s_lround.c:55:31: error: initializer element is not constant
static const type dtype_max = type_max + 0.5;
^~~~~~~~
Since 'type_min' and 'type_max' are constants declared just above these
lines this error is nonsensical, but older gcc's are not smart enough.
Work around the error by reusing the (type)DTYPE_MIN and (type)DTYPE_MAX
macros, so I can MFC this right away, unbreaking a few stable builds.
MFC after: immediately
It turned out that the (type)DTYPE_MAX conversions at the top of
s_lround.c are now emitted as cvtsi2sd instructions, at least on SSE
capable CPUs. This caused the FE_INEXACT flag to always be set, at least
for the double and float variants. Under clang 11, the whole INRANGE()
comparisons were still optimized away, but this has "improved" in clang
12, due to stricter adherence to the -ffp-exception-behavior=maytrap
compiler flag.
To avoid run-time integer to float conversions, use static constants
instead, so they are computed at compile time, and the INRANGE()
statements are optimized away again, if applicable.
While here, use an integer instead of a floating type to store the test
results in lround_test.c, as this is more appropriate, and we can also
drop the volatile hack.
Reported by: arichardson
MFC after: 3 days
Summary:
Fix FPU exception management for powerpcspe. Bits are in a different place from
the standard FPSCR, so we need to handle the shifting differences. Also,
there's no concept of a "software exception" raise, so we need to do exceptional
math to trigger the exception from software.
Reviewed By: alfredo
Differential Revision: https://reviews.freebsd.org/D22824
For some reason the ld128 log1pl() implementation is less accurate than
logl(), but does at least guarantee precision >= the ld80 implementation.
Mark log1p_accuracy_tests as XFAIL for ld128 and increase the log1p tolerance
to the ld80 equivalent in accuracy_tests to avoid losing test coverage for
the other functions.
PR: 253984
Reviewed By: ngie, dim
Differential Revision: https://reviews.freebsd.org/D29039
This avoids build failures due to the clang 12 warning:
'#pragma FENV_ACCESS' is not supported on this target - ignored
Clang 12 currently emits this warning for all non-x86 architectures.
While this can result in incorrect code generation (e.g. on AArch64 some
exceptions are not raised as expected), this is a pre-existing issue and
we should not fail the build due to this warning.
Reviewed By: dim, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29743
After 3b00222f156d, it turns out that clang only supports strict
floating point semantics for SystemZ and x86 at the moment, while for
other architectures it is still experimental.
Therefore, only use -fp-exception-behavior=maytrap on x86 for now,
otherwise this option results in "error: overriding currently
unsupported use of floating point exceptions on this target
[-Werror,-Wunsupported-floating-point-opt]" on other architectures.
Fixes: 3b00222f156d
PR: 254911
MFC after: 1 week
When using clang with x86_64 CPUs that support AVX, some floating point
transformations may raise exceptions that would not have been raised by
the original code. To avoid this, use the -fp-exception-behavior=maytrap
flag, introduced in clang 10.0.0.
In particular, this fixes a number of test failures with ctanhf(3) and
ctanf(3), when libm is compiled with -mavx. An unexpected FE_INVALID
exception is then raised, because clang emits vdivps instructions to
perform certain divides. (The vdivps instruction operates on multiple
single-precision float operands simultaneously, but the exceptions may
be influenced by unused parts of the XMM registers. In this particular
case, it was calculating 0 / 0, which results in FE_INVALID.)
If -fp-exception-behavior=maytrap is specified however, clang uses
vdivss instructions instead, which work on one operand, and should not
raise unexpected exceptions.
Reported by: olivier
Reviewed by: arichardson
PR: 254911
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29686
When compiling parts of math.h with clang using a C standard before C11,
and using -pedantic, it will result in warnings similar to:
bug254714.c:5:11: warning: '_Generic' is a C11 extension [-Wc11-extensions]
return !isfinite(1.0);
^
/usr/include/math.h:111:21: note: expanded from macro 'isfinite'
^
/usr/include/math.h:82:39: note: expanded from macro '__fp_type_select'
^
This is because the block that enables use of _Generic is conditional
not only on C11, but also on whether the compiler advertises support for
C generic selections via __has_extension(c_generic_selections).
To work around the warning without having to pessimize the code, use the
__extension__ keyword, which is supported by both clang and gcc. While
here, remove the check for __clang__, as _Generic has been supported for
a long time by gcc too now.
Reported by: yuri
PR: 254714
MFC after: 1 week
The man page says "The feenableexcept(), fedisableexcept(), and
fegetexcept() functions return a bitmap of the exceptions that were
unmasked prior to the call.", so we should return zero not -1.
Reviewed By: mhorne
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29386
After increasing the lib/msun/tests WARNS to 6, this triggers a
compilation error for RISC-V.
Fixes: 87d65c747a43 ("lib/msun: Allow building tests with WARNS=6")
Reported by: Jenkins
The LLVM bug was fixed a long time ago and with D29076 this test actually
passes now.
Reviewed By: andrew
Differential Revision: https://reviews.freebsd.org/D29092
If long double has more than 64 mantissa bits, using uint64_t to hold the
mantissa bits will truncate the value and result in test failures. To fix
this problem use __uint128_t since all platforms that have
__LDBL_MANT_DIG__ > 64 also have compiler support for 128-bit integers.
Reviewed By: rlibby
Differential Revision: https://reviews.freebsd.org/D29076
I only tested the WARNS=6 change on AArch64 and AMD64, but this file has
unused functions for architectures with LDBL_PREC == 53.
While touching this file change the LDBL_PREC == 53 checks to i386 checks.
The long double tests should only be disabled for i386 (due to the rather
odd rounding mode that it uses) not all architectures where long double
is the same as double.
PR: 205449
Fixes: 87d65c747a43 ("lib/msun: Allow building tests with WARNS=6")
Reported by: Jenkins
Some CPUs (e.g. AArch64 QEMU) cannot trap on floating point exceptions and
therefore ignore the writes to the floating point control register inside
feenableexcept(). If no exceptions are enabled after
feenableexcept(FE_ALL_EXCEPT), we can assume that the CPU does not
support exceptions and we can then skip the test.
Reviewed By: dim
Differential Revision: https://reviews.freebsd.org/D29095
These appears to have been resolved by compiling the test with -fno-builtin
and/or using a newer compiler.
PR: 208703
Reviewed By: ngie
Differential Revision: https://reviews.freebsd.org/D28884
This provides better error messages that just an assertion failure and
also makes it easier to mark individual tests as XFAIL.
It was also helpful when coming up with D28786 and D28787.
Differential Revision: https://reviews.freebsd.org/D28798
Apparently GCC only supports arithmetic expressions that use static
const variables in initializers starting with GCC8. To keep older
versions happy use a macro instead.
Fixes: 221622ec0c ("lib/msun: Avoid FE_INEXACT for x86 log2l/log10l")
Reported by: Jenkins
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D29233
The existing code masked off all bits that it didn't know about. To be
future-proof, we should save and restore the entire fpcr/fpsr registers.
Additionally, the existing fesetenv() was incorrectly setting the rounding
mode in fpsr instead of fpcr.
This patch stores fpcr in the high 32 bits of fenv_t and fpsr in the low
bits instead of trying to interleave them in a single 32-bit field.
Technically, this is an ABI break if you re-compile parts of your code or
pass a fenv_t between DSOs that were compiled with different versions
of fenv.h. However, I believe we should fix this since the existing code
was broken and passing fenv_t across DSOs should rarely happen.
Reviewed By: andrew
Differential Revision: https://reviews.freebsd.org/D29160
This fixes tests/lib/msun/logarithm_test after compiling the test with
-fno-builtin (D28577). Adding invln10_lo + invln10_10 results in
FE_INEXACT (for all inputs) and the same for the log2l invln2_lo + invln2_hi.
This patch avoids FE_INEXACT (for exact results such as 0) by defining a
constant and using that.
Reviewed By: dim
Differential Revision: https://reviews.freebsd.org/D28786
This caused LDBL_MANT_DIG to not be defined and therefore the scalbnl
alias was not being emitted for double==long double platforms.
Fixes: 760b2ffc ("Update scalbn* functions to the musl versions")
Reported by: Jenkins
The only diff compared to musl is a minor change to scalbnl() to replace
musl's union ldshape with union IEEEl2bits.
This fixes the scalbn tests on non-x86 (since x86 has an assembly version
that is used instead).
Musl commit messages:
commit 8c44a060243f04283ca68dad199aab90336141db
Author: Szabolcs Nagy <nsz@port70.net>
Date: Mon Apr 3 02:38:13 2017 +0200
fix scalbn when result is in the subnormal range
in nearest rounding mode scalbn could introduce double rounding error
when an intermediate value and the final result were both in the
subnormal range e.g.
scalbn(0x1.7ffffffffffffp-1, -1073)
returned 0x1p-1073 instead of 0x1p-1074, because the intermediate
computation got rounded to 0x1.8p-1023.
with the fix an intermediate value can only be in the subnormal range
if the final result is 0 which is correct even after double rounding.
(there still can be two roundings so signals may be raised twice, but
that's only observable with trapping exceptions which is not supported.)
commit 2eaed464e2080d8321d3903b71086a1ecfc4ee4a
Author: Szabolcs Nagy <nsz@port70.net>
Date: Wed Sep 4 15:52:54 2013 +0000
math: use float_t and double_t in scalbnf and scalbn
remove STRICT_ASSIGN (c99 semantics is assumed) and use the conventional
union to prepare the scaling factor (so libm.h is no longer needed)
commit 1b77b9072f374bd26eb0574b83a0d5f18d75ec60
Author: Szabolcs Nagy <nsz@port70.net>
Date: Thu Aug 15 10:07:46 2013 +0000
math: minor scalbn*.c simplification
commit c4359e01303da2755fe7e8033826b132eb3659b1
Author: Szabolcs Nagy <nsz@port70.net>
Date: Tue Nov 13 10:55:35 2012 +0100
math: excess precision fix modf, modff, scalbn, scalbnf
old code was correct only if the result was stored (without the
excess precision) or musl was compiled with -ffloat-store.
now we use STRICT_ASSIGN to work around the issue.
(see note 160 in c11 section 6.8.6.4)
commit 666271c105e4137bdfa195e217799d74143370d4
Author: Szabolcs Nagy <nsz@port70.net>
Date: Tue Nov 13 10:30:40 2012 +0100
math: fix scalbn and scalbnf on overflow/underflow
old code was correct only if the result was stored (without the
excess precision) or musl was compiled with -ffloat-store.
(see note 160 in n1570.pdf section 6.8.6.4)
commit 8051e08e10d2b739fcfcbc6bc7466e8d77fa49f1
Author: nsz <nsz@port70.net>
Date: Mon Mar 19 10:54:07 2012 +0100
simplify scalbn*.c implementations
The old scalbn.c was wrong and slow, the new one is just slow.
(scalbn(0x1p+1023,-2097) should give 0x1p-1074, but the old code gave 0)
Reviewed By: dim
Differential Revision: https://reviews.freebsd.org/D28872
This forces the compiler to emit calls to libm functions, instead of
possibly substituting pre-calculated results at compile time, which
should help to actually test those functions.
Reviewed by: emaste, arichardson, ngie
Differential Revision: https://reviews.freebsd.org/D28577
MFC after: 3 days
I did this without a full vendor update since that would cause too many
conflicts. Since these files now almost match the NetBSD sources the
next git subtree merge should work just fine.
Reviewed By: lwhsu
Differential Revision: https://reviews.freebsd.org/D28797
This applies musl commit b02eed9c4841913d690a2d0029737d72615384fe by
Szabolcs Nagy and updates the tests accordingly. This also allows
removing an XFAIL from the test.
musl commit message:
complex: fix ctanh(+-0+i*nan) and ctanh(+-0+-i*inf)
These cases were incorrect in C11 as described by
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1886.htm
PR: 217528
Reviewed By: dim
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28578
This adjusts the factor used to scale the subnormal numbers, so it
becomes the right value after adjusting its exponent. Thanks to Steve
Kargl for finding the most elegant fix.
Also enable the hypot tests, and add a test case for this bug.
PR: 253313
MFC after: 1 week
This sprinkles a few strategic volatiles in an attempt to defeat clang's
optimization interfering with the expected floating-point exception
flags.
Reported by: lwhsu
PR: 244732
MFC after: 3 days
This tests fork()s, so if there is still data in the stdout buffer on fork
it will print it again in the child process. This was happening in the
CheriBSD CI and caused the test to complain about malformed TAP output.
Reviewed By: ngie
Differential Revision: https://reviews.freebsd.org/D28397