Merge the projects/clang-sparc64 branch back to head. This brings in
several updates from the llvm and clang trunks to make the sparc64
backend fully functional.
Apart from one patch to sys/sparc64/include/pcpu.h which is still under
discussion, this makes it possible to let clang fully build world and
kernel for sparc64.
Any assistance with testing this on actual sparc64 hardware is greatly
appreciated, as there will unavoidably be bugs left.
Many thanks go to Roman Divacky for his upstream work on getting the
sparc64 backend into shape.
MFC r262985:
Repair a few minor mismerges from r262261 in the clang-sparc64 project
branch. This is also to minimize differences with upstream.
preprocessor) gives the following error:
--- Version.map ---
<stdin>:287:4: error: invalid preprocessing directive
# Implemented as weak aliases for imprecise versions
^
1 error generated.
Change the comment to a C-style one, to prevent this error.
Approved by: re (hrs)
These are weak and so can be replaced by other versions in applications
that choose to do so, and will give a linker warning when used so that
applications that rely on the extra precision can avoid them.
Note that since the C/C++ specs only guarantee that long double has
precision equal to double, code that actually relies on these functions
having greater precision is unportable at best and broken at worst.
. Use integer literal constants instead of double literal constants.
* s_erff.c:
. Use integer literal constants instead of casting double literal
constants to float.
. Update the threshold values from those carried over from erf() to
values appropriate for float.
. New sets of polynomial coefficients for the rational approximations.
These coefficients have little, but positive, effect on the maximum
error in ULP in the four intervals, but do improve the overall
speed of execution.
. Remove redundant GET_FLOAT_WORD(ix,x) as hx already contained the
contents that is packed into ix.
. Update the mask that is used to zero-out lower-order bits in x in
the intervals [1.25, 2.857143] and [2.857143, 12]. In tests on
amd64, this change improves the maximum error in ULP from 6.27739
and 63.8095 to 3.16774 and 2.92095 on these intervals for erffc().
Reviewed by: bde
as a fairly faithful implementation of the algorithm found in
PTP Tang, "Table-driven implementation of the Expm1 function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 18,
211-222 (1992).
Over the last 18-24 months, the code has under gone significant
optimization and testing.
Reviewed by: bde
Obtained from: bde (most of the optimizations)
* Use integral numerical constants, and let the compiler do the
conversion to long double.
ld128/s_expl.c:
* Use integral numerical constants, and let the compiler do the
conversion to long double.
* Use the ENTERI/RETURNI macros, which are no-ops on ld128. This
however makes the ld80 and ld128 identical.
Reviewed by: bde (as part of larger diff)
* In the special case x = -Inf or -NaN, use a micro-optimization
to eliminate the need to access u.xbits.man.
* Fix an off-by-one for small arguments |x| < 0x1p-65.
ld128/s_expl.c:
* In the special case x = -Inf or -NaN, use a micro-optimization
to eliminate the need to access u.xbits.manh and u.xbits.manl.
* Fix an off-by-one for small arguments |x| < 0x1p-114.
Obtained from: bde
* Update the evaluation of the polynomial. This allows the removal
of the now unused variables t23 and t45.
ld128/s_expl.c:
* Update the evaluation of the polynomial and the intermediate
result t. This update allows several numerical constants to be
written as double rather than long double constants. Update
the constants as appropriate.
Obtained from: bde
and use macros to access the e component of the unions. This allows
the portions of the code in ld80 to be identical to the ld128 code.
Obtained from: bde
The names now coincide with the name used in PTP Tang's paper.
* Rename the variable from s to tbl to better reflect that
this is a table, and to be consistent with the naming scheme
in s_exp2l.c
Reviewed by: bde (as part of larger diff)
* Update Copyright years to include 2013.
ld128/s_expl.c:
* Correct and update Copyright years. This code originated from
the ld80 version, so it should reflect the same time period.
Reviewed by: bde (as part of larger diff)
* Use ENTERI/RETURNI to allow the use of FP_PE on i386 target.
Reviewed by: das (and bde a long time ago)
Approved by: das (mentor)
Obtained from: bde (polynomial coefficients)
are workarounds for various symptoms of the problem described in clang
bugs 3929, 8100, 8241, 10409, and 12958.
The regression tests did their job: they failed, someone brought it
up on the mailing lists, and then the issue got ignored for 6 months.
Oops. There may still be some regressions for functions we don't have
test coverage for yet.
libc.a and libc_p.a. In addition, define isnan in libm.a and libm_p.a,
but not in libm.so.
This makes it possible to statically link executables using both isnan
and isnanf with libc and libm.
Tested by: kargl
MFC after: 1 week
table and the requirement on trailing zero bits.
* Remove the __aligned() compiler directives as these were found
to have a negative effect on the produced code.
Submitted by: bde
Approved by: das (mentor)
. Change the API for the LD80C by removing the explicit passing
of the sign bit. The sign can be determined from the last
parameter of the macro.
. On i386, load long double by bit manipulations to work around
at least a gcc compiler issue. On non-i386 ld80 architectures,
use a simple assignment.
* ld80/s_expl.c:
. Update the only consumer of LD80C.
Submitted by: bde
Approved by: das (mentor)
. Fix the threshold for expl(x) where |x| is small.
. Also update the previously incorrect comment to match the
new threshold.
* ld128/s_expl.c:
. Re-order logic in exceptional cases to match the logic used in
other long double functions.
. Fix the threshold for expl(x) where is |x| is small.
. Also update the previously incorrect comment to match the
new threshold.
Submitted by: bde
Approved by: das (mentor)