freebsd-skq

Author	SHA1	Message	Date
Bruce Evans	f776d19f07	Oops, the previous i386 version of e_fmodf.S and e_fmodl.S was actually the amd64 version.	2016-09-04 15:08:14 +00:00
Bruce Evans	5432c3bec3	Disconnect the "optimized" asm variants of cos(), sin() and tan() from the build on i386. Leave them in the source tree for regression tests. The asm functions were always much less accurate (by a factor of more than 10**18 in the worst case). They were faster on old CPUs. But with each new generation of CPUs they get relatively slower. The double precision C version's average advantage is about a factor of 2 on Haswell. The asm functions were already intentionally avoided in float and long double precision on i386 and in all precisions on amd64. Float precision and amd64 give larger advantages to the C version. The long double precision C code and compilers' understanding of long double precision are not so good, so the i387 is still slightly faster for long double precision, except for the unimportant subcase of huge args where the sub-optimal C code now somehow beats the i387 by about a factor of 2.	2016-09-04 14:12:19 +00:00
Bruce Evans	83e449a402	Add asm versions of fmod(), fmodf() and fmodl() on amd64. Add asm versions of fmodf() amd fmodl() on i387. fmod is similar to remainder, and the C versions are 3 to 9 times slower than the asm versions on x86 for both, but we had the strange mixture of all 6 variants of remainder in asm and only 1 of 6 variants of fmod in asm.	2016-09-04 12:22:14 +00:00
Konstantin Belousov	826549e53d	Merge the 386 and amd64 versions of the fenv.h, to make cc -m32 compilations which use fenv.h work. Reviewed by: tjil Sponsored by: The FreeBSD Foundation	2013-04-21 13:31:55 +00:00
Tijl Coosemans	71dad5d6ad	Optimise i387 trigonometric functions. Replace "andw 0x400,%ax \ jnz" with "sahf \ jp", "fprem1" with "fprem" and "fstsw %ax" with "fnstsw %ax".	2012-09-16 16:58:49 +00:00
David Schultz	741ae1d017	Bugfix: feenableexcept() and fedisableexcept() should just return the old exception mask, not mask \| ~FE_ALL_EXCEPT. MFC after: 2 weeks	2011-10-21 06:25:31 +00:00
David Schultz	5d9fefacf2	Use #include "fenv.h" instead of #include <fenv.h>. This makes it more convenient to compile the math library by itself. Requested by: bde	2011-10-16 05:37:56 +00:00
David Schultz	90a83ac60a	Replace two lines accidentally removed in r226218. Thanks to bde for noticing this.	2011-10-15 04:17:20 +00:00
David Schultz	d78e594bc9	Provide external definitions of all of the standardized functions in fenv.h that are currently inlined. The definitions are provided in fenv.c via 'extern inline' declaractions. This assumes the compiler handles 'extern inline' as specified in C99, which has been true under FreeBSD since 8.0. The goal is to eventually remove the 'static' keyword from the inline definitions in fenv.h, so that non-inlined references all wind up pointing to the same external definition like they're supposed to. I am deferring the second step to provide a window where newly-compiled apps will still link against old math libraries. (This isn't supported, but there's no need to cause undue breakage.) Reviewed by: stefanf, bde	2011-10-10 15:43:09 +00:00
Konstantin Belousov	8997563c9a	Add section .note.GNU-stack for assembly files used by 386 and amd64.	2011-01-07 16:13:12 +00:00
Dimitry Andric	b3b74eb4e1	Use __FBSDID() instead of RCSID() in most .S files under lib/msun/i386, and one under lib/msun/amd64. This avoids adding the identifiers to the .text section, and moves them to the .comment section instead. Suggested by: bde Approved by: rpaulo (mentor)	2010-10-01 20:14:36 +00:00
Konstantin Belousov	60d818ef9c	Placate new binutils, by using 16-bit %ax instead of 32-bit %eax as an argument for fnstsw. Explicitely specify sizes for the XMM control and status word and X87 control and status words. Reviewed by: das Tested by: avg MFC after: 2 weeks	2010-02-03 20:23:47 +00:00
Attilio Rao	9235ed7199	Use, in uncovered part, the END() macro in order to improve debugging. In this specific case, Valgrind won't get confused when analyzing such functions. Sponsored by: Sandvine Incorporated Tested by: emaste MFC: 3 days	2009-05-25 14:37:10 +00:00
David Schultz	1192a80ed1	On i386, gcc truncates long double constants to double precision at compile time regardless of the dynamic precision, and there's no way to disable this misfeature at compile time. Hence, it's impossible to generate the appropriate tables of constants for the long double inverse trig functions in a straightforward way on i386; this change hacks around the problem by encoding the underlying bits in the table. Note that these functions won't pass the regression test on i386, even with the FPU set to extended precision, because the regression test is similarly damaged by gcc. However, the tests all pass when compiled with a modified version of gcc. Reported by: bde	2008-08-02 03:56:22 +00:00
David Schultz	074fb64d9a	Add assembly versions of remquol() and remainderl().	2008-03-30 21:21:53 +00:00
David Schultz	e43c8f6acc	Hook up sqrtl() to the build.	2008-03-02 01:48:17 +00:00
David Schultz	c6f56f9f41	MD implementations of sqrtl().	2008-03-02 01:48:08 +00:00
David Schultz	d3f9671a7d	Implement rintl(), nearbyintl(), lrintl(), and llrintl(). Thanks to bde@ for feedback and testing of rintl().	2008-01-14 02:12:07 +00:00
David Schultz	6821aba9e5	Add logbl(3) to libm.	2007-12-17 03:53:38 +00:00
Daniel Eischen	5f864214bb	Use C comments since we now preprocess these files with CPP.	2007-04-29 14:05:22 +00:00
David Schultz	8185b32b5a	Fix a problem relating to fesetenv() clobbering i387 register stack. Details: As a side-effect of restoring a saved FP environment, fesetenv() overwrites the tag word, which indicates which i387 registers are in use. Normally this isn't a problem because the calling convention requires the register stack to be empty on function entry and exit. However, fesetenv() is inlined, so we need to tell gcc explicitly that the i387 registers get clobbered. PR: 85101	2007-01-06 21:46:23 +00:00
David Schultz	3cb636ce18	Remove an unneeded fnstcw instruction. Noticed by: bde	2007-01-05 07:15:26 +00:00
Bruce Evans	fae6222bdb	Moved __BEGIN_DECLS up a little so that it covers __test_sse() and C++ isn't broken, PR: 104425	2006-10-14 20:35:56 +00:00
Bruce Evans	3454a5a101	Removed the optimized asm versions of scalb() and scalbf(). These functions are only for compatibility with obsolete standards. They shouldn't be used, so they shouldn't be optimized. Use the generic versions instead. This fixes scalbf() as a side effect. The optimized asm version left garbage on the FP stack. I fixed the corresponding bug in the optimized asm scalb() and scalbn() in 1996. NetBSD fixed it in scalb(), scalbn() and scalbnf() in 1999 but missed fixing it in scalbf(). Then in 2005 the bug was reimplemented in FreeBSD by importing NetBSD's scalbf(). The generic versions have slightly different error handling: - the asm versions blindly round the second parameter to a (floating point) integer and proceed, while the generic versions return NaN if this rounding changes the value. POSIX permits both behaviours (these functions are XSI extensions and the behaviour for a bogus non-integral second parameter is unspecified). Apart from this and the bug in scalbf(), the behaviour of the generic versions seems to be identical. (I only exhusatively tested generic_scalbf(1.0F, anyfloat) == asm_scalb(1.0F, anyfloat). This covers many representative corner cases involving NaNs and Infs but doesn't test exception flags. The brokenness of scalbf() showed up as weird behaviour after testing just 7 integer cases sequentially.)	2006-07-05 20:06:42 +00:00
Daniel Eischen	d7eda46253	Add symbol versioning to libm.	2006-03-27 23:59:45 +00:00
Bruce Evans	f964c6ecfb	Fixed some comments added in rev.1.5. The log message for 1.5 said that some small (one or two ulp) inaccuracies were fixed, and a comment implied that the critical change is to switch the rounding mode to to-nearest, with a switch of the precision to extended at no extra cost. Actually, the errors are very large (ucbtest finds ones of several hundred ulps), and it is the switch of the precision that is critical. Another comment was wrong about NaNs being handled sloppily.	2005-10-30 12:21:02 +00:00
Daniel Eischen	7f8fa2cf47	Prevent these functions from using stack outside of their frame. Reported by: Marc Olzheim <marcolz at stack dot nl> OK'd by: das	2005-05-06 15:44:20 +00:00
David Schultz	a4ca7ca8ac	More optimized math functions.	2005-04-16 21:12:55 +00:00
David Schultz	3b9141ee91	Implement and document remquo() and remquof().	2005-03-25 04:40:44 +00:00
David Schultz	9233b45ad9	Make the fenv.h routines work for programs that use SSE for floating-point arithmetic on i386. Now I'm going to make excuses for why this code is kinda scary: - To avoid breaking the ABI with 5.3-RELEASE, we can't change sizeof(fenv_t). I stuck the saved mxcsr in some discontiguous reserved bits in the existing structure. - Attempting to access the mxcsr on older processors results in an illegal instruction exception, so support for SSE must be detected at runtime. (The extra baggage is optimized away if either the application or libm is compiled with -msse{,2}.) I didn't run tests to ensure that this doesn't SIGILL on older 486's lacking the cpuid instruction or on other processors lacking SSE. Results from running the fenv regression test on these processors would be appreciated. (You'll need to compile the test with -DNO_STRICT_DFL_ENV.) If you have an 80386, or if your processor supports SSE but the kernel didn't enable it, then you're probably out of luck. Also, I un-inlined some of the functions that grew larger as a result of this change, moving them from fenv.h to fenv.c.	2005-03-17 22:21:46 +00:00
David Schultz	10b01832c3	Replace fegetmask() and fesetmask() with feenableexcept(), fedisableexcept(), and fegetexcept(). These two sets of routines provide the same functionality. I implemented the former as an undocumented internal interface to make the regression test easier to write. However, fe(enable\|disable\|get)except() is already part of glibc, and I would like to avoid gratuitous differences. The only major flaw in the glibc API is that there's no good way to report errors on processors that don't support all the unmasked exceptions.	2005-03-16 19:03:46 +00:00
David Schultz	99401fa2e9	- Define the LDBL_PREC to be the number of significant bits in a long double's mantissa. - Add an assembly version of scalbnl.	2005-03-07 04:53:48 +00:00
David Schultz	cd7d05b5a2	Add scalbnl, also known as as ldexpl.	2005-03-07 04:52:58 +00:00
David Schultz	4b2011300b	Alias scalbnf as ldexpf. The two are identical in binary floating-point formats.	2005-03-07 04:52:43 +00:00
David Schultz	f674c13c78	Remove the i387 versions of atan(), atan2(), and atan2f(). They are slower than the MI routines on modern hardware, except for degenerate cases such as the Pentium 4. PR: 67469	2005-02-21 16:04:23 +00:00
David Schultz	c4691a5da9	Remove i387 versions of asin() and acos(). Although the hardware instruction was faster on the 486, it's slower than our MD version on modern processors. Determined by: bde PR: 67469	2005-02-20 22:51:08 +00:00
David Schultz	dab1571b90	Remove the float versions of the i387 trig functions obtained from NetBSD. They're buggy, giving particularly for inputs larger in magnitude than 2**63. Noticed by: bde PR: 67469	2005-02-20 22:50:40 +00:00
David Schultz	79b990338f	Move machine-dependent crud to its own makefile.	2005-02-04 14:33:39 +00:00
David Schultz	e1b61b5b93	Remove wrappers and other cruft intended to support SVID, mistakes in C90, and other arcana. Most of these features were never fully supported or enabled by default. Ok: bde, stefanf	2005-02-04 14:08:32 +00:00
David Schultz	f365db00e5	Mark all inline asms that read the floating-point control or status registers as volatile. Instructions that wrote to FP state were already marked volatile, but apparently gcc has license to move non-volatile asms past volatile asms. This broke amd64's feupdateenv at -O2 due to a WAR conflict between fnstsw and fldenv there.	2005-01-14 07:09:23 +00:00
David Schultz	fe69257da2	Import the subset of J.T. Conklin's single-precision x86-optimized math routines that appear to be (a) correct and (b) faster than their MI counterparts on my Pentium 4. Obtained from: NetBSD	2005-01-13 18:58:25 +00:00
David Schultz	3cdb8115d7	Things that are broken, unneeded, and unused since 1997 belong in the attic.	2005-01-13 15:43:22 +00:00
David Schultz	439e59cf85	Faster lrint() and llrint() implementations for x86.	2005-01-11 23:10:53 +00:00
Stefan Farfeleder	c8764bba5a	Completely remove s_ilogb.S as the assembler implementation gives very little speed improvement to none at all over the MI version. Submitted by: bde	2004-06-20 10:42:23 +00:00
Stefan Farfeleder	b6161bb16a	Return the same result as the MI version for 0.0, INFINITY and NaN. Reviewed by: standards@	2004-06-19 09:30:00 +00:00
David Schultz	0b71a226d1	Add an fenv.h implementation for the i386 port. Reviewed by: standards@	2004-06-06 10:04:17 +00:00
Bruce Evans	46d31a8b36	Removed bogus 'l' suffixes in FP register to register instructions.	2000-06-06 12:12:36 +00:00
Peter Wemm	7f3dea244c	$Id$ -> $FreeBSD$	1999-08-28 00:22:10 +00:00
Bruce Evans	9970814b3e	Fixed wrong mnemonic `setnel' that gas happened to generate correct object code for. Obtained from: a slightly different fix in NetBSD	1997-04-30 20:37:52 +00:00
Bruce Evans	6b04d9918b	Include <machine/asm.h> instead of kernel-only <machine/asmacros.h>.	1997-03-09 14:01:11 +00:00

1 2

62 Commits