Commit Graph

11872 Commits

Author SHA1 Message Date
imp
09d99400d7 FreeBSD/mips libc support. Merged from perforce mips2-jnpr branch. 2008-04-16 05:06:11 +00:00
davidxu
6a349c6771 _vfork is not in libthr, remove the reference. 2008-04-16 03:19:11 +00:00
cperciva
4ee0999e20 Fix one-byte buffer overflow: NUL gets written to the buffer, but isn't
counted in the width specification in scanf.

This is not a security problem, since this function is only used to
parse a user's configuration file.

Submitted by:	Joerg Sonnenberger
Obtained from:	dragonflybsd
MFC after:	1 week
2008-04-15 23:29:51 +00:00
davidxu
a19eeb1bb9 Implement POSIX function tcgetsid() which returns session id.
PR: stand/107561
2008-04-15 08:33:32 +00:00
davidxu
a1371575f4 don't include pthread_np.h, it is not used. 2008-04-14 08:08:40 +00:00
delphij
b612c0d58d Use calloc() instaed of zeroing memory ourselves. 2008-04-13 08:05:08 +00:00
das
e7c98f4fd1 Unbreak the build for arm and powerpc.
Pointy hat to yours truly.
2008-04-12 14:53:52 +00:00
das
61773c1dea Updates for changes in the way printf() handles hex floating point
numbers.
2008-04-12 03:11:56 +00:00
das
244a0bcf48 Make several changes to the way printf handles hex floating point (%a):
1. Previously, printing the number 1.0 could produce 0x1p+0, 0x2p-1,
   0x4p-2, or 0x8p-3, depending on what happened to be convenient. This
   meant that printing a value as a double and printing the same value
   as a long double could produce different (but equivalent) results.
   The change is to always make the leading digit a 1, unless the
   number is 0. This solves the aforementioned problem and has
   several other advantages.

2. Use the FPU to do rounding. This is far simpler and more portable
   than manipulating the bits, and it fixes an obsure round-to-even
   bug. It also raises the exceptions now required by IEEE 754R.
   The drawbacks are that it is usually slightly slower, and it makes
   printf less effective as a debugging tool when the FPU is hosed
   (e.g., due to a buggy softfloat implementation).

3. On i386, twiddle the rounding precision so that (2) works properly
   for long doubles.

4. Make several simplifications that are now possible due to (2).

5. Split __hldtoa() into a separate file.

Thanks to remko for access to a sparc64 box for testing.
2008-04-12 03:11:36 +00:00
das
23ad2eb9b8 Fix some bugs that caused sparc64's quad precision sqrt to get
the wrong answer for virtually all inputs.

Thanks to remko for access to a sparc64 box for testing.
2008-04-12 03:10:13 +00:00
das
cb01b0e6f4 Make the software emulator for long doubles set the FPU exception
flags appropriately. The next step is to make it raise a SIGFPE if
any exceptions are unmasked.

Thanks to remko for access to a sparc64 box for testing.
2008-04-12 03:09:51 +00:00
delphij
b65786c8a8 Add memrchr(3).
Obtained from:	OpenBSD
2008-04-10 00:12:44 +00:00
deischen
bdfc7667fc Move the cpuset functions from FBSD_1.0 to FBSD_1.1. All symbols added
to 8.0 belong in the FBSD_1.1 symbol namespace.
2008-04-07 13:53:51 +00:00
dfr
a38ef19667 On i386, don't try to do network-type stuff if the device name is'nt pxeN. 2008-04-05 15:03:29 +00:00
dfr
fd80ed9056 Add some compatibility code so that software which is built to use the new
struct flock with l_sysid member can work properly on an an old kernel which
doesn't support l_sysid.

Sponsored by:	Isilon Systems
2008-04-04 09:43:03 +00:00
imp
cc89dca5ca Minor style(9) nit: move to using ANSI definition of functions. 2008-04-03 20:36:44 +00:00
ru
c60a524cb1 Fix descriptions of "struct msqid_ds and "struct ipc_perm" to match
harsh reality.
2008-04-03 16:21:43 +00:00
das
6a8cdf6fed Fix some corner cases:
- fma(x, y, z) returns z, not NaN, if z is infinite, x and y are finite,
  x*y overflows, and x*y and z have opposite signs.
- fma(x, y, z) doesn't generate an overflow, underflow, or inexact exception
  if z is NaN or infinite, as per IEEE 754R.
- If the rounding mode is set to FE_DOWNWARD, fma(1.0, 0.0, -0.0) is -0.0,
  not +0.0.
2008-04-03 06:14:51 +00:00
davidxu
8d9f007088 put THR_CRITICAL_LEAVE into do .. while statement. 2008-04-03 02:47:35 +00:00
kevlo
762f60a112 style(9) cleanup 2008-04-03 02:41:54 +00:00
davidxu
2d5bf7e6fc add __hidden suffix to _umtx_op_err, this eliminates PLT. 2008-04-03 02:13:51 +00:00
davidxu
f1df18eb48 Non-portable functions are in pthread_np.h, fix compiling problem. 2008-04-02 11:41:12 +00:00
davidxu
d00a98a4e8 Add pthread_setaffinity_np and pthread_getaffinity_np to libc namespace. 2008-04-02 08:53:18 +00:00
davidxu
aa2019ec00 Remove unused functions. 2008-04-02 08:33:42 +00:00
davidxu
570834290d Replace function _umtx_op with _umtx_op_err, the later function directly
returns errno, because errno can be mucked by user's signal handler and
most of pthread api heavily depends on errno to be correct, this change
should improve stability of the thread library.
2008-04-02 07:41:25 +00:00
davidxu
8d50c29c72 Replace userland rwlock with a pure kernel based rwlock, the new
implementation does not switch pointers when it resumes waiters.

Asked by: jeff
2008-04-02 04:32:31 +00:00
davidxu
09e95ab21c Normally, we are often reading local time rather than setting time zone,
replace mutex with rwlock, this should eliminate lock contention in
most cases.
2008-04-01 06:56:11 +00:00
davidxu
111802a962 Restore normal pthread_cond_signal path to avoid some obscure races. 2008-04-01 06:23:08 +00:00
davidxu
664ef9f365 return EAGAIN early rather than running bunch of code later, micro optimize
static branch prediction.
2008-04-01 00:21:49 +00:00
das
017bbc352e Remove a (bogus) remnant of debugging this on sparc64. 2008-03-31 13:11:45 +00:00
kib
2ad0eb2d91 Add the libc glue and headers definitions for the *at() syscalls.
Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:14:04 +00:00
kientzle
bd1adacdf1 Include an extra byte for the trailing NUL. <sigh>
Pointy hat: Me
2008-03-31 06:24:39 +00:00
davidxu
220584ad61 Rewrite rwlock to user atomic operations to change rwlock state, this
eliminates internal mutex lock contention when most rwlock operations
are read.

Orignal patch provided by: jeff
2008-03-31 02:55:49 +00:00
das
39170f049a Add assembly versions of remquol() and remainderl(). 2008-03-30 21:21:53 +00:00
das
f487ba5286 Hook remquol() and remainderl() up to the build. 2008-03-30 20:48:02 +00:00
das
b4b933cd52 Implement remainderl() as a wrapper around remquol(). The extra work
remquol() performs to compute the quotient is negligible.
2008-03-30 20:47:42 +00:00
das
8bd589e900 Implement remquol() based on remquo(). 2008-03-30 20:47:26 +00:00
das
7e1a7394d9 Implement csqrtl(). 2008-03-30 20:07:15 +00:00
das
4e563b6d98 Hook hypotl() and cabsl() up to the build. 2008-03-30 20:03:46 +00:00
das
fed973ef18 Document hypotl().
Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2008-03-30 20:03:29 +00:00
das
ca69e4f334 Alias hypotl() and cabsl() for platforms where long double is the same
as double.
2008-03-30 20:03:06 +00:00
das
b48d845e62 Implement cabsl() in terms of hypotl().
Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2008-03-30 20:02:03 +00:00
das
927d9b1d58 Implement hypotl(). This is bde's conversion of fdlibm hypot(), with minor
fixes for ld128 by me.
2008-03-30 20:01:50 +00:00
bde
2916ad3e28 Use fabs[f]() instead of bit fiddling for setting absolute values.
This makes little difference in float precision, but in double
precision gives a speedup of about 30% on amd64 (A64 CPU) and i386
(A64).  This depends on fabs[f]() being inline and efficient.  The
bit fiddling (or any use of SET_HIGH_WORD(), which libm does too
much because it was best on old 32-bit machines) always causes
packing overheads and sometimes causes stalls in the packing, since
it operates on only part of a variable in the double precision case.
It apparently did cause stalls in a critical path here.
2008-03-30 18:07:12 +00:00
bde
b06e3a074e Use the expression fabs(x+0.0)-fabs(y+0.0) instead of
fabs(x+0.0)+fabs(y+0.0) when mixing NaNs.  This improves
consistency of the result by making it harder for the compiler to reorder
the operands.  (FP addition is not necessarily commutative because the
order of operands makes a difference on some machines iff the operands are
both NaNs.)
2008-03-30 17:28:27 +00:00
bde
95436ce20d Fix a missing mask in a hi+lo decomposition. Thus bug made the extra
precision in software useless, so hypotf() had some errors in the 1-2
ulp range unless there is extra precision in hardware (as happens on
i386).
2008-03-30 17:17:42 +00:00
dfr
256e041f3b Don't call xdrrec_skiprecord in the non-blocking case. If
__xdrrec_getrec has returned TRUE, then we have a complete request in
the buffer - calling xdrrec_skiprecord is not necessary. In particular,
if there is another record already buffered on the stream,
xdrrec_skiprecord will discard both this request and the next
one, causing the call to xdr_callmsg to fail and the stream to be
closed.

Sponsored by:	Isilon Systems
2008-03-30 09:36:17 +00:00
dfr
6eacca7a06 Don't assume that there is readable data on the stream after the
fragment header.
2008-03-30 09:35:04 +00:00
ru
0f0375e36a Remove options MK_LIBKSE and DEFAULT_THREAD_LIB now that we no longer
build libkse.  This should fix WITHOUT_LIBTHR builds as a side effect.
2008-03-29 17:44:40 +00:00
das
ff9d959b80 Include math.h for the fmaf() prototype. 2008-03-29 16:38:29 +00:00
das
a1fc7d5578 Fix some rather obscene code that has ambiguous if...if...else...
constructs in it.
2008-03-29 16:37:59 +00:00
das
ede66e5daf Document modff() and modfl(). Technically, modff() and modfl()
live in libm, while modf() lives in libc due to historical
mistakes. I'm claiming in the manpage that they all live in libm,
since programmers should not rely on the mistake.
2008-03-29 16:19:35 +00:00
jeff
bcb0495c0e - Add a man page for cpuset_getaffinity() and cpuset_setaffinity() and
hook it up to the build.

Reviewed by:	brueffer (skeleton and formatting assistance)
2008-03-29 10:26:29 +00:00
jeff
7ab14d8a96 - Add a man page for cpuset(), cpuset_setid(), and cpuset_getid() and hook
it up to the build.

Reviewed by:	brueffer (skeleton and formatting assistance)
2008-03-29 10:06:30 +00:00
ps
41d5b26ff8 Add support to mincore for detecting whether a page is part of a
"super" page or not.

Reviewed by:	alc, ups
2008-03-28 04:29:27 +00:00
ru
dd08a4dd1f Removed no longer existing CTL_MACHDEP defines.
Inspired by:	phk
2008-03-26 23:02:17 +00:00
dfr
79d2dfdaa6 Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
brueffer
b64d211df2 Fix some "in in" typos in comments.
PR:		121490
Submitted by:	Anatoly Borodin <anatoly.borodin@gmail.com>
Approved by:	rwatson (mentor), jkoshy
MFC after:	3 days
2008-03-26 07:32:08 +00:00
ru
5398c833ef Compile libthr with warnings.
(Somehow this file sneaked from initial commit.)
2008-03-25 15:33:00 +00:00
ru
e8df07e5aa Compile libthr with warnings. 2008-03-25 13:28:12 +00:00
ru
e2f2131976 Fixed mis-implementation of pthread_mutex_get{spin,yield}loops_np().
Reviewed by:	davidxu
2008-03-25 09:48:10 +00:00
jeff
e1e2efa7be - Restore kse.h in this directory so other tools don't find it by mistake.
- Restore the ability to debug kse coredumps in 8.0.

Suggested by:	marcel
2008-03-23 09:38:11 +00:00
davidxu
1423f22a4c Add POSIX pthread API pthread_getcpuclockid() to get a thread's cpu
time clock id.
2008-03-22 09:59:20 +00:00
davidxu
3367c267b3 Use linker set to collection all target operations. 2008-03-22 05:40:44 +00:00
kaiw
2030a16c64 Add MLINK for archive_write_close.
Approved by:	jkoshy(mentor), kientzle
2008-03-21 11:10:20 +00:00
davidxu
8420f5505a Resolve __error()'s PLT early so that it needs not to be resolved again,
otherwise rwlock is recursivly called when signal happens and the __error
was never resolved before.
2008-03-21 02:31:55 +00:00
ru
735c62ce43 pthread_mutexattr_destroy() was accidentally broken in last revision,
unbreak it.  We should really start compiling this with warnings.
2008-03-20 11:47:08 +00:00
des
0b6cf6e8e4 s/wait/delta/ to avoid namespace collision.
MFC after:	2 weeks
2008-03-20 09:55:27 +00:00
davidxu
8326c06222 Preserve application code's errno in rtld locking code, it attemps to keep
any case safe.
2008-03-20 09:35:44 +00:00
davidxu
1bc96ca9dc Make pthread_mutexattr_settype to return error number directly and
conformant to POSIX specification.

Bug reported by: modelnine at modelnine dt org
2008-03-20 08:27:14 +00:00
davidxu
5e11ba1ac4 don't reduce new thread's refcount if current thread can not set cpuset
for it, since the new thread will reduce it by itself.
2008-03-19 09:33:07 +00:00
davidxu
a6c50de140 - Trim trailing spaces.
- Use a different sigmask variable name to avoid confusing.
2008-03-19 08:13:04 +00:00
davidxu
e846a1612b if passed thread pointer is equal to current thread, pass -1 to kernel
to speed up searching.
2008-03-19 06:38:21 +00:00
jkoshy
f8600f40e7 Ensure that the section header table is written out in an order
consistent with the section indices returned to the application by
elf_ndxscn().

Submitted by:		kaiw
2008-03-19 06:06:34 +00:00
jkoshy
af7049a2cd Clarify that the ELF library only sets the sh_entsize field of a
section header entry if the application is not taking charge of ELF
object layout.

Update (c) years, and bump the manual page's date.

Submitted by:		kaiw
2008-03-19 05:07:49 +00:00
emax
7704e14e90 Add mandatory "security description" SDP parameter to the PANU profile
Pointed-out by:	Iain Hibbert < plunky at rya-online dot net >
MFC after:	3 days
2008-03-19 00:06:30 +00:00
emax
a55b468d9c Add PSM and Load Factor SDP parameters to the BNEP based profiles
(NAP, GN and PANU). No reason to not to support them.

Separate SDP parameters data structures for the BNEP based profiles.

Generalize Service Availability SDP parameter creation.

Requested by:	Iain Hibbert < plunky at rya-online dot net >
MFC after:	3 days
2008-03-18 18:21:39 +00:00
davidxu
9dbfb036ab - Copy signal mask out before THR_UNLOCK(), because THR_UNLOCK() may call
_thr_suspend_check() which messes sigmask saved in thread structure.
- Don't suspend a thread has force_exit set.
- In pthread_exit(), if there is a suspension flag set, wake up waiting-
  thread after setting PS_DEAD, this causes waiting-thread to break loop
  in suspend_common().
2008-03-18 02:06:51 +00:00
antoine
73a86dc575 Don't allocate the constant array "props" on the stack in wctype.
PR:		74743
Submitted by:	knut st. osmundsen
Approved by:	rwatson (mentor)
MFC after:	1 month
2008-03-17 18:22:23 +00:00
das
6f407f2920 scandir(3) previously used st_size to obtain an initial estimate
of the array length needed to store all the directory entries.
Although BSD has historically guaranteed that st_size is the size
of the directory file, POSIX does not, and more to the point, some
recent filesystems such as ZFS use st_size to mean something else.

The fix is to not stat the directory at all, set the initial
array size to 32 entries, and realloc it in powers of 2 if that
proves insufficient.

PR:	113668
2008-03-16 19:08:53 +00:00
davidxu
3da38e35b4 Actually delete SIGCANCEL mask for suspended thread, so the signal will not
be masked when it is resumed.
2008-03-16 03:22:38 +00:00
kientzle
b9046423ec Update a comment: the format bid only runs once per archive; it no
longer runs once per entry.
2008-03-15 11:09:16 +00:00
kientzle
c37b281ebc Free up the entry objects allocated during this test. 2008-03-15 11:06:15 +00:00
kientzle
ee09b4a896 Release the buffers used for exercising the compress code. 2008-03-15 11:05:49 +00:00
kientzle
c7c16fc8e2 Remove the duplicate "archive_format" and "archive_format_name" fields
from the private archive_write structure and fix up all writers to use
the format fields in the base "archive" structure.  This error made it
impossible to query the format after setting up a writer because the
write format was stored in an inaccessible place.
2008-03-15 11:04:45 +00:00
kientzle
7ff9ac24d8 Correct a sign mismatch that only showed up on 64-bit systems.
Pointy hat: me
2008-03-15 11:02:47 +00:00
kientzle
3f30727604 Refactor the mtree code a bit to make the layering clearer: Each
"file" is described by multiple "lines" each possibly containing
multiple "keywords."  Incorporate some additions from Joerg Sonnenberger
to handle linked files and correctly deal with backing files on disk.
2008-03-15 07:10:24 +00:00
kientzle
1e4445bea8 FreeBSD does have fstat().
Correct the nasty typo this uncovers.
2008-03-15 04:20:50 +00:00
kientzle
74f604455f Testability is more important than standards conformance.
Disable the use of PaxHeader.<pid> for the fake pax extension pathname
until I can make the name here settable.  Otherwise, tests that try
to compare output to static pre-generated reference files break.
2008-03-15 03:49:18 +00:00
kientzle
4c7ea30e33 Ignore a few more common files. 2008-03-15 02:31:28 +00:00
kientzle
f863f57f14 Resolve a minor nit in SUS compliance by including the PID in the
fake directory name used for pax extended headers.
2008-03-15 02:30:42 +00:00
kientzle
c6463b768d GC a reference to the defunct TESTFILES variable. 2008-03-15 02:22:08 +00:00
kientzle
4f3f5f46dd A subtle point: "pax interchange format" mandates that all strings
(including pathname, gname, uname) be stored in UTF-8.  This usually
doesn't cause problems on FreeBSD because the "C" locale on FreeBSD
can convert any byte to Unicode/wchar_t and from there to UTF-8.  In
other locales (including the "C" locale on Linux which is really
ASCII), you can get into trouble with pathnames that cannot be
converted to UTF-8.

Libarchive's pax writer truncated pathnames and other strings at the
first nonconvertible character.  (ouch!)  Other archivers have worked
around this by storing unconvertible pathnames as raw binary, a
practice which has been sanctioned by the Austin group.  However,
libarchive's pax reader would segfault reading headers that weren't
proper UTF-8.  (ouch!)  Since bsdtar defaults to pax format, this
affects bsdtar rather heavily.

To correctly support the new "hdrcharset" header that is going into
SUS and to handle conversion failures in general, libarchive's pax reader
and writer have been overhauled fairly extensively.  They used to do
most of the pax header processing using wchar_t (Unicode); they now do
most of it using char so that common logic applies to either UTF-8 or
"binary" strings.

As a bonus, a number of extraneous conversions to/from wchar_t have
been eliminated, which should speed things up just a tad.

Thanks to: Bjoern Jacke for originally reporting this to me
Thanks to: Joerg Sonnenberger for noting a bad typo in my first draft of this
Thanks to: Gunnar Ritter for getting the standard fixed
MFC after: 5 days
2008-03-15 01:43:59 +00:00
kientzle
2e83d280ed Ignore some built files. 2008-03-15 00:52:22 +00:00
kientzle
f5ba8800a4 Don't lie. If a string can't be converted to a wide (Unicode) string,
return a NULL instead of an incomplete string.  Expand the test coverage
to verify the correct behavior here.
2008-03-14 23:19:46 +00:00
kientzle
794d55fb64 Don't advertise the default block size as a constant; don't
rely on a deprecated value to set the default.  This is also
related to a longer-term goal of setting the default block
size based on format and possibly other factors, which makes
it a bad idea to tie this to a published constant.
2008-03-14 23:09:02 +00:00
kientzle
f9f0592c21 New public functions archive_entry_copy_link() and archive_entry_copy_link_w()
override the currently set link value, whether that's a hardlink
or a symlink.  Plus documentation update and tests.
2008-03-14 23:00:53 +00:00
kientzle
8229d34b1a Update some comments, comment out argument names to guard against
namespace problems.
2008-03-14 22:47:38 +00:00
kientzle
baaf82407d Since "length" computes the length of a string and is used as an
argument to malloc(3), it should be size_t, not int.
2008-03-14 22:44:07 +00:00
kientzle
205982039c Let archive_entry_clear() accept a NULL pointer and simply do nothing.
In particular, this allows archive_entry_free() to work correctly
for a NULL pointer, which makes it parallel with free(3).
2008-03-14 22:40:36 +00:00
kientzle
474ea5abd8 Rework the versioning implementation and test to match the
new interface.  Mark the functions that are going away in
libarchive 3.0.

In particular, archive_version_string() now computes the
string rather than assuming that it will be created by the
build infrastructure.  Eventually, this will allow some
simplification of the build infrastructure.
2008-03-14 22:31:57 +00:00
kientzle
495bb86033 Rework the versioning information, hopefully for the last time.
* There are now only two public version identifiers:  "number" is
   a single integer that combines Major/minor/release in a single
   value of the form Mmmmrrr.  This is easy to compare against for
   checking feature support.  "string" is a displayable text string
   of the form "libarchive M.mm.rr".
 * The number is present both as a macro (version of the installed header)
   and a function (version of the shared library).  The string form
   is available only as a function.
 * Retain the older version definitions for now, but mark them all
   as deprecated, to disappear in libarchive 3.0 (whenever that happens).
 * Rework the various deprecation conditionals to use ARCHIVE_VERSION_NUMBER.

An ancillary goal is to reduce the number of @...@ substitutions that
are required.  Someday, I might even be able to avoid build-time
processing of archive.h entirely.
2008-03-14 22:19:50 +00:00
kientzle
5e403ce978 Add a useful sprintf()-style wrapper around
archive_string_vsprintf().  (Which is built
on top of libarchive's internal resizable string
support.)
2008-03-14 22:00:09 +00:00
kientzle
0614ffca95 Support for writing 'compress' format, thanks to Joerg Sonnenberger. 2008-03-14 20:35:38 +00:00
kientzle
98633157bb A block in a tar file is 512 bytes. Period.
Remove the entirely pointless symbolic constant
and sizeof(unsigned char).  (The constant
here is doubly wrong, since not only does
it obscure a basic format constant, it was
never intended to be a tar-specific value,
so could conceivably be changed at some point
in the future.)
2008-03-14 20:32:20 +00:00
jkoshy
d51a5310b2 - Document Pentium and Pentium MMX events.
- Update (c) years and the manual page's date.
2008-03-14 06:22:03 +00:00
ru
5fad0ab914 Fix bugs in previous revision (missing comma, misspelled syscall name). 2008-03-13 10:33:24 +00:00
ru
346fbfb32e Remove trailing whitespace. 2008-03-13 10:26:17 +00:00
ru
0fbb165835 Add missing section number. 2008-03-13 10:25:30 +00:00
davidxu
f4d90c5978 In file sem_timewait.3, remove reference to SYSV semphore in SEE ALSO
section, sync it with sem_wait.3.
2008-03-13 01:53:28 +00:00
kaiw
5b0dd8ae24 Current 'ar' read support in libarchive can only handle a GNU/SVR4
filename table whose size is less than 65536 bytes.

The original intention was to not consume the filename table, so the
client will have a chance to look at it. To achieve that, the library
call decompressor->read_ahead to read(look ahead) but do not call
decompressor->consume to consume the data, thus a limit was raised
since read_ahead call can only look ahead at most BUFFER_SIZE(65536)
bytes at the moment, and you can not "look any further" before you
consume what you already "saw".

This commit will turn GNU/SVR4 filename table into "archive format
data", i.e., filename table will be consumed by libarchive, so the
65536-bytes limit will be gone, but client can no longer have access
to the content of filename table.

'ar' support test suite is changed accordingly. BSD ar(1) is not
affected by this change since it doesn't look at the filename table.

Reported by:	erwin
Discussed with:	jkoshy, kientzle
Reviewed by:	jkoshy, kientzle
Approved by:	jkoshy(mentor), kientzle
2008-03-12 21:10:26 +00:00
jkoshy
9efc2e038a Bring the behaviour of pmc_capabilities() and pmc_width() in line with
documentation: set 'errno' and return -1 in case of an error.

Update (c) years.
2008-03-12 15:51:32 +00:00
jkoshy
dc87dccab6 Describe return values from pmc_ncpu() and pmc_npmc() better. 2008-03-12 15:48:59 +00:00
piso
0c792dea70 -Don't pass down the entire pkt to ProtoAliasIn, ProtoAliasOut, FragmentIn
and FragmentOut.
-Axe the old PacketAlias API: it has been deprecated since 5.x.
2008-03-12 11:58:29 +00:00
jeff
9d33d28fb7 - Remove kse syscall symbols and man pages. 2008-03-12 10:12:22 +00:00
jeff
d98a1ab70e - Don't inspect the P_SA flag. It's being removed. 2008-03-12 10:00:33 +00:00
jeff
0409b8b057 - Remove libkse and related support code in libpthread from the build.
Don't remove the files yet.  Kernel support will be removed shortly.
2008-03-12 09:49:39 +00:00
kientzle
ce12a09ced Portability: Eliminate the need for uudecode by incorporating
uudecode into the main test driver and invoking it just-in-time
within the various tests.

Also, incorporate a number of improvements to the main test support
code that have proven useful on other projects where I've used this
framework.
2008-03-12 05:12:23 +00:00
kientzle
84e4492c0b Remove some unused fields from the private archive_read structure
(left over from when the unified read/write structure was copied
to form separate read and write structures) and eliminate the
pointless initialization of a couple of the unused fields.
2008-03-12 04:58:32 +00:00
kientzle
9f91e7f3cf Tighten up the semantics of acl_next() and xattr_next() when you
hit the end of the ACL or xattr list.

Thanks to: Jeff Johnson for pointing out the obvious typo
2008-03-12 04:47:37 +00:00
kientzle
06452801dd Typo, thanks to: Jeff Johnson.
MFC after: 3 days
2008-03-12 04:26:44 +00:00
davidxu
f2039f468f Add missing comma. 2008-03-12 02:37:31 +00:00
davidxu
20682c9e8d Add manual for function sem_timedwait().
Reviewed by: ru, deischen
2008-03-12 02:33:17 +00:00
davidxu
19ffe8108d If a thread is cancelled, it may have already consumed a umtx_wake,
check waiter and semphore counter to see if we may wake up next thread.
2008-03-11 03:26:47 +00:00
emax
e644cad509 Add structures to hold SDP parameters for the NAP, GN and PANU profiles.
It should be mentioned that a somewhat similar patch was submitted by
Rako < rako29 at gmail dot com >

MFC after:	1 week
2008-03-11 00:08:40 +00:00
jkoshy
7fcad1def3 Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns.
Correct a typo [a misplaced comma].

Reviewed by:		ru
2008-03-10 14:45:29 +00:00
jkoshy
3064e5aa7b Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns.
Reviewed by:		ru
2008-03-10 14:44:41 +00:00
rwatson
72cc21ea73 Add reference to kldunloadf system call, which was previously not
mentioned in the kldunload(2) man page.

MFC after:	3 days
Spotted by:	rink
2008-03-10 09:54:13 +00:00
antoine
514f31f40e Introduce a new F_DUP2FD command to fcntl(2), for compatibility with
Solaris and AIX.
fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent.
Document it.
Add some regression tests (identical to the dup2(2) regression tests).

PR:		120233
Submitted by:	Jukka Ukkonen
Approved by:	rwaston (mentor)
MFC after:	1 month
2008-03-08 22:02:21 +00:00
antoine
ea3f3b4bc0 Merge changes from NetBSD on humanize_number.c, 1.8 -> 1.13
Significant changes:
- rev. 1.11: Use PRId64 instead of a cast to long long and %lld to print
an int64_t.
- rev. 1.12: Fix a bug that humanize_number() produces "1000" where it
should be "1.0G" or "1.0M".  The bug reported by Greg Troxel.

PR:		118461
PR:		102694
Approved by:	rwatson (mentor)
Obtained from:	NetBSD
MFC after:	1 month
2008-03-08 21:55:59 +00:00
jasone
423cb10cb4 Remove stale #include <machine/atomic.h>, which as needed by lazy
deallocation.
2008-03-07 16:54:03 +00:00
rwatson
360d527360 Add __FBSDID() tags.
MFC after:	3 days
2008-03-07 15:25:56 +00:00
davidxu
cd00bbaa4b Fix a bug when calculating remnant size. 2008-03-06 03:24:03 +00:00
davidxu
6b41341850 Don't report death event to debugger if it is a forced exit. 2008-03-06 02:07:18 +00:00
davidxu
5f0ffdddf1 Restore code setting new thread's scheduler parameters, I was thinking
that there might be starvations, but because we have already locked the
thread, the cpuset settings will always be done before the new thread
does real-world work.
2008-03-06 01:59:08 +00:00
davidxu
1efa6566c1 Increase and decrease in_sigcancel_handler accordingly to avoid possible
error caused by nested SIGCANCEL stack, it is a bit complex.
2008-03-05 07:04:55 +00:00
davidxu
d6c532fa79 Use cpuset defined in pthread_attr for newly created thread, for now,
we set scheduling parameters and cpu binding fully in userland, and
because default scheduling policy is SCHED_RR (time-sharing), we set
default sched_inherit to PTHREAD_SCHED_INHERIT, this saves a system
call.
2008-03-05 07:01:20 +00:00
davidxu
b118d117f4 Add more cpu affinity function's symbols. 2008-03-05 06:56:35 +00:00
davidxu
adf8d28a8f Check actual size of cpuset kernel is using and define underscore version
of API.
2008-03-05 06:55:48 +00:00
davidxu
8ead1ed2f9 If a new thread is created, it inherits current thread's signal masks,
however if current thread is executing cancellation handler, signal
SIGCANCEL may have already been blocked, this is unexpected, unblock the
signal in new thread if this happens.

MFC after: 1 week
2008-03-04 04:28:59 +00:00
davidxu
e0d98325b4 Include cpuset.h, unbreak compiling. 2008-03-04 03:45:11 +00:00
davidxu
7046a9b037 implement pthread_attr_getaffinity_np and pthread_attr_setaffinity_np. 2008-03-04 03:03:24 +00:00
davidxu
f88b971008 Implement functions pthread_getaffinity_np and pthread_setaffinity_np to
get and set thread's cpu affinity mask.
2008-03-03 09:16:29 +00:00
jkoshy
35a6571045 - Fix an off-by-one bug in _libelf_insert_section(). [1]
- Update (c) years.

Submitted by:		kaiw [1]
2008-03-03 04:29:25 +00:00
das
245318776a 1 << 47 needs to be written 1ULL << 47. 2008-03-02 20:16:55 +00:00
jeff
694203dedd Add cpuset, an api for thread to cpu binding and cpu resource grouping
and assignment.
 - Add a reference to a struct cpuset in each thread that is inherited from
   the thread that created it.
 - Release the reference when the thread is destroyed.
 - Add prototypes for syscalls and macros for manipulating cpusets in
   sys/cpuset.h
 - Add syscalls to create, get, and set new numbered cpusets:
   cpuset(), cpuset_{get,set}id()
 - Add syscalls for getting and setting affinity masks for cpusets or
   individual threads: cpuid_{get,set}affinity()
 - Add types for the 'level' and 'which' parameters for the cpuset.  This
   will permit expansion of the api to cover cpu masks for other objects
   identifiable with an id_t integer.  For example, IRQs and Jails may be
   coming soon.
 - The root set 0 contains all valid cpus.  All thread initially belong to
   cpuset 1.  This permits migrating all threads off of certain cpus to
   reserve them for special applications.

Sponsored by:	Nokia
Discussed with:	arch, rwatson, brooks, davidxu, deischen
Reviewed by:	antoine
2008-03-02 07:39:22 +00:00
jkoshy
380ba89956 Translate the r_info field of ELF relocation records when converting
between 64 and 32 bit variants.

Submitted by:		kaiw
2008-03-02 06:33:10 +00:00
das
635be49304 Hook up sqrtl() to the build. 2008-03-02 01:48:17 +00:00
das
09521f824a MD implementations of sqrtl(). 2008-03-02 01:48:08 +00:00
das
40c2687372 MI implementation of sqrtl(). This is very slow and should
be overridden when hardware sqrt is available.
2008-03-02 01:47:58 +00:00
philip
a72a71deeb Use the easily-greppable copyright notice template from
src/share/examples/mdoc/POSIX-copyright.

Requested by:	ru
2008-02-29 17:48:25 +00:00
bde
d32d47f4d6 Fix and improve some magic numbers for the "medium size" case.
e_rem_pio2.c:
This case goes up to about 2**20pi/2, but the comment about it said that
it goes up to about 2**19pi/2.

It went too far above 2**pi/2, giving a multiplier fn with 21 significant
bits in some cases.  This would be harmful except for a numerical
accident.  It happens that the terms of the approximation to pi/2,
when rounded to 33 bits so that multiplications by 20-bit fn's are
exact, happen to be rounded to 32 bits so multiplications by 21-bit
fn's are exact too, so the bug only complicates the error analysis (we
might lose a bit of accuracy but have bits to spare).

e_rem_pio2f.c:
The bogus comment in e_rem_pio2.c was copied and the code was changed
to be bug-for-bug compatible with it, except the limit was made 90
ulps smaller than necessary.  The approximation to pi/2 was not
modified except for discarding some of it.

The same rough error analysis that justifies the limit of 2**20pi/2
for double precision only justifies a limit of 2**18pi/2 for float
precision.  We depended on exhaustive testing to check the magic numbers
for float precision.  More exaustive testing shows that we can go up
to 2**28pi/2 using a 53+25 bit approximation to pi/2 for float precision,
with a the maximum error for cosf() and sinf() unchanged at 0.5009
ulps despite the maximum error in rem_pio2f being ~0.25 ulps.  Implement
this.
2008-02-28 16:22:36 +00:00
scf
7ee4756ce9 Replace the use of warnx() with direct output to stderr using _write().
This reduces the size of a statically-linked binary by approximately 100KB
in a trivial "return (0)" test application.  readelf -S was used to verify
that the .text section was reduced and that using strlen() saved a few
more bytes over using sizeof().  Since the section of code is only called
when environ is corrupt (program bug), I went with fewer bytes over fewer
cycles.

I made minor edits to the submitted patch to make the output resemble
warnx().

Submitted by:	kib bz
Approved by:	wes (mentor)
MFC after:	5 days
2008-02-28 04:09:08 +00:00
jhb
8ee71003bd Add <limits.h> for SHRT_MAX.
Pointy hat to:	jhb
2008-02-27 21:25:19 +00:00
jhb
4c65fa8afd File descriptors are an int, but our stdio FILE object uses a short to hold
them.  Thus, any fd whose value is greater than SHRT_MAX is handled
incorrectly (the short value is sign-extended when converted to an int).
An unpleasant side effect is that if fopen() opens a file and gets a
backing fd that is greater than SHRT_MAX, fclose() will fail and the file
descriptor will be leaked.  Better handle this by fixing fopen(), fdopen(),
and freopen() to fail attempts to use a fd greater than SHRT_MAX with
EMFILE.

At some point in the future we should look at expanding the file descriptor
in FILE to an int, but that is a bit complicated due to ABI issues.

MFC after:	1 week
Discussed on:	arch
Reviewed by:	wollman
2008-02-27 19:02:02 +00:00
kientzle
a898e5bef8 Spelling correction, thanks to Joerg Sonnenberger. 2008-02-27 06:16:41 +00:00
kientzle
13c4f20c01 Optimize skipping over Zip entries.
Thanks to: Dan Nelson, who sent me the patch
MFC after: 7 days
2008-02-27 06:05:59 +00:00
wollman
e043fbfcde stdio is currently limited to file descriptors not greater than
{SHRT_MAX}, so {STREAM_MAX} should be no greater than that.  (This
does not exactly meet the letter of POSIX but comes reasonably close
to it in spirit.)

MFC after:	14 days
2008-02-27 05:56:57 +00:00
ru
f12be23c59 Added the "restrict" type-qualifier to the readlink() prototype. 2008-02-26 20:33:52 +00:00
kientzle
8d5e1fcfc0 Rename the archive_endian.h functions to avoid name clashes
with NetBSD's sys/endian.h file.

Pointed out by: Joerg Sonnenberger
2008-02-26 07:17:47 +00:00
bde
f77d7dfd70 Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small speeddown for |x| <= 9pi/4 (only 1-2 cycles average, but that
is 4%).

Inlining this is less likely to bust caches than inlining the float
version since it is much smaller (about 220 bytes text and rodata) and
has many fewer branches.  However, the float version was already large
due to its manual inlining of the branches and also the polynomial
evaluations.
2008-02-25 22:19:17 +00:00
bde
49cb35343e Use a temporary array instead of the arg array y[] for calling
__kernel_rem_pio2().  This simplifies analysis of aliasing and thus
results in better code for the usual case where __kernel_rem_pio2()
is not called.  In particular, when __ieee854_rem_pio2[f]() is inlined,
it normally results in y[] being returned in registers.  I couldn't
get this to work using the restrict qualifier.

In float precision, this saves 2-3% in most cases on amd64 and i386
(A64) despite it not being inlined in float precision yet.  In double
precision, this has high variance, with an average gain of 2% for
amd64 and 0.7% for i386 (but a much larger gain for usual cases) and
some losses.
2008-02-25 18:28:58 +00:00
bde
83268c5f08 Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.

This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range).  40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2.  The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly.  amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.

This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code).  It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.
2008-02-25 13:33:20 +00:00
brueffer
bcb6adff03 Add missing words.
MFC after:	3 days
2008-02-25 13:03:18 +00:00
bde
ce9e405fdb Fix some off-by-1 errors.
e_rem_pio2.c:
Float and double precision didn't work because init_jk[] was 1 too small.
It needs to be 2 larger than you might expect, and 1 larger than it was
for these precisions, since its test for recomputing needs a margin of
47 bits (almost 2 24-bit units).

init_jk[] seems to be barely enough for extended and quad precisions.
This hasn't been completely verified.  Callers now get about 24 bits
of extra precision for float, and about 19 for double, but only about
8 for extended and quad.  8 is not enough for callers that want to
produce extra-precision results, but current callers have rounding
errors of at least 0.8 ulps, so another 1/2**8 ulps of error from the
reduction won't affect them much.

Add a comment about some of the magic for init_jk[].

e_rem_pio2.c:
Double precision worked in practice because of a compensating off-by-1
error here.  Extended precision was asked for, and it executed exactly
the same code as the unbroken double precision.

e_rem_pio2f.c:
Float precision worked in practice because of a compensating off-by-1
error here.  Double precision was asked for, and was almost needed,
since the cosf() and sinf() callers want to produce extra-precision
results, at least internally so that their error is only 0.5009 ulps.
However, the extra precision provided by unbroken float precision is
enough, and the double-precision code has extra overheads, so the
off-by-1 error cost about 5% in efficiency on amd64 and i386.
2008-02-25 11:43:20 +00:00
raj
69575dab52 Let PowerPC world optionally build with -msoft-float. For FPU-less PowerPC
variations (e500 currently), this provides a gcc-level FPU emulation and is an
alternative approach to the recently introduced kernel-level emulation
(FPU_EMU).

Approved by:	cognet (mentor)
MFp4:		e500
2008-02-24 19:22:53 +00:00
bde
09a79b45a1 Optimize the 9pi/2 < |x| <= 2**19pi/2 case some more by avoiding an
fabs(), a conditional branch, and sign adjustments of 3 variables for
x < 0 when the branch is taken.  In double precision, even when the
branch is perfectly predicted, this saves about 10 cycles or 10% on
amd64 (A64) and i386 (A64) for the negative half of the range, but
makes little difference for the positive half of the range.  In float
precision, it also saves about 4 cycles for the positive half of the
range on i386, and many more cycles in both halves on amd64 (28 in the
negative half and 11 in the positive half for tanf), but the amd64
times for float precision are anomalously slow so the larger
improvement is only a side effect.

Previous commits arranged for the x < 0 case to be handled simply:
- one part of the rounding method uses the magic number 0x1.8p52
  instead of the usual 0x1.0p52.  The latter is required for large |x|,
  but it doesn't work for negative x and we don't need it for large |x|.
- another part of the rounding method no longer needs to add `half'.
  It would have needed to add -half for negative x.
- removing the "quick check no cancellation" in the double precision
  case removed the need to take the absolute value of the quadrant
  number.

Add my noncopyright in e_rem_pio2.c
2008-02-23 12:53:21 +00:00
bde
26ba55ab66 Avoid using FP-to-integer conversion for !(amd64 || i386) too. Use the
FP-to-FP method to round to an integer on all arches, and convert this
to an int using FP-to-integer conversion iff irint() is not available.
This is cleaner and works well on at least ia64, where it saves 20-30
cycles or about 10% on average for 9Pi/4 < |x| <= 32pi/2 (should be
similar up to 2**19pi/2, but I only tested the smaller range).

After the previous commit to e_rem_pio2.c removed the "quick check no
cancellation" non-optimization, the result of the FP-to-integer
conversion is not needed so early, so using irint() became a much
smaller optimization than when it was committed.

An earlier commit message said that cos, cosf, sin and sinf were equally
fast on amd64 and i386 except for cos and sin on i386.  Actually, cos
and sin on amd64 are equally fast to cosf and sinf on i386 (~88 cycles),
while cosf and sinf on amd64 are not quite equally slow to cos and sin
on i386 (average 115 cycles with more variance).
2008-02-22 18:43:23 +00:00
bde
e31bf4b688 Remove the "quick check no cancellation" optimization for
9pi/2 < |x| < 32pi/2 since it is only a small or negative optimation
and it gets in the way of further optimizations.  It did one more
branch to avoid some integer operations and to use a different
dependency on previous results.  The branches are fairly predictable
so they are usually not a problem, so whether this is a good
optimization depends mainly on the timing for the previous results,
which is very machine-dependent.  On amd64 (A64), this "optimization"
is a pessimization of about 1 cycle or 1%; on ia64, it is an
optimization of about 2 cycles or 1%; on i386 (A64), it is an
optimization of about 5 cycles or 4%; on i386 (Celeron P2) it is an
optimization of about 4 cycles or 3% for cos but a pessimization of
about 5 cycles for sin and 1 cycle for tan.  I think the new i386
(A64) slowness is due to an pipeline stall due to an avoidable
load-store mismatch (so the old timing was better), and the i386
(Celeron) variance is due to its branch predictor not being too good.
2008-02-22 17:26:24 +00:00
bde
37c23ae5ff Optimize the 9pi/2 < |x| <= 2**19pi/2 case on amd64 and i386 by avoiding
the the double to int conversion operation which is very slow on these
arches.  Assume that the current rounding mode is the default of
round-to-nearest and use rounding operations in this mode instead of
faking this mode using the round-towards-zero mode for conversion to
int.  Round the double to an integer as a double first and as an int
second since the double result is needed much earler.

Double rounding isn't a problem since we only need a rough approximation.
We didn't support other current rounding modes and produce much larger
errors than before if called in a non-default mode.

This saves an average about 10 cycles on amd64 (A64) and about 25 on
i386 (A64) for x in the above range.  In some cases the saving is over
25%.  Most cases with |x| < 1000pi now take about 88 cycles for cos
and sin (with certain CFLAGS, etc.), except on i386 where cos and sin
(but not cosf and sinf) are much slower at 111 and 121 cycles respectivly
due to the compiler only optimizing well for float precision.  A64
hardware cos and sin are slower at 105 cycles on i386 and 110 cycles
on amd64.
2008-02-22 15:55:14 +00:00
bde
af1dfd5050 Add an irint() function in inline asm for amd64 and i386. irint() is
the same as lrint() except it returns int instead of long.  Though the
extern lrint() is fairly fast on these arches, it still takes about
12 cycles longer than the inline version, and 12 cycles is a lot in
applications where [li]rint() is used to avoid slow conversions that
are only a couple of times slower.

This is only for internal use.  The libm versions of *rint*() should
also be inline, but that would take would take more header engineering.
Implementing irint() instead of lrint() also avoids a conflict with
the extern declaration of the latter.
2008-02-22 14:11:03 +00:00
bde
d3a4e4141f Optimize the conversion to bits a little (by about 11 cycles or 16%
on i386 (A64), 5 cycles on amd64 (A64), and 3 cycles on ia64).  gcc
tends to generate very bad code for accessing floating point values
as bits except when the integer accesses have the same width as the
floating point values, and direct accesses to bit-fields (as is common
only for long double precision) always gives such accesses.  Use the
expsign access method, which is good for 80-bit long doubles and
hopefully no worse for 128-bit long doubles.  Now the generated code
is less bad.  There is still unnecessary copying of the arg on amd64
and i386 and mysterious extra slowness on amd64.
2008-02-22 11:59:05 +00:00
bde
95a5ac1745 Optimize the fixup for +-0 by using better classification for this case
and by using a table lookup to avoid a branch when this case occurs.
On i386, this saves 1-4 cycles out of about 64 for non-large args.
2008-02-22 10:04:53 +00:00
bde
dc8c48731a Fix rintl() on signaling NaNs and unsupported formats. 2008-02-22 09:21:14 +00:00
das
8b6c2ddfd4 s/rcsid/__FBSDID/ 2008-02-22 02:30:36 +00:00
das
224826f963 Remove an unused variable. 2008-02-22 02:27:34 +00:00
das
d74b55ed2b Eliminate some warnings. 2008-02-22 02:26:51 +00:00
philip
9044373a13 Note, as required by our agreement with IEEE/The Open Group, that the message
queue manual pages excerpt the POSIX standard.

Spotted by:	Mindaugas Rasiukevicius <rmind -at- NetBSD.org>
Reviewed by:	imp
MFC after:	1 day
2008-02-21 19:16:57 +00:00
kientzle
40e6cafd9c Sanity-check the block size.
Thanks to: Joerg Sonnenberger
MFC after: 7 days
2008-02-21 03:21:50 +00:00
bde
f0e3007ba6 Merge cosmetic changes from e_rem_pio2.c 1.10 (convert to __FBSDID();
fix indentation and return type of __ieee754_rem_pio2()).

Remove unused variables.
2008-02-19 15:42:46 +00:00
bde
30565c600e Optimize for 3pi/4 <= |x| <= 9pi/4 in much the same way as for
pi/4 <= |x| <= 3pi/4.  Use the same branch ladder as for float precision.
Remove the optimization for |x| near pi/2 and don't do it near the
multiples of pi/2 in the newly optimized range, since it requires
fairly large code to handle only relativley few cases.  Ifdef out
optimization for |x| <= pi/4 since this case can't occur because it
is done in callers.

On amd64 (A64), for cos() and sin() with uniformly distributed args,
no cache misses, some parallelism in the caller, and good but not great
CC and CFLAGS, etc., this saves about 40 cycles or 38% in the newly
optimized range, or about 27% on average across the range |x| <= 2pi
(~65 cycles for most args, while the A64 hardware fcos and fsin take
~75 cycles for half the args and 125 cycles for the other half).  The
speedup for tan() is much smaller, especially relatively.  The speedup
on i386 (A64) is slightly smaller, especially relatively.  i386 is
still much slower than amd64 here (unlike in the float case where it
is slightly faster).
2008-02-19 15:30:58 +00:00
bde
e508bf1279 Rearrange the polynomial evaluation for better parallelism. This
saves an average of about 8 cycles or 5% on A64 (amd64 and i386 --
more in cycles but about the same percentage on i386, and more with
old versions of gcc) with good CFLAGS and some parallelism in the
caller.  As usual, it takes a couple more multiplications so it will
be slower on old machines.

Convert to __FBSDID().
2008-02-19 12:54:14 +00:00
kientzle
fa2b3c3128 Include O_BINARY in open() calls on platforms that support it. 2008-02-19 06:10:48 +00:00
kientzle
efdcbf021b Another tiny, tiny step towards Windows support. No, I don't plan to
ever commit the Windows support files to FreeBSD CVS.  That would just
be wrong.
2008-02-19 06:06:13 +00:00
kientzle
c47c10e462 Someday I might forgive the standards bodies for omitting timegm().
Maybe.  In the meantime, my workarounds for trying to coax UTC without
timegm() are getting uglier and uglier.  Apparently, some systems
don't support setenv()/unsetenv(), so you can't set the TZ env var and
hope thereby to coax mktime() into generating UTC.  Without that, I
don't see a really good alternative to just giving up and converting to
localtime with mktime().  (I suppose I should research the Perl library
approach for computing an inverse function to gmtime(); that might
actually be simpler than this growing list of hacks.)
2008-02-19 06:02:01 +00:00
kientzle
5b631adaa6 Simplify file type setting. 2008-02-19 05:54:24 +00:00
kientzle
c300e636ea The test_assert() function that backs my custom assert() macro
now returns a value, which supports such convenient
constructs as:
   if (assert(NULL != foo())) {
   }

Also be careful to setlocale("C") for each new test to
avoid locale pollution.

Also a couple of minor portability enhancements.
2008-02-19 05:52:30 +00:00
kientzle
88b1623cab Portability: Since the values are fixed and the symbolic names
are only present on some platforms, just use the values directly.
2008-02-19 05:49:02 +00:00
kientzle
1a2f1a0d3a Portability: Include O_BINARY if the local platform defines it. 2008-02-19 05:46:58 +00:00
kientzle
677b5b664a Correct a compile error when libbz2/zlib are unavailable. 2008-02-19 05:44:59 +00:00
kientzle
5a220e02da Mark a few additional functions that are/are not available on FreeBSD. 2008-02-19 05:40:28 +00:00
kientzle
ae947994a7 Portability improvements:
* If the platform can't restore char nodes, block nodes, or fifos,
don't try and just return error.
  * Include O_BINARY in most open() calls (define O_BINARY to 0 if the
platform doesn't provide a definition already)
  * Refactor the ownership restore to more cleanly support platforms
that don't have any form of {l,f,}chown() call.
  * Comment a lingering issue with older Unix-like systems that allow
root to hose the filesystem.  I don't (yet) have a good solution for
this, but I expect it will require adding more redundant stat()
calls. <sigh>

MFC after: 14 days
2008-02-19 05:39:35 +00:00
das
0a944b08e4 Document return values better. 2008-02-18 19:02:49 +00:00
das
11fca9d5f5 Add tgammaf() as a simple wrapper around tgamma(). 2008-02-18 17:27:11 +00:00
bde
3a3915219d 2 long double constants were missing L suffixes. This helped break tanl()
on !(amd64 || i386).  It gave slightly worse than double precision in some
cases.  tanl() now passes tests of 2^24 values on ia64.
2008-02-18 15:39:52 +00:00
bde
3fc58437c4 Fix a typo which broke k_tanl.c on !(amd64 || i386). 2008-02-18 14:09:41 +00:00
bde
ad78d66621 Inline __ieee754__rem_pio2(). With gcc4-2, this gives an average
optimization of about 10% for cos(x), sin(x) and tan(x) on
|x| < 2**19*pi/2.  We didn't do this before because __ieee754__rem_pio2()
is too large and complicated for gcc-3.3 to inline very well.  We don't
do this for float precision because it interferes with optimization
of the usual (?) case (|x| < 9pi/4) which is manually inlined for float
precision only.

This has some rough edges:
- some static data is duplicated unnecessarily.  There isn't much after
  the recent move of large tables to k_rem_pio2.c, and some static data
  is duplicated to good affect (all the data static const, so that the
  compiler can evaluate expressions like 2*pio2 at compile time and
  generate even more static data for the constant for this).
- extern inline is used (for the same reason as in previous inlining of
  k_cosf.c etc.), but C99 apparently doesn't allow extern inline
  functions with static data, and gcc will eventually warn about this.

Convert to __FBSDID().

Indent __ieee754_rem_pio2()'s declaration consistently (its style was
made inconsistent with fdlibm a while ago, so complete this).

Fix __ieee754_rem_pio2()'s return type to match its prototype.  Someone
changed too many ints to int32_t's when fixing the assumption that all
ints are int32_t's.
2008-02-18 14:02:12 +00:00
kevlo
c74ac9adc1 getopt(3) returns -1, not EOF. 2008-02-18 03:19:25 +00:00
das
2acea74331 Use volatile hacks to make sure exp() generates an underflow
exception when it's supposed to. Previously, gcc -O2 was optimizing
away the statement that generated it.
2008-02-17 21:53:19 +00:00
jasone
2bc29a1530 Fix a race condition in arena_ralloc() for shrinking in-place large
reallocation, when junk filling is enabled.  Junk filling must occur
prior to shrinking, since any deallocated trailing pages are immediately
available for use by other threads.

Reported by:	Mats Palmgren <mats.palmgren@bredband.net>
2008-02-17 18:34:17 +00:00