15260 Commits

Author SHA1 Message Date
Jeff Roberson
fc03d22b17 Refine UMA bucket allocation to reduce space consumption and improve
performance.

 - Always free to the alloc bucket if there is space.  This gives LIFO
   allocation order to improve hot-cache performance.  This also allows
   for zones with a single bucket per-cpu rather than a pair if the entire
   working set fits in one bucket.
 - Enable per-cpu caches of buckets.  To prevent recursive bucket
   allocation one bucket zone still has per-cpu caches disabled.
 - Pick the initial bucket size based on a table driven maximum size
   per-bucket rather than the number of items per-page.  This gives
   more sane initial sizes.
 - Only grow the bucket size when we face contention on the zone lock, this
   causes bucket sizes to grow more slowly.
 - Adjust the number of items per-bucket to account for the header space.
   This packs the buckets more efficiently per-page while making them
   not quite powers of two.
 - Eliminate the per-zone free bucket list.  Always return buckets back
   to the bucket zone.  This ensures that as zones grow into larger
   bucket sizes they eventually discard the smaller sizes.  It persists
   fewer buckets in the system.  The locking is slightly trickier.
 - Only switch buckets in zalloc, not zfree, this eliminates pathological
   cases where we ping-pong between two buckets.
 - Ensure that the thread that fills a new bucket gets to allocate from
   it to give a better upper bound on allocation time.

Sponsored by:	EMC / Isilon Storage Division
2013-06-18 04:50:20 +00:00
Andrew Turner
8b02079f36 Build __clear_cache on ARM with clang now it supports it. 2013-06-15 12:16:27 +00:00
Ed Schouten
2d5add2ae6 Let ARM use the custom tailored atomic intrinsics. 2013-06-15 09:04:10 +00:00
Ed Maste
580b4d185b Renumber clauses to reduce diffs to other versions
NetBSD, OpenBSD, and Android's Bionic number the clauses 1 through 3,
so follow suit to make comparison easier.
2013-06-13 00:19:30 +00:00
Jeremie Le Hen
54f82841d5 Turn libc.so into an ld script rather than a symlink pointing to the
real shared object and libssp_nonshared.a.

This was the last showstopper that prevented from enabling SSP for ports
by default.  portmgr@ performed a buildworld which showed no significant
breakage with this patch.

Details:

On i386 for PIC objects, gcc uses the __stack_chk_fail_local hidden
symbol instead of calling __stack_chk_fail directly [1].  This happen
not only with our gcc-4.2.1 but also with the latest gcc-4.8.  If you
want the very nasty details, see [2].

OTOH the problem doesn't exist on other architectures.  It also doesn't
exist with Clang as the latter will somehow manage to create the
function in the object file at compile time (contrary to only
referencing it through a symbol that will be brought in at link time).

In a perfect world, when an object file is compiled with
-fstack-protector, it will be linked into a binary or a DSO with this
same flag as well, so GCC will add libssp_nonshared.a to the linker
command-line.  Unfortunately, we don't control softwares in ports and we
may have such broken DSO.  This is the whole point of this patch.

You can reproduce the problem on i386 by compiling a source file into an
object file with "-fstack-protector-all -fPIE" and linking it
into a binary without "-fstack-protector".

This ld script automatically proposes libssp_nonshared.a along with the
real libc DSO to the linker.  It is important to understand that the
object file contained in this library will be pulled in the resulting
binary _only if_ the linker notices one of its symbols is needed (i.e.
one of the SSP symbol is missing).

A theorical performance impact could be when compiling, but my testing
showed less than 0.1% of difference.

[1] For 32-bit code gcc saves the PIC register setup by using
    __stack_chk_fail_local hidden function instead of calling
    __stack_chk_fail directly.  See comment line 19460 in:
    src/contrib/gcc/config/i386/i386.c

[2] When compiling a source file to an object file, if you use something
    which is external to the compilation unit, GCC doesn't know yet if
    this symbol will be inside or outside the DSO.  So it expects the
    worst case and routes the symbol through the GOT, which means
    additional space and extra relocation for rtld(1).

    Declaring a symbol has hidden tells GCC to use the optimal route (no
    GOT), but on the other hand this means the symbol has to be provided
    in the same DSO (namely libssp_nonshared.a).

    On i386, GCC actually uses an hidden symbol for SSP in PIC objects
    to save PIC register setup, as said in [1].

PR:		ports/138228
PR:		ports/168010
Reviewed by:	kib, kan
2013-06-12 21:12:05 +00:00
Dimitry Andric
284c197886 Upgrade our copy of llvm/clang to 3.3 release.
Release notes are still in the works, these will follow soon.

MFC after:	1 month
2013-06-12 18:48:53 +00:00
John Baldwin
608203fd94 Borrow the algorithm from kvm_getprocs() to fix procstat_getprocs() to
handle the case where the process tables grows in between the calls to
fetch the size and fetch the table.

MFC after:	1 week
2013-06-11 20:00:49 +00:00
Dimitry Andric
6a0372513e Vendor import of clang tags/RELEASE_33/final r183502 (effectively, 3.3
release):
http://llvm.org/svn/llvm-project/cfe/tags/RELEASE_33/final@183502
2013-06-10 20:45:12 +00:00
Dimitry Andric
59d6cff90e Vendor import of llvm tags/RELEASE_33/final r183502 (effectively, 3.3
release):
http://llvm.org/svn/llvm-project/llvm/tags/RELEASE_33/final@183502
2013-06-10 20:36:52 +00:00
David Schultz
998b640bcb Add implementations of acoshl(), asinhl(), and atanhl(). This is a
merge of the work done by bde and myself.
2013-06-10 06:04:58 +00:00
Jilles Tjoelker
42cb36d269 Make recv() and send() cancellation points, as required by POSIX.
Call the recvfrom() and sendto() functions overridden by libthr instead of
the _recvfrom() and _sendto() versions that are not cancellation points.
2013-06-09 14:31:59 +00:00
Joel Dahl
2a82581d9b Minor mdoc fixes. 2013-06-09 07:15:43 +00:00
Pedro F. Giffuni
0b4b96e6e5 libstand: Reset the seek pointer in ext2fs as done in UFS.
Based on r134760:

Reset the seek pointer to 0 when a file is successfully opened,
since otherwise the initial seek offset will contain the directory
offset of the filesystem block that contained its directory entry.
This bug was mostly harmless because typically the directory is
less than one filesystem block in size so the offset would be zero.
It did however generally break loading a kernel from the (large)
kernel compile directory.

Also reset the seek pointer when a new inode is opened in read_inode(),
though this is not actually necessary now because all callers set
it afterwards.

PR:		177328
Submitted by:	Eric van Gyzen
Reviewed by:	iedowse
MFC after:	5 days
2013-06-09 01:19:22 +00:00
Jilles Tjoelker
172886a93e sigaction(2): Document various non-POSIX functions as async-signal safe. 2013-06-08 13:45:43 +00:00
Gleb Smirnoff
6160e12c10 Add new system call - aio_mlock(). The name speaks for itself. It allows
to perform the mlock(2) operation, which can consume a lot of time, under
control of aio(4).

Reviewed by:	kib, jilles
Sponsored by:	Nginx, Inc.
2013-06-08 13:27:57 +00:00
Ed Schouten
e737464f59 Use improved __sync_*() intrinsics for MIPS in userspace as well.
r251524 introduced custom tailored versions for MIPS of these functions
for kernel-space code. We can just reuse them in userspace as well.
2013-06-08 13:22:53 +00:00
Andrew Turner
8e03dd5926 Finish pulling in the NetBSD setjmp/longjmp updates on ARM.
Store/restore the VFP registers in setjmp/longjmp on ARM EABI if VFP is
enabled in the kernel. It checks the hw.floatingpoint sysctl to see if
floating-point is available and uses this to determine if it should store
them. If it does it uses a different magic value so longjmp is able to know
if it should load them.
2013-06-07 22:01:06 +00:00
Andrew Turner
4e75169f43 Include machine/setjmp.h to get the definition of _JB_MAGIC__SETJMP. This
allows us to remove it from the ARM copy of machine/asm.h.
2013-06-07 21:13:28 +00:00
Andrew Turner
4ce3ba179a Remove an extra copy of _setjmp from libstand. We have used the libc version
of this function since r183876.
2013-06-07 21:06:19 +00:00
Ed Maste
a9205626a7 Add libusb_get_port_numbers
libusbx deprecated libusb_get_port_path and replaced it with
libusb_get_port_numbers.  The latter omits an extra parameter which was
unused in the FreeBSD implementation anyway.
2013-06-07 13:45:58 +00:00
Ed Maste
371df6c6ad Switch to 2-clause license and standard text
Approved by:	bms@
2013-06-06 21:09:27 +00:00
Andrew Turner
d2d3491d5a Remove part of the NetBSD longjmp code that was not ready to be merged. 2013-06-05 07:37:45 +00:00
David Schultz
0b8d0b5be9 Style fixes.
Submitted by:	bde
2013-06-05 05:33:01 +00:00
Andrew Turner
8ed717de58 Start to merge the updated ARM NetBSD setjump/longjmp functions. To begin
with merge the functions but leave out the code to save/load the VFP
registers as that requires other changes to ensure the VFP is enabled
first.

This removes storing the old fpa registers. These were never fully
supported, and the only user of this code I can find have moved to newer
CPUs which use a VFP.
2013-06-04 19:47:26 +00:00
Joel Dahl
580dbd6574 mdoc: convert .Fd to .In, which is much nicer. 2013-06-04 07:37:06 +00:00
David Schultz
c13c6c32b1 Add man links for expl(3) and expm1l(3). 2013-06-04 05:41:38 +00:00
Xin LI
21d1182ecf Fix a typo: XPORT_SPI should be tested against transport, nor protocol.
Submitted by:	Sascha Wildner <swildner dragonflybsd org>
Reviewed by:	mjacob
MFC after:	2 weeks
2013-06-03 21:52:19 +00:00
Steve Kargl
1a287d1ddf Change a comma to a semicolon.
Remove a blank line that crept into the declarations.

Fix a comment to show a sign on a NaN.
2013-06-03 20:09:22 +00:00
Steve Kargl
3ffff4bad5 ld80 and ld128 implementations of expm1l(). This code started life
as a fairly faithful implementation of the algorithm found in

PTP Tang, "Table-driven implementation of the Expm1 function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 18,
211-222 (1992).

Over the last 18-24 months, the code has under gone significant
optimization and testing.

Reviewed by:	bde
Obtained from:	bde (most of the optimizations)
2013-06-03 19:51:32 +00:00
Steve Kargl
42e4111cab Fix two comments that got lost in the disentanglement of the larger diff. 2013-06-03 19:29:03 +00:00
Steve Kargl
8cc74771f2 ld80/s_expl.c:
* Use integral numerical constants, and let the compiler do the
  conversion to long double.

ld128/s_expl.c:

* Use integral numerical constants, and let the compiler do the
  conversion to long double.
* Use the ENTERI/RETURNI macros, which are no-ops on ld128.  This
  however makes the ld80 and ld128 identical.

Reviewed by:	bde (as part of larger diff)
2013-06-03 19:13:44 +00:00
Steve Kargl
35cbca6a7f Micro-optimization: move the unary mius operator to operate
on a literal constant.

Obtained from:	bde
2013-06-03 18:57:35 +00:00
Steve Kargl
a3f70b4ed8 Add a comment to note that bde supplied most, if not all,
of the optimizations.
2013-06-03 18:53:40 +00:00
Steve Kargl
1783063f18 ld80/s_expl.c:
* In the special case x = -Inf or -NaN, use a micro-optimization
  to eliminate the need to access u.xbits.man.

* Fix an off-by-one for small arguments |x| < 0x1p-65.

ld128/s_expl.c:

* In the special case x = -Inf or -NaN, use a micro-optimization
  to eliminate the need to access u.xbits.manh and u.xbits.manl.

* Fix an off-by-one for small arguments |x| < 0x1p-114.

Obtained from:	bde
2013-06-03 18:51:34 +00:00
Steve Kargl
31407861b8 ld80/s_expl.c:
* Update the evaluation of the polynomial.  This allows the removal
  of the now unused variables t23 and t45.

ld128/s_expl.c:

* Update the evaluation of the polynomial and the intermediate
  result t.  This update allows several numerical constants to be
  written as double rather than long double constants.   Update
  the constants as appropriate.

Obtained from:	bde
2013-06-03 18:40:00 +00:00
Steve Kargl
199b8e343d Rename a few P2, P3, ... coefficients to A2, A3, ... missed in
my previous commit.
2013-06-03 18:18:08 +00:00
Steve Kargl
f3049ab5f3 Update a comment to reflect that we are using an endpoint of
an interval instead of a midpoint.
2013-06-03 18:14:18 +00:00
Steve Kargl
ad36b00fcb Add a u suffix to the IEEEl2bits unions o_threshold and u_threshold,
and use macros to access the e component of the unions.  This allows
the portions of the code in ld80 to be identical to the ld128 code.

Obtained from:	bde
2013-06-03 18:07:04 +00:00
Steve Kargl
4aa8c9453f Introduce the macro LOG2_INTERVAL, which is log2(number of intervals).
Use the macroi as a micro-optimization to convert a subtraction and
division to a shift.

Obtained from:	bde
2013-06-03 17:51:08 +00:00
Steve Kargl
03e1315345 Whitespace. 2013-06-03 17:40:52 +00:00
Steve Kargl
bb23de67bb * Rename the polynomial coefficients from P2, P3, ... to A2, A3, ....
The names now coincide with the name used in PTP Tang's paper.

* Rename the variable from s to tbl to better reflect that
  this is a table, and to be consistent with the naming scheme
  in s_exp2l.c

Reviewed by:	bde (as part of larger diff)
2013-06-03 17:36:26 +00:00
Steve Kargl
b419a5506a * Style(9). Start non-Copyright fancy formatted comments with /**.
Reviewed by:	bde (as part of larger diff)
2013-06-03 17:24:46 +00:00
Steve Kargl
a1d69112c1 ld80/s_expl.c:
* Update Copyright years to include 2013.

ld128/s_expl.c:

* Correct and update Copyright years.  This code originated from
  the ld80 version, so it should reflect the same time period.

Reviewed by:	bde (as part of larger diff)
2013-06-03 17:21:43 +00:00
Ed Schouten
49111f0092 Add libiconv based versions of *c16*() and *c32*().
I initially thought wchar_t was locale independent, but this seems to be
only the case on Linux. This means that we cannot depend on the *wc*()
routines to implement *c16*() and *c32*(). Instead, use the Citrus
libiconv that is part of libc.

I'll see if there is anything I can do to make the existing functions
somewhat useful in case the system is built without libiconv in the
nearby future. If not, I'll simply remove the broken implementations.

Reviewed by:	jilles, gabor
2013-06-03 17:17:56 +00:00
Ed Maste
acbbd07aca Switch to 2-clause license
Approved by:	bms@
2013-06-03 12:43:09 +00:00
David Schultz
25a4d6bfda Add logl, log2l, log10l, and log1pl.
Submitted by:	bde
2013-06-03 09:14:31 +00:00
Konstantin Belousov
91ddaeb725 Since the cause of the problems with the __fillcontextx() was
identified, unify the code of check_deferred_signal() for all
architectures, making the variant under #ifdef x86 common.

Tested by:	marius (sparc64)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-06-03 04:22:42 +00:00
Tijl Coosemans
53089271a1 Convert old make variable modifiers :U and :L to bmake :tu and :tl.
Reviewed by:	sjg
2013-06-02 11:44:23 +00:00
Jilles Tjoelker
4b08438c22 dup(2): Clarify return value, in particular of dup2(). 2013-05-31 22:09:31 +00:00
Jilles Tjoelker
4e3f0e45cf sigaction(2): *at system calls are async-signal safe. 2013-05-31 21:31:38 +00:00