exponent bits of the reduced result, construct 2**k (hopefully in
parallel with the construction of the reduced result) and multiply by
it. This tends to be much faster if the construction of 2**k is
actually in parallel, and might be faster even with no parallelism
since adjustment of the exponent requires a read-modify-wrtite at an
unfortunate time for pipelines.
In some cases involving exp2* on amd64 (A64), this change saves about
40 cycles or 30%. I think it is inherently only about 12 cycles faster
in these cases and the rest of the speedup is from partly-accidentally
avoiding compiler pessimizations (the construction of 2**k is now
manually scheduled for good results, and -O2 doesn't always mess this
up). In most cases on amd64 (A64) and i386 (A64) the speedup is about
20 cycles. The worst case that I found is expf on ia64 where this
change is a pessimization of about 10 cycles or 5%. The manual
scheduling for plain exp[f] is harder and not as tuned.
Details specific to expm1*:
- the saving is closer to 12 cycles than to 40 for expm1* on i386 (A64).
For some reason it is much larger for negative args.
- also convert to __FBSDID().
exponent bits of the reduced result, construct 2**k (hopefully in
parallel with the construction of the reduced result) and multiply by
it. This tends to be much faster if the construction of 2**k is
actually in parallel, and might be faster even with no parallelism
since adjustment of the exponent requires a read-modify-wrtite at an
unfortunate time for pipelines.
In some cases involving exp2* on amd64 (A64), this change saves about
40 cycles or 30%. I think it is inherently only about 12 cycles faster
in these cases and the rest of the speedup is from partly-accidentally
avoiding compiler pessimizations (the construction of 2**k is now
manually scheduled for good results, and -O2 doesn't always mess this
up). In most cases on amd64 (A64) and i386 (A64) the speedup is about
20 cycles. The worst case that I found is expf on ia64 where this
change is a pessimization of about 10 cycles or 5%. The manual
scheduling for plain exp[f] is harder and not as tuned.
This change ld128/s_exp2l.c has not been tested.
that is specialized for float precision. The new polynomial has degree
5 instead of 11, and a maximum error of 2**-27.74 ulps instead
of 2**-30.64. This doesn't affect the final error significantly; the
maximum error was and is about 0.9101 ulps on amd64 -01 and the number
of cases with an error of > 0.5 ulps is actually reduced by epsilon
despite the larger error in the polynomial.
This is about 15% faster on amd64 (A64), i386 (A64) and ia64. The asm
version is still used instead of this on i386 since it is faster and
more accurate.
threshold, according to the 'F' MALLOC_OPTIONS flag. This obsoletes the
'H' flag.
Try to realloc() large objects in place. This substantially speeds up
incremental large reallocations in the common case.
Fix a bug in arena_ralloc() that caused relocation of sub-page objects
even if the old and new sizes were in the same size class.
Maintain trees of runs and simplify the per-chunk page map. This allows
logarithmic-time searching for sufficiently large runs in
arena_run_alloc(), whereas the previous algorithm required linear time
in the worst case.
Break various large functions into smaller sub-functions, and inline
only the functions that are in the fast path for small object
allocation/deallocation.
Remove an unnecessary check in base_pages_alloc_mmap().
Avoid integer division in choose_arena() for the NO_TLS case on
single-CPU systems.
the semantics of pthread_mutex_islocked_np() to return true if and only if
the mutex is held by the current thread.
Obviously, change the regression test to match.
MFC after: 2 weeks
locked. This is intended primarily to support the userland equivalent
of the various *_ASSERT_LOCKED() macros we have in the kernel.
MFC after: 2 weeks
referencing the files VM pages are returned from the network stack,
making changes to the file safe.
This flag does not guarantee that the data has been transmitted to the
other end.
use. If it is in use, use the watched request, otherwise use the
lockuser's own request. Only allocate a lockuser request if both
requests are null.
PR: 119920
Tested by (6.x): Landon Fuller <landonf -at- bikemonkey -dot- org>
prerequisite for using this interface. However, the 'statinfo' struct
actually references CPUSTATES from <sys/resource.h>, so in fact it
requires <sys/resource.h> to compile. Use a nested include of
<sys/resource.h> to make the code match the docs.
Reported by: Pietro Cerutti gahr | gahr.ch
global header if nothing else has been written before the closing of
the archive. This will change the behaviour when creating archives
without members, i.e., instead of generating a 0-size archive file, an
archive with just the global header (8 bytes in total) will be created
and it is indeed a valid archive by the definition of libarchive, thus
subsequent operation on this archive will be accepted. This especially
solves the failure caused by following sequence: (several ports do)
% ar cru libfoo.a # without specifying obj files
% ranlib libfoo.a
Reviewed by: kientzle, jkoshy
Approved by: kientzle
Approved by: jkoshy (mentor)
Reported by: erwin
MFC after: 1 month
obey or ignore the size field on a hardlink entry. In particular,
if we're reading a non-POSIX archive, we should always ignore
the size field.
This should fix both the audio/xmcd port and the math/unixstat port.
Thanks to: Pav Lucistnik for pointing these two ports out to me.
MFC after: 7 days
fields in FTS and FTSENT structs being too narrow. In addition,
the narrow types creep from there into fts.c. As a result, fts(3)
consumers, e.g., find(1) or rm(1), can't handle file trees an ordinary
user can create, which can have security implications.
To fix the historic implementation of fts(3), OpenBSD and NetBSD
have already changed <fts.h> in somewhat incompatible ways, so we
are free to do so, too. This change is a superset of changes from
the other BSDs with a few more improvements. It doesn't touch
fts(3) functionality; it just extends integer types used by it to
match modern reality and the C standard.
Here are its points:
o For C object sizes, use size_t unless it's 100% certain that
the object will be really small. (Note that fts(3) can construct
pathnames _much_ longer than PATH_MAX for its consumers.)
o Avoid the short types because on modern platforms using them
results in larger and slower code. Change shorts to ints as
follows:
- For variables than count simple, limited things like states,
use plain vanilla `int' as it's the type of choice in C.
- For a limited number of bit flags use `unsigned' because signed
bit-wise operations are implementation-defined, i.e., unportable,
in C.
o For things that should be at least 64 bits wide, use long long
and not int64_t, as the latter is an optional type. See
FTSENT.fts_number aka FTS.fts_bignum. Extending fts_number `to
satisfy future needs' is pointless because there is fts_pointer,
which can be used to link to arbitrary data from an FTSENT.
However, there already are fts(3) consumers that require fts_number,
or fts_bignum, have at least 64 bits in it, so we must allow for them.
o For the tree depth, use `long'. This is a trade-off between making
this field too wide and allowing for 64-bit inode numbers and/or
chain-mounted filesystems. On the one hand, `long' is almost
enough for 32-bit filesystems on a 32-bit platform (our ino_t is
uint32_t now). On the other hand, platforms with a 64-bit (or
wider) `long' will be ready for 64-bit inode numbers, as well as
for several 32-bit filesystems mounted one under another. Note
that fts_level has to be signed because -1 is a magic value for it,
FTS_ROOTPARENTLEVEL.
o For the `nlinks' local var in fts_build(), use `long'. The logic
in fts_build() requires that `nlinks' be signed, but our nlink_t
currently is uint16_t. Therefore let's make the signed var wide
enough to be able to represent 2^16-1 in pure C99, and even 2^32-1
on a 64-bit platform. Perhaps the logic should be changed just
to use nlink_t, but it can be done later w/o breaking fts(3) ABI
any more because `nlinks' is just a local var.
This commit also inludes supporting stuff for the fts change:
o Preserve the old versions of fts(3) functions through libc symbol
versioning because the old versions appeared in all our former releases.
o Bump __FreeBSD_version just in case. There is a small chance that
some ill-written 3-rd party apps may fail to build or work correctly
if compiled after this change.
o Update the fts(3) manpage accordingly. In particular, remove
references to fts_bignum, which was a FreeBSD-specific hack to work
around the too narrow types of FTSENT members. Now fts_number is
at least 64 bits wide (long long) and fts_bignum is an undocumented
alias for fts_number kept around for compatibility reasons. According
to Google Code Search, the only big consumers of fts_bignum are in
our own source tree, so they can be fixed easily to use fts_number.
o Mention the change in src/UPDATING.
PR: bin/104458
Approved by: re (quite a while ago)
Discussed with: deischen (the symbol versioning part)
Reviewed by: -arch (mostly silence); das (generally OK, but we didn't
agree on some types used; assuming that no objections on
-arch let me to stick to my opinion)
Even though I believe this is a good change, it does
have the potential to break certain clients, so it's
good to document the reasoning behind the change.
cases which are used mainly by regression tests.
As usual, the cutoff for tiny args was not correctly translated to
float precision. It was 2**-54 but 2**-24 works. It must be about
2**-precision, since the error from approximating log(1+x) by x is
about the same as |x|. Exhaustive testing shows that 2**-24 gives
perfect rounding in round-to-nearest mode.
Similarly for the cutoff for being small, except this is not used by
so many other functions. It was 2**-29 but 2**-15 works. It must be
a bit smaller than sqrt(2**-precision), since the error from
approximating log(1+x) by x-x*x/2 is about the same as x*x. Exhaustive
testing shows that 2**-15 gives a maximum error of 0.5052 ulps in
round-to-nearest-mode. The algorithm for the general case is only good
for 0.8388 ulps, so this is sufficient (but it loses slightly on i386 --
then extra precision gives 0.5032 ulps for the general case).
While investigating this, I noticed that optimizing the usual case by
falling into a middle case involving a simple polynomial evaluation
(return x-x*x/2 instead of x here) is not such a good idea since it
gives an enormous pessimization of tinier args on machines for which
denormals are slow. Float x*x/2 is denormal when |x| ~< 2**-64 and
x*x/2 is evaluated in float precision, so it can easily be denormal
for normal x. This is even more interesting for general polynomial
evaluations. Multiplying out large powers of x is normally a good
optimization since it reduces dependencies, but it creates denormals
starting with quite large x.
forget to translate "float" to "double".
ucbtest didn't detect the bug, but exhaustive testing of the float
case relative to the double case eventually did. The bug only affects
args x with |x| ~> 2**19*(pi/2) on non-i386 (i386 is broken in a
different way for large args).
it should never have existed and it has not been used for many years
(floats are reduced faster using doubles). All relevant changes (just
the workaround for broken assignment) have been merged to the double
version.
there is a problem with non-floats (when i386 defaults to extra
precision). This essentially restores yesterday's behaviour for doubles
on i386 (since generic rint() isn't used and everywhere else assumed
working assignment), but for arches that use the generic rint() it
finishes restoring some of 1995's behaviour (don't waste time doing
unnecessary store/load).
variable hack for exp2f() only.
The volatile variable had a surprisingly large cost for exp2f() -- 19
cycles or 15% on i386 in the worst case observed. This is only partly
explained by there being several references to the variable, only one
of which benefited from it being volatile. Arches that have working
assignment are likely to benefit even more from not having any volatile
variable.
exp2() now has a chance of working with extra precision on i386.
exp2() has even more references to the variable, so it would have been
pessimized more by simply declaring the variable as volatile. Even
the temporary volatile variable for STRICT_ASSIGN costs 5-10% on i386,
(A64) so I will change STRICT_ASSIGN() to do an ordinary assignment
until i386 defaults to extra precision.
instead of a volatile cast hack for the float version only. The cast
hack broke with gcc-4, but this was harmless since the float version
hasn't been used for a few years. Merge from the float version so
that the double version has a chance of working on i386 with extra
precision.
See k_rem_pio2f.c rev.1.8 for the original hack.
Convert to _FBSDID().
hack for log1pf() only. The cast hack broke with gcc-4, resulting in
~1 million errors of more than 1 ulp, with a maximum error of ~1.5 ulps.
Now the maximum error for log1pf() on i386 is 0.5034 ulps again (this
depends on extra precision), and log1p() has a chance of working with
extra precision.
See s_log1pf.c 1.8 for the original hack. (It claims only 62343 large
errors).
Convert to _FBSDID(). Another thing broken with gcc-4 is the static
const hack used for rcsids.
around assignments not working for gcc on i386. Now volatile hacks
for rint() and rintf() don't needlessly pessimize so many arches
and the remaining pessimizations (for arm and powerpc) can be avoided
centrally.
This cleans up after s_rint.c 1.3 and 1.13 and s_rintf.c 1.3 and 1.9:
- s_rint.c 1.13 broke 1.3 by only using a volatile cast hack in 1 place
when it was needed in 2 places, and the volatile cast hack stopped
working with gcc-4. These bugs only affected correctness tests on
i386 since i386 normally uses asm rint() and doesn't support the
extra precision mode that would break assignments of doubles.
- s_rintf.c 1.9 improved(?) on 1.3 by using a volatile variable hack
instead of an extra-precision variable hack, but it declared 2
variables as volatile when only 1 variable needed to be volatile.
This only affected speed tests on i386 since i386 uses asm rintf().
long doubles (i386, amd64, ia64) and one for machines with 128-bit
long doubles (sparc64). Other platforms use the double version.
I've only done runtime testing on i386.
Thanks to bde@ for helpful discussions and bugfixes.
write a new test to exercise the hardlink strategies used
by different archive formats (tar, old cpio, new cpio).
This uncovered two problems, both fixed by this commit:
1) Enforce file size when writing files to disk.
2) When restoring hardlink entries, if they have data associated, go
ahead and open the file so we can write the data.
In particular, this fixes bsdtar/bsdcpio extraction of new cpio
formats where the "original" is empty and the subsequent "hardlink"
entry actually carries the data. It also provides correct behavior
for old cpio archives where hardlinked entries have their bodies
stored multiple times in the archive; the last body should always be
the one that ends up in the final file. The new pax format also
permits (but does not require) hardlinks to carry file data; again,
the last contents should always win.
Note that with any of these, a size of zero on a hardlink simply means
that the hardlink carries no data; it does not mean that the file has
zero size. A non-zero size on a hardlink does provide the file size.
Thanks to: John Baldwin, for reminding me about this long-standing bug
and sending me a simple example archive that prompted this test case
assignments and casts don't clip extra precision, if any. The
implementation is to assign to a temporary volatile variable and read
the result back to assign to the original lvalue.
lib/msun currently 2 different hard-coded hacks to avoid the problem
in just a few places and needs it in a few more places. One variant
uses volatile for the original lvalue. This works but is slower than
necessary. Another temporarily casts the lvalue to volatile. This
broke with gcc-4.2.1 or earlier (gcc now stores to the lvalue but
doesn't load from it).
instead of 32+32+15+1) on all arches that have such long doubles (amd64,
ia64 and i386). Large objects should be be accessed in large units,
and the 32+32+15+1[+padding] decomposition asks for almost the opposite
of that, sometimes resulting in very slow accesses depending on how
well the compiler ignores what we ask for and converts to the best
units for the given machine. E.g., on Athlons, there is a 10-20 cycle
penalty for accessing the middle 32-bit word immediately after an
80-bit store.
Whether actually using the alternative view is better is very machine-
dependent. A 32+32+16 view is probably best with old 32-bit systems
and gcc through 4.2.1. The compiler should mostly avoid the view and
generate best accesses, but gcc-4.2.1 is far from doing that. I think
64+16 is best for now. Similarly for doubles -- they should be using
64+0 especially on 64-bit machines, but fdlibm uses 32+32 extensively
for them. Fortunately, in 64-bit mode for doubles, gcc already ignores
the 32+32-bit view and generates best accesses in many cases.
one part" by simply ignoring the marker at the beginning
of the file. (Zip archivers reserve four bytes at the beginning
of each part of a multi-part archive, if it happens to only
require one part, those four bytes get filled with a placeholder
that can be ignored.)
Thanks to: Marius Nuennerich,
for pointing me to a Zip archive that libarchive couldn't handle
MFC after: 7 days
Specifically, remove the BUGS section and note that openpty(3) now always
does the various security-related steps. Also, update the error return
value section. The PR below is for the original bug rather than the doc
updates.
MFC after: 1 week
PR: bin/9770
into slowsort for some sequences because different parts of the
code used 'r' to store two different things, one of which was
signed. Clean things up by splitting 'r' into two variables, and
use a more meaningful name.
doesn't need to compensate for this situation.
While here, fix a minor longstanding bug that empty tar archives
(which begin with at least 512 zero bytes) never properly reported
their format. In particular, this fixes the output of:
bsdtar tvvf /dev/zero
And, of course, a new test to verify that libarchive correctly
recognizes the format of such files.
implement shm_open(2) and shm_unlink(2) in the kernel:
- Each shared memory file descriptor is associated with a swap-backed vm
object which provides the backing store. Each descriptor starts off with
a size of zero, but the size can be altered via ftruncate(2). The shared
memory file descriptors also support fstat(2). read(2), write(2),
ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared
memory file descriptors.
- shm_open(2) and shm_unlink(2) are now implemented as system calls that
manage shared memory file descriptors. The virtual namespace that maps
pathnames to shared memory file descriptors is implemented as a hash
table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash
of the pathname.
- As an extension, the constant 'SHM_ANON' may be specified in place of the
path argument to shm_open(2). In this case, an unnamed shared memory
file descriptor will be created similar to the IPC_PRIVATE key for
shmget(2). Note that the shared memory object can still be shared among
processes by sharing the file descriptor via fork(2) or sendmsg(2), but
it is unnamed. This effectively serves to implement the getmemfd() idea
bandied about the lists several times over the years.
- The backing store for shared memory file descriptors are garbage
collected when they are not referenced by any open file descriptors or
the shm_open(2) virtual namespace.
Submitted by: dillon, peter (previous versions)
Submitted by: rwatson (I based this on his version)
Reviewed by: alc (suggested converting getmemfd() to shm_open())
default. This has the disadvantage of rendering the datasize resource
limit irrelevant, but without this change, legitimate uses of more
memory than will fit in the data segment are thwarted by default.
Fix chunk_alloc_mmap() to work correctly if initial mapping is not
chunk-aligned and mapping extension fails.
the number of bytes read is actually not important as long as we have at
least what we ask for. Illustrate its benefits by using it throughout
the ZIP support code, except for the few cases where it doesn't apply.
Approved by: kientzle
exercises and verifies the libarchive APIs:
* Improved error reporting; hexdumps are now provided for
many file/memory content differences.
* Overall status more clearly counts "tests" and "assertions"
* Reference files can now be stored on disk instead of having
to be compiled into the test program itself. A couple of
tests have been converted to this more natural structure.
* Several memory leaks corrected so that leaks within libarchive
itself can be more easily detected and diagnosed.
* New test: GNU tar compatibility
* New test: Zip compatibility
* New test: Zero-byte writes to a compressed archive entry
* New test: archive_entry_strmode() format verification
* New test: mtree reader
* New test: write/read of large (2G - 1TB) entries to tar archives
(thanks to recent performance work, this test only requires a few seconds)
* New test: detailed format verification of cpio odc and newc writers
* Many minor additions/improvements to existing tests as well.
Clean up DSS-related locking and protect all pertinent variables with
dss_mtx (remove dss_chunks_mtx). This fixes race conditions that could
cause chunk leaks.
Reported by: [1] kris
This is a long-standing bug, but until recent changes it was difficult
to trigger, and even then its impact was non-catastrophic, with the
exception of revision 1.157.
Optimize chunk_alloc_mmap() to avoid the need for unmapping pages in the
common case. Thanks go to Kris Kennaway for a patch that inspired this
change.
Do not maintain a record of previously mmap'ed chunk address ranges.
The original intent was to avoid the extra system call overhead in
chunk_alloc_mmap(), which is no longer a concern. This also allows some
simplifications for the tree of unused DSS chunks.
Introduce huge_mtx and dss_chunks_mtx to replace chunks_mtx. There was
no compelling reason to use the same mutex for these disjoint purposes.
Avoid memset() for huge allocations when possible.
Maintain two trees instead of one for tracking unused DSS address
ranges. This allows scalable allocation of multi-chunk huge objects in
the DSS. Previously, multi-chunk huge allocation requests failed if the
DSS could not be extended.
that I've been working on but put off committing until after the
RELENG_7 branch, including:
* New manpages: cpio.5 mtree.5
* New archive_entry_strmode()
* New archive_entry_link_resolver()
* New read support: mtree format
* Internal API change: read format auction only runs once
* Running the auction only once allowed simplifying a lot of bid logic.
* Cpio robustness: search for next header after a sync error
* Support device nodes on ISO9660 images
* Eliminate a lot of unnecessary copies for uncompressed archives
* Corrected handling of new GNU --sparse --posix formats
* Correctly handle a zero-byte write to a compressed archive
* Fixed memory leaks
Many of these improvements were motivated by the upcoming bsdcpio
front-end.
There have also been extensive improvements to the libarchive_test
test harness, which I'll commit separately.
global list of all files.
- Mark kvm_getfiles() as broken since the live version exports struct xfile
with no filelist at the head and does so incorrectly and the deadfiles
version exports struct file with a filelist at the head. It is not known
if either version works or complies to the manpage.
order to support re-use of multi-chunk unused regions within the DSS for
huge allocations. This generalization is important to correct function
when mmap-based allocation is disabled.
Avoid zeroing re-used memory in the DSS unless it really needs to be
zeroed.
memory is acquired from the system via sbrk(2) and/or mmap(2). By default,
use sbrk(2) only, in order to support traditional use of resource limits.
Additionally, when both options are enabled, prefer the data segment to
anonymous mappings, in order to coexist better with large file mappings
in applications on 32-bit platforms. This change has the potential to
increase memory fragmentation due to the linear nature of the data
segment, but from a performance perspective this is mitigated by the use
of madvise(2). [1]
Add the ability to interpret integer prefixes in MALLOC_OPTIONS
processing. For example, MALLOC_OPTIONS=lllllllll can now be specified as
MALLOC_OPTIONS=9l.
Reported by: [1] rwatson
Design review: [1] alc, peter, rwatson
- Use PTY* for all pty(4) related constants.
- Use PTMX* for all pts(4) related constants.
- Consistently use _PATH_DEV PTMX rather than "/dev/ptmx".
- Revert 1.7 and properly fix it by using the correct prefix string for
pts(4) masters.
MFC after: 3 days
kick off any other users on the device line before using it since
openpty(3) is documented to do this. Note that grantpt(3) does not
call revoke(2), it only adjusts permissions and ownership.
MFC after: 3 days
my original implementation made both use the same code. Unfortunately,
this meant libm depended on a vendor header at compile time and previously-
unexposed vendor bits in libc at runtime.
Hence, I just wrote my own version of the relevant vendor routine. As it
turns out, mine has a factor of 8 fewer of lines of code, and is a bit more
readable anyway. The strtod() and *scanf() routines still use vendor code.
Reviewed by: bde
lynx, curl etc. Note that this patch differs significantly from that
in the PR, as the submitter refined it after submitting the PR.
PR: 110388
Submitted by: Alexander Pohoyda <alexander.pohoyda@gmx.net>
MFC after: 3 weeks
calculating run sizes. Use of the floating point unit was a potential
pessimization to context switching for applications that do not otherwise
use floating point math. [1]
Reformat cpp macro-related comments to improve consistency.
Submitted by: das
returned on a perfectly valid bzip2 stream whose decompressed size
is multiple of read-ahead buffer size. Reproduce the problem is easy:
create some power-of-two sized file (truncate -s 1m file will do),
bzip2 it and try to load it as md_image from loader. See how it fails.
The bug doesn't affect gzip code (which most of bzip2-reading code was
copied from) probably due to the fact that libgzip doesn't report
Z_STREAM_END with the last block, but requires extra call to inflate()
to retrieve it and has some extra data in the input stream at that time.
However, apply similar fix to gzipfs.c just in the case the API will
change in the future to do what bzip2 code does.
Add some ifdef'ed code to enable testing bzipfs.c from witin normal
FreeBSD environment as opposed to the restricted loader one, so that
one can use gdb and whatnot.
Sponsored by: Sippy Software, Inc., http://www.sippysoft.com/
MFC in: 7 days
someone thought it would be a good idea to copy z_abs() to libm in 1994.
However, it's never been declared or documented anywhere, and I'm
reasonably confident that nobody uses it.
Discussed with: bde, deischen, kan
I hope that this and the i386 version of it will not be needed, but
this is currently about 16 cycles or 36% faster than the C version,
and the i386 version is about 8 cycles or 19% faster than the C
version, due to poor optimization of the C version.
deallocation and dynamic load balancing via the MALLOC_LAZY_FREE and
MALLOC_BALANCE knobs. This is a non-functional change, since these
features are still enabled when possible.
Clean up a few things that more pedantic compiler settings would cause
complaints over.
adds two new directories in msun: ld80 and ld128. These are for
long double functions specific to the 80-bit long double format
used on x86-derived architectures, and the 128-bit format used on
sparc64, respectively.
loop count.
2. Add function pthread_mutex_setyieldloops_np to turn a mutex's yield
loop count.
3. Make environment variables PTHREAD_SPINLOOPS and PTHREAD_YIELDLOOPS
to be only used for turnning PTHREAD_MUTEX_ADAPTIVE_NP mutex.
default to the value of MK_KERBEROS unless set explicitly by
WITH_GSSAPI/WITHOUT_GSSAPI. (This introduces another type of
MK_* variables which itself is questionable.)
- Teach tools/build/options/makeman script that generates the
src.conf(5) manpage about the new type of MK_* variables.
- Fix broken logic in lib/Makefile.
when particular function can't be found in nsswitch-module. For
example, getgrouplist(3) will use module-supplied 'getgroupmembership'
function (which can work in an optimal way for such source as LDAP) and
will fall back to the stanard iterate-through-all-groups implementation
otherwise.
PR: ports/114655
Submitted by: Michael Hanselmann <freebsd AT hansmi DOT ch>
Reviewed by: brooks (mentor)
WITHOUT_KERBEROS knob. While GSS can be used for other things
some third party software (most notably ports/x11/kdelibs3)
takes the presence of libgssapi as an indication that kerberos
is available, and attempts to link with the kerberos libs. If
they are not available, the build will fail.
Because you might want to use GSS but not kerberos, add a knob
to re-enable it if WITHOUT_KERBEROS is present.
Document the new knob, and the new behavior of WITHOUT_KERBEROS.
Not objected and/or generally agreed to by: freebsd-arch
Problem discussed/analyzed in:
PR: ports/116484
is seems to be a problem for SUID applications, which we like to
prevent as much as possible.
PR: docs/39530
Submitted by: Soren Spies <sspies at apple dot com>
MFC After: 3 days
This protects against a race with an upcall in the parent during the
fork which can clobber the parent's tcb before the vm space is copied
in the child. The child then gets a corrupted tcb that is either null
or that points to another thread that doesn't exist in the child (after
a fork, only the fork()ing thread exists in the child).
Reported by: Arno J. Klaassen (arno at heho / snv / jussieu / fr)
a length field of zero; it does not mean the body is empty.
Thanks to: Lapo Luchini for sending me a JAR archive that demonstrated this bug
MFC after: 3 days
ia64, powerpc, and sparc64, use ANSI function headers and specifically
indicate the lack of arguments with 'void'. Otherwise, warnings are
generated at WARNS=3, leading to a compile failure with -Werror.
libkse in FreeBSD 8.0, do not build or install static versions of libkse
(i.e. libkse*.a) in the default case. Static versions will be built and
installed if libthr is not built or if libkse is the default threading
library.
Discussed on: freebsd-arch
MFC after: 3 days
contention. The intent is to dynamically adjust to load imbalances, which
can cause severe contention.
Use pthread mutexes where possible instead of libc "spinlocks" (they aren't
actually spin locks). Conceptually, this change is meant only to support
the dynamic load balancing code by enabling the use of spin locks, but it
has the added apparent benefit of substantially improving performance due to
reduced context switches when there is moderate arena lock contention.
Proper tuning parameter configuration for this change is a finicky business,
and it is very much machine-dependent. One seemingly promising solution
would be to run a tuning program during operating system installation that
computes appropriate settings for load balancing. (The pthreads adaptive
spin locks should probably be similarly tuned.)
vector of slots for lazily freed objects. For each deallocation, before
doing the hard work of locking the arena and deallocating, try several times
to randomly insert the object into the vector using atomic operations.
This approach is particularly effective at reducing contention for
multi-threaded applications that use the producer-consumer model, wherein
one producer thread allocates objects, then multiple consumer threads
deallocate those objects.
allocations. [1]
Fix calculation of the number of arenas when 'n' is specified via
MALLOC_OPTIONS.
Clean up various style inconsistencies.
Obtained from: [1] NetBSD
elf{32,64}_xlateto[fm]() translation functions. This change makes our
libelf compatible with other ELF(3) implementations. [1]
- Update manual page to reflect this change.
- Style fixes: wrap a long line.
Submitted by: jb [1]
Note that ULong in this code is actually defined as an unsigned integer across
all arches so that the gdtoa() function always processes 32 bit data
despite the unfortunate naming of "ULong".
libraries had not had their versions bumped relative to 6.3-REL but
had indeed been changed. We need to bump their version so they can be
properly added to the compat6x port:
libasn1.so.8 libgssapi.so.8 libhdb.so.8 libkadm5clnt.so.8
libkadm5srv.so.8 libkafs5.so.8 libkrb5.so.8 libobjc.so.2
MFC After: 1 day
doesn't use the default CFLAGS which contain -fno-strict-aliasing.
Until the code is cleaned up, just add -fno-strict-aliasing to the
CFLAGS of these for the tinderboxes' sake, allowing the rest of the
tree to have -Werror enabled again.
cause the build to fail because y.tab.c can have a more
recent modification time than y.tab.h, and the bad rule
relied on the opposite.
(The last write to y.tab.c by yacc(1) happens after the
last write to y.tab.h, according to truss(1).)
Reported by: kensmith
fixes a NULL-dereference of curthread when libstdc+ initializes
the exception handling globals on archs we can't use GNU TLS due
to lack of support in binutils 2.15 (i.e. arm and sparc64), yet,
thus making threaded C++ programs compiled with GCC 4.2.1 work
again on these archs.
Reviewed by: davidxu
MFC after: 3 days
to tune pthread mutex performance:
1. LIBPTHREAD_SPINLOOPS
If a pthread mutex is being locked by another thread, this environment
variable sets total number of spin loops before the current thread
sleeps in kernel, this saves a syscall overhead if the mutex will be
unlocked very soon (well written application code).
2. LIBPTHREAD_YIELDLOOPS
If a pthread mutex is being locked by other threads, this environment
variable sets total number of sched_yield() loops before the currrent
thread sleeps in kernel. if a pthread mutex is locked, the current thread
gives up cpu, but will not sleep in kernel, this means, current thread
does not set contention bit in mutex, but let lock owner to run again
if the owner is on kernel's run queue, and when lock owner unlocks the
mutex, it does not need to enter kernel and do lots of work to resume
mutex waiters, in some cases, this saves lots of syscall overheads for
mutex owner.
In my practice, sometimes LIBPTHREAD_YIELDLOOPS can massively improve performance
than LIBPTHREAD_SPINLOOPS, this depends on application. These two environments
are global to all pthread mutex, there is no interface to set them for each
pthread mutex, the default values are zero, this means spinning is turned off
by default.
is also implemented in glibc and is used by a number of existing
applications (mysql, firefox, etc).
This mutex type is a default mutex with the additional property that
it spins briefly when attempting to acquire a contested lock, doing
trylock operations in userland before entering the kernel to block if
eventually unsuccessful.
The expectation is that applications requesting this mutex type know
that the mutex is likely to be only held for very brief periods, so it
is faster to spin in userland and probably succeed in acquiring the
mutex, than to enter the kernel and sleep, only to be woken up almost
immediately. This can help significantly in certain cases when
pthread mutexes are heavily contended and held for brief durations
(such as mysql).
Spin up to 200 times before entering the kernel, which represents only
a few us on modern CPUs. No performance degradation was observed with
this value and it is sufficient to avoid a large performance drop in
mysql performance in the heavily contended pthread mutex case.
The libkse implementation is a NOP.
Reviewed by: jeff
MFC after: 3 days
This can only happen on 32-bit systems when you're reading
an uncompressed archive and the skip request is an exact
multiple of 4G (e.g., skipping a tar entry with an 8G body).
The symptom is that the read_ahead() ends up returning zero
bytes, and the extraction stops with a premature end-of-file.
Using '1' here is more correct anyway, as it allows read_ahead()
to function opportunistically and minimize copying.
MFC after: 5 days
kthread_add() takes the same parameters as the old kthread_create()
plus a pointer to a process structure, and adds a kernel thread
to that process.
kproc_kthread_add() takes the parameters for kthread_add,
plus a process name and a pointer to a pointer to a process instead of just
a pointer, and if the proc * is NULL, it creates the process to the
specifications required, before adding the thread to it.
All other old kthread_xxx() calls return, but act on (struct thread *)
instead of (struct proc *). One reason to change the name is so that
any old kernel modules that are lying around and expect kthread_create()
to make a process will not just accidentally link.
fix top to show kernel threads by their thread name in -SH mode
add a tdnam formatting option to ps to show thread names.
make all idle threads actual kthreads and put them into their own idled process.
make all interrupt threads kthreads and put them in an interd process
(mainly for aesthetic and accounting reasons)
rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper'
man page fixes to follow.
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
a module was loaded might make the pathname inaccurate.
I wonder if an inode reference should be stored with the pathname
to allow a validity check?
Suggested by: rwatson@
threading library.
- Now that libpthread is a symlink, it's no longer possible
to link applications with libpthread and have libmap.conf(5)
select the desired threading library; applications will be
linked to the default threading library, libkse or libthr.
Remove an obsolete paragraph.
- Mention that improvements can be seen compared to libkse.
Reviewed by: deischen, davidxu
for kldstat(2).
This allows libdtrace to determine the exact file from which
a kernel module was loaded without having to guess.
The kldstat(2) API is versioned with the size of the
kld_file_stat structure, so this change creates version 2.
Add the pathname to the verbose output of kldstat(8) too.
MFC: 3 days
aligned, GCC 4.2.1 also generates code for sendudp() that assumes
this alignment. GCC 4.2.1 however doesn't 32-bit align wbuf, causing
the loader to crash due to an unaligned access of wbuf in sendudp()
when netbooting sparc64. Solve this by specifying wbuf as packed and
32-bit aligned, too. As for lastdata and readudp() this currently is
no issue when compiled with GCC 4.2.1, though give lastdata the same
treatment as wbuf for consistency and possibility of being affected
in the future. [1]
- Sprinkle const on a lookup table.
Reported by: marcel [1]
Submitted by: yongari [1]
Reviewed by: marcel [1]
MFC after: 5 days
test MK_INSTALLLIB, users can set WITHOUT_INSTALLLIB. The old
NO_INSTALLLIB is still supported as several makefiles set it.
- While here, fix an install when instructed not to install libs
(usr.bin/lex/lib/Makefile).
PR: bin/114200
Submitted by: Henrik Brix Andersen