want to adjust this code to just assume that all CPUs >= Esther should
be checked for the extended cpuid flags register.
MFC after: 3 days
PR: i386/119491
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small speeddown for |x| <= 9pi/4 (only 1-2 cycles average, but that
is 4%).
Inlining this is less likely to bust caches than inlining the float
version since it is much smaller (about 220 bytes text and rodata) and
has many fewer branches. However, the float version was already large
due to its manual inlining of the branches and also the polynomial
evaluations.
The code seems pretty MPSAFE and Giant is held over kproc_exit() which
at lowel calls exit1(). exit1() requires Giant to be unowned so this
opens a window for races.
Reported by: Bryan Venteicher <bryanv at daemoninthecloset dot org>
Tested by: Bryan Venteicher <bryanv at daemoninthecloset dot org>
of the GNU utility. The default behavior of our original `du' is to
count hardlinked files only once for each invocation of the utility.
With the new -l option they count towards the final size every time
they are found.
PR: bin/117944
Submitted by: keramida
Reviewed by: des, obrien
MFC after: 2 weeks
always curthread.
As KPI gets broken by this patch, manpages and __FreeBSD_version will be
updated by further commits.
Tested by: Andrea Barberio <insomniac at slackware dot it>
__kernel_rem_pio2(). This simplifies analysis of aliasing and thus
results in better code for the usual case where __kernel_rem_pio2()
is not called. In particular, when __ieee854_rem_pio2[f]() is inlined,
it normally results in y[] being returned in registers. I couldn't
get this to work using the restrict qualifier.
In float precision, this saves 2-3% in most cases on amd64 and i386
(A64) despite it not being inlined in float precision yet. In double
precision, this has high variance, with an average gain of 2% for
amd64 and 0.7% for i386 (but a much larger gain for usual cases) and
some losses.
source upgrades by falling back to GNU ar(1) as necessary. Option
WITH_BSDAR is gone. Option _WITH_GNUAR to aid in upgrades is *not*
supposed to be set by the user.
Stop bootstrapping BSD ar(1) on the next __FreeBSD_version bump, as
there are no known bugs in it. Bump __FreeBSD_version to anticipate
this and to flag the switch to BSD ar(1), should it be needed for
something.
Input from: obrien, des, kaiw
first before they can be set to Explorer mode.
PR: kern/118578
Submitted by: Andriy Gapon <avg@icyb.net.ua> (I added some comments)
Reviewed by: philip
MFC after: 1 month
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.
This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range). 40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2. The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly. amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.
This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code). It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.
e_rem_pio2.c:
Float and double precision didn't work because init_jk[] was 1 too small.
It needs to be 2 larger than you might expect, and 1 larger than it was
for these precisions, since its test for recomputing needs a margin of
47 bits (almost 2 24-bit units).
init_jk[] seems to be barely enough for extended and quad precisions.
This hasn't been completely verified. Callers now get about 24 bits
of extra precision for float, and about 19 for double, but only about
8 for extended and quad. 8 is not enough for callers that want to
produce extra-precision results, but current callers have rounding
errors of at least 0.8 ulps, so another 1/2**8 ulps of error from the
reduction won't affect them much.
Add a comment about some of the magic for init_jk[].
e_rem_pio2.c:
Double precision worked in practice because of a compensating off-by-1
error here. Extended precision was asked for, and it executed exactly
the same code as the unbroken double precision.
e_rem_pio2f.c:
Float precision worked in practice because of a compensating off-by-1
error here. Double precision was asked for, and was almost needed,
since the cosf() and sinf() callers want to produce extra-precision
results, at least internally so that their error is only 0.5009 ulps.
However, the extra precision provided by unbroken float precision is
enough, and the double-precision code has extra overheads, so the
off-by-1 error cost about 5% in efficiency on amd64 and i386.
variations (e500 currently), this provides a gcc-level FPU emulation and is an
alternative approach to the recently introduced kernel-level emulation
(FPU_EMU).
Approved by: cognet (mentor)
MFp4: e500
check if it is invoked as 'bsdranlib'.
Reported by: Michael Plass <mfp49_freebsd [AT] plass-family [DOT] net>
Reviewed by: Michael Plass <mfp49_freebsd [AT] plass-family [DOT] net>
Reviewed by: jkoshy
Approved by: jkoshy (mentor)
only anonymous default (OBJT_DEFAULT) and swap (OBJT_SWAP) objects should
ever have OBJ_ONEMAPPING set. However, vm_object_deallocate() was
setting it on device (OBJT_DEVICE) objects. As a result,
vm_object_page_remove() could be called on a device object and if that
occurred pmap_remove_all() would be called on the device object's pages.
However, a device object's pages are fictitious, and fictitious pages do
not have an initialized pv list (struct md_page).
To date, fictitious pages have been allocated from zeroed memory,
effectively hiding this problem. Now, however, the conversion of rotting
diagnostics to invariants in the amd64 and i386 pmaps has revealed the
problem. Specifically, assertion failures have occurred during the
initialization phase of the X server on some hardware.
MFC after: 1 week
Discussed with: Kostik Belousov
Reported by: Michiel Boland
Do not mmap 0-size objects and do not try to extract symbol from
0-size objects, but do treat 0-size objects as qualified objects and
accept them as an archive member. (A member with only the header part)
Note that GNU binutils ar on FreeBSD ignores 0-size objects, but on
Linux it accepts them. [1] But, since this is a rare usage, we can
safely ignore the compatibility issue.
Reported by: Michael Plass <mfp49_freebsd [AT] plass-family [DOT] net>
Pointed out by: Michael Plass <mfp49_freebsd [AT] plass-family [DOT] net> [1]
Reviewed by: Michael Plass <mfp49_freebsd [AT] plass-family [DOT] net>
Reviewed by: jkoshy
Approved by: jkoshy (mentor)
computes the new path and the second one, updatepwd(), updates the variables
PWD, OLDPWD and the path used for the pwd builtin according to the new
directory. For a logical directory change, chdir() is now called between
those two functions, no longer causing wrong values to be stored in PWD etc. if
it fails.
PR: 64990, 101316, 120571
namespace in order to handle lockmgr fields in a controlled way instead
than spreading all around bogus stubs:
- VN_LOCK_AREC() allows lock recursion for a specified vnode
- VN_LOCK_ASHARE() allows lock sharing for a specified vnode
In FFS land:
- BUF_AREC() allows lock recursion for a specified buffer lock
- BUF_NOREC() disallows recursion for a specified buffer lock
Side note: union_subr.c::unionfs_node_update() is the only other function
directly handling lockmgr fields. As this is not simple to fix, it has
been left behind as "sole" exception.
the same order that FreeBSD 6 and before did. Doug
White and the other bloodhounds at ISC discovered that
while FreeBSD 7's ordering of options was more efficient,
it caused some cable modem routers to ignore the
SYN-ACKs ordered in this fashion.
The placement of sackOK after the timestamp option seems
to be the critical difference:
FreeBSD 6:
<mss 1460,nop,wscale 1,nop,nop,timestamp 3512155768 0,sackOK,eol>
FreeBSD 7.0:
<mss 1460,nop,wscale 3,sackOK,timestamp 1370692577 0>
FreeBSD 7.0 + this change:
<mss 1460,nop,wscale 3,nop,nop,timestamp 7371813 0,sackOK,eol>
MFC after: 1 week
the provided trailers. This has been broken since revision 1.240.
Submitted by: Dan Nelson
PR: kern/120948
"sounds ok to me" from: phk
MFC after: 3 days
itself, not on the type of the file. As such, do a readlink to get
the symbolic link's contents and fail to match if the path isn't a
symbolic link.
Pointed out by: des@
can run on processors that don't have a FPU. This is typically the
case for Book E processors. While a tuned system will probably want
to use soft-float (or use a processor that has a FPU if the usage is
FP intensive enough), allowing hard-float on FPU-less systems gives
great portability and flexibility.
Obtained from: NetBSD