a maximum error of 2.905 ulps for cosf(), but the algorithm for cosf()
is good for < 1 ulps and happens to give perfect rounding (< 0.5 ulps)
near +-pi/2 except for the bug. The extra relative errors for tanf()
were similar (slightly larger). The bug didn't affect sinf() since
sinf'(+-pi/2) is 0.
For range reduction in ~[-3pi/4, -pi/4] and ~[pi/4, 3pi/4] we must
subtract +-pi/2 and the only complication is that this must be done
in extra precision. We have handy 17+24-bit and 17+17+24-bit
approximations to pi/2. If we always used the former then we would
lose up to 24 bits of accuracy due to cancelation of leading bits, but
we need to keep at least 24 bits plus a guard digit or 2, and should
keep as many guard bits as efficiency permits. So we used the
less-precise pi/2 not very near +-pi/2 and switched to using the
more-precise pi/2 very near +-pi/2. However, we got the threshold for
the switch wrong by allowing 19 bits to cancel, so we ended up with
only 21 or 22 bits of accuracy in some cases, which is even worse than
naively subtracting pi/2 would have done.
Exhaustive checking shows that allowing only 17 bits to cancel (min.
accuracy ~24 bits) is sufficient to reduce the maximum error for cosf()
near +-pi/2 to 0.726 ulps, but allowing only 6 bits to cancel (min.
accuracy ~35-bits) happens to give perfect rounding for cosf() at
little extra cost so we prefer that.
We actually (in effect) allow 0 bits to cancel and always use the
17+17+24-bit pi/2 (min. accuracy ~41 bits). This is simpler and
probably always more efficient too. Classifying args to avoid using
this pi/2 when it is not needed takes several extra integer operations
and a branch, but just using it takes only 1 FP operation.
The patch also fixes misspelling of 17 as 24 in many comments.
For the double-precision version, the magic numbers include 33+53 bits
for the less-precise pi/2 and (53-32-1 = 20) bits being allowed to
cancel, so there are ~33-20 = 13 guard bits. This is sufficient except
probably for perfect rounding. The more-precise pi/2 has 33+33+53
bits and we still waste time classifying args to avoid using it.
The bug is apparently from mistranslation of the magic 32 in 53-32-1.
The number of bits allowed to cancel is not critical and we use 32 for
double precision because it allows efficient classification using a
32-bit comparison. For float precision, we must use an explicit mask,
and there are fewer bits so there is less margin for error in their
allocation. The 32 got reduced to 4 but should have been reduced
almost in proportion to the reduction of mantissa bits.
the modified interface that they use. Changes include:
- Register a different interrupt handler for the new interface. This one is
INTR_MPSAFE, not INTR_FAST, and directly processes completions and AIFs.
- Add an event registration and callback mechanism for the ioctl and CAM
modules can know when a resource shortage clears. This condition was
previously fatal in CAM due to programming oversights.
- Fix locking to play better with newbus.
- Provide access methods for talking to cards with the NEWCOMM interface.
- Fix up the CAM module to be better suited for dealing with newer firmware
on the PERC Si/Di series that requires talking to plain SCSI via aac.
- Add a whole slew of new PCI Id's.
Thanks to Adaptec for providing an initial version of this work and for
answering countless questions about it. There are still some rough edges in
this, but it works well enough to commit and test for now.
Obtained from: Adaptec, Inc.
o Rather than just try to turn off EXCA_INTR_RESET, set the entire register
to 0. This is slightly faster, and a better hammer.
o Move attempted clearing of the output enable (EXCA_PWRCTL_OE) back to
after we turn off the power. Modify it to write 0 so that we don't get
Bad Vcc messages on TI bridges (untested, but ru@ sent me a similar patch)
while at the same time avoiding interrupt storms on Ricoh bridges (tested
by me on my Sony).
# Many of my observations of 'breakage' for this patch are due to some bug
# in the load/unload of cbb.ko unlreated to this change. I'll be investigating
# and fixing that bug in the fullness of time.
an allocation. This fixes the malloc 'use after free' panic on boot that
many were seeing. It doesn't solve the problem of the allocations being
cached and then written past their bounds later. That will take more work.
Submitted by: kan
in 1993 in rev.1.5 of the i386 a.out version (csu/i386/crt0.c).
Profiling uses a magic label "eprol" to delimit the start of the part
of the text section covered by profiling. This label must be placed
before the call to main() to get main() properly profiled. It was
placed there in rev.1.1 of crt0.c. Rev.1.5 imported the initial
implementation of shared libraries in FreeBSD and misplaced the label.
Fortunately, the misplaced label was misspelled and the old label
wasn't removed, so the new label had no effect. Unfortunately, when
profiling was implemented for the ELF in 1998 in rev.1.2 of
csu/i386-elf/crt1.c, only the incorrectly placed label was copied
(after fixing its name). The bug was then copied to all other arches.
The label seems to be still misplaced in NetBSD for most arches. It
is in common.c for most arches so it is even further from being inside
the function that calls main().
I think "eprol" is short for "end of prologue", but it must be placed
before the end of the prologue so that it covers main(). crt0.c has
it before the calls atexit(_mcleanup) and monstartup(...), but it
cannot affect these calls so I moved it after the call to monstartup().
It now also covers the call to _init() but not the newer call to
_init_tls(). Profiling of _init() seems to be harmless, and the call
to _init_tls() seems to be misplaced.
Reviewed by: jdp (long ago, for a slightly different i386 version)
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html
The same problem applies to if_bridge(4), too.
- Copy-and-paste the if_bridge(4) related block from
if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.
Silence from: thompsa
appear to be never called:
(1) If a function is never called according to its call count but it
must have been called because its child time is nonzero, then print
it in the flat profile. Previously, if its call count was zero
then we only printed it in the flat profile if its self time was
nonzero.
(2) If a function has a zero call count but has a nonzero self or child
time, then print its total self time in the self time per call
column as a percentage of the total (self + child) time. It is
not possible to print the times per call in this case because the
call count is zero. Previously, this was handled by leaving both
per-call columns blank. The self time is printed in another column
but there was no way to recover the total time.
(1) partially fixes the case of the "never called" function main() and
prepares for (2) to apply to main() and other functions. Profiling
of main() was lost in the conversion from a.out to ELF, so main()'s
call count has always been zero for many years; then in the common
case where main() is a tiny function, it gets no profiling ticks, so
main() was completely lost in the flat profile.
(2) improves mainly cases like kernel threads. Most kernel threads
appear to be never called because they are always started before
userland can run to turn on profiling. As for main(), the fact that
they are called is not very interesting and their callers are
uninteresting, but their relative self time is interesting since they
are long-running.
Almost always printing percentages in the per-call columns would be
more useful than almost always printing 0.0ms. 0.1ms is now a long
time, so only very large functions take that long per call. The accuracy
per call can approach 1-10 nsec provided programs are run for about
100000 times as long as is necessary to get this accuracy with high
resolution kernel profiling.
you want to see, e.g., sendmail arguments mail(1) will use.
-H is not an independent flag, it's a modifier. Also explicitly
say that -H will cause mail(1) to exit as soon as it prints the headers.
MFC after: 5 days
the 5GHz band.
o Enable 802.11a channels scanning for 2915ABG adapters.
o Fix a typo (negociated->negotiated).
With hints from NetBSD.
MFC after: 2 days
- Rename vxfoo() functions to vx_foo() to improve readability and
consistency with other drivers.
- Prefix most the softc members with 'vx_' (the other members already had
the prefix).
- Switch to using callout_init_mtx() and callout_*() rather than
timeout() and untimeout().
- Add some missing calls to if_free() in some failure cases in vx_attach().
- Use if_printf() and remove the unit number from the softc.
- Remove uses of the 'register' keyword and spls.
- Add locked variants of vx_init() and vx_start().
- Add a mutex to the softc and lock it in various appropriate places.
- Setup the interrupt handler last during attach.
Tested by: imp
MFC after: 1 week
It allows to specify options for NFS root file system.
Currently supported options are: soft, intr, conn, lockd.
I'm adding this functionality mostly for 'lockd' option, which is only
honored when performing the initial mount and will be silently ignored
if used while updating the mount options.
This will allow to use flock(2) without the need of using varmfs or
rpc.lockd and friends.
Example of use:
boot.nfsroot.options="intr,lockd"
MFC after: 2 weeks