1994-08-19 09:40:01 +00:00
|
|
|
.\" Copyright (c) 1985 Regents of the University of California.
|
|
|
|
.\" All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
|
|
.\" must display the following acknowledgement:
|
|
|
|
.\" This product includes software developed by the University of
|
|
|
|
.\" California, Berkeley and its contributors.
|
|
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
|
|
.\" may be used to endorse or promote products derived from this software
|
|
|
|
.\" without specific prior written permission.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
|
|
|
.\" from: @(#)math.3 6.10 (Berkeley) 5/6/91
|
1999-08-28 00:22:10 +00:00
|
|
|
.\" $FreeBSD$
|
1994-08-19 09:40:01 +00:00
|
|
|
.\"
|
|
|
|
.TH MATH 3M "May 6, 1991"
|
|
|
|
.UC 4
|
|
|
|
.ds up \fIulp\fR
|
|
|
|
.ds nn \fINaN\fR
|
|
|
|
.de If
|
|
|
|
.if n \\
|
|
|
|
\\$1Infinity\\$2
|
|
|
|
.if t \\
|
|
|
|
\\$1\\(if\\$2
|
|
|
|
..
|
|
|
|
.SH NAME
|
|
|
|
math \- introduction to mathematical library functions
|
|
|
|
.SH DESCRIPTION
|
|
|
|
These functions constitute the C math library,
|
|
|
|
.I libm.
|
|
|
|
The link editor searches this library under the \*(lq\-lm\*(rq option.
|
|
|
|
Declarations for these functions may be obtained from the include file
|
|
|
|
.RI < math.h >.
|
1995-03-01 05:06:48 +00:00
|
|
|
.\" The Fortran math library is described in ``man 3f intro''.
|
1994-08-19 09:40:01 +00:00
|
|
|
.SH "LIST OF FUNCTIONS"
|
1995-03-01 05:06:48 +00:00
|
|
|
Each of the following double functions has a float counterpart with the
|
|
|
|
name ending in f, as an example the float counterpart of double acos(double
|
|
|
|
x) is float acosf(float x).
|
1994-08-19 09:40:01 +00:00
|
|
|
.sp 2
|
|
|
|
.nf
|
|
|
|
.ta \w'copysign'u+2n +\w'infnan.3m'u+10n +\w'inverse trigonometric func'u
|
|
|
|
\fIName\fP \fIAppears on Page\fP \fIDescription\fP \fIError Bound (ULPs)\fP
|
|
|
|
.ta \w'copysign'u+4n +\w'infnan.3m'u+4n +\w'inverse trigonometric function'u+6nC
|
|
|
|
.sp 5p
|
|
|
|
acos sin.3m inverse trigonometric function 3
|
|
|
|
acosh asinh.3m inverse hyperbolic function 3
|
|
|
|
asin sin.3m inverse trigonometric function 3
|
|
|
|
asinh asinh.3m inverse hyperbolic function 3
|
|
|
|
atan sin.3m inverse trigonometric function 1
|
|
|
|
atanh asinh.3m inverse hyperbolic function 3
|
|
|
|
atan2 sin.3m inverse trigonometric function 2
|
|
|
|
cabs hypot.3m complex absolute value 1
|
|
|
|
cbrt sqrt.3m cube root 1
|
|
|
|
ceil floor.3m integer no less than 0
|
|
|
|
copysign ieee.3m copy sign bit 0
|
|
|
|
cos sin.3m trigonometric function 1
|
|
|
|
cosh sinh.3m hyperbolic function 3
|
|
|
|
erf erf.3m error function ???
|
|
|
|
erfc erf.3m complementary error function ???
|
|
|
|
exp exp.3m exponential 1
|
|
|
|
expm1 exp.3m exp(x)\-1 1
|
|
|
|
fabs floor.3m absolute value 0
|
|
|
|
floor floor.3m integer no greater than 0
|
|
|
|
hypot hypot.3m Euclidean distance 1
|
|
|
|
ilogb ieee.3m exponent extraction 0
|
|
|
|
j0 j0.3m bessel function ???
|
|
|
|
j1 j0.3m bessel function ???
|
|
|
|
jn j0.3m bessel function ???
|
|
|
|
lgamma lgamma.3m log gamma function; (formerly gamma.3m)
|
|
|
|
log exp.3m natural logarithm 1
|
|
|
|
log10 exp.3m logarithm to base 10 3
|
|
|
|
log1p exp.3m log(1+x) 1
|
|
|
|
pow exp.3m exponential x**y 60\-500
|
|
|
|
remainder ieee.3m remainder 0
|
|
|
|
rint floor.3m round to nearest integer 0
|
|
|
|
scalbn ieee.3m exponent adjustment 0
|
|
|
|
sin sin.3m trigonometric function 1
|
|
|
|
sinh sinh.3m hyperbolic function 3
|
|
|
|
sqrt sqrt.3m square root 1
|
|
|
|
tan sin.3m trigonometric function 3
|
|
|
|
tanh sinh.3m hyperbolic function 3
|
|
|
|
y0 j0.3m bessel function ???
|
|
|
|
y1 j0.3m bessel function ???
|
|
|
|
yn j0.3m bessel function ???
|
|
|
|
.ta
|
|
|
|
.fi
|
|
|
|
.SH NOTES
|
|
|
|
In 4.3 BSD, distributed from the University of California
|
|
|
|
in late 1985, most of the foregoing functions come in two
|
|
|
|
versions, one for the double\-precision "D" format in the
|
|
|
|
DEC VAX\-11 family of computers, another for double\-precision
|
|
|
|
arithmetic conforming to the IEEE Standard 754 for Binary
|
|
|
|
Floating\-Point Arithmetic. The two versions behave very
|
|
|
|
similarly, as should be expected from programs more accurate
|
|
|
|
and robust than was the norm when UNIX was born. For
|
1998-01-15 23:28:18 +00:00
|
|
|
instance, the programs are accurate to within the numbers
|
1994-08-19 09:40:01 +00:00
|
|
|
of \*(ups tabulated above; an \*(up is one \fIU\fRnit in the \fIL\fRast
|
1998-01-15 23:28:18 +00:00
|
|
|
\fIP\fRlace. And the programs have been cured of anomalies that
|
|
|
|
afflicted the older math library \fIlibm\fR in which incidents like
|
|
|
|
the following had been reported:
|
1994-08-19 09:40:01 +00:00
|
|
|
.RS
|
|
|
|
sqrt(\-1.0) = 0.0 and log(\-1.0) = \-1.7e38.
|
|
|
|
.br
|
|
|
|
cos(1.0e\-11) > cos(0.0) > 1.0.
|
|
|
|
.br
|
|
|
|
pow(x,1.0)
|
|
|
|
.if n \
|
|
|
|
!=
|
|
|
|
.if t \
|
|
|
|
\(!=
|
|
|
|
x when x = 2.0, 3.0, 4.0, ..., 9.0.
|
|
|
|
.br
|
|
|
|
pow(\-1.0,1.0e10) trapped on Integer Overflow.
|
|
|
|
.br
|
|
|
|
sqrt(1.0e30) and sqrt(1.0e\-30) were very slow.
|
|
|
|
.RE
|
|
|
|
However the two versions do differ in ways that have to be
|
|
|
|
explained, to which end the following notes are provided.
|
|
|
|
.PP
|
|
|
|
\fBDEC VAX\-11 D_floating\-point:\fR
|
|
|
|
.PP
|
|
|
|
This is the format for which the original math library \fIlibm\fR
|
|
|
|
was developed, and to which this manual is still principally
|
|
|
|
dedicated. It is \fIthe\fR double\-precision format for the PDP\-11
|
|
|
|
and the earlier VAX\-11 machines; VAX\-11s after 1983 were
|
|
|
|
provided with an optional "G" format closer to the IEEE
|
|
|
|
double\-precision format. The earlier DEC MicroVAXs have no
|
2000-03-02 14:54:02 +00:00
|
|
|
D format, only G double\-precision.
|
|
|
|
(Why? Why not?)
|
1994-08-19 09:40:01 +00:00
|
|
|
.PP
|
|
|
|
Properties of D_floating\-point:
|
|
|
|
.RS
|
|
|
|
Wordsize: 64 bits, 8 bytes. Radix: Binary.
|
|
|
|
.br
|
|
|
|
Precision: 56
|
|
|
|
.if n \
|
|
|
|
sig.
|
|
|
|
.if t \
|
|
|
|
significant
|
|
|
|
bits, roughly like 17
|
|
|
|
.if n \
|
|
|
|
sig.
|
|
|
|
.if t \
|
|
|
|
significant
|
|
|
|
decimals.
|
|
|
|
.RS
|
|
|
|
If x and x' are consecutive positive D_floating\-point
|
|
|
|
numbers (they differ by 1 \*(up), then
|
|
|
|
.br
|
|
|
|
1.3e\-17 < 0.5**56 < (x'\-x)/x \(<= 0.5**55 < 2.8e\-17.
|
|
|
|
.RE
|
|
|
|
.nf
|
|
|
|
.ta \w'Range:'u+1n +\w'Underflow threshold'u+1n +\w'= 2.0**127'u+1n
|
|
|
|
Range: Overflow threshold = 2.0**127 = 1.7e38.
|
|
|
|
Underflow threshold = 0.5**128 = 2.9e\-39.
|
|
|
|
NOTE: THIS RANGE IS COMPARATIVELY NARROW.
|
|
|
|
.ta
|
|
|
|
.fi
|
|
|
|
.RS
|
|
|
|
Overflow customarily stops computation.
|
|
|
|
.br
|
|
|
|
Underflow is customarily flushed quietly to zero.
|
|
|
|
.br
|
|
|
|
CAUTION:
|
|
|
|
.RS
|
|
|
|
It is possible to have x
|
|
|
|
.if n \
|
|
|
|
!=
|
|
|
|
.if t \
|
|
|
|
\(!=
|
|
|
|
y and yet
|
|
|
|
x\-y = 0 because of underflow. Similarly
|
|
|
|
x > y > 0 cannot prevent either x\(**y = 0
|
|
|
|
or y/x = 0 from happening without warning.
|
|
|
|
.RE
|
|
|
|
.RE
|
|
|
|
Zero is represented ambiguously.
|
|
|
|
.RS
|
|
|
|
Although 2**55 different representations of zero are accepted by
|
|
|
|
the hardware, only the obvious representation is ever produced.
|
|
|
|
There is no \-0 on a VAX.
|
|
|
|
.RE
|
|
|
|
.If
|
|
|
|
is not part of the VAX architecture.
|
|
|
|
.br
|
|
|
|
Reserved operands:
|
|
|
|
.RS
|
|
|
|
of the 2**55 that the hardware
|
|
|
|
recognizes, only one of them is ever produced.
|
|
|
|
Any floating\-point operation upon a reserved
|
|
|
|
operand, even a MOVF or MOVD, customarily stops
|
|
|
|
computation, so they are not much used.
|
|
|
|
.RE
|
|
|
|
Exceptions:
|
|
|
|
.RS
|
|
|
|
Divisions by zero and operations that
|
|
|
|
overflow are invalid operations that customarily
|
|
|
|
stop computation or, in earlier machines, produce
|
|
|
|
reserved operands that will stop computation.
|
|
|
|
.RE
|
|
|
|
Rounding:
|
|
|
|
.RS
|
|
|
|
Every rational operation (+, \-, \(**, /) on a
|
|
|
|
VAX (but not necessarily on a PDP\-11), if not an
|
|
|
|
over/underflow nor division by zero, is rounded to
|
|
|
|
within half an \*(up, and when the rounding error is
|
|
|
|
exactly half an \*(up then rounding is away from 0.
|
|
|
|
.RE
|
|
|
|
.RE
|
|
|
|
.PP
|
|
|
|
Except for its narrow range, D_floating\-point is one of the
|
|
|
|
better computer arithmetics designed in the 1960's.
|
|
|
|
Its properties are reflected fairly faithfully in the elementary
|
|
|
|
functions for a VAX distributed in 4.3 BSD.
|
|
|
|
They over/underflow only if their results have to lie out of range
|
|
|
|
or very nearly so, and then they behave much as any rational
|
|
|
|
arithmetic operation that over/underflowed would behave.
|
|
|
|
Similarly, expressions like log(0) and atanh(1) behave
|
|
|
|
like 1/0; and sqrt(\-3) and acos(3) behave like 0/0;
|
|
|
|
they all produce reserved operands and/or stop computation!
|
|
|
|
The situation is described in more detail in manual pages.
|
|
|
|
.RS
|
|
|
|
.ll -0.5i
|
|
|
|
\fIThis response seems excessively punitive, so it is destined
|
|
|
|
to be replaced at some time in the foreseeable future by a
|
|
|
|
more flexible but still uniform scheme being developed to
|
|
|
|
handle all floating\-point arithmetic exceptions neatly.
|
1996-02-12 01:20:38 +00:00
|
|
|
.\" See infnan(3M) for the present state of affairs.\fR
|
1994-08-19 09:40:01 +00:00
|
|
|
.ll +0.5i
|
|
|
|
.RE
|
|
|
|
.PP
|
|
|
|
How do the functions in 4.3 BSD's new \fIlibm\fR for UNIX
|
|
|
|
compare with their counterparts in DEC's VAX/VMS library?
|
|
|
|
Some of the VMS functions are a little faster, some are
|
|
|
|
a little more accurate, some are more puritanical about
|
|
|
|
exceptions (like pow(0.0,0.0) and atan2(0.0,0.0)),
|
|
|
|
and most occupy much more memory than their counterparts in
|
|
|
|
\fIlibm\fR.
|
|
|
|
The VMS codes interpolate in large table to achieve
|
|
|
|
speed and accuracy; the \fIlibm\fR codes use tricky formulas
|
|
|
|
compact enough that all of them may some day fit into a ROM.
|
|
|
|
.PP
|
|
|
|
More important, DEC regards the VMS codes as proprietary
|
|
|
|
and guards them zealously against unauthorized use. But the
|
|
|
|
\fIlibm\fR codes in 4.3 BSD are intended for the public domain;
|
|
|
|
they may be copied freely provided their provenance is always
|
|
|
|
acknowledged, and provided users assist the authors in their
|
|
|
|
researches by reporting experience with the codes.
|
|
|
|
Therefore no user of UNIX on a machine whose arithmetic resembles
|
|
|
|
VAX D_floating\-point need use anything worse than the new \fIlibm\fR.
|
|
|
|
.PP
|
|
|
|
\fBIEEE STANDARD 754 Floating\-Point Arithmetic:\fR
|
|
|
|
.PP
|
|
|
|
This standard is on its way to becoming more widely adopted
|
|
|
|
than any other design for computer arithmetic.
|
|
|
|
VLSI chips that conform to some version of that standard have been
|
|
|
|
produced by a host of manufacturers, among them ...
|
|
|
|
.nf
|
|
|
|
.ta 0.5i +\w'Intel i8070, i80287'u+6n
|
|
|
|
Intel i8087, i80287 National Semiconductor 32081
|
|
|
|
Motorola 68881 Weitek WTL-1032, ... , -1165
|
|
|
|
Zilog Z8070 Western Electric (AT&T) WE32106.
|
|
|
|
.ta
|
|
|
|
.fi
|
|
|
|
Other implementations range from software, done thoroughly
|
|
|
|
in the Apple Macintosh, through VLSI in the Hewlett\-Packard
|
|
|
|
9000 series, to the ELXSI 6400 running ECL at 3 Megaflops.
|
|
|
|
Several other companies have adopted the formats
|
|
|
|
of IEEE 754 without, alas, adhering to the standard's way
|
|
|
|
of handling rounding and exceptions like over/underflow.
|
|
|
|
The DEC VAX G_floating\-point format is very similar to the IEEE
|
|
|
|
754 Double format, so similar that the C programs for the
|
|
|
|
IEEE versions of most of the elementary functions listed
|
|
|
|
above could easily be converted to run on a MicroVAX, though
|
|
|
|
nobody has volunteered to do that yet.
|
|
|
|
.PP
|
|
|
|
The codes in 4.3 BSD's \fIlibm\fR for machines that conform to
|
|
|
|
IEEE 754 are intended primarily for the National Semi. 32081
|
|
|
|
and WTL 1164/65. To use these codes with the Intel or Zilog
|
|
|
|
chips, or with the Apple Macintosh or ELXSI 6400, is to
|
|
|
|
forego the use of better codes provided (perhaps freely) by
|
|
|
|
those companies and designed by some of the authors of the
|
|
|
|
codes above.
|
|
|
|
Except for \fIatan\fR, \fIcabs\fR, \fIcbrt\fR, \fIerf\fR,
|
|
|
|
\fIerfc\fR, \fIhypot\fR, \fIj0\-jn\fR, \fIlgamma\fR, \fIpow\fR
|
|
|
|
and \fIy0\-yn\fR,
|
|
|
|
the Motorola 68881 has all the functions in \fIlibm\fR on chip,
|
|
|
|
and faster and more accurate;
|
|
|
|
it, Apple, the i8087, Z8070 and WE32106 all use 64
|
|
|
|
.if n \
|
|
|
|
sig.
|
|
|
|
.if t \
|
|
|
|
significant
|
|
|
|
bits.
|
|
|
|
The main virtue of 4.3 BSD's
|
|
|
|
\fIlibm\fR codes is that they are intended for the public domain;
|
|
|
|
they may be copied freely provided their provenance is always
|
|
|
|
acknowledged, and provided users assist the authors in their
|
|
|
|
researches by reporting experience with the codes.
|
|
|
|
Therefore no user of UNIX on a machine that conforms to
|
|
|
|
IEEE 754 need use anything worse than the new \fIlibm\fR.
|
|
|
|
.PP
|
|
|
|
Properties of IEEE 754 Double\-Precision:
|
|
|
|
.RS
|
|
|
|
Wordsize: 64 bits, 8 bytes. Radix: Binary.
|
|
|
|
.br
|
|
|
|
Precision: 53
|
|
|
|
.if n \
|
|
|
|
sig.
|
|
|
|
.if t \
|
|
|
|
significant
|
|
|
|
bits, roughly like 16
|
|
|
|
.if n \
|
|
|
|
sig.
|
|
|
|
.if t \
|
|
|
|
significant
|
|
|
|
decimals.
|
|
|
|
.RS
|
|
|
|
If x and x' are consecutive positive Double\-Precision
|
|
|
|
numbers (they differ by 1 \*(up), then
|
|
|
|
.br
|
|
|
|
1.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16.
|
|
|
|
.RE
|
|
|
|
.nf
|
|
|
|
.ta \w'Range:'u+1n +\w'Underflow threshold'u+1n +\w'= 2.0**1024'u+1n
|
|
|
|
Range: Overflow threshold = 2.0**1024 = 1.8e308
|
|
|
|
Underflow threshold = 0.5**1022 = 2.2e\-308
|
|
|
|
.ta
|
|
|
|
.fi
|
|
|
|
.RS
|
|
|
|
Overflow goes by default to a signed
|
|
|
|
.If "" .
|
|
|
|
.br
|
|
|
|
Underflow is \fIGradual,\fR rounding to the nearest
|
|
|
|
integer multiple of 0.5**1074 = 4.9e\-324.
|
|
|
|
.RE
|
|
|
|
Zero is represented ambiguously as +0 or \-0.
|
|
|
|
.RS
|
|
|
|
Its sign transforms correctly through multiplication or
|
|
|
|
division, and is preserved by addition of zeros
|
|
|
|
with like signs; but x\-x yields +0 for every
|
|
|
|
finite x. The only operations that reveal zero's
|
|
|
|
sign are division by zero and copysign(x,\(+-0).
|
|
|
|
In particular, comparison (x > y, x \(>= y, etc.)
|
|
|
|
cannot be affected by the sign of zero; but if
|
|
|
|
finite x = y then
|
|
|
|
.If
|
|
|
|
\&= 1/(x\-y)
|
|
|
|
.if n \
|
|
|
|
!=
|
|
|
|
.if t \
|
|
|
|
\(!=
|
|
|
|
\-1/(y\-x) =
|
|
|
|
.If \- .
|
|
|
|
.RE
|
|
|
|
.If
|
|
|
|
is signed.
|
|
|
|
.RS
|
|
|
|
it persists when added to itself
|
|
|
|
or to any finite number. Its sign transforms
|
|
|
|
correctly through multiplication and division, and
|
|
|
|
.If (finite)/\(+- \0=\0\(+-0
|
|
|
|
(nonzero)/0 =
|
|
|
|
.If \(+- .
|
|
|
|
But
|
|
|
|
.if n \
|
|
|
|
Infinity\-Infinity, Infinity\(**0 and Infinity/Infinity
|
|
|
|
.if t \
|
|
|
|
\(if\-\(if, \(if\(**0 and \(if/\(if
|
|
|
|
are, like 0/0 and sqrt(\-3),
|
|
|
|
invalid operations that produce \*(nn. ...
|
|
|
|
.RE
|
|
|
|
Reserved operands:
|
|
|
|
.RS
|
|
|
|
there are 2**53\-2 of them, all
|
|
|
|
called \*(nn (\fIN\fRot \fIa N\fRumber).
|
|
|
|
Some, called Signaling \*(nns, trap any floating\-point operation
|
|
|
|
performed upon them; they are used to mark missing
|
|
|
|
or uninitialized values, or nonexistent elements
|
|
|
|
of arrays. The rest are Quiet \*(nns; they are
|
|
|
|
the default results of Invalid Operations, and
|
|
|
|
propagate through subsequent arithmetic operations.
|
|
|
|
If x
|
|
|
|
.if n \
|
|
|
|
!=
|
|
|
|
.if t \
|
|
|
|
\(!=
|
|
|
|
x then x is \*(nn; every other predicate
|
|
|
|
(x > y, x = y, x < y, ...) is FALSE if \*(nn is involved.
|
|
|
|
.br
|
|
|
|
NOTE: Trichotomy is violated by \*(nn.
|
|
|
|
.RS
|
|
|
|
Besides being FALSE, predicates that entail ordered
|
|
|
|
comparison, rather than mere (in)equality,
|
|
|
|
signal Invalid Operation when \*(nn is involved.
|
|
|
|
.RE
|
|
|
|
.RE
|
|
|
|
Rounding:
|
|
|
|
.RS
|
|
|
|
Every algebraic operation (+, \-, \(**, /,
|
|
|
|
.if n \
|
|
|
|
sqrt)
|
|
|
|
.if t \
|
|
|
|
\(sr)
|
|
|
|
is rounded by default to within half an \*(up, and
|
|
|
|
when the rounding error is exactly half an \*(up then
|
|
|
|
the rounded value's least significant bit is zero.
|
|
|
|
This kind of rounding is usually the best kind,
|
|
|
|
sometimes provably so; for instance, for every
|
|
|
|
x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find
|
|
|
|
(x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ...
|
|
|
|
despite that both the quotients and the products
|
|
|
|
have been rounded. Only rounding like IEEE 754
|
|
|
|
can do that. But no single kind of rounding can be
|
|
|
|
proved best for every circumstance, so IEEE 754
|
|
|
|
provides rounding towards zero or towards
|
|
|
|
.If +
|
|
|
|
or towards
|
|
|
|
.If \-
|
|
|
|
at the programmer's option. And the
|
|
|
|
same kinds of rounding are specified for
|
|
|
|
Binary\-Decimal Conversions, at least for magnitudes
|
|
|
|
between roughly 1.0e\-10 and 1.0e37.
|
|
|
|
.RE
|
|
|
|
Exceptions:
|
|
|
|
.RS
|
|
|
|
IEEE 754 recognizes five kinds of floating\-point exceptions,
|
|
|
|
listed below in declining order of probable importance.
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
.ta \w'Invalid Operation'u+6n +\w'Gradual Underflow'u+2n
|
|
|
|
Exception Default Result
|
|
|
|
.tc \(ru
|
|
|
|
|
|
|
|
.tc
|
|
|
|
Invalid Operation \*(nn, or FALSE
|
|
|
|
.if n \{\
|
|
|
|
Overflow \(+-Infinity
|
|
|
|
Divide by Zero \(+-Infinity \}
|
|
|
|
.if t \{\
|
|
|
|
Overflow \(+-\(if
|
|
|
|
Divide by Zero \(+-\(if \}
|
|
|
|
Underflow Gradual Underflow
|
|
|
|
Inexact Rounded value
|
|
|
|
.ta
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
NOTE: An Exception is not an Error unless handled
|
|
|
|
badly. What makes a class of exceptions exceptional
|
|
|
|
is that no single default response can be satisfactory
|
|
|
|
in every instance. On the other hand, if a default
|
|
|
|
response will serve most instances satisfactorily,
|
|
|
|
the unsatisfactory instances cannot justify aborting
|
|
|
|
computation every time the exception occurs.
|
|
|
|
.RE
|
|
|
|
.PP
|
|
|
|
For each kind of floating\-point exception, IEEE 754
|
|
|
|
provides a Flag that is raised each time its exception
|
|
|
|
is signaled, and stays raised until the program resets
|
|
|
|
it. Programs may also test, save and restore a flag.
|
|
|
|
Thus, IEEE 754 provides three ways by which programs
|
|
|
|
may cope with exceptions for which the default result
|
|
|
|
might be unsatisfactory:
|
|
|
|
.IP 1) \w'\0\0\0\0'u
|
|
|
|
Test for a condition that might cause an exception
|
|
|
|
later, and branch to avoid the exception.
|
|
|
|
.IP 2) \w'\0\0\0\0'u
|
|
|
|
Test a flag to see whether an exception has occurred
|
|
|
|
since the program last reset its flag.
|
|
|
|
.IP 3) \w'\0\0\0\0'u
|
|
|
|
Test a result to see whether it is a value that only
|
|
|
|
an exception could have produced.
|
|
|
|
.RS
|
|
|
|
CAUTION: The only reliable ways to discover
|
|
|
|
whether Underflow has occurred are to test whether
|
|
|
|
products or quotients lie closer to zero than the
|
|
|
|
underflow threshold, or to test the Underflow
|
|
|
|
flag. (Sums and differences cannot underflow in
|
|
|
|
IEEE 754; if x
|
|
|
|
.if n \
|
|
|
|
!=
|
|
|
|
.if t \
|
|
|
|
\(!=
|
|
|
|
y then x\-y is correct to
|
|
|
|
full precision and certainly nonzero regardless of
|
|
|
|
how tiny it may be.) Products and quotients that
|
|
|
|
underflow gradually can lose accuracy gradually
|
|
|
|
without vanishing, so comparing them with zero
|
|
|
|
(as one might on a VAX) will not reveal the loss.
|
|
|
|
Fortunately, if a gradually underflowed value is
|
|
|
|
destined to be added to something bigger than the
|
|
|
|
underflow threshold, as is almost always the case,
|
|
|
|
digits lost to gradual underflow will not be missed
|
|
|
|
because they would have been rounded off anyway.
|
|
|
|
So gradual underflows are usually \fIprovably\fR ignorable.
|
|
|
|
The same cannot be said of underflows flushed to 0.
|
|
|
|
.RE
|
|
|
|
.PP
|
|
|
|
At the option of an implementor conforming to IEEE 754,
|
|
|
|
other ways to cope with exceptions may be provided:
|
|
|
|
.IP 4) \w'\0\0\0\0'u
|
|
|
|
ABORT. This mechanism classifies an exception in
|
|
|
|
advance as an incident to be handled by means
|
|
|
|
traditionally associated with error\-handling
|
|
|
|
statements like "ON ERROR GO TO ...". Different
|
|
|
|
languages offer different forms of this statement,
|
|
|
|
but most share the following characteristics:
|
|
|
|
.IP \(em \w'\0\0\0\0'u
|
|
|
|
No means is provided to substitute a value for
|
|
|
|
the offending operation's result and resume
|
|
|
|
computation from what may be the middle of an
|
|
|
|
expression. An exceptional result is abandoned.
|
|
|
|
.IP \(em \w'\0\0\0\0'u
|
|
|
|
In a subprogram that lacks an error\-handling
|
|
|
|
statement, an exception causes the subprogram to
|
|
|
|
abort within whatever program called it, and so
|
|
|
|
on back up the chain of calling subprograms until
|
|
|
|
an error\-handling statement is encountered or the
|
|
|
|
whole task is aborted and memory is dumped.
|
|
|
|
.IP 5) \w'\0\0\0\0'u
|
|
|
|
STOP. This mechanism, requiring an interactive
|
|
|
|
debugging environment, is more for the programmer
|
|
|
|
than the program. It classifies an exception in
|
|
|
|
advance as a symptom of a programmer's error; the
|
|
|
|
exception suspends execution as near as it can to
|
|
|
|
the offending operation so that the programmer can
|
|
|
|
look around to see how it happened. Quite often
|
|
|
|
the first several exceptions turn out to be quite
|
|
|
|
unexceptionable, so the programmer ought ideally
|
|
|
|
to be able to resume execution after each one as if
|
|
|
|
execution had not been stopped.
|
|
|
|
.IP 6) \w'\0\0\0\0'u
|
|
|
|
\&... Other ways lie beyond the scope of this document.
|
|
|
|
.RE
|
|
|
|
.PP
|
|
|
|
The crucial problem for exception handling is the problem of
|
|
|
|
Scope, and the problem's solution is understood, but not
|
|
|
|
enough manpower was available to implement it fully in time
|
|
|
|
to be distributed in 4.3 BSD's \fIlibm\fR. Ideally, each
|
|
|
|
elementary function should act as if it were indivisible, or
|
|
|
|
atomic, in the sense that ...
|
|
|
|
.IP i) \w'iii)'u+2n
|
|
|
|
No exception should be signaled that is not deserved by
|
|
|
|
the data supplied to that function.
|
|
|
|
.IP ii) \w'iii)'u+2n
|
|
|
|
Any exception signaled should be identified with that
|
|
|
|
function rather than with one of its subroutines.
|
|
|
|
.IP iii) \w'iii)'u+2n
|
|
|
|
The internal behavior of an atomic function should not
|
|
|
|
be disrupted when a calling program changes from
|
|
|
|
one to another of the five or so ways of handling
|
|
|
|
exceptions listed above, although the definition
|
|
|
|
of the function may be correlated intentionally
|
|
|
|
with exception handling.
|
|
|
|
.PP
|
|
|
|
Ideally, every programmer should be able \fIconveniently\fR to
|
|
|
|
turn a debugged subprogram into one that appears atomic to
|
|
|
|
its users. But simulating all three characteristics of an
|
|
|
|
atomic function is still a tedious affair, entailing hosts
|
|
|
|
of tests and saves\-restores; work is under way to ameliorate
|
|
|
|
the inconvenience.
|
|
|
|
.PP
|
|
|
|
Meanwhile, the functions in \fIlibm\fR are only approximately
|
|
|
|
atomic. They signal no inappropriate exception except
|
|
|
|
possibly ...
|
|
|
|
.RS
|
|
|
|
Over/Underflow
|
|
|
|
.RS
|
|
|
|
when a result, if properly computed, might have lain barely within range, and
|
|
|
|
.RE
|
|
|
|
Inexact in \fIcabs\fR, \fIcbrt\fR, \fIhypot\fR, \fIlog10\fR and \fIpow\fR
|
|
|
|
.RS
|
|
|
|
when it happens to be exact, thanks to fortuitous cancellation of errors.
|
|
|
|
.RE
|
|
|
|
.RE
|
|
|
|
Otherwise, ...
|
|
|
|
.RS
|
|
|
|
Invalid Operation is signaled only when
|
|
|
|
.RS
|
|
|
|
any result but \*(nn would probably be misleading.
|
|
|
|
.RE
|
|
|
|
Overflow is signaled only when
|
|
|
|
.RS
|
|
|
|
the exact result would be finite but beyond the overflow threshold.
|
|
|
|
.RE
|
|
|
|
Divide\-by\-Zero is signaled only when
|
|
|
|
.RS
|
|
|
|
a function takes exactly infinite values at finite operands.
|
|
|
|
.RE
|
|
|
|
Underflow is signaled only when
|
|
|
|
.RS
|
|
|
|
the exact result would be nonzero but tinier than the underflow threshold.
|
|
|
|
.RE
|
|
|
|
Inexact is signaled only when
|
|
|
|
.RS
|
|
|
|
greater range or precision would be needed to represent the exact result.
|
|
|
|
.RE
|
|
|
|
.RE
|
|
|
|
.SH BUGS
|
|
|
|
When signals are appropriate, they are emitted by certain
|
|
|
|
operations within the codes, so a subroutine\-trace may be
|
|
|
|
needed to identify the function with its signal in case
|
|
|
|
method 5) above is in use. And the codes all take the
|
|
|
|
IEEE 754 defaults for granted; this means that a decision to
|
|
|
|
trap all divisions by zero could disrupt a code that would
|
|
|
|
otherwise get correct results despite division by zero.
|
|
|
|
.SH SEE ALSO
|
1998-02-04 22:30:20 +00:00
|
|
|
\fBfpgetround\fR(3),
|
|
|
|
\fBfpsetround\fR(3),
|
|
|
|
\fBfpgetprec\fR(3),
|
|
|
|
\fBfpsetprec\fR(3),
|
|
|
|
\fBfpgetmask\fR(3),
|
|
|
|
\fBfpsetmask\fR(3),
|
|
|
|
\fBfpgetsticky\fR(3),
|
|
|
|
\fBfpresetsticky\fR(3) - IEEE floating point interface
|
|
|
|
.SH NOTES
|
1994-08-19 09:40:01 +00:00
|
|
|
An explanation of IEEE 754 and its proposed extension p854
|
|
|
|
was published in the IEEE magazine MICRO in August 1984 under
|
|
|
|
the title "A Proposed Radix\- and Word\-length\-independent
|
|
|
|
Standard for Floating\-point Arithmetic" by W. J. Cody et al.
|
|
|
|
The manuals for Pascal, C and BASIC on the Apple Macintosh
|
|
|
|
document the features of IEEE 754 pretty well.
|
|
|
|
Articles in the IEEE magazine COMPUTER vol. 14 no. 3 (Mar.
|
|
|
|
1981), and in the ACM SIGNUM Newsletter Special Issue of
|
|
|
|
Oct. 1979, may be helpful although they pertain to
|
|
|
|
superseded drafts of the standard.
|