6.5.8 Floating Point

For the rules used by the text interpreter for recognising floating-point numbers see Number Conversion.

Gforth has a separate floating point stack, but the documentation uses the unified notation.10

Floating point numbers have a number of unpleasant surprises for the unwary (e.g., floating point addition is not associative) and even a few for the wary. You should not use them unless you know what you are doing or you don’t care that the results you get may be totally bogus. If you want to learn about the problems of floating point numbers (and how to avoid them), you might start with David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys 23(1):5−48, March 1991.

Conversion between integers and floating-point:

s>f ( n – r ) floating-ext “s-to-f”
d>f ( d – r ) floating “d-to-f”
f>s ( r – n ) floating-ext “f-to-s”
f>d ( r – d ) floating “f-to-d”

Arithmetics:

f+ ( r1 r2 – r3 ) floating “f-plus”
f- ( r1 r2 – r3 ) floating “f-minus”
f* ( r1 r2 – r3 ) floating “f-star”
f/ ( r1 r2 – r3 ) floating “f-slash”
fnegate ( r1 – r2 ) floating “f-negate”
fabs ( r1 – r2 ) floating-ext “f-abs”
fcopysign ( r1 r2 – r3  ) gforth-1.0 “fcopysign”

r3 takes its absolute value from r1 and its sign from r2

fmax ( r1 r2 – r3 ) floating “f-max”
fmin ( r1 r2 – r3 ) floating “f-min”
floor ( r1 – r2 ) floating “floor”

Round towards the next smaller integral value, i.e., round toward negative infinity.

fround ( r1 – r2 ) floating “f-round”

Round to the nearest integral value.

ftrunc ( r1 – r2  ) floating-ext “f-trunc”

round towards 0

f** ( r1 r2 – r3 ) floating-ext “f-star-star”

r3 = r1r2

fsqrt ( r1 – r2 ) floating-ext “f-square-root”
fexp ( r1 – r2 ) floating-ext “f-e-x-p”

r2 = er1

fexpm1 ( r1 – r2 ) floating-ext “f-e-x-p-m-one”

r2=er1−1

fln ( r1 – r2 ) floating-ext “f-l-n”

Natural logarithm: r1 = er2

flnp1 ( r1 – r2 ) floating-ext “f-l-n-p-one”

Inverse of fexpm1: r1+1 = er2

flog ( r1 – r2 ) floating-ext “f-log”

The decimal logarithm: r1 = 10r2

falog ( r1 – r2 ) floating-ext “f-a-log”

r2=10r1

f2* ( r1 – r2  ) gforth-0.2 “f2*”

Multiply r1 by 2.0e0

f2/ ( r1 – r2  ) gforth-0.2 “f2/”

Multiply r1 by 0.5e0

1/f ( r1 – r2  ) gforth-0.2 “1/f”

Divide 1.0e0 by r1.

Vector arithmetics:

v* ( f-addr1 nstride1 f-addr2 nstride2 ucount – r ) gforth-0.5 “v-star”

dot-product: r=v1*v2. The first element of v1 is at f_addr1, the next at f_addr1+nstride1 and so on (similar for v2). Both vectors have ucount elements.

faxpy ( ra f-x nstridex f-y nstridey ucount – ) gforth-0.5 “faxpy”

vy=ra*vx+vy, where vy is the vector starting at f_y with stride nstridey bytes, and vx is the vector starting at f_x with stride nstridex, and both vectors contain ucount elements.

Angles in floating point operations are given in radians (a full circle has 2 pi radians).

fsin ( r1 – r2 ) floating-ext “f-sine”
fcos ( r1 – r2 ) floating-ext “f-cos”
fsincos ( r1 – r2 r3 ) floating-ext “f-sine-cos”

r2=sin(r1), r3=cos(r1)

ftan ( r1 – r2 ) floating-ext “f-tan”
fasin ( r1 – r2 ) floating-ext “f-a-sine”
facos ( r1 – r2 ) floating-ext “f-a-cos”
fatan ( r1 – r2 ) floating-ext “f-a-tan”
fatan2 ( r1 r2 – r3 ) floating-ext “f-a-tan-two”

r1/r2=tan(r3). Forth-2012 does not require, but probably intends this to be the inverse of fsincos. In Gforth it is.

fsinh ( r1 – r2 ) floating-ext “f-cinch”
fcosh ( r1 – r2 ) floating-ext “f-cosh”
ftanh ( r1 – r2 ) floating-ext “f-tan-h”
fasinh ( r1 – r2 ) floating-ext “f-a-cinch”
facosh ( r1 – r2 ) floating-ext “f-a-cosh”
fatanh ( r1 – r2 ) floating-ext “f-a-tan-h”
pi ( – r  ) gforth-0.2 “pi”

Fconstantr is the value pi; the ratio of a circle’s area to its diameter.

One particular problem with floating-point arithmetic is that comparison for equality often fails when you would expect it to succeed. For this reason approximate equality is often preferred (but you still have to know what you are doing). Also note that IEEE NaNs may compare differently from what you might expect. The comparison words are:

f~rel ( r1 r2 r3 – flag  ) gforth-0.5 “f~rel”

Approximate equality with relative error: |r1-r2|<r3*|r1+r2|.

f~abs ( r1 r2 r3 – flag  ) gforth-0.5 “f~abs”

Approximate equality with absolute error: |r1-r2|<r3.

f~ ( r1 r2 r3 – flag  ) floating-ext “f-proximate”

Forth-2012 medley for comparing r1 and r2 for equality: r3>0: f~abs; r3=0: bitwise comparison; r3<0: fnegate f~rel.

f= ( r1 r2 – f ) gforth-0.2 “f-equals”
f<> ( r1 r2 – f ) gforth-0.2 “f-not-equals”
f< ( r1 r2 – f ) floating “f-less-than”
f<= ( r1 r2 – f ) gforth-0.2 “f-less-or-equal”
f> ( r1 r2 – f ) gforth-0.2 “f-greater-than”
f>= ( r1 r2 – f ) gforth-0.2 “f-greater-or-equal”
f0< ( r – f ) floating “f-zero-less-than”
f0<= ( r – f ) gforth-0.2 “f-zero-less-or-equal”
f0<> ( r – f ) gforth-0.2 “f-zero-not-equals”
f0= ( r – f ) floating “f-zero-equals”
f0> ( r – f ) gforth-0.2 “f-zero-greater-than”
f0>= ( r – f ) gforth-0.2 “f-zero-greater-or-equal”

Special values in IEEE754 can be derived by for example dividing by zero. The most common ones are defined as floating point constants for easy usage.

infinity ( – r  ) gforth-1.0 “infinity”

floating point infinity

inf ( – r  ) gforth-1.0 “inf”

synonym of infinity for copy-paste from ..., See Examining data.

-infinity ( – r  ) gforth-1.0 “-infinity”

floating point -infinity

-inf ( – r  ) gforth-1.0 “-inf”

synonym of -infinity for copy-paste from ..., See Examining data.

NaN ( – r  ) gforth-1.0 “NaN”

floating point Not a Number


Footnotes

(10)

It’s easy to generate the separate notation from that by just separating the floating-point numbers out: e.g. ( n r1 u r2 -- r3 ) becomes ( n u -- ) ( F: r1 r2 -- r3 ).