Thread

Topic: Accurate formatting of floating-point numbers?

Author: ark@research.att.com (Andrew Koenig)
Date: 1997/11/21 Raw View

In article <3474B4FE.5D6A@nz.eds.com>,
Ross Smith  <ross.smith@nz.eds.com> wrote:

> > The only alternative I can imagine that _might_ work is to do the best
> > conversion to decimal possible with n bits, and then tweak the result until the
> > conversion back to binary produces the original value. But this would be many
> > times as inefficient as the original imperfect conversion.

> Even that isn't certain to work, because there's no guarantee that the
> decimal-to-binary conversion algorithm is capable of producing every
> possible binary value.

The IEEE floating-point spec requires that conversions in both directions
must be off by at most 0.47 in the low-order bit position, and says
that that requirement is sufficient to guarantee that every binary
value can be written out in decimal and read back to give exactly
the same bit pattern.

I have heard that the number 0.47 is the main result of Jerome Coonen's
PhD thesis; perhaps someone more closely connected with that work
can confirm.

The Scheme language adopts a slightly different, but also useful
approach to floating-point formatting: It requires that input conversion
be perfectly accurate (that is, the binary result is always exactly
equal to the result of rounding the exact decimal input value to
the given number of bits) and that the result of output conversion
be the *shortest* string of decimal digits that, when converted back,
yield exactly the same bit pattern.  I think that falls within the
IEEE bounds.

But as far as C++ is concerned, the standards committee did not want
to impose accuracy requirements on C++ beyond those imposed by C.
Otherwise it might not be possible for a C++ implementation to behave
compatibly with a C implementation on the same machine.
--
    --Andrew Koenig
      ark@research.att.com
      http://www.research.att.com/info/ark
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Pierre Baillargeon <pierre@jazzmail.com>
Date: 1997/11/21 Raw View

I think the original problem has been lost in the discussion. The problem was
generating C++ code with floating point constants which would yield the same
number (with all bit) when compiled.

Well, the C++ standard does not require any precision on number reading (for
obvious portability reason) nor any method (i.e. it does not prescribe that all
possible digit be used, even using additional digit to resolve round-off). Thus
even if the floating point number is output correctly, there is no garantee that
the compiler will read it correctly.

Since bit-for-bit was required, I suspect a non-portable solution could be used
(since bits are not portable). So using an union of a bit-field and a floating
point number could do the trick. Would work on all platforms with the same number
representation (and byte ordering).
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: 1997/11/22 Raw View

Ross Smith wrote:
...
> Even that isn't certain to work, because there's no guarantee that the
> decimal-to-binary conversion algorithm is capable of producing every
> possible binary value.

Every real number representable in IEEE floating point format has an
exact representation as a finite-length decimal string. I would expect
that every such string is converted exactly to the corresponding binary
value by atof(). Furthermore, I would expect that every decimal string
representing a number that is not too close to being exactly halfway
between representable values, will be converted to the nearest
representable value. In all cases, I would expect it to be converted to
one of the two bracketing representable values. Do you have any specific
reason why those expectations might not hold up?
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1997/11/22 Raw View

Igor Boukanov wrote:
 >
 > Paul D. DeRocco (pderocco@ix.netcom.com) wrote:
 > > Not true. For any binary floating point value, there may not be a precise
 > > representation of it in decimal.
 >
 > Maybe I am very wrong, but as I undestand any binary float value
 > maybe represented as a decimal value (but not vice versa).
 > That comes from the fact that 10 = 2*5 that also gives that the exact
 > representation of N-digit fractinal part of a binary float number conatains
 > exactly N decimal digits:
 > 0.111B = 0.875
 > 1e-4B = 0.0001B = 0.0625
 > etc.

Yes, of course, but in takes nearly N decimal digits to represent a binary
floating point number with an exponent of -N, which is impractical when talking
about very tiny numbers.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1997/11/24 Raw View

James Russell Kuyper Jr. wrote:
>
> Every real number representable in IEEE floating point format has an
> exact representation as a finite-length decimal string. I would expect
> that every such string is converted exactly to the corresponding binary
> value by atof(). Furthermore, I would expect that every decimal string
> representing a number that is not too close to being exactly halfway
> between representable values, will be converted to the nearest
> representable value. In all cases, I would expect it to be converted to
> one of the two bracketing representable values. Do you have any specific
> reason why those expectations might not hold up?

In order to do this for a floating point number with n bits of precision, I
think it takes more than n bits of precision in the decimal to binary
conversion to ensure this, at least in the naive algorithm. I'm not sure,
though.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Jonathan Fry <jon@spss.com>
Date: 1997/11/19 Raw View

David Bruce wrote:
>
> How/can I output floating-point numbers as ASCII/UniCode/... text in such a
> way that I can be sure of getting the same number (i.e., no rounding errors)
> when
>   (1) reading them in again (e.g., with an istream)?
>   (2) using the output as a literal in generated C++ source code?
>
> In Scheme it is required that printing a number and reading it in again
> yields the same value, and algorithms (at least for IEEE754 floating point)
> to achieve this have been published since 1990 at least.

If any of these algorithms are _not_ written in Scheme or some other
notation which assumes exact and unlimited arithmetic, I'd be very
interested to hear about those.

--
--------------------
Jonathan Fry
Developer
SPSS Inc.
jon@spss.com      (SPSS questions to support@spss.com)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: fjh@murlibobo.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/11/19 Raw View

jensk@bbn.hp.com (Jens Kilian) writes:

>To accurately read floating-point numbers, unlimited precision arithmetic
>is required (at least in some cases).

I don't think that is correct, or at least it misses the point.
Since floating-point numbers have finite precision, you only need
to write out a finite number of decimal digits to distinguish between e.g.
all possible IEEE double-precision floating point numbers.
Similarly, computing the finite-precision binary floating point number
that is the closest approximation to any finite decimal in the input
stream should not require unbounded memory.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1997/11/20 Raw View

Jens Kilian wrote:
>
> To accurately read floating-point numbers, unlimited precision arithmetic
> is required (at least in some cases).

Not true. For any binary floating point value, there may not be a precise
representation of it in decimal. However, if you keep adding significant
digits, you'll arrive at a point where incrementing the least significant digit
will produce a change small enough that it doesn't show up in the binary
equivalent. That is, there is some finite number of decimal digits that can be
used to represent any possible binary value. There may be several decimal
values that convert to the same binary value, however. But if the binary to
decimal conversion generates one of these decimal values, the decimal to binary
conversion will produce the original value.

Now for some speculation. The reason that this is never done is that in order
to do this for, say, n bits of precision, I believe the computations need more
than n bits of precision. You could therefore do it for floats by using
doubles, and perhaps for doubles by using long doubles, but you could never do
it for long doubles without writing a complete floating point emulator for a
"long long double", which would rarely be worth it.

The only alternative I can imagine that _might_ work is to do the best
conversion to decimal possible with n bits, and then tweak the result until the
conversion back to binary produces the original value. But this would be many
times as inefficient as the original imperfect conversion.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: ark@research.att.com (Andrew Koenig)
Date: 1997/11/21 Raw View

In article <3472e109.0@isoit370.bbn.hp.com>,
Jens Kilian <Jens_Kilian@bbn.hp.com> wrote:

> > In Scheme it is required that printing a number and reading it in again
> > yields the same value, and algorithms (at least for IEEE754 floating point)
> > to achieve this have been published since 1990 at least.

> > As far as I can tell, the new ISO Standard C++ provides no such guarantees.

> Neither does the IEEE arithmetic standard itself.

Yes it does.
--
    --Andrew Koenig
      ark@research.att.com
      http://www.research.att.com/info/ark
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: boukanov@sentef1.fi.uib.no (Igor Boukanov)
Date: 1997/11/21 Raw View

Paul D. DeRocco (pderocco@ix.netcom.com) wrote:
> Not true. For any binary floating point value, there may not be a precise
> representation of it in decimal.

Maybe I am very wrong, but as I undestand any binary float value
maybe represented as a decimal value (but not vice versa).
That comes from the fact that 10 = 2*5 that also gives that the exact
representation of N-digit fractinal part of a binary float number conatains
exactly N decimal digits:
0.111B = 0.875
1e-4B = 0.0001B = 0.0625
etc.

--
Regards, Igor Boukanov.
igor.boukanov@fi.uib.no
http://www.fi.uib.no/~boukanov/
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Ross Smith <ross.smith@nz.eds.com>
Date: 1997/11/21 Raw View

Paul D. DeRocco wrote:
>
> The only alternative I can imagine that _might_ work is to do the best
> conversion to decimal possible with n bits, and then tweak the result until the
> conversion back to binary produces the original value. But this would be many
> times as inefficient as the original imperfect conversion.

Even that isn't certain to work, because there's no guarantee that the
decimal-to-binary conversion algorithm is capable of producing every
possible binary value.

--
Ross Smith ............................. <mailto:ross.smith@nz.eds.com>
Internet and New Media, EDS (New Zealand) Ltd., Wellington, New Zealand
   "The first thing we do, let's kill all the language lawyers."
                             -- Henry VI Part II, by W. Shakespeare;
                                additional dialogue by B. Stroustrup
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: jensk@bbn.hp.com (Jens Kilian)
Date: 1997/11/21 Raw View

Paul D. DeRocco (pderocco@ix.netcom.com) wrote:
> Jens Kilian wrote:
> >
> > To accurately read floating-point numbers, unlimited precision arithmetic
> > is required (at least in some cases).

> Not true. For any binary floating point value, there may not be a precise
> representation of it in decimal. However, if you keep adding significant
> digits, you'll arrive at a point where incrementing the least significant
> digit will produce a change small enough that it doesn't show up in the
> binary equivalent.

Oops, you're right.  The "unlimited precision" statement came from Clinger's
paper[1], which describes how to read numbers written with a *minimal*
number of digits.

Greetings,

 Jens.

[1] William D. Clinger: How to Read Floating-Point Numbers Accurately.
    Proceedings of the ACM SIGPLAN'90 Conference on Programming Language
    Design and Implementation (PLDI), White Plains, New York, June 20-22, 1990.
    SIGPLAN Notices 25(6) (June 1990), pp. 92-101.
--
mailto:jjk@acm.org                 phone:+49-7031-14-7698 (HP TELNET 778-7698)
  http://www.bawue.de/~jjk/          fax:+49-7031-14-7351
PGP:       06 04 1C 35 7B DC 1F 26 As the air to a bird, or the sea to a fish,
0x555DA8B5 BB A2 F0 66 77 75 E1 08 so is contempt to the contemptible. [Blake]
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: mfinney@lynchburg.net
Date: 1997/11/21 Raw View

In <347339DA.31F8@spss.com>, Jonathan Fry <jon@spss.com> writes:
>If any of these algorithms are _not_ written in Scheme or some other
>notation which assumes exact and unlimited arithmetic, I'd be very
>interested to hear about those.

The algorithms of which I am aware are...

How to Read Floating Point Numbers Accurately
   William D. Clinger, ACM SIGPLAN'90, pages 92-101

   - This is, unfortunately, in Scheme

How to Print Floating-Point Numbers Accurately
   Guy L. Steele Jr., Jon L. White, ACM SIGPLAN'90, pages 112-123

   - This is in a pseudo-ALGOL

Printing Floating-Point Numbers Quickly and Accurately
   Robert G. Burger, R. Kent Dybvig, ACM SIGNPLAN'96, pages 108-116

   - This is also, unfortunately, in Scheme

However, the discussions should make it possible to convert to a
different language.

Michael Lee Finney
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: David Bruce <dib@dera.gov.uk>
Date: 1997/11/18 Raw View

How/can I output floating-point numbers as ASCII/UniCode/... text in such a
way that I can be sure of getting the same number (i.e., no rounding errors)
when
  (1) reading them in again (e.g., with an istream)?
  (2) using the output as a literal in generated C++ source code?

In Scheme it is required that printing a number and reading it in again
yields the same value, and algorithms (at least for IEEE754 floating point)
to achieve this have been published since 1990 at least.

As far as I can tell, the new ISO Standard C++ provides no such guarantees.
[C++PL3, section 21.4.3] talks about formats and precisions, but none of
the options presented appears to offer much flexibility.
I'm prepared for the output to use as many digits as it needs, though I'd
rather not use more than necessary.  I suppose I want a floating point
analogue of <stream>.width(0).  The best I've come up with so far is
    <stream>.precision(numeric_limits<double>::digits10)
but that probably generates *lots* of trailing zeros, and may or may not
even give accurate output in the sense I require.  (And it's pretty horrid.)

Obviously I could try to write a manipulator/locale/whatever of my own
using the Scheme algorithms, but surely there must be a better way.
What have I missed?

Sincerely,

    David Bruce
----
post: DERA Malvern, St Andrews Road, Malvern, WORCS WR14 3PS, ENGLAND
mailto:dib@dera.gov.uk ** phone: +44 1684 895112 ** fax: +44 1684 894389
[The views expressed above are entirely those of the writer and do not represent the views, policy or understanding of any other person or official body.]
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: jensk@bbn.hp.com (Jens Kilian)
Date: 1997/11/19 Raw View

David Bruce (dib@dera.gov.uk) wrote:
> How/can I output floating-point numbers as ASCII/UniCode/... text in such a
> way that I can be sure of getting the same number (i.e., no rounding errors)
> when
>   (1) reading them in again (e.g., with an istream)?
>   (2) using the output as a literal in generated C++ source code?

> In Scheme it is required that printing a number and reading it in again
> yields the same value, and algorithms (at least for IEEE754 floating point)
> to achieve this have been published since 1990 at least.

> As far as I can tell, the new ISO Standard C++ provides no such guarantees.

Neither does the IEEE arithmetic standard itself.

> Obviously I could try to write a manipulator/locale/whatever of my own
> using the Scheme algorithms, but surely there must be a better way.
> What have I missed?

To accurately read floating-point numbers, unlimited precision arithmetic
is required (at least in some cases).

Bye,
 Jens.
--
mailto:jjk@acm.org                 phone:+49-7031-14-7698 (HP TELNET 778-7698)
  http://www.bawue.de/~jjk/          fax:+49-7031-14-7351
PGP:       06 04 1C 35 7B DC 1F 26 As the air to a bird, or the sea to a fish,
0x555DA8B5 BB A2 F0 66 77 75 E1 08 so is contempt to the contemptible. [Blake]
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]