Topic: full precision for float/double
Author: Barry Margolin <barmar@bbnplanet.com>
Date: 1999/01/19 Raw View
In article <780pc0$rf9$1@nnrp1.dejanews.com>, <AllanW@my-dejanews.com> wrote:
>Good lord, man! SetFullPrecision? If there was such a thing, my job
>would be a *lot* easier. But you can't print the "full precision" of a
>floating-point number in base 10. This would be possible in base 2, 4,
>8, 16, or 32, but nothing else, and none of these is suitable for
>human readers of floating-point numbers.
Although it may require a few dozen digits after the decimal point, all binary floating point numbers *do* have an exact representation
in base 10. Floating point expresses a number as mantissa*2^exponent,
where mantissa and exponent are integers. All powers of 2 (both positive
and negative) have exact representations in decimal (the negative powers
are .5, .25, .125, .0625, etc.), so this product also is exact.
Precision is lost going the other way, though: 1/10 is a repeating decimal
in binary. So if you print a floating point number in full precision, it
may not show up as what you expect. If you enter 0.1, it will print out as
something like 0.1000000000000000120546289 (this is not an actual number
that's likely, I just pounded on the number keys for the last few digits).
See the Clinger and Steele/White papers I referenced in another post for
more information on how to print and read floating point to meet common
expectations.
--
Barry Margolin, barmar@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: James.Kanze@dresdner-bank.com
Date: 1999/01/19 Raw View
In article <36A29570.2093@cs.cornell.edu>,
vavasis@cs.cornell.edu wrote:
> Suppose I print a double onto a ostream:
> double d = /* ... */;
> os << d;
>
> Later I read the double back in:
> double d2;
> is >> d2;
>
> Is it guaranteed that d==d2? Is there any standard I/O manipulator or
> combination of manipulators to guarantee this? I couldn't find anything
> about this in Stroustrup (C++ Programming Lang., 3d ed, 2d printing).
There is no guarantee whatsoever. Actual precision is a quality of
implementation issue; a conforming implementation could always output
"0" in the first case, although I doubt many would consider this very
good quality.
On machines which use an IEEE format, the IEEE standard does require
reversability if enough digits (17 for double, I think) are used. I
would *expect* any implementation using IEEE to meet this requirement in
its C++ implementation, but again, this is my expectation with regards
to minimum quality, and not something guaranteed by the standard.
--
James Kanze GABI Software, S rl
Conseils en informatique orient objet --
-- Beratung in industrieller Datenverarbeitung
mailto: kanze@gabi-soft.fr mailto: James.Kanze@dresdner-bank.com
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Jens Kilian <Jens_Kilian@bbn.hp.com>
Date: 1999/01/20 Raw View
Barry Margolin <barmar@bbnplanet.com> writes:
> Interested readers should check the proceedings of the 1990 ACM Conference
> on Programming Language Design and Implementation, which was published as
> SIGPLAN Notices 25(6). It contains two papers:
>
> William Clinger. How to read floating point numbers accurately.
>
> Guy Lewis Steele Jr. and Jon L White. How to print floating point numbers
> accurately.
Also the following, which contains improvements on the Steele&White algorithms:
Robert C. Burger and R. Kent Dybvig
Printing Floating-Point Numbers Quickly and Accurately
Proceedings of the ACM SIGPLAN Conference on Programming
Language Design and Implemantation, pp. 108-116, ACM Press, May
21-24 1996.
resp. Robert G. Burger and R. Kent Dybvig
Printing Floating-Point Numbers Quickly and Accurately
ACM SIGPLAN Notices, 31(5), pp. 108-116, May 1996.
The paper is online at
http://www.cs.indiana.edu/hyplan/dyb/FP-Printing-PLDI96.ps.gz
The references in this paper include
David M. Gay
Correctly rounded binary-decimal and decimal-binary conversions
Numerical Analysis Manuscript 90-10, AT&T Bell Laboratories,
Murray Hill, New Jersey 07974, November 1990.
http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz
David M. Gay's routines are freely available; see
http://cm.bell-labs.com/netlib/fp/index.html
(Thank God for the Web!)
Jens.
--
mailto:jjk@acm.org phone:+49-7031-14-7698 (HP TELNET 778-7698)
http://www.bawue.de/~jjk/ fax:+49-7031-14-7351
PGP: 06 04 1C 35 7B DC 1F 26 As the air to a bird, or the sea to a fish,
0x555DA8B5 BB A2 F0 66 77 75 E1 08 so is contempt to the contemptible. [Blake]
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Jerry Leichter <leichter@smarts.com>
Date: 1999/01/20 Raw View
| >On the contrary. It *is* possible - even simple - to define I/O so
| >that one can make this guarantee. What you need to understand is
| >what the guarantee is.
| >
| >Neither the program that wrote the value, nor the program that reads
| >it back, has access to the "true" value of some real number. It has
| >access to d or d2, which is an approximation within the limits of the
| >machines FP representation. All we can ask is that the
| >representation be thesame.
|
| As far as I can see there are just two ways of doing this:
|
| floating point values must be constrained so that the result of
| writing them out in denary and reading them back must result result in
| the same bit pattern. That would be a cost on every floating point
| operation.... or ... Every time a value is written to file (or any
| other form of persistent storage) the value in memory must be modified
| to meet your requirement. That would require modifying const
| qualified values....
Did you read the rest of my message?
Your speculation about the requirements is incorrect. Efficient
algorithms that have the property I describe are known. They do not
place any constraints on the FP values. They've been implemented, and
they work. Take a look at the documentation for the fromDecimal and
toDecimal procedures at:
http://www.research.digital.com/SRC/m3sources/html/float/src/Common/Float.ig.html
The papers on which the algorithms are based date back 10-15 years.
It's one of the shames of this industry that, even when efficient,
correct algorithms are known and published, too many old, incorrect
implementations not only continue to exist - but continue to be written.
-- Jerry
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/01/18 Raw View
In article <36A29570.2093@cs.cornell.edu>, Stephen Vavasis
<vavasis@cs.cornell.edu> writes
>Suppose I print a double onto a ostream:
> double d = /* ... */;
> os << d;
>
>Later I read the double back in:
> double d2;
> is >> d2;
>
>Is it guaranteed that d==d2? Is there any standard I/O manipulator or
>combination of manipulators to guarantee this? I couldn't find anything
>about this in Stroustrup (C++ Programming Lang., 3d ed, 2d printing).
>
>If there is nothing about this in the standard, I would like to propose
>(for some future standard) a manipulator for this purpose:
> os << setfullprecision << d;
I can think of no way that this can be done in general. Output of
doubles is in denary (base 10) and internal representations are in
binary, in general the two representations will not produce identical
values. To achieve what you want you must use a binary mode file. Even
then there are problems because the representation of a floating point
type in a register might not be exactly what it is on the stack. So if
you write:
os<< d1*d2;
os>> d3;
d3 might not be exactly d1*d2. But then that problem exists anyway:
d3= d1*d2;
cout << ((d3 == (d1*d2))? 0 : 1);
can produce either 0 or 1 as output:(
Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Jerry Leichter <leichter@smarts.com>
Date: 1999/01/18 Raw View
| >Suppose I print a double onto a ostream:
| > double d = /* ... */;
| > os << d;
| >
| >Later I read the double back in:
| > double d2;
| > is >> d2;
| >
| >Is it guaranteed that d==d2? ...
|
| I can think of no way that this can be done in general. Output of
| doubles is in denary (base 10) and internal representations are in
| binary, in general the two representations will not produce identical
| values.
On the contrary. It *is* possible - even simple - to define I/O so that
one can make this guarantee. What you need to understand is what the
guarantee is.
Neither the program that wrote the value, nor the program that reads it
back, has access to the "true" value of some real number. It has access
to d or d2, which is an approximation within the limits of the machines
FP representation. All we can ask is that the representation be the
same.
Suppose our input reader had the property that it could compute every
possible FP binary number for *some* base-10 input. Knowing the
algorithm that the reader will use, the writer can choose to write that
number, in base-10, that the reader will convert back to the binary
value d.
The Modula-3 FP I/O library actually defines its FP formatting and
parsing routines in exactly this way: If you write out a floating point
value in the default format, and read it back, you are certain to get
back the original bits. (Well, not if the original was an NaN, but
let's not get into that.) There is an efficient algorithm for doing
this, which satisfies the obvious correctness constraints (e.g., the
binary number a decimal value is converted to is no further from the
decimal value than any other representable binary number).
For this to work *within an implementation* (i.e., reader and writer use
the same library) requires only careful work by that implementer. If
the standard were to require this to work *across distinct implementa-
tions*, it would have to specify the algorithm. (Further, it doesn't
even make sense to specify that you "get the same bits back" when
reading back to an implementation using a different FP implementation.)
| Even then there are problems because the representation of a floating
| point type in a register might not be exactly what it is on the stack.
| So if you write:
|
| os<< d1*d2;
| os>> d3;
|
| d3 might not be exactly d1*d2. But then that problem exists anyway:
|
| d3= d1*d2;
| cout << ((d3 == (d1*d2))? 0 : 1);
|
| can produce either 0 or 1 as output:(
This isn't quite the same problem (and is a significant, if annoying,
issue in its own right).
It *is* possible to guarantee that, if d1==d2, then if you write the
values of d1 and d2 out and read them back in, the values you read back
in will *also* compare equal. (I don't think you could prove that from
the current standard, but that's another story.) It's also possible to
guarantee that if you write d1 and read the value back into d2, then
d1==d2. (This may take a bit of extra work on machines that might have
d1 in a register with extra significance bits, but if *can* be done, and
shouldn't even be that expensive, if you want it.)
-- Jerry
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: stephen.clamage@sun.com (Steve Clamage)
Date: 1999/01/18 Raw View
Stephen Vavasis <vavasis@cs.cornell.edu> writes:
>Suppose I print a double onto a ostream:
> double d = /* ... */;
> os << d;
>Later I read the double back in:
> double d2;
> is >> d2;
>Is it guaranteed that d==d2?
No.
>Is there any standard I/O manipulator or
>combination of manipulators to guarantee this?
No.
Very little about floating-point is assured in C or C++.
Algorithms exist that provide maximum accuracy when converting
floating-point values to text, allowing any written value to be
read back in exactly. (I don't have a reference handy. Maybe
someone else can provide it.) The standard does not require their use.
--
Steve Clamage, stephen.clamage@sun.com
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Stephen Vavasis <vavasis@cs.cornell.edu>
Date: 1999/01/18 Raw View
Francis Glassborow wrote:
>
> In article <36A29570.2093@cs.cornell.edu>, Stephen Vavasis
> <vavasis@cs.cornell.edu> writes
> [question about printing a double d onto a stream, reading it back in,
> and getting d back exactly]
> I can think of no way that this can be done in general. Output of
> doubles is in denary (base 10) and internal representations are in
> binary, in general the two representations will not produce identical
> values.
A different representation does not necessarily prevent exact
conversion. For example, 3 binary bits can be converted to one decimal
digit and then back to the same three bits (.001<->.1, .010<->[either .2
or .3], .011<->.4, .100<->.5, .101<->.6, .110<->[either .7 or .8],
.111<->.9) using ordinary rounding for conversion.
A use for exact representation in streams is checkpointing. A
long-running computation might want to dump its intermediate values to a
file so that it can be restarted from that state.
> To achieve what you want you must use a binary mode file.
For the purpose of checkpointing, I prefer text mode so that I can
actually examine the intermediate state. But in any case, if I switch
to binary-mode files, do doubles always come out in full precision, or
do I still need a call to setprecision? There is only a very brief
mention of binary mode on p. 639 of Stroustrup's book so I don't know
how it works.
-- Steve Vavasis
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/01/18 Raw View
In article <36A37469.1555@smarts.com>, Jerry Leichter
<leichter@smarts.com> writes
>On the contrary. It *is* possible - even simple - to define I/O so that
>one can make this guarantee. What you need to understand is what the
>guarantee is.
>
>Neither the program that wrote the value, nor the program that reads it
>back, has access to the "true" value of some real number. It has access
>to d or d2, which is an approximation within the limits of the machines
>FP representation. All we can ask is that the representation be the
>same.
As far as I can see there are just two ways of doing this:
floating point values must be constrained so that the result of writing
them out in denary and reading them back must result result in the same
bit pattern. That would be a cost on every floating point operation.
I suspect that would be unacceptable.
or
Every time a value is written to file (or any other form of persistent
storage) the value in memory must be modified to meet your requirement.
That would require modifying const qualified values.
I guess that would be even less acceptable.
However reasonable your suggestion may seem, it does place a burden on
the implementation way beyond the immediate write/read cycle.
If you want this type of assurance you should use binary read/write
methods.
BTW the C Standards committees recently looked at the far simpler
requirement that programmers should be able to ensure that register
representations matched memory ones. I seem to remember that we were in
favour of saying that (double)d (where d is already a double) should
strip any extra precision that was only available at register level.
Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Barry Margolin <barmar@bbnplanet.com>
Date: 1999/01/19 Raw View
In article <36A37469.1555@smarts.com>,
Jerry Leichter <leichter@smarts.com> wrote:
>| I can think of no way that this can be done in general. Output of
>| doubles is in denary (base 10) and internal representations are in
>| binary, in general the two representations will not produce identical
>| values.
>
>On the contrary. It *is* possible - even simple - to define I/O so that
>one can make this guarantee. What you need to understand is what the
>guarantee is.
Interested readers should check the proceedings of the 1990 ACM Conference
on Programming Language Design and Implementation, which was published as
SIGPLAN Notices 25(6). It contains two papers:
William Clinger. How to read floating point numbers accurately.
Guy Lewis Steele Jr. and Jon L White. How to print floating point numbers
accurately.
--
Barry Margolin, barmar@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/01/19 Raw View
In article <36A364D0.6040@cs.cornell.edu>, Stephen Vavasis
<vavasis@cs.cornell.edu> writes
>For the purpose of checkpointing, I prefer text mode so that I can
>actually examine the intermediate state. But in any case, if I switch
>to binary-mode files, do doubles always come out in full precision, or
>do I still need a call to setprecision? There is only a very brief
>mention of binary mode on p. 639 of Stroustrup's book so I don't know
>how it works.
Do it with read() and write().
Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: AllanW@my-dejanews.com
Date: 1999/01/19 Raw View
In article <36A29570.2093@cs.cornell.edu>,
vavasis@cs.cornell.edu wrote:
> Suppose I print a double onto a ostream:
> double d = /* ... */;
> os << d;
>
> Later I read the double back in:
> double d2;
> is >> d2;
>
> Is it guaranteed that d==d2? Is there any standard I/O manipulator or
> combination of manipulators to guarantee this? I couldn't find anything
> about this in Stroustrup (C++ Programming Lang., 3d ed, 2d printing).
Of course not. The insertion and the extraction both involve translation
from one base to another; you WILL lose some precision.
> If there is nothing about this in the standard, I would like to propose
> (for some future standard) a manipulator for this purpose:
> os << setfullprecision << d;
Good lord, man! SetFullPrecision? If there was such a thing, my job
would be a *lot* easier. But you can't print the "full precision" of a
floating-point number in base 10. This would be possible in base 2, 4,
8, 16, or 32, but nothing else, and none of these is suitable for
human readers of floating-point numbers.
As an exercise, try this handy program:
#include <iostream>
int main() {
double d = 1.0, e = 1.0;
int i;
for (i=2; i<=100; ++i) d /= i;
for (i=2; i<=100; ++i) d *= i;
if (d<e || d>e)
{
// At this point, we've demonstrated that double's don't
// have full precision -- and we've done it without
// using any stream handling. That's not where the lost
// precision comes from.
std::cout << (d-e) << ": welcome to reality." << std::endl;
}
else
{
std::cout <<
"You have a (very good!) optimizing compiler;\n"
"the divisions and multiplications never took place.\n"
"Try it with optimizations off to see the problem."
<< std::endl;
}
}
The reasons why have nothing to do with the C++ standard; it's the same
problem that your hand-held calculator has. Try this:
1 / 3 = / 3 = * 9 =
Did you get 1.0? You probably got 0.99999999 -- very very close, but
not quite right.
For more details, try one of the newsgroups about numerical
representations or algorithms.
----
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: James Kuyper <kuyper@wizard.net>
Date: 1999/01/19 Raw View
Francis Glassborow wrote:
>
> In article <36A37469.1555@smarts.com>, Jerry Leichter
> <leichter@smarts.com> writes
> >On the contrary. It *is* possible - even simple - to define I/O so that
> >one can make this guarantee. What you need to understand is what the
> >guarantee is.
> >
> >Neither the program that wrote the value, nor the program that reads it
> >back, has access to the "true" value of some real number. It has access
> >to d or d2, which is an approximation within the limits of the machines
> >FP representation. All we can ask is that the representation be the
> >same.
>
> As far as I can see there are just two ways of doing this:
>
> floating point values must be constrained so that the result of writing
> them out in denary and reading them back must result result in the same
> bit pattern. That would be a cost on every floating point operation.
>
> I suspect that would be unacceptable.
>
> or
>
> Every time a value is written to file (or any other form of persistent
> storage) the value in memory must be modified to meet your requirement.
> That would require modifying const qualified values.
>
> I guess that would be even less acceptable.
>
> However reasonable your suggestion may seem, it does place a burden on
> the implementation way beyond the immediate write/read cycle.
>
> If you want this type of assurance you should use binary read/write
> methods.
For any of the IEEE binary floating point formats I'm familiar with,
there is a minimum number of decimal digits such that for every ordinary
representable binary number, there is at least one faithful decimal
representation with that number of significant digits. By faithful, I
mean that the decimal number is closer to that binary number than to any
other representable value. It is feasible to require that any such
number printed with at least that number of significant digits, if
converted back to binary, be converted to exactly the same binary
number. By restricting the discussion to 'ordinary' numbers, I'm
ignoring the problems represented by NaN's, Infinities, and signed
zeros.
This guarantee is not part of the standard, because it puts some strains
on the conversion routines. The simplest way to achieve it requires
using internally a floating point format at least a few bits larger than
the one being converted. There are machines which can do this easily,
and others for which it would be quite difficult. For instance, some
implementations use 64 bit floating point for storing numbers, and an
80-bit format for temporaries.
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: Stephen Vavasis <vavasis@cs.cornell.edu>
Date: 1999/01/18 Raw View
Suppose I print a double onto a ostream:
double d = /* ... */;
os << d;
Later I read the double back in:
double d2;
is >> d2;
Is it guaranteed that d==d2? Is there any standard I/O manipulator or
combination of manipulators to guarantee this? I couldn't find anything
about this in Stroustrup (C++ Programming Lang., 3d ed, 2d printing).
If there is nothing about this in the standard, I would like to propose
(for some future standard) a manipulator for this purpose:
os << setfullprecision << d;
-- Steve Vavasis
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]