Thread

Topic: Should Float(s) to string(stream) be re-input exactly the same?

Author: pbristow@hetp.u-net.com ("Paul A Bristow")
Date: Fri, 12 Nov 2004 00:20:31 GMT Raw View

""Eric Backus"" <eric_backus@alum.mit.edu> wrote in message
news:1098664795.487306@cswreg.cos.agilent.com...
> ""Andrew Koenig"" <ark@acm.org> wrote in message
> news:o6ved.751242$Gx4.433382@bgtnsc04-news.ops.worldnet.att.net...
> > <kanze@gabi-soft.fr> wrote in message
> > news:d6652001.0410220022.100e7a2b@posting.google.com...
>
> This doesn't sound right to me.  Let's take IEEE double-precision as an
> example.  I was under the impression that 17 significant decimal digits is
> enough to uniquely represent any IEEE double-precision number, and that it
> would allow for a "round trip" without any rounding errors.  Where does
the
> unbounded-precision arithmetic come in?

My 'round-tripping' experience confirms this with float (exhaustive - ALL
legit 32-bit float IEEE X86 values round-trip right with the previous
compiler version),
and double (64bit IEEE X86) - only a large number of
random samples because estimates suggested my PC, and I, would be worn
out/deceased before a full test completed ;-) .

For 32-bit IEEE float, 6 digits are guaranteed, but 9 may be significant.
For 64-bit IEEE double, 15 are guaranteed, and 17 may be significant.

I believe the following expression will provide these values fairly portably
(not for radix 10 I suspect):

2 + std::numeric_limits<FloatingPointType>::digits * 3010/10000

(3010/10000 is log(10) = 0.3010 neither of which can (yet :-(( ) be
evaluated at compile time).

and digits is the number of significand bits FLT_MANT_DIG, DBL_MANT_DIG ...

Links below (relevant to the above) may be of interest to some readers.
  1.. http://http.cs.berkeley.edu/~wkahan/ home page of William Kahan,
pioneer of IEEE754 specification.
  2.. http://http.cs.berkeley.edu/~wkahan/imporber.pdf, The Improbability of
PROBABILISTIC ERROR ANALYSES for Numerical Computations.
  3.. http://http.cs.berkeley.edu/~wkahan/names.pdf gives useful definitions
and explanations of radix, exponent, etc.
  4.. http://http.cs.berkley.edu/~wkahan/ieee754status/ieee754.ps page 4
gives significant digits for real formats.

It will be obvious to many that many operations, including (non-binary)
serialization, depend on exact 'round tripping' to avoid difficult to spot
minor inaccuracies creeping in.  (This assumes the same FP format of
course).

Paul

PS  Exercise for students: What is the formula for calculating the _maximum_
number of decimal digits  required for an C/C++ 'exactly representable'
value?  Is it the number of significand bits? Why? Explain you answer by
means of diagrams or programs ;-)

Paul A Bristow
Prizet Farmhouse, Kendal LA8 8AB   UK
pbristow@hetp.u-net.com



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: Fri, 22 Oct 2004 17:37:28 GMT Raw View

AlbertoBarbati@libero.it (Alberto Barbati) wrote in message
news:<gXtdd.49586$N45.1462626@twister2.libero.it>...
> Paul A Bristow wrote:

> > Some of the discussion veered off onto the method of inputting in
> > hex, but I feel that it another issue (and is, of course, tiresome
> > to make portable because it is dependent on the particular floating
> > point format, as is using exactly representable decimal digit
> > strings).

> You don't seem to have understood what the hex fp format is. The hex
> fp format *does not* depend on the implementation of floating point
> and is supposed to be as portable as the "regular" scientific
> format. As I wrote in my post, the hex fp format is a string like
> [?]0xh.hhhhp    d where "h.hhhh" is a hexadecimal number representing
> the mantissa and "d" is a decimal number representing the
> exponent. It's *exactly the same* as the scientific format, except
> that the mantissa is expressed as an hexadecimal instead of a decimal
> number. The advantage is that if FLT RADIX is a power of 2, you can do
> exact I/O (both exact output *and* exact input!) using a finite number
> of digits.

You can also do that with decimal; 2 is a factor of 10.  The difference
is that with hex format, the conversion is a lot simpler, with less
risque of rounding errors in the conversion routines.  And with less
total digits.

> If FLT RADIX is not a power of 2 it's no better and no worse than the
> "regular" scientific format (unless FLT RADIX is either 5 of a
> multiple of 10... which I believe it's rather rare in practice).

I've only seen three FLT_RADIX in practice: 2, 16 and 10.  Rare is a
matter of appreciation -- I'm pretty sure that 10 is less common than 2
or 16 today, but it is certainly more frequent than any other value.
(Actually, I'm not sure of that.  Most general purpose processors today
use IEEE, which is base 2, but I think that a lot of smaller, hand-held
processors use base 10.  I don't think any of them have a C++ compiler,
but who knows.)

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: ark@acm.org ("Andrew Koenig")
Date: Sat, 23 Oct 2004 16:38:37 GMT Raw View

<kanze@gabi-soft.fr> wrote in message
news:d6652001.0410220022.100e7a2b@posting.google.com...
> AlbertoBarbati@libero.it (Alberto Barbati) wrote in message
> news:<gXtdd.49586$N45.1462626@twister2.libero.it>...

>> The advantage is that if FLT RADIX is a power of 2, you can do
>> exact I/O (both exact output *and* exact input!) using a finite number
>> of digits.

> You can also do that with decimal; 2 is a factor of 10.  The difference
> is that with hex format, the conversion is a lot simpler, with less
> risque of rounding errors in the conversion routines.  And with less
> total digits.

Ummm...not quite.

It is true that every binary floating-point number has an exact decimal
representation.  However, it is not true that every decimal floating-point
number has an exact binary representation.

More generally, if you wish to read a decimal representation of a
floating-point number and compute the nearest binary floating-point value to
what you read, you need unbounded-precision arithmetic to do it correctly in
all cases.  If, on the other hand, the representation is binary or hex, you
don't.

It is true that you might need to examine all of the input bits to determine
the correct rounding of the result, but you don't need unbounded-precision
arithmetic to do that -- instead, you can (if needed) get into a mode where
you scan bits looking for a 1 or a zero (depending on rounding mode) and
stop when you find it.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: hyrosen@mail.com (Hyman Rosen)
Date: Sun, 24 Oct 2004 20:45:19 GMT Raw View

Alberto Barbati wrote:
 > The advantage is that if  FLT_RADIX is a power of 2,
 > you can do exact I/O (both exact output *and* exact input!)
 > using a finite number of digits.

Of course this is equally true for decimal I/O!
Every binary floating point number can be written
exactly using a finite number of decimal digits.

This whole topic comes up over and over. There are
a pair of classic papers which were published in a
SIGPLAN proceedings which discuss how to write and
read floating point numbers. The upshot is that
there is a way to convert a decimal string to its
nearest floating point representation, and a way to
write a shortest decimal string such that the read
algorithm will convert it back to the original value.
That's all that's really needed.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: eric_backus@alum.mit.edu ("Eric Backus")
Date: Mon, 25 Oct 2004 08:06:45 GMT Raw View

""Andrew Koenig"" <ark@acm.org> wrote in message
news:o6ved.751242$Gx4.433382@bgtnsc04-news.ops.worldnet.att.net...
> <kanze@gabi-soft.fr> wrote in message
> news:d6652001.0410220022.100e7a2b@posting.google.com...
>> You can also do that with decimal; 2 is a factor of 10.  The difference
>> is that with hex format, the conversion is a lot simpler, with less
>> risque of rounding errors in the conversion routines.  And with less
>> total digits.
>
> Ummm...not quite.
>
> It is true that every binary floating-point number has an exact decimal
> representation.  However, it is not true that every decimal floating-point
> number has an exact binary representation.
>
> More generally, if you wish to read a decimal representation of a
> floating-point number and compute the nearest binary floating-point value
> to what you read, you need unbounded-precision arithmetic to do it
> correctly in all cases.

This doesn't sound right to me.  Let's take IEEE double-precision as an
example.  I was under the impression that 17 significant decimal digits is
enough to uniquely represent any IEEE double-precision number, and that it
would allow for a "round trip" without any rounding errors.  Where does the
unbounded-precision arithmetic come in?

--
Eric Backus

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net (Victor Bazarov)
Date: Mon, 25 Oct 2004 18:22:01 GMT Raw View

Alberto Barbati wrote:
> Paul A Bristow wrote:
>=20
>>
>> Some of the discussion veered off onto the method of inputting in hex,=
=20
>> but I
>> feel that it another issue (and is, of course, tiresome to make portab=
le
>> because it is dependent on the particular floating point format, as is=
=20
>> using
>> exactly representable decimal digit strings).
>>
>=20
> You don't seem to have understood what the hex fp format is. The hex fp=
=20
> format *does not* depend on the implementation of floating point and is=
=20
> supposed to be as portable as the "regular" scientific format. As I=20
> wrote in my post, the hex fp format is a string like [=E2=88=92]0xh.hhh=
hp=C2=B1d=20
> where "h.hhhh" is a hexadecimal number representing the mantissa and "d=
"=20
> is a decimal number representing the exponent. It's *exactly the same*=20
> as the scientific format, except that the mantissa is expressed as an=20
> hexadecimal instead of a decimal number. The advantage is that if=20
> FLT_RADIX is a power of 2, you can do exact I/O (both exact output *and=
*=20
> exact input!) using a finite number of digits. If FLT_RADIX is not a=20
> power of 2 it's no better and no worse than the "regular" scientific=20
> format (unless FLT_RADIX is either 5 of a multiple of 10... which I=20
> believe it's rather rare in practice).

Alberto,

Calling hex fp format "portable" is definitely stretching it.  If on two
systems 'float', say, have different representations, the "same" number
will be output in a different way.  IOW, it will only be "both exact
output *and* exact input" on a system with the same internal
representation of a floating point numbers.  AFAICT, there is no
_specific_ requirement for the implementation of floating point numbers
in C++.  Am I missing some part of the standard where floating point has
to have a particular (binary) representation?

V

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jjk@acm.org (Jens Kilian)
Date: Tue, 26 Oct 2004 02:45:30 GMT Raw View

eric_backus@alum.mit.edu ("Eric Backus") writes:
> This doesn't sound right to me.  Let's take IEEE double-precision as
> an example.  I was under the impression that 17 significant decimal
> digits is enough to uniquely represent any IEEE double-precision
> number, and that it would allow for a "round trip" without any
> rounding errors.  Where does the unbounded-precision arithmetic come
> in?

It comes in when you want to print/read the *shortest* decimal representation
which unambiguously represents a given floating-point number.  The papers
mentioned upthread are:

        http://portal.acm.org/citation.cfm?id=93557
        http://portal.acm.org/citation.cfm?id=93559
--
mailto:jjk@acm.org                 As the air to a bird, or the sea to a fish,
  http://www.bawue.de/~jjk/        so is contempt to the contemptible. [Blake]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net (James Kuyper)
Date: Tue, 26 Oct 2004 21:18:00 GMT Raw View

v.Abazarov@comAcast.net (Victor Bazarov) wrote in message news:<kgPdd.6032$Ae.1725@newsread1.dllstx09.us.to.verio.net>...
> Alberto Barbati wrote:
>
>> Paul A Bristow wrote:

..

>> You don't seem to have understood what the hex fp format is. The
hex fp format *does not* depend on the implementation of floating
point and is supposed to be as portable as the "regular" scientific
format. As I wrote in my post, the hex fp format is a string like
[&#8722;]0xh.hhhhp   d where "h.hhhh" is a hexadecimal number
representing the mantissa and "d" is a decimal number representing the
exponent. It's *exactly the same* as the scientific format, except
that the mantissa is expressed as an hexadecimal instead of a decimal
number. The advantage is that if FLT_RADIX is a power of 2, you can do
exact I/O (both exact output *and* exact input!) using a finite number
of digits. If FLT_RADIX is not a power of 2 it's no better and no
worse than the "regular" scientific format (unless FLT_RADIX is either
5 of a multiple of 10... which I believe it's rather rare in
practice).
>
>
>
> Alberto,
>
> Calling hex fp format "portable" is definitely stretching it.  If on two
> systems 'float', say, have different representations, the "same" number
> will be output in a different way.

Memory storage format definitely isn't portable, but I was under the
impression we were talking about the printing format, which can be
portable. For instance, in C99, which already has support for a
similar feature, I can write

    double x  = 0xd.eadbeef0123p-122;
    printf("%.11a", x);

For any conforming implementation of C99 where FLT_RADIX is a power of
2, that code must print out

    0xd.eadbeef0123p-122

regardless of how doubles are represented internally on that machine.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Wed, 27 Oct 2004 02:32:45 GMT Raw View

Victor Bazarov wrote:
> Alberto Barbati wrote:
>=20
>> Paul A Bristow wrote:
>>
>>>
>>> Some of the discussion veered off onto the method of inputting in=20
>>> hex, but I
>>> feel that it another issue (and is, of course, tiresome to make porta=
ble
>>> because it is dependent on the particular floating point format, as=20
>>> is using
>>> exactly representable decimal digit strings).
>>>
>>
>> You don't seem to have understood what the hex fp format is. The hex=20
>> fp format *does not* depend on the implementation of floating point=20
>> and is supposed to be as portable as the "regular" scientific format.=20
>> As I wrote in my post, the hex fp format is a string like=20
>> [=E2=88=92]0xh.hhhhp=C2=B1d where "h.hhhh" is a hexadecimal number rep=
resenting the=20
>> mantissa and "d" is a decimal number representing the exponent. It's=20
>> *exactly the same* as the scientific format, except that the mantissa=20
>> is expressed as an hexadecimal instead of a decimal number. The=20
>> advantage is that if FLT_RADIX is a power of 2, you can do exact I/O=20
>> (both exact output *and* exact input!) using a finite number of=20
>> digits. If FLT_RADIX is not a power of 2 it's no better and no worse=20
>> than the "regular" scientific format (unless FLT_RADIX is either 5 of=20
>> a multiple of 10... which I believe it's rather rare in practice).
>=20
>=20
> Alberto,
>=20
> Calling hex fp format "portable" is definitely stretching it.  If on tw=
o
> systems 'float', say, have different representations, the "same" number
> will be output in a different way.

Take a real number x (make it strictly positive, for simplicity). Then=20
there exists two unique numbers, a positive real m < 10 and an integer e=20
(called mantissa and exponent) such that:

   x =3D m * 10 ^ e

The "regular" decimal scientific representation is defined by decimal=20
representations of m and e. For a normalized fp number, you'll agree=20
that this representation (given sufficient precision) is not dependent=20
on the internal representation.

Now, with the same argument, you can find two unique numbers, a real=20
number 8 <=3D m' < 16 and an integer e' such that

   x =3D m' * 2 ^ e'

The hexadecimal fp format is defined in term of the hexadecimal=20
representation of m' and the decimal representation of e'. With=20
"hexadecimal representation of m'" I mean the output of this algorithm:

   double integral, fractional;
   std::cout << std::hex;
   fractional =3D std::modf(m, &integral);
   std::cout << static_cast<int>(integral) << '.';
   m =3D fractional * 16.0;
   for(; m > 0.0; m =3D fractional * 16.0)
   {
     fractional =3D std::modf(m, &integral);
     assert(integral < 16.0);
     std::cout << static_cast<int>(integral); // one single hex digit
   }

Please show me how this representation (which is essentially identical=20
as the "regular" scientific one) depends on the internal representation=20
of x, as I fail to see it.

Even in corner cases (such that non-normalized numbers, for which=20
certain details of the representation are unspecified) the format can=20
always be interpreted as a valid fp number. You just need to read an=20
hexadecimal number (with fractional part) and a decimal number, then=20
multiply the former by 2 raised to the latter.

It's true that on some fp implementations with FLT_RADIX =3D=3D 2, m' and=
 e'=20
can be directly extracted by the internal representation without any=20
computation and that's a bonus, but the format is not defined in such=20
terms, in order to achieve portability.

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net (Victor Bazarov)
Date: Wed, 27 Oct 2004 02:33:05 GMT Raw View

James Kuyper wrote:
> [...]
> Memory storage format definitely isn't portable, but I was under the
> impression we were talking about the printing format, which can be
> portable. For instance, in C99, which already has support for a
> similar feature, I can write
>
>     double x  = 0xd.eadbeef0123p-122;
>     printf("%.11a", x);
>
> For any conforming implementation of C99 where FLT_RADIX is a power of
> 2,

... and where mantissa is long enough to accommodate 12 hex digits...

(C99 standard only requires 10 decimal digits in a double or long double)

 > that code must print out
>
>     0xd.eadbeef0123p-122
>
> regardless of how doubles are represented internally on that machine.

I don't dispute that.  I thought that there is no specific requirement
in the Standard that FLT_RADIX is a power of 2.  If I am mistaken, please
slap me in the face with the section of the Standard.  If there is none
such section exists, then I'd appreciate the elaboration on "can be
portable" WRT printing format.  If it can be portable _conditionally_,
then it's not portable.  If it can be portable in general, why isn't it?
(Of course the last question assumes that it isn't yet)

Victor

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Michael.Karcher@writeme.com (Michael Karcher)
Date: Wed, 27 Oct 2004 02:32:57 GMT Raw View

Victor Bazarov <v.Abazarov@comacast.net> wrote:
> Calling hex fp format "portable" is definitely stretching it.  If on two
> systems 'float', say, have different representations, the "same" number
> will be output in a different way.

No, it will not. This is the key point of the hex notation. As long as a
binary representation is used (IEEE or not), numbers with exactly the same
value will be output exactly the same way. For example:
1.5 = 1.1(binary) = 1100(binary)*2^{-3} = 0xCp-2
17.25 = 10001.01(binary) = 1000.1010*2^1 = 0x8.Ap+1

Michael Karcher

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net ("Victor Bazarov")
Date: Wed, 27 Oct 2004 05:38:48 GMT Raw View

"Michael Karcher" <Michael.Karcher@writeme.com> wrote...
> Victor Bazarov <v.Abazarov@comacast.net> wrote:
>> Calling hex fp format "portable" is definitely stretching it.  If on two
>> systems 'float', say, have different representations, the "same" number
>> will be output in a different way.
>
> No, it will not. This is the key point of the hex notation. As long as a
> binary representation is used (IEEE or not), numbers with exactly the same
> value will be output exactly the same way. For example:
> 1.5 = 1.1(binary) = 1100(binary)*2^{-3} = 0xCp-2
> 17.25 = 10001.01(binary) = 1000.1010*2^1 = 0x8.Ap+1

Try looking beyond simple examples.  What if one representation is longer
than the other?  Output the longer representation using the hex conversion,
then input it into a shorter one.  Then output again.

V

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net ("Victor Bazarov")
Date: Wed, 27 Oct 2004 05:39:06 GMT Raw View

"Alberto Barbati" <AlbertoBarbati@libero.it> wrote...
> Take a real number x (make it strictly positive, for simplicity). Then
> there exists two unique numbers, a positive real m < 10 and an integer e
> (called mantissa and exponent) such that:
>
>    x = m * 10 ^ e
>
> The "regular" decimal scientific representation is defined by decimal
> representations of m and e. For a normalized fp number, you'll agree
> that this representation (given sufficient precision) is not dependent
> on the internal representation.
>
> Now, with the same argument, you can find two unique numbers, a real
> number 8 <= m' < 16 and an integer e' such that
>
>    x = m' * 2 ^ e'
>
> The hexadecimal fp format is defined in term of the hexadecimal
> representation of m' and the decimal representation of e'. With
> "hexadecimal representation of m'" I mean the output of this algorithm:
>
>    double integral, fractional;
>    std::cout << std::hex;
>    fractional = std::modf(m, &integral);
>    std::cout << static_cast<int>(integral) << '.';
>    m = fractional * 16.0;
>    for(; m > 0.0; m = fractional * 16.0)
>    {
>      fractional = std::modf(m, &integral);
>      assert(integral < 16.0);
>      std::cout << static_cast<int>(integral); // one single hex digit
>    }
>
> Please show me how this representation (which is essentially identical
> as the "regular" scientific one) depends on the internal representation
> of x, as I fail to see it.

It depends on the _length_ of it.  Let's say I want to represent 1.10 (dec)
in two different systems with binary internal representation.  One, which
has, say, 17 binary digits for the mantissa, gives me 1.0001100110011001p0
(binary), or 1.1999p0 (hex), and I output it.  It yields, for the sake of
argument, "1.1999p0".  The other binary form has fewer digits.  Let's say
it has 10.  I try inputting "1.1999p0" and get 1.000110011p0 (binary) as
the internal representation.  Now I output it again.  I get "1.198p0".  Is
hex format portable?  No.  No more than any other floating point format.

The portability of the hex form is only slightly better than decimal, if
at all.

Do you see a flaw in my explanation somewhere?

V

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: Wed, 27 Oct 2004 19:59:59 GMT Raw View

eric_backus@alum.mit.edu ("Eric Backus") wrote in message
news:<1098664795.487306@cswreg.cos.agilent.com>...
> ""Andrew Koenig"" <ark@acm.org> wrote in message
> news:o6ved.751242$Gx4.433382@bgtnsc04-news.ops.worldnet.att.net...
> > <kanze@gabi-soft.fr> wrote in message
> > news:d6652001.0410220022.100e7a2b@posting.google.com...
> >> You can also do that with decimal; 2 is a factor of 10.  The
> >> difference is that with hex format, the conversion is a lot
> >> simpler, with less risque of rounding errors in the conversion
> >> routines.  And with less total digits.

> > Ummm...not quite.

> > It is true that every binary floating-point number has an exact
> > decimal representation.  However, it is not true that every decimal
> > floating-point number has an exact binary representation.

> > More generally, if you wish to read a decimal representation of a
> > floating-point number and compute the nearest binary floating-point
> > value to what you read, you need unbounded-precision arithmetic to
> > do it correctly in all cases.

> This doesn't sound right to me.  Let's take IEEE double-precision as
> an example.  I was under the impression that 17 significant decimal
> digits is enough to uniquely represent any IEEE double-precision
> number, and that it would allow for a "round trip" without any
> rounding errors.  Where does the unbounded-precision arithmetic come
> in?

I don't think Andy was saying that the round trip conversion wasn't
possible.  The problem is obtaining the closest binary floating point
representation for an arbitrary decimal number.  (My intent was only to
make a claim concerning round trip conversions, but somewhere along the
line, I think I claimed more than I meant to.)

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: Thu, 28 Oct 2004 11:12:47 GMT Raw View

ark@acm.org ("Andrew Koenig") wrote in message
news:<o6ved.751242$Gx4.433382@bgtnsc04-news.ops.worldnet.att.net>...
> <kanze@gabi-soft.fr> wrote in message
> news:d6652001.0410220022.100e7a2b@posting.google.com...
> > AlbertoBarbati@libero.it (Alberto Barbati) wrote in message
> > news:<gXtdd.49586$N45.1462626@twister2.libero.it>...

> >> The advantage is that if FLT RADIX is a power of 2, you can do
> >> exact I/O (both exact output *and* exact input!) using a finite
> >> number of digits.

> > You can also do that with decimal; 2 is a factor of 10.  The
> > difference is that with hex format, the conversion is a lot simpler,
> > with less risque of rounding errors in the conversion routines.  And
> > with less total digits.

> Ummm...not quite.

> It is true that every binary floating-point number has an exact
> decimal representation.  However, it is not true that every decimal
> floating-point number has an exact binary representation.

Yes.  I was thinking in terms of the original posting, where it was a
question of round-trip.  It's possible to output any floating point
number exactly in decimal, and of course, that set of decimal numbers
has an exact binary representation -- with only the number of bits
available.

>From a quality of implementation point of view: if the decimal number
has an exact representation in the target format, it would be a mighty
poor implementation which converted it to anything else.

> More generally, if you wish to read a decimal representation of a
> floating-point number and compute the nearest binary floating-point
> value to what you read, you need unbounded-precision arithmetic to do
> it correctly in all cases.  If, on the other hand, the representation
> is binary or hex, you don't.

Unbounded, in the sense that you might need an infinite number of
digits, or unbound, in the sense that the bound depends on the number of
digits in the decimal representation (which is unbound)?

Doing accurate conversions does require some sort of extended precision
arithmetic.  (I believe that this fact is established, and well known.)
If there is an upper bound based on the number of digits, I'm willing to
accept that an implementation throws bad_alloc when passed a decimal
representation with thousands of digits.  If there is really no upper
bound... I don't think I'd like it if the implementation threw bad_alloc
trying to convert 1.3, supposing that there was a reasonable amount of
free space available.

(But it is all a QoI issue, of course.  I think that an implementation
like:

    template< typename CharT, class InpIter >
    InpIter
    num_get< CharT, InpIter >::do_get(
        InpIter, InIter, ios_base&, ios_base::iostate&, float& ) const
    {
        throw bad_alloc() ;
    }

is strictly conforming, although that's one compiler I wouldn't buy.)

> It is true that you might need to examine all of the input bits to
> determine the correct rounding of the result, but you don't need
> unbounded-precision arithmetic to do that -- instead, you can (if
> needed) get into a mode where you scan bits looking for a 1 or a zero
> (depending on rounding mode) and stop when you find it.

Aren't some similar tricks possible for decimal arithmetic?  I've seen
algorithms which claim to always convert to the closest bit, and which
while using extended precision, didn't use unbound precision.  See
e.g. "How to read floating point numbers accurately", by William
D. Clinger, ACL SIGPLAN, June 1990.  I'll admit that I've not yet found
the time to read and to study the article, but the abstract claims "The
author presents an efficient algorithm that always finds the best
approximation. The algorithm uses a few extra bits of precision to
compute an IEEE-conforming approximation while testing an intermediate
result to determine whether the approximation could be other than the
best. If the approximation might not be the best, then the best
approximation is determined by a few simple operations on
multiple-precision integers, where the precision is determined by the
input." This would seem sufficient, supposing that the author is right.

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net (James Kuyper)
Date: Thu, 28 Oct 2004 16:31:22 GMT Raw View

v.Abazarov@comAcast.net (Victor Bazarov) wrote in message news:<HDzfd.6973$Ae.6503@newsread1.dllstx09.us.to.verio.net>...
..
> I don't dispute that.  I thought that there is no specific requirement
> in the Standard that FLT_RADIX is a power of 2.  If I am mistaken, please

There isn't; the assertion you were objecting to was predicated on "if
FLT_RADIX is a power of 2 ...".  You can object that this premise is
not universally true, and you'd be right, though the exceptions are
quite rare nowadays. But you can't use that as an argument against the
conclusion that was based upon that premise.

> portable" WRT printing format.  If it can be portable _conditionally_,
> then it's not portable.  If it can be portable in general, why isn't it?

That renders the word "portable" useless, because there's absolutely
no code that is unconditionally portable. The most that anyone can say
is "This code is portable to all systems that satisfy condition X".
"X" might be "systems that implement C++", but that means the code
isn't necessarily portable to C compilers. "X" could be "systems that
implement C, Fortran, Ada, or Pascal" (and yes, I have seen a program
that was portable to all of those different languages simultaneously -
I was quite impressed by the useless cleverness needed to achieve that
result). However, that still means it isn't necessarily portable to
systems which implement "Perl".

It's perfectly meaningful to say, for instance, that a given piece of
code is "portable to all systems that have a M$VC compiler". In fact,
that's a pretty important portability requirement, in some contexts.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Sun, 17 Oct 2004 06:44:53 GMT Raw View

Victor Bazarov wrote:
> Alberto Barbati wrote:
>>
>> It's a mathematical fact that not all floating point numbers=20
>> represented in a binary base can also be represented *exactly* in a=20
>> decimal base, no matter how many digits you use.
>=20
> I think it's the reverse that is true.  You cannot, for example, repres=
ent
> 0.1 (one tenth) precisely in binary because it becomes periodic:
>=20
> Binary:  0.00011001100110011001(1001)
> Hex:     0.19999999999999999999(9)
>=20
> However, any binary fraction can be represented in decimal notation
> exactly.  You just need a long enough (yet finite) set of decimal digit=
s.
> The difference is that the base (1/2) for binary fractions is represent=
ed
> exactly in the decimal, but the reverse isn't so.

You're completely right. I apologize, I got it the wrong way round.

>> However, it seems that the problem has been acknoledged by the C=20
>> community that have introduced a printf/scanf formatter ("%a" IIRC) to=
=20
>> output floating point numbers in a hexadecimal format that would=20
>> represent compacly and exactly any floating point number (assuming=20
>> that the underlying representation is binary).
>=20
> It seems that some folks have been doing something similar for ages.  I=
t's
> basically a text form of the internal binary representation whether you=
 use
> %a or just output the underlying chars as %02x.

Not exactly. Outputting the underlying chars would be unportable as=20
internal binary representation may differ among platforms. According to=20
the C standard %a shall produce a text string formatted like=20
[=E2=88=92]0xh.hhhhp=C2=B1d where "h.hhhh" is a hexadecimal number repres=
enting the=20
mantissa, "d" is the (decimal) exponent, "0x" and "p" are taken literally.

> Under more portable I mean that you still are going to lose
> some precision if the platform that reads the external representation h=
as
> fewer bits representing 'double' than the platform where it was written=
.=20

You're right about that. Yet, it seems that the hex format can be=20
useful, at least in the C committee's opinion.

>> It is reasonable to assume that in a few years all C/C++ compilers=20
>> with a conformant C9X library will have such feature, so it may be=20
>> available to C++ programs also. I believe that the C++ community=20
>> should do its part and extend the iostream formatters to allow the=20
>> hexadecimal format. That could be quite easily obtained by adding a=20
>> new ios_base flag in the floatfield mask and require num_get/num_put=20
>> to behave accordingly. I cannot see complex design issue to discuss,=20
>> so if it's not too late for that, I wish it could be considered for=20
>> inclusion in TR1.
>=20
> I think it's an implementation issue.  If you use 'hex' modifier to out=
put
> a floating point number, should it just give the analogous format to C9=
9's
> %a?  The implementers might begin thinking about that already.

We can't "overload" the std::hex modifier, because that would silently=20
change the behaviour of a program that mix integer and float formatting.=20
For example:

   int main()
   {
     std::cout << std::hex << 10000 << 10000.0 << "\n";
   }

There's more. On the input side, C99 changed the scanf "%g" specifier to=20
allow parsing of the hex fp format and C++ istream parsing for floats is=20
defined in terms of said specifier (22.2.2.1.2/5). Currently, C++ only=20
includes the C95 library, but if it will ever "upgrade" to C99 then C++=20
will get to parse the hex fp format too!

I think the committee should look into this and either clarify that the=20
hex fp format is disallowed on input or standardize the way such format=20
is produced on output. I have already located the changes that should be=20
made to introduce a new ios_base flag (a total of five places). Should I=20
make a formal proposal?

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Sun, 17 Oct 2004 20:43:35 GMT Raw View

"Alberto Barbati" <AlbertoBarbati@libero.it> wrote in message
news:%0kcd.44059$N45.1310622@twister2.libero.it...

>> It is reasonable to assume that in a few years all C/C++ compilers with a
>> conformant C9X library will have such feature, so it may be available to
>> C++ programs also. I believe that the C++ community should do its part
>> and extend the iostream formatters to allow the hexadecimal format. That
>> could be quite easily obtained by adding a new ios_base flag in the
>> floatfield mask and require num_get/num_put to behave accordingly. I
>> cannot see complex design issue to discuss, so if it's not too late for
>> that, I wish it could be considered for inclusion in TR1.
>
> I think it's an implementation issue.  If you use 'hex' modifier to output
> a floating point number, should it just give the analogous format to C99's
> %a?  The implementers might begin thinking about that already.

We can't "overload" the std::hex modifier, because that would silently
change the behaviour of a program that mix integer and float formatting.
For example:

   int main()
   {
     std::cout << std::hex << 10000 << 10000.0 << "\n";
   }

There's more. On the input side, C99 changed the scanf "%g" specifier to
allow parsing of the hex fp format and C++ istream parsing for floats is
defined in terms of said specifier (22.2.2.1.2/5). Currently, C++ only
includes the C95 library, but if it will ever "upgrade" to C99 then C++
will get to parse the hex fp format too!

I think the committee should look into this and either clarify that the
hex fp format is disallowed on input or standardize the way such format
is produced on output. I have already located the changes that should be
made to introduce a new ios_base flag (a total of five places). Should I
make a formal proposal?

[pjp] You can if you want, but the (non-normative) library TR1 already
incorporates such a mechanism. It is based on the work we at
Dinkumware did several years ago in integrating C99 and C++.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: 18 Oct 2004 16:40:01 GMT Raw View

richtesi@informatik.tu-muenchen.de (Simon Richter) wrote in message
news:<ckod3g$p9d$02$1@news.t-online.com>...

> > My expectation is that f == rf for ALL possible float values

> I believe that operator==(float, float) and operator==(double, double)
> should be dropped from the standard, as there is no way this can be
> implemented correctly on all architectures and it places a heavy
> burden on implementors.

I can't speak for all architectures, but it is trivial to implement on
the two I know well, Intel IA32 and Sparc.  There's no more burden on
implementors here than for operator==( int, int ).

There is, of course, more burden on users to use them correctly.
Correct floating point is a lot trickier than correct integral
arithmetic.

> Seriously though, you can only check that the result you get is within
> a substantially small distance from the expected result, as for
> example the Intel architecture has longer floating point registers
> than the actual memory layout, so a value cached on the FP stack may
> already differ in the lower order bits when compared to a value from
> memory (where these bits are read as zero).

The original poster spoke of values actually assigned to float
variables.  The number of bits in a float are fixed.  Given everything
that has happened since the assignment to the first float, and the fact
that the second float is assigned through a non-const reference in a
library routine, I would be very surprised if either of the values
represented the results of a calculation which didn't get stored back in
memory.  It's a distant possibility, but when he speaks of the LSB being
off in 1/3 of the values, it's obvious that this cannot be his problem.

For the rest, with sufficient digits (and nine is sufficient for IEEE
single precision floating point), the results of the conversion, in both
ways, is exactly defined, AND the conversion should be round trip.

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: Mon, 18 Oct 2004 17:25:14 CST Raw View

v.Abazarov@comAcast.net (Victor Bazarov) wrote in message
news:<vPQbd.4940$Ae.4915@newsread1.dllstx09.us.to.verio.net>...
> Paul A Bristow wrote:
> > Whilest devising tests for the Boost lexical_cast function, I have
> > encountered (for one compiler version) some surprising (to me)
> > results outputting floats to decimal digit strings and reading them
> > back in.

> > Providing you use enough decimal digits, I expected the result of
> > this 'loopback' to be the same, for example:

> >   float f = any_float_value; // Similarly for other types, UDTs even?
> >   std::stringstream s;
> >   s.precision(float_significant_decimal_digits); // 9 decimal digits should
> > be enough (see Appendix below).
> >   s << f; // Output to decimal digit string(stream).
> >   float rf;
> >   s >> rf; // read string back into float.

> > My expectation is that f == rf for ALL possible float values (and
> > indeed this WAS true for an exhaustive test with a previous version
> > of a well-known compiler, and for a randomish sample of double and
> > long double values).

> > A recent version outputs the same decimal digit strings

> > - BUT the value read back in is 1 least significant binary digit
> >   different - suspiciously only for 1/3 of the float values.

> > (Nor does increasing the number of decimal digits output via
> > s.precision() change this).

> > But perhaps this is not a Standard expectation?

> Standard says nothing about such behaviour.  AFAIK the only
> expectation one might have is that a number output with certain
> precision, then read back, then this new number output with the same
> precision should produce the same output.

> IMO any other expectation is unreasonable.

The standard effectively sets no requirements concerning the precision
of the conversions on input or output.  Quality of implementation does,
however, as does IEEE (IIRC).  If the floating point format on the
machine he is using is IEEE, I would consider the behavior he describes
defective, if only from a quality of implementation point of view.

--
James Kanze           GABI Software         http://www.gabi-soft.fr
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Tue, 19 Oct 2004 06:25:53 GMT Raw View

P.J. Plauger wrote:
>
> [pjp] You can if you want, but the (non-normative) library TR1 already
> incorporates such a mechanism. It is based on the work we at
> Dinkumware did several years ago in integrating C99 and C++.
>

I see. I guess you refer to paper N1568. The paper chooses the value
ios_base::fixed | ios_base::scientific to require the hexadecimal format.

Sure, that approach requires the least number of changes and gets the
job done, but looks a bit unnatural to me because %a is not a fixed
format. In fact the C standard says "if the precision is missing then
the precision is sufficient for an exact representation of the value"
oppositely of the %f specifier when there is a default value of exactly
6 digits.

Moreover, the value ios_base::fixed | ios_base::scientific, although
very rarely used, falled back in pre-TR1 libraries to %g formatting and
that raises a couple of compatibility issues:

1) An pre-TR1 program built with a TR1 library may experience a silent
behaviour change;

2) A TR1 program that relies on hexadecimal fp formatting will silently
build even with a pre-TR1 library with a behaviour change.

My proposal is to add a new constant. Although more intrusive, the
modification are still very few and none of the above objections would
hold. Here they are:

Clause 22.2.2.2.2

In table 58, after the line:

   floatfield == ios_base::scientific %E

add the lines:

   floatfield == ios_base::hexfloat && !uppercase %a
   floatfield == ios_base::hexfloat %A

Clause 27.4

In the <ios> header synopsis, after the line:

   ios_base& scientific (ios_base& str);

add the line:

   ios_base& hexfloat (ios_base& str);

Clause 27.4.2

In the ios base class synopsis, after the line:

   static const fmtflags scientific;

add the line:

   static const fmtflags hexfloat;

Clause 27.4.2.1.2

In table 83 (fmtflags effect) add the line:

   hexfloat generates floating-point output in hexadecimal notation;

In table 84 (fmtflags constants) replace the line:

   floatfield    scientific | fixed

with the line

   floatfield    scientific | fixed | hexfloat

Clause 27.4.5.4

Add the following paragraphs:

     ios_base& hexfloat(ios_base& str);

   5 Effects: Calls str.setf(ios_base::scientific, ios_base::hexfloat).
   6 Returns: str.

The choice of the name "hexfloat" is quite arbitrary and is subject to
discussion.

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: pbristow@hetp.u-net.com ("Paul A Bristow")
Date: Wed, 20 Oct 2004 12:29:39 GMT Raw View

Thanks to all who replied to my query.

Since the values output are the same as in a previously successful loopback
test, I think it is reasonable to assume that the problem is not with
output, but with INPUT.

Some of the discussion veered off onto the method of inputting in hex, but I
feel that it another issue (and is, of course, tiresome to make portable
because it is dependent on the particular floating point format, as is using
exactly representable decimal digit strings).

The results are the same using an even simpler test

float f = 3.14590000f; // 9 decimal digits
cout << f;
float rf;
cin >> rf;
cout << rf;
assert(f == rf);

And results are:
3.14590001

3.14590025

Assertion failed: f == rf, file .\Hello.cpp, line 50

These are the decimal digits string representations of binary values which
differ by the least significant significand bit

f = 3.14590014F is the next to succeed.

Significantly, using "exactly representable" values, for example for pi

const float pi_l = 3.1415925025939941406250F;  // Upper and lower limits of
pi using IEEE 24 bit significand float.

succeeds with 9 digits 3.14159250

but fails for the upper limit (1 bit more) 3.14159274

 const float pi_u = 3.1415927410125732421875F;

I conclude that whilest the Standard is not quite explicit about an accuracy
requirement, this counts as a bug rather than a feature.

Although apparently quite minor, I fear it may cause considerable confusion
from small differences in some computed results.

Thanks

Paul


""Paul A Bristow"" <pbristow@hetp.u-net.com> wrote in message
news:1097747958.7681.0@nnrp-t71-03.news.uk.clara.net...
> Whilest devising tests for the Boost lexical_cast function,
> I have encountered (for one compiler version) some surprising (to me)
> results outputting
> floats to decimal digit strings and reading them back in.
>
> Providing you use enough decimal digits, I expected the result of
> this 'loopback' to be the same, for example:
>
>   float f = any_float_value; // Similarly for other types, UDTs even?
>   std::stringstream s;
>   s.precision(float_significant_decimal_digits); // 9 decimal digits
should
> be enough (see Appendix below).
>   s << f; // Output to decimal digit string(stream).
>   float rf;
>   s >> rf; // read string back into float.
>
> My expectation is that f == rf for ALL possible float values
> (and indeed this WAS true for an exhaustive test with a previous version
of
> a well-known compiler,
> and for a randomish sample of double and long double values).
>
> A recent version outputs the same decimal digit strings
>
> - BUT the value read back in is 1 least significant binary digit
different -
>
> suspiciously only for 1/3 of the float values.
>
> (Nor does increasing the number of decimal digits output via s.precision()
> change this).
>
> But perhaps this is not a Standard expectation?
>
> Paul Bristow


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Thu, 21 Oct 2004 07:07:38 GMT Raw View

Paul A Bristow wrote:
>=20
> Some of the discussion veered off onto the method of inputting in hex, =
but I
> feel that it another issue (and is, of course, tiresome to make portabl=
e
> because it is dependent on the particular floating point format, as is =
using
> exactly representable decimal digit strings).
>=20

You don't seem to have understood what the hex fp format is. The hex fp=20
format *does not* depend on the implementation of floating point and is=20
supposed to be as portable as the "regular" scientific format. As I=20
wrote in my post, the hex fp format is a string like [=E2=88=92]0xh.hhhhp=
=C2=B1d=20
where "h.hhhh" is a hexadecimal number representing the mantissa and "d"=20
is a decimal number representing the exponent. It's *exactly the same*=20
as the scientific format, except that the mantissa is expressed as an=20
hexadecimal instead of a decimal number. The advantage is that if=20
FLT_RADIX is a power of 2, you can do exact I/O (both exact output *and*=20
exact input!) using a finite number of digits. If FLT_RADIX is not a=20
power of 2 it's no better and no worse than the "regular" scientific=20
format (unless FLT_RADIX is either 5 of a multiple of 10... which I=20
believe it's rather rare in practice).

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: pbristow@hetp.u-net.com ("Paul A Bristow")
Date: Fri, 15 Oct 2004 04:34:24 GMT Raw View

Whilest devising tests for the Boost lexical_cast function,
I have encountered (for one compiler version) some surprising (to me)
results outputting
floats to decimal digit strings and reading them back in.

Providing you use enough decimal digits, I expected the result of
this 'loopback' to be the same, for example:

  float f = any_float_value; // Similarly for other types, UDTs even?
  std::stringstream s;
  s.precision(float_significant_decimal_digits); // 9 decimal digits should
be enough (see Appendix below).
  s << f; // Output to decimal digit string(stream).
  float rf;
  s >> rf; // read string back into float.

My expectation is that f == rf for ALL possible float values
(and indeed this WAS true for an exhaustive test with a previous version of
a well-known compiler,
and for a randomish sample of double and long double values).

A recent version outputs the same decimal digit strings

- BUT the value read back in is 1 least significant binary digit different -

suspiciously only for 1/3 of the float values.

(Nor does increasing the number of decimal digits output via s.precision()
change this).

But perhaps this is not a Standard expectation?

Paul Bristow



Appendix

For float, the number of significant binary digits is

 int float_significand_digits =
std::numeric_limits<float>::digits; // FLT_MANT_DIG == 24
for 32-bit FP

the number of _guaranteed_ accurate decimal digits is given by

 int float_guaranteed_decimal_digits =
std::numeric_limits<float>::digits10;

 and is 6 for the MSVC 32-bit floating point format.

The maximum number of digits that _can_ be significant is
given by the formula

 float const log10Two = 0.30102999566398119521373889472449F; // log10(2.)

 int float_significant_digits = int(ceil(1 +
float_significand_digits * log10Two));

 // Note that C++ compiler will NOT evaluate log10
(2.) at compile time, nor a floating point division,
 // but an WILL perform an integer division, so can
use 301/1000 as an approximation.
 // 3010/10000 is the nearest approximation using
short int (10000 < max of 32767)

but this is convenient numerically equivalent to

 int const float_significant_decimal_digits = 2 +
std::numeric_limits<float>::digits * 3010/10000;

which CAN be calculated at compile time, and is 9 decimal
digits for the IEEE 32-bit floating point format.

To demonstrate, the following test asserts:

#include <iomanip>
#include <cassert>
#include <limits>
#include <sstream>
using std::setprecision;

int main()
{
 int const float_significant_decimal_digits = 2 +
std::numeric_limits<float>::digits * 3010/10000; // == 9

 float f = 3.1459F; // a test value - on a
hemidemisemi-random test 1/3 fail.
 float rf; // for recalculate.
 std::stringstream s;
 s.precision(float_significant_decimal_digits); //
9 decimal digits is enough.

 s << f; // Output to string.
 s >> rf; // Read back in to float.

 assert(f == rf); // Check get back the same.

 return 0;
}  // int main()


--
Paul A Bristow
Prizet Farmhouse, Kendal LA8 8AB   UK
pbristow@hetp.u-net.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: richtesi@informatik.tu-muenchen.de (Simon Richter)
Date: Fri, 15 Oct 2004 17:26:09 GMT Raw View

Hi,

> My expectation is that f == rf for ALL possible float values

I believe that operator==(float, float) and operator==(double, double)
should be dropped from the standard, as there is no way this can be
implemented correctly on all architectures and it places a heavy burden
on implementors.

Seriously though, you can only check that the result you get is within a
substantially small distance from the expected result, as for example
the Intel architecture has longer floating point registers than the
actual memory layout, so a value cached on the FP stack may already
differ in the lower order bits when compared to a value from memory
(where these bits are read as zero).

    Simon

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Fri, 15 Oct 2004 17:26:24 GMT Raw View

Paul A Bristow wrote:
>
> But perhaps this is not a Standard expectation?
>

It's a mathematical fact that not all floating point numbers represented
in a binary base can also be represented *exactly* in a decimal base, no
matter how many digits you use. So the standard does not (because it
can't) have this expectation.

However, it seems that the problem has been acknoledged by the C
community that have introduced a printf/scanf formatter ("%a" IIRC) to
output floating point numbers in a hexadecimal format that would
represent compacly and exactly any floating point number (assuming that
the underlying representation is binary).

It is reasonable to assume that in a few years all C/C++ compilers with
a conformant C9X library will have such feature, so it may be available
to C++ programs also. I believe that the C++ community should do its
part and extend the iostream formatters to allow the hexadecimal format.
That could be quite easily obtained by adding a new ios_base flag in the
floatfield mask and require num_get/num_put to behave accordingly. I
cannot see complex design issue to discuss, so if it's not too late for
that, I wish it could be considered for inclusion in TR1.

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net (Victor Bazarov)
Date: Fri, 15 Oct 2004 15:06:29 GMT Raw View

Paul A Bristow wrote:
> Whilest devising tests for the Boost lexical_cast function,
> I have encountered (for one compiler version) some surprising (to me)
> results outputting
> floats to decimal digit strings and reading them back in.
>
> Providing you use enough decimal digits, I expected the result of
> this 'loopback' to be the same, for example:
>
>   float f = any_float_value; // Similarly for other types, UDTs even?
>   std::stringstream s;
>   s.precision(float_significant_decimal_digits); // 9 decimal digits should
> be enough (see Appendix below).
>   s << f; // Output to decimal digit string(stream).
>   float rf;
>   s >> rf; // read string back into float.
>
> My expectation is that f == rf for ALL possible float values
> (and indeed this WAS true for an exhaustive test with a previous version of
> a well-known compiler,
> and for a randomish sample of double and long double values).
>
> A recent version outputs the same decimal digit strings
>
> - BUT the value read back in is 1 least significant binary digit different -
>
> suspiciously only for 1/3 of the float values.
>
> (Nor does increasing the number of decimal digits output via s.precision()
> change this).
>
> But perhaps this is not a Standard expectation?

Standard says nothing about such behaviour.  AFAIK the only expectation
one might have is that a number output with certain precision, then
read back, then this new number output with the same precision should
produce the same output.

IMO any other expectation is unreasonable.

V

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: v.Abazarov@comAcast.net (Victor Bazarov)
Date: Sat, 16 Oct 2004 07:12:14 GMT Raw View

Alberto Barbati wrote:
> Paul A Bristow wrote:
>
>>
>> But perhaps this is not a Standard expectation?
>>
>
> It's a mathematical fact that not all floating point numbers represented
> in a binary base can also be represented *exactly* in a decimal base, no
> matter how many digits you use.

I think it's the reverse that is true.  You cannot, for example, represent
0.1 (one tenth) precisely in binary because it becomes periodic:

Binary:  0.00011001100110011001(1001)
Hex:     0.19999999999999999999(9)

However, any binary fraction can be represented in decimal notation
exactly.  You just need a long enough (yet finite) set of decimal digits.
The difference is that the base (1/2) for binary fractions is represented
exactly in the decimal, but the reverse isn't so.

 > So the standard does not (because it
> can't) have this expectation.

No.  It's the unreasonability of precise representation of what is in the
computer memory combined with the fact that in the computer memory it is
still not the precise number anyway.  So, why allow outputting the full
set of needed decimal digits when the number is imprecise anyway?

FLT_EPSILON, for example is 0.00000011920928955078125000000.  Precisely.

> However, it seems that the problem has been acknoledged by the C
> community that have introduced a printf/scanf formatter ("%a" IIRC) to
> output floating point numbers in a hexadecimal format that would
> represent compacly and exactly any floating point number (assuming that
> the underlying representation is binary).

It seems that some folks have been doing something similar for ages.  It's
basically a text form of the internal binary representation whether you use
%a or just output the underlying chars as %02x.  Yes, %a is more readable
and portable.  Under more portable I mean that you still are going to lose
some precision if the platform that reads the external representation has
fewer bits representing 'double' than the platform where it was written.

> It is reasonable to assume that in a few years all C/C++ compilers with
> a conformant C9X library will have such feature, so it may be available
> to C++ programs also. I believe that the C++ community should do its
> part and extend the iostream formatters to allow the hexadecimal format.
> That could be quite easily obtained by adding a new ios_base flag in the
> floatfield mask and require num_get/num_put to behave accordingly. I
> cannot see complex design issue to discuss, so if it's not too late for
> that, I wish it could be considered for inclusion in TR1.

I think it's an implementation issue.  If you use 'hex' modifier to output
a floating point number, should it just give the analogous format to C99's
%a?  The implementers might begin thinking about that already.

V

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]