Topic: Problem reading floats in C++


Author: clamage@Eng.Sun.COM (Steve Clamage)
Date: 1998/07/25
Raw View
>I ran into an annoying problem with the C++ compiler we are using.
> ...
>I spoke with some other people I work with about this "feature" of
>our local compiler.  One of them had run into the problem herself
>and investigated further.  Apparently, when the string "1." is read
>in as a floating point, the code sets the input stream into
>a not good state, and leaves the float unaffected.  All the subsequent
>reads also fail.  Apparently, entering "1." is for a float
>is as bad as entering "banana".

>I was told that the compiler set this up this way to conform to
>the standard.  The reasoning is that it is impossible for
>"cout << n1" to output the string "1.", so it should not be
>possible for "cin >> n1" to interpret that string.

That explanation is complete nonsense.

First, you CAN produce the string "1." as output. Set
"precision" to 0, and set the "showpoint" and 'fixed" flags.
(The same is true in stdio, by the way.)

Second, there is no requirement, implied or expressed, that
all input and output formats be completely reversible.
For example, you can produce numeric output with fill characters
between the sign and the first digit, but there is no input
format setting that will read the data back.

And of course, "1." is a perfectly valid floating-point
number for input by scanf and by an istream in any case.
Digits are not required following the decimal point.

Check with the compiler vendor's tech support. You may have
found a known bug for which there is a fix. If not, file a
bug report.

--
Steve Clamage, stephen.clamage@sun.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: AllanW@my-dejanews.com
Date: 1998/07/27
Raw View
In article <199807241544.PAA03854@mit1.fnal.gov>,
  "I don't suffer from insanity. I enjoy every minute of it."
<kkelley@mit1.fnal.gov> wrote:

> #include <iostream>
> using namespace std;
> int main() {
>   float n1,n2,n3,n4,n5;
>   cin >> n1 >> n2 >> n3 >> n4 >> n5;
>   cout << n1 << " " << n2 << " " << n3 << " " << n4 << " " << n5 << endl;
>   return 0;
> }
[...]
> [input]  1. 2. 3. 4. 5.
> [output] 1.79921e-29 0 1.69007e-29 nan0x7fffffff 0

> I spoke with some other people I work with about this "feature" of
> our local compiler.  One of them had run into the problem herself
> and investigated further.  Apparently, when the string "1." is read
> in as a floating point, the code sets the input stream into
> a not good state, and leaves the float unaffected.  All the subsequent
> reads also fail.  Apparently, entering "1." is for a float
> is as bad as entering "banana".
>
> I was told that the compiler set this up this way to conform to
> the standard.  The reasoning is that it is impossible for
> "cout << n1" to output the string "1.", so it should not be
> possible for "cin >> n1" to interpret that string.

Rediculous. This goes against common sense.

> Is that what the standard specifies?  I suppose there is very
> specific definition on what strings can be handled and which
> cannot, as undefined behavior from an input stream is a very
> bad thing.  So the standard must clearly specify something.

> Everyone understands why entering "banana" when the program asks
> for a float is bad.  Most experienced programmers will even
> understand why entering "017" when asked for an integer will give
> different behaviour from entering "17".  I suspect only very
> experienced C++ programmers will be able to guess why entering
> "1." for a float would fail.

I would not expect this from any language. First, numbers of the
form "1." are valid in C++ source code. Second, the rules at
run-time should be much more liberal than those used in source
code. (Indeed, the iostreams library removes thousands separators
in the input stream -- it certainly won't do this in source code.)
Third, if the number "1" was a valid floating-point number, I would
expect that the trailing period was either part of the number
(in which case the value is equivalent to 1.0 with the trailing 0
omitted) or it was not part of the number (in which case the period
is left in the input stream for subsequent input).  So the first
value, at least, would always have to be valid.

Perhaps, like your "very experienced C++ programmers," I could
guess why it happened once evidence had been shown to me.  But
I would never predict it, and I would never happily accept it.

> So, is it a bug in the compiler or a "feature" of the standard?

I tried it on Microsoft Visual C++ V5.0.  It behaved as I expected;
that is, the output was
    1 2 3 4 5

Of course, showing that it works on one platform is far different
than showing that it *should* work on all platforms.  You asked
about the standard; I don't have the FDIS, but I do have the
November 1997 draft.

In [lib.istream.formatted.arithmetic]:
  these extractors depend on the
  locale's num_get<> (_lib.locale.num.get_) object  to  perform  parsing
  the  input  stream data.  The conversion occurs as if performed by the
  following code fragment:
    typedef num_get< charT,istreambuf_iterator<charT,traits> > numget;
    iostate err = 0;
    use_facet< numget >(loc).get(*this, 0, *this, err, val);
    setstate(err);

So, we are referred to the description of the locale.
In [lib.locale.num.get]:

1 The details of this operation occur in two stages

  --Stage 1: Determine a conversion specifier

  --Stage 2: Extract characters from in and transform them into  char's,
    converting the value transformed characters according to the conver-
    sion specification determined in stage 1.

  --Stage 3: Store results
    The details of the stages are presented below.
  Stage 1:
[...]
    For conversions to a  floating type the specifier is %g.
[...Goes on to explain about size modifiers...]
  Stage 2:
[...Explain how we extract a group of characters, changing the
local decimal point to '.' and removing the thousands separator
if there is one...]
  Stage 3:
    The result of stage 2 processing can be one of

  --A  sequence  of  chars  has been accumulated in stage 2 that is con-
    verted (according to the rules of std::scanf) to a value of the type
    of val.  This value is stored in val and ios_base::goodbit is stored
    in err.

  --The sequence of chars accumulated in stage 2 would have caused scanf
    to report an input failure.  ios_base::failbit is assigned to
[...]

So, we are referred to scanf.  The specification for scanf() is
incorporated into the C++ standard only by reference to the C
standard. From [intro.refs]:

1 The following standards contain provisions which, through reference in
  this text, constitute provisions of this International  Standard.   At
  the time of publication, the editions indicated were valid.  All stan-
  dards are subject to revision, and parties to agreements based on this
  International  Standard  are encouraged to investigate the possibility
  of applying the most recent editions of the standards indicated below.
  Members  of IEC and ISO maintain registers of currently valid Interna-
  tional Standards.

  --ISO/IEC 2382 Dictionary for Information Processing Systems.
  --ISO/IEC 9899:1990 Programming Languages - C
  --ISO/IEC:1990 Programming Languages - C AMENDMENT 1: C Integrity

I don't have a copy of the 1990 C standard.  But we all know what the
%g specifier does in scanf(), right?  It reads a value of type float,
in standard or exponential format.  Ironically, one text that I have
notes:
    The floating-point numbers in the input field must follow the
    format:
        [+/-]ddddddddd[.]dddd[E|e][+/-]ddd
although obviously this isn't literally correct -- the text
    +123.456-789
follows this format, but it isn't a (single) valid floating-point
value.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "I don't suffer from insanity. I enjoy every minute of it." <kkelley@mit1.fnal.gov>
Date: 1998/07/25
Raw View
I ran into an annoying problem with the C++ compiler we are using.
Here is a sample program which reproduces the effect.

>>>>>>>>>>>>>>>>>>>>>>>>>
% cat test.cc
#include <iostream>

using namespace std;

int main() {
  float n1,n2,n3,n4,n5;

  cin >> n1 >> n2 >> n3 >> n4 >> n5;
  cout << n1 << " " << n2 << " " << n3 << " " << n4 << " " << n5 << endl;

  return 0;
}

% ./a.out
1 2 3 4 5
1 2 3 4 5
% ./a.out
1.0 2.0 3.0 4.0 5.0
1 2 3 4 5
% ./a.out
1. 2. 3. 4. 5.
1.79921e-29 0 1.69007e-29 nan0x7fffffff 0
>>>>>>>>>>>>>>>>>>>>>>>>>

I spoke with some other people I work with about this "feature" of
our local compiler.  One of them had run into the problem herself
and investigated further.  Apparently, when the string "1." is read
in as a floating point, the code sets the input stream into
a not good state, and leaves the float unaffected.  All the subsequent
reads also fail.  Apparently, entering "1." is for a float
is as bad as entering "banana".

I was told that the compiler set this up this way to conform to
the standard.  The reasoning is that it is impossible for
"cout << n1" to output the string "1.", so it should not be
possible for "cin >> n1" to interpret that string.

Is that what the standard specifies?  I suppose there is very
specific definition on what strings can be handled and which
cannot, as undefined behavior from an input stream is a very
bad thing.  So the standard must clearly specify something.

I ran into this problem because this program is reading in a formatted
file which, for its own arcane reasosns, must have numbers in that
format.  The workaround I came up with was to subclass istream
and provide a different way for my new class to handle reading
in floats.  It now reads into a char* buffer, and then calls
atof(buffer), which works for the above string.

Everyone understands why entering "banana" when the program asks
for a float is bad.  Most experienced programmers will even
understand why entering "017" when asked for an integer will give
different behaviour from entering "17".  I suspect only very
experienced C++ programmers will be able to guess why entering
"1." for a float would fail.

So, is it a bug in the compiler or a "feature" of the standard?


Thanks,
Ken



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]