Topic: Problems with std::basic_istream
Author: kanze@gabi-soft.de (James Kanze)
Date: Fri, 19 Oct 2001 15:00:28 GMT Raw View
I'm having some problems with std::basic_istream.
1. Consider the following case:
std::istringstream in( "123" ) ;
int >> anInt ;
In all normal implementations, we will now have:
in.eof() == true
in.fail() == false
in.bad() == false
in.good() == false
My question is what happens on the next input. In all
implementations I've ever seen, the next input will set failbit,
and in.fail() will become true. This is the behavior of the
classic iostream, and it is, I believe, the expected behavior by
the majority of istream users.
It is not, as far as I can tell, the behavior required by the
standard (although there is room to argue that the standard leaves
the behavior in this case fully undefined). Basically, if we look
at 27.6.1.2.1/1 and 27.6.1.3/1, we see that all input functions,
formatted or unformatted, start by constructing an
std::basic_istream<>::sentry object, and do *nothing* (not even
set failbit) if this object evaluates to false. If we look then
at 27.6.1.1.2 for the semantics of the sentry, we find first of
all that the standard only specifies the constructor behavior for
the case where is.good() is true. We are left to guess what
should happen if is.good() is false (as it will be in this case);
presumably, the constructor either does nothing, or calling the
function is such cases is undefined behavior. If the constructor
does nothing, we still have the problem of what the status of ok_
is, but in no case does the constructor set the failbit in the
owning stream. (In 27.6.1.1.2/5, it says that "During
preparation, the constructor may call setstate(failbit) [...]."
This suggests that the constructor may not call setstate(failbit)
otherwise. And of course, 27.6.1.1.2/2 commences by saying "If
is.good() is true, prepares for formatted or unformatted input."
If is.good() is false, there is no preparation, and the
constructor may *not* call the setstate(failbit).)
Depending on the status of ok_, the extraction routine will either
do nothing (and thus not set the failbit), or attempt to extract
data, probably by reading from the streambuf, which may or may not
return valid characters. In the first case, code such as:
std::istringstream in( "123" ) ;
while ( in >> anInt ) ...
results in an endless loop; in the second, we really don't know
too much what will happen, but the traditional guarantee, if
in.eof() is true, the next input will fail, is broken.
IMHO, 27.6.1.1.2/2 should contain an additional sentence : "If
is.good() is false, the constructor calls is.setstate(failbit),
and sets ok_ to false."
2. Consider the following code:
std::istringstream in( "123e-" ) ;
in >> aDouble ;
With the implementations I currently have access to, we now have:
in.eof() == true
in.fail() == true
in.bad() == false
in.good() == false
This seems compatible with what the standard says.
The problem is that for many years, we have been taught, by such
experts as Steve Clamage, that the correct way to do input was:
while ( in >> variable ) {
// ...
}
if ( in.eof() ) {
// End of data...
} else {
// Formatting error...
}
The problem is that in this case, the code will consider the input
as an end of file, and not as the format error that it is.
So my question is: is this actually the desired behavior, and if
so, how do I distinguish between a real end of file, and
misformatted data? And if this is the desired behavior, of what
possible use is eof()?
3. This is a somewhat more abstract problem, but what is the
relationship between std::basic_istream<>::peek and
std::basic_istream<>::putback? I'm particularly concerned by the
sequence:
int ch1 = in.get() ;
int ch2 = in.peek() ;
in.putback( static_cast< char >( ch1 ) ) ;
My impression is that if this is required to work, streambuf's
must support at least a two character buffer. We have effectively
given the application a way to ensure two character look-ahead.
(I'll admit that I've not yet looked at this one in detail yet, so
there are probably some aspects that I've missed. I'm just
throwing the idea out to see what others think.)
4. Are successive calls to std::basic_istream<>::peek guaranteed to
return the same value. They don't with Sun CC, and if memory
serves me right, they don't with VC++ either, although I don't
have access to the compiler right now to verify. The basic
problem is that if the input stream is good, peek just returns the
return value of rdbuf()->sgetc() (see 27.6.1.3/27), without
setting any flags, regardless of the return value.
This is generally only a problem when rdbuf()->sgetc() returns an
end of file indication. In many implementations of filebuf, an
end of file indication is not definitive; the next call to sgetc()
may return a valid character. While this is, IMHO, a debatable
policy in a streambuf, it doesn't seem to correspond with the rest
of the istream policy at all.
What is the intent of the standard here? Should peek set failbit
if it returns end of file, should sgetc be required to always
return end of file once it has returned it once, or do we accept
that peek is simply unreliable, and that if it returns end of
file, there is no guarantee that it will do so the next time, or
that the next input will fail?
--
James Kanze mailto:kanze@gabi-soft.de
Beratung in objektorientierer Datenverarbeitung --
-- Conseils en informatique orient e objet
Ziegelh ttenweg 17a, 60598 Frankfurt, Germany, T l.: +49 (0)69 19 86 27
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Valentin.Bonnard@free.fr (Valentin Bonnard)
Date: Sun, 21 Oct 2001 18:45:17 GMT Raw View
> I'm having some problems with std::basic_istream.
> 2. Consider the following code:
>
> std::istringstream in( "123e-" ) ;
> in >> aDouble ;
>
> With the implementations I currently have access to, we now have:
>
> in.eof() == true
> in.fail() == true
> in.bad() == false
> in.good() == false
>
> This seems compatible with what the standard says.
>
> The problem is that for many years, we have been taught, by such
> experts as Steve Clamage, that the correct way to do input was:
>
> while ( in >> variable ) {
> // ...
> }
> if ( in.eof() ) {
> // End of data...
> } else {
> // Formatting error...
> }
Then write it this way:
if (in.fail ())
// format error
else
// eot
> The problem is that in this case, the code will consider the input
> as an end of file, and not as the format error that it is.
It is both.
> So my question is: is this actually the desired behavior, and if
> so, how do I distinguish between a real end of file, and
> misformatted data? And if this is the desired behavior, of what
> possible use is eof()?
To know that at least one end-of-file as been seen.
The question is: what do you want to do when data appears
after and end-of-file.
> 3. This is a somewhat more abstract problem, but what is the
> relationship between std::basic_istream<>::peek and
> std::basic_istream<>::putback? I'm particularly concerned by the
> sequence:
>
> int ch1 = in.get() ;
> int ch2 = in.peek() ;
> in.putback( static_cast< char >( ch1 ) ) ;
>
> My impression is that if this is required to work,
putback isn't required to work, it depends on the
available room in the buffer.
For example, in a non-buffering streambuf, pbackfail
will be called. By default, pbackfail just fails.
> streambuf's must support at least a two character buffer. We
> have effectively given the application a way to ensure two
> character look-ahead. (I'll admit that I've not yet looked at
> this one in detail yet, so there are probably some aspects that
> I've missed. I'm just throwing the idea out to see what others
> think.)
Actualy, one putback may fail, or one hundred putback
calls may succed, depending on the streambuf.
The only guarantied look-ahead is one character in underflow().
-- VB
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]