Topic: Problems with std::basic_istream


Author: kanze@gabi-soft.de (James Kanze)
Date: Fri, 19 Oct 2001 15:00:28 GMT
Raw View
I'm having some problems with std::basic_istream.

1.  Consider the following case:

        std::istringstream in( "123" ) ;
        int >> anInt ;

    In all normal implementations, we will now have:

        in.eof()  == true
        in.fail() == false
        in.bad()  == false
        in.good() == false

    My question is what happens on the next input.  In all
    implementations I've ever seen, the next input will set failbit,
    and in.fail() will become true.  This is the behavior of the
    classic iostream, and it is, I believe, the expected behavior by
    the majority of istream users.

    It is not, as far as I can tell, the behavior required by the
    standard (although there is room to argue that the standard leaves
    the behavior in this case fully undefined).  Basically, if we look
    at 27.6.1.2.1/1 and 27.6.1.3/1, we see that all input functions,
    formatted or unformatted, start by constructing an
    std::basic_istream<>::sentry object, and do *nothing* (not even
    set failbit) if this object evaluates to false.  If we look then
    at 27.6.1.1.2 for the semantics of the sentry, we find first of
    all that the standard only specifies the constructor behavior for
    the case where is.good() is true.  We are left to guess what
    should happen if is.good() is false (as it will be in this case);
    presumably, the constructor either does nothing, or calling the
    function is such cases is undefined behavior.  If the constructor
    does nothing, we still have the problem of what the status of ok_
    is, but in no case does the constructor set the failbit in the
    owning stream.  (In 27.6.1.1.2/5, it says that "During
    preparation, the constructor may call setstate(failbit) [...]."
    This suggests that the constructor may not call setstate(failbit)
    otherwise.  And of course, 27.6.1.1.2/2 commences by saying "If
    is.good() is true, prepares for formatted or unformatted input."
    If is.good() is false, there is no preparation, and the
    constructor may *not* call the setstate(failbit).)

    Depending on the status of ok_, the extraction routine will either
    do nothing (and thus not set the failbit), or attempt to extract
    data, probably by reading from the streambuf, which may or may not
    return valid characters.  In the first case, code such as:

        std::istringstream in( "123" ) ;
        while ( in >> anInt ) ...

    results in an endless loop; in the second, we really don't know
    too much what will happen, but the traditional guarantee, if
    in.eof() is true, the next input will fail, is broken.

    IMHO, 27.6.1.1.2/2 should contain an additional sentence : "If
    is.good() is false, the constructor calls is.setstate(failbit),
    and sets ok_ to false."


2.  Consider the following code:

        std::istringstream in( "123e-" ) ;
        in >> aDouble ;

    With the implementations I currently have access to, we now have:

        in.eof()  == true
        in.fail() == true
        in.bad()  == false
        in.good() == false

    This seems compatible with what the standard says.

    The problem is that for many years, we have been taught, by such
    experts as Steve Clamage, that the correct way to do input was:

        while ( in >> variable ) {
            //  ...
        }
        if ( in.eof() ) {
            //  End of data...
        } else {
            //  Formatting error...
        }

    The problem is that in this case, the code will consider the input
    as an end of file, and not as the format error that it is.

    So my question is: is this actually the desired behavior, and if
    so, how do I distinguish between a real end of file, and
    misformatted data?  And if this is the desired behavior, of what
    possible use is eof()?


3.  This is a somewhat more abstract problem, but what is the
    relationship between std::basic_istream<>::peek and
    std::basic_istream<>::putback?  I'm particularly concerned by the
    sequence:

        int ch1 = in.get() ;
        int ch2 = in.peek() ;
        in.putback( static_cast< char >( ch1 ) ) ;

    My impression is that if this is required to work, streambuf's
    must support at least a two character buffer.  We have effectively
    given the application a way to ensure two character look-ahead.
    (I'll admit that I've not yet looked at this one in detail yet, so
    there are probably some aspects that I've missed.  I'm just
    throwing the idea out to see what others think.)


4.  Are successive calls to std::basic_istream<>::peek guaranteed to
    return the same value.  They don't with Sun CC, and if memory
    serves me right, they don't with VC++ either, although I don't
    have access to the compiler right now to verify.  The basic
    problem is that if the input stream is good, peek just returns the
    return value of rdbuf()->sgetc() (see 27.6.1.3/27), without
    setting any flags, regardless of the return value.

    This is generally only a problem when rdbuf()->sgetc() returns an
    end of file indication.  In many implementations of filebuf, an
    end of file indication is not definitive; the next call to sgetc()
    may return a valid character.  While this is, IMHO, a debatable
    policy in a streambuf, it doesn't seem to correspond with the rest
    of the istream policy at all.

    What is the intent of the standard here?  Should peek set failbit
    if it returns end of file, should sgetc be required to always
    return end of file once it has returned it once, or do we accept
    that peek is simply unreliable, and that if it returns end of
    file, there is no guarantee that it will do so the next time, or
    that the next input will fail?


--
James Kanze                                   mailto:kanze@gabi-soft.de
Beratung in objektorientierer Datenverarbeitung --
                             -- Conseils en informatique orient   e objet
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany, T   l.: +49 (0)69 19 86 27

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Valentin.Bonnard@free.fr (Valentin Bonnard)
Date: Sun, 21 Oct 2001 18:45:17 GMT
Raw View
> I'm having some problems with std::basic_istream.

> 2.  Consider the following code:
>
>         std::istringstream in( "123e-" ) ;
>         in >> aDouble ;
>
>     With the implementations I currently have access to, we now have:
>
>         in.eof()  == true
>         in.fail() == true
>         in.bad()  == false
>         in.good() == false
>
>     This seems compatible with what the standard says.
>
>     The problem is that for many years, we have been taught, by such
>     experts as Steve Clamage, that the correct way to do input was:
>
>         while ( in >> variable ) {
>             //  ...
>         }
>         if ( in.eof() ) {
>             //  End of data...
>         } else {
>             //  Formatting error...
>         }

Then write it this way:

if (in.fail ())
 // format error
else
 // eot

>     The problem is that in this case, the code will consider the input
>     as an end of file, and not as the format error that it is.

It is both.

>     So my question is: is this actually the desired behavior, and if
>     so, how do I distinguish between a real end of file, and
>     misformatted data?  And if this is the desired behavior, of what
>     possible use is eof()?

To know that at least one end-of-file as been seen.

The question is: what do you want to do when data appears
after and end-of-file.

> 3.  This is a somewhat more abstract problem, but what is the
>     relationship between std::basic_istream<>::peek and
>     std::basic_istream<>::putback?  I'm particularly concerned by the
>     sequence:
>
>         int ch1 = in.get() ;
>         int ch2 = in.peek() ;
>         in.putback( static_cast< char >( ch1 ) ) ;
>
>     My impression is that if this is required to work,

putback isn't required to work, it depends on the
available room in the buffer.

For example, in a non-buffering streambuf, pbackfail
will be called. By default, pbackfail just fails.

>     streambuf's must support at least a two character buffer.  We
>     have effectively given the application a way to ensure two
>     character look-ahead.  (I'll admit that I've not yet looked at
>     this one in detail yet, so there are probably some aspects that
>     I've missed.  I'm just throwing the idea out to see what others
>     think.)

Actualy, one putback may fail, or one hundred putback
calls may succed, depending on the streambuf.

The only guarantied look-ahead is one character in underflow().

  --   VB

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]