Topic: EOF on ostream::operator<<( streambuf* )


Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1997/10/06
Raw View
I recently ran into a problem with the g++ implementation of
ostream::operator<<( streambuf* ).  While the g++ implementation is
definitly "wrong" (in the sense that it doesn't conform to the standard
Unix definition of EOF), I'm curious as to just what the standard has to
say -- the error is a result of what I would consider a very reasonable
way of implementing the function.

Basically, the symptom of g++ is that when inputting from the keyboard,
I need two ^D's for EOF to be recognized.  (Strictly speaking, this
doesn't violate the C++ standard, because the C++ standard doesn't
specify what constitutes an EOF on a keyboard.  I'm pretty sure that
this wasn't the concious intent of the people at Cygnus, however.)  The
reason for this is simple: for performance reasons, the operator uses
the equivalent of sgetn to read the streambuf.  The problem is that this
function doesn't have a separate return code for EOF -- EOF is
recognized because it returns without having read any characters (return
value == 0).  And what is happening here is that while the first call to
sgetn recognizes the first ^D and returns, it has already read some
characters, and so returns them.  And the next sgetn needs another ^D to
terminate it.

My question is really, what is the correct solution:

1. ostream::operator<<( streambuf* ) should read one character at a
time, in order to not miss a potential EOF, or

2. filebuf (which is doing the actual reading) should memorize the EOF.

Both have their problems.  The first is strictly correct, but if this
reasoning is carried to its extreme, when can you use sgetn?  And the
second has the problem of when to clear the EOF, and how.  (The EOF in
the ostream can be cleared with ios::clear, but there is no equivalent
function in streambuf.  From the interface point of view, the only state
in streambuf is the buffer.)

For the moment, I'm modifying my copy to use the first solution, but I'm
far from convinced that this is best solution.  (In my case, I'm using
this in some demo software, where speed is irrelevant, so the potential
performance problem is unimportant.  I imagine that the people at Cygnus
would prefer checking out all other options, however, since other g++
may be concerned about performance.)

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
        I'm looking for a job -- Je recherche du travail
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: 1997/10/07
Raw View
kanze@gabi-soft.fr (J. Kanze) wrote:

: I recently ran into a problem with the g++ implementation of
: ostream::operator<<( streambuf* ).  While the g++ implementation is
: definitly "wrong" (in the sense that it doesn't conform to the standard
: Unix definition of EOF), I'm curious as to just what the standard has to
: say -- the error is a result of what I would consider a very reasonable
: way of implementing the function.

I think this is a platform problem.  The standard UN*X definition of
EOF is no more data.  Some shells (borne, ksh, but not bash?) use ^D
to mean send what I have typed so far.  If the ^D is at the beginning
of a line, that means no more data.  If there is anything on the line,
it means send that stuff to the program now and I will continue with
the line.  There is no EOF character.  UN*X ^D is not the same as DOS
^Z in any way.

Try:
$ echo H^De^Dll^Do^M

John
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1997/10/08
Raw View
jpotter@falcon.lhup.edu (John Potter) writes:

|>  kanze@gabi-soft.fr (J. Kanze) wrote:
|>
|>  : I recently ran into a problem with the g++ implementation of
|>  : ostream::operator<<( streambuf* ).  While the g++ implementation is
|>  : definitly "wrong" (in the sense that it doesn't conform to the standard
|>  : Unix definition of EOF), I'm curious as to just what the standard has to
|>  : say -- the error is a result of what I would consider a very reasonable
|>  : way of implementing the function.
|>
|>  I think this is a platform problem.  The standard UN*X definition of
|>  EOF is no more data.  Some shells (borne, ksh, but not bash?) use ^D
|>  to mean send what I have typed so far.  If the ^D is at the beginning
|>  of a line, that means no more data.  If there is anything on the line,
|>  it means send that stuff to the program now and I will continue with
|>  the line.  There is no EOF character.  UN*X ^D is not the same as DOS
|>  ^Z in any way.

I'm aware of this (although it isn't a shell definition, but part of the
tty driver).  I was just trying to simplify the exposition.  I explained
in more detail what was happening in a later posting.  (I've managed to
obtain some counter-intuitive behavior from the Sun CC libraries as
well.  The real problem, of course, is that it is far from clear what
the actual behavior should be.)

The question is awkward: the standard, of course, doesn't define what
EOF might mean from a keyboard, and strictly speaking, as you say, ^D
isn't a real EOF.  On the other hand, if you are reading through
istream, it acts exactly like one, and if you are reading with stdio, it
acts like one.  And if you are extracting characters one by one using
streambuf::sbumpc, it acts like one.  What is funny is its behavior when
reading directly through a streambuf.  I, for one, find it
counter-intuitive that streambuf::sgetc returns different values in
successive calls.

Of course, counter-intuitive != non-conforming.  The whole point of my
question was: what should "conforming" mean in this case.  Perhaps the
correct answer is that once streambuf::sgetc has returned EOF, any
further use of the streambuf is undefined -- although specific
streambuf's may define it further.  This would mean that both Sun and
g++ are conforming, but it makes writing filtering streambuf's
significantly more complicated.

And it still leaves open the question of streambuf::sgetn (which g++
uses in its implementation of "stream::operator<<( streambuf* )").  In
the default implementation, streambuf::sgetn will normally finish with a
sgetc (or sbumpc) that returns EOF; it has no way of informing the
caller of this, however.

And the reason it is a standards question is basically because the
standard doesn't define any state for a streambuf, so you are left with
a problem: if you memorize the EOF (as does istream and FILE), how do
you reset it?  The obvious answer is that there should be a virtual
function clear in streambuf as well.  But there isn't, and it is too
late for it in this round of standardization.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
        I'm looking for a job -- Je recherche du travail
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]