Thread

Topic: Issue, #309, reposted after mis-posting to comp.lang.c++

Author: jimreesma@gmail.com
Date: Wed, 13 Sep 2006 13:14:30 CST Raw View

To the appropriate committee folks,

The current proposed resolution of issue #309
(http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#309)  is
unacceptable.   I write commerical software and coding around this
makes my code ugly, non-intuitive, and requires comments referring
people to this very issue.   Following is the full explanation of my
experience.

In the course of writing software for commercial use, I constructed
std::ifstream's based on user-supplied pathnames on typical POSIX
systems.

It was expected that some files that opened successfully might not read
successfully -- such as a pathname which actually refered to a
directory.   Intuitively, I expected the streambuffer underflow() code
to throw an exception in this situation, and recent implementations of
libstdc++'s basic_filebuf do just that (as well as many of my own
custom streambufs).

I also intuitively expected that the istream code would convert these
exceptions to the "badbit' set on the stream object, because I had not
requested exceptions.    I refer to 27.6.1.1. P4.

However, this was not the case on at least two implementations -- if
the first thing I did with an istream was call operator>>( T& ) for T
among the basic arithmetic types and std::string.   Looking further I
found that the sentry's constructor was invoking the exception when it
pre-scanned for whitespace, and the extractor function (operator>>())
was not catching exceptions in this situation.

So, I was in a situation where setting 'noskipws' would change the
istream's behavior even though no characters (whitespace or not) could
ever be successfully read.

Also, calling .peek() on the istream before calling the extractor()
changed the behavior (.peek() had the effect of setting the badbit
ahead of time).

I found this all to be so inconsistent and inconvenient for me and my
code design, that I filed a bugzilla entry for libstdc++.   I was then
told that the bug cannot be fixed until issue #309 is resolved by the
committee.

Jim Rees
ITA Software, Inc.
Cambridge, MA

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: AlbertoBarbati@libero.it (Alberto Ganesh Barbati)
Date: Thu, 14 Sep 2006 02:49:55 GMT Raw View

jimreesma@gmail.com ha scritto:
> To the appropriate committee folks,

I'm not a committee folk, I apologize for the intrusion... :-)

> In the course of writing software for commercial use, I constructed
> std::ifstream's based on user-supplied pathnames on typical POSIX
> systems.
>
> It was expected that some files that opened successfully might not read
> successfully -- such as a pathname which actually refered to a
> directory.   Intuitively, I expected the streambuffer underflow() code
> to throw an exception in this situation, and recent implementations of
> libstdc++'s basic_filebuf do just that (as well as many of my own
> custom streambufs).

Hmmm... If I open a directory as it were a file, I expect the operation
to fail immediately, before any read operation attempt. For example, on
Win32 the stream gets failbit set in the ctor. I don't know POSIX
systems very well, but I would expect the same. Apparently, I'm wrong
about this...

Anyway, your problematic scenario could still occur, for example when
opening a file of size 0.

BTW, underflow() can fail by either throwing (as suggested by footnote
275) or simply return traits::eof(). So you should not *expect* it to
throw. It might occur on a particular implementation and in particular
cases but it's not required by the standard.

> I also intuitively expected that the istream code would convert these
> exceptions to the "badbit' set on the stream object, because I had not
> requested exceptions.    I refer to 27.6.1.1. P4.

Notice that if failbit is set, then every operation fails immediately,
without having the chance to set badbit. So, for example, on Win32
badbit won't be set.

Moreover, if underflow() fails without throwing (and we saw that this
case could actually happen), then badbit won't be set anyway, regardless
of issue #309.

> However, this was not the case on at least two implementations -- if
> the first thing I did with an istream was call operator>>( T& ) for T
> among the basic arithmetic types and std::string.   Looking further I
> found that the sentry's constructor was invoking the exception when it
> pre-scanned for whitespace, and the extractor function (operator>>())
> was not catching exceptions in this situation.

Hmmm... if the sentry is trying to parse the whitespaces, then clearly
failbit was not set... However the sentry will set failbit | eofbit and
you can check failure with fail() after that. No need to check badbit.

> So, I was in a situation where setting 'noskipws' would change the
> istream's behavior even though no characters (whitespace or not) could
> ever be successfully read.
>
> Also, calling .peek() on the istream before calling the extractor()
> changed the behavior (.peek() had the effect of setting the badbit
> ahead of time).

I can't give you an answer, but let me ask you this: why are you worried
about badbit? My experience is that checking fail() (that is either
failbit or badbit) is the right thing to do 99.9% of the times. I don't
really bother about which of the two bits is set (most of the times it's
either failbit or both). In fact, as I showed you above, using bad()
usually means relying on implementation-defined behaviour, so the code
would be unportable regardless of issue #309.

Perhaps you could motivate your concerns by providing a use case where
checking for badbit rather than failbit really can make a difference. To
be convincing, it would be better if you provide an example that does
not depend on implementation-defined behaviour.

Regards,

Ganesh

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "kanze" <kanze@gabi-soft.fr>
Date: Thu, 14 Sep 2006 13:01:05 CST Raw View

Alberto Ganesh Barbati wrote:
> jimreesma@gmail.com ha scritto:
> > In the course of writing software for commercial use, I
> > constructed std::ifstream's based on user-supplied pathnames
> > on typical POSIX systems.

> > It was expected that some files that opened successfully
> > might not read successfully -- such as a pathname which
> > actually refered to a directory.   Intuitively, I expected
> > the streambuffer underflow() code to throw an exception in
> > this situation, and recent implementations of libstdc++'s
> > basic_filebuf do just that (as well as many of my own custom
> > streambufs).

> Hmmm... If I open a directory as it were a file, I expect the
> operation to fail immediately, before any read operation
> attempt. For example, on Win32 the stream gets failbit set in
> the ctor. I don't know POSIX systems very well, but I would
> expect the same. Apparently, I'm wrong about this...

Yep.  The open normally works.  The read may or may not work; if
the file system is mounted locally, you can read a directory
just like any other file.

> Anyway, your problematic scenario could still occur, for
> example when opening a file of size 0.

Not really.  If I understand correctly, his problem occurs
because the read returns an hard error, and not 0 bytes read.

The obvious solution in his case is simply to do a peek()
immediately after the open, and then check badbit.  Supposing,
of course, that the implementation of filebuf handles this
correctly.  (Of course, it still means that he can read data
from an open directory.  He'll probably get a format error
fairly quickly, of course, if he expects any predetermined
format.)

> BTW, underflow() can fail by either throwing (as suggested by
> footnote 275) or simply return traits::eof().

I think the intended behavior is for it to throw if it
encounters an error, and to only return EOF if it encounters end
of file.

> So you should not *expect* it to throw. It might occur on a
> particular implementation and in particular cases but it's not
> required by the standard.

It's true that an implementation is not required to test for
possible hardware errors.  His problem, however, if I understand
it correctly, is that systems are detecting the possible
hardware errors, but reporting them differently.

> > I also intuitively expected that the istream code would
> > convert these exceptions to the "badbit' set on the stream
> > object, because I had not requested exceptions.    I refer
> > to 27.6.1.1. P4.

> Notice that if failbit is set, then every operation fails
> immediately, without having the chance to set badbit. So, for
> example, on Win32 badbit won't be set.

> Moreover, if underflow() fails without throwing (and we saw
> that this case could actually happen), then badbit won't be
> set anyway, regardless of issue #309.

How true.  In fact, the standard makes no guarantee as to
whether we can distinguish hard errors from end of file or not.

Again, I think his problem is that the open succeeds, and then
the first read fails with a hard error.  This is standard
behavior under Unix when attempting to use the normal reads on a
directory on a file system which is remote mounted.  (Arguably,
what is broken here is Unix, and not C++.  An open for read
succeeds on a "file" which cannot be read.)

All he's asking for, I think, is that when the system detects a
hard error during a physical read, the behavior of iostreams be
consistent.  The current situation would seem to be that the
behavior depends on whether the error is first detected in the
constructor of the sentry object, or later in the actual
operator>> code, and that in the first case, the behavior isn't
consistant accross implementations.

I agree with his argument.  Globally, I would expect that
anytime the system detects a hard error (manifested by the
system request read returning -1 under Unix), badbit be set),
filebuf would raise an exception, and anytime an exception
occurs during input---any type of input---badbit be set.

> > However, this was not the case on at least two
> > implementations -- if the first thing I did with an istream
> > was call operator>>( T& ) for T among the basic arithmetic
> > types and std::string.   Looking further I found that the
> > sentry's constructor was invoking the exception when it
> > pre-scanned for whitespace, and the extractor function
> > (operator>>()) was not catching exceptions in this
> > situation.

> Hmmm... if the sentry is trying to parse the whitespaces, then
> clearly failbit was not set... However the sentry will set
> failbit | eofbit and you can check failure with fail() after
> that. No need to check badbit.

Except that he wants to tread badbit differently.  If he gets
failbit on the first input, that means an empty file, or a
format error (depending on eofbit).  If he gets badbit on the
first input, that probably means he opened a file he shouldn't
have.

> > So, I was in a situation where setting 'noskipws' would
> > change the istream's behavior even though no characters
> > (whitespace or not) could ever be successfully read.

> > Also, calling .peek() on the istream before calling the
> > extractor() changed the behavior (.peek() had the effect of
> > setting the badbit ahead of time).

> I can't give you an answer, but let me ask you this: why are
> you worried about badbit? My experience is that checking
> fail() (that is either failbit or badbit) is the right thing
> to do 99.9% of the times.

But isn't that experience conditionned by the fact that you
can't reliably do more?  Wouldn't it be better if you handled
hard read errors differently from end of file?  Consider a
server, reading a configuration file.  If for some reason there
is a read error on the disk, do you want it to start, without
any error message, but with the wrong configuration, or do you
want it to abort, signaling a read error in the configuration
file.

A lot depends on context, but in general, the more information,
the better: it is easier to ignore excess information than to
process information you don't have.

> I don't really bother about which of the two bits is set (most
> of the times it's either failbit or both). In fact, as I
> showed you above, using bad() usually means relying on
> implementation-defined behaviour, so the code would be
> unportable regardless of issue #309.

> Perhaps you could motivate your concerns by providing a use
> case where checking for badbit rather than failbit really can
> make a difference. To be convincing, it would be better if you
> provide an example that does not depend on
> implementation-defined behaviour.

Since any detection of hard errors is to some degree system
dependant, some implementation-defined behavior is bound to be
involved.  In general, in this sort of situation, the intent of
implementation-defined behavior isn't that the implementation do
just anything; the intent is that it do the most it can, without
the standard requiring something impossible for certain
implementations.  After, it is a quality of implementation
problem.

And I find it hard to imagine any serious software where you
don't distinguish badbit (although it's bloody hard to
test---how do you force a read error on the disk).  If you
cannot read the input, you don't want to just silently ignore
the fact, and say that everything went right.  (I've actually
had a collegue loose data because a program didn't test badbit
when writing.  The program was a typical Unix filter program,
copying the input file to standard out, with a little
transformation.  The disk was full, but the program said that
the copy was fine, so he deleted the input file, and moved the
output over to replace it.)

--
James Kanze                                           GABI Software
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]