Topic: Binary file i/o using streams


Author: awbone@be.the.spam.mindspring.com (Ash)
Date: 1999/01/23
Raw View
>
>To avoid having to write operators for data_streams and iostreams, why
>not allow a data_stream to be constructed from an iostream reference and
>pass stuff to the iostream operators without change?

In cases where an object is streamed in the same way to both iostreams
and data_streams, that is a solution.  Most of the time what I write
to an iostream is more than just the object contents, though.

ashley


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James.Kanze@dresdner-bank.com
Date: 1999/01/19
Raw View
In article <77n2e4$4rc$1@newsreader3.core.theplanet.net>,
  "Andrew J Robb" <AJRobb@bigfoot.com> wrote:
> I find the idea of portable binary far more attractive than human readable
> text.

Have you ever tried to debug it, when using it for communication between
two machines?  One of the things I really like about SMTP or NNTP is
that I can simply telnet into the other end, and pretend to be the
client computer.

>  Text is so frought with complications that I avoid it wherever
> possible.
>
> Whovever thought that it was OK to generate a file like:
>
> cout << 1 << 2;
>
> and not be able to read back the original value with:
>
> int a, b;
> cin >> a >> b;

Whoever decided that iostream was for human oriented input and output,
and not for communication between computers.  You need both.

> where the variable, a, becomes 12 and the read of b fails?
>
> Or (as a consequence of being a simple char stream):
>
> cout << "this is a message";
>
> and not being able to read it back with:
>
> string a;
> cin >> a;
>
> There are so many examples of where these operators fail that I cannot think
> of using them for anything more than printing error messages.

Or outputting to the printer, or inputting a file created in an editor,
or formatting data for display on a screen, or parsing input from a text
window, or ...

>  I hate the
> idea of doing it and I resent the work involved, but I find myself writing
> my own routines to handle formatted text files.  Re-inventing wheels is not
> what C++ is all about, but when the supplied wheel is square...  I love the
> idea of overloaded stream operators but their definition is poor.
>
> I think that there is a strong case for a new class for handling data
> storage/exchange.  Then we can forget about iostream.

I agree on the need for handling data for storage and exchange.  I
disagree that we can forget about people.

>  Derived classes or
> probably format options could handle various file formats, including:
>     native (binary)
>     portable (binary)

Needs defining, first.  (Have you ever seen how BER encodes a double?
It's as portable as you can get, but it sure requires a lot of extra
bytes.)

>     java (binary)

Fine.  And what if my system has no support for the Java types (32 bit
ints, 64 bit longs).

And while we're talking about binary types, where's BER encoding, and
IIOP?

>     base64 (binary option)
>     gzip (compression option)

These two work on previously existing byte streams -- IMHO, the correct
solution here is a filtering streambuf, and not an iostream.  (Although
when I needed gzip, I found it even easier to use a pipebuf.)

>     text (text - emulating iostream::)
>     csv (text)
>     unicode (text option)
>     ASCII (text option)
>     EBCDIC (text option)

And ISO 8859-1 (the most frequent code set here).  And ISO 8859-2 (the
most frequent code set about 500 km to the east).  And you seem to be
forgetting the Chinese and the Japanese, not to mention the Indians,
Arabs, etc., etc.

Let's face it.  The standard cannot possibly cover all required uses.
You have to write a little bit of code yourself.

--
James Kanze                                           GABI Software, S   rl
Conseils en informatique orient    objet  --
                          --  Beratung in industrieller Datenverarbeitung
mailto: kanze@gabi-soft.fr          mailto: James.Kanze@dresdner-bank.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrew J Robb" <AJRobb@bigfoot.com>
Date: 1999/01/21
Raw View
James.Kanze@dresdner-bank.com wrote in message
<782j99$crm$1@nnrp1.dejanews.com>...
>
>In article <77n2e4$4rc$1@newsreader3.core.theplanet.net>,
>  "Andrew J Robb" <AJRobb@bigfoot.com> wrote:
>> I find the idea of portable binary far more attractive than human
readable
>> text.
>
>Have you ever tried to debug it, when using it for communication between
>two machines?  One of the things I really like about SMTP or NNTP is
>that I can simply telnet into the other end, and pretend to be the

>client computer.


Both mail and news are intended for human exchange.

>
>>  Text is so frought with complications that I avoid it wherever
>> possible.
>>
>> Whovever thought that it was OK to generate a file like:
>>
>> cout << 1 << 2;
>>
>> and not be able to read back the original value with:
>>
>> int a, b;
>> cin >> a >> b;
>
>Whoever decided that iostream was for human oriented input and output,
>and not for communication between computers.  You need both.
I did not say that text is only for humans.

Here I am complaining about a lack of symmetry between output and input.


>
>> where the variable, a, becomes 12 and the read of b fails?
>>
>> Or (as a consequence of being a simple char stream):
>>
>> cout << "this is a message";
>>
>> and not being able to read it back with:
>>
>> string a;
>> cin >> a;
>>
>> There are so many examples of where these operators fail that I cannot
think
>> of using them for anything more than printing error messages.
>
>Or outputting to the printer, or inputting a file created in an editor,
>or formatting data for display on a screen, or parsing input from a text
>window, or ...

The trouble with parsing editor files (anything that involves human
vagueness), is the amount of work required to verify that all the data has
been parsed correctly.  I find the overloaded istream & operator>> is not
suited to generating robust parsers.


>
>>  I hate the
>> idea of doing it and I resent the work involved, but I find myself
writing
>> my own routines to handle formatted text files.  Re-inventing wheels is
not
>> what C++ is all about, but when the supplied wheel is square...  I love
the
>> idea of overloaded stream operators but their definition is poor.
>>
>> I think that there is a strong case for a new class for handling data
>> storage/exchange.  Then we can forget about iostream.
>
>I agree on the need for handling data for storage and exchange.  I
>disagree that we can forget about people.

I don't want to forget people - they pay me.  I do want the ability to
output/input data without having to insert spaces between fields.  This is
recognised in the copy_out algorithm where a separator is specified.


>
>>  Derived classes or
>> probably format options could handle various file formats, including:
>>     native (binary)
>>     portable (binary)
>
>Needs defining, first.  (Have you ever seen how BER encodes a double?
>It's as portable as you can get, but it sure requires a lot of extra
>bytes.)

I would first look to CORBA that uses just this idea.


>
>>     java (binary)
>
>Fine.  And what if my system has no support for the Java types (32 bit
>ints, 64 bit longs).

Then have classes that do.
>
>And while we're talking about binary types, where's BER encoding, and
>IIOP?


As I replied above, CORBA/IIOP would be a good basis for "portable".

>
>>     base64 (binary option)
>>     gzip (compression option)
>
>These two work on previously existing byte streams -- IMHO, the correct
>solution here is a filtering streambuf, and not an iostream.  (Although
>when I needed gzip, I found it even easier to use a pipebuf.)


Agreed.

>
>>     text (text - emulating iostream::)


See, I wasn't forgetting humans.

>>     csv (text)
>>     unicode (text option)
>>     ASCII (text option)
>>     EBCDIC (text option)
>
>And ISO 8859-1 (the most frequent code set here).  And ISO 8859-2 (the
>most frequent code set about 500 km to the east).  And you seem to be
>forgetting the Chinese and the Japanese, not to mention the Indians,
>Arabs, etc., etc.


Java 1.1 does a much better job of handling character sets.  It also gives
you the machanism for creating new filters.  I would rather not have to use
a Java application to translate a character stream for my C++ application.

>
>Let's face it.  The standard cannot possibly cover all required uses.

>You have to write a little bit of code yourself.


It seems to me that this is an area ripe for collaboration - hopefully to be
incorporated into the standard in 10-20 years.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: awbone@be.the.spam.mindspring.com (Ash)
Date: 1999/01/21
Raw View
On 17 Jan 99 02:10:24 GMT, "Andrew J Robb" <AJRobb@bigfoot.com> wrote:

>I find the idea of portable binary far more attractive than human readable
>text.  Text is so frought with complications that I avoid it wherever
>possible.

-stuff deleted-

>I think that there is a strong case for a new class for handling data
>storage/exchange.  Then we can forget about iostream.  Derived classes or
>probably format options could handle various file formats, including:
>    native (binary)
>    portable (binary)
>    java (binary)
>    base64 (binary option)
>    gzip (compression option)
>    text (text - emulating iostream::)
>    csv (text)
>    unicode (text option)
>    ASCII (text option)
>    EBCDIC (text option)

I agree that the iostream hierarchy isn't a great fit for
_unformatted_ streams of data.  While it is certainly possible
to write locales and streambufs to write to any desired device
in whatever representation you choose, it's also true that there
is an awful lot of machinery in the iostream framework that is
irrelevant to unformatted streams, and simply complicates
development.

In my work, which deals with several forms of low-bandwidth
radio and satellite communications, I deal extensively with
streams of data in binary format.  To facilitate the various
types of i/o we do, I developed a 'data stream' library.  The
i/o interface mirrors that of iostreams, with insertion and extraction
operators, and a similar inheritance structure.  The base data_ios
holds a pointer to a device (ie, file, serial port, socket), while
data_istream has a 'decoder' and data_ostream an 'encoder.

The encoder and decoder classes are responsible for encoding/decoding
primitive types into a desired format and writing the encoded data to
the device.  For example, we have native, network byte order, xdr, and
ascii encoders and decoders.

The architecture is clean, simple and extensible, and is minimal with
respect to unformatted i/o.  Some people grumble about having to
writing insertion and extraction operators for both iostreams and
data_streams, but I don't consider that a con.  Almost every
iostream insertion operator I write has additional text labels
associated with the object's data, whereas the data_stream
operators deal with the data only.

Anyway, that's how we tackled this issue.  BTW, we also use
iostreams extensively, including writing to WindowsNT consoles
and edit controls.

ashley
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Kevin Jacobs <jacobs@darwin.EPBI.CWRU.Edu>
Date: 1999/01/21
Raw View
In comp.std.c++ Ash <awbone@be.the.spam.mindspring.com> wrote:
> I agree that the iostream hierarchy isn't a great fit for
> _unformatted_ streams of data.  While it is certainly possible
> to write locales and streambufs to write to any desired device
> in whatever representation you choose [...]

Good bloody luck, is all I can say.

I've done some pretty magic things in getting iostream based binary streams
to work via virtually every method my fevered brain could come up with.  My
two main branches were:

 1) use the locale machinery (codecvt, num_get, num_put, etc..)
    to redefine stream functions for various binary representations.

 2) create my own ios class by inheriting from std::basic_ios and
    then "re-invent" my own iostream hierarchy for various types of
    binary representations via custom streambufs.

Approach #1 was very intriguing in spite of its many twists and turns.
In the end, the specialization of codecvt<T> for T=char is what killed the
idea since it meant I had to define a new character class for any binary
iostream.  It can still be done and 90% of it is there in my development
code, but it just got out of hand.  I've used similar bastardization and
perversions to get other text based stream magic working, but its too much
pain for something as simple as a binary stream.

Approach #2 is not as sexy, but works fairly well.

> In my work, which deals with several forms of low-bandwidth
> radio and satellite communications, I deal extensively with
> streams of data in binary format.  To facilitate the various
> types of i/o we do, I developed a 'data stream' library.  The
> i/o interface mirrors that of iostreams, with insertion and extraction
> operators, and a similar inheritance structure.  The base data_ios
> holds a pointer to a device (ie, file, serial port, socket), while
> data_istream has a 'decoder' and data_ostream an 'encoder.

Thats essentially my approach #2.

-Kevin

--
----------->  Kevin Jacobs  <-----------|------->  (216) 778-8487  <--------
S.A.G.E. Project Technical Coordinator  | Department of Epidemiology
  & System Administrator                |   & Biostatistics
Internet E-mail: jacobs@darwin.cwru.edu | Case Western Reserve University
----------------------------------------------------------------------------
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "Andrew J Robb" <AJRobb@bigfoot.com>
Date: 1999/01/22
Raw View
To avoid having to write operators for data_streams and iostreams, why
not allow a data_stream to be constructed from an iostream reference and
pass stuff to the iostream operators without change?



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: dietmar.kuehl@claas-solutions.de
Date: 1999/01/18
Raw View
Hi,
In article <77n2e4$4rc$1@newsreader3.core.theplanet.net>,
  "Andrew J Robb" <AJRobb@bigfoot.com> wrote:
> I find the idea of portable binary far more attractive than human readable
> text.  Text is so frought with complications that I avoid it wherever
> possible.

You should note that the standard library does provide mechanisms to you. It
does not solve your problems! If you are producing output you want to read
back in, you should be careful to do it in a way which indeed allows reading
it back in. Apparently you would prefer that the numerical output operators
add a separation character after writing. However, I want to produce formats
like "1, 2.". This would be impossible (or at least inconvenient) if a
separation character woul always be added. On the other hand, you can easily
change the format how built-in types are formatted if you want to: Just
replace the 'num_put' facet with an own one (which probably uses the standard
facet to do the actual formatting).

> There are so many examples of where these operators fail that I cannot think
> of using them for anything more than printing error messages.  I hate the
> idea of doing it and I resent the work involved, but I find myself writing
> my own routines to handle formatted text files.  Re-inventing wheels is not
> what C++ is all about, but when the supplied wheel is square...  I love the
> idea of overloaded stream operators but their definition is poor.

Their definition is according to common use (partially inherited already from
common use in C). ... and IMO the wheel you are speaking of is not square at
all. Instead, it has that many features that you can change most things about
it! Unfortunately, formatting of types which are not built-in is sometimes
defined and cannot generally be changed (for example, this applies to
basic_string and bitset).

> I think that there is a strong case for a new class for handling data
> storage/exchange.  Then we can forget about iostream.  Derived classes or
> probably format options could handle various file formats, including:
>     native (binary)
>     portable (binary)
>     java (binary)
>     base64 (binary option)
>     gzip (compression option)
>     text (text - emulating iostream::)
>     csv (text)
>     unicode (text option)
>     ASCII (text option)
>     EBCDIC (text option)

I can't see no evidence for any "strong case" which leads to abandoning
IOStreams! Actually, except for the few I/O operators defined for types not
built-in in the standard C++ library and existing user defined operator you
can already do what you want with IOStreams: Just implement all IO operators
similar to the IO operators for built-in types (ie. delegate the actual
formatting to some appropriate facet), supply corresponding facets for the
built-in types, and that's it. In case the operators which cannot be changed
are crucial, you might change the character traits argument for the IOStream
classes (however, if I remember correctly, the behavior of the classes is not
guaranteed by the standard in some cases).

Anyway, you should note that compression options are orthogonal to the
formatting and thus best be implemented as specialized stream buffers instead
of a special formatting option. There are other things orthogonal to
formatting, too, like eg. the actual destination (whether it is a file, a
text window, a socket, or whatever you are writing to is completely
independent from the formatting). Also, the distinction between ASCII and
EBCDIC is already possible with the IOStreams as defined by the standard: You
would just install a corresponding 'codecvt' facet translating between the
native representation and whatever external representation you want to
produce. This also applies to Unicode (however, it is likely that you would
use the wide character versions of the IOStream classes in this case).

You should note that IOStreams are not meant to be a poor man's persistence
engine: They are for user interaction. That is, they are intended to produce
error messages and to read and write human readable data streams which at
least might be changed by human interaction.

Anyway, for some kind of persistance or data exchange mechanism, you might
very well built an appropriate binary stream layer on top of the stream
buffer layer (I would definitely reuse this because it allows for example a
portable file access). Formatted IO is clearly not appropriate in all cases
(but then, some data base is often more appropriate than binary streams).
However, I doubt that a binary stream layer will become part of the standard
C++ library in the next standardization round... --
<mailto:dietmar.kuehl@claas-solutions.de> homepage:
<http://www.informatik.uni-konstanz.de/~kuehl>

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrew J Robb" <AJRobb@bigfoot.com>
Date: 1999/01/17
Raw View
I find the idea of portable binary far more attractive than human readable
text.  Text is so frought with complications that I avoid it wherever
possible.

Whovever thought that it was OK to generate a file like:

cout << 1 << 2;

and not be able to read back the original value with:

int a, b;
cin >> a >> b;

where the variable, a, becomes 12 and the read of b fails?

Or (as a consequence of being a simple char stream):

cout << "this is a message";

and not being able to read it back with:

string a;
cin >> a;

There are so many examples of where these operators fail that I cannot think
of using them for anything more than printing error messages.  I hate the
idea of doing it and I resent the work involved, but I find myself writing
my own routines to handle formatted text files.  Re-inventing wheels is not
what C++ is all about, but when the supplied wheel is square...  I love the
idea of overloaded stream operators but their definition is poor.

I think that there is a strong case for a new class for handling data
storage/exchange.  Then we can forget about iostream.  Derived classes or
probably format options could handle various file formats, including:
    native (binary)
    portable (binary)
    java (binary)
    base64 (binary option)
    gzip (compression option)
    text (text - emulating iostream::)
    csv (text)
    unicode (text option)
    ASCII (text option)
    EBCDIC (text option)
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]