Topic: controlling structure layout portably


Author: "Greg Brewer" <nospam.greg@brewer.net>
Date: Mon, 9 Jul 2001 22:39:32 GMT
Raw View
As far as packing a structure is concerned, I had a discussion on it about a
year ago.  The majority thought such a modifier was a good idea with a
couple of strong desents.  It would be nice if there was a library function
for swapping endian bytes.

Greg

"Matthias Benkmann" <mbenkmann@gmx.de> wrote in message
news:3b4a10e1.2206883@news.cis.dfn.de...
> I wonder why the following has never made it (in some way) into C++
> (or C for that matter)
>
> packed struct BITMAPFILEHEADER {
> };
>
> MSB
>



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Barry Margolin <barmar@genuity.net>
Date: Mon, 9 Jul 2001 22:57:33 GMT
Raw View
In article <3b4a2ff1.10159837@news.cis.dfn.de>,
Matthias Benkmann <mbenkmann@gmx.de> wrote:
>On Mon,  9 Jul 2001 21:26:39 GMT, Barry Margolin <barmar@genuity.net>
>wrote:
>
>>In article <3b4a10e1.2206883@news.cis.dfn.de>,
>>Matthias Benkmann <mbenkmann@gmx.de> wrote:
>>>Wouldn't the above be a blessing for many people? Who doesn't need to
>>>read/write files of a given format sometimes? Who doesn't need to
>>>read/write network packets with a given layout sometimes? Who doesn't
>>>need to address certain machine-specific data areas sometimes?
>>
>>There are many libraries that serve the purpose of marshaling data in
>>formats specified by external standards.  It's not necessary to build
>>support for all these file formats into the standard.
>
>I think you didn't get what I was driving at. I don't want support for
>bitmap files in C++. That was just an example. The important parts are
>a keyword to mark a structure as packed and a storage class specifier
>little_endian to mark an integer as to be stored in little_endian
>format. It is basically impossible without this to write portable file
>handling unless you work on raw byte data like this:
>
>i=a[0]*256+a[1]

Folks have been doing it for years with functions like ntohs().  For more
elaborate stuff they use XDR, ASN.1/BER, CORBA, etc.

>which is horrible. And I think that not even this is really portable
>as there is not "byte" data type (char could be 16bits couldn't it?)

I think C99 introduced some new types like uint8, uint16, etc.  They're
optional types, but if they exist they must be *exactly* the specified
number of bits.  This allows you to do

uint8 a[BUFSIZE];

to get an array of 8-bit bytes, so you won't be screwed by 16-bit chars.

But this is only really an issue for the folks writing the aforementioned
libraries.  It would be nice if they could be written entirely portably,
but for the user of the library it doesn't really matter if the author had
to use some #ifdef's.

--
Barry Margolin, barmar@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: news_comp.std.c++_expires-2001-09-01@nmhq.net (Niklas Matthies)
Date: Tue, 10 Jul 2001 16:24:13 GMT
Raw View
On Mon,  9 Jul 2001 22:34:54 GMT, Matthias Benkmann <mbenkmann@gmx.de> wr=
ote:
> On Mon,  9 Jul 2001 21:26:39 GMT, Barry Margolin <barmar@genuity.net>
> wrote:
> >In article <3b4a10e1.2206883@news.cis.dfn.de>,
> >Matthias Benkmann <mbenkmann@gmx.de> wrote:
> >>Wouldn't the above be a blessing for many people? Who doesn't need to
> >>read/write files of a given format sometimes? Who doesn't need to
> >>read/write network packets with a given layout sometimes? Who doesn't
> >>need to address certain machine-specific data areas sometimes?=20
[=B7=B7=B7]
> And I think that not even this is really portable
> as there is not "byte" data type (char could be 16bits couldn't it?)

The problem with this particular point is not only lack of
octet-granularity memory access, but most probably also lack of
byte-granularity I/O on those platforms. Binary formats are inherently
dependent on the size of the basic data unit there are based on, and a
programming language can't do much about this. How do you want to read
or even just store (say) a GIF file on a platform that does I/O in (say)
15-bit units? The answer is that octet-based binary formats just aren't
designed for use on such platforms. If you want to process such a
format, you have to assume octet units, or you have to invent a
replacement format that stores the same information in differently sized
data units.

-- Niklas Matthies
--=20
When I was a little kid, I always wanted a bicycle, so I prayed to god
for a bicycle. Then I realized that god doesn't work that way, so I stole
a bicycle and prayed for forgiveness.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: news_comp.std.c++_expires-2001-09-01@nmhq.net (Niklas Matthies)
Date: Tue, 10 Jul 2001 16:24:35 GMT
Raw View
[To moderators: Please cancel my first followup, this is a supersede
(replaced "byte" by "octet" in one instance). Thanks.]

On Mon,  9 Jul 2001 22:34:54 GMT, Matthias Benkmann <mbenkmann@gmx.de> wr=
ote:
[=B7=B7=B7]
> And I think that not even this is really portable
> as there is not "byte" data type (char could be 16bits couldn't it?)

The problem with this particular point is not only lack of
octet-granularity memory access, but most probably also lack of
octet-granularity I/O on those platforms. Binary formats are inherently
dependent on the size of the basic data unit there are based on, and a
programming language can't do much about this. How do you want to read
or even just store (say) a GIF file on a platform that does I/O in (say)
15-bit units? The answer is that octet-based binary formats just aren't
designed for use on such platforms. If you want to process such a
format, you have to assume octet units, or you have to invent a
replacement format that stores the same information in differently sized
data units.

-- Niklas Matthies
--=20
Less matters that which you can take by yourself than that which others=20
have given to you. =20

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: mbenkmann@gmx.de (Matthias Benkmann)
Date: Wed, 11 Jul 2001 17:00:13 GMT
Raw View
On Mon,  9 Jul 2001 22:57:33 GMT, Barry Margolin <barmar@genuity.net>
wrote:

>In article <3b4a2ff1.10159837@news.cis.dfn.de>,
>Matthias Benkmann <mbenkmann@gmx.de> wrote:
>>On Mon,  9 Jul 2001 21:26:39 GMT, Barry Margolin <barmar@genuity.net>
>>wrote:
>>
>>>In article <3b4a10e1.2206883@news.cis.dfn.de>,
>>>Matthias Benkmann <mbenkmann@gmx.de> wrote:
>>>>Wouldn't the above be a blessing for many people? Who doesn't need to
>>>>read/write files of a given format sometimes? Who doesn't need to
>>>>read/write network packets with a given layout sometimes? Who doesn't
>>>>need to address certain machine-specific data areas sometimes?
>>>
>>>There are many libraries that serve the purpose of marshaling data in
>>>formats specified by external standards.  It's not necessary to build
>>>support for all these file formats into the standard.
>>
>>I think you didn't get what I was driving at. I don't want support for
>>bitmap files in C++. That was just an example. The important parts are
>>a keyword to mark a structure as packed and a storage class specifier
>>little_endian to mark an integer as to be stored in little_endian
>>format. It is basically impossible without this to write portable file
>>handling unless you work on raw byte data like this:
>>
>>i=a[0]*256+a[1]
>
>Folks have been doing it for years with functions like ntohs().

But this makes code much more complicated. Instead of using a straight
struct I would have to use accessor methods that do this conversion
all the time. It makes code harder to read and harder to maintain.


> For more
>elaborate stuff they use XDR, ASN.1/BER, CORBA, etc.
>
>>which is horrible. And I think that not even this is really portable
>>as there is not "byte" data type (char could be 16bits couldn't it?)
>
>I think C99 introduced some new types like uint8, uint16, etc.  They're
>optional types, but if they exist they must be *exactly* the specified
>number of bits.  This allows you to do
>
>uint8 a[BUFSIZE];
>
>to get an array of 8-bit bytes, so you won't be screwed by 16-bit chars.
>
>But this is only really an issue for the folks writing the aforementioned
>libraries.  It would be nice if they could be written entirely portably,
>but for the user of the library it doesn't really matter if the author had
>to use some #ifdef's.

You assume that there is a library with an appropriate license and
appropriate design for all file formats and network stuff. There
isn't. And if your program creates files of its own (i.e. no standard
format) then you don't have a premade library.
And if you're working in an embedded environment the library concept
might not even be applicable.

MSB

----
By the way:
Vacuum cleaners suck!

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: brangdon@cix.co.uk (Dave Harris)
Date: Wed, 11 Jul 2001 17:00:10 GMT
Raw View
mbenkmann@gmx.de (Matthias Benkmann) wrote (abridged):
>         little_endian unsigned int   bfType:16;
>
> How am I supposed to write portable code that reads/writes files or
> network packets with a given layout and endianness?

Why should the little endian format be favoured? There are many other
permutations, eg:

    0123
    3210
    1032
    2301

I believe at least 3 of those are in common use. How many new keywords are
we adding here?

  Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
      brangdon@cix.co.uk      |   And close your eyes with holy dread,
                              |  For he on honey dew hath fed
 http://www.bhresearch.co.uk/ |   And drunk the milk of Paradise."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Stephen Clamage <stephen.clamage@sun.com>
Date: Wed, 11 Jul 2001 17:00:30 GMT
Raw View
On Mon,  9 Jul 2001 22:34:54 GMT, mbenkmann@gmx.de (Matthias Benkmann)
wrote:
>I think you didn't get what I was driving at. I don't want support for
>bitmap files in C++. That was just an example. The important parts are
>a keyword to mark a structure as packed and a storage class specifier
>little_endian to mark an integer as to be stored in little_endian
>format.

Big-endian and litle-endian are not the only storage layout options.
Two examples: I worked on an 8-bit system where the 32-bit integer
ABCD was stored as BADC -- the word order was big-endian within the
double-word, but the byte order within each word was little-endian.
DEC floating-point formats put the most significant bytes in the
middle of the address range occupied by a floating-point value.
---
Steve Clamage, stephen.clamage@sun.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: juergen@monocerus.demon.co.uk (Juergen Heinzl)
Date: Wed, 11 Jul 2001 18:16:42 GMT
Raw View
In article <3b4a10e1.2206883@news.cis.dfn.de>, Matthias Benkmann wrote:
> I wonder why the following has never made it (in some way) into C++
> (or C for that matter)
>
> packed struct BITMAPFILEHEADER {
>         little_endian unsigned int   bfType:16;
>         little_endian unsigned int   bfSize:16;
>         little_endian unsigned int   bfReserved1:16;
>         little_endian unsigned int   bfReserved2:16;
>         little_endian unsigned int   bfOffBits:32;
> };
[-]
I think because something like little_endian and big_endian
is something which is machine specific. It's just not a
language issue what if you run a C(++) compiler on a machine
with a CPU that does not know of endianess at all ...

sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long)

... say there is no such thing as a native byte order ?

> How am I supposed to write portable code that reads/writes files or
> network packets with a given layout and endianness?
> How is placement new supposed to unfold its full usefulness if I can't
> have accurate control over the layout of the object?
[-]
The higher level a language the less you ought to need caring
about the layout of an object.

[-]
> Wouldn't the above be a blessing for many people? Who doesn't need to
> read/write files of a given format sometimes? Who doesn't need to
> read/write network packets with a given layout sometimes? Who doesn't
> need to address certain machine-specific data areas sometimes?
[-]
No, it wouldn't be a blessing since a C++ application on machine A
may quite well communicate with a Java application on machine B which
in turn may want to talk to a GW2001 BASIC application on machine C.

What I mean suddenly C++ would re-define the binary representation
and even worse, it'd be different from the binary representation of
some C binary and how is application X on B supposed to know what your
application was written in ?

A can of worms,
Juergen

--
\ Real name     : Juergen Heinzl                \       no flames      /
 \ EMail Private : juergen@monocerus.demon.co.uk \ send money instead /

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
Date: Thu, 12 Jul 2001 13:02:08 GMT
Raw View
Wed, 11 Jul 2001 17:00:13 GMT, Matthias Benkmann <mbenkmann@gmx.de> pisze=
:

>>Folks have been doing it for years with functions like ntohs().=20
>=20
> But this makes code much more complicated. Instead of using a
> straight struct I would have to use accessor methods that do this
> conversion all the time.

You can translate it once after reading from a file, putting data
in a struct with fields sized and aligned naturally for the given
implementation, and not bother with accessor methods.

--=20
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZAST=CAPCZA
QRCZAK

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: mbenkmann@gmx.de (Matthias Benkmann)
Date: Mon, 9 Jul 2001 20:28:22 GMT
Raw View
I wonder why the following has never made it (in some way) into C++
(or C for that matter)

packed struct BITMAPFILEHEADER {
        little_endian unsigned int   bfType:16;
        little_endian unsigned int   bfSize:16;
        little_endian unsigned int   bfReserved1:16;
        little_endian unsigned int   bfReserved2:16;
        little_endian unsigned int   bfOffBits:32;
};


How am I supposed to write portable code that reads/writes files or
network packets with a given layout and endianness?
How is placement new supposed to unfold its full usefulness if I can't
have accurate control over the layout of the object?
Right now I see only the possibility to work with raw storage on
byte-level (which reminds me that the concept of a "byte" is also
something missing, which would be useful for portable code) which
makes code complicated and inefficient whereas the above would be
trivial to implement in a compiler and could produce much more
efficient code than manual fiddling with bytes.

Wouldn't the above be a blessing for many people? Who doesn't need to
read/write files of a given format sometimes? Who doesn't need to
read/write network packets with a given layout sometimes? Who doesn't
need to address certain machine-specific data areas sometimes?

MSB

----
By the way:
Vacuum cleaners suck!

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Barry Margolin <barmar@genuity.net>
Date: Mon, 9 Jul 2001 21:26:39 GMT
Raw View
In article <3b4a10e1.2206883@news.cis.dfn.de>,
Matthias Benkmann <mbenkmann@gmx.de> wrote:
>Wouldn't the above be a blessing for many people? Who doesn't need to
>read/write files of a given format sometimes? Who doesn't need to
>read/write network packets with a given layout sometimes? Who doesn't
>need to address certain machine-specific data areas sometimes?

There are many libraries that serve the purpose of marshaling data in
formats specified by external standards.  It's not necessary to build
support for all these file formats into the standard.

--
Barry Margolin, barmar@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: mbenkmann@gmx.de (Matthias Benkmann)
Date: Mon, 9 Jul 2001 22:34:54 GMT
Raw View
On Mon,  9 Jul 2001 21:26:39 GMT, Barry Margolin <barmar@genuity.net>
wrote:

>In article <3b4a10e1.2206883@news.cis.dfn.de>,
>Matthias Benkmann <mbenkmann@gmx.de> wrote:
>>Wouldn't the above be a blessing for many people? Who doesn't need to
>>read/write files of a given format sometimes? Who doesn't need to
>>read/write network packets with a given layout sometimes? Who doesn't
>>need to address certain machine-specific data areas sometimes?
>
>There are many libraries that serve the purpose of marshaling data in
>formats specified by external standards.  It's not necessary to build
>support for all these file formats into the standard.

I think you didn't get what I was driving at. I don't want support for
bitmap files in C++. That was just an example. The important parts are
a keyword to mark a structure as packed and a storage class specifier
little_endian to mark an integer as to be stored in little_endian
format. It is basically impossible without this to write portable file
handling unless you work on raw byte data like this:

i=a[0]*256+a[1]

which is horrible. And I think that not even this is really portable
as there is not "byte" data type (char could be 16bits couldn't it?)

MSB

----
By the way:
Vacuum cleaners suck!

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]