Topic: bit field standards
Author: Jim Cobban <thesnaguy@hotmail.com>
Date: 2000/04/26 Raw View
This is a multi-part message in MIME format.
--------------061D7E0D32D954F32171C819
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by mumnunah.cs.mu.OZ.AU id NAA00675
"Mark M. Young" wrote:
> I was checking on the results of the following program with a Borland
> compiler and a gcc compiler and noticed something interesting. Both
> compilers produced executables that printed '~' (126) as they should
> when 'a' is given 4 bits. However, when 'a' is given 3 bits, Borland
> works fine, but gcc prints '=FD' (253). I know this character will var=
y
> from system to system. This means that Borland put 'bitmask_rep_' in
> memory with 'b' followed by 'a' and packed the bits to the right. This
> makes sense to me. Also, this means that gcc put 'bitmask_rep_' in
> memory with 'a' followed by 'b' and packed to the left. This makes no
> sense from a computer scientist's viewpoint. Could anyone please clear
> up what the standard says about this (maybe with a URL)?
The standard says nothing. Indeed a compiler could put the two fields in
separate bytes altogether and still comply with the standard, although it
might have difficulty maintaining market share. The exact wording of the
standard, section 9.6 is:
"Allocation of bit=ADfields within a class object is implementation=ADdef=
ined.
Alignment of bit=ADfields is implementation=ADdefined. Bit=ADfields are p=
acked into
some addressable allocation unit. [Note: bit=ADfields straddle allocation=
units
on some machines and not on others. Bit=ADfields are assigned right=ADto=AD=
left on
some machines, left=ADto=ADright on others. ]"
What do YOU mean by "before" and "after"? What do you mean by "left" and
"right"? Why do you apparently think that bit fields should be packed in=
to a
byte from the low order bit up, rather than from the high order bit down?
Indeed from your example it appears to me that both compilers put "a" bef=
ore
"b" and the only difference is whether the unused bit is put before or af=
ter
the 7 bits which you defined.
>From a more practical point of view you must understand that Borland C is=
an
implementation for one specific microprocessor family, Intel x86, which i=
s a
little-endian computer. Multiple byte data units are allocated in memory
with the low address byte containing the low order byte of the value.
gcc is a compiler which started in the UNIX environment and has since bee=
n
implemented on practically every processor that any electrical engineer h=
as
ever dreamt up. Some of those processors are little-endian, like the Int=
el
x86. Some are big-endian like the MC68000, and some can even be switched
between the two modes. I do not know for a fact why gcc allocates bit fi=
elds
in the order that it does, but one logical reason for allocating bit fiel=
ds
starting at the high order bit of each byte is that bits are then allocat=
ed
in the order in which they would be printed on a piece of paper, regardle=
ss
of whether the processor is big- or little-endian. This is also one of t=
he
arguments in favor of big-endian machine architectures: It is easier to =
read
memory dumps of multiple byte fields because the numbers are expressed in=
the
same order in which we conventionally write them on paper.
--
Jim Cobban jcobban@magma.ca
34 Palomino Dr.
Kanata, ON, CANADA
K2M 1M1
+1-613-592-9438
--------------061D7E0D32D954F32171C819
Content-Type: text/x-vcard; charset=us-ascii;
name="thesnaguy.vcf"
Content-Description: Card for Jim Cobban
Content-Disposition: attachment;
filename="thesnaguy.vcf"
Content-Transfer-Encoding: 7bit
begin:vcard
n:Cobban;James
tel;fax:+1-613-592-9438
tel;home:+1-613-592-9438
x-mozilla-html:FALSE
url:http://www.magma.ca/~jcobban
version:2.1
email;internet:thesnaguy@hotmail.com
title:Consultant
adr;quoted-printable:;;34 Palomino Dr.=0D=0A;Kanata;ON;K2M 1M1;Canada
fn:Jim Cobban
end:vcard
--------------061D7E0D32D954F32171C819--
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: "Mark M. Young" <youngmm@wku.edu>
Date: 2000/04/20 Raw View
I was checking on the results of the following program with a Borland
compiler and a gcc compiler and noticed something interesting. Both
compilers produced executables that printed '~' (126) as they should
when 'a' is given 4 bits. However, when 'a' is given 3 bits, Borland
works fine, but gcc prints ' ' (253). I know this character will vary
from system to system. This means that Borland put 'bitmask_rep_' in
memory with 'b' followed by 'a' and packed the bits to the right. This
makes sense to me. Also, this means that gcc put 'bitmask_rep_' in
memory with 'a' followed by 'b' and packed to the left. This makes no
sense from a computer scientist's viewpoint. Could anyone please clear
up what the standard says about this (maybe with a URL)?
Mark M. Young
youngmm@hera.wku.edu
#include <iostream>
typedef unsigned char u_char;
typedef unsigned int u_int;
typedef struct bitmask_rep_
{
u_int a:4; // CRITICAL PART: or 'u_int a:3;'
u_int b:4;
} bitmaskrep_;
typedef union
{
u_char asChars[ sizeof( bitmask_rep_ ) ];
bitmask_rep_ asBits;
} bitmask;
int main( int argc, char *argv[] )
{
bitmask some_bits;
some_bits.asBits.a = 7; // 0111
some_bits.asBits.b = 14; // 1110
cout << some_bits.asChars[ 0 ] << endl; // prints '~'
// notice: they are put in memory 'a' followed by
// 'b' which and packed to the left
// a >><< b => 0111 >><< 1110 => 126 => '~'
return 0;
}
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]