Topic: Sizeof in C++ versus sizeof in C (was: vtbls)


Author: "Ronald F. Guilmette" <rfg@rahul.net>
Date: 29 Dec 1994 21:06:45 GMT
Raw View
In article <1994Dec26.201700.9473@sq.sq.com>,
Mark Brader <msb@sq.sq.com> wrote:
>> > > struct S { double d; int i; };
>
>> > Applying sizeof had better yield the type when used as an
>> > array element.  Otherwise the well-known C technique
>> > T* p = malloc(n * sizeof(T));
>> > fails horribly.
>
>> I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
>> doesn't seem to really require this.  Offhand, I'd say that's a flaw in
>> the C standard.
>
>Sure it does.  From 6.5.2.1 / 3.5.2.1:
>
>#  There may also be unnamed padding at the end of a structure or union,
>#  as necessary to achieve the appropriate alignment were the structure
>#  or union to be an element of an array.

Note the deliberate use of the word `may'.  The C standard doesn't
require stand-alone (i.e. non-element) struct/union objects to include
the kind of trailing padding which must normally be supplied in the
case of array elements.

I believe that this is an oversight, and something that should be
corrected in the C standard.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- E-mail: rfg@segfault.us.com ----------- Purveyors of Compiler Test ----
-------------------------------------------- Suites and Bullet-Proof Shoes -




Author: tanmoy@qcd.lanl.gov (Tanmoy Bhattacharya)
Date: 29 Dec 1994 22:27:37 GMT
Raw View
In article <3dv8d5$obj@hustle.rahul.net>, "Ronald F. Guilmette" <rfg@rahul.net> writes:
|> In article <1994Dec26.201700.9473@sq.sq.com>,
|> Mark Brader <msb@sq.sq.com> wrote:
|> >> > > struct S { double d; int i; };
|> >
|> >> > Applying sizeof had better yield the type when used as an
|> >> > array element.  Otherwise the well-known C technique
|> >> > T* p = malloc(n * sizeof(T));
|> >> > fails horribly.
|> >
|> >> I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
|> >> doesn't seem to really require this.  Offhand, I'd say that's a flaw in
|> >> the C standard.
|> >
|> >Sure it does.  From 6.5.2.1 / 3.5.2.1:
|> >
|> >#  There may also be unnamed padding at the end of a structure or union,
|> >#  as necessary to achieve the appropriate alignment were the structure
|> >#  or union to be an element of an array.
|>
|> Note the deliberate use of the word `may'.  The C standard doesn't
|> require stand-alone (i.e. non-element) struct/union objects to include
|> the kind of trailing padding which must normally be supplied in the
|> case of array elements.
|>
|> I believe that this is an oversight, and something that should be
|> corrected in the C standard.
|>

Could you please elaborate on your argument? I find the following reason
appealing:

Assume the declaration:

struct a {double x; char y;};

And assume that for the compiler I am talking about, there is no internal
padding that is not strictly required,

1) I am assuming that the above statement implies:

   struct a b[2];

   ((char*)(void*)(b+1) - (char*)(void*)b) == sizeof(b[0]);
    and,
   ((char*)(void*)(b+2) - (char*)(void*)(b+1)) == sizeof(b[1]);
    and,
   sizeof(b) == sizeof(b[0])+sizeof(b[1]);

   The word `may' is a bit troublesome for this step, I do not know.

   This implies that if double needs alignment other than 1, because no padding
   is possible _before_ the first field,

      sizeof(b[0]) > sizeof(double)+sizeof(char).

   i.e. there is some trailing padding.

2) sizeof(x) reports the sizeof the type of x: so that after

   struct a x, b[2];

   sizeof(x) == sizeof(b[0]);

   i.e. x must contain trailing padding.

3) One translation unit might contain

   struct a x;
   size_t a = sizeof(x);

   and the other might contain

   struct a b[2];
   extern size_t a;
   size_t b = sizeof(b[0]);

   and a==b is still guranteed.

   So, in every implementation in which compilation of translation units cannot
   communicate with each other, and which allows more than one translation unit
   per program, x must contain trailing padding (except by the `as if' rule, e.g.
   if the translation unit declares x with non-external linkage and never takes
   its address, and never uses sizeof on it.)

So, under what conditions do you propose it not contain any trailing padding?

By the way, in an implementation that has a limit of one translation unit per
program, I cannot apply this logic. In fact, is there a rule such that even
the assertion in the following is valid:

struct {char a;} a;
struct {char a;} b;
assert(sizeof(a) == sizeof(b));

On a related topic, I discover that the `initial sequence rule' as I understood
it is too weak: does it apply to struct x {int a[2];} and struct y {int b[1]; int
c[1];}? (I mean is there no guarantee that offsetof c is the same as offsetof
a[1]? Can a perverse enough compiler decide to put padding between b and c?)

Cheers
Tanmoy
--
tanmoy@qcd.lanl.gov(128.165.23.46) DECNET: BETA::"tanmoy@lanl.gov"(1.218=1242)
Tanmoy Bhattacharya O:T-8(MS B285)LANL,NM87544-0285,USA H:#3,802,9 St,NM87545
Others see <gopher://yaleinfo.yale.edu:7700/00/Internet-People/internet-mail>,
<http://alpha.acast.nova.edu/cgi-bin/inmgq.pl>or<ftp://csd4.csd.uwm.edu/pub/
internetwork-mail-guide>. -- <http://nqcd.lanl.gov/people/tanmoy/tanmoy.html>
fax: 1 (505) 665 3003   voice: 1 (505) 665 4733    [ Home: 1 (505) 662 5596 ]




Author: msb@sq.sq.com (Mark Brader)
Date: Sat, 31 Dec 94 10:18:27 GMT
Raw View
[Apology for long inclusion.  Reminder that thread is cross-posted.]

> > > > > struct S { double d; int i; };
>
> > > > Applying sizeof had better yield the type when used as an
> > > > array element.  Otherwise the well-known C technique
> > > > T* p = malloc(n * sizeof(T));
> > > > fails horribly.
>
> > > I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
> > > doesn't seem to really require this.  Offhand, I'd say that's a flaw in
> > > the C standard.
>
> > Sure it does.  From 6.5.2.1 / 3.5.2.1:
> >
> > #  There may also be unnamed padding at the end of a structure or union,
> > #  as necessary to achieve the appropriate alignment were the structure
> > #  or union to be an element of an array.
>
> Note the deliberate use of the word `may'.

Well, I didn't think it was accidental, but perhaps some people are confused
about what it means; the sentence would perhaps be subject to multiple
interpretations if it occurred in isolation.  Fortunately, there is enough
context to clarify it.

The word "may" here is not giving the implementation permission to
include padding if it feels like it; "may" is indicating possibility,
and specifically, the possibility that padding will have to be included
*because it would be necessary "were the structure or union to be an
element of an array"*.

> The C standard doesn't require stand-alone (i.e. non-element)
> struct/union objects to include the kind of trailing padding which
> must normally be supplied in the case of array elements.

Does too. :-)  Let me repeat two sentences from my earlier posting:

> > No; 6.5.2.1 is talking about properties of the types, not the objects.
> > The language could be more precise, but the intent is clear.

It says explicitly that it's talking about types: "a structure is a
type ... a union is a type".  What that section is mostly about is the
types' *representations*, although that word is not used.  And the
representation has to be the same for all objects of the type!  Notice
how in some places the section *does* talk about "structure objects".

Of course, the last words of the quoted passage are slightly wrong;
the elements of an array are objects, not types.  A suitable correction
would be "... to be the element type of an array".

> I believe that this is an oversight, and something that should be
> corrected in the C standard.

It was already covered in one of the public reviews before the standard
was adopted.  I made the same complaint about a draft version where the
"were..." wording was absent or different, and the present wording was
adopted as a fix.

As I say, though, the wording could be cleaned up.

--
Mark Brader, msb@sq.com, SoftQuad Inc., Toronto
        "I'm a little worried about the bug-eater," she said.  "We're embedded
        in bugs, have you noticed?"             -- Niven, "The Integral Trees"

This article is in the public domain.




Author: "Ronald F. Guilmette" <rfg@rahul.net>
Date: 24 Dec 1994 09:50:31 GMT
Raw View
In article <D18nvG.K6@research.att.com>,
Andrew Koenig <ark@research.att.com> wrote:
>In article <3dcjj4$8dr@hustle.rahul.net> "Ronald F. Guilmette" <rfg@rahul.net> writes:
>
>>  struct S { double d; int i; };
>...
>Applying sizeof had better yield the type when used as an
>array element.  Otherwise the well-known C technique
>
> T* p = malloc(n * sizeof(T));
>
>fails horribly.

I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
doesn't seem to really require this.

Offhand, I'd say that's a flaw in the C standard.

(Note that I have cross-posted this follow-up also to comp.std.c.)

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- E-mail: rfg@segfault.us.com ----------- Purveyors of Compiler Test ----
-------------------------------------------- Suites and Bullet-Proof Shoes -




Author: yaakov@cc.gatech.edu (Yaakov Eisenberg)
Date: 24 Dec 1994 19:41:43 -0500
Raw View
In article <3dgqt7$9d9@hustle.rahul.net>,
Ronald F. Guilmette <rfg@rahul.net> wrote:
>In article <D18nvG.K6@research.att.com>,
>Andrew Koenig <ark@research.att.com> wrote:
>>In article <3dcjj4$8dr@hustle.rahul.net> "Ronald F. Guilmette" <rfg@rahul.net> writes:
>>
>>>  struct S { double d; int i; };
>>...
>>Applying sizeof had better yield the type when used as an
>>array element.  Otherwise the well-known C technique
>>
>> T* p = malloc(n * sizeof(T));
>>
>>fails horribly.
>
>I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
>doesn't seem to really require this.
>

from the ISO C standard, section 6.3.3.4 (the sizeof operator)

   When applied to an operand that has structure or union type, the
   result is the total number of bytes in such an object, including
   internal and trailing padding.

from section 6.5.2.1

   There may also be unnamed padding at the end of a structure or
   union, as necessary to achieve the appropriate alignment were
   the structure or union to be an element of an array.

The standard certainly allows padding, and it apparently admits that padding
is necessary if structure elements of an array wouldn't be properly aligned
otherwise.
   In any case, I believe that all structures of a given type have the same
size, whether or not they are part of an array.  (Andrew Koenig appears to
disagree with this: "... the type when used as an array element.")
--

     -Yaakov Eisenberg (yaakov@cc.gatech.edu)




Author: msb@sq.sq.com (Mark Brader)
Date: Mon, 26 Dec 94 20:17:00 GMT
Raw View
> > > struct S { double d; int i; };

> > Applying sizeof had better yield the type when used as an
> > array element.  Otherwise the well-known C technique
> > T* p = malloc(n * sizeof(T));
> > fails horribly.

> I am forced to agree whole-heartedly.  Unfortunately, the ISO C standard
> doesn't seem to really require this.  Offhand, I'd say that's a flaw in
> the C standard.

Sure it does.  From 6.5.2.1 / 3.5.2.1:

#  There may also be unnamed padding at the end of a structure or union,
#  as necessary to achieve the appropriate alignment were the structure
#  or union to be an element of an array.

Now from 6.3.3.4 / 3.3.3.4:

#  When [sizeof is] applied to an operand that has structure or union
#  type, the result is the total number of bytes in such an object,
#  including internal and trailing padding.

This thread is crossposted; I make no comment about C++.
--
Mark Brader, msb@sq.com       "C and C++ are two different languages.
SoftQuad Inc., Toronto         That's UK policy..."  -- Clive Feather




Author: "Ronald F. Guilmette" <rfg@rahul.net>
Date: 22 Dec 1994 19:21:08 GMT
Raw View
In article <3d2237$7k1@engnews2.Eng.Sun.COM>,
Steve Clamage <clamage@Eng.Sun.COM> wrote:
>... The "sizeof" operator returns the amount of space an
>object requires when used as an array element, including any
>hidden data and padding for alignment...

Golly I'm glad I read this newsgroup!  Otherwise I'd never pick up on
little tidbits like this.

Steve, I take it that you are saying that for your average, run-of-the
mill, 32-bit UNIX workstation, if I have:

 struct S { double d; int i; };

that `sizeof(struct S)' will yield 16 rather than 12, yes?

I just have one question... is this yet another small incompatability
with C?  If so, is it listed already in the `compatability' appendix
of the latest draft?

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- E-mail: rfg@segfault.us.com ----------- Purveyors of Compiler Test ----
-------------------------------------------- Suites and Bullet-Proof Shoes -




Author: ark@research.att.com (Andrew Koenig)
Date: Fri, 23 Dec 1994 00:54:52 GMT
Raw View
In article <3dcjj4$8dr@hustle.rahul.net> "Ronald F. Guilmette" <rfg@rahul.net> writes:

> Golly I'm glad I read this newsgroup!  Otherwise I'd never pick up on
> little tidbits like this.

See?  You learn something new all the time.

> Steve, I take it that you are saying that for your average, run-of-the
> mill, 32-bit UNIX workstation, if I have:

>  struct S { double d; int i; };

> that `sizeof(struct S)' will yield 16 rather than 12, yes?

Yes indeed -- at least if doubles are expected to be aligned on an
8-byte boundary.

> I just have one question... is this yet another small incompatability
> with C?  If so, is it listed already in the `compatability' appendix
> of the latest draft?

No, I don't think it's an incompatibility.

For example, when I tried the following program

 #include <stdio.h>

 struct S {
  double d;
  int i;
 };

 main()
 {
  printf ("%d\n", sizeof(struct S));
 }

on my `average run-of-the-mill, 32-bit UNIX workstation,'
it printed 16 when run on the C++ compiler and both the C
compilers available on that machine (gcc and the vendor's
compiler).  I tried it on another machine from a different
vendor and both C compilers produced 16 there too.

Applying sizeof had better yield the type when used as an
array element.  Otherwise the well-known C technique

 T* p = malloc(n * sizeof(T));

fails horribly.
--
    --Andrew Koenig
      ark@research.att.com