Thread

Topic: size of a "empty" class

Author: jamshid@ses.com (Jamshid Afshar)
Date: Sun, 25 Sep 1994 07:40:59 GMT Raw View

In article <CwG3BB.80M@world.std.com> tob@world.std.com (Tom O Breton) writes:
>jamshid@ses.com (Jamshid Afshar) writes:
>> > In the very, very, very rare case where you want to use an empty class
>> > in that manner, add a dummy variable. (Frankly, I suspect there are zero
>> > real cases of empty class + virtual functions + want to hold all the
>> > virtuality data elsewhere.)
>>
>> I don't think such cases are very rare.  It's not uncommon to have a
>> class derived from a purely abstract class use a sibling concrete
>> class in its implementation.  For example: [...]
>>
>>         template<class T>
>>         class Container {   // abstract class and no data members
>>         public:
>>            virtual T get() = 0;
>>            //...
>>         };
>
>Your class is not empty. It has a hidden pointer to a virtual table.
>
>Frankly, I don't think you understand what you are talking about here,
>which is a difficulty I'm not willing to overcome.

I don't know who you think you are talking to, but when it comes to
C++ I've never been accused of not knowing what I'm talking about.

Your idea that vtable pointers in the object are the only way for a
C++ compiler to implement polymorphism is provincial and naive.  That
alone is excusable (though your haughty tone is not).  What I find
truly amazing is that you didn't get a clue when I specifically
mentioned the possibility of alternative implementations (e.g., tagged
pointers) in both the articles to which you responded.

>What I already said should be sufficient.

What you already said is barely coherent, much less sufficient.  If
you don't feel like answering the questions in my last article, fine,
but please use restraint when posting future articles to comp.std.c++.
Others might also mistake you for someone qualified to discuss issues
related to the C++ standard.

Jamshid Afshar
jamshid@ses.com

Author: hevi@hilja.it.lut.fi (Petri Heinil{)
Date: Tue, 27 Sep 1994 18:06:49 GMT Raw View

In article <CwGBt9.2y0@cdf.toronto.edu>, g2devi@cdf.toronto.edu (Robert N. Deviasse) writes:
> In article <CwG3BB.80M@world.std.com>, Tom O Breton <tob@world.std.com> wrote:
> > ...
> >Your class is not empty. It has a hidden pointer to a virtual table.
> >Frankly, I don't think you understand what you are talking about here,
>
>
> Just because you can't currently see how something can be otherwise does not
> mean that it can't be so.

Yes, for an example:

I want to do some image prosessing in OO way, I have type-class
Pixel, there is needed the 8 bit for 256 state gray scale. And
the protocol is defined something like show, calculate, etc. to the
manager object. And when I do for example edge detection, I need two classes
inherited from Pixel: a EdgePixel, that has a 8 states for the direction
where the egde goes and has different behaviour for show (the line) or
calculation, and the other class is NormalPixel that has standard pixel
behaviour. The EdgePixel are commonly minor number in calculated image.

And how I want get the edges from the 2048 x 2048 image, so if there
exist in compiler tagged optimization for virtual functions the size
likely needed is

 2048 * 2048 * 8 / 8  =  4 Mbytes

and if the are virtual pointer with the Pixel the size needed is

 2048 * 2048 * (8 + 40) / 8 = 20 Mbytes

that can't happen in normal machine (today) with 16 Mbytes ram.

The other question is does the compiler optimization rules
belongs to the standard definition, because they are partially
implementation issues. Alhough the C++ now posess the optimization
directives like "register" or "volatile", derived from the ANSI C.
And the C++ is low level language (it have pointers), so the
optimization might be appropriate.

--
-- <A HREF="http://www.lut.fi/~hevi/">The Page</A> --

Author: jamshid@ses.com (Jamshid Afshar)
Date: Tue, 20 Sep 1994 06:23:41 GMT Raw View

In article <CwCHFu.73u@world.std.com> tob@world.std.com (Tom O Breton) writes:
>jamshid@ses.com (Jamshid Afshar) writes:
>> How could this cause a problem in real code?  Say the B constructor
>> stores `this' in a static Set<B*> to keep track of all "live" B
>> objects.  If B::~B() removes `this' from the set, we would get an
>> error when d2 is destroyed because the destructors for both d2 and
>> d2.d1 would try to remove the same B* address from the set.
>
>I do not believe the language should be guarding against such a remote
>special case. It should simply not guarantee pointer uniqueness for
>empty classes. In the very, very, very rare case where you want to use
>an empty class in that manner, add a dummy variable. (Frankly, I suspect
>there are zero real cases of empty class + virtual functions + want to
>hold all the virtuality data elsewhere.)

I don't think such cases are very rare.  It's not uncommon to have a
class derived from a purely abstract class use a sibling concrete
class in its implementation.  For example:

 template<class T>
 class Container {   // abstract class and no data members
 public:
    virtual T get() = 0;
    //...
 };

 template<class T>
 class List : public Container<T> {     // concrete class
    //...
 public:
    virtual void insert(T t, ListIter<T> pos) {/*...*/}
    //...
 };

 template<class T>
 class PriorityQueue : public Container<T> {  // concrete class
    List<T> d;
 public:
    virtual T get() { return d.first(); }
    //...
 };

I simply don't think &pq should be allowed to be equal to &pq.d.  I
think the compiler, not the programmer, should add a dummy data member
or rearrange the object layout to ensure this doesn't happen *when
necessary*.  It's not at all difficult for the compiler to determine
when such overhead is necessary (eg, it would not be necessary in this
example under all C++ compilers I know of).  The compiler can
certainly make this decision much more easily and efficiently than a
programmer.

>The current decision is against the spirit of C, which is not paying for
>overhead that you don't use. Funny thing, I don't hear the people who
>usually yell "too tough to implement" complaining about this ~status
>quo~ kluge.

The restriction to empty class optimizations that I'm proposing is
*extremely* limited.  Any compiler making an effort to optimize away
space for empty classes could very easily check for this case.  If
they don't, I'm sure this would be the least of your optimization
concerns.

>If it is desired to let the dummy variable disappear in not-empty
>derived classes, it would be no more complex than what is being done
>now, just explicitly noted. But _I_ think that should be left up to the
>user, who will have to make their inheritance tree a bit odd (reminder:
>in a very, very, very rare case) but be otherwise unaffected.

I don't understand what you mean about disappearing data members, or
what "is being done now".  What is being done now?

>> Yes, B would usually have a virtual function or destructor and because
>> of the vtable pointer the compiler could not optimize away the
>> object's size, but a vtable pointer is not the only way for a C++
>> compiler to implement polymorphism.
>
>True -- in fact I could wish the vtable dereference had been left
>user-definable, as copy CTORs, operator=, and new/delete are. When
>that's useful, it's _very_ useful. IE, for multitudes of small objects
 ^^^^^^
>that dare not waste memory, or for objects that want to handle their
>functionality-state themselves. Or for reading heterogeneous objects
>from file.

When what's useful?  I was referring to tagged pointers as an
alternative to vtables.  Even in such situations where the compiler
did not use a vtable pointer, it could do something clever like
consider the empty "base" part of a derived class to be below the
class, which would not impose any size overhead.

>But IMO in principle _some_ member data is required for virtuality. The
>idea of using the address to supply that information is simply a
>roundabout way of using the object's data. You have to add data to make
>the address unique -- well, why not just put the data in the object
>where it belongs in the first place?

Are you implying that empty abstract base classes aren't useful?  I
think they're often part of excellent designs.

>It seems to come down to the fact that there is no way to reliably get
>instance-dependent information (in the Shannon sense) from an object
>without holding information in the object.
>
>> Has ANSI/ISO looked into this problem?
>
>They did, and they agree with you. Take that as a bad sign. }:)

What exactly has ANSI/ISO decided about this?  The only "optimization"
I believe ANSI/ISO should disallow is making the address of a data
member equal the address of its enclosing object when the data member
and enclosing object are derived from a common base class.  I'm all
for optimizing away space for empty base classes when this (rare)
situation does not occur.  Note, I say my situation is rare not
because the design is rare, but because the base class would almost
definitely have virtual functions and all C++ compilers I know of use
vtable pointer to implement virtual functions.

Jamshid Afshar
jamshid@ses.com

Author: g2devi@cdf.toronto.edu (Robert N. Deviasse)
Date: Tue, 20 Sep 1994 23:27:08 GMT Raw View

In article <CwG3BB.80M@world.std.com>, Tom O Breton <tob@world.std.com> wrote:
>jamshid@ses.com (Jamshid Afshar) writes:
>> > In the very, very, very rare case where you want to use an empty class
>> > in that manner, add a dummy variable. (Frankly, I suspect there are zero
>> > real cases of empty class + virtual functions + want to hold all the
>> > virtuality data elsewhere.)
>>
>> I don't think such cases are very rare.  It's not uncommon to have a
>> class derived from a purely abstract class use a sibling concrete
>> class in its implementation.  For example:
>>
>>         template<class T>
>>         class Container {   // abstract class and no data members
>>         public:
>>            virtual T get() = 0;
>>            //...
>>         };
>
>
>Your class is not empty. It has a hidden pointer to a virtual table.
>Frankly, I don't think you understand what you are talking about here,

Just because you can't currently see how something can be otherwise does not
mean that it can't be so.

Who says that a compiler has to allocate a vtable to classes with virtual
functions? Off the top of my head I can think of two possible implimentations
that could give the above size zero. One is to have "fat pointers" (i.e. a
class pointer is a void pointer plus type information) and the other is to
have a global association list of pointers and types. These implementations
have some definite advantages over the vtable approach and though they will
likely be slower than vtables on current PC's, it need not be the case in
other architectures.

>which is a difficulty I'm not willing to overcome.

Pity.

> What I already said
>should be sufficient.
>

It isn't.

>        Tom
>
>--
>finger me for how Tehomega is coming along (at tob@world.std.com)
>Author of The Burning Tower (from TomBreton@delphi.com) (weekly in
>rec.games.frp.archives)
>

Take care
    Robert

--
/----------------------------------+------------------------------------------\
| Robert N. Deviasse               |"If we have to re-invent the wheel,       |
| EMAIL: g2devi@cdf.utoronto.ca    |  can we at least make it round this time"|
+----------------------------------+------------------------------------------/

Author: tob@world.std.com (Tom O Breton)
Date: Tue, 20 Sep 1994 20:23:34 GMT Raw View

jamshid@ses.com (Jamshid Afshar) writes:
> > In the very, very, very rare case where you want to use an empty class
> > in that manner, add a dummy variable. (Frankly, I suspect there are zero
> > real cases of empty class + virtual functions + want to hold all the
> > virtuality data elsewhere.)
>
> I don't think such cases are very rare.  It's not uncommon to have a
> class derived from a purely abstract class use a sibling concrete
> class in its implementation.  For example:
>
>         template<class T>
>         class Container {   // abstract class and no data members
>         public:
>            virtual T get() = 0;
>            //...
>         };


Your class is not empty. It has a hidden pointer to a virtual table.

Frankly, I don't think you understand what you are talking about here,
which is a difficulty I'm not willing to overcome. What I already said
should be sufficient.

        Tom

--
finger me for how Tehomega is coming along (at tob@world.std.com)
Author of The Burning Tower (from TomBreton@delphi.com) (weekly in
rec.games.frp.archives)

Author: tob@world.std.com (Tom O Breton)
Date: Sun, 18 Sep 1994 21:38:18 GMT Raw View

[ Followups to comp.std.c++ ]

jamshid@ses.com (Jamshid Afshar) writes:
> How could this cause a problem in real code?  Say the B constructor
> stores `this' in a static Set<B*> to keep track of all "live" B
> objects.  If B::~B() removes `this' from the set, we would get an
> error when d2 is destroyed because the destructors for both d2 and
> d2.d1 would try to remove the same B* address from the set.

I do not believe the language should be guarding against such a remote
special case. It should simply not guarantee pointer uniqueness for
empty classes. In the very, very, very rare case where you want to use
an empty class in that manner, add a dummy variable. (Frankly, I suspect
there are zero real cases of empty class + virtual functions + want to
hold all the virtuality data elsewhere.)

The current decision is against the spirit of C, which is not paying for
overhead that you don't use. Funny thing, I don't hear the people who
usually yell "too tough to implement" complaining about this ~status
quo~ kluge.

If it is desired to let the dummy variable disappear in not-empty
derived classes, it would be no more complex than what is being done
now, just explicitly noted. But _I_ think that should be left up to the
user, who will have to make their inheritance tree a bit odd (reminder:
in a very, very, very rare case) but be otherwise unaffected.

> Yes, B would usually have a virtual function or destructor and because
> of the vtable pointer the compiler could not optimize away the
> object's size, but a vtable pointer is not the only way for a C++
> compiler to implement polymorphism.

True -- in fact I could wish the vtable dereference had been left
user-definable, as copy CTORs, operator=, and new/delete are. When
that's useful, it's _very_ useful. IE, for multitudes of small objects
that dare not waste memory, or for objects that want to handle their
functionality-state themselves. Or for reading heterogeneous objects
from file.

But IMO in principle _some_ member data is required for virtuality. The
idea of using the address to supply that information is simply a
roundabout way of using the object's data. You have to add data to make
the address unique -- well, why not just put the data in the object
where it belongs in the first place?

It seems to come down to the fact that there is no way to reliably get
instance-dependent information (in the Shannon sense) from an object
without holding information in the object.

> Has ANSI/ISO looked into this problem?

They did, and they agree with you. Take that as a bad sign. }:)

        Tom

--
finger me for how Tehomega is coming along (at tob@world.std.com)
Author of The Burning Tower (from TomBreton@delphi.com) (weekly in
rec.games.frp.archives)

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Mon, 19 Sep 1994 06:20:09 GMT Raw View

tob@world.std.com (Tom O Breton) writes:

>The current decision is against the spirit of C, which is not paying for
>overhead that you don't use.

I agree with you.  But I've long since given up arguing about this one.
It was a quite deliberate decision and there is basically no chance
of it being reversed :-(.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au

Author: jamshid@ses.com (Jamshid Afshar)
Date: Sun, 18 Sep 1994 04:35:03 GMT Raw View

Redirected to comp.std.c++.

In article <778976946snz@protech.demon.co.uk>,
Mark Strange <Mark@protech.demon.co.uk> wrote:
>In article <CBARBER.94Sep1141903@apricot.bbn.com>
>cbarber@bbn.com "Christopher Barber" writes:
>[...]
>> The ARM states that "objects of an empty class have a nonzero size"
>> (section 9, p164 of the 1st edition).  This is so that such objects
>> can be allocated and have addresses.  [I think Chris meant "unique"
>> addresses]
>
>Yes, empty classes have a size of 1. You are correct in saying that it is
>so that they are given an address. Since each object, even 'empty' ones
>have an address, it is possible to distinguish, by comparing addresses,
>between objects of the same class.

I wish C++ guaranteed this to be true, but last I heard it will not.
Of course I'm not talking about comparing addresses after casting them
to void*.  The first member of a struct is guaranteed to be at the
same (void*) address as the enclosing object and the first member of
an array is at the same address as the array itself.  I'm referring to
comparisons involving standard conversions (no casts) like:

 class B {};

 class D1 : public B {};

 class D2 : public B {
    D1 d1;
 };

 D2 d2;

 D1* p1 = &d2.d1;   // pointer to subobject
 D2* p2 = &d2;

As far as I know, a compiler may make `p1' equal `p2' even though they
are *not* pointing to the "same" object (by an practical definition of
"same").

How could this cause a problem in real code?  Say the B constructor
stores `this' in a static Set<B*> to keep track of all "live" B
objects.  If B::~B() removes `this' from the set, we would get an
error when d2 is destroyed because the destructors for both d2 and
d2.d1 would try to remove the same B* address from the set.

Yes, B would usually have a virtual function or destructor and because
of the vtable pointer the compiler could not optimize away the
object's size, but a vtable pointer is not the only way for a C++
compiler to implement polymorphism.  Has ANSI/ISO looked into this
problem?

Jamshid Afshar
jamshid@ses.com