Thread

Topic: Downcasting from virtual base class

Author: jvsb@ra.alcbel.be (Johan Vanslembrouck)
Date: 25 Nov 93 14:41:39 GMT Raw View

In comp.lang.c++ <1993Nov24.161840.23856@cs.wisc.edu>,
James Larus writes:

> Actually, it is simple (though a bit) messy to downcast a virtual base
class to
> a derived class.  The fact that the code below works is a strong
indicating that
> this restriction is unnecessary and should be removed from the
language.

I agree that casting a virtual base class to a derived class should be
possible. The reason why it is not possible at this moment is because
of
the (IMHO) rather counter-intuitive organisation of the "object parts"
when virtual base classes are involved (cfr. ARM p.225-227).

Consider the following nonvirtual inheritance lattice:

class L { ... };
class A : public L { ... };
class B : public L { ... };
class C : public A, public B { ... };

The memory layout usually looks like this (which is rather intuitive,
I think):

      ----------------
     |  A's L part    |
     |                |
     |----------------|
     |    A part      |
     |                |
     |----------------|
     |  B's L part    |
     |                |
     |----------------|
     |    B part      |
     |                |
     |----------------|
     |    C part      |
     |                |
      ----------------

Using virtual inheritance ...

class L { ... };
class A : virtual public L { ... };
class B : virtual public L { ... };
class C : public A, public B { ... };

,however, the following layout is used:

      ----------------
     |    A part      |
     |                |
     |      ----------------------
     |----------------|           |
     |    B part      |           |
     |                |           |
     |       ---------------------|
     |----------------|           |
     |    C part      |           |
     |                |           |
     |----------------|           |
     |   A+B's L part |<----------
     |                |
      ---------------

The common L part has moved from the top to the bottom of the
memory layout. Calculating the offset of a derived class part given
a pointer to the virtual base class part is considered too difficult
by the ARM
(p.227: "Casting from a virtual base class to a derived class
is disallowed to avoid requiring an implementation to maintain
pointers to enclosing objects").

Another disadvantage of this approach is that it is impossible to
create
a derived class object at the location of a virtual base class object
*without* explicitly copying the contents of the base class object;
the virtual base class data members are moved to another location.

This is my alternative, which, I believe, is more intuitive and might
make downcasting easier. The common L part is again at the top,
and a pointer from the derived class part to the base class part
is inserted BEFORE the derived class part.

      -----------------
     |   A+B's L part |<----------
     |                |           |
     |----------------|           |
     |       ---------------------|
     |    A part      |           |
     |                |           |
     |----------------|           |
     |       ---------------------
     |    B part      |
     |                |
     |----------------|
     |    C part      |
     |                |
      ----------------

I would use the same mechanism for nonvirtual inheritance, like in:

      -----------------
     |   A's L part   |<----------
     |                |           |
     |----------------|           |
     |       ---------------------
     |    A part      |
     |                |
     |----------------|
     |   B's L part   |<----------
     |                |           |
     |----------------|           |
     |       ---------------------
     |    B part      |
     |                |
     |----------------|
     |    C part      |
     |                |
      ----------------

To be complete, I would also introduce pointers at the beginning of
each derived class part to the immediate base class part.
The layout for the nonvirtual inheritance case now looks as follows:

      -----------------
     |   A's L part   |<---------- <------
     |                |           |       |
     |----------------|           |       |
     |       ---------------------        |
     |    A part      |                   |
     |                |                   |
     |----------------|                   |
     |   B's L part   |<---------- <------|-------
     |                |           |       |       |
     |----------------|           |       |       |
     |       ---------------------        |       |
     |    B part      |                   |       |
     |                |                   |       |
     |----------------|                   |       |
     |       -----------------------------        |
     |       -------------------------------------
     |    C part      |
     |                |
      ----------------

The layout for the virtual inheritance case looks like:

      -----------------
     |  A+B's L part  |<---------- <------
     |                |           |       |
     |----------------|           |       |
     |       ---------------------|       |
     |    A part      |           |       |
     |                |           |       |
     |----------------|           |       |
     |       ---------------------        |
     |    B part      |                   |
     |                |                   |
     |----------------|                   |
     |       -----------------------------|
     |       -----------------------------
     |    C part      |
     |                |
      ----------------

And why not introduce a pointer at the end of each base class part to
the
immediate derived class parts? Too much pointer overhead?
Only in case of very small classes, I think.
An advantage of this approach could be that an implementation
could choose to locate the different parts at different places in
memory
instead of using contiguous memory.

-----------------------------------------------------------------------
Johan Vanslembrouck - SE99                 Tel    : +32 3 2407739
Alcatel Bell Telephone                     Telex  : 72128 Bella B
Francis Wellesplein  1                     Fax    : +32 3 2409932
B-2018 Antwerp                             e-mail : jvsb@ra.alcbel.be
Belgium
-----------------------------------------------------------------------

Author: simon@sco.COM (Simon Tooke)
Date: Fri, 26 Nov 1993 14:24:47 GMT Raw View

In <1993Nov25.154139@ra.alcbel.be> jvsb@ra.alcbel.be (Johan Vanslembrouck) writes:


>In comp.lang.c++ <1993Nov24.161840.23856@cs.wisc.edu>,
>James Larus writes:

>> Actually, it is simple (though a bit) messy to downcast a virtual base
>class to
>> a derived class.  The fact that the code below works is a strong
>indicating that
>> this restriction is unnecessary and should be removed from the
>language.

>I agree that casting a virtual base class to a derived class should be
>possible. The reason why it is not possible at this moment is because
>of
>the (IMHO) rather counter-intuitive organisation of the "object parts"
>when virtual base classes are involved (cfr. ARM p.225-227).

The organization of the internals of class layout (above simple `C' structs,
"PODS") is an implementation detail.

>Consider the following nonvirtual inheritance lattice:

>class L { ... };
>class A : public L { ... };
>class B : public L { ... };
>class C : public A, public B { ... };

>The memory layout usually looks like this (which is rather intuitive,
>I think):

>      ----------------
>     |  A's L part    |
>     |                |
>     |----------------|
>     |    A part      |
>     |                |
>     |----------------|
>     |  B's L part    |
>     |                |
>     |----------------|
>     |    B part      |
>     |                |
>     |----------------|
>     |    C part      |
>     |                |
>      ----------------



>Using virtual inheritance ...

>class L { ... };
>class A : virtual public L { ... };
>class B : virtual public L { ... };
>class C : public A, public B { ... };

>,however, the following layout is used:

>      ----------------
>     |    A part      |
>     |                |
>     |      ----------------------
>     |----------------|           |
>     |    B part      |           |
>     |                |           |
>     |       ---------------------|
>     |----------------|           |
>     |    C part      |           |
>     |                |           |
>     |----------------|           |
>     |   A+B's L part |<----------
>     |                |
>      ---------------

>The common L part has moved from the top to the bottom of the
>memory layout. Calculating the offset of a derived class part given
>a pointer to the virtual base class part is considered too difficult
>by the ARM
>(p.227: "Casting from a virtual base class to a derived class
>is disallowed to avoid requiring an implementation to maintain
>pointers to enclosing objects").

The problem here is that the compiler would have to know at compile time to
insert a pointer to the derived class in the base class.  This would entail
knowledge of exactly how many classes are going to derive from this base class
AT COMPILE TIME - clearly infeasable.  Especially if the base class were a PODS
inherited from C or FORTRAN.

>Another disadvantage of this approach is that it is impossible to
>create
>a derived class object at the location of a virtual base class object
>*without* explicitly copying the contents of the base class object;
>the virtual base class data members are moved to another location.

Since there is no guarantee the sizes are the same, this won't work often
anyways.

>This is my alternative, which, I believe, is more intuitive and might
>make downcasting easier. The common L part is again at the top,
>and a pointer from the derived class part to the base class part
>is inserted BEFORE the derived class part.

(Scheme deleted as it doesn't take into account multiple virtual bases)

>I would use the same mechanism for nonvirtual inheritance, like in:

(Scheme with pointers in derived classes pointing to base classes)
This scheme works, but really there is no point, since nothing is gained


>To be complete, I would also introduce pointers at the beginning of
>each derived class part to the immediate base class part.
>The layout for the nonvirtual inheritance case now looks as follows:

>      -----------------
>     |   A's L part   |<---------- <------
>     |                |           |       |
>     |----------------|           |       |
>     |       ---------------------        |
>     |    A part      |                   |
>     |                |                   |
>     |----------------|                   |
>     |   B's L part   |<---------- <------|-------
>     |                |           |       |       |
>     |----------------|           |       |       |
>     |       ---------------------        |       |
>     |    B part      |                   |       |
>     |                |                   |       |
>     |----------------|                   |       |
>     |       -----------------------------        |
>     |       -------------------------------------
>     |    C part      |
>     |                |
>      ----------------

The first rule of computing: if you need to add a feature, just insert
another level of indirection.  This (and several other items in this doc)
do against the philosophy of not paying for what you don't use.  This
example could double the size of some class layouts.  It _really_ is more
efficient to only insert internal pointers where they are required.

>And why not introduce a pointer at the end of each base class part to
>the
>immediate derived class parts? Too much pointer overhead?
>Only in case of very small classes, I think.
>An advantage of this approach could be that an implementation
>could choose to locate the different parts at different places in
>memory
>instead of using contiguous memory.

Ah, non-continguous classes - propose it at the next X3J16/WG21 gathering.
Have to be able to directly represent sparse files somehow.

Okay, now for the punchline: the committee voted in Run-Time Type
Indentification a very few meetings ago.  This (in many implementations)
inserts a pointer to a "description" of the class in the classes vtbl
(so you can see this only works for classes with virtual functions).

Also added was a dynamic cast, which, at runtime, interprets this description
and allows upcasting from virtual base to derived class.  So, no overhead
involved (on a per-instance basis) until the upcast takes place.
(this still won't work for ambiguous upcasting cases)

-simon tooke


>-----------------------------------------------------------------------
>Johan Vanslembrouck - SE99                 Tel    : +32 3 2407739
>Alcatel Bell Telephone                     Telex  : 72128 Bella B
>Francis Wellesplein  1                     Fax    : +32 3 2409932
>B-2018 Antwerp                             e-mail : jvsb@ra.alcbel.be
>Belgium
>-----------------------------------------------------------------------

===============================================================================
Simon Tooke  (not speaking for) SCO Canada, Inc.         Voice:  (416) 922-1937
....!scocan!simon             simon@sco.com                Fax:  (416) 922-2704
130 Bloor St. West. Suite 1001, Toronto, Ontario, Canada  M5S 1N5

Author: sdm@cs.brown.edu (Scott Meyers)
Date: Fri, 26 Nov 1993 14:45:29 GMT Raw View

In article <1993Nov26.142447.2591@sco.COM> simon@sco.COM (Simon Tooke) writes:
| >An advantage of this approach could be that an implementation
| >could choose to locate the different parts at different places in
| >memory
| >instead of using contiguous memory.
|
| Ah, non-continguous classes - propose it at the next X3J16/WG21 gathering.
| Have to be able to directly represent sparse files somehow.

I used to think that an implementation was free to allocate the space for a
class in non-contiguous memory (is that really a term?), especially for
classes containing virtual bases, but then it dawned on me that it would
very difficult to write operator new for such classes, because it's defined
to return a pointer to memory at least as big as its (single) size_t
parameter.  I've since taken this as an implicit requirement that memory
for an object must be contiguous.  Yes?  No?  Maybe?

Scott

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 27 Nov 1993 13:52:07 GMT Raw View

sdm@cs.brown.edu (Scott Meyers) writes:

>I used to think that an implementation was free to allocate the space for a
>class in non-contiguous memory (is that really a term?), especially for
>classes containing virtual bases, but then it dawned on me that it would
>very difficult to write operator new for such classes, because it's defined
>to return a pointer to memory at least as big as its (single) size_t
>parameter.  I've since taken this as an implicit requirement that memory
>for an object must be contiguous.  Yes?  No?  Maybe?

No - at best, this would require that an implementation allocate
classes for which an operator new() was defined in contiguous memory.
It wouldn't have any effect on classes which didn't define
operator new().

This is completely academic, given that [as another poster pointed
out] the working paper defines an object as denoting a contiguous
region of storage.

--
Fergus Henderson                     fjh@munta.cs.mu.OZ.AU

Author: simon@sco.COM (Simon Tooke)
Date: Sat, 27 Nov 1993 14:10:31 GMT Raw View

In <1993Nov26.144529.1777@cs.brown.edu> sdm@cs.brown.edu (Scott Meyers) writes:

>In article <1993Nov26.142447.2591@sco.COM> simon@sco.COM (Simon Tooke) writes:
>| >An advantage of this approach could be that an implementation
>| >could choose to locate the different parts at different places in
>| >memory
>| >instead of using contiguous memory.
>|
>| Ah, non-contiguous classes - propose it at the next X3J16/WG21 gathering.
>| Have to be able to directly represent sparse files somehow.

>I used to think that an implementation was free to allocate the space for a
>class in non-contiguous memory (is that really a term?), especially for
>classes containing virtual bases, but then it dawned on me that it would
>very difficult to write operator new for such classes, because it's defined
>to return a pointer to memory at least as big as its (single) size_t
>parameter.  I've since taken this as an implicit requirement that memory
>for an object must be contiguous.  Yes?  No?  Maybe?

>Scott

Section 1.3 of the WP states that each object (except bitfields) occupies
one or more contiguous bytes.

-simon

===============================================================================
Simon Tooke  (not speaking for) SCO Canada, Inc.         Voice:  (416) 922-1937
....!scocan!simon             simon@sco.com                Fax:  (416) 922-2704
130 Bloor St. West. Suite 1001, Toronto, Ontario, Canada  M5S 1N5

Author: nikki@trmphrst.demon.co.uk (Nikki Locke)
Date: Fri, 26 Nov 1993 10:46:48 +0000 Raw View

In article <1993Nov25.154139@ra.alcbel.be> jvsb@ra.alcbel.be (Johan Vanslembrouck) writes:
> I agree that casting a virtual base class to a derived class should be
> possible. The reason why it is not possible at this moment is because
> of
> the (IMHO) rather counter-intuitive organisation of the "object parts"
> when virtual base classes are involved (cfr. ARM p.225-227).
...
> Using virtual inheritance ...
>
> class L { ... };
> class A : virtual public L { ... };
> class B : virtual public L { ... };
> class C : public A, public B { ... };

Your suggested layout ...

> The layout for the virtual inheritance case looks like:
>
>       -----------------
>      |  A+B's L part  |<---------- <------
>      |                |           |       |
>      |----------------|           |       |
>      |       ---------------------|       |
>      |    A part      |           |       |
>      |                |           |       |
>      |----------------|           |       |
>      |       ---------------------        |
>      |    B part      |                   |
>      |                |                   |
>      |----------------|                   |
>      |       -----------------------------|
>      |       -----------------------------
>      |    C part      |
>      |                |
>       ----------------

OK, now lets introduce a new class

class D : public B, public A { ... }; // Different order -> different layout

and the following code ...

void function(int i)
{
  C c;
  D d;
  L *l = i ? c : d;
  B* b = (B *)l;  // Cast virtual base to derived

  // ...
}

Please explain how your layout would enable the compiler to output code
for the last (commented) line.

--
Nikki Locke,Trumphurst Ltd.(PC and Unix consultancy) nikki@trmphrst.demon.co.uk
trmphrst.demon.co.uk is NOT affiliated with ANY other sites at demon.co.uk.

Author: kanze@us-es.sel.de (James Kanze)
Date: 29 Nov 1993 12:00:11 GMT Raw View

In article <1993Nov25.154139@ra.alcbel.be> jvsb@ra.alcbel.be (Johan
Vanslembrouck) writes:

|> In comp.lang.c++ <1993Nov24.161840.23856@cs.wisc.edu>,
|> James Larus writes:

|> > Actually, it is simple (though a bit) messy to downcast a virtual base
|> class to
|> > a derived class.  The fact that the code below works is a strong
|> indicating that
|> > this restriction is unnecessary and should be removed from the
|> language.

|> I agree that casting a virtual base class to a derived class should be
|> possible. The reason why it is not possible at this moment is because
|> of
|> the (IMHO) rather counter-intuitive organisation of the "object parts"
|> when virtual base classes are involved (cfr. ARM p.225-227).

|> Consider the following nonvirtual inheritance lattice:

|> class L { ... };
|> class A : public L { ... };
|> class B : public L { ... };
|> class C : public A, public B { ... };

|> The memory layout usually looks like this (which is rather intuitive,
|> I think):

|>       ----------------
|>      |  A's L part    |
|>      |                |
|>      |----------------|
|>      |    A part      |
|>      |                |
|>      |----------------|
|>      |  B's L part    |
|>      |                |
|>      |----------------|
|>      |    B part      |
|>      |                |
|>      |----------------|
|>      |    C part      |
|>      |                |
|>       ----------------

|> Using virtual inheritance ...

|> class L { ... };
|> class A : virtual public L { ... };
|> class B : virtual public L { ... };
|> class C : public A, public B { ... };

|> ,however, the following layout is used:

|>       ----------------
|>      |    A part      |
|>      |                |
|>      |      ----------------------
|>      |----------------|           |
|>      |    B part      |           |
|>      |                |           |
|>      |       ---------------------|
|>      |----------------|           |
|>      |    C part      |           |
|>      |                |           |
|>      |----------------|           |
|>      |   A+B's L part |<----------
|>      |                |
|>       ---------------

|> The common L part has moved from the top to the bottom of the
|> memory layout. Calculating the offset of a derived class part given
|> a pointer to the virtual base class part is considered too difficult
|> by the ARM
|> (p.227: "Casting from a virtual base class to a derived class
|> is disallowed to avoid requiring an implementation to maintain
|> pointers to enclosing objects").

|> Another disadvantage of this approach is that it is impossible to
|> create
|> a derived class object at the location of a virtual base class object
|> *without* explicitly copying the contents of the base class object;
|> the virtual base class data members are moved to another location.

|> This is my alternative, which, I believe, is more intuitive and might
|> make downcasting easier. The common L part is again at the top,
|> and a pointer from the derived class part to the base class part
|> is inserted BEFORE the derived class part.

|>       -----------------
|>      |   A+B's L part |<----------
|>      |                |           |
|>      |----------------|           |
|>      |       ---------------------|
|>      |    A part      |           |
|>      |                |           |
|>      |----------------|           |
|>      |       ---------------------
|>      |    B part      |
|>      |                |
|>      |----------------|
|>      |    C part      |
|>      |                |
|>       ----------------

This would also be a legal implementation.  But I don't see where it
would help in the downcasting problem.  Given an L* (and no other
declarations except L and B), how can the compiler know where B is
relative to L.

It would also require pointers to members to contain negative values,
which could conceivably cause problems for some systems.

|> I would use the same mechanism for nonvirtual inheritance, like in:

|>       -----------------
|>      |   A's L part   |<----------
|>      |                |           |
|>      |----------------|           |
|>      |       ---------------------
|>      |    A part      |
|>      |                |
|>      |----------------|
|>      |   B's L part   |<----------
|>      |                |           |
|>      |----------------|           |
|>      |       ---------------------
|>      |    B part      |
|>      |                |
|>      |----------------|
|>      |    C part      |
|>      |                |
|>       ----------------

And increase the size of all of the derived classes.

Note that in the above, it is perfectly legal to instantiate an A
alone.  And of course, such an A must be identical to an A which is a
base class of C.

Now I have a number of classes which are just one pointer long (and
contain no virtual functions).  I often use derivation to add to there
functionality.  (This is not really subclassing in an OO sense, but is
very practical none the less.)  Are you proposing to double the size
of my class for this?

While this idea might win votes from memory manufacturers, it is in
direct conflict with the rule: you don't pay for what you don't use.
Of course, it is not forbidden as an implementation.  But what
advantages does it bring in return for its price.

    [Further example layouts deleted...]

|> And why not introduce a pointer at the end of each base class part to
|> the
|> immediate derived class parts?

Principally, I suppose because the compiler cannot know how many
immediately derived class parts it will have.  For that matter, in the
case of virtual base classes, it is not at all clear what "immediately
derived class" means.

|> Too much pointer overhead?
|> Only in case of very small classes, I think.

Not really.  In most big classes, most of the members will themselves
be (smaller) classes.  So the big class pays the price for itself
(admittedly not significant) *and* for all of its members (which are
what makes it big in the first place, very significant).

|> An advantage of this approach could be that an implementation
|> could choose to locate the different parts at different places in
|> memory
|> instead of using contiguous memory.

Hmmm.  And how do you propose implementing pointers to members in this
case?
--
James Kanze                             email: kanze@us-es.sel.de
GABI Software, Sarl., 8 rue du Faisan, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                   -- Beratung in industrieller Datenverarbeitung

Author: jvsb@ra.alcbel.be (Johan Vanslembrouck)
Date: 1 Dec 93 14:35:22 GMT Raw View

In <KANZE.93Nov30140939@slsvhdt.us-es.sel.de>, James Kanze writes:

|> On the other hand, you can already do:

|>      B* bp = dynamic_cast< B* >lp ;

|> At least according to the working papers.

|> This solves the problem, at a small runtime expense.  And I suspect
|> that in a typical application, there will be no need for extra
|> information in the object itself.  All of the necessary information
|> will be in the virtual function table.

|> Have you read the proposal for RTTI and dynamic_cast?  It occurs to me
|> that this will probably accomplish everything you want.

Huh ... no. Can you tell me where to find it? But I have seen manually
written code fragments which accomplish dynamic casting.
I certainly do not object against the dynamic cast approach.
The sooner it is available, the better. I only want a line-up
of the class layout for virtual and nonvirtual inheritance.

|>|> >Now I have a number of classes which are just one pointer long (and
|>|> >contain no virtual functions).  I often use derivation to add to there
|>|> >functionality.  (This is not really subclassing in an OO sense, but is
|>|> >very practical none the less.)  Are you proposing to double the size
|>|> >of my class for this?

|>|> Why not? This makes me think of a political party that was very happy
|>|> because it doubled its votes. From 1 percent to 2 percent.
|>|> Or, if you have 1 dollar and you get another one, then you have doubled
|>|> your fortune.

|> This is not quite relevant.  As a simple example, my current project
|> requires >250 Megabytes of heapspace.  Not exactly little.  And this,
|> although most of the objects involved are 24 bytes or less.  Of
|> course, the bigger objects are assembled out of these little objects
|> (a typical case, I would think).  Save me 4 bytes for each little
|> object, and I will gain almost 50 Megabytes.  On the other hand, each
|> of these little objects has an inheritance hierarchy of at least 4
|> layers, often more.  Add four pointers to each object, and I would
|> need around 400 Megabytes, instead of 250.

Well, if you had used virtual inheritance frequently, you also would
have had 400 Megabytes.

|> In my experience, streaming does not have to know the layout.  In
|> fact, it cannot in general use the layout per se; what do you do about
|> classes which allocate memory to store variable length data (a
|> string class, for example)?  Classes which contain pointers (a linked
|> list element, for example)?  Streaming is an entirely different (and
|> very complex) issue.

I have separate streaming functions (to be more exact: streaming classes)
for things such as lists and string classes.
For other classes, the user has to supply a short description in
data of the data members. Types, pointers to types, arrays of types
and arrays of pointers to types are currently supported (see example below).
Given this description and knowledge about the class layout,
I can stream any given object.
Unfortunately, the class layout for virtual inheritance is very different
from that for non-virtual inheritance and is not covered yet in my
implementation.

Example: Given a user class:

class Person : public TransObject
{
  ...
  private:
      char            name[20];
      Address*        homeaddress;
      long            age;
      Date            birthday;
} ;

and the following description (which could be generated from the
Person class definition)

STRUCT_DESCRIPTOR_BEGIN(Person)
      FIELD(char,     0,      name,           20,     0)
      FIELD(Address,  1,      homeaddress,    1,      1)
      FIELD(long,     0,      age,            1,      2)
      FIELD(Date,     0,      birthday,       1,      3)
STRUCT_DESCRIPTOR_END(Person)
IMPLEMENT_STRUCT_CLASS(Person)

... I can (un)stream a Person class object.

I suppose some OODB implementors use similar mechanisms?

In case a definition like

 Type* var;

does not mean  "pointer to Type" but "pointer to an array of Type"
(like with strings or certain implementations of linked lists),
a specialized streaming class has to be used.


-----------------------------------------------------------------------
Johan Vanslembrouck - SE99                 Tel    : +32 3 2407739
Alcatel Bell Telephone                     Telex  : 72128 Bella B
Francis Wellesplein  1                     Fax    : +32 3 2409932
B-2018 Antwerp                             e-mail : jvsb@ra.alcbel.be
Belgium
-----------------------------------------------------------------------