Thread

Topic: multiple inheritance virtual function implementation?

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/07/18 Raw View

"Gerhard Fetty" <fetty@ict.tuwien.ac.at> writes:

|>  >b doesn't have to deal with foo, just bar. One way to implement it is to
|>
|>  sorry, my fault. c calls with bar.
|>
|>  >have A's data (including its vtable) at the front, followed by B's data
|>  >(including its vtable), followed by D's data. When the code converts the
|>  >address of a D to the address of an A it doesn't modify the pointer.
|>  >When it converts the address of a D to the address of a B it adds the
|>  >offset of B within D to the pointer, so that the new pointer points to
|>  >the beginning of B's data, which is somewhere inside of the D object.
|>
|>
|>  Ok.
|>  Maybe I have been misunderstood, re-formulation:
|>
|>  Let's have a list of objects of different classes that are '_multiply_'
|>  derived from their base classes. A pointer to a common base class traverses
|>  this list and polymorhically performs an operation on each of the objects.
|>  Ok, that's plain polymorhism, the core of c++. AFAIK, the traversing
|>  base-class pointer does not have to know anything about the derived classes!
|>  This means there must not be any code that determines which sub-classes the
|>  pointer actually points to.
|>
|>  But how can this pointer just by following offsets determine where it can
|>  find its vtable _without_ rtti?

The canonical solution would be for each class -- base or derived -- to
have its own vptr.  In practice, I've never seen a compiler which
wouldn't merge the vptr's when the base class and the derived class had
the same address.  With single inheritance (and the most frequent
implementations), this is always.  With multiple inheritance, there will
often be one or more base classes where this is not the case.

Note that in order for the vptr's to be merged, several things are
necessary: that the classes start at the same address, that the vptr be
situated at the same offset within the class, and that the common
functions (those present in the base class) be at the same offsets in
the vtbl.  Consider the following:

    struct B1 { virtual void f() ; } ;
    struct B2 { virtual void g() ; } ;
    struct D : B1 , B2 {} ;

Obviously, f() will be at the first slot in B1's vtbl, and g() at the
first slot in B2's.  Just as obviously, both f() and g() cannot be at
the first slot in D's vtbl, so the vptr for D cannot be used for one of
the base classes.  A typical layout of the above class would have a
common vptr for B1 and D, followed by the vptr for B2. The vtbl pointed
to by the first vptr would contain two entries: one for f(), followed by
one for g().  In addition, the one for g() would contain information to
tell the program to add the size of a pointer to the this before
actually calling the function.  The second vptr would point to a vtbl
containing only g() (with no fix-up).  If I redefine g() in D, however,
the second vtbl would specify the fix-up, and not the first, since the
address of this in D::g() is not the same as the address of the B2
sub-object.

There are at least two wide-spread techniques of specifying the fix-up:
either the vtbl contains two element structures, with a pointer to the
physical function and the offset to be added to the pointer, or if
fix-up is needed, the pointer to the function contains in fact a pointer
to a trampoline: a small piece of code which does the fix-up, then jumps
to the actual function.  Note that since the fix-up for a call through
B2's vtbl is not the same as that through D's (or B1's) virtual table,
this is an additional reason why the pointers cannot be merged.

Now add virtual bases, and it becomes really complex.  See the ARM for
some example implementations.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@bardeen.ceg.uiuc.edu (Siemel Naran)
Date: 1998/07/18 Raw View

>But how can this pointer just by following offsets determine where it can
>find its vtable _without_ rtti? I have the impression that somewhere there
>must internally be compiler-generated something like
>
>switch(class(*this)) {
>case baseclass a: this=this+offset_for_base_a; break;
>case baseclass b: this=this+offset_for_base_a; break;
>...
>}
>this->vtable(function);

It's really all up to the implementation.  But ...

All this information can be encapsulated in a virtual table.  This is how
I think it works.  Suppose C derives from B derives from A, and all three
classes define virtual functions "Foo foo(int)" and "Bar bar(double)".
Suppose the class B additionaly defines a virtual function "void f()".
Note that class C inherits this virtual function.

For an object of ultimate derived type A, the compiler makes a virtual
table, which is basically an array or struct, consisting of two items:
vtable[0]=&A::foo; // type is   Foo (A::*)(int);
vtable[1]=&A::bar; // type is   Bar (A::*)(double);

For an object of ultimate derived type B, the compiler makes a virtual
table, consisting of three items:
vtable[0]=&B::foo; // type is   Foo  (B::*)(int);
vtable[1]=&B::bar; // type is   Bar  (B::*)(double);
vtable[2]=&B::f  ; // type is   void (B::*)();

For an object of ultimate derived type C, the compiler makes a virtual
table, consisting of three items.  Details omitted.

You know the ultimate derived type of an object when you create it.  And
the program sets the vptr correctly -- i.e. it will point to either one of
the three tables above.  Then when you call a virtual function through a
base class pointer or reference, the programs just fetches the appropriate
address of the function from the virtual table, and then calls the
function.  Note that the type checking is done at compile time, so if you
used a reinterpret_cast to fool the computer about types, you're probably
going to f__k things up.

For multiple inheritance, the most logical thing seems to be for the
compiler to make two virtual tables.  If X inherits from D1 and D2, then
there will be two virtual tables -- one for the D1 part of X, and the other
for the D2 part of X.  Now, if we define additional functions in class X,
which virtual table gets these additional functions?  Both.

Consider this program.


#include <iomanip.h>


//#define I_PROMISE_NEVER_TO_WRITE_CODE_LIKE_THIS


struct B1 { virtual void foo() const { cout<<"B1::foo()\n"; } };
struct D1 : B1 {    void foo() const { cout<<"D1::foo()\n"; } };

struct B2 { virtual void bar() const { cout<<"B2::bar()\n"; } };
struct D2 : B2 {    void bar() const { cout<<"D2::bar()\n"; } };

struct X : D1,D2 { void foo() const { cout<<"X::foo()\n"; }
                   void bar() const { cout<<"X::bar()\n"; } };


void virtual_B1(const B1& b1)
{
     b1.foo();
#if defined(I_PROMISE_NEVER_TO_WRITE_CODE_LIKE_THIS)
     const B2& b2 = (const B2&)b1;
     b2.bar(); // LINE1: does X::bar() or X::foo() get called?
#endif
     cout << endl;
}

void virtual_B2(const B2& b2)
{
     b2.bar();
#if defined(I_PROMISE_NEVER_TO_WRITE_CODE_LIKE_THIS)
     const B1& b1 = (const B1&)b2;
     b1.foo(); // LINE2: similar question
#endif
     cout << endl;
}

void slicing_B1(B1 b1);
void slicing_B2(B2 b1);


int main(int argc, char **argv)
{
     cout << "sizeof(B1)=" << sizeof(B1) << endl;
     cout << "sizeof(D1)=" << sizeof(D1) << endl;
     cout << "sizeof(B2)=" << sizeof(B2) << endl;
     cout << "sizeof(D2)=" << sizeof(D2) << endl;
     cout << "sizeof(X )=" << sizeof(X ) << endl;
     cout << endl;

     X x;
     virtual_B1(x);
     virtual_B2(x);
     cout << endl;
}



What are your results?  For me (Linux g++ 2.7.2.3), the results are:

sizeof(B1)=4
sizeof(D1)=4
sizeof(B2)=4
sizeof(D2)=4
sizeof(X )=8

X::foo()
X::foo()

X::bar()
X::bar()


It should be pointed out that not all pointers have the same size!  See
thread with subject "sizeof pointer" on this newsgroup.  So be cautious
in the interpretation of your results in the 'sizeof' section of the
program.


Now try the following evil experiment.  Give class B2 an extra virtual
function before function bar().

struct B2 { virtual void something() { }
            virtual void bar() const { cout<<"B2::bar()\n"; } };

Then LINE1, as marked above, causes a segmentation fault because it
accesses a virtual table of B1 as if it were a virtual table of B2.  In
other words, it tries fetch vtable[1], the second element in the array
'vtable', whereas the vtable has only one element.


Anyway, all this is implementation specific.


Finally, good style says that virtual classes should always always have
virtual destructors.  And virtual functions should usually not be defined
inline as they will usually be called by address anyway.


--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Pete Becker <petebecker@acm.org>
Date: 1998/07/15 Raw View

Gerhard Fetty wrote:
>
>
> class B: {
>     int bar(int);
> ...}
> class D: public A, public B {
> ...}
> ...
> D d;
> B* b=&d;
> C* c=&d;
> b->bar(0);
> c->foo(0);
>
> But how can 'b' determine the address of the functions 'foo()' and 'bar()'
> in a similiar way as above? Is the internal address of 'b' and 'c' the same
> as 'd'? How are offsets for v-pointer of 'b' and 'c' computed, resp.?

b doesn't have to deal with foo, just bar. One way to implement it is to
have A's data (including its vtable) at the front, followed by B's data
(including its vtable), followed by D's data. When the code converts the
address of a D to the address of an A it doesn't modify the pointer.
When it converts the address of a D to the address of a B it adds the
offset of B within D to the pointer, so that the new pointer points to
the beginning of B's data, which is somewhere inside of the D object.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Gerhard Fetty" <fetty@ict.tuwien.ac.at>
Date: 1998/07/15 Raw View

>b doesn't have to deal with foo, just bar. One way to implement it is to

sorry, my fault. c calls with bar.

>have A's data (including its vtable) at the front, followed by B's data
>(including its vtable), followed by D's data. When the code converts the
>address of a D to the address of an A it doesn't modify the pointer.
>When it converts the address of a D to the address of a B it adds the
>offset of B within D to the pointer, so that the new pointer points to
>the beginning of B's data, which is somewhere inside of the D object.


Ok.
Maybe I have been misunderstood, re-formulation:

Let's have a list of objects of different classes that are '_multiply_'
derived from their base classes. A pointer to a common base class traverses
this list and polymorhically performs an operation on each of the objects.
Ok, that's plain polymorhism, the core of c++. AFAIK, the traversing
base-class pointer does not have to know anything about the derived classes!
This means there must not be any code that determines which sub-classes the
pointer actually points to.

But how can this pointer just by following offsets determine where it can
find its vtable _without_ rtti? I have the impression that somewhere there
must internally be compiler-generated something like

switch(class(*this)) {
case baseclass a: this=this+offset_for_base_a; break;
case baseclass b: this=this+offset_for_base_a; break;
...
}
this->vtable(function);

but that is exactly what oo has always claimed to avoid!

--gerhard



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/07/15 Raw View

Gerhard Fetty <Gerhard.Fetty@tuwien.ac.at> wrote:
>> If this has been already answered anywhere, please point out where. I
>> haven't found it.

Barry Margolin wrote:
> Section 10.8c of the ARM, titled "Multiple Inheritance and Virtual
> Functions".

And the entire chapter 12 of Stroustrup's "Design and Evolution
of C++" (1994, Addison-Wesley, ISBN 0-201-54330-3).

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Gerhard Fetty" <fetty@ict.tuwien.ac.at>
Date: 1998/07/14 Raw View

If this has been already answered anywhere, please point out where. I
haven't found it.

Consider the following code snippet:

class A: {
    virtual int foo(int);
...}
class C: public A {
...}
...
C c;
A* a=&c;
a->foo(0);

Polymorphism is achieved by the use of a v-pointer in each object and a
v-table for each class. If you have a pointer to an object, a virtual
function can be called like this:

The pointer 'a' is dereferenced. Now we are at the address where the memory
layout of 'c' starts. At a known offset for the class 'C' we find its
v-pointer which in turn is dereferenced and leads us to the v-table. Here
again at a certain offset we find the address of the virtual function 'foo',
which is finally invoked.

Every inheritance class expands the memory layout of its base class by its
additional member variables and its own v-pointer. So base class pointers
can easily point to derived objects because they would only access the first
n bytes of memory after the address of the pointer.

So far, so good. This theoritecal 'thougt model' serves me quite well in the
case of single inheritance. How can one imagine the memory layout for
multiple inheritance without having tedious run-time type-checks? If we
additionally have code like this:

class B: {
    int bar(int);
...}
class D: public A, public B {
...}
...
D d;
B* b=&d;
C* c=&d;
b->bar(0);
c->foo(0);

But how can 'b' determine the address of the functions 'foo()' and 'bar()'
in a similiar way as above? Is the internal address of 'b' and 'c' the same
as 'd'? How are offsets for v-pointer of 'b' and 'c' computed, resp.?

Thanks,

--gerhard



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Barry Margolin <barmar@bbnplanet.com>
Date: 1998/07/14 Raw View

In article <6of5uh$bua$1@news.tuwien.ac.at>,
Gerhard Fetty <Gerhard.Fetty@tuwien.ac.at> wrote:
>If this has been already answered anywhere, please point out where. I
>haven't found it.

Section 10.8c of the ARM, titled "Multiple Inheritance and Virtual
Functions".

--
Barry Margolin, barmar@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]