Thread

Topic: address of the class member

Author: Darron Shaffer <darron.shaffer@beasys.com>
Date: 1998/08/24 Raw View

AllanW@my-dejanews.com writes:

> > AllanW@my-dejanews.com writes:
> >
> > > (what can be used instead of vtables?)
> > >
> > > 1. (function pointers in each object)
> > >
> > > 2. (vtable)
> > >
> > > 3. (dispatch function per virtual function, rtti)
>
> In article <vzayasj1q1t.fsf@dalibm2.beasys.com>,
>   Darron Shaffer <darron.shaffer@beasys.com> wrote:
> >
> > 4. (convert the virtual table into a function)
>
> The dispatch function does something like a switch. But if the
> virtual function identifiers are small integers beginning with 0, we
> can optimize this to a table lookup. And since the table lookup is
> so easy on most CPUs, we can make the dispatch function inline;
> instead of the address of a function, we use the address of a
> lookup table.
>
> In other words, a virtual table.

Yes, though on some theoretical machine it may not optimize to a
vtable, but stay as code.

> > 5. Each distinct class hierarchy has its own separate two-dimensional
> >    array (constructed by the compiler/linker).  The first index is the
> >    class id from RTTI info (a small integer unique within the current
> >    class hierarchy) and the second is the virtual function index.  The
> >    array contains a pointer to the real function to use.
>
> A small integer unique within the class heirarchy gives us the class
> id, which is used to index into the table. But this is no more or less
> efficient than having a pointer in the class which obtains the RTTI
> type, and incidentally the address of a (smaller, one-dimensional)
> table of virtual pointers.
>
> The details vary, but it's still essentially a virtual table, except
> that it's 2D instead 1D -- isn't that true?

Yes, though the space in each object instance might be smaller than a
vptr.  If you have few classes but lots of instances this might make
difference in memory usage.  But this is probably slower than a
traditional vtable.

> > 6. Pointers to objects have extra bits that can be used for a class
> >    id.  Now use one of the other RTTI based methods.
>
> This is interesting. You wouldn't want the extra bits to be low-order,
> though, because you would have to mask them off each time you wanted
> to use them. But if you made them high-order bits, the net effect
> would be to complicate operator new somewhat, by segregating all
> objects into regions of virtual memory, in return for which each of
> those objects would be smaller (no wasted space for a v-table pointer).
>
> But would this work with overridden operator new? Say the programmer
> supplies an operator new to force placement of a MyClass object. If
> MyClass had any virtual functions, you would be in trouble, no?

No.  When you write "Foo * f = new(mystorage) Foo;" the pointer placed
into 'f' would have the magic bits added to the bits in the pointer
'mystorage'.  This would happen outside operator new, in the code that
calls operator new.

For this to be efficient, your virtual memory system would have to
have a nice way of ignoring some (probably upper) bits in addresses.
This *could* be done by duplicating page tables.

Pointer casts would have to always preserve these bits, since they
contain the objects dynamic type.

This will break if the following is legal:


 void * vp = new char[sizeof(Foo)];
 Foo * f = new(vp) Foo;

 f = (Foo *) vp;  // Will this give us a legal pointer
    // to our object?

 f->bar();  // Call a virtual function.

> > 7. When you add 0x8000000 to an object pointer you get the address of
> >    another virtual page that has an array of function pointers (the
> >    vtable).  With proper control of the virtual memory system, the free
> >    store, and non-stack automatic variables this could even be pretty
> >    efficient.
>
> This implies that every object has it's own array of function pointers,
> and that the array wouldn't be allowed to be larger than the object
> itself. Perhaps you meant to add 0x8000000 and then mask off the low
> N bits? If so, this is a variation on item #6 -- either way, you would
> use the object's address to find the object, which implies difficulty
> with placed objects.

Yes, I did assume masking off some lower bits.  This is indeed a
variation on #6.  This scheme could be more efficient, within its
limitations.

As I see it these are:

 1) The user cannot replace operator new().  (because it has to
    cluster allocations of objects with the same dynamic type)

 2) The user cannot use placement new.
    (If they use an address that is not near anything else, you
    can build an appropriate virtual page at the correct
    address.  You can't count on this.)

The easiest way to fix these problems is by converting to scheme #6.

Darron


--
 __  __  _
 _ ) ___ _\   Enterprise Middleware Solutions Darron J. Shaffer
 __) __    \  BEA Systems Inc.   Sr. Software Engineer
              17101 Preston Rd   darron.shaffer@beasys.com
              LB# 115, Ste 260    Voice: (972) 738-6137
              Dallas, TX 75248    Fax:   (972) 738-6111
       http://www.beasys.com


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: orgh@home.com (orgh)
Date: 1998/08/16 Raw View

Hi,

I'm tring to trace the layout of the Virtual Function
Table and I was wandering whether there is a way that I
can get the address of the class member function?
I was tring to extract it using Pointer-to-Member
operators but didn't work...

thx,
slawek


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: stephen.clamage@sun.com (Steve Clamage)
Date: 1998/08/16 Raw View

orgh@home.com (orgh) writes:

>I'm tring to trace the layout of the Virtual Function
>Table and I was wandering whether there is a way that I
>can get the address of the class member function?
>I was tring to extract it using Pointer-to-Member
>operators but didn't work...

There is no portable way to get the address of a non-static
member function. The only thing you can do with the address of a
non-static member function is call it in association with
an object of the correct type, or to compare its address to
that of another similar function.  Pointer-to-member is the way
you accomplish those tasks.

For virtual functions, there is an extra level of information
hiding. The point of a virtual function is that you want to
be sure that the correct version gets called.  A pointer to a
virtual function cannot contain the address of the function,
because the function that is to be called is not known until
the point where the function is actually called. At that point,
the compiler looks up the address in the virtual table. (In the
most common implementations, that is. The language definition
doesn't mention virtual tables, because other implementations
are possible.)

There is no portable way to peek inside a pointer to member
or virtual table. Compilers often document the layout, so
that people writing tools like debuggers or other analyzers
of object code can get the information needed. Those details
vary from compiler to compiler, and from release to release
of the same compiler.

--
Steve Clamage, stephen.clamage@sun.com

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/18 Raw View

In article <6r7a3p$oob$1@engnews1.eng.sun.com>,
  stephen.clamage@sun.com (Steve Clamage) wrote:
[snip]
> A pointer to a
> virtual function cannot contain the address of the function,
> because the function that is to be called is not known until
> the point where the function is actually called. At that point,
> the compiler looks up the address in the virtual table. (In the
> most common implementations, that is. The language definition
> doesn't mention virtual tables, because other implementations
> are possible.)

Is this true?

I am aware that the standard specifically did not mention virtual
tables, the idea being that any mechanism that works correctly
would satisfy the standard. But are there other methods that
accomplish this goal?

There has to be *some* way to translate a virtual function call
into a real, live function address at run-time.

1. The first way, perhaps "obvious" if one hasn't considered
any level of optimization, would be to put the addresses of the
virtual function in the class data itself. The trouble with
this method is that, for classes with more than a couple of
instanciations, this would waste a lot of space.

2. The second way, which I believe is *FAR* more common, is to
have the class contain one address of a hidden data structure,
which in turn has all the function addresses. The hidden data
structure would only be instanciated once, no matter how many
instances of the class exist. The only cost would be one extra
level of indirection in virtual function calls, which is cheap
enough (one extra memory read, and possibly one extra assembly
instruction depending on CPU). You don't have to call this
hidden data structure a virtual table, of course; but a rose
by any other name...

3. I've been stretching, trying to come up with a third method.
Here's what I came up with: each function with virtual members
has a hidden ID field, perhaps used in RTTI. All virtual
functions have a hidden "dispatch" function. Any call to any
virtual function named foo with the same call signature, in any
class (even classes not related heirarchically), would first
call this "hidden foo" dispatch function. That function would
grab the ID, and use it to index it's own dispatch table to
call the appropriate class member function. The disadvantage
is that these per-function-name tables would take up much more
space than the per-class virtual tables, since they would have
so many unused entries (i.e. std::ios::foo wouldn't have any
entry, because class std::ios doesn't define foo). The
advantage, well perhaps there isn't any.

Again, I was just trying to think of any *legal* alternatives
to virtual tables. It seems to me that number three, above, is
legal but pointless. Does anyone know why someone would want to
do it that way? Is there another implementation that would
actually accomplish something useful?

Not a very important question, I guess; I'm just curious.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@fermi.ceg.uiuc.edu (Siemel Naran)
Date: 1998/08/18 Raw View

On 18 Aug 1998 17:04:14 GMT, AllanW@my-dejanews.com

>2. The second way, which I believe is *FAR* more common, is to
>have the class contain one address of a hidden data structure,
>which in turn has all the function addresses. The hidden data
>structure would only be instanciated once, no matter how many
>instances of the class exist. The only cost would be one extra
>level of indirection in virtual function calls, which is cheap
>enough (one extra memory read, and possibly one extra assembly
>instruction depending on CPU). You don't have to call this
>hidden data structure a virtual table, of course; but a rose
>by any other name...

You're talking about the virtual table, right?  I wonder what
implementations there are besides virtual tables.

What happens when you do
     typedef void (Base::*Fptr)(int,int);
     Fptr fptr=&Base::virtualfunction;

fptr can't be the address of Base::virtualfunction, because it might also
need to be the address of Derived::virtualfunction.  It all depends on
which object you call the function with.
     Base    b; (b.*fptr)(1,2); // fptr is &Base::virtualfunction
     Derived d; (d.*fptr)(1,2); // fptr is &Derived::virtualfunction


So what is fptr?  It could be a integer.  Suppose that the virtual table
of class Base has three virtual functions in it, and the second of these
functions has the following properties:
     // index=1 (2nd element in array)
     name      = "virtualfunction"
     signature = void (Base::*)(int,int)

The virtual table of class Derived must have the same three virtual
functions as class Base, in that order.  And it may add more virtual
functions.
     // index=1 (2nd element in array)
     name      = "virtualfunction"
     signature = void (Derived::*)(int,int)

So after compile time checks, "Fptr fptr=&Base::virtualfunction" should,
in this implementation, become "int fptr=1".  And
"Fptr=&Derived::virtualfunction" would be equivalent.



--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/18 Raw View

stephen.clamage@sun.com (Steve Clamage) wrote:
>> A pointer to a
>> virtual function cannot contain the address of the function,
>> because the function that is to be called is not known until
>> the point where the function is actually called. At that point,
>> the compiler looks up the address in the virtual table. (In the
>> most common implementations, that is. The language definition
>> doesn't mention virtual tables, because other implementations
>> are possible.)

AllanW@my-dejanews.com wrote:
> Is this true?
> There has to be *some* way to translate a virtual function call
> into a real, live function address at run-time.

There is; read Steve's comments again.  A pointer to a member
function (virtual or otherwise) is meaningless as a "pointer" value
until it is associated with an object (of that class type), which
happens at the time that the function is dereferenced and called,
since the function needs a valid 'this' pointer (which is a hidden
paramenter of the function).  Virtual member functions add a
complication to this, which require converting the member pointer
into a "real" pointer to the correct (derived) member function,
which depends on the runtime type of the object being associated
with the function at the time of the call, i.e., the vtable of the
object must be consulted at the time of the call.

It is for these reasons that a member function pointer is not
the same as a "real" function pointer (such as a pointer to a
static or non-member function).  Member pointers are thus typically
implemented as offsets or indices into the vtable; they usually
aren't addresses per se.

If, on the other hand, you simply want to convert a member function
pointer into a "real" function pointer without an object to
associate with the call, I have to ask: Why?

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/08/19 Raw View

AllanW@my-dejanews.com wrote:
>
> In article <6r7a3p$oob$1@engnews1.eng.sun.com>,
>   stephen.clamage@sun.com (Steve Clamage) wrote:
> [snip]
> > A pointer to a
> > virtual function cannot contain the address of the function,
> > because the function that is to be called is not known until
> > the point where the function is actually called. At that point,
> > the compiler looks up the address in the virtual table. (In the
> > most common implementations, that is. The language definition
> > doesn't mention virtual tables, because other implementations
> > are possible.)
>
> Is this true?
>
> I am aware that the standard specifically did not mention virtual
> tables, the idea being that any mechanism that works correctly
> would satisfy the standard. But are there other methods that
> accomplish this goal?
>
> There has to be *some* way to translate a virtual function call
> into a real, live function address at run-time.
>

[... three possible implementations snipped ...]

A fourth possibility could be a compiler which analyses the complete
code. In some cases, it is able to replace the virtual call by a direct
call, since there's only one at that position to be called. In many
cases, there are just a few possible functions, and the compiler can
use a switch on a type id. In the remaining cases, the compiler can use
a map of type ids to functions.
This would of course make an expensive compile, and would not work
too well with dynamic libraries; however it would be a conforming
implementation.

The compiler could even go as far as generating different versions
of the same function depending on the real type (and some advanced
logic for comparing the function pointers), therefore "virtualizing"
about every function which calls a virtual (or "virtualized") function
and then building a static call tree of those many functions. This
would cause code bloat and would not work with dynamic linking, but
it would be conforming, and it would be hard to beat the speed...

[...]

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/19 Raw View

stephen.clamage@sun.com (Steve Clamage) wrote:

> >> A pointer to a
> >> virtual function cannot contain the address of the function,
> >> because the function that is to be called is not known until
> >> the point where the function is actually called. At that point,
> >> the compiler looks up the address in the virtual table. (In the
> >> most common implementations, that is. The language definition
> >> doesn't mention virtual tables, because other implementations
> >> are possible.)

dtribble@technologist.com (David R. Tribble) mis-quoted me asking:
> > Is this true?
> > There has to be *some* way to translate a virtual function call
> > into a real, live function address at run-time.

and then Mr. Tribble responded:
> There is; read Steve's comments again.  A pointer to a member
> function (virtual or otherwise) is meaningless as a "pointer" value
> until it is associated with an object (of that class type), which
> happens at the time that the function is dereferenced and called,
> since the function needs a valid 'this' pointer (which is a hidden
> paramenter of the function).  Virtual member functions add a
> complication to this, which require converting the member pointer
> into a "real" pointer to the correct (derived) member function,
> which depends on the runtime type of the object being associated
> with the function at the time of the call, i.e., the vtable of the
> object must be consulted at the time of the call.

> It is for these reasons that a member function pointer is not
> the same as a "real" function pointer (such as a pointer to a
> static or non-member function).  Member pointers are thus typically
> implemented as offsets or indices into the vtable; they usually
> aren't addresses per se.

> If, on the other hand, you simply want to convert a member function
> pointer into a "real" function pointer without an object to
> associate with the call, I have to ask: Why?

Well, this is an excellent answer to a question I didn't ask. I think
this was unintentional; you assumed that my question was simpler than
it really was, and then you mis-quoted me by ignoring a full (but very
small) paragraph between the question and the statement following it.

The question "Is this true" was meant to ask about the assertion by
Steve Clamage and others that "...other implementations are possible"
and NOT about the obviously true assertion that "pointer to a virtual
function cannot contain [just] the address of the function."

The correct quote is:

> Is this true?

> I am aware that the standard specifically did not mention virtual
> tables, the idea being that any mechanism that works correctly
> would satisfy the standard. But are there other methods that
> accomplish this goal?

> There has to be *some* way to translate a virtual function call
> into a real, live function address at run-time...

[I continued by listing the three ways I know of, and asking if
 there was any worth to the alternates, or if there were other
 alternatives I had missed.]

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/19 Raw View

In article <35DA9796.697D37BE@physik.tu-muenchen.de>,
  Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote:
>
> AllanW@my-dejanews.com wrote:
> >
> > In article <6r7a3p$oob$1@engnews1.eng.sun.com>,
> >   stephen.clamage@sun.com (Steve Clamage) wrote:
> > [snip]
> > > A pointer to a
> > > virtual function cannot contain the address of the function,
> > > because the function that is to be called is not known until
> > > the point where the function is actually called. At that point,
> > > the compiler looks up the address in the virtual table. (In the
> > > most common implementations, that is. The language definition
> > > doesn't mention virtual tables, because other implementations
> > > are possible.)
> >
> > Is this true?
> >
> > I am aware that the standard specifically did not mention virtual
> > tables, the idea being that any mechanism that works correctly
> > would satisfy the standard. But are there other methods that
> > accomplish this goal?
> >
> > There has to be *some* way to translate a virtual function call
> > into a real, live function address at run-time.
> >
>
> [... three possible implementations snipped ...]
>
> A fourth possibility could be a compiler which analyses the complete
> code. In some cases, it is able to replace the virtual call by a direct
> call, since there's only one at that position to be called. In many
> cases, there are just a few possible functions, and the compiler can
> use a switch on a type id. In the remaining cases, the compiler can use
> a map of type ids to functions.
> This would of course make an expensive compile, and would not work
> too well with dynamic libraries; however it would be a conforming
> implementation.

You suggest using the switch based on type id as an optimization, which
implies a "general" case to which this is an exception. But it seems to
me that this could be used 100% of the time, if desired.  Presumably the
base class version of the function has the switching code, with some
linker fixup to supply the list of possible type id's and the correct
function addresses, no?

This approximates my third case, in which every virtual function has
a list of type id's and associated function addresses. It's an
improvement because the list of addresses wouldn't be so large, and
would have no unused entries. But would this "optimization" prevent
us from optiminzing out the base class versin of the function, if
we knew it wouldn't be used?

As for the "general case" method, a map of type id's to functions.
But that's almost exactly the same as my third method, isn't it?
Again, every virtual function name/signature combination would have
to have this type id-->function address, meaning that most of the
maps would be nearly empty, which wastes space. And would it execute
any faster than the virtual table method? I think it would be slower.

> The compiler could even go as far as generating different versions
> of the same function depending on the real type (and some advanced
> logic for comparing the function pointers), therefore "virtualizing"
> about every function which calls a virtual (or "virtualized") function
> and then building a static call tree of those many functions. This
> would cause code bloat and would not work with dynamic linking, but
> it would be conforming, and it would be hard to beat the speed...

Sorry, but I don't think I followed you on this one. Where is the
"advanced logic for comparing function pointers?" Please explain this
in more detail.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/08/20 Raw View

AllanW@my-dejanews.com wrote:
>
> In article <35DA9796.697D37BE@physik.tu-muenchen.de>,
>   Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote:
> >
> > AllanW@my-dejanews.com wrote:
> > >
> > > In article <6r7a3p$oob$1@engnews1.eng.sun.com>,
> > >   stephen.clamage@sun.com (Steve Clamage) wrote:
> > > [snip]
> > > > A pointer to a
> > > > virtual function cannot contain the address of the function,
> > > > because the function that is to be called is not known until
> > > > the point where the function is actually called. At that point,
> > > > the compiler looks up the address in the virtual table. (In the
> > > > most common implementations, that is. The language definition
> > > > doesn't mention virtual tables, because other implementations
> > > > are possible.)
> > >
> > > Is this true?
> > >
> > > I am aware that the standard specifically did not mention virtual
> > > tables, the idea being that any mechanism that works correctly
> > > would satisfy the standard. But are there other methods that
> > > accomplish this goal?
> > >
> > > There has to be *some* way to translate a virtual function call
> > > into a real, live function address at run-time.
> > >
> >
> > [... three possible implementations snipped ...]
> >
> > A fourth possibility could be a compiler which analyses the complete
> > code. In some cases, it is able to replace the virtual call by a direct
> > call, since there's only one at that position to be called. In many
> > cases, there are just a few possible functions, and the compiler can
> > use a switch on a type id. In the remaining cases, the compiler can use
> > a map of type ids to functions.
> > This would of course make an expensive compile, and would not work
> > too well with dynamic libraries; however it would be a conforming
> > implementation.
>
> You suggest using the switch based on type id as an optimization, which
> implies a "general" case to which this is an exception. But it seems to
> me that this could be used 100% of the time, if desired.  Presumably the
> base class version of the function has the switching code, with some
> linker fixup to supply the list of possible type id's and the correct
> function addresses, no?
>
> This approximates my third case, in which every virtual function has
> a list of type id's and associated function addresses. It's an
> improvement because the list of addresses wouldn't be so large, and
> would have no unused entries. But would this "optimization" prevent
> us from optiminzing out the base class versin of the function, if
> we knew it wouldn't be used?
>
> As for the "general case" method, a map of type id's to functions.
> But that's almost exactly the same as my third method, isn't it?
> Again, every virtual function name/signature combination would have
> to have this type id-->function address, meaning that most of the
> maps would be nearly empty, which wastes space. And would it execute
> any faster than the virtual table method? I think it would be slower.

The difference between your nearly empty table and a map is that a map
is not "nearly empty" - it uses the space it needs for it's elements.
The usual implementation is a balanced tree (see any STL map
implementation). This is also the reason why I did invent a third
method for this case; however, the compiler could decide to do big
switches with the help of a map in general, of course, making the
big case equal to the less-big case.

However, I guess the current vtbl method is still better...

>
> > The compiler could even go as far as generating different versions
> > of the same function depending on the real type (and some advanced
> > logic for comparing the function pointers), therefore "virtualizing"
> > about every function which calls a virtual (or "virtualized") function
> > and then building a static call tree of those many functions. This
> > would cause code bloat and would not work with dynamic linking, but
> > it would be conforming, and it would be hard to beat the speed...
>
> Sorry, but I don't think I followed you on this one. Where is the
> "advanced logic for comparing function pointers?" Please explain this
> in more detail.

Assume you have a hierarchy like this:

struct Base { virtual void f() {} };
struct Der1: Base { void f() {} };
struct Der2: Base { void f() {} };

and a function

void f(Base* pb)
{
  pb->f();
}

Then this hypothetical compiler would generate three functions,
namely

void f(really Base* pb) { pb->Base::f(); }
void f(Der1* as Base* pb) { ((Der1*)pb)->Der1::f(); }
void f(Der2* as Base* pb) { ((Der2*)pb)->Der2::f(); }

Which one is called would be decided by the type really passed in
(which is known, since the calling function is specialized as well,
if the real type is not known at compile time there).
(Objects floating out of a function would be a problem, though;
probably we would get our tables back as "return continue tables"...)

Logically, those functions are just one function, so theoir addresses
must compare equal.

However, thinking more about it, function pointers are far more
complex in this case anyway: You must ennsure that the correct
version of the function is called when invoked through the
function pointer.

The logical thing would be to duplicate the function pointers
as well, and assign them separately. Seems we have a table again...

Another possibility would be to do the same for function pointers
as for object types: At some point, we know which function the pointer
points to, so we specialize all the functions taking such a pointer
for all possible pointer values, and then we know exactly which
vesion of which function to call at the call place. The function
pointers would have been optimized out.

Seems that *comparing* function pointers is not the problematic
part here...

(BTW, the compiler would probably need the resources of a few
hundred workstations and a compile of substantial amounts of
code would probably need a few years ;-))

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jkanze@otelo.ibmmail.com
Date: 1998/08/20 Raw View

In article <6rcb15$gug$1@nnrp1.dejanews.com>,
  AllanW@my-dejanews.com wrote:
>
> In article <6r7a3p$oob$1@engnews1.eng.sun.com>,
>   stephen.clamage@sun.com (Steve Clamage) wrote:
> [snip]
> > A pointer to a
> > virtual function cannot contain the address of the function,
> > because the function that is to be called is not known until
> > the point where the function is actually called. At that point,
> > the compiler looks up the address in the virtual table. (In the
> > most common implementations, that is. The language definition
> > doesn't mention virtual tables, because other implementations
> > are possible.)
>
> Is this true?
>
> I am aware that the standard specifically did not mention virtual
> tables, the idea being that any mechanism that works correctly
> would satisfy the standard. But are there other methods that
> accomplish this goal?

The most obvious solution is to have the constructor register with a
map< void* , vtbl_t* >.  Using a hash map, on most machines, this can
be made to be very fast.  On the other hand, I don't think it can be
made quite as fast as the standard implementation, and I can't think
of any advantages that would "pay for" the lost speed.  (Maybe a debugging
implementation: if you can't find an entry in the map, you know immediately
that the pointer is out of wack, rather than just accessing, and possibly
branching to, random data.  With a static map per class type, you would
probably catch a certain number of errors that way.)

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Darron Shaffer <darron.shaffer@beasys.com>
Date: 1998/08/20 Raw View

AllanW@my-dejanews.com writes:

<snip>
>
> I am aware that the standard specifically did not mention virtual
> tables, the idea being that any mechanism that works correctly
> would satisfy the standard. But are there other methods that
> accomplish this goal?
>
> 1. (function pointers in each object)
>
> 2. (vtable)
>
> 3. I've been stretching, trying to come up with a third method.
> Here's what I came up with: each function with virtual members
> has a hidden ID field, perhaps used in RTTI. All virtual
> functions have a hidden "dispatch" function. Any call to any
> virtual function named foo with the same call signature, in any
> class (even classes not related heirarchically), would first
> call this "hidden foo" dispatch function. That function would
> grab the ID, and use it to index it's own dispatch table to
> call the appropriate class member function. The disadvantage
> is that these per-function-name tables would take up much more
> space than the per-class virtual tables, since they would have
> so many unused entries (i.e. std::ios::foo wouldn't have any
> entry, because class std::ios doesn't define foo). The
> advantage, well perhaps there isn't any.
>

4. Each class with virtual methods has ONE dispatch function, pointed
   to by a hidden pointer in each instance.  A virtual function call
   loads a register with a "virtual function" identifier and then
   calls this dispatch function.  The dispatch function does something
   like a switch and passes control to the correct function.

5. Each distinct class hierarchy has its own separate two-dimensional
   array (constructed by the compiler/linker).  The first index is the
   class id from RTTI info (a small integer unique within the current
   class hierarchy) and the second is the virtual function index.  The
   array contains a pointer to the real function to use.

6. Pointers to objects have extra bits that can be used for a class
   id.  Now use one of the other RTTI based methods.

7. When you add 0x8000000 to an object pointer you get the address of
   another virtual page that has an array of function pointers (the
   vtable).  With proper control of the virtual memory system, the free
   store, and non-stack automatic variables this could even be pretty
   efficient.

--
 __  __  _
 _ ) ___ _\   Enterprise Middleware Solutions Darron J. Shaffer
 __) __    \  BEA Systems Inc.   Sr. Software Engineer
              17101 Preston Rd   darron.shaffer@beasys.com
              LB# 115, Ste 260    Voice: (972) 738-6137
              Dallas, TX 75248    Fax:   (972) 738-6111
       http://www.beasys.com

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Ron Natalie <ron@sensor.com>
Date: 1998/08/20 Raw View

No matter what scheme you come up with, you pretty much
have to add some identification data in each object with
virtual functions.  Call this a type_id or a pointer.
You might be able to cheat the size down a bit if you
don't use pointers.



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/20 Raw View


> > > AllanW@my-dejanews.com wrote:
> > > > I am aware that the standard specifically did not mention virtual
> > > > tables, the idea being that any mechanism that works correctly
> > > > would satisfy the standard. But are there other methods that
> > > > accomplish this goal?

> > In article <35DA9796.697D37BE@physik.tu-muenchen.de>,
> >   Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote:
> > > A fourth possibility could be a compiler which analyses the complete
> > > code. In some cases, it is able to replace the virtual call by a direct
> > > call, since there's only one at that position to be called. In many
> > > cases, there are just a few possible functions, and the compiler can
> > > use a switch on a type id. In the remaining cases, the compiler can use
> > > a map of type ids to functions.
> > > This would of course make an expensive compile, and would not work
> > > too well with dynamic libraries; however it would be a conforming
> > > implementation.

Allan:
> > You suggest using the switch based on type id as an optimization, which
> > implies a "general" case to which this is an exception. But it seems to
> > me that this could be used 100% of the time, if desired.  Presumably the
> > base class version of the function has the switching code, with some
> > linker fixup to supply the list of possible type id's and the correct
> > function addresses, no?
> >
> > This approximates my third case, in which every virtual function has
> > a list of type id's and associated function addresses. It's an
> > improvement because the list of addresses wouldn't be so large, and
> > would have no unused entries. But would this "optimization" prevent
> > us from optiminzing out the base class versin of the function, if
> > we knew it wouldn't be used?
> >
> > As for the "general case" method, a map of type id's to functions.
> > But that's almost exactly the same as my third method, isn't it?
> > Again, every virtual function name/signature combination would have
> > to have this type id-->function address, meaning that most of the
> > maps would be nearly empty, which wastes space. And would it execute
> > any faster than the virtual table method? I think it would be slower.

Christopher:
> The difference between your nearly empty table and a map is that a map
> is not "nearly empty" - it uses the space it needs for it's elements.

Yes, that's what I meant when I talked about it being an improvement.

> The usual implementation is a balanced tree (see any STL map
> implementation). This is also the reason why I did invent a third
> method for this case; however, the compiler could decide to do big
> switches with the help of a map in general, of course, making the
> big case equal to the less-big case.

Balanced trees are pretty fast, especially if they're generated and
balanced in advance, and then treated as read-only (this would be
the usage pattern for virtual tables). But it still isn't nearly
as fast as the simple indexed look-up in use by current v-tables.

> However, I guess the current vtbl method is still better...

That's my conclusion. The map isn't any simpler, it isn't any
faster, and it is bigger (even if the data is the same size, we need
some sort of run-time library routine to index it).

Christopher:
> > > The compiler could even go as far as generating different versions
> > > of the same function depending on the real type (and some advanced
> > > logic for comparing the function pointers), therefore "virtualizing"
> > > about every function which calls a virtual (or "virtualized") function
> > > and then building a static call tree of those many functions. This
> > > would cause code bloat and would not work with dynamic linking, but
> > > it would be conforming, and it would be hard to beat the speed...

Allan:
> > Sorry, but I don't think I followed you on this one. Where is the
> > "advanced logic for comparing function pointers?" Please explain this
> > in more detail.

Christopher:
> Assume you have a hierarchy like this:
>
> struct Base { virtual void f() {} };
> struct Der1: Base { void f() {} };
> struct Der2: Base { void f() {} };
>
> and a function
>
> void f(Base* pb)
> {
>   pb->f();
> }
>
> Then this hypothetical compiler would generate three functions,
> namely
>
> void f(really Base* pb) { pb->Base::f(); }
> void f(Der1* as Base* pb) { ((Der1*)pb)->Der1::f(); }
> void f(Der2* as Base* pb) { ((Der2*)pb)->Der2::f(); }
>
> Which one is called would be decided by the type really passed in
> (which is known, since the calling function is specialized as well,
> if the real type is not known at compile time there).
> (Objects floating out of a function would be a problem, though;
> probably we would get our tables back as "return continue tables"...)
>
> Logically, those functions are just one function, so theoir addresses
> must compare equal.

Interesting concept. I think that you're saying that the compiler/linker
would silently turn this:
    struct Class1 { /* ... */ };
    struct Class1a : public Class1 { /* ... */ };
    struct Class1b : public Class1 { /* ... */ };
    struct Class1c : public Class1 ( /* ... */ );
    struct Class1d : public Class1 { /* ... */ };
    struct Class1e : public Class1 { /* ... */ };
    void func1(Class1 *x);
    void func2(Class1 *x) { func1(x); }
    void func3() {
        Class1 *a = new Class1; func2(a); delete a;
        a = new Class1b;        func2(a); delete a;
        a = new Class1d;        func2(a); delete a;
    }
into this:
    struct Class1 { /* ... */ };
    struct Class1a : public Class1 { /* ... */ };
    struct Class1b : public Class1 { /* ... */ };
    struct Class1c : public Class1 ( /* ... */ );
    struct Class1d : public Class1 { /* ... */ };
    struct Class1e : public Class1 { /* ... */ };
    void func1(Class1 *x);
    void func1a(Class1a *x);
    void func1b(Class1b *x);
    void func1c(Class1c *x);
    void func1d(Class1d *x);
    void func1e(Class1e *x);
    void func2(Class1 *x) { func1(x); }
    void func2a(Class1a *x) { func1a(x); }
    void func2b(Class1b *x) { func1b(x); }
    void func2c(Class1c *x) { func1c(x); }
    void func2d(Class1d *x) { func1d(x); }
    void func2e(Class1e *x) { func1e(x); }
    void func3() {
        Class1 *a = new Class1; func2(a);  delete a;
        a = new Class1b;        func2b(a); delete a;
        a = new Class1d;        func2d(a); delete a;
    }
You're right, this would be very fast, because there wouldn't be
any dynamic binding at all. But it does leave two problems.
First, there's exponential growth in code like this:
    void abc(Class1 *a, Class2 *b, Class3 *c, Class4 *d, Class5 *e)
    {
        a->func(b,c,d,e);
        b->func(a,c,d,e);
        c->func(a,b,d,e);
        d->func(a,b,c,e);
        e->func(a,b,c,d);
    }
If each ClassN had 5 derived classes, this would expand into
6x6x6x6x6=7776 functions. Even if the optimizer could eliminate 90%
of them, the code would still be over 700 times larger.

Second, how would you handle this?
    struct Class1 { /* ... */ };
    struct Class1a : public Class1 { /* ... */ };
    struct Class1b : public Class1 { /* ... */ };
    struct Class1c : public Class1 ( /* ... */ );
    struct Class1d : public Class1 { /* ... */ };
    struct Class1e : public Class1 { /* ... */ };
    void def(Class1 *a) { return a->somefunc(); }
    Class1 * read_data() {
        char x;
        std::cin >> x;
        switch (x) {
        default:  return new Class1;
        case 'a': return new Class1a;
        case 'b': return new Class1b;
        case 'c': return new Class1c;
        case 'd': return new Class1d;
        case 'e': return new Class1e;
    }
    int main() {
        for (int i=0; i<10; ++i) def(read_data());
    }
Here, there can be only one read_data based on the parameters. Even
if you break C++ rules by allowing multiple functions with the same
signature and different return type, you can't know which one to call
until you're deep inside of read_data. So how do you get main() to
call the correct def?

> However, thinking more about it, function pointers are far more
> complex in this case anyway: You must ennsure that the correct
> version of the function is called when invoked through the
> function pointer.
>
> The logical thing would be to duplicate the function pointers
> as well, and assign them separately. Seems we have a table again...

True. I think it's forced on you in this case.

> Another possibility would be to do the same for function pointers
> as for object types: At some point, we know which function the pointer
> points to, so we specialize all the functions taking such a pointer
> for all possible pointer values, and then we know exactly which
> vesion of which function to call at the call place. The function
> pointers would have been optimized out.
>
> Seems that *comparing* function pointers is not the problematic
> part here...
>
> (BTW, the compiler would probably need the resources of a few
> hundred workstations and a compile of substantial amounts of
> code would probably need a few years ;-))

It's worse than that. If the answer isn't deterministic, 1000
supercomputers couldn't always solve the problem, even in 1000 years,
because there is no right answer.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@fermi.ceg.uiuc.edu (Siemel Naran)
Date: 1998/08/21 Raw View

On 20 Aug 1998 23:16:23 GMT, AllanW@my-dejanews.com

>> >   Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote:
>> > > use a switch on a type id. In the remaining cases, the compiler can use
>> > > a map of type ids to functions.

BTW, this is a nice idea if at some point in our program we have
to do a switch on the typeid.  Ideally, one should use virtual
functions for this sort of stuff.  But this may not be possible.
So instead we do
     if (typeid(...)==something1) doaction1;
     else if (typeid(...)==something2) doaction2;

Storing the various typeid's somethingN in a map is a really good idea.
(Of course, not for one or two typeid's.) For that matter, we could use
a hash table -- might be even faster.

>Balanced trees are pretty fast, especially if they're generated and
>balanced in advance, and then treated as read-only (this would be
>the usage pattern for virtual tables). But it still isn't nearly
>as fast as the simple indexed look-up in use by current v-tables.

Balanced trees are O(log2(N)), and indexed lookup is O(1).

>Christopher:
>> > > The compiler could even go as far as generating different versions
>> > > of the same function depending on the real type (and some advanced
>> > > logic for comparing the function pointers), therefore "virtualizing"

[Allan]
>Interesting concept. I think that you're saying that the compiler/linker
>would silently turn this:
>    void func1(Class1 *x);

>into this:
>    void func1(Class1 *x);
>    void func1a(Class1a *x);
>    void func1b(Class1b *x);
>    void func1c(Class1c *x);
>    void func1d(Class1d *x);
>    void func1e(Class1e *x);

>You're right, this would be very fast, because there wouldn't be
>any dynamic binding at all. But it does leave two problems.
>First, there's exponential growth in code like this:

The optimization is possible, but I don't think it's worhwhile.
What you're essentially doing is replacing virtual functions
with templates.  This avoids the vptr overhead.  But if the
person wanted to use templates, they should have done so in
the first place.  The decision to use virtual functions is
a way of saying 'I don't want templates'; the decision to use
templates is a way of saying 'I don't want virtual functions'.

>Second, how would you handle this?
[snip]
>Here, there can be only one read_data based on the parameters. Even
>if you break C++ rules by allowing multiple functions with the same
>signature and different return type, you can't know which one to call
>until you're deep inside of read_data. So how do you get main() to
>call the correct def?

If the dynamic type cannot be determined statically, then this
optimization doesn't come into play.  You have no choice but to
use the vptr method.  In any case it's ok to overload by return
type if it is the optimizer that's doing it.

So this is the point.  We generally use virtual funcs when we
want polymorphic containers.  For example, if we want a bag of
apples or oranges, we just say list<Fruit*>.  The dynamic type
of each fruit is almost always never known at compile time.
So the above optimization is not even applicable.  Furthermore,
the above optimization is almost always not applicable.  Hence
the above optimization is not worthwhile.

--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/21 Raw View

> AllanW@my-dejanews.com writes:
>
> > I am aware that the standard specifically did not mention virtual
> > tables, the idea being that any mechanism that works correctly
> > would satisfy the standard. But are there other methods that
> > accomplish this goal?
> >
> > 1. (function pointers in each object)
> >
> > 2. (vtable)
> >
> > 3. I've been stretching, trying to come up with a third method.
> > Here's what I came up with: each function with virtual members
> > has a hidden ID field, perhaps used in RTTI. All virtual
> > functions have a hidden "dispatch" function. Any call to any
> > virtual function named foo with the same call signature, in any
> > class (even classes not related heirarchically), would first
> > call this "hidden foo" dispatch function. That function would
> > grab the ID, and use it to index it's own dispatch table to
> > call the appropriate class member function. The disadvantage
> > is that these per-function-name tables would take up much more
> > space than the per-class virtual tables, since they would have
> > so many unused entries (i.e. std::ios::foo wouldn't have any
> > entry, because class std::ios doesn't define foo). The
> > advantage, well perhaps there isn't any.

In article <vzayasj1q1t.fsf@dalibm2.beasys.com>,
  Darron Shaffer <darron.shaffer@beasys.com> wrote:
>
> 4. Each class with virtual methods has ONE dispatch function, pointed
>    to by a hidden pointer in each instance.  A virtual function call
>    loads a register with a "virtual function" identifier and then
>    calls this dispatch function.  The dispatch function does something
>    like a switch and passes control to the correct function.

The dispatch function does something like a switch. But if the
virtual function identifiers are small integers beginning with 0, we
can optimize this to a table lookup. And since the table lookup is
so easy on most CPUs, we can make the dispatch function inline;
instead of the address of a function, we use the address of a
lookup table.

In other words, a virtual table.

> 5. Each distinct class hierarchy has its own separate two-dimensional
>    array (constructed by the compiler/linker).  The first index is the
>    class id from RTTI info (a small integer unique within the current
>    class hierarchy) and the second is the virtual function index.  The
>    array contains a pointer to the real function to use.

A small integer unique within the class heirarchy gives us the class
id, which is used to index into the table. But this is no more or less
efficient than having a pointer in the class which obtains the RTTI
type, and incidentally the address of a (smaller, one-dimensional)
table of virtual pointers.

The details vary, but it's still essentially a virtual table, except
that it's 2D instead 1D -- isn't that true?

> 6. Pointers to objects have extra bits that can be used for a class
>    id.  Now use one of the other RTTI based methods.

This is interesting. You wouldn't want the extra bits to be low-order,
though, because you would have to mask them off each time you wanted
to use them. But if you made them high-order bits, the net effect
would be to complicate operator new somewhat, by segregating all
objects into regions of virtual memory, in return for which each of
those objects would be smaller (no wasted space for a v-table pointer).

But would this work with overridden operator new? Say the programmer
supplies an operator new to force placement of a MyClass object. If
MyClass had any virtual functions, you would be in trouble, no?

> 7. When you add 0x8000000 to an object pointer you get the address of
>    another virtual page that has an array of function pointers (the
>    vtable).  With proper control of the virtual memory system, the free
>    store, and non-stack automatic variables this could even be pretty
>    efficient.

This implies that every object has it's own array of function pointers,
and that the array wouldn't be allowed to be larger than the object
itself. Perhaps you meant to add 0x8000000 and then mask off the low
N bits? If so, this is a variation on item #6 -- either way, you would
use the object's address to find the object, which implies difficulty
with placed objects.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]