Thread

Topic: Dynamic cast of virtual function pointers

Author: "Stuart Yeates" <stuart.yeates@trimble.co.nz>
Date: 1998/08/19 Raw View



Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote in article
<35D16165.56CEB83B@physik.tu-muenchen.de>...
>
> Unlike your example, no casts are used at all, nor is the value
> mangled - it's just written to file and re-read. However, the GC
> won't scan the disk for pointers...

there are garbage collection schemes which work with disk based pointers
(persistent object stores), they require similar mechanisms to distributed
garbage collection algorithms.

stuart


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/20 Raw View

In article <01bdcbc1$dfb09420$e8f83f9b@syeates>,
  "Stuart Yeates" <stuart.yeates@trimble.co.nz> wrote:
>
>
> Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote in article
> <35D16165.56CEB83B@physik.tu-muenchen.de>...
> >
> > Unlike your example, no casts are used at all, nor is the value
> > mangled - it's just written to file and re-read. However, the GC
> > won't scan the disk for pointers...
>
> there are garbage collection schemes which work with disk based pointers
> (persistent object stores), they require similar mechanisms to distributed
> garbage collection algorithms.

Can you name an example, and/or explain how they work?

It seems a fundamental principle that GC must have a way to scan all
pointers, in order to find objects that are not pointed to. For GC
to work without any unusual heroic compiler support, the "normal"
practice is to scan global, static, and automatic objects for values
which might point to heap objects, and to continue by scanning the
heap items already "seen" for pointers to still other heap items.

So without unusual heroic compiler support, how could GC work if a
pointer was written to disk, eventually to be read back in and used
to dereference an item in the heap?

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/13 Raw View

In article <35D16165.56CEB83B@physik.tu-muenchen.de>, Christopher Eltschka
<celtschk@physik.tu-muenchen.de> wrote:
>Indeed, there's a much simpler case:
>
>class foo {};
>
>void f()
>{
>  foo* f1 = new foo;
>
>  FILE* f=fopen("whatever", "w");
>  fwrite(&f1, sizeof(f1), 1, f);
>  fclose(f);
>  f1=NULL; // the object is no longer referenced...
>}
>
>void g()
>{
>  foo* f2;
>
>  FILE* f=fopen("whatever", "r");
>  fread(&f2, sizeof(f2), 1, f); // now the object is referenced again
>  fclose(f);
>
>  delete f;
>}
>
>int main()
>{
>  f();
>  g();
>}
>
>Unlike your example, no casts are used at all, nor is the value
>mangled - it's just written to file and re-read. However, the GC
>won't scan the disk for pointers...

  I am not sure what your example wants to demonstrate: A conservative GC
works so that it finds the root set, and from that is scanning (by "mark
sweep") all active pointers. Those objects not passed in that process are
considered inactive and are remove.

  So in your example, it depends were the program stops when the GC. If it
stops in g() after the object f2 has been read, then the GC will discover
that, and the object will remain. But if the program stops after g() has
terminated, then there is no pointer looking at the object anymore, which
will be discovered after the mark sweep has been done, and the object will
be removed.

  You can play around with this idea a bit to see what happens: Say we
instead have
    int main()
    {
        foo* fp;
        f();
        g();
    }
and g() sets this global pointer by
    void g()
    {
        foo* f2;

        FILE* f=fopen("whatever", "r");
        fread(&f2, sizeof(f2), 1, f); // now the object is referenced again
        fclose(f);

        fp = f2;        // Set fp.

        delete f;
    }
Then, when g() has terminated, fp will be in the root set before main()
has terminated, and the pointer will be traced, discovering that the
object is still in use.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/13 Raw View

In article <haberg-1208981118180001@sl83.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
>
> In article <35D0D92D.6EC0@noSPAM.central.beasys.com>,
> dtribble@technologist.com wrote:
> >Perhaps there's a misunderstanding here.  I was suggesting that a
> >twisted program manipulates pointers independently of GC so that
> >the GC cannot properly deduce which objects are really dead.
>
>   So then I have to explain so that there are no further misunderstandings:
>
>   Suppose one wants handles that can be moved, and because of that one
> stamps each handle with a unique number, the idea being that a handle must
> be referenced by that number and not directly (because when the handle
> moves, the number remains the same but its address changes). Any mechanism
> must then reference the handles via a look-up function which computes the
> current address of a handle.

He's right.

Consider the abstract idea of a "pointer." If we dereference it, we
access the data it "points" to. Back away from the physical idea we're
all used to, where the pointer contains a number that can be loaded
into some machine register and used to get to the data. That's an
implementation detail, used by most compilers because it's simple and
very quick. But nothing requires the compiler to work this way. So
long as you're able to use a pointer in the ways you've become
familiar with, the compiler is free to add another layer of abstraction.
This would resemble the handles used by several APIs, such as the
so-called memory management of Windows 3.0. You call "malloc" or
"operator new" to allocate a chunk of memory; this call returns a value
which internally is the number 1, but externally it is a "pointer" object
with private data (the handle number). Now pass this address to memset;
the values "pointed to" get erased. And so on.

One way to implement this is to have an internal array of "real" memory
addresses. Then pointers contain an index into this internal array.
At any time, the memory block can be moved and the index updated --
even if code is actively using that pointer at the same time this is
going on! (Except multi-threaded systems; there we would need a lock.)
All that's required is to update the "real" memory address quickly,
and return to the code which can continue to use the unchanged pointer
value.

This has a wide variety of useful applications; two that spring to
mind quickly are diagnostic memory traces and support of virtual
memory on computers which don't have any assistance from the OS. But
the big, obvious use is Garbage Collection. Whenever neccesary, the
GC could create free memory by moving existing blocks into contiguous
memory.

>   So if this look-up function is not built into the GC, then there is no
> way for the GC to trace the objects that the handles point to just by
> tracing pointers: One will start will a subset of the handles which
> constitute the root set, trace the objects they are pointing to. But then
> these objects have no references to the new handles they point to, only
> numbers which must be resolved by the look-up function.

He's saying: let's assume that someone already had such a mechanism,
but it didn't include Garbage Collection. Now we tried to use a
general-purpose Garbage Collection method on that program. But it
fails, because objects don't contain "real" memory addresses of
other objects, they contain the "handles".  Instead, the memory
system itself has an array of "real" memory addresses. But you can't
use that for GC, because every single object is in that list -- there
will never be any "Garbage" to collect!

>   So, for a GC just tracing pointers, the active set of handles cannot be
> found.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/13 Raw View

In article <MPG.103c1b34300dee4d989c91@news.rmi.net>, jcoffin@taeus.com
(Jerry Coffin) wrote:
>How does the lookup function get access to these pointers?
...
> if you allow pointers to get mangled in such
>a way that they can no longer be recognized as pointers, you're going
>to run into a problem with GC.

  Precisely, this is what I wanted to give a simple example of, how one
might mangle them so that they are no longer recognizable as pointers.
Then there is no way for the GC to recognize them, unless the
mangling/unmangling procedure is put into the GC itself.

>Summary: if you mangle pointers, and/or store pointers in areas
>invisible as part of the root set (primarily files) it'll break GC.

  As there are many fishy things one can do, I propose that C++ should
make it easy to find the root set/help tracing pointers of the classes
specially marked so, indicating which GC to use. -- Also note that
different GC's may require the code be written in different special ways
(say if one should allow memory moving operations or not).

  Then one can add GC's as libraries and objects can use the GC which is
best for them.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/13 Raw View

In article <MPG.103c1b34300dee4d989c91@news.rmi.net>,
  jcoffin@taeus.com (Jerry Coffin) wrote:
> This discussion has _really_ gone on a long time for something that's
> ultimately pretty simple: if you allow pointers to get mangled in such
> a way that they can no longer be recognized as pointers, you're going
> to run into a problem with GC.
[snip]
> I believe this has already been covered in the C++ standard: at least
> IIRC, about a year ago or so, Andrew Koenig asked about what it'd take
> in the standard to support optional GC.  What came out was pretty
> simple: if you store pointers in ways that they can't be recognized as
> pointers, and you attempt to dereference them later, you get undefined
> results.  This includes not only mangling pointers, but things like
> temporarily storing them in files, then reading them back in.

Sounds pretty reasonable, and that's probably enough of a hook to
allow reasonable garbage collection without any other explicit
mention in the standard.

Is this provision part of the standard? Was it in one of the public drafts?

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/12 Raw View

In article <35D0DBB9.6EF4@noSPAM.central.beasys.com>,
dtribble@technologist.com wrote:
>No, I;m proposing that the compiler generate enough information
>available at runtime, both as class layout info and as info kept
>on the stack along with object pointers (possibly), so that the
>GC doesn't have to sift through memory looking for possible stale
>objects; it can simply walk the stack and heap, armed with complete
>knowledge of what is and what isn't a pointer and what each
>object's contents are.  Thus imbuing the GC with total object
>knowledge would eliminate the need for "guessing" about pointers
>and things that look like pointers but aren't.

  I thought I had already suggested this, in the sense that the compiler
knows how to generate the root set given a certain type of object to scan
for.

  However, with the root set in hand, the pointers must be traced, because
it is not possible to tell at compile time how dynamically allocated
objects will link up at runtime. I had in my mind variation where this is
put entirely on the GC implementation which is not a part of C++ then, but
one can also think of a C++ feature that provides support for that too.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/12 Raw View

In article <35D0D92D.6EC0@noSPAM.central.beasys.com>,
dtribble@technologist.com wrote:
>Perhaps there's a misunderstanding here.  I was suggesting that a
>twisted program manipulates pointers independently of GC so that
>the GC cannot properly deduce which objects are really dead.

  So then I have to explain so that there are no further misunderstandings:

  Suppose one wants handles that can be moved, and because of that one
stamps each handle with a unique number, the idea being that a handle must
be referenced by that number and not directly (because when the handle
moves, the number remains the same but its address changes). Any mechanism
must then reference the handles via a look-up function which computes the
current address of a handle.

  So if this look-up function is not built into the GC, then there is no
way for the GC to trace the objects that the handles point to just by
tracing pointers: One will start will a subset of the handles which
constitute the root set, trace the objects they are pointing to. But then
these objects have no references to the new handles they point to, only
numbers which must be resolved by the look-up function.

  So, for a GC just tracing pointers, the active set of handles cannot be found.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/08/12 Raw View


David R Tribble wrote:
>
> In article <35CA228C.78E9@noSPAM.central.beasys.com>, I,
> dtribble@technologist.com, wrote:
> >> However.  As I write this, I can envision a twisted scheme that
> >> generates pointers using some form of pointer arithmetic, so that
> >> even if an object's address is no longer stored in any
> >> pointer/reference, it can be recomputed at a later time.  (Think
> >> of saving the xor of a two pointer values, and then recomputing
> >> the original pointer value by xor'ing again later.)  This
> >> sort of thing would severely complicate any C++ GC mechanism.
>
> Hans Aberg wrote:
> > One could also think of handles that can be moved: Instead of using
> > the address of the handle, the handle gets a number which does not
> > change when it is moved. (Probably very slow.)
> >
> > But such schemes would be implemented as a part of the GC
> > implementation technique I think, and so should pose no problems.
>
> Perhaps there's a misunderstanding here.  I was suggesting that a
> twisted program manipulates pointers independently of GC so that
> the GC cannot properly deduce which objects are really dead.
> For example:
>
>     class Foo
>     {
>         ...
>     };
>
>     void bar()
>     {
>         Foo *   f1;
>         long    save;
>
>         f1 = new Foo;        // Allocate a Foo
>
>         save = (long)(void *)f1;   // Save the Foo pointer
>         save ^= MAGIC;             // Mangle it
>
>         f1 = NULL;           // No more references to the Foo
>
>         // [1] At this point, the Foo is unreferenced but still alive
>         // Q: Does GC assume the Foo can be reclaimed here?
>         ...
>
>         Foo *   f2;
>
>         save ^= MAGIC;             // Unmangle the saved pointer
>         f2 = (Foo *)(void *)save;  // Restore the pointer
>
>         // [2] At this point, the Foo is no longer unreferenced
>         // and is still alive
>         ...
>
>         delete f2;           // Really delete the Foo
>         f2 = NULL;           // No more references to the Foo
>     }
>
> This contrived program creates a situation where an object that
> is dynamically allocated has no references but is still considered
> alive by the program.  At some time later, it reconstructs a valid
> pointer to the object.  This sort of thing complicates GC because
> the object could be considered unreachable (dead) and reclamable
> between [1] and [2], when in fact the program considers it still
> alive.
>
> But I am not suggesting that a C++ GC mechanism be required to
> deal with this sort of pathological case.

Indeed, there's a much simpler case:

class foo {};

void f()
{
  foo* f1 = new foo;

  FILE* f=fopen("whatever", "w");
  fwrite(&f1, sizeof(f1), 1, f);
  fclose(f);
  f1=NULL; // the object is no longer referenced...
}

void g()
{
  foo* f2;

  FILE* f=fopen("whatever", "r");
  fread(&f2, sizeof(f2), 1, f); // now the object is referenced again
  fclose(f);

  delete f;
}

int main()
{
  f();
  g();
}

Unlike your example, no casts are used at all, nor is the value
mangled - it's just written to file and re-read. However, the GC
won't scan the disk for pointers...



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1998/08/13 Raw View

In article <haberg-1208981118180001@sl83.modempool.kth.se>,
haberg@REMOVE.matematik.su.se says...

[ ... ]

>   Suppose one wants handles that can be moved, and because of that one
> stamps each handle with a unique number, the idea being that a handle must
> be referenced by that number and not directly (because when the handle
> moves, the number remains the same but its address changes). Any mechanism
> must then reference the handles via a look-up function which computes the
> current address of a handle.
>
>   So if this look-up function is not built into the GC, then there is no
> way for the GC to trace the objects that the handles point to just by
> tracing pointers: One will start will a subset of the handles which
> constitute the root set, trace the objects they are pointing to. But then
> these objects have no references to the new handles they point to, only
> numbers which must be resolved by the look-up function.

How does the lookup function get access to these pointers?  You've
basically got one of two scenarios: either it has some other data
(which will be in the root set) to get access to them.  By following
that you get to the pointers, and all is good.

Your other possibility is that you've mangled the pointers to produce
the opaque handles.  I.e. the handles really ARE pointers, but they've
been mangled in a way that the GC can't recognize.

This discussion has _really_ gone on a long time for something that's
ultimately pretty simple: if you allow pointers to get mangled in such
a way that they can no longer be recognized as pointers, you're going
to run into a problem with GC.  Otherwise, you're going to have
something in the root set that leads to all memory that's available --
having some lookup function, and using some opaque version of a
pointer really makes NO difference at all: either the lookup is un
mangling pointers, or its using something in the root set to figure
out addresses from the opaque handles.  Both cases are covered: one
works and the other doesn't.

I believe this has already been covered in the C++ standard: at least
IIRC, about a year ago or so, Andrew Koenig asked about what it'd take
in the standard to support optional GC.  What came out was pretty
simple: if you store pointers in ways that they can't be recognized as
pointers, and you attempt to dereference them later, you get undefined
results.  This includes not only mangling pointers, but things like
temporarily storing them in files, then reading them back in.

I can hardly imagine the committee having a big problem with any of
that, so I suspect it got put in.  If it's there, your objections
simply don't work anymore.

Summary: if you mangle pointers, and/or store pointers in areas
invisible as part of the root set (primarily files) it'll break GC.  I
believe doing so also involves undefined behavior under the C++
standard.  As long as you avoid the things the standard already says
produce undefined results, your root set will include pointers to the
data, and it makes no difference at all whether the pointers are
stored directly, or are hidden in some function that itself has to
access them via something in the root set.

--
    Later,
    Jerry.

The Universe is a figment of its own imagination.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/11 Raw View

In article <35CA228C.78E9@noSPAM.central.beasys.com>, I,
dtribble@technologist.com, wrote:
>> However.  As I write this, I can envision a twisted scheme that
>> generates pointers using some form of pointer arithmetic, so that
>> even if an object's address is no longer stored in any
>> pointer/reference, it can be recomputed at a later time.  (Think
>> of saving the xor of a two pointer values, and then recomputing
>> the original pointer value by xor'ing again later.)  This
>> sort of thing would severely complicate any C++ GC mechanism.

Hans Aberg wrote:
> One could also think of handles that can be moved: Instead of using
> the address of the handle, the handle gets a number which does not
> change when it is moved. (Probably very slow.)
>
> But such schemes would be implemented as a part of the GC
> implementation technique I think, and so should pose no problems.

Perhaps there's a misunderstanding here.  I was suggesting that a
twisted program manipulates pointers independently of GC so that
the GC cannot properly deduce which objects are really dead.
For example:

    class Foo
    {
        ...
    };

    void bar()
    {
        Foo *   f1;
        long    save;

        f1 = new Foo;        // Allocate a Foo

        save = (long)(void *)f1;   // Save the Foo pointer
        save ^= MAGIC;             // Mangle it

        f1 = NULL;           // No more references to the Foo

        // [1] At this point, the Foo is unreferenced but still alive
        // Q: Does GC assume the Foo can be reclaimed here?
        ...

        Foo *   f2;

        save ^= MAGIC;             // Unmangle the saved pointer
        f2 = (Foo *)(void *)save;  // Restore the pointer

        // [2] At this point, the Foo is no longer unreferenced
        // and is still alive
        ...

        delete f2;           // Really delete the Foo
        f2 = NULL;           // No more references to the Foo
    }

This contrived program creates a situation where an object that
is dynamically allocated has no references but is still considered
alive by the program.  At some time later, it reconstructs a valid
pointer to the object.  This sort of thing complicates GC because
the object could be considered unreachable (dead) and reclamable
between [1] and [2], when in fact the program considers it still
alive.

But I am not suggesting that a C++ GC mechanism be required to
deal with this sort of pathological case.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/12 Raw View

David Tribble (dtribble@technologist.com) wrote:
> > Because then GC knows exactly what is, and what is not, a pointer.
> > It has total knowledge of what is contained within any object,
> > including (possibly self-referential) pointers.

AllanW@my-dejanews.com wrote:
> If it could always tell that, GC would be almost trivial.
> But it can't, so it isn't.

Why can't it?

Me:
> > Current GC mechanisms that don't have access to the C++ symbolic
> > information (RTTI) typically have to make some guesses about what
> > constitutes a "real" pointer, and usually err on the conservative
> > side; which means that there is the possibility that some dead
> > objects will never get reclaimed.

Allan:
> Right.
> I think you're proposing a change to the language definition, to
> provide hooks for GC. This way it would always be able to sift
> through all memory looking for pointers, and being certain if some
> data is a pointer or not.

No, I;m proposing that the compiler generate enough information
available at runtime, both as class layout info and as info kept
on the stack along with object pointers (possibly), so that the
GC doesn't have to sift through memory looking for possible stale
objects; it can simply walk the stack and heap, armed with complete
knowledge of what is and what isn't a pointer and what each
object's contents are.  Thus imbuing the GC with total object
knowledge would eliminate the need for "guessing" about pointers
and things that look like pointers but aren't.

Allan:
> One could imagine having the ability to implement class *, which
> would be the root class of all pointers. Then it would be a simple
> matter to create a new member variable to implement a linked list.
> You would be able to walk the entire list of pointers any time you
> needed to, finding out which ones point where.

Technically, you have a kind of linked list when you have pointers
on the stack.  Walking down the pointers in the stack is equivalent
to running down the list of pointers.

Allan:
> Come to think of it, you wouldn't really need the linked list. If
> you could detect whenever a pointer was initialized, changed, or
> destroyed, then you could use this information to maintain reference
> counts to the objects pointed to, and delete them when the count
> reached 0.
> But C++ was designed to implement object-oriented concepts without
> the overhead traditionally associated with such systems.

You don't need reference counts for good GC.  You only need a
scan-and-mark strategy which relies on knowing where all the active
pointers are.  Which requires help from the compiler.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/07 Raw View

In article <35CA228C.78E9@noSPAM.central.beasys.com>,
dtribble@technologist.com wrote:
>GC doesn't need to address normally allocated and deallocated
>objects, which use explicit new and delete calls, nor does it
>need to address auto objects whose destructors get called when
>they go out of scope.

  I just want to point out that it is more complicated that so: Suppose
you have a class which always is used as an automatic class, but
containing a pointer dynamically allocated/deallocated.

  Then, even if that class is always used to create automatic objects, if
that automatic object appears inside a class which is dynamically
allocated, it becomes dynamically allocated and the GC should treat it
differently (that is, not part of the root set).

  So even though its destructor is called when it is goes out of scope, it
is the GC which is going to trigger that, and not the C++ automatic
feature.

>However.  As I write this, I can envision a twisted scheme that
>generates pointers using some form of pointer arithmetic, so that
>even if an object's address is no longer stored in any
>pointer/reference, it can be recomputed at a later time.  (Think
>of saving the xor of a two pointer values, and then recomputing
>the original pointer value by xor'ing again later.)  This
>sort of thing would severely complicate any C++ GC mechanism.

  One could also think of handles that can be moved: Instead of using the
address of the handle, the handle gets a number which does not change when
it is moved. (Probably very slow.)

  But such schemes would be implemented as a part of the GC implementation
technique I think, and so should pose no problems.

>II.
>
>A different argument for adding GC to C++ involves changing the
>language to act more like other languages that have built-in GC,
>such as Eiffel and Java.  In this argument, we completely remove
>destructors from the language, and let all of our objects be
>managed by GC.  We don't explicitly delete objects, we simply
>stop referencing them.
..
>But following this line of argument leads to a different color of
>C++ than what we have today.

  So therefore, I do not think it is feasible: C++ can be changed so that
applications that uses a GC can be conveintly and effiently be
implemented, but one should not make a global C++ GC affecting everything.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/07 Raw View

Hans Aberg wrote:
>
>   The C++ extensions need not be that dramatic: A convenient way to
> locate the root set. Objects may need to have "colors" indicateing
> their runtime dynamics: Global, stacked, temporary, dynamically
> allocated.

There's one more color, which is for an object that is nested inside
another. Realistically, a reference to a nested object needs to keep the
entire enclosing object alive, since it probably isn't practical to
deallocate the memory occupied by the enclosing object minus the
enclosed object.

--

Ciao,
Paul

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/07 Raw View

I, David Tribble (dtribble@technologist.com) wrote:
>> If I recall correctly, many of the more successful GC algorithms
>> use a "mark and sweep" strategy, periodically marking "dead" objects
>> that are no longer referenced, and then sweeping (reclaiming) their
>> storage.
>>
>> That being the case, it would seem reasonable for a GC mechanism
>> in C++ to allow delete() to "mark" the object's memory block after
>> its destructor was called.  This would apply to dynamically
>> allocated objects (on the heap).  Other (static and auto) objects
>> would be destructed, as usual, when their names go out of scope,
>> which also calls delete().

AllanW@my-dejanews.com wrote:
> ...
> What we've just described is the standard "heap" mechanism shipped
> with most C++ run-time systems. This isn't Garbage Collection.

Har har.  I was merely suggesting that the would benefit by marking
explcitly deleted objects at the time they are deleted, rather than
determining their dead status later.

>> The only case left is that of dynamic objects (which are created
>> by explicit calls to new()) that are not explicitly deleted, but
>> have no active references to them (also known as "dead" objects).
>> It is this category of objects (a.k.a., "memory leaks") that would
>> get the most benefit from GC.  The only complication is when to
>> execute the destructor for such an object; its "dead" status
>> might not be detected until quite some time after the last pointer
>> to it was changed.  But maybe this isn't such a problem, since it
>> only occurs for dynamic objects (which cannot "go out of scope").

> As I understood it, this *IS* garbage collection. There's certainly
> no need to "collect" objects on the stack, so the "new" objects are
> the only ones that matter.

Precisely.  Hence the usefulness of marking them at the time they
are deleted.  Or, equivalently, taking then out of the reclamation
pool immediately at the time they are deleted.  (This is probably
an inconsequential point.)

>> Another point to make is that a GC mechanism almost undoubtedly
>> performs better if it is aided by the compiler, i.e., if it knows
>> about the RTTI information and the layout of objects, and whether
>> an object is really a pointer or not; chasing down pointers and
>> references on the stack and heap then becomes much simpler, with
>> no need for guessing, and allows the GC to be more aggressive.

> I can't see why it needs any help from the compiler. I certainly do
> see why it needs support from ::operator new and friends. That's why
> GC systems always come with global replacements for these functions.

Because then GC knows exactly what is, and what is not, a pointer.
It has total knowledge of what is contained within any object,
including (possibly self-referential) pointers.

Current GC mechanisms that don't have access to the C++ symbolic
information (RTTI) typically have to make some guesses about what
constitutes a "real" pointer, and usually err on the conservative
side; which means that there is the possibility that some dead
objects will never get reclaimed.

> > (BTW, objects that contain pointers to themselves can be handled
> > as special cases, as long as the GC mechanism has knowledge of the
> > class layouts.)
>
> Or even without such knowledge, so long as it knows how big the
> individual blocks are. But what about circular references:

Not a problem if the GC knows what is contained within an object.
It's more of a problem if it has to guess.

> Another complication:
 [snip]
> The first sizeof(void*) bytes of buff[] might happen to contain a bit
> pattern that matches an address inside one of our collectable objects.
> If the GC doesn't know that this is not an address, that object
> remains in memory until the bit pattern changes.

Exactly my point for suggesting that the compiler give the GC some
symbolic information about the physical layout of classes.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/07 Raw View

In article <35CA655D.DCB73315@ix.netcom.com>, "Paul D. DeRocco"
<pderocco@ix.netcom.com> wrote:
>There's one more color, which is for an object that is nested inside
>another. Realistically, a reference to a nested object needs to keep the
>entire enclosing object alive, since it probably isn't practical to
>deallocate the memory occupied by the enclosing object minus the
>enclosed object.

  I figure this is not needed, if one uses copy constructors for the GC
memory moving operations: The correct signal will then be automatically
forwarded to the subobjects. (So, because of this, the object oriented
structure of C++ may make writing GC's not so difficult if C++ is
augmented with some well-chosen features.)

  But I figure one finds out exactly what is needed when writing GC's in
C++ OO style.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/06 Raw View

AllanW@my-dejanews.com wrote:
> Since GC explicitly reclaims memory allocated by ::operator new and
> not yet freed, it is inconsistent with per-class ::operator deletes.
>
> X::operator new() and X::operator delete() go together; this is not
> mandated by the standard, but it's a practical reality. I would also
> hope that any class with it's own X::operator new() would not
> participate in GC. In fact, that would be the easiest way to specify
> that class X is to be excluded from GC.

I.

GC doesn't need to address normally allocated and deallocated
objects, which use explicit new and delete calls, nor does it
need to address auto objects whose destructors get called when
they go out of scope.

The practical use of GC, however, is for situations where you have
objects that have been allocated by ::new() (or Foo::new()), but
which have not yet been deleted (either by ::delete() or
Foo::delete()), but which also have no pointers or references bound
("pointing to") them.  In others words, unreferenced "lost" objects
that are candidates for reclamation.  Once the last pointer/reference
to a dynamically allocated object becomes null (or is destructed),
the object can no longer be explicitly deleted; it is lost.

However.  As I write this, I can envision a twisted scheme that
generates pointers using some form of pointer arithmetic, so that
even if an object's address is no longer stored in any
pointer/reference, it can be recomputed at a later time.  (Think
of saving the xor of a two pointer values, and then recomputing
the original pointer value by xor'ing again later.)  This
sort of thing would severely complicate any C++ GC mechanism.

II.

A different argument for adding GC to C++ involves changing the
language to act more like other languages that have built-in GC,
such as Eiffel and Java.  In this argument, we completely remove
destructors from the language, and let all of our objects be
managed by GC.  We don't explicitly delete objects, we simply
stop referencing them.

The question arises, then, of what to do about cleaning up
resources attached to reclaimed objects (such as open files).
Java's answer is to provide a 'finally' clause that gets executed
whenever the object is reclaimed, subject to some restrictions.
We could, likewise, redefine a C++ destructor to mean something
like a 'finally' function.  We could also, for convenience,
define the delete operator to perform object reclamation on
demand, allowing for some control over when GC gets done.

But following this line of argument leads to a different color of
C++ than what we have today.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/07 Raw View

Schemes that complicated are "contrived" in the sense that they would
only be written to prove that GC can't work in the general case (which
isn't disputed -- is it?).

However, I can think of less-contrived examples. Suppose we're on an
Intel 16-bit machine; although general-purpose addresses require 32
bits of storage, it's also true that for most addresses allocated from
the heap the high 16 bits will take on a limited number of values. If
we assume that there are 16 or less of these, and also assume that all
valid objects are aligned on 16-byte boundaries, then we might do
something like this:

    class AddressMunge {
        unsigned short address;
        static long segment[16];
    public:
        class segment_list_full {}; // Exception
        explicit AddressMunge(void*a=0) { (*this)=a; }
        void *operator=(void*a) {
            if (!a) { address=0xFFFF; return; }
            unsigned long seg = (unsigned long)a;
            assert(0 == seg & 0xF); // Can't store unaligned addresses
            address = (unsigned short)seg;
            seg &= 0xFFFF0000;
            assert(a == (void*)(seg | address));
            for (int i=0; i<16; ++i) {
                if (!segment[i]) segment[i]=seg;
                if (segment[i]==seg) { address |= i; return; }
            }
            throw segment_list_full;
        }
        operator void*() {
            if (0xFFFF == address) return 0;
            return (void*)(segment[address&0xF] | (address & 0xFFF0));
        }
    };
    unsigned long AddressMunge::segment[16] = { 0 };
Now we can save a list of munged addresses compactly:
    int main(int,char**) {
        AddressMunge hash_table[32701];
        Data * data;
        while (0 != (data = readData()))
            hash_table[hash(data)] = data;
        // ...
        return 0;
    }

> II.
>
> A different argument for adding GC to C++ involves changing the
> language to act more like other languages that have built-in GC,
> such as Eiffel and Java.  In this argument, we completely remove
> destructors from the language, and let all of our objects be
> managed by GC.  We don't explicitly delete objects, we simply
> stop referencing them.

Yes; the only GC I know anything about for C++ (Great Circle) can
be used in this mode.

> The question arises, then, of what to do about cleaning up
> resources attached to reclaimed objects (such as open files).
> Java's answer is to provide a 'finally' clause that gets executed
> whenever the object is reclaimed, subject to some restrictions.
> We could, likewise, redefine a C++ destructor to mean something
> like a 'finally' function.

It seems to me that this is already what it means, with the sole
difference being that it's currently explicit (on the heap) or at
least very visible (for auto objects).

Presumably with GC, destructors would still be called for auto
objects that go out of scope, so idioms that reflect "automatic
cleanup" would still work effectively when used in that way.
Likewise when created on the heap in the usual way (i.e. we call
delete when done).

The trouble, as I see it, is that there's no distinction between
these two very different cases:
    * Create something on the heap, when you never mean to explicitly
      destroy it.  We use it while it's needed, and rely on GC to
      clean up after us. The point is that we can't make the error of
      deleting an object that's still in use, or of deleting the same
      object twice, if we let GC deal with it -- and yet, if GC is
      everything it hopes to be, the automatic delete won't be very
      much later than the programmer could have accomplished anyway.
      The only disadvantage is that you can't do anything meaningful
      in the destructor, which won't be called.

    * Create something on the heap, with the intent of explicitly
      destroying it. But due to a programming error, it's never
      destroyed. Since the intent was to explicitly delete it, there
      may very well be a meaningful destructor. GC can reclaim the
      memory, but unless it finds a way to call the destructor it
      won't close the files, or release the lock, or terminate the
      Internet session cleanly, or calculate the grand totals...

How do you support the first case (feature) but not the second (error)?
Can you get the address of the destructor without any performance hits?
If not, do you simply disallow meaningful destructors in any program
that uses GC? Or do you try to put Garbage-Collectable objects in a
separate "arena" so that it doesn't apply to the general case?

> We could also, for convenience,
> define the delete operator to perform object reclamation on
> demand, allowing for some control over when GC gets done.
>
> But following this line of argument leads to a different color of
> C++ than what we have today.

Which isn't neccesarily bad, but I don't want to jump too quickly.
Let's make sure we don't accidentally give anyone the impression
that they'll get something for nothing.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/08 Raw View

David Tribble (dtribble@technologist.com) wrote:
> Because then GC knows exactly what is, and what is not, a pointer.
> It has total knowledge of what is contained within any object,
> including (possibly self-referential) pointers.

If it could always tell that, GC would be almost trivial.

But it can't, so it isn't.

> Current GC mechanisms that don't have access to the C++ symbolic
> information (RTTI) typically have to make some guesses about what
> constitutes a "real" pointer, and usually err on the conservative
> side; which means that there is the possibility that some dead
> objects will never get reclaimed.

Right.

I think you're proposing a change to the language definition, to
provide hooks for GC. This way it would always be able to sift
through all memory looking for pointers, and being certain if some
data is a pointer or not.

One could imagine having the ability to implement class *, which
would be the root class of all pointers. Then it would be a simple
matter to create a new member variable to implement a linked list.
You would be able to walk the entire list of pointers any time you
needed to, finding out which ones point where.

Come to think of it, you wouldn't really need the linked list. If
you could detect whenever a pointer was initialized, changed, or
destroyed, then you could use this information to maintain reference
counts to the objects pointed to, and delete them when the count
reached 0.

But C++ was designed to implement object-oriented concepts without
the overhead traditionally associated with such systems. Marking
pointers in this manner would lead to much slower implementations
of pointer operations, perhaps even calling a function through a
pointer.  We would take a hit on projects that wouldn't benefit
from GC, or simply don't want it for historic reasons.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/08 Raw View

    struct little {
        int a;
        // ...
    };
    struct big : public little {
        char buffer[128];
        // ...
    }

    int main(int,char**) {
        little *l = new little;
        big *b = new big;
        // ...
        l = b;
        // At this point, GC is free to reclaim the "little" object
        // because there are no more references to it.
        b = 0;
        // At this point, the only reference to the "big" object
        // is through a "little" pointer. Are you suggesting that
        // it's okay to delete the "big" object, except for the
        // "little" subobject?
        b = l;
        std::cout << b->buffer << std::endl;
        return 0;
    };

>   But I figure one finds out exactly what is needed when writing GC's in
> C++ OO style.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/09 Raw View

In article <6qg69f$626$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>But C++ was designed to implement object-oriented concepts without
>the overhead traditionally associated with such systems. Marking
>pointers in this manner would lead to much slower implementations
>of pointer operations, perhaps even calling a function through a
>pointer.  We would take a hit on projects that wouldn't benefit
>from GC, or simply don't want it for historic reasons.

  So that is why I suggested for extending C++ admitting GC's be
implemented conveniently and efficiently, and not to settle for a C++ with
a global GC affecting everything:

  Then, a traditional C++ pointer or reference would just be a pointer, or
"do nothing" from the GC point of view. If one needs a pointer that should
be traced by the GC, it is easy to write a special class for such a
pointer. (I think the discussion in this thread overlooks the fact that if
the automatic objects behaves as if they turn over references, then there
is no need for pointers, as one then always can use automatic objects
instead.) Then such pointers can be conveniently traced by the GC by using
C++ copy constructors of such classes.

  Since different types of objects may need to use a different GC, I think
this is the only way to go.

>I think you're proposing a change to the language definition, to
>provide hooks for GC. This way it would always be able to sift
>through all memory looking for pointers, and being certain if some
>data is a pointer or not.

  I suggested this already, in the form that one writes
    class Data {
        use Root root;
        ...
    };
Then the class Root object "root" keeps track of class Data objects when
they are part of the root set that the GC needs to trace.

>One could imagine having the ability to implement class *, which
>would be the root class of all pointers. Then it would be a simple
>matter to create a new member variable to implement a linked list.
>You would be able to walk the entire list of pointers any time you
>needed to, finding out which ones point where.

  So I already suggested that.

>Come to think of it, you wouldn't really need the linked list. If
>you could detect whenever a pointer was initialized, changed, or
>destroyed, then you could use this information to maintain reference
>counts to the objects pointed to, and delete them when the count
>reached 0.

  Reference counts are slow, and if one creates self-referential objects,
the reference count cannot be used to remove that object properly.

  But you may use ref counts to keep track of the root set: Whenever an
element is dynamically allocated set the ref count to zero. Then those
elements with ref count not zero are in the root set, and can be traced by
the GC. (But in this variation, ref count zero does not imply that the
object can be remobed.)

  Because all other variations seems to be slow, one wants the C++
compiler to keep track of the root set. So the Root class would search
through the appropriate locations, global, stacked and temporary. If the
writer of the C++ compiler knows about this, those elements could be put
in special locations, thus speeding up the search.

  Otherwise, I gave a suggesting for implementing a conservative GC with a
class keeping track of the root set: Just stack the "this" pointers of a
class Data; this class Data then also has a pointer to the location where
its "this pointer is stored, and when ~Data is called, this stacked
pointer is set to 0. When popped, the stack only skips back over 0
pointers. You then get the root set in this stack, if you make sure
dynamic allocations are not stacked (which can be done by writing a new
"T::operator new" I think).

  But I think this too is too slow; too many operations: Check stack is
not full, push "this" pointer onto stack, set pointer to stack location,
set "this" pointer 0 when destructor is called. But you may get the feel
of writing a conservative GC using C++ object OO features.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C717C6.2781@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:

>>   Perhaps the implementer of a GC should be able to ensure that when ~T() is
>> called, "operator delete" is not called.
>
>The reason I ask is that the corrected sentence describes the current
>definition of C++; there is no "Perhaps" about it, any implementor of
>C++ must ensure it.

  I think you are possibly confusing "operator delete" with "delete p" here:

  We are speaking about destroying dynamically allocated objects, which in
C++ is done by using "delete p", in which case first ~T() is called
whereafter "operator delete" is called to deallocate memory. Now, with a
GC in place, the GC could in principle decide to call ~T() at one time but
"operator delete" at another time. But one can set just set "operator
delete" to do nothing, so it should be no problem, except that one gets a
lot of "do nothing functions to be called", which takes time. (With a
conservative GC, the "operator delete" is going to do nothing to the user,
but not necessarily to the GC, because I figure one can use it to do the
memory moving operations at GC time.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/04 Raw View

In article <35C6799D.75077097@ix.netcom.com>,
  "Paul D. DeRocco" <pderocco@ix.netcom.com> wrote:
[snip]
> The combination of the above two points leads me to feel that garbage
> collection is only appropriate for objects that have no meaningful
> destruction semantics. This is a large category of objects, however,
> including such things as strings.

Indeed, strings are the textbook example, since almost every
implementation of BASIC has garbage-collected strings. But BASIC is also
widely criticized as being slow, and the traditional thinking is that
this is caused by GC strings. (Thus the phrase/pun, "garbage collection
stinks.") This probably isn't quite fair, but it is at least part of the
reason that GC has a stigma.

In the past, the real problem is that GC really was slow; I expect that
use of new techniques, and Virtual Memory on most platforms, reduces
this problem considerably. Also, GC happened at random moments, but
generally the worst ones for perceived performance. (For instance, in
BASIC, the statement
    INPUT "Type your name:", NAME$
is likely to trigger GC right AFTER the input, on small-memory
computers.) Triggering GC during I/O would probably help a lot. Explicit
GC triggers would also help. For instance, the hypothetical
    CALL GC_NOW
    INPUT "Type your name:", NAME$
would look a lot faster, since the GC would occur while the user was
still typing.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/04 Raw View

markw65@my-dejanews.com wrote:
>
> In article <6q2jit$d2j$1@shell7.ba.best.com>,
>   ncm@nospam.cantrip.org (Nathan Myers) wrote:
> > Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
> > >(Nathan Myers) wrote:
> > >>The time and order of destruction of unbound temporary objects
                                           ^^^^^^^
> > >>is well-specified: they are destroyed at the end of the "containing
> > >>expression" in reverse order of construction.  Bound temporaries
> > >>have the lifetime of the reference they were bound to.
> > >
> > >  Is the C++ standard written so that one can always ensure that all
> > >automatic and temporary objects taken together are released in the reverse
> > >order they are created? This sounds unlikely though, but the reason I ask
> > >is that if so, the feature might be used for keeping track of the root set
> > >when implementing a GC.
> >
> > However unlikely it may seem, it's true.
>
> Actually, I dont think it is... and the quote above about bound temporaries is
> why.
>
> According to 12.2 (Temporary Objects) paragraph 5, if a reference is bound to
> a temporary, the lifetime of the temporary extends to the end of the scope,
> or the end of the lifetime of the reference, whichever is shorter.
>
> If the bound temporary was created from another temporary, then the order of
> destruction will not be the reverse of the order of construction:

Nathan did specifically restrict his initial statement about reverse
order of construction to unbound temporaries.

> void foo()
> {
>  const Y &y = Y(X(1)); // obvious declarations skipped
>  // ...
> }
>
> So we first create an X, then create a Y and bind y to it. Now we destroy the
> X at the end of the full expression containing it, but dont destroy y until
> we exit from foo.
>
> Or did I miss something

Section 6.6 says that "destructors are called for all constructed
objects with automatic storage duration (named objects or temporaries)
that are declared in that scope in the reverse order of their
declaration", not in the reverse order of their creation. I think that
"or temporaries" is inconsistent with "declared in that scope". What is
the point of declaration for a temporary? Could anyone clarify this?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: tom_lippincott@advisories.com (Tom Lippincott)
Date: 1998/08/04 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:

>   Is the C++ standard written so that one can always ensure that all
> automatic and temporary objects taken together are released in the reverse
> order they are created?

The temporaries created by return and throw statements outlive automatic
objects created in nearby scopes.

                                             --Tom Lippincott

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/05 Raw View

In article <haberg-0408982031330001@sl34.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
> In article <35C717C6.2781@wizard.net>, James Kuyper <kuyper@wizard.net>
wrote:
>
> >>   Perhaps the implementer of a GC should be able to ensure that
> >> when ~T()is called, "operator delete" is not called.
> >
> >The reason I ask is that the corrected sentence describes the current
> >definition of C++; there is no "Perhaps" about it, any implementor of
> >C++ must ensure it.
>
>   I think you are possibly confusing "operator delete" with "delete p" here:
>
>   We are speaking about destroying dynamically allocated objects, which in
> C++ is done by using "delete p", in which case first ~T() is called
> whereafter "operator delete" is called to deallocate memory. Now, with a
> GC in place, the GC could in principle decide to call ~T() at one time but
> "operator delete" at another time. But one can set just set "operator
> delete" to do nothing, so it should be no problem, except that one gets a
> lot of "do nothing functions to be called", which takes time. (With a
> conservative GC, the "operator delete" is going to do nothing to the user,
> but not necessarily to the GC, because I figure one can use it to do the
> memory moving operations at GC time.)

Since GC explicitly reclaims memory allocated by ::operator new and not
yet freed, it is inconsistent with per-class ::operator deletes.

X::operator new() and X::operator delete() go together; this is not
mandated by the standard, but it's a practical reality. I would also
hope that any class with it's own X::operator new() would not
participate in GC. In fact, that would be the easiest way to specify
that class X is to be excluded from GC.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/05 Raw View

  I'd like to round off the GC (garbage collecting) discussions in this
thread by posting a call for participation in a symposium on memory
management. (I got this list from the Haskell <http:/haskell.org/> mailing
list.)

  From the subject of this list one can see that many of the topics
discussed in this thread, raised as GC pros or cons, in fact are under
intense study by the experts on this subject.

  I think it would be great if C++ is extended so that it is possible to
conveniently implement these GC techniques efficiently in the form of
libraries which can be used by the programs that need them in the ways
they need them.

  The C++ extensions need not be that dramatic: A convenient way to locate
the root set. Objects may need to have "colors" indicateing their runtime
dynamics: Global, stacked, temporary, dynamically allocated. This could
then be used to write special copy constructors if that is used at GC time
for say tracing the pointers. In addition, some special features may be
needed in order to make the GC implementation as efficient as possible.

  Those are just a few things that I wanted to have if I would have to
write my own conservative GC. Experts on GC's can probably pinpoint the
exact details of what is needed.

-----------------------------------------------------------
          Call for participation

   International Symposium on Memory Management 1998

    Sat 17th - Mon 19th October 1998, Vancouver

        Sponsored by ACM SIGPLAN
        Co-located with OOPSLA

        Full details at:
    http://www.sfu.ca/~burton/ismm98.html

    *****************************************
    *     YOU CAN REGISTER NOW ON THIS URL  *
    *****************************************

The International Symposium on Memory Management is a forum for
research in memory management, especially garbage collection and
dynamic storage allocators. Areas of interest include but are not
limited to: garbage collection, dynamic storage allocation, storage
managemeent implementation techniques and their interactions with
language and OS implementation, and empirical studies of programs'
memory allocation and referencing behavior.


Accepted papers
~~~~~~~~~~~~~~~~
A Compacting Incremental Collector and its Performance in a Production
Quality Compiler, Martin Larose and Marc Feeley, Universite de
Montreal

Combining Card Marking with Remembered Sets: How to Save Scanning
Time, Alain Azagury, Eliot Kolodner, Erez Petrank and Zvi Yehudai, IBM
Haifa Research Laboratory

Barrier techniques for Incremental Tracing, Pekka P. Pirinen,
Harlequin

The Memory Fragmentation Problem: Solved?, Mark S. Johnstone and Paul
R.Wilson, University of Texas at Austin

Using Generational Garbage Collection to Implement Cache-Conscious
Data Placement, Trishul M. Chilimbi and James R. Larus, University of
Wisconsin-Madison

One-bit Counts between Unique and Sticky, David J. Roth and David
S. Wise, Indiana University

Hierarchical Distributed Reference Counting, Luc Moreau, University of
Southampton

Comparing Mostly-Copying and Mark-Sweep Conservative Collection,
Frederick Smith and Greg Morrisett, Cornell University

A Non-Fragmenting Non-Copying Garbage Collector, Gustavo
Rodriguez-Rivera, Michael Spertus and Charles Fiterman, Geodesic
Systems

Garbage Collection in Generic Libraries, Gor V. Nishanov and Sibylle
Schupp, Rensselaer Polytechnic Institute

Memory Management for Prolog with Tabling, Bart Demoen and
Konstantinos Sagonas, Katholieke Universiteit Leuven

The Bits Between the Lambdas - Binary Data in a Lazy Functional
Language, Malcolm Wallace and Colin Runciman, University of York

A Memory-Efficient Real-Time Non-Copying Garbage Collector, Tian
F. Lim, Prsemyslaw Pardyak and Brian N. Bershad, University of
Washington

Guaranteeing Non-Disruptiveness and Real-Time Deadlines in an
Incremental Garbage Collector, Fridtjof Siebert

A Study of Large Object Spaces, Michael W. Hicks, Luke Hornof,
Jonathan T. Moore and Scott M. Nettles, University of Pennsylvania

Portable Run-Time Type Description for Conventional Compilers, Sheetal
V. Kakkad, Mark S. Johnstone and Paul R. Wilson, University of Texas
at Austin and Somerset Design Center, Motorola Inc.

Compiler Support to Customize the Mark and Sweep Algorithm, Dominique
Colnet, Philippe Coucaud and Olivier Zendra, INRIA-CNRS-Universite
Henri Poincare

Very Concurrent Mark-&-Sweep Garbage Collection without Fine-Grain
Synchronization, Lorenz Huelsbergen and Phil Winterbottom, Bell
Laboratories

Memory Allocation for Long-Running Server Applications, Per-Ake Larson
and Murali Krishnan, Microsoft

-----------------------------------------------------------

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/05 Raw View

<AllanW@my-dejanews.com> wrote:
>> But suppose, hypothetically, that someone did overcome this hurdle,
>> creating a garbage collector that called the destructor for all
>> objects. Would you still object to it?

Nathan Myers wrote:
> I don't object to garbage collectors or to garbage collection.
> In fact, I depend on it daily in my various scripting languages.
> I only object to inflated claims made for GC, and for languages
> that depend on it.
>
> A garbage collector for C++ that called destructors would be more
> or less generally useful depending on how much control it offered
> over when those destructors ran.  Of course generality is not always
> necessary, except in a language standard.

I'm certainly no expert on the theory of GC, but...

If I recall correctly, many of the more successful GC algorithms
use a "mark and sweep" strategy, periodically marking "dead" objects
that are no longer referenced, and then sweeping (reclaiming) their
storage.

That being the case, it would seem reasonable for a GC mechanism
in C++ to allow delete() to "mark" the object's memory block after
its destructor was called.  This would apply to dynamically
allocated objects (on the heap).  Other (static and auto) objects
would be destructed, as usual, when their names go out of scope,
which also calls delete().

The only case left is that of dynamic objects (which are created
by explicit calls to new()) that are not explicitly deleted, but
have no active references to them (also known as "dead" objects).
It is this category of objects (a.k.a., "memory leaks") that would
get the most benefit from GC.  The only complication is when to
execute the destructor for such an object; its "dead" status
might not be detected until quite some time after the last pointer
to it was changed.  But maybe this isn't such a problem, since it
only occurs for dynamic objects (which cannot "go out of scope").

Another point to make is that a GC mechanism almost undoubtedly
performs better if it is aided by the compiler, i.e., if it knows
about the RTTI information and the layout of objects, and whether
an object is really a pointer or not; chasing down pointers and
references on the stack and heap then becomes much simpler, with
no need for guessing, and allows the GC to be more aggressive.

(BTW, objects that contain pointers to themselves can be handled
as special cases, as long as the GC mechanism has knowledge of the
class layouts.)

-- David R. Tribble, dtribble@technologist.com --
-- Win95: Start me up... You make a grown man cry...

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/05 Raw View

In article <6q84oa$f9q$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>Since GC explicitly reclaims memory allocated by ::operator new and not
>yet freed, it is inconsistent with per-class ::operator deletes.

  I speak about local "Base::operator new" and "Base::operator delete".

  But yes, it would be good if the C++ mandatory "operator delete" could
be avoided when using a conservative GC. But a GC can be incremental, too,
in which case one may want to use "operator delete".

>X::operator new() and X::operator delete() go together; this is not
>mandated by the standard, but it's a practical reality.

  I thought it was mandated by the standard.

>I would also
>hope that any class with it's own X::operator new() would not
>participate in GC. In fact, that would be the easiest way to specify
>that class X is to be excluded from GC.

  Well, I think the opposite case would be the best, because if you use
handles pointing to a heap, then the handles and the heap would use a
different GC.

  Besides, a program may use several different GC's for the objects on the
heap depending on the nature of the object.

  In addition, for such a GC to work, the program must use the heap in a
handle safe way, and some objects may not use handles at all.

  So the way C++ is now, I think the best is that programs uses the
traditional "::operator new/delete", except for the special GC techniques
localized to class hierarchies.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/05 Raw View

In article <35C797A7.1777@noSPAM.central.beasys.com>,
dtribble@technologist.com wrote:
>If I recall correctly, many of the more successful GC algorithms
>use a "mark and sweep" strategy, periodically marking "dead" objects
>that are no longer referenced, and then sweeping (reclaiming) their
>storage.

  Yesm this is the techniue I try to point out that one might use: With
the root set in hand, one uses a variation of the copy constructor to put
forwrad the mark and sweep procedure. Then a suitably GC time altered
"operator delete" handles the memory mivuing operation.

>That being the case, it would seem reasonable for a GC mechanism
>in C++ to allow delete() to "mark" the object's memory block after
>its destructor was called.

  This is however not necessary as the GC implementnr easily can create
the suitable classes. What is needed is that one can somehow get hold of
the root set, then it is not so difficult to implement the mark and sweep
procedure.

  Here I assume we are speaking about C++ admitting the convenient and
efficient implementation of GC's and not C++ with a global GC, as this is
not really feasible.

>Another point to make is that a GC mechanism almost undoubtedly
>performs better if it is aided by the compiler, i.e., if it knows
>about the RTTI information and the layout of objects, and whether
>an object is really a pointer or not; chasing down pointers and
>references on the stack and heap then becomes much simpler, with
>no need for guessing, and allows the GC to be more aggressive.

  To me it looks as though one would need special copy constructors and
"operator new/delete" at GC time (as opposed to the ones used at runtime
by the program): If C++ supports that, it could help speed up the GC.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <haberg@REMOVE.member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/05 Raw View

In article <35C797A7.1777@noSPAM.central.beasys.com>,
  dtribble@technologist.com wrote:
> If I recall correctly, many of the more successful GC algorithms
> use a "mark and sweep" strategy, periodically marking "dead" objects
> that are no longer referenced, and then sweeping (reclaiming) their
> storage.
>
> That being the case, it would seem reasonable for a GC mechanism
> in C++ to allow delete() to "mark" the object's memory block after
> its destructor was called.  This would apply to dynamically
> allocated objects (on the heap).  Other (static and auto) objects
> would be destructed, as usual, when their names go out of scope,
> which also calls delete().

Take it a step further -- have delete() chain the object's memory
block into a linked list. That way, we would be able to find all of
the marked items quickly, without having to scan memory for them.

Come to think of it, we might as well have operator new() scan the
list to see if there's any memory blocks of about the right size.
That way, we won't have to allocate more memory from the OS if
there's a free block available.

Hmm, the search for new blocks might be more effective if the list
wasn't too fragmented. We could have delete() check the block just
before it, and the block just after it -- if one or both of them
is free, we could merge them into one larger block, making them
available for operator new().

What we've just described is the standard "heap" mechanism shipped
with most C++ run-time systems. This isn't Garbage Collection.

> The only case left is that of dynamic objects (which are created
> by explicit calls to new()) that are not explicitly deleted, but
> have no active references to them (also known as "dead" objects).
> It is this category of objects (a.k.a., "memory leaks") that would
> get the most benefit from GC.  The only complication is when to
> execute the destructor for such an object; its "dead" status
> might not be detected until quite some time after the last pointer
> to it was changed.  But maybe this isn't such a problem, since it
> only occurs for dynamic objects (which cannot "go out of scope").

As I understood it, this *IS* garbage collection. There's certainly
no need to "collect" objects on the stack, so the "new" objects are
the only ones that matter.

> Another point to make is that a GC mechanism almost undoubtedly
> performs better if it is aided by the compiler, i.e., if it knows
> about the RTTI information and the layout of objects, and whether
> an object is really a pointer or not; chasing down pointers and
> references on the stack and heap then becomes much simpler, with
> no need for guessing, and allows the GC to be more aggressive.

I can't see why it needs any help from the compiler. I certainly do
see why it needs support from ::operator new and friends. That's why
GC systems always come with global replacements for these functions.

> (BTW, objects that contain pointers to themselves can be handled
> as special cases, as long as the GC mechanism has knowledge of the
> class layouts.)

Or even without such knowledge, so long as it knows how big the
individual blocks are. But what about circular references:
    struct Link {
        Link * next;
        Link(Link *l) : next(l) {}
        // ...
    };
    Link *a = new Link;
    Link *b = new Link(a);
    Link *c = new Link(b);
    a->next = c;
    // Use a, b, and c, but don't delete them
Will a, b, and c ever be deleted?  Great Circle uses the "mark" method
to find even circular references like this, but doesn't that take a
lot of (hopefully idle) time?

Another complication:
    int main(int,char**) {

        // Scratch buffer
        char buff[80];

        // Just to make this example more interesting
        srand(int(time(0)));
        for (int i=0; i<sizeof(void*); ++i) buff[i]=char(rand());

        // Returns a status code
        int code = do_the_real_work(); // Assumed to take a long time

        // Classify the return code
        if      (code<0) sprintf(buff, "Error %d: ", code);
        else if (code>0) sprintf(buff, "Warning %d: ", code);
        else             strcpy (buff, "Success: ");

        // Now look up the appropriate text and spit it out on standard error
        translate_code(code, buff+strlen(buff));
        std::cerr << buff << std::endl;

        // All done!
        return code;
    }
The first sizeof(void*) bytes of buff[] might happen to contain a bit
pattern that matches an address inside one of our collectable objects.
If the GC doesn't know that this is not an address, that object remains
in memory until the bit pattern changes.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gabriel Dos Reis <dosreis@DPTMaths.ens-cachan.fr>
Date: 1998/08/04 Raw View

>>>>> =ABJames=BB, kanze  <kanze@my-dejanews.com> wrote:

[...]

James> For any object, you must know its semantics. =20

Yup. But that is quite *different* from breaking encapsulation. You
don't have to know the object's implementation details to grasp its
semantics. You do know that. What you need is a formal specification
of the object's observable semantics.

--
Gabriel Dos Reis                   |  Centre de Math=E9matiques et de=20
dosreis@cmla.ens-cachan.fr         |         Leurs Applications
Fax : (33) 01 47 40 21 69          |   ENS de Cachan -- CNRS (URA 1611)
               61, Avenue du Pdt Wilson, 94235 Cachan - Cedex
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/08/04 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
>(Nathan Myers) wrote:
>  The variation I thought of is this: One has a class Data with a pointer
>to a handle class DataRef, which in its turn has a pointer to a derived
>class of a class Base, and this last object is allocated on the heap.

An aside: "a derived class of a class Base" is more nicely stated
as "a class derived from class Base".

>  Then, in a situation when a Data constructor is called and when
>"Base::operator new" is not used, the pointer that the class Data contains
>to the DataRef handle is stacked, and in a similar situation, the
>destructor ~Data() pops the stack. (That is, when Data does not appear in
>derived class of class Base.) Then if the stack order is not broken by
>temporary objects life-time, this stack will always contain the current
>root set (including the global elements).

If I understand correctly, you are proposing to thread a root-set
stack through the runtime stack, and need to ensure that this
threaded stack, and the operations on it that can only occur at
construction and destruction times, precisely model a stack.
Then, you can walk this threaded stack and avoid looking at
other raw memory?

Of course, binding a reference breaks this ordering.

>>There are two kinds of GC effort I have seen: those that ignore the
>>type system, and those that work with it.  Hans Boehm's method,
>>adopted by Great Circle, is an example of the former.  In this mode
>>the "root set" pointers can only appear in static storage and on the
>>stack.
>
>Well, you have to search at more places than this.

True.  But it seemed hard enough already.

>Nathan Myers wrote:
>>Type-aware GC only collects objects of types that are meant to be
>>collected, and such pointers may appear anywhere, so the global
>>"root set" may not be not so interesting.  I imagine allocating
>>objects from a pool, and identifying the root set manually, in that
>>case.
>
>  I have no idea how your idea of a pool would work, because if one allows
>self-referential objects to be created at runtime, then the only way to
>find out object no longer are in use is by tracing from the known root
>set.

Right.  But if you have the root set identified for you by the
programmer, you don't have to go searching for it.  If only
a Graph allocates (self-referential) Nodes from a GraphPool,
then the set of Graphs _is_ the root set.

>>For example, the constructor for type Graph would register
>>itself with the GraphNode pool.  Reference-counting GC doesn't
>>work for general graphs, so this is the first place I would expect
>>to see other techniques flourish.
>
>  But you need not such specialty applications for that to happen: As soon
>as one wants to allocate self-referential objects like programs with
>iterative or recursive function calls that happens. When working with
>references, such objects can be created by a simple assignment.
>
>  The case of graphs is a special case of this.

New techniques are always successful first in special cases, and
then get used more generally.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C5D3C7.59E2@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>Hans Aberg wrote:
>...
>> Perhaps the implementor of a GC should be able to ensure that when ~T() is
>> called not "operator delete" is called.
>
>The word "not" does't belong in that location in an English sentence.
>I'm not sure what you intended. The only adjustment that seems to fit
>the context would be to move "not" just before the final "called".

  So if you get it, why do you ask? :-) Here is a corrected version:

  Perhaps the implementer of a GC should be able to ensure that when ~T() is
called, "operator delete" is not called.

  The problem is probably not important, I just point out that it might exist.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C5CFB8.15FB@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>Section 6.6 "Jump statements" paragraph 2 (of CD2) says:
>
>| On   exit   from   a   scope   (however   accomplished),   destructors
>| (_class.dtor_)  are  called for all constructed objects with automatic
>| storage duration (_basic.stc.auto_)  (named  objects  or  temporaries)
>| that  are declared in that scope, in the reverse order of their decla-
>| ration.

  An older version (ARM 1994) of the same paragraph says

    On exit from a scope (however accomplished), destructors are called for
    all constructed class objects in that scope that have not yet been
    destroyed. This applies to both explicitly declared object and
    temporaries.

So it looks as though that there might have been a change: The new version
suggests that the stack order can never be broken.

  If this is so, my idea for keeping track of the root set might work:
Just stack the "this" pointer of all the relevant automatic objects
(including temporary) when they are created, and pop the stack whenever an
automatic is destroyed. Then this stack contains the root set at all
times.

 But I would really want to hear a verification from an expert on the
interiors of C++ compilers before I try to implement it.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C5CE40.794B@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>But a formalism that insists on doing the work for you can be bad, if
>for some reason you need to control when and whether the work is done.
>That is, it should be possible, if necessary, to control the
>circumstances under which GC occurs; I gather that this isn't possible
>with many implementations of GC.

  This is why my ideas are not circulating around a C++ with a GC, but a
C++ with supports the implementation of a GC. In fact the model I have in
my mind uses a different GC approach for different types of data: The
handles use one type of "operator new", and the objects on the heap use
another "operator new", and data objects that have not been analyzed and
rewritten in a handle safe way may use the traditional global "::operator
new".

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C6799D.75077097@ix.netcom.com>, "Paul D. DeRocco"
<pderocco@ix.netcom.com> wrote:
>But the only reason for garbage collection is to guarantee the eventual
>freeing of memory in situations where the compiler or application cannot
>otherwise determine when to free the memory. If the compiler can figure
>out, or the application can specify, when to call the destructor, then
>it may as well free the memory at the same time.

  I see two reasons for not freeing the memory immediately, and one is
that it might be slow, and another that when taking a full step into the
dynamic world, it is much harder to know exactly when an object is
deleted.

  A GC that deletes objects a little every time is called incremental, so
the idea exists in the GC world. One can probably think of a lot of
variations, dependant on the application at hand.

>>   Then to the problem where one wants to ensure that say the ~T() is
>> released the very moment nobody is looking at that object anymore.
>
>My point is that if it isn't possible to know when that moment will
>occur, the only thing that should be allowed to happen at the moment are
>other things that the programmer doesn't need to know about, such as the
>deallocation of memory. If a non-trivial destructor does something
>meaningful to the application, then it isn't the sort of thing that we
>want firing off at unpredictable times.
>
>The combination of the above two points leads me to feel that garbage
>collection is only appropriate for objects that have no meaningful
>destruction semantics. This is a large category of objects, however,
>including such things as strings.

  I think you are going to get a combination of features: Think of
something explicit, say a file. Then its buffer and so on could be
implemented on the heap, but the destructor would merely close it. It may
happen that somebody "destroys" the object, that is closing the file, even
though somebody else still wants to use it, so the buffer could then be
reopened. So the object is not then even destroyed once, but several
times. And so on.

  So I think there is not going to be either this or that but a
complicated mixture of techniques.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <6q6kef$1su$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>If I understand correctly, you are proposing to thread a root-set
>stack through the runtime stack, and need to ensure that this
>threaded stack, and the operations on it that can only occur at
>construction and destruction times, precisely model a stack.
>Then, you can walk this threaded stack and avoid looking at
>other raw memory?

  That is correct.

>Of course, binding a reference breaks this ordering.

  That is correct, too, but it looks as though CD2 (on "Temporary", ch 12
, and ch 6.2 second paragraph) has been changed in this respect too (or
has it?).

  It is possible to get around this by having pointers from the automatic
object to their individual location on the stack with their "this"
pointers: Then, when these elements are destroyed, their pointers to the
stack are set to zero. The stack then just pops off its 0 pointers
(instead of decrementing just one step). One gets a stack with some
"holes" in it, but it makes not so much difference in the long run, as the
holes are created by relatively few temporary objects.

  But then one starts to generate so much overhead so that a ref count is
perhaps just as fast (or faster ...).

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: markw65@my-dejanews.com
Date: 1998/08/04 Raw View

In article <6q2jit$d2j$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
> >(Nathan Myers) wrote:
> >>The time and order of destruction of unbound temporary objects
> >>is well-specified: they are destroyed at the end of the "containing
> >>expression" in reverse order of construction.  Bound temporaries
> >>have the lifetime of the reference they were bound to.
> >
> >  Is the C++ standard written so that one can always ensure that all
> >automatic and temporary objects taken together are released in the reverse
> >order they are created? This sounds unlikely though, but the reason I ask
> >is that if so, the feature might be used for keeping track of the root set
> >when implementing a GC.
>
> However unlikely it may seem, it's true.

Actually, I dont think it is... and the quote above about bound temporaries is
why.

According to 12.2 (Temporary Objects) paragraph 5, if a reference is bound to
a temporary, the lifetime of the temporary extends to the end of the scope,
or the end of the lifetime of the reference, whichever is shorter.

If the bound temporary was created from another temporary, then the order of
destruction will not be the reverse of the order of construction:

eg

void foo()
{
 const Y &y = Y(X(1)); // obvious declarations skipped
 // ...
}

So we first create an X, then create a Y and bind y to it. Now we destroy the
X at the end of the full expression containing it, but dont destroy y until
we exit from foo.

Or did I miss something?

Mark Williams

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/04 Raw View

Hans Aberg wrote:
>
> In article <35C5D3C7.59E2@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
> >Hans Aberg wrote:
> >...
> >> Perhaps the implementor of a GC should be able to ensure that when ~T() is
> >> called not "operator delete" is called.
> >
> >The word "not" does't belong in that location in an English sentence.
> >I'm not sure what you intended. The only adjustment that seems to fit
> >the context would be to move "not" just before the final "called".
>
>   So if you get it, why do you ask? :-) Here is a corrected version:
>
>   Perhaps the implementer of a GC should be able to ensure that when ~T() is
> called, "operator delete" is not called.

The reason I ask is that the corrected sentence describes the current
definition of C++; there is no "Perhaps" about it, any implementor of
C++ must ensure it.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/08/04 Raw View

Hans Aberg wrote:
>
>   I found a description that explains both what kind of feauture I am
> asking for, and how it might be implemented as a part of the C++ language:
> If somebody has a suggestion for an efficient workaround within the
> existing C++, then this description could be used to pin down the
> semantics.
>
>   But the suggestion amounts to that one should be able to use virtual
> function pointers as objects with fuller information about the virtual
> structure, so it seems me that it should fit into the picture of a C++
> language addition from that point of view: One gets enhanced capability of
> making use of the features that are already there in the compiler but
> which are not, at this point, available to the C++ programmer.
>
>   If a class X has a virtual function X::f, then there is a unique base
> class V of X in which V::f is first defined as a virtual function. (So the
> class V does not have a base class in which f is defined virtual.) Call
> this unique class _the_ base class of the virtual function f.

This assumption is wrong, as can be easily proven:

class A
{
public:
  virtual void f() {}
};

class B
{
public:
  virtual void f() {}
};

class X: public A, public B
{
public:
  virtual void f() {} // overrides A::f *and* B::f
};

Now, what is _the_ base class of X::f()?

[... conclusions from wrong assumption snipped ...]
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: smaharba@my-dejanews.com
Date: 1998/08/04 Raw View

In article <35C6799D.75077097@ix.netcom.com>,
  "Paul D. DeRocco" <pderocco@ix.netcom.com> wrote:
>
> But the only reason for garbage collection is to guarantee the eventual
> freeing of memory in situations where the compiler or application cannot
> otherwise determine when to free the memory. If the compiler can figure
> out, or the application can specify, when to call the destructor, then
> it may as well free the memory at the same time.

Au contraire, Mr. Roccoco! There are lots of other good reasons for garbage
collection:

1. Can save lots of destructor coding 2. Can reduce programmer errors 3. Can
reduce code size 4. Can increase performance (due to increased locality since
memory freeing can be deferred)

I'm sure there are others.

Of course, as Nathan has rightly pointed out, GC comes with its own problems.
I'm certainly not from the GC-as-panacea camp.

-Dave

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <35C73C1E.2B5AD765@physik.tu-muenchen.de>, Christopher Eltschka
<celtschk@physik.tu-muenchen.de> wrote:

>Hans Aberg wrote:
..
>>   If a class X has a virtual function X::f, then there is a unique base
>> class V of X in which V::f is first defined as a virtual function. (So the
>> class V does not have a base class in which f is defined virtual.) Call
>> this unique class _the_ base class of the virtual function f.
>
>This assumption is wrong, as can be easily proven:

  A suggestion for a polymorhic virtual function pointer type in the case
of multple inheritance has already been posted earlier in this thread (see
this article).

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Anatoli <anatoli@see.my.sig>
Date: 1998/08/01 Raw View

*** This sub-thread is drifting off-topic rapidly (from discussion
*** of C++ Standard towards C++ programming issues and techniques).
*** Followups set to comp.lang.c++.moderated.

Several things to note.

Hans Aberg wrote:
[snips throughout]
>     Method(Data (T::*f0)(Data&)) { MethodBase::func = f0; }

This shouldn't compile, as Derived-member-pointer-to-X isn't
convertible to Base-member-pointer-to-X.

>   Now, the V::f will be forcefully converted to a Base::f virtual function
> pointer,

You can't do that (see above).  Compilers are required to diagnose
such usage.

If you don't use multiple inheritance, nor virtual inheritance, nor
covariant return types, you may get away with this, but you *have* to
use reinterpret_cast on memfun-ptrs.  Otherwise it simply will not
compile.  Needless to say, such things are inherently non-portable since
they rely on a particular implementation of virtual functions
and member pointers.  And any amount of stack space is cheaper
than one platform-dependent, subtle, hard-to-find bug.

In short:  don't do it unless they pay you ridiculously large sums of
$$$. :)

>  Summing it up, the information needed seems to be in the compiler
> already, but in the absence of that, this is a way to enter it by hand.

It might be in the compiler, but it is certainly not required
to be present at runtime, so

>   A suggestion for a dynamic_cast on function pointers could be

will not work, because memfun-ptrs do not store information
required to do dynamic_cast.  They simply don't need to.
If you need it, you have to do it yourself.

Also, you mentioned something like "basemost appearance of
virtual function" several times in this and earlier posts.
There's no such thing in C++, for several reasons.  First,
there's multiple inheritance, so there may be no unique
basemost declaration of any given member function.
Second, a derived class may override a function
with different (covariant) return type, so there is no
two-way compatibility.  Finally, even if we ignore these
issues, there are (conceptually) contracts that functions
must implement, and function in Derived may enhance contract
found in Base.  So again we don't have two-way compatibility,
this time at a conceptual level.  Summary:  if you mean
&Base::func, write &Base::func, not &Derived::func.  The former
may be always interpreted as the latter, but not the other way
around.

One final thing:  there is only finite number of different member
function pointers in a given program.  Since Methods correspond to
them directly, there are only so many different Methods in a given
program.  Which menas you don't have to allocate them on heap.
Pointers to static Methods will do.  So you win in heap space and
allocation time what you loose in stack space.
--
Regards
Anatoli (anatoli at ptc dot com) -- opinions aren't
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/01 Raw View

kanze@my-dejanews.com wrote:
>
> The problem is that I've never found anything that I wanted to do
> "because an object goes out of scope", except maybe free memory.  This
> doesn't mean that the C++ model is flawed, only that it is using a
> particular mechanism
> to trigger so-called "finalization".  To take your example, there is
> certainly no implicit relationship between locking and scope, and one
> could probably find examples where it would be logical to acquire and
> release a lock in different scope.  In practice, given the tenents of
> structured programming (a la Dijkstra), however, it is usually fairly
> natural to associate the acquisition/freeing of the lock with a scope
> (perhaps artificially introduced by adding extra braces), and you can
> still use new/delete for the cases where it doesn't fit.

I think there is _usually_ an implicit relationship between locking and
scope. In practice, there may be some situations where locking and
unlocking want to occur in separate scopes, but then it becomes hard to
prove that the program is guaranteed to unlock every lock.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/01 Raw View

AllanW@my-dejanews.com wrote:
>
> I've never heard of a GC for C++ that managed to call destructors.
> I suspect that it would be technically challenging, and would perhaps
> require support from the compiler (start with a way to get RTTI for
> any type of object, including arrays, from a void*, and then add the
> ability to call a class's destructor given it's class name at
> runtime, but not before.)

I was just reading about one. However, it makes no sense to me. If the
programmer doesn't care _when_ an object is deleted, he has no business
caring _what happens_ when an object is deleted, so he has no business
writing a non-trivial destructor for that type of object.

On the other hand, if an object has a non-trivial destructor, and the
language provides a means for it to be called at some predictable time
(such as when a scope is exited), then there is no need for garbage
collection, since the object's memory can be freed at the same time.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/01 Raw View

  I have a found a variation of the suggestion by Anatoli (anatoli at ptc
dot com), which has the advantage that it does not make use of dynamic
memory: This simplifies writing special constructors (I had to put in
virtual copy constructor in the dynamic version in order to make it work),
and should be considerable faster.

  In addition, this non-dynamic version shows very clearly what is needed
if such a feature should be implemented, perhaps as a part of C++ in the
future: One builds a pair consisting of a virtual function pointer itself,
and a regular function pointer which can be used to tell if a pointer to a
class object can be used for evaluation.

  I have added a macro which very clearly shows how the feature should be
used: If one writes
    method(V, f);
where V : Base, and f is a virtual function in class V, then this will
expand to the pair
    class MethodBase {
        Data (Base::*func)(Data&);
        bool (*applicable)(Data&);
    };
where func = f (by forced type cast), and where "applicable" is set to the
function
    bool applicable(Data&) { return dynamic_cast<T*>(obj.data()) != 0; }
which is defined as a static member of the class Method<V> : MethodBase.

  Then I have put this into the context I use, showing how one can replace
f by Base::general if the object is not applicable to f and flatten the
function call chain, as before.

  Is this a happy solution, so that one does not need to add it to the C++
language? Not really, as the user must keep track of the base-most class V
of every virtual function f used. This could be prone to errors. One could
of course think of the ability to choose V as a feature, but in my context
it is not.

  So over to the details then:

#include <bool.h>

class Data;

class Base
{
public:
    virtual Data argument(Data& obj, Data& arg);
    virtual Data general(Data&);
};

class MethodBase
{
    friend class Data;

protected:
    Data (Base::*func)(Data&);
    bool (*applicable)(Data&);

public:
    MethodBase() : func(0), applicable(0) { }
    MethodBase(Data (Base::*f)(Data&), bool (*a)(Data&))
     : func(f), applicable(a) { }
};

class Data
{
    Base* datap;
    MethodBase mb;

public:
    // Plus stuff for handling the Base* datap!

    Data(MethodBase m) : mb(m) { }

    Base* data();

    Data evaluate(Data& arg)
    {
        Data d = arg.data()->argument(*this, arg);

        if (d.mb.func != 0 && d.mb.applicable(arg))
            d = (data()->*(mb.func))(arg);
        else
            d = data()->general(arg);

        return d;
    }
};

template<class T>                   // Was: <typename T>
class Method : public MethodBase
{
public:
    Method(Data (T::*f0)(Data&))
    {   func = f0;
        MethodBase::applicable = &Method<T>::applicable;   }

    static bool applicable(Data& obj)
    {   return dynamic_cast<T*>(obj.data()) != 0;   }
};

#define method(V,f) Method<V>(V ## :: ## f)

----- End of Core Classes -----

Then define a new class with a new virtual function:

class V : public Base
{
public:
    virtual Data f(Data&);
};


Wholly independently of this, add another class making use of V::f:

class D : public Base
{
public:
    Data argument(Data&, Data&)
    {   return method(V, f);   }
};


  The way I think of this last line "method(V, f)" is that it defines a
feature of some kind, which one wants to access: One could have "f =
assign" and then it is the ability to assign to the object, or "print" and
then it is the ability to print out the object, but also more special
properties like "Double", and it would be the ability of the object to
compute using Double arguments.

  Then the idea with this "method(V, f)" construction is that objects
which do not deal with say the "Double" arguments in a special ways should
not be burdened with having to define that function "V::Double".

  One can also combine working with data and virtual function pointers in
a pleasing way: One could have had
    Data D::argument(Data& obj, Data& arg)
    {    if (obj == arg)
             return arg;
         else
             return method(V, f);
    }
thus interchanging the data "arg" with the method "method(V, f)" depending
on a runtime condition "obj == arg". Example of a class with a similar
function could be a class providing substitutions: If one finds an object
matching what is to substituted, just replace it, other wise let the
substitution continue to the other subobjects.

  So in real life programming, I have found it to be very convenient.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/01 Raw View

  The non-dynamic variation of the suggestion by Anatoli (anatoli at ptc
dot com) that I posted will not work:

  C++ has one polymorphic type, namely a pointer to a class (in C++
required to have at least one virtual function) having derived classes.
Then that pointer can be allowed to point at objects of different derived
classes.

  The other candidate for a polymorhic type, the virtual function pointer,
does not exist as such within the current C++: Polymorphy of virtual
functions must take place via a polymorphic pointer to a class. So
therefore, I think that Anatoli's suggestion is the only possibility, and
that one must accept that "operator new" is called and an extra function
call to be inserted when making use of virtual function pointer
polymorphy.

  Suppose one would implement direct virtual function pointer polymorphy
as a part of the C++ language. Then one might write

class C {
public:
   virtual int f(char);
};

main() {
    int (::*vf)(char) = C::f;  // Saves pair (C, C::f) into polymorphic
                               // virtual function pointer vf.

    P* p = new Q();            // Just a pointer
    char a = 'a';              // and an argument.

    if (applicable(vf, p))  // Check if applicable
       (p->*vf)(a);
    else
        ...                 // Virtual function call does not work.

Here, "(p->*vf)(a)" would be the same thing as
    ((dynamic_cast<vf>(p))->*vf)(a)
where the "<vf>" symbolizes C* as of the pair vf = (C, C::f), and the
"->*vf" symbolizes the "->*C::f". And "applicable(vf, p)" would correspond
to the boolean value
    dynamic_cast<vf>(p) != 0;

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/02 Raw View

In article <35C2162C.89EF6DAA@ix.netcom.com>, "Paul D. DeRocco"
<pderocco@ix.netcom.com> wrote:

>AllanW@my-dejanews.com wrote:
>>
>> I've never heard of a GC for C++ that managed to call destructors.
..
>I was just reading about one. However, it makes no sense to me. If the
>programmer doesn't care _when_ an object is deleted, he has no business
>caring _what happens_ when an object is deleted, so he has no business
>writing a non-trivial destructor for that type of object.
>
>On the other hand, if an object has a non-trivial destructor, and the
>language provides a means for it to be called at some predictable time
>(such as when a scope is exited), then there is no need for garbage
>collection, since the object's memory can be freed at the same time.

  I think that there is a mixup in this discussion between destructors and
deleting an object: The destructor is in C++ the ~T() method which is used
to an object cleanup in case it is needed, and the object is deleted by
"operator delete" which releases the memory. These two are celarly
distinguished by C++.

  Now what happens if one introduces a GC is that "operator delete" is
changed; in the case if a conservative GC, it does nothing. If then the
destructor contains a lot of memory management "delete p" operations,
these become unnecessary.

  So what happens is that the programmer no longer is interested in those
memory allocation issues. But there still are cases where the programmer
may be iunterested in calling a destructor, and one may even want to have
control over when the destructor is called.

  One thing I slightly worry about the way C++ is now is that the
destructor ~T() and the "operator delete" are not sufficiently inependent:
Perhaps the implementor of a GC should be able to ensure that when ~T() is
called not "operator delete" is called. (But the latter can be simply set
to "do nothing", so perhaps this is not a problem.)

  Then to the problem where one wants to ensure that say the ~T() is
released the very moment nobody is looking at that object anymore. This is
a very difficult problem to solve, with or without GC, for reason that by
using dynamic memory one can build structures that go in loops (are
self-referencing). If this done, the only way to release that structure is
by tracing the pointers, which can be done in the mind of the programmer,
because he or she knows it will happen, or by a GC by tracing the pointers
at GC time.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/02 Raw View

In article <6pqqq6$3kv$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>The time and order of destruction of unbound temporary objects
>is well-specified: they are destroyed at the end of the "containing
>expression" in reverse order of construction.  Bound temporaries
>have the lifetime of the reference they were bound to.

  Is the C++ standard written so that one can always ensure that all
automatic and temporary objects taken together are released in the reverse
order they are created? This sounds unlikely though, but the reason I ask
is that if so, the feature might be used for keeping track of the root set
when implementing a GC.

  Experiments (by Dan Edelson) seems to indicate though, that keeping
track of the root set when in such a way generates too much overhead.

  The most efficient way to find the root set seems to be by a search of
the pointers in the appropriate places (stacks and registers and places
where global data is stored). -- This is also the main reason for that C++
ought to be extended to support GC implemention techniques, to facilitate
such a search. The data needed for such a search on a selected set of
pointers is only available to the person who writes the C++ compiler.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/02 Raw View

  Anatoli (anatoli at ptc dot com) mentioned that "the basemost" virtual
function I spoke of needs not to exist: It is possible in C++, by multiple
inheritance, to make different, but same name and same typed, virtual
functions to close together and become one in a derived class. Then they
are semantically the same, and not only the same name used for two
different virtual functions; see ARM 1994, 10.8c for details.

  If one should implement a polymorhic virtual pointer, then there are two
approaches:

  The simpler one would be to simply save the virtual pointer as it stands, e.g.
    int (::*vf)() = C::f;   // Save pair (C, C::f)
saves the pair (C, C::f), and does not attempt to trace the base-most
class in which C::f is defined.

  In the other approach, one does try to trace the basemost class in which
C::f is defined, and issues an error message if this is not unique: The
progammer will then have indicate further specification in order to make
it unique.

  The way I see it, both variations may have its uses: Perhaps one should
settle for two different types of polymorhic virtual function pointers
then.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/02 Raw View

In article <javhofnl6g.fsf@gatsby.u-net.com>, David Wragg
<dpw@doc.ic.ac.uk> wrote:
>haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
...
>As I understand it, you are implementing a programming language of
>some kind, and this programming language has certain semantics. You
>seem to be trying to implement your language by mapping its semantics
>very closely onto the semantics of C++. Except due to semantics that
>C++ does not have, this approach has problems, hence this discussion.

  Not really: As things have turned out, I implement a semantics alone,
freed from any ideas of a computer language. I have found that this is
real hard for computer people to understand, but this is the way pure math
is structured: One develop notions (semantics) which are described by some
notation (syntax/grammar/language) one somehow develops which are suitable
for the notions one develops.

  So the model I program is freed both from the idea of a computer
language and the idea of a specific implementation model. However, one
requirement I have is that it should be easy to implement, and it
circulates around C++ basic constructs as that is the language I am
currently using.

>Many of the workarounds that have been suggested have, from a general
>purpose C++ programming perspective, been interesting and fairly
>widely applicable, but not for your purposes. IMHO, this is because
>language implementation is not a typical kind of programming activity.

  I think this is as wrong as saying that implementing an OS (operative
system) is not a common programming activity: A lot of computer scientists
are doing just that, implementing computer languages in various forms.
When analyzing it properly, a lot more is a computer language than what
one would normally think of, as it is a way to build good user runtime
interfaces to non-programmers.

  For example, the "cout" of C++ has a computer language parser in it:

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/08/03 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
>(Nathan Myers) wrote:
>>The time and order of destruction of unbound temporary objects
>>is well-specified: they are destroyed at the end of the "containing
>>expression" in reverse order of construction.  Bound temporaries
>>have the lifetime of the reference they were bound to.
>
>  Is the C++ standard written so that one can always ensure that all
>automatic and temporary objects taken together are released in the reverse
>order they are created? This sounds unlikely though, but the reason I ask
>is that if so, the feature might be used for keeping track of the root set
>when implementing a GC.

However unlikely it may seem, it's true.

>  Experiments (by Dan Edelson) seems to indicate though, that keeping
>track of the root set when in such a way generates too much overhead.

Edelson is careful.

>  The most efficient way to find the root set seems to be by a search of
>the pointers in the appropriate places (stacks and registers and places
>where global data is stored). -- This is also the main reason for that C++
>ought to be extended to support GC implemention techniques, to facilitate
>such a search. The data needed for such a search on a selected set of
>pointers is only available to the person who writes the C++ compiler.

There are two kinds of GC effort I have seen: those that ignore the
type system, and those that work with it.  Hans Boehm's method,
adopted by Great Circle, is an example of the former.  In this mode
the "root set" pointers can only appear in static storage and on the
stack.  The compiler knows the lifetime of these pointers (even when
they are buried in class members).

Type-aware GC only collects objects of types that are meant to be
collected, and such pointers may appear anywhere, so the global
"root set" may not be not so interesting.  I imagine allocating
objects from a pool, and identifying the root set manually, in that
case.

For example, the constructor for type Graph would register
itself with the GraphNode pool.  Reference-counting GC doesn't
work for general graphs, so this is the first place I would expect
to see other techniques flourish.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Oleg Zabluda <zabluda@math.psu.edu>
Date: 1998/08/03 Raw View

Nathan Myers <ncm@nospam.cantrip.org> wrote:

: I mean that it has been promised that it supports a different style
: of programming in which consequences of resource-consumptive actions
: may be ignored.  When such an action includes consuming a scarce
: resource such as a file descriptor, garbage collection alone suddenly
: doesn't help.  Then the coding style itself is inappropriate, and you
: must retreat to the contract style.  This generally turns out to be
: the norm, not the exception, in real programs.

Right. We can only dump on GC what it can do well and nothing
more. That is memory reclamation. Don't try to reclaim file
descriptors, mutex locks, spawned precesses/threads, reference
counts, file locks, instance counters, and god knows what else,
with GC. The compiler should run destructors as usual, at the
usual well-determined times. The only thing which can reasonably
be replaced by GC is ``operator delete'', nothing more.
Surprize! That's exactly what GC's of today are doing.

Oleg.
--
Life is a sexually transmitted, 100% lethal disease.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <6q47go$49c@marianna.psu.edu>, zabluda@math.psu.edu (Oleg
Zabluda) wrote:
> We can only dump on GC what it can do well and nothing
>more. That is memory reclamation. Don't try to reclaim file
>descriptors, mutex locks, spawned precesses/threads, reference
>counts, file locks, instance counters, and god knows what else,
>with GC. The compiler should run destructors as usual, at the
>usual well-determined times. The only thing which can reasonably
>be replaced by GC is ``operator delete'', nothing more.
>Surprize! That's exactly what GC's of today are doing.

  I should point out that there is a problem with the notion of "runing
destructors as usual at the usual well-determined times" when
transitioning to dynamically allocated objects. The problem is this:

  One is, by dynamic allocation, allowed to creat an object that is
self-referential, that is, pointing back to itself. How do we know that
such an object no longer is in use? The only reliable metod (if we do not
have specialty knowledge of the object itself) is by tracing the pointers
in the root set, and then discover that this object is not in the set of
all traced pointers.

  It is of course possible for a programmer to destroy the resources
earlier (say a file that is closed). But it is impossible to foresee that
somebody else is unsing that resource in this generality.

  This fact is in reality not negative: By full dynamic allocation, one
simply takes the step out in a wider generality. (If you can come up with
a method that in this generality ensures that a resource is destroyed as
soon the last it is not in use, please let me know. Please do not come up
with standard suggestions of ref counts, because that does not work.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/04 Raw View

kanze@my-dejanews.com wrote:
>
> In article <6pnthf$e60$1@shell7.ba.best.com>,
>   ncm@nospam.cantrip.org (Nathan Myers) wrote:
...
> > A GC formalism which could accommodate other resources would not
> > need to break composition and encapsulation.
>
> In general, any formalism which will allow the system to do some of the
> work for you is good.

But a formalism that insists on doing the work for you can be bad, if
for some reason you need to control when and whether the work is done.
That is, it should be possible, if necessary, to control the
circumstances under which GC occurs; I gather that this isn't possible
with many implementations of GC.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <6q2jit$d2j$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>>  Is the C++ standard written so that one can always ensure that all
>>automatic and temporary objects taken together are released in the reverse
>>order they are created? This sounds unlikely though, but the reason I ask
>>is that if so, the feature might be used for keeping track of the root set
>>when implementing a GC.
>
>However unlikely it may seem, it's true.

  The variation I thought of is this: One has a class Data with a pointer
to a handle class DataRef, which in its turn has a pointer to a derived
class of a class Base, and this last object is allocated on the heap.

  Then, in a situation when a Data constructor is called and when
"Base::operator new" is not used, the pointer that the class Data contains
to the DataRef handle is stacked, and in a similar situation, the
destructor ~Data() pops the stack. (That is, when Data does not appear in
derived class of class Base.) Then if the stack order is not broken by
temporary objects life-time, this stack will always contain the current
root set (including the global elements).

  So if I can be absolutely sure the stack order is never broken, I can
try it, but otherwise not: So if the C++ standard ensures it, the method
can be used, but otherwise it is going to be too complicated make a fix
(too much overhead), and the method must scrapped.

  (In the above, I skip the programming details: All objects are
eventually allocated on the heap, but it is possible to distinguish
between the situations where the originator of that allocation is an
automatic Data object or an object derived from the class Base. In the
first, the pointer to the DataRef handle becomes a part of the root set,
and in the second case it does not.)

>There are two kinds of GC effort I have seen: those that ignore the
>type system, and those that work with it.  Hans Boehm's method,
>adopted by Great Circle, is an example of the former.  In this mode
>the "root set" pointers can only appear in static storage and on the
>stack.  The compiler knows the lifetime of these pointers (even when
>they are buried in class members).

  Well, you have to search at more places than this. Here is a quote from
Hans Boehm:
>1) Treating the registers as potential roots
>2) Treating all locations between the sp and the stack base (sometimes a fixed
>address) as potential roots.
>3) Treating all locations between the linker defined symbols _etext (end of
>program = start of data) and _end (endof statically allocated data) as
>potential roots.
>
>Realistically, this is complicated by dynamic libraries, threads, a nonconstant
>stack base, etc.

Nathan Myers wrote:
>Type-aware GC only collects objects of types that are meant to be
>collected, and such pointers may appear anywhere, so the global
>"root set" may not be not so interesting.  I imagine allocating
>objects from a pool, and identifying the root set manually, in that
>case.

  I have no idea how your idea of a pool would work, because if one allows
self-referential objects to be created at runtime, then the only way to
find out object no longer are in use is by tracing from the known root
set.

  So the root set must be kept track of. I can think of two methods to do
that when allocating the objects: In both methods one distinguishes
between when a root set (non-dynamic) object is allocated and when a
dynamic (heap) object is allocated. The first suggestion then by using a
stack, and the second is by a reference count in the DataRef handle. (So
this use of a ref count differs from the usual one, as the ref count does
not change when the originator of the allocation is an heap object,
derived from class Base.)

  But these variations are probably relatively slow to the variation where
a selective search on the root set is done by the compiler itself: The
compiler can put these objects in special places, so that one only
searches interesting data.

>For example, the constructor for type Graph would register
>itself with the GraphNode pool.  Reference-counting GC doesn't
>work for general graphs, so this is the first place I would expect
>to see other techniques flourish.

  But you need not such specialty applications for that to happen: As soon
as one wants to allocate self-referential objects like programs with
iterative or recursive function calls that happens. When working with
references, such objects can be created by a simple assignment.

  The case of graphs is a special case of this.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/04 Raw View

Hans Aberg wrote:
>
> In article <6pqqq6$3kv$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
> (Nathan Myers) wrote:
> >The time and order of destruction of unbound temporary objects
> >is well-specified: they are destroyed at the end of the "containing
> >expression" in reverse order of construction.  Bound temporaries
> >have the lifetime of the reference they were bound to.
>
>   Is the C++ standard written so that one can always ensure that all
> automatic and temporary objects taken together are released in the reverse
> order they are created? This sounds unlikely though, but the reason I ask

Section 6.6 "Jump statements" paragraph 2 (of CD2) says:

| On   exit   from   a   scope   (however   accomplished),   destructors
| (_class.dtor_)  are  called for all constructed objects with automatic
| storage duration (_basic.stc.auto_)  (named  objects  or  temporaries)
| that  are declared in that scope, in the reverse order of their decla-
| ration.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/04 Raw View

Hans Aberg wrote:
...
> Perhaps the implementor of a GC should be able to ensure that when ~T() is
> called not "operator delete" is called.

The word "not" does't belong in that location in an English sentence.
I'm not sure what you intended. The only adjustment that seems to fit
the context would be to move "not" just before the final "called".
However, AFAIK that describes the C++ language as currently specified.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

  So C++ has a polymorphic type, a pointer to a class with a virtual
function and derived classes, and one might think of introducing another
polymorphic type, a polymorphic virtual function pointer.

  There is yet another polymorphic type one may have use of, and while at
it, I can point out this too:

  Implementors of dynamic and functional languages distinguish between
"unboxed" and "boxed" elements: In the model with an automatic type Data
with a pointer to a handle class DataRef with a pointer to a heap class
Base derived class, the "boxed" elements would be on the heap, and the
unboxed elements would be implemented as an union in the DataRef class.

  But it would be much better if the unboxed elements could be implemented
as derived classes of the DataRef class: One way to do this would be to
write a "DataRef::operator new" which always allocates a fixed size, and
elements exceeding this fixed number would cause an error. One advantage
is that the one that writes such classes can add them without altering the
DataRef class.

  The problem with C++ though is to allow such objects to mutate into an
object of another type: This does not work with multiple inheritance, as
the new object may need to be adjusted out of the old allocation.

  So, one would then want a way in C++ to ensure that when the pointers of
a derived class is converted to that of a base class, the new pointer
points to the complete object: This could be done if objects have the
first word being a "type" pointer to a table with information of where to
find the various subparts.

  This should then be combined with a way to make such objects to behave
as if implemented as a (typed) union, so that they can self-mutate into
another type.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/04 Raw View

In article <6q2jit$d2j$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>>  Is the C++ standard written so that one can always ensure that all
>>automatic and temporary objects taken together are released in the reverse
>>order they are created? This sounds unlikely though, but the reason I ask
>>is that if so, the feature might be used for keeping track of the root set
>>when implementing a GC.
>
>However unlikely it may seem, it's true.

  This is not true, not only according to the standard ch 12.1 (about
"Temporary"), but I made a stack implementation checking it on my
compiler, and it breaks exactly as one might expect:

  If one simply writes
    return x;
and the value of "x" is put into a temporary, and it is copied (using the
copy constructor) into a register for the return value, then the "x" value
may immediately be destroyed. I could verify that the stack order was
broken by a stack storing the allocated pointers and setting them to zero
when the object is destroyed: If the active stack then contains some 0
pointers, then the order is broken, and it is easy to make the stack to
print it out and use the debugger locate the place in the program and see
what is going on.

  So I think you are overlooking the temporary elements held in registers
when you try to think about the root set.

  The method is interesting though, because if one has knowledge about
those temporary objects, those could be added to the list of root
elements. This strengthens my view that C++ ought to have support for GC
implementations, because without such firsthand knowledge of the lifetimes
of the temporary objects, it seems to be rather difficult to keep track of
the root set.

  The interface for C++ GC support could be quote simple:

  One first marks the classes of data that should be tracked for belonging
to the root set:
    class Data {
        use Root root;
    };
This would mean that whenever an object of class Data is created in a
non-dynamic situation, the object "root" (which is a global object
initialized elsewhere) keeps track of that: That is, when data is created
in global, stacked, temporary ..., situations, but not as a part of an
object (also of another type B) created with "operator new". -- Possibly
one should indicate which "B::operator new" that should be excluded.

  Then, at GC-time, one writes
    root.init();   // Initialize root set pointer.
    Data* dp;
    while ((dp = root.next()) != 0)
        // Do GC on the pointer dp.

  With this in hand, the GC implementer can do the rest (marking up
handles, moving data calling destructors, tracing pointers, freeing unused
memory, etc).

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/04 Raw View

Hans Aberg wrote:
>
>   One thing I slightly worry about the way C++ is now is that the
> destructor ~T() and the "operator delete" are not sufficiently
> inependent: Perhaps the implementor of a GC should be able to ensure
> that when ~T() is called not "operator delete" is called. (But the
> latter can be simply set to "do nothing", so perhaps this is not a
> problem.)

But the only reason for garbage collection is to guarantee the eventual
freeing of memory in situations where the compiler or application cannot
otherwise determine when to free the memory. If the compiler can figure
out, or the application can specify, when to call the destructor, then
it may as well free the memory at the same time.

>   Then to the problem where one wants to ensure that say the ~T() is
> released the very moment nobody is looking at that object anymore.

My point is that if it isn't possible to know when that moment will
occur, the only thing that should be allowed to happen at the moment are
other things that the programmer doesn't need to know about, such as the
deallocation of memory. If a non-trivial destructor does something
meaningful to the application, then it isn't the sort of thing that we
want firing off at unpredictable times.

The combination of the above two points leads me to feel that garbage
collection is only appropriate for objects that have no meaningful
destruction semantics. This is a large category of objects, however,
including such things as strings.

However, the category of objects that should never be garbage collected
because they are deleted in a controlled manner is also large. I wonder
if the presence of large numbers of such objects makes too much excess
work for Boehm-like collectors. That is, if only a fraction of the
objects are of the sort that can be automatically reclaimable, then a
Boehm collector is going to be wasting gobs of time following fruitless
pointer chains.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <35BF46D6.E7D47045@see.my.sig>, Anatoli <anatoli@see.my.sig> wrote:
>I believe I have a solution for you.

  I have looked at it, and it seems to be (close to) the right thing.

>  It may be a bit too
>expensive for your needs but it should work.

  Within what is possible within the current C++, it may not be that
expensive at all.

>  Sorry if
>it's something you already know.

  If I already knew everything, posting to this newgroup would have been
unnecessary.

>class Object
>{
>  // a common base class
>};
>
>class MethodBase
>{
>public:
>  virtual bool IsApplicableTo (Object*) = 0;
>  virtual void ApplyTo (Object*) = 0;
>  virtual ~Method () {}
>};
>
>template <typename T>
>class Method : public MethodBase
>{
>  void T::*memfun ();
>public:
>  Method (void T::*mf ()) : memfun (mf) {}
>  bool IsApplicableTo (Object* obj)
>    {
>      return dynamic_cast<T*> (obj) != 0;
>    }
>  void ApplyTo (Object* obj)
>    {
>      (dynamic_cast<T&>(*obj)).*memfun ();
>      // throws bad_cast if something is wrong!
>    }
>};
>
>Now you can:
>
>  Object* obj = new MyGizmo;
>  MethodBase* met = new Method<MyGizmo> (&MyGizmo::MyFunc);
>  // 100 KLOC later
>  if (met->IsApplicableTo (obj))
>    met->ApplyTo (obj);
>  else
>    Fallback (obj);

  I think in the terms of what I posted before, with any virtual function
f pointer defined by a pair (V, k) where V is the the base-most class f is
declared virtual and k is the offset of on the vtbl of V, then one sjould
write
    new Method<V>(&V::f);
That is, the one writing the C++ code must kep of the class V where f is
first declared virtual.

  But otherwise I think it is efficient enough: I think one could write

    Object* obj = new C;
    MethodBase* met = new Method<V>(&V::f);
    // 100 KLOC later
    if (met->IsApplicableTo(obj))
        return met->memfun;
    else
        return &Base::general;  // Or Object::general
                                // -- something that always works.

Then the return (say g) is such that (obj->*g)(a) always works, which is
what I wanted.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <35BF46D6.E7D47045@see.my.sig>, Anatoli <anatoli@see.my.sig> wrote:
>Here's one possible method [bare bones]:

  I have examined it and I post a version which flattens the function call
hierarchy:

>class MethodBase
>{
>public:
>  virtual bool IsApplicableTo (Object*) = 0;
>  virtual void ApplyTo (Object*) = 0;
>  virtual ~Method () {}
>};
...
>Now you can:
>
>  Object* obj = new MyGizmo;
>  MethodBase* met = new Method<MyGizmo> (&MyGizmo::MyFunc);
>  // 100 KLOC later
>  if (met->IsApplicableTo (obj))
>    met->ApplyTo (obj);
>  else
>    Fallback (obj);

  In the original variation, this "met->ApplyTo(obj)" inserts one extra
function call, and the idea is to remove it. So the idea is to put the
virtual function pointer in the class MethodBase (which need not be
abstract), then extract that pointer as an object, and then call. So:

class Data;

class Base
{
public:
    Data argument(Data& obj, Data& arg);
    Data general(Data&);
};

class MethodBase
{
public:
    Data (Base::*func)(Data&);

    virtual bool IsApplicableTo(Data&);
    virtual ~MethodBase() {}
};

template<class T>
class Method : public MethodBase
{
public:
    Method(Data (T::*f0)(Data&)) { MethodBase::func = f0; }

    bool IsApplicableTo(Base* obj)
    {   return dynamic_cast<T*>(obj) != 0;   }
};

class Data
{
    Base* datap;        // Data moving over Base hierarchy.
    MethodBase* mbp;

public:
    Data(MethodBase* mb) : mbp(mb), datap(NULL) { }
    ~Data() { delete mbp; delete datap; }

    Base* data();

    Data evaluate(Data& arg)
    {
        Data d = arg.data()->argument(*this, arg);

        if (d.mbp != NULL && d.mbp->IsApplicableTo(arg))
            d = (data()->*(mbp->func))(arg);
        else
            d = data()->general(arg);

        return d;
    }
};

  Then define a virtual function pointer for the first time:

class V : public Base
{
public:
    Data f(Data&);
};

and then call for an use of this V::f pointer:

class D : public Base
{
public:
    Data argument(Data&, Data&)
    {   return new Method<V>(&V::f);   }
};

  Now, the V::f will be forcefully converted to a Base::f virtual function
pointer, but it will never used as such: It will only be used as
(vp->*f)(a) where vp is in class V or derived from class V. So if the
compiler does not do anything funny by this forced conversion to a Base::f
virtual function pointer, it will work.

  In this variation, the Method<V> class saves exactly the information
asked for, namely the pair (V, k) where V is the basemost f is first
declared virtual, and the offset of V::f (symbolized by Base::f).

  You can also see that if Data::evaluate is inlined, then no extra
function calls are inserted in the function call stack when the actual
evaluation takes place. (In reality I use a "while" loop scanning until
real Base data has been found.)

  Summing it up, the information needed seems to be in the compiler
already, but in the absence of that, this is a way to enter it by hand.

  A suggestion for a dynamic_cast on function pointers could be
    dynamic_cast<C::f>(p)
which should be equivalent to
    dynamic_cast<V*>(p)
where V is the base class of the virtual function C::f, that is the
base-most class V of C in which f is declared virtual.

  There is some details one would need to think of in order to make this
confirm with C++ ideas: The return type of this dynamic_cast becomes
dynamic V* if C::f is allowed to be a variable. It would then suffice with
a dynamic_cast with return type bool, or perhaps a conditional evaluator
    (p->*(C::f, Base::g))(a);
meaning that if p is a pointer to a class object in or derived from the
base class V of f, compute (p->*V::f)(a), otherwise use (p->*Base::g)(a).
-- The point with this latter construct would be that it can be made fast
by the compiler.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/30 Raw View

In article <6pnthf$e60$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> If you use GC to avoid deleting objects, those objects had better not
> have meaningful destructors.  The meaningful destructor is precisely
> the tool that C++ offers to manage resources, both memory and "other".
> You cannot garbage-collect objects that represent resources other than
> memory.  This fact breaks composition and encapsulation, for garbage-
> collection: you cannot compose an object of subobjects that might
> manage a resource other than memory.  To use a subobject, you must
> know details of its implementation, which breaks encapsulation.
>
> A GC formalism which could accommodate other resources would not
> need to break composition and encapsulation.

I've never heard of a GC for C++ that managed to call destructors.
I suspect that it would be technically challenging, and would perhaps
require support from the compiler (start with a way to get RTTI for
any type of object, including arrays, from a void*, and then add the
ability to call a class's destructor given it's class name at
runtime, but not before.)

But suppose, hypothetically, that someone did overcome this hurdle,
creating a garbage collector that called the destructor for all objects.
Would you still object to it?

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/08/01 Raw View

In article <6pnthf$e60$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> Alexandre Oliva <oliva@dcc.unicamp.br> wrote:

> >If only it were ensured to run at application termination, it could be
> >used to save the state of persistent objects...
>
> Or if it had other hooks that could be related to program execution
> events, it could be used for other things.  Recall that I mentioned
> it in the context of languages that have GC but offer poor facilities
> for management of resources other than memory.
>
> If you use GC to avoid deleting objects, those objects had better not
> have meaningful destructors.

This is a tautology.  GC is independant of the functionality of
destructors -- if you want something done, CALL the function that does it.

One particularity of C++ (but hardly a universal feature in all languages)
is that there is a special function, called a destructor, which is
automatically called when the object in question goes out of scope,
or by the delete operator.  If you want this function to be called,
have the object go out of scope, or invoke the delete operator.

The above holds true, with or without garbage collection.  Of course,
without garbage collection, there will be a lot more cases when you
will want such a function, and so a lot more code to write.  If you
like writing (and debugging) extra code, then garbage collection is
not for you.  (There are, of course, other reasons why it might not
be for you, that may be applicable.  But usually, they aren't.)

> The meaningful destructor is precisely
> the tool that C++ offers to manage resources, both memory and "other".

Excellently expressed.

> You cannot garbage-collect objects that represent resources other than
> memory.

Of course you can.  You do it exactly like you do things today -- you
call the destructor.  Or you make it clearer what you are doing, by
giving the function a meaningfull name, and calling it.  (There is
still much to be said for calling the destructor -- in many such cases,
using a local variable going out of scope may be quite appropriate.)

> This fact breaks composition and encapsulation, for garbage-
> collection: you cannot compose an object of subobjects that might
> manage a resource other than memory.  To use a subobject, you must
> know details of its implementation, which breaks encapsulation.

For any object, you must know its semantics.  If you are using Lock objects,
for example, you must know when the lock is freed.  If it is freed in the
destructor, you must know this, and ensure that the object remains in
existance long enough, but not too long.  One could easily argue that
this breaks encapsulation, and introduces an unnecessary coupling
between the lifetime of the object (determined at least partially by
the rules of the language) and the period that the lock is held.  In
fact, I don't find this to be a problem, but I do find that you must
know about this "feature" of the Lock class in order to use it correctly.

> A GC formalism which could accommodate other resources would not
> need to break composition and encapsulation.

In general, any formalism which will allow the system to do some of the
work for you is good.  If you can define a formalism which will manage
not only memory, but also locks, I'll be the first to suggest adopting
it.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/08/01 Raw View

In article <6po1ji$6qk$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> >Garbage collection doesn't claim to solve all problems.
> >But for the problems that it does address, it addresses
> >them in a manner that requires less discipline (i.e. effort)
> >on the part of the programmer than programming by contract.
>
> Where GC works, it works.  The problem is that adding use of a
> non-memory resource to a GC program can make the structure of the
> entire program no longer appropriate.  This cannot happen in a
> program written in contract style.  Of course arbitrarily large
> changes in requirements can make any program architecture
> inappropriate, but we like for small changes to have small
> consequences.

Two comments: first, I've yet to see a program that didn't have some
non-memory resources from the beginning.  So the program would (hopefully)
be structured to take them into account, even in the presence of garbage
collection.  And two, this is a general problem in software engineering;
we don't let it stop us from using the appropriate tools.  I'm sure you
wouldn't ban integer arithmetic because a program structured to use
integer arithmetic won't be appropriate if some of the values are changed
to floating point.

> Global GC may be seduce you into unwise decisions early in a
> project.  GC that can be applied independently to individual
> data structures is safer that way.

As a general rule, I'm not sure, but I think I could accept that it might
be applicable in C++.  Objects which are to be garbage collected must be
declared as such.  (Sort of a type qualifier.)  I believe, in fact, that
this was the case in the Detlev proposal.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 1998/08/01 Raw View

AllanW@my-dejanews.com wrote:
> I've never heard of a GC for C++ that managed to call destructors.

It's not at all hard. The Boehm collector, for example, allows you to
register a function upon a piece of allocated memory. Before the collector
reclaims the memory, it calls the function, passing in a pointer to the
memory as an argument. All the C++ compiler has to do is to register a
destructor when it invokes new(type).
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David Wragg <dpw@doc.ic.ac.uk>
Date: 1998/08/01 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
> [snip]
>   Now, tying this up to the original discussion: I am working with
> something deeper than just identifying functions using types; I am
> working with a C++ style object-orientation, except that I have taken
> the full step out, making the objects completely runtime dynamic.
>
>   This is in part the reason I knock my head so hard on this virtual
> pointer question, because complicated workarounds will be both too
> slow and too difficult to work with.

Having read your posts for several days I think I now have a fairly
good idea of what you are trying to achieve. (I didn't in my earlier
replies; I probably should have read your earlier posts more
carefully). Below I describe some of the thoughts I have had reading
your posts. They don't lead to a solution, but hopefully they might
shed some light on why you have encountered some hard problems.

As I understand it, you are implementing a programming language of
some kind, and this programming language has certain semantics. You
seem to be trying to implement your language by mapping its semantics
very closely onto the semantics of C++. Except due to semantics that
C++ does not have, this approach has problems, hence this discussion.

Many of the workarounds that have been suggested have, from a general
purpose C++ programming perspective, been interesting and fairly
widely applicable, but not for your purposes. IMHO, this is because
language implementation is not a typical kind of programming activity.

For any general purpose programming language, the basic constructs and
their semantics will be selected with the intention that, by combining
these constructs, a wide range of programming problems can be tackled
with relative ease. Even so, it is not the case that the basic
constructs of one language map cleanly onto the basic constructs of
another. This fact doesn't point to flaws or omissions in any
particular language; it's just that languages have different
"philosophies".

This makes implementing one language using another an interesting
problem. If both languages share a similar "philosophy" it can be
easy; implementing a Lisp-like language in Lisp is fun, as is
implementing a logic language in Prolog. But this doesn't always
follow: Imagine implementing a simple interpreted OO language in C++;
since the class definitions of the interpreted language would only be
seen at run-time, it would not be possible to implement the method
dispatch mechanism in terms of the method dispatch mechanism of C++. I
think this is the kind of fundamental problem Hans has run into.

In other words, if you want to implement a language, then you might
have to implement that language "from scratch", rather than riding on
the mechanisms of the language you implement in.

--
Dave Wragg
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/08/01 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
>ncm@nospam.cantrip.org (Nathan Myers) wrote:
>>If you use GC to avoid deleting objects, those objects had better not
>>have meaningful destructors. ...
>
>  However, this can be achieved in the model I presented, a class Data
>with pointers to DataRef handles with a pointer to object of a class Base
>hierarchy: If say the DataRef handles are on two linked lists, those in
>use and those not in use, one just lets the destructors T::~T(), where T :
>Base, be invoked whenever a handle is taken off the list of handles in
>use.
>
>  One problem then, is that one does not know exactly when this is going
>to happen, but that problem exists already within C++, with the temporary
>objects.

The time and order of destruction of unbound temporary objects
is well-specified: they are destroyed at the end of the "containing
expression" in reverse order of construction.  Bound temporaries
have the lifetime of the reference they were bound to.

Memory-like resource management is appropriate for memory-like resources.
Needing to know that everything in a structure satisfies that criterion
still breaks encapsulation, but that may be tolerable in many cases.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/08/01 Raw View

<AllanW@my-dejanews.com> wrote:
>In article <6pnthf$e60$1@shell7.ba.best.com>,
>  ncm@nospam.cantrip.org (Nathan Myers) wrote:
>> If you use GC to avoid deleting objects, those objects had better not
>> have meaningful destructors. ...
>> [else] you cannot compose an object of subobjects that might
>> manage a resource other than memory.  To use a subobject, you must
>> know details of its implementation, which breaks encapsulation.
>
>But suppose, hypothetically, that someone did overcome this hurdle,
>creating a garbage collector that called the destructor for all objects.
>Would you still object to it?

Since you asked...

I don't object to garbage collectors or to garbage collection.
In fact, I depend on it daily in my various scripting languages.
I only object to inflated claims made for GC, and for languages
that depend on it.

A garbage collector for C++ that called destructors would be more
or less generally useful depending on how much control it offered
over when those destructors ran.  Of course generality is not always
necessary, except in a language standard.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/08/01 Raw View

In article <6pqfm2$kh6$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>I've never heard of a GC for C++ that managed to call destructors.
>I suspect that it would be technically challenging, and would perhaps
>require support from the compiler (start with a way to get RTTI for
>any type of object, including arrays, from a void*, and then add the
>ability to call a class's destructor given it's class name at
>runtime, but not before.)

  I just made such a suggestion in another post in this thread: The model
of automatic class Data with a pointer to a class DataRef handle object,
which has a moving pointer to an object of a derived class the class Base.
The class Base hierarchy objects ends up on the heap, but one natural way
is to put the DataRef on two linked lists (which are not in the heap but
in a separate array), handles in use and handles not in use. So whenever a
handle in use is taken off that list and put onto the list of handles not
in use, simply call the destructior of the object too.

>But suppose, hypothetically, that someone did overcome this hurdle,
>creating a garbage collector that called the destructor for all objects.
>Would you still object to it?

  There is no reason to object to this model, as I see it, as a destructor
not needed will simply be empty, except for this detail: The destructor
must be virtual, and the way C++ is now, such virtual functions cannot be
inlined. So the burden is having a lots of empty destructors called.

  So a change that C++ might need is a way avoiding to execute such
virtual destructors if they are empty.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: abrahams@motu.com (David Abrahams)
Date: 1998/07/29 Raw View

On 29 Jul 98 11:51:15 GMT, kanze@my-dejanews.com wrote:

>In article <6piuim$7ou$1@shell7.ba.best.com>,

>Seriously, I know of no other language which tries to put as much into
>finalization as does C++.  The "finally clause" works well for encapsulating
>things that have to be done before leaving a block.  It isn't necessary
>in C++, because the idiom which has developed is to use destructors
>for this, but many would argue that the finally clause is more natural;
>the cleanup takes place in the same scope as the rest of the function.

You can simulate that in C++ with (HORRORS!) macros, but I doubt if
most compilers would recognize and optimize away the resulting code
duplication.

>In fact, I don't really believe that one is better than the other --
>they are different, that's all.

You say this now, and from my readings on Java you are correct, but
then later...

>In the case of a file descriptor, neither do
>destructors; only the finally clause is really appropriate.  (On all
>of the systems I've used, "freeing" a file descriptor involves closing
>the file, an operation that can fail.  Which means that you must have
>some way of testing the error and handling it.)

As far as I remember, finally doesn't give you any special tools for
dealing with this problem. Have I forgotten something? I was under the
impression that finally and destructor cleanup were semantically
equivalent.

-Dave
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Anatoli <anatoli@see.my.sig>
Date: 1998/07/30 Raw View

Hans Aberg wrote:
> [total snip]

I believe I have a solution for you.  It may be a bit too
expensive for your needs but it should work.  Sorry if
it's something you already know.

You are trying to determine when a given mem-vfun-ptr
is applicable to a given object.  Note that mem-vfun-ptrs
are little more than offsets in vtbl.  However, you cannot
simpy look at vtbl and see whether a given slot is occupied,
because it may be occupied by something completely irrelevant
(a function with different signature).

In other words, you cannot ask the object.  The only other
way is to ask the mem-vfun-ptr.

But again, mem-vfun-ptrs are just offsets.  They do not store
any type information at runtime.  You've asked "C++ standartizers"
to include such information for you, but you can do it yourself.
Just encapsulate mem-vfun-ptrs in a polymorphic class.
Here's one possible method [bare bones]:

class Object
{
  // a common base class
};

class MethodBase
{
public:
  virtual bool IsApplicableTo (Object*) = 0;
  virtual void ApplyTo (Object*) = 0;
  virtual ~Method () {}
};

template <typename T>
class Method : public MethodBase
{
  void T::*memfun ();
public:
  Method (void T::*mf ()) : memfun (mf) {}
  bool IsApplicableTo (Object* obj)
    {
      return dynamic_cast<T*> (obj) != 0;
    }
  void ApplyTo (Object* obj)
    {
      (dynamic_cast<T&>(*obj)).*memfun ();
      // throws bad_cast if something is wrong!
    }
};

Now you can:

  Object* obj = new MyGizmo;
  MethodBase* met = new Method<MyGizmo> (&MyGizmo::MyFunc);
  // 100 KLOC later
  if (met->IsApplicableTo (obj))
    met->ApplyTo (obj);
  else
    Fallback (obj);

You may similarly define classe ClosureBase and template
class Closure which would contain both object pointer
and mem-vfun-ptr.  The ApplyTo() in this case will return
ClosureBase* (pointing to suitably composed Closure<T>)
instead of calling memfun directly.  You can save about half
of dynamic_cast<>s that way, more if you execute closures
more than once. Or you may pass a fallback function to ApplyTo.
You may also employ the "operator ()" syntax for better
readability.  [Or even "operator->!]

You may note that this solution is very similar to one
that I've proposed in comp.lang.c++.moderated
under thread "Compilers rejecting explicit casting?".
I think that this construct is powerful enough to play
many, many dynamic tricks in C++.

--
Regards
Anatoli (anatoli at ptc dot com) -- opinions aren't
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <35BF2588.167E@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>I think that by "runtime objects" you mean the same thing I am talking
>about when I say "symbol definitions".

  Not really: I throw away all language constructs and then the evaluation
takes place by runtime objects alone. So all ideas of "symbols" in a
computer language sense are irrelevant, because they do not exist at the
time the evaluations take place.

  For example, in a computer language, an evaluation like exp(sqrt) -> exp
o sqrt could take place by identifying the symbols "exp" and "sqrt" in an
evaluation position and determining that the result should be "exp o
sqrt".

  However, I parse it to the objects "exp" and "sqrt" in an evaluation
position, and then the evaluation is handed over to the objects
themselves: At this point one could remove the name table and all its
symbols from the program, and then perform the evaluation. So what
language and syntax constructs one uses is irrelevant, as the language
itself does not participate in the evaluations that take place.

>Those definitions are represented
>at runtime by C++ object.

  So I think it is just misleading to think in terms of computer languages
and symbols on this question.

>I don't know what you mean by "the symbols are first ways". I'm having
>trouble parsing that phrase.

  It's a typo; it should be: "the symbols are first thrown away".

>You've left me curious as to what
>general() actually does. All you've explained so far are the
>circumstances under which it is called; you have not said what it does
>once it is called.

  It depends on the object what it feels is right. For the discussion in
this thread, it suffices to know that it handles the cases in which the
virtual function pointer (or what it represents) cannot be identified.

  Otherwise, some examples of use: The identity, would just return a
reference to its argument. And for a function composition object, (f o
g).general(a) = f(g(a)). A function like "sqrt" could make an evaluation
for a deferred evaluation, and yet other objects may throw an exception.

>>   Not at all (see my other post).
>
>I've looked at that post, and it makes explicit the way in which the
>internal representation and evaluation of member function pointers would
>get more complicated. I don't know for sure, but I think that right now
>such a pointer could be implemented by a 1 byte vtable offset, and
>evaluated by simply subscripting the vtable.

  The problem is that one needs a pair (V, k), where V is the base class
of the virtual function pointer with offset k, and that classes (like V)
are not first class objects that can be treated as variables: If I could
trick C++ to produce the pair (V, k) and then reuse it, I could solve the
problem; but I do not see how one could make C++ doing that.

>>   I think that as soon the one version of C++ has been approved, the work
>> on the next revision starts.
>
>My understanding is that ISO rules require a long (5 year?) period
>during which all work is concentrated on maintaining and gaining
>experience with the current standard, before the committee even starts
>thinking about the next version.

  I do not recall the details, but I am told that somehow the old slow ISO
process has been speeded up in order to help languages to have a longer
life-span.

>You're giving more syntactic details about your application, which of
>course complicates the design. I think that this could be handled within
>the context of my suggestion by defining a Pair object derived from
>Base, containing two Data members. The Atan object would then check
>whether dyamic_cast<Pair *> pBase was NULL, and if not it would extract
>the Base pointers from the two Data sub-objects and see if they could be
>dynamic_cast<Double *>. How you create the Pair object is a detail which
>depends upon how your parser recognises the ','.

  The problem is not whether one can do it by such means but that one
wants to avoid all kind of dynamic_cast checks as much as possible as it
is slow and complicates the design: The ideal is that the path to the
right evaluation should be as short as possible. (I can also mention that
the more dynamic structures I implement, the less need there is for macro
programming as that can be handled by runtime objects: Instead of a STL
pair, I use a dynamic list class which right now happens to be implemented
with multiple inheritance from a STL vector class. Then I can always use
that one for creating lists, as all variables are generic. STL was an
inspiration when I started with these ideas.)

  Usually, I start with dividing up in cases like you suggest, and when I
have realized a more suitable structure, I proceed to the virtual function
pointer structure I suggested. There are some other reasons for this
approach: Objects are also supposed to be able to handle deferred
evaluations, and this becomes awfully complicated to do if different cases
are bundled together.

  I mentioned one desired requirement, and that is to not insert any
intermediate function calls unless necessary. So the way I have it now, if
"sqrt(9)" is evaluated, one lands on (after this complicated modified
double dispatch evaluation) on sqrt.double(9), and there are no
intermediate function calls in this hierarchy. (So the so the double
dispatch 9.argument(sqrt) is removed by returning a virtual function
pointer.)

  Apart from being a pleasing evaluation model, and that it is wise to not
insert more function calls than necessary, I think of some possible
applications: For example, a recursive decent parser which removes
intermedtiate function calls if not necessary for the parsing.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/30 Raw View

In article <haberg-2807981912460001@sl53.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
>   Here is a much simpler question:
>
>   In the code
>
>     class B {
>     public:
>         virtual int f();
>     };
>
>     class D : public B {
>     public:
>         int f();
>     };
>
>     typedef int (B::*Bf)();
>
> If I write
>     Bf g = &B::f;
>     B* bp = new D();
> is
>     (bp->*g)();
> guaranteed in C++ to compute using D::f,

Of course.  That's part of why pointers to member functions look so
different from pointers to non-member functions.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/30 Raw View

In article <or90lcvoan.fsf@sunsite.dcc.unicamp.br>,
  Alexandre Oliva <oliva@dcc.unicamp.br> wrote:
> kanze  <kanze@my-dejanews.com> writes:
>
> > The "finally clause" works well for encapsulating things that have
> > to be done before leaving a block.
>
> But it does encapsulate things that have to be done when an object
> goes out of scope, which is a pity.  If I wanted to implement in Java
> a locking mechanism that releases the locks as the lock object goes
> out of scope, every block that created such an object would have to
> end with a finally clause releasing the lock.  This is totally against
> encapsulation.  :-(

The problem is that I've never found anything that I wanted to do "because
an object goes out of scope", except maybe free memory.  This doesn't mean
that the C++ model is flawed, only that it is using a particular mechanism
to trigger so-called "finalization".  To take your example, there is
certainly no implicit relationship between locking and scope, and one
could probably find examples where it would be logical to acquire and
release a lock in different scope.  In practice, given the tenents of
structured programming (a la Dijkstra), however, it is usually fairly
natural to associate the acquisition/freeing of the lock with a scope
(perhaps artificially introduced by adding extra braces), and you can
still use new/delete for the cases where it doesn't fit.

The second *slight* advantage of the finally clause is that it makes
the exact point of finalization visible -- in the typical C++ idiom,
a closing brace can trigger an awful lot.  Using finally, anyone reading
the program will see exactly where the lock is freed, without having
to count nested braces, etc.  All I can say to this is that I've used
the C++ idiom extensively for some years, and I've never seen this
particular aspect cause problems.  While I find it easy to explain why
the Java idiom should be superior, and why the C++ idiom might cause
problems, my practical experience suggests that it really doesn't make
a whole lot of difference.  When writing C++, I don't feel the lack of
a finally clause, and when writing Java, I don't feel the lack of
destructors being called at a deterministic point in time.  Of course,
when writing Java and C++, I don't write the same programs.  They
are different languages, and require a different style of programming.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1998/07/30 Raw View

[ Followups to comp.lang.java ]

Alexandre Oliva <oliva@dcc.unicamp.br> writes:

> kanze  <kanze@my-dejanews.com> writes:

> > And I don't see any pratical use for the finalize method of Java.
>
> If only it were ensured to run at application termination, it could be
> used to save the state of persistent objects...

But then every object destroyed is saved... doesn't
make sens to me.

--

Valentin Bonnard                mailto:bonnardv@pratique.fr
info about C++/a propos du C++: http://pages.pratique.fr/~bonnardv/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/30 Raw View

In article <haberg-2907981617400001@sl103.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:

>   (I recall that Steve Strassmann, the fellow who wrote the language
> Dylan, told me he felt it is extremely difficult to implement a
> conservative GC using C++. A computer scientist working with implementing
> Haskell all day said to me that the question frightened him. So I do not
> think it is so that one can take an armchair approach to the question
> saying "Sure, C++ has the capacity, just work a little harder with the
> templates".)

It's easier than that.  Conservative garbage collection is available
for C++ as a third party add-in, from someone called Geodesics, I think,
or you can pick up the Boehm collector for free off the net.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/30 Raw View

In article <35bf3424.174918783@news.motu.com>,
  abrahams@motu.com wrote:
> On 29 Jul 98 11:51:15 GMT, kanze@my-dejanews.com wrote:

> >In the case of a file descriptor, neither do
> >destructors; only the finally clause is really appropriate.  (On all
> >of the systems I've used, "freeing" a file descriptor involves closing
> >the file, an operation that can fail.  Which means that you must have
> >some way of testing the error and handling it.)
>
> As far as I remember, finally doesn't give you any special tools for
> dealing with this problem. Have I forgotten something? I was under the
> impression that finally and destructor cleanup were semantically
> equivalent.

They're very close.  The only difference is that in a finally clause, you
can call a function and evaluate its return value.  Thus, things like
close, which might fail, can be functions with return values, and still
be called from finally.

It doesn't really help as much as one would like, because there is still
the fundamental problem: you have two errors, and the error reporting
mechanism (exceptions) only supports propagating one.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/07/30 Raw View

Alexandre Oliva <oliva@dcc.unicamp.br> wrote:
>kanze  <kanze@my-dejanews.com> writes:
>> And I don't see any practical use for the finalize method of Java.

Neither do I, as it's defined.  (A finalizer is run by any thread, at
an indeterminate time, unrelated to meaningful program events; and
perhaps never.)

>If only it were ensured to run at application termination, it could be
>used to save the state of persistent objects...

Or if it had other hooks that could be related to program execution
events, it could be used for other things.  Recall that I mentioned
it in the context of languages that have GC but offer poor facilities
for management of resources other than memory.

If you use GC to avoid deleting objects, those objects had better not
have meaningful destructors.  The meaningful destructor is precisely
the tool that C++ offers to manage resources, both memory and "other".
You cannot garbage-collect objects that represent resources other than
memory.  This fact breaks composition and encapsulation, for garbage-
collection: you cannot compose an object of subobjects that might
manage a resource other than memory.  To use a subobject, you must
know details of its implementation, which breaks encapsulation.

A GC formalism which could accommodate other resources would not
need to break composition and encapsulation.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/07/30 Raw View

Fergus Henderson<fjh@cs.mu.OZ.AU> wrote:
>ncm@nospam.cantrip.org (Nathan Myers) writes:
>>Second, a "contract" style of programming
>>eliminates memory leaks and complexity in managing resources.
>
>I don't think it eliminates the complexity.  It just enables you
>to manage that complexity better.

OK.

>Consider the recent thread about the lifetime of the value
>returned by std::exception::what().  The standard is certainly
>supposed to specify a contract between the implementors and the
>users, but it's very easy to forget to document the intended
>lifetime of every piece of data.  "contract" style programming
>is no panacea either.

The example isn't a good one: the question was whether one could
assume the lifetime of the pointer returned was supposed to be
longer than the standard says; and the answer was no.

It is true that contracts can become arbitrarily complicated
and correspondingly difficult to check.  There are no panaceas.

>Garbage collection doesn't claim to solve all problems.
>But for the problems that it does address, it addresses
>them in a manner that requires less discipline (i.e. effort)
>on the part of the programmer than programming by contract.

Where GC works, it works.  The problem is that adding use of a
non-memory resource to a GC program can make the structure of the
entire program no longer appropriate.  This cannot happen in a
program written in contract style.  Of course arbitrarily large
changes in requirements can make any program architecture
inappropriate, but we like for small changes to have small
consequences.

Global GC may be seduce you into unwise decisions early in a
project.  GC that can be applied independently to individual
data structures is safer that way.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/30 Raw View

I've got it!  I know what is needed here!

First, we define an object f, which I'll say is of class F.  Much later
we want to define x, of class X.  We want to be able to implement
functionality of the form
    f(x)
long after the definintion of F is complete, without changing F or
recompiling everything that uses it.

The answer is to use a base class for X, not for F.

    struct F; // Forward
    struct X_base {
        virtual void F_of_this(F&);
    }
    struct F {
        // ...
        void operator()(X_base&);
        // ...
    };
    void F::operator()(X_base&x) {
        x.F_of_this(*this);
    }

    ////// Much, much later
    struct X : public X_base {
        virtual void F_of_this(F&);
    };
    void X::F_of_this()(F&f) {
        // Handles case where program has called f(x) where this==&x
        // ...
    }

Now you can define as many x's as you like.  So long as they are
derived from X_base, your programs will be able to call f(x)
without recompiling f.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/30 Raw View

In article <haberg-2807981912460001@sl53.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
>   Here is a much simpler question:
>
>   In the code
>
>     class B {
>     public:
>         virtual int f();
>     };
>
>     class D : public B {
>     public:
>         int f();
>     };
>
>     typedef int (B::*Bf)();
>
> If I write
>     Bf g = &B::f;
>     B* bp = new D();
> is
>     (bp->*g)();
> guaranteed in C++ to compute using D::f, or is the result (according to
> the standard) undefined?

Since f is a virtual function, the code above is guaranteed in C++
to compute using D::f.  Can you use this to solve the problem you've
been having in your calculator-type program?

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <6pnqv0$9q0$1@nnrp1.dejanews.com>, kanze@my-dejanews.com wrote:
>It's easier than that.  Conservative garbage collection is available
>for C++ as a third party add-in, from someone called Geodesics, I think,
>or you can pick up the Boehm collector for free off the net.

  You probably did not read the articles in this thread, because we
already discussed the limitations with Hans Boem's approach (his stuff is
posted at <http://reality.sgi.com/boehm/gc.html>):

  First, he goes underneath the C++, into the OS, and then makes a search
for potential roots in all likely places. So it is not a C++ libarary in
the sense that it is built on top of C++, and it is not as fast as one
could be with a more efficient approach (like when C++ has support for
GC's), and it is not something that most people would even try to write.
So the approach does not lend itself for people custom taylor a GC for
their application.

  So this is an interesting approach, but it ain't a solution.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <6pnthf$e60$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>If you use GC to avoid deleting objects, those objects had better not
>have meaningful destructors.  The meaningful destructor is precisely
>the tool that C++ offers to manage resources, both memory and "other".
>You cannot garbage-collect objects that represent resources other than
>memory.  This fact breaks composition and encapsulation, for garbage-
>collection: you cannot compose an object of subobjects that might
>manage a resource other than memory.  To use a subobject, you must
>know details of its implementation, which breaks encapsulation.

  This is infact an interesting point: Even with a GC, some data must be
have a destructor. (In C++, ~T() must be applied even if "operator delete"
does nothing).

  For example, suppose the heap has an object pointing to an open file:
When that memory is freed because there are no active pointers to it
anymore, if somebody has not bothered closing the file at that point, the
GC must close it.

  However, this can be achieved in the model I presented, a class Data
with pointers to DataRef handles with a pointer to object of a class Base
hierarchy: If say the DataRef handles are on two linked lists, those in
use and those not in use, one just lets the destructors T::~T(), where T :
Base, be invoked whenever a handle is taken off the list of handles in
use.

  One problem then, is that one does not know exactly when this is going
to happen, but that problem exists already within C++, with the temporary
objects.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/30 Raw View

In article <6pohsm$aal$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>First, we define an object f, which I'll say is of class F.  Much later
>we want to define x, of class X.  We want to be able to implement
>functionality of the form
>    f(x)
>long after the definintion of F is complete, without changing F or
>recompiling everything that uses it.

  My situation is though more general than just this: Object x does not
know anything about objects of type F, and f does not know anything about
objects of type X. Any attempts to link them together would impose a
severe restrictions, not solving the problem.

  So the case is that f sends a request to x, and x returns an answer (in
my model represented by a virtual function), normally with no knowledge of
what type f has, or what f can do. Then, with this answer in hand, f
starts to think about it. If f got an answer it does not recognize, then
it should take some kind of action (otherwise the code will break): I
suggested that it should execute f.general(x).

  (One way to think of this is as of two different computers on the
Internet: Then x just sends away an answer say "float" without knowing if
f really has implemented floats. If f does not know about floats, it must
do something: Perhaps f knows how to convert a float into a double and can
carry out the computations that way, or it does not and must throw an
exception. But this is something that x does not know or would want to
know anything about.)

>The answer is to use a base class for X, not for F.
>
>    struct F; // Forward
>    struct X_base {
>        virtual void F_of_this(F&);
>    }
>    struct F {
>        // ...
>        void operator()(X_base&);
>        // ...
>    };
>    void F::operator()(X_base&x) {
>        x.F_of_this(*this);
>    }
>
>    ////// Much, much later
>    struct X : public X_base {
>        virtual void F_of_this(F&);
>    };
>    void X::F_of_this()(F&f) {
>        // Handles case where program has called f(x) where this==&x
>        // ...
>    }
>
>Now you can define as many x's as you like.  So long as they are
>derived from X_base, your programs will be able to call f(x)
>without recompiling f.

  So the problem is more complicated: First define Base. Then much later
define F. Then, a fellow who has absolutely no idea of that F even exists,
independently defines X.

  There is a point to this: Whne defining X, there are numerous classes F
with elementary behavior (like the function composite class) which the
writer of X would not want to know about.

  In addition, the exceptional behavior of f(x) when f does not recognize
the answer that x provides must be handled by f (or a structure associated
with f alone and not with x) from the point of view of object orientation:
Otherwise I start creating dependencies between objects that are logically
independent. (When writing static and global structures, it is very common
to allow such global dependancies between logically independent
components, but in a dynamicl model, it does not work well.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/25 Raw View

In article <6pb4nm$244$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>What I'm getting from this and other posts, is that you are writing
>a compiler or interpreter for some other language.  Which is a great
>idea; C++ is an ideal choice for this type of project.

  Right. Except that I design runtime objects and onto that, I hook
language-like parsing in order to access the runtime features. So the
structire of the program is not known in advance, and will never be: that
is the problem caused by C++.

>However, if that's correct, you have to bear in mind the difference
>between your language-processor's run time environment and that of
>the program being processed.

  So this is the point I try to make all the time, which readers of this
thread constantly seems to mix up: I want C++ to be a great language for
implemnting such languages, interpreters and dynamic runtime objects, but
that is not the case now. This is what the discussion is all about,
improving C++ in this respect: Making it easier to do the job, simply.

>Let's imagine a hypothetical language interpreter which implements a
>language similar (on some level) to C++.  For the sake of discussion,
>let's call the language INTERP.  Your INERP interpreter is itself
>a computer program, of course, and let's assume that it's written in
>C++.  Your user will use your program, INTERP, to execute an INTERP-
>language program named TEST.  This TEST program contains 50 classes,
>many of which have virtual functions in a complex inheritance
>heirarchy.  You, as the author of C++I, have to write code that
>implements the v-tables needed by this inheritance heirarchy.
>Is that pretty close to the way things are?

  In my situation, tha answer is no, because I am not writing on a
computer language whose structure is known in advance. In my case, the
names on the lookup table do nothing but calling certain constructor
classes by a suitable syntax. So forget about the lookup table, it is
irrelevant. The constructor class however must be implemented in C++.

  So suppose I want to add a C++ compiler to my program called say "gcc".
Then I would need to add a "class gcc : public Base" to my program. I then
may need to add a virtual function "X::gcc" for some classes "X : public
B". If a class "Y : public Base" does not have the Y::gcc but is asked to
execute it, then it should execute Y::general instead.

  So from that point of view, there is no need putting X::gcc in the Base
class as a Base::gcc, which has the unwanted side effect that all classes
derived from Base (which in effect are all classes essential to the
runtime structure of the program) must be recompiled. But the way C++ is
now, one must add Base::gcc and recompile it.

  But if this was not forced, I could add my gcc later, and at runtime,
only those classes bothering about gcc in a specificcally new way would be
needed to be recompiled.

  The situation is similar to the one before C++ added RTTI: Then the Base
class would use
   class Base {
       enum Type { String, Double, ... };
       static Type type;
   };
Then the number "type" would be set for the derived classes String, Double
and on. But if a new class "gcc" is added, one would have to add that to
the list in the Base class, and then recompile everything. So that way
both unnecessary compile time is wasted (relative the use of RTTI), and it
is impossible to put the Base class in a library if the idea is that users
should ba able to add their own classes later on.

>If so, you have to draw a distinction between virtual functions in
>the INTERP program, and virtual functions in TEST.

  It is possible to make the lookup table and the parsing classes derived
from class Base, too, in which there is no such distinction. For example,
one implements a simple bootstrap syntax which enables one to read the
language, and then one hands over the control to the dynamically read
language to interpret the user program. This is in fact how many
interpreters of languages such as Scheme, SML and Haskell works: It
corresponds to adding a "class Class : public Base" which can be used to
add dynamic language classes.

  So from this point of view, there is no clear disctinction bewteen your
ITEROP and TEST programs.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/25 Raw View

In article <6pb0hd$t5u$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>From a light reading of that post, I believe it is a problem I am
>familiar with, although the name is new to me.  Basically, the
>problem stems from separate compilation.  C++'s efficiency comes
>from it's knowledge of details about the class, especially the
>size of the object and the size of the virtual table.  But this
>also means that when details about the class change, even if they
>are all private, all the users of the class must also be
>recompiled.  In the case where one class is used across a great
>number of projects, this can be very difficult or even impossible.

  So this is the problem an expert on writing C++ compilers would have to
tyhink of: If there is an efficent implementation of the feature.

  If there is an efficient implementation of the feature, then it is much
better to add it to the C++ compiler than people writing workarounds
themselves, because it is going to be much faster.

  There are variations one could think of, too: One could be allowed to write
    class B
        virtual T f();
    };

    class D : public B
        virtual T g() { ... }
        T f() -> g;            // Funny command explained below.
    };
Then, at execution time a, a call to D::f does not execute it at all but
replaces it directly with g.

  The reason is one would add such a feture to C++ is efficiency: A
function call is slower than a direct replacement, and the latter
significantly slows down the program (as these are intermediate functions
called every time a simple evaluation takes place).

  So adding such features, which are static within the language C++, would
give implementors an improved chance to develop dynamic fetures to
comptete in speed with that of static features.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/25 Raw View

In article <35B91738.A72@noSPAM.central.beasys.com>,
dtribble@technologist.com wrote:
>Stroustrup discusses "double dispatch", also known as
>"multi-methods", in "The Design and Evolution of C++", sect. 13.8.
>In particular, he discusses why he thought it was too complicated
>to add to C++, mainly in dealing with the efficiency of choosing the
>correct function to call and the exponentially-sized function tables
>(vtbl) apparently required.  In essence, it boils down into
>deciding which member function of the form 'intersect(Sh &a, Sh &b)'
>to choose when 'a' and 'b' are types derived from base class 'Sh',
>out of all possible combinations of 'a' and 'b' types.

  This is also a point that I tried to bring out: The reason that dynamic
language features are not part of C++ is not that "C++ should be a static
language", but that those dynamic language features constitute a
technology under development: If there is an efficient, well known way to
implement it, it can be a part of C++, otherwise not.

  With respect to double dispatch, the advice of those implementing
languages like CLOS, and Cecil, is: do not use it! It is too complicated
and to slow.

  So, the discussion here deals with a much simpler thing, namely how to
be able to use virtual function pointers instead of function calls.

  I think the relation between multimethods (as discussed by Stroustrup in
the citation above) and double dispatch is this: Double dispatch is a way
of implementing a language feature called multimethods; the latter is a
way of autmoatically selecting the right virtual function pointer based on
typing. So one then has to wsit down what kind of dynamic typing one wants
to have in that language.

  So this is not what I have discussed here in this thread: I have given
examples on how to double dispatch could be used to implement such a
feature. (The double dispatch technique is also mentioned in Stroutrup,
loc cit, 13.8.1.)

  But in my own programming, I do things that are considerably more subtle
than multimethods: Every new class does not require a new virtual function
pointer. In fact, if one is clever about it, often several classes can
share the same virtual function. In addition, the suggestion I proposed
would help to avoid vtbl's becoming large when not needed.

  And the suggestion is not strictly about double dispatch, as we hand
around virtual function pointers instead of dealing with extra function
calls: This is in fact much easier to work with in actual programming.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/25 Raw View

  I post this example, just to show that working with dynamic features
does not contradict the C++ static interface:

  So if I want to read data from the console, my C++ code looks like:
    Data da;
    Parser parser;
    try {
        cin >> parser >> da;   // "parser" manipulaor for reading data "da"
    catch (Exception& ex) {
        ...                    // Exception code: Parsing error.
    }

  So instead of reading simple things, like doubles, strings and the like,
one can read expressions, lambda formulas and the like. One could easily
switch parsers. The parser could be simple, but could be complicated, like
a whole language; the one writing the code need not know. In the latter
case, one could read whole programs.

  Data could be printed similarly:
    Printer printer;
    cout << printer << da << endl;  // "printer" manipulator for printing "da"

  So I have no idea of this fear that implementing dynamic features in C++
would contradict C++'s static interface.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David Wragg <dpw@doc.ic.ac.uk>
Date: 1998/07/26 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
> In article <jasojsb21g.fsf@gatsby.u-net.com>, David Wragg
> <davew@gatsby.u-net.com> wrote:
> >This is not correct. The C++ language is perfectly amenable to the
> >same implementation technique that Java uses (virtual function name to
> >vtable offset resolution at link time or run-time, rather than compile
> >time). There is no "C++ runtime mechanism". Problems you have with
> >current implementations are just that (even if almost all widespread
> >implementations happen to display this flaw).
>
>   In Java, one can also access the dynamic linking names from your
> program; it is called reflections. See
>   http://www.javasoft.com/products/jdk/1.1/docs/guide/reflection/index.html
> This could be used if everything fails, and this you cannot do with C++.

Well, this could be added to C++ without great difficulty. Whether it
should be added is another matter. It isn't necessary to solve the
"FBC" problem.

> >Also, the term "fragile base class problem" is used to refer to two
> >very different things (I recently posted on this subject in another
> >group - see http://x13.dejanews.com/getdoc.xp?AN=373321668).
>
>   The "fragile base class problem" is, I think, the problem that all
> derived classes must be recompiled even if it is unnecessary for the
> changes done to the base class.

I know, I'm not saying your use of the term was in any way ambiguous
or wrong - there is plenty of precendent for using the term in exactly
the way you did. I was just pointing out that the term is also used
(by others) to refer to something different. Since I have yet to find
the first published use of the term, I don't claim that either meaning
is more correct.

> I said that the question here is related to that question, because
> here also the base class must get another virtual function if a
> dervied class adds it, and then all derived classes must be
> recompiled.

For typical implementation strategies you are right. However, the
relevant part of the C++ standard is the One Definition Rule
(basic.odr) which (caveat: I haven't seen the FDIS) says a class can
have definitions in multiple translation units, but (to all intents
and purposes) those definitions must be the same. So for instance if
you change a class definition in a header file you have to recompile
everything including that header file.

Of course, implementations may be more lenient. In particular, if the
ODR is followed strictly it causes the "FBC" problem.

In the case of static linking, the best solution is development
environments that automatically do appropriate recompilations when
source code changes are made. How few recompilations they do is a
quality-of-implementation issue (obviously they should always do
enough recompilations to be a conforming implementation under all
circumstances).

The case of dynamic linking is trickier, but dynamic linking is far
outside the scope of the standard anyway, so it seems to me that the
best solution is to lobby vendors to create implementations which deal
intelligently with dynamic linking. The fact that this has been done
by some vendors in the past surely means that others can do it in the
future (though hopefully they will bring their compilers into line
with the standard first).

> So in working around this problem in how the compiler implements it,
> one may need to think about how to work around the fragile base
> class problem i this instance.

Yes, with present implementations workarounds are needed. But since it
is present implementations that are flawed, rather than the standard,
it is the implementations that need fixing.

A workaround I have used (and one that works completely within the
standard) is to use "interface" classes that only contain virtual
functions, and create "implementation" classes which inherit from
these. As long as you make sure the definitions of the "interface"
classes never change, they can be used everywhere. The definition of a
particular "implementation" class should only appear within one
DLL/shared library, so that if the definition changes only the sources
of that DLL/shared library need recompiling.

This technique is not directly comparable with your examples, but
seems to be widely applicable. Is it not suitable for your purposes?

--
Dave Wragg

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/26 Raw View

In article <haberg-2507981314240001@sl42.modempool.kth.se>, I wrote:
>  There are variations one could think of, too: One could be allowed to write
>    class B
>        virtual T f();
>    };
>
>    class D : public B
>        virtual T g() { ... }
>        T f() -> g;            // Funny command explained below.
>    };
>Then, at execution time a, a call to D::f does not execute it at all but
>replaces it directly with g.

  In this context, I can mention that Simon Peyton Jones is developing
some similar kind of features in a portable assembly language called C--,
developed as an offspring for making implementations of the functional
computer language Haskell; see <http://www.dcs.gla.ac.uk/~simonpj>. This
language contains some interesting ideas, so it could be worth having a
look.

  I feel it would be good if C++ could be equipped with the features
needed in doing such implementaions, so people would not feel it was
necessary developing languages like C--.

  One problem with C and C++ alike is that it is notoriously hard to
implement an efficent GC (garbage collector), and without a GC (experts
say) it is not possible to expect fast dynamic implementations: The
traditional "new" and "malloc" are 50-100 times slower than dynamic memory
using an efficient GC. A conservative GC is needed, because "smart
pointers" with as little overhead as possible is also needed for fast
implementing dynamic structures. (So this does not necessarily mean that
C++ must have a GC, but it should be relatively easy for others to develop
GC's which can be used in say libraries.)

  One could probably go along mentioning some of the other features that
eventualy must become a must with C++: Dynamic linking, ability to use
method names from within the program (like Java reflections), parallel
threads (more advanced than POSIX, so that the GC can run in a parallel
thread).

  So it is not that C++ is a great language for implementing dynamic
structures as somebody said here: C++ has some serious shortcomings in
this respect. Adding such features, when efficient implementations of them
becomes well-known, does not change the profile of C++ as a statically
typed language providing fast runtime implementations, but will enhance
it.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/07/26 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
>
>traditional "new" and "malloc" are 50-100 times slower than dynamic
>memory using an efficient GC.

This is frequently repeated, and it accurately describes common
malloc implementations.  However, there is no reason malloc cannot
be implemented 50-100 times faster than is typical.  I have done it.

Garbage collection is no panacea.   First, new/delete can be made
comparablyh fast.  Second, a "contract" style of programming
eliminates memory leaks and complexity in managing resources.
Finally, and most importantly, the same style that manages memory
well manages other resources as well.

This last point is important because a program manages many
resources.  Memory has looser requirements than other resources
-- memory resources are freely interchangeable, and not scarce.
If your program manages those other resources properly, then
memory is just one more; but your program must not fail to manage
other resources just because it no longer manages memory.

Languages that offer garbage collection built-in sometimes fail
to provide the tools needed to manage the other resources properly,
out of the delusion that memory is the only resource that needs to
be managed, or that other resources have the same characteristics
as memory.  (Java's "finalization" may be an example of this.)

Before many of us accept garbage collection as the norm, we should
hold out for a formalism that can handle other resources and their
typically more-stringent requirements as well.  Then memory management
will be just one case among many, and the programming styles promised
to be possible given garbage collection may become practical.

In the meantime, garbage collection alone doesn't deliver on its
promises.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/26 Raw View

In article <ja3ebpj0kb.fsf@gatsby.u-net.com>, David Wragg
<dpw@doc.ic.ac.uk> wrote:
>>   In Java, one can also access the dynamic linking names from your
>> program; it is called reflections. See
>>   http://www.javasoft.com/products/jdk/1.1/docs/guide/reflection/index.html
>> This could be used if everything fails, and this you cannot do with C++.
>
>Well, this could be added to C++ without great difficulty. Whether it
>should be added is another matter. It isn't necessary to solve the
>"FBC" problem.

  I think it is a spin-off of the solving the PBC problem and admitting
dynamic linking by adding the names to the object code: Then it is natural
that the next step is to also admit those names are accessed from within
the runtime program.

>> I said that the question here is related to that question, because
>> here also the base class must get another virtual function if a
>> dervied class adds it, and then all derived classes must be
>> recompiled.
>
>For typical implementation strategies you are right. However, the
>relevant part of the C++ standard is the One Definition Rule
>(basic.odr) which (caveat: I haven't seen the FDIS) says a class can
>have definitions in multiple translation units, but (to all intents
>and purposes) those definitions must be the same. So for instance if
>you change a class definition in a header file you have to recompile
>everything including that header file.

  In the suggestion of the virtual function pointer dynamic cast, the idea
is that the one writing the program should be able to avoid to alter the
base class: So it would then not have to be recompiled.

  But the second question is then if one can write compilers so that this
is possible, and how efficient such implementations are (that is how
fast).

>A workaround I have used (and one that works completely within the
>standard) is to use "interface" classes that only contain virtual
>functions, and create "implementation" classes which inherit from
>these. As long as you make sure the definitions of the "interface"
>classes never change, they can be used everywhere. The definition of a
>particular "implementation" class should only appear within one
>DLL/shared library, so that if the definition changes only the sources
>of that DLL/shared library need recompiling.

  One suggestion was posted in the thread "Re: Is name __vptr standard?"
by AllanW@my-dejanews.com: It consisted of putting the function pointers
in a STL "map", an associative array with unique key on names. If you have
a more efficent solution, please post it.

>This technique is not directly comparable with your examples, but
>seems to be widely applicable. Is it not suitable for your purposes?

  The problem is that as soon one starts with workarounds, it puts a heavy
burden on the code writing. For example, I used a variation where every
virtual function was implemented by two functions, one returning a certain
function pointer, and other one the virtual function itself. It quickly
became hopeless keeping track of these two functions needed.

  So I replaced them with the version I use know, where the virtual
function pointer can return both data or a virtual function pointer, and
from the programming point of view, this is considerably easier to work
with. In addition, if there are efficent low-level ways to implement it, I
think it would be much better such features would be a part of C++ than
people trying to find workarounds which waste programming, compile and
running time.

  The kind of suggestions you have would eventually lead to implementing a
class Class: However, this does not change the picture, because even with
a class Class, it would be good if one could add new C++ classes for
efficency.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/26 Raw View

In article <6pepb3$9qg$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:

>.. there is no reason malloc cannot
>be implemented 50-100 times faster than is typical.  I have done it.

  It would be interestint to know what the "secret" is.

>Garbage collection is no panacea.   First, new/delete can be made
>comparablyh fast.  Second, a "contract" style of programming
>eliminates memory leaks and complexity in managing resources.
>Finally, and most importantly, the same style that manages memory
>well manages other resources as well.

  This is not the case if one works extensively with dynamic structures,
because the right thing is to work with references, and one needs a good
way to remove unused references. Ref counts are slow, and can create loops
which never are removed.

  Garbage collection is just another technique under development. The
thing that makes a conservative garbage collector so difficult to
implement in C++ is that there is no convenient way to extract the root
set (the set where all active pointers originate: Trying to create lists
of them creates to much overhead. Searching all possible places in the OS
(like Hans Boehm does) seems to be a slow procedure.

  In the setup I am working with, I know (within the C++ language) where
the root set is created which should be handled of a particular GC: If I
sopke about a class Data with a pointer to a class DataRef object, then it
is this pointer and nothing else. So I would want to be able to write my
Data something like
    class Data {
        root my_GC DataRef* dref;   // Mark for use with my_GC.
    };
Then C++ somehow keeps track of it so U can use my_GC to collect the
unused handles: If I only get hold of the root set, this could be done
relatively easily by using the copy contructors of C++.

  So C++ could in this way allow people to implement the GC they prefer.
Good Gc's are hybrids of techniques, moving, generational, incremental,
parallel threads, fixed size, so it would be to early to let C++ come with
one, other than in libraries.

>Languages that offer garbage collection built-in sometimes fail
>to provide the tools needed to manage the other resources properly,
>out of the delusion that memory is the only resource that needs to
>be managed, or that other resources have the same characteristics
>as memory.  (Java's "finalization" may be an example of this.)

  So this my suggestion would not change anything to C++ or its
programming style, except facilitating the implementaion of a GC. In
particular, the destructors would still exist, if one needs to write them.
You can also combine different memory models within the same program. This
is in fact important when implementing a GC: My DataRef handles would use
a different model than the data on the heap.

>Before many of us accept garbage collection as the norm, we should
>hold out for a formalism that can handle other resources and their
>typically more-stringent requirements as well.  Then memory management
>will be just one case among many, and the programming styles promised
>to be possible given garbage collection may become practical.

  So this is not the norma in the model I presented it: One puts it in in
the spots where it seems to be needed. In other places, one uses the old
"new".

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/07/27 Raw View

Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
>(Nathan Myers) wrote:
>
>>.. there is no reason malloc cannot
>>be implemented 50-100 times faster than is typical.  I have done it.
>
>  It would be interestint to know what the "secret" is.

The short answer is "page tagging".  An early effort is described
at http://www.cantrip.org/wave12.html .  I have lately done better.

>>Garbage collection is no panacea.   First, new/delete can be made
>>comparably fast.  Second, a "contract" style of programming
>>eliminates memory leaks and complexity in managing resources.
>>Finally, and most importantly, the same style that manages memory
>>well manages other resources as well.
>
>  This is not the case if one works extensively with dynamic structures,
>because the right thing is to work with references, and one needs a good
>way to remove unused references. Ref counts are slow, and can create loops
>which never are removed.

Agreed, reference-counting is not always the best way to manage
shared resources.

>  Garbage collection is just another technique under development. The
>thing that makes a conservative garbage collector so difficult to
>implement in C++ is that there is no convenient way to extract the root
>set (the set where all active pointers originate: Trying to create lists
>of them creates to much overhead. Searching all possible places in the OS
>(like Hans Boehm does) seems to be a slow procedure.

True, retrofitting GC to a runtime environment that knows nothing
about it is difficult.  A compiler or library that knows when heap
pointers are being manipulated and offers hooks to operate on them
would make GC a much more practical proposition.  Note "or library".

>  In the setup I am working with, I know (within the C++ language) where
>the root set is created which should be handled of a particular GC: If I
>spoke about a class Data with a pointer to a class DataRef object, then it
>is this pointer and nothing else. So I would want to be able to write my
>Data something like
>    class Data {
>        root my_GC DataRef* dref;   // Mark for use with my_GC.
>    };
>Then C++ somehow keeps track of it so U can use my_GC to collect the
>unused handles: If I only get hold of the root set, this could be done
>relatively easily by using the copy contructors of C++.

This might be one way; but where did that Dataref* pointer value come
from?  Before proposing sweeping language changes, consider how your
goals can be approached using the mechanisms already supported by
the language.  Templates can be a big help, here.

For an example of a C++ library which does what has traditionally
been done with major compiler and language changes, study the Blitz
library: http://monet.uwaterloo.ca/blitz/.

>  So C++ could in this way allow people to implement the GC they prefer.
>Good Gc's are hybrids of techniques, moving, generational, incremental,
>parallel threads, fixed size, so it would be to early to let C++ come with
>one, other than in libraries.

Yes.

>>Languages that offer garbage collection built-in sometimes fail
>>to provide the tools needed to manage the other resources properly,
>>out of the delusion that memory is the only resource that needs to
>>be managed, or that other resources have the same characteristics
>>as memory.  (Java's "finalization" may be an example of this.)
>
>  So this my suggestion would not change anything to C++ or its
>programming style, except facilitating the implementaion of a GC. In
>particular, the destructors would still exist, if one needs to write them.
>You can also combine different memory models within the same program. This
>is in fact important when implementing a GC: My DataRef handles would use
>a different model than the data on the heap.

I'm not yet convinced that language changes are needed for this
to be achieved.

>>Before many of us accept garbage collection as the norm, we should
>>hold out for a formalism that can handle other resources and their
>>typically more-stringent requirements as well.  Then memory management
>>will be just one case among many, and the programming styles promised
>>to be possible given garbage collection may become practical.
>
>  So this is not the norm in the model I presented it: One puts it in in
>the spots where it seems to be needed. In other places, one uses the old
>"new".

A garbage collector that only needs to look at certain "references"
should be able to do much better than one obliged to look at everything
and apply the same algorithm to everything.  An ability to treat
different uses of memory differently translates well to treating other
resources according to their unique characteristics as well.

The difficulty is in supporting composition, where the components
observe the "old" semantics, and demand the old attention.  E.g.
how do you GC something that contains a reference-counted string?
Do your proposed language changes help, there?  If not, maybe the
scheme requires a discipline of use which can also accommodate the
semantics you need within the language as it is.

Do you really feel you have explored the existing capabilities of the
language fully?  I know I haven't.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <EwFuGG.IKp@cadlab.it>, "Alex Martelli" <martelli@cadlab.it> wrote:
>Double-dispatching, and other variations of multiple dispatch,
>why, sure, I _have_ needed to do that many times over the years,
>including back when I couldn't count even on plain RTTI as being
>portable among the compilers I had to support (and that wasn't
>all that long ago, either).

  Double-dispatching provides a rather steep function stack with functions
inserted between the function calls that actually perform operations on
data.

  So one simply allows the functions to also return virtual function
pointers. These are evaluated and scanned through in a while loop unto one
finds a virtual function that acts on data. This gives a maximally
flattened function stack.

  So this is how the problem arises.

  Not very deep or complicated or anything such.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/07/29 Raw View

David Wragg <davew@gatsby.u-net.com> wrote:
>> Also, the term "fragile base class problem" is used to refer to two
>> very different things (I recently posted on this subject in another
>> group - see http://x13.dejanews.com/getdoc.xp?AN=373321668).

AllanW@my-dejanews.com wrote:
> From a light reading of that post, I believe it is a problem I am
> familiar with, although the name is new to me.  Basically, the
> problem stems from separate compilation.  C++'s efficiency comes
> from its knowledge of details about the class, especially the
> size of the object and the size of the virtual table.  But this
> also means that when details about the class change, even if they
> are all private, all the users of the class must also be
> recompiled.  In the case where one class is used across a great
> number of projects, this can be very difficult or even impossible.
>
> So far, I've only worked on one class where this was a concern.  I
> wrote a class used to give details to a report generator that
> displayed its results in one of our spreadsheets.  I managed to
> design the class so that internal changes need never force all the
> clients to recompile.  Here's a simplified version of what I did:
>
>     // The shared headere file
>     class ReportInternal;
>     class Report {
>         ReportInternal *r;
>     public:
>         Report();
>         ~Report();
>         void Function1(int);
>         int Function2(char);
>         // etc...
>     };
>
> And inside the DLL I did this:
>     class ReportInternal {
>         // Whatever data is needed.
>         char data1[];
>         int  data2;
>         double data3;
>         // etc...
>     public:
>         // These functions do the "real" work.
>         void Function1(int);
>         int Function2(char);
>         // etc...
>     };
>
>     // MAINTENANCE NOTE: DO NOT DELETE ANY FUNCTIONS IN REPORT.
>     // You may add new functions, but if old ones become obsolete,
>     // you MUST re-implement them in terms of the new functions.
>     // Be sure to mark these as *depricated* in the header file
>     // if appropriate.
>     Report::Report() : r(new ReportInternal) {}
>     Report::~Report() { delete r; }
>     void Report::Function1(int i) { r->Function1(i); }
>     int Report::Function2(char i) { return r->Function2(i); }
>     // etc...

This is a common programming technique, known as a "proxy" class.
You give the client a proxy object that hides the details of the
implementation; the proxy class simply forwards all requests
(i.e., member function calls) to the actual implementation class
functions.  The proxy class hides the implementation details from
the client.

This technique has several advantages:

1. The implementation class can change from release to release
without requiring the client code to be recompiled; the client code
only needs to be re-linked to the library containing the new
implementation class code and data.  (But you still have a problem
when you remove/obsolete a proxy member function, in which case the
client code needs to be recompiled; this should probably only be
done for major releases.)

2. The client sees nothing about the implementation class.  This
gets around one of the weaknesses of C++, in that a class
declaration must show all of its private members in its public
header file (which makes true data hiding difficult).  Using
proxy classes allows hiding the private details entirely.

3. Compiles of client code are probably faster, since the proxy
class is simpler than the full-blown implementation class.

The only disadvantages I see are:

1. The library code is a bit larger, because of the proxy forwarding
functions.  But the overhead is generally small, and is a reasonable
trade-off for the benefits.

2. The proxy member functions have to call the implementation
member functions, making the program execute a bit slower.  But,
again, the overhead is very small (and may not even be noticeable
for member functions that perform I/O), and are a reasonable
trade-off for the benefits.

> I'm not sure what I would have done if I had needed to use any
> virtual functions.  That could complicate the whole mechanism.

In this case, virtual functions are bad, because adding new
member functions will cause the _vtbl to change between releases.
Adding or changing virtual functions in the implementation class
is okay, since the client is isolated from it, but such changes to
the proxy class are not okay.  It's better to use only non-virtual
member functions in the proxy class.

> Is there some simpler way to deal with this issue?  I'm always
> in the market for some new technique to make my life simpler.
> (But I don't want to play games with the vtbl, which might or
> might not exist!)

Proxies are a perfectly acceptable and safe technique.

Another way to implement a proxy class is to make it a base class,
and make the implementation class a (hidden) class derived from it:

    // proxy.h
    // (Given to clients)

    class Proxy    // The proxy class, available to clients
    {
    public:
        static Proxy *  create();  // Creates a proxy
                        ~Proxy();
        int             foo();
        void            bar();
    private:
                        Proxy();   // Inaccessible
    };

    // proxy.cpp

    #include "impl.h"

    Proxy * Proxy::create()
    {
        return new Impl;   // Creat an Impl object that
                           // looks like a proxy object
    }

    int Proxy::foo()
    {
        return ((Impl *)this)->iFoo();  // Forward the call
    }

    void Proxy::foo()
    {
        ((Impl *)this)->iBar();         // Forward the call
    }

    // impl.h
    // (Not given to clients)

    class Impl: public Proxy  // Implementation class is derived
    {
    public:
                Impl();
                ~Impl();
        int     iFoo();
        void    iBar();
    };

    // impl.cpp

    #include "impl.h"

    int Impl::iFoo()
    {
        ...Do the actual work here...
    }

    ...etc...

A Proxy object can only be created by calling Proxy::create()
(we declare Proxy::Proxy() as private to ensure this), which
actually creates an Impl object.  The proxy member functions
forward their calls to the implementation member functions.  This
technique saves the overhead of a pointer member in Proxy, but
prevents clients from creating a Proxy object as an auto variable
or by using operator new().

-- David R. Tribble, dtribble@technologist.com --
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

  I found a description that explains both what kind of feauture I am
asking for, and how it might be implemented as a part of the C++ language:
If somebody has a suggestion for an efficient workaround within the
existing C++, then this description could be used to pin down the
semantics.

  But the suggestion amounts to that one should be able to use virtual
function pointers as objects with fuller information about the virtual
structure, so it seems me that it should fit into the picture of a C++
language addition from that point of view: One gets enhanced capability of
making use of the features that are already there in the compiler but
which are not, at this point, available to the C++ programmer.

  If a class X has a virtual function X::f, then there is a unique base
class V of X in which V::f is first defined as a virtual function. (So the
class V does not have a base class in which f is defined virtual.) Call
this unique class _the_ base class of the virtual function f.

  Then, whenever the programmer writes X::f, the compiler saves the pair
(V, k), where V is the base class of X::f, and k is the offset of V::f on
the virtual function table of V. It makes no difference what this offset
is in the compiler implementation, but here I will assumed it is just an
array offset. So if D is a derived class of V and we have a pointer D* dp,
then D::f(a) is called by (*(dp->__vtbl[k]))(dp, a), where __vtbl is the
internal vtbl pointer that the class has. (Cf. Ellis-Stroustrup, ARM 1194,
10.7c, p 228.)

  The feature I then want is that for any pointer C* cp of any class C
whatsoever, I should be able to write C++ code that semantically is
equivalent to the call sequence
    V* vp = dynamic_cast<V*>(cp);
    if (vp != NULL)
        (*(cp->__vtbl[k]))(cp, a);    // Call C::f.
    else
        ...                           // Exception code.

  One should, of course, in C++ write something that hides away this low
level code sequence.

  But this wording of the feature is interesting, because it means that
one has a greater access to information that already is there in the
compiler, the base class V of the virtual function f.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <Ewr8BB.CnK@cadlab.it>, "Alex Martelli" <martelli@cadlab.it> w=
rote:
>        class I_do_gcc {
>            public:
>            virtual data* gcc(data*) =3D 0;    // or whatever signature
>        };
>and the "ask object Y to execute gcc" becomes:
>        data* doit(data* input_data, Base * pY)
>        {
>        I_do_gcc* performer =3D dynamic_cast<I_do_gcc*> pY;
>        if(performer)
>            return performer->gcc(input_data);
>        else
>            return generic_gcc(input_data);
>        }

  The problem is that the one that gets the request for "ask object Y to
execute gcc" does not know what "gcc" is. So, yes, one can probably do it
with pointers in some way (see my other post), but the question is how to
extract that information conveniently. If I write a function that returns
gcc, I just want to be able to write
   Data X::f() {=A0return Y::gcc; }
and if I add a class using gcc, I just want to write
   class Use_gcc {
   public:
       Data gcc(Data&);  // use gcc.
   };

  If there are any tricks, then they should be such that this programmer
interface is not disturebed (because otherwise the program becomes too
difficult to handle.)

  So well, I have used other models, like one in which one is forced to
use two virtual functions:
   class Use_gcc {
   public:
       T gcc_f() { ...=A0}        // Set for use with gcc
       Data gcc(Data&);         // use gcc.
   };
But when one has a really lot of them (just ten or twenty suffices to mak=
e
it complicated), it quickly becomes difficult to work with.

  (I am saving these suggestions you post though, to see if I can make us=
e
of them later on: I need to think more carefully about it.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <6ph2gl$94o$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
(Nathan Myers) wrote:
>True, retrofitting GC to a runtime environment that knows nothing
>about it is difficult.  A compiler or library that knows when heap
>pointers are being manipulated and offers hooks to operate on them
>would make GC a much more practical proposition.  Note "or library".

  It makes not difference how the feature is provided, of course. But Hans
Boehm (which provides a library) told me that they get hold of the root
set by going underneath the C++ compiler, using special knowledge about
every particular OS about function parameter and registers. Then they are
forced to search through all of it, looking for potential pointers.

  So if one should be able to selectively just extract a root set
associated with some particular pointer, then it seems me that some new
features must be intergrated into the C++ language itself.

>>Data something like
>>    class Data {
>>        root my_GC DataRef* dref;   // Mark for use with my_GC.
>>    };
>>Then C++ somehow keeps track of it so U can use my_GC to collect the
>>unused handles: If I only get hold of the root set, this could be done
>>relatively easily by using the copy contructors of C++.
>
>This might be one way; but where did that Dataref* pointer value come
>from?  Before proposing sweeping language changes, consider how your
>goals can be approached using the mechanisms already supported by
>the language.  Templates can be a big help, here.

  This is just one of the many models for dynamic data: One has a class
Data with apointer to a handle class DataRef, as above, which in its turn
has a pointer to a class Base hierarchy object. The class Base also has a
pointer back to the class DataRef:

  This way, objects in the class Base hierarchy can reference each other
via the DataRef handle, and it is also possible for an object in this
hierarchy to self-mutate into an object of another type. So if one should
maximize the dynamic capablities this is a model one arrives at.

  From the point of view of memory management, the DataRef are of a fixed
size, these can be put on a likend list. The data in the Base hierarchy,
as all references are via handles, is movable. The programmer would
normally just use automatic Data, which rpovides references (non-copied
data), letting the GC taking care keeping track of dead references.

  One can think of other models, for example, small sized data should
perhaps be put onto the DataRef handles, avoiding tracing pointers. This
might be done by deriving the class Base form the DataRef class, and
giving these classes thier own "operator new".

  From what I can see there are two problems with implementing a
conservative GC onto this model: First, one needs to keep track if the
Data is autmatic or on free store; the latter case arrives when one writes
a
    class D : public Base
        Data d;            // Now, d.dref is not in the root set.
    };

  The second probles it to keep track of the pointers in the root set: The
trickiest part is keeping track of the pointers in the temporary
variables.

  One could think of implementing featires keeping track of that (Dan
Edelson's name is mentioned n thios context), but it is known that it is
too slow: Keeping track of that data is not easy and creates too much
overhead.

>A garbage collector that only needs to look at certain "references"
>should be able to do much better than one obliged to look at everything
>and apply the same algorithm to everything.  An ability to treat
>different uses of memory differently translates well to treating other
>resources according to their unique characteristics as well.

  This is what I think of too, one should be able to use different memory
techniques.

>The difficulty is in supporting composition, where the components
>observe the "old" semantics, and demand the old attention.  E.g.
>how do you GC something that contains a reference-counted string?
>Do your proposed language changes help, there?  If not, maybe the
>scheme requires a discipline of use which can also accommodate the
>semantics you need within the language as it is.

  I can only say what would happen in my program, if was able to use it:
As for a start, I would first write my classes in the Base class hierarchy
in a handle safe way (if its memory is to be moved) and then be able to
remove the reference count in the class DataRef above.

  However, references out of the class Base hierarchy are allowed, so if I
use say some STL list classes and these use traditional "operator new"
memory allocation, this should be no problem. So I could if I so want
successively study the problem of making such classes also the external
data implemented with my GC for optimization, but this is not something
that is required. (So the prudent thing would be to eventually stop using
the ref count string class, because it is not going to be needed anymore.
And it is not needed in my model even now, as I already put the ref count
once and for all in the DataRef class: If one has a pointer back form tha
class Base to the DataRef class, one can avoid the need for having two ref
counts.)

  So this is really something that I want: External data should be easy to
add, which making my dynamic model as fast and flexible as possible.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

  Here is a much simpler question:

  In the code

    class B {
    public:
        virtual int f();
    };

    class D : public B {
    public:
        int f();
    };

    typedef int (B::*Bf)();

If I write
    Bf g = &B::f;
    B* bp = new D();
is
    (bp->*g)();
guaranteed in C++ to compute using D::f, or is the result (according to
the standard) undefined?

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <35BC92AA.59E2@wizard.net>, James Kuyper <kuyper@wizard.net> w=
rote:
> For any pair f,x of defined symbols, it is possible for
>the user to request the evaluation of f(x). The evaluation potentially
>evokes different code, depending upon both the definition of f, and the
>type of x.

>You've chosen to implement this using a variant of double dispatch.

  This is correct.

>Symbol definitions are represented by instances of classes derived from
>a Base class, which has a virtual argument() function which identifies
>the type of the symbol by returning a pointer to a Base virtual member
>function.

  This is however incorrect: I use symbols to access the runtime objects,
but in the actual runtime coiputations, the symbols are first ways. So th=
e
"f" and "x" are runtime object that when evaluated as "f(x)" do not know
anything about hwho called them.

>Every derived class that knows how to handle a given symbol
>type defines an override for the corresponding Base virtual member
>function. Any symbols of types that it doesn't know how to handle get
>passed to a single general() handler, defined as yet another virtual
>member of Base. Presumably general() displays a "syntax error" message
>of some kind.

  So this is wrong: general() ("generic" as I call it) just handles cases
where no specific method has been indicated: If "f" does not regonize the
suggestion from "x" or does not get any suggestion at all, then general()
is used.

>This design is a useful one if most derived classes are able to handle
>most symbol types. However, you have complained about unnecessarily
>having to re-compile your entire body of code just because you had to
>add a single symbol type like Float. This is a problem worth complaining
>about only if most of your derived classes don't implement the Float()
>virtual function.

  So this is correct: In the beginning of the project this was not the
problem. But the more I work on it, the less stuff should be put into the
Base class from the semantic point of view).

>To avoid that problem, you proposed a change that would allow a
>variation on dynamic_cast<> to be applied to member function poitners. I
>don't know how hard that change would be to implement, though I suspect
>that it would complicate the internal representation or evaluation of
>member function pointers.=20

  Not at all (see my pther post).

>In any event, the new standard has been
>approved, and there won't be any further changes of anything that isn't
>an outright defect for a long time to come, so you'll have to look
>elsewhere for a solution.

  I think that as soon the one version of C++ has been approved, the work
on the next revision starts.

  But even if suggestions are not approved, perhaps we gain some insights=
. :-)

>I'd like to suggest the following solution.
...
>        // "Base *pBase" is obtained from x, by means that don't matter
>        // for this example
>        Double *pDouble =3D dynamic_cast<Double *> pBase;
>        if(pDouble!=3DNULL)
>        {
>                // code using Double:: members through pDouble=20
>                // to evaluate f(x)
>                return result;
>        }
>        Float *pFloat =3D dynamic_cast<Float *> pBase;
>        {
>                //Evaluation of f(x) through pFloat
>                return result;
>        }
>        return general(x);

  Actually, I am moving the other direction: I started writing code with
cases like this, but I changed because it became illogical. Suppose a
class takes both  one and two doubles, say combining atan and atan2 into
one object. (And I here depart form the idea of genericity o all function=
s
being of type D f(D&) as I just want to illustrate the concept.) Then the
logical way is to write
    class Atan : public Base {
    public:
        double double1(double d) {=A0return atan(d); }
        double double2(double x, double y) { return atan2(y,x); }
    };
where "double1" and "double2" are special virtual functions for taking
care of arguments with one or two doubles.

  This is the logical way to program, because it keeps features that are
semantically apart also apart in the code.

  So this is the logical way to program, and it is also faster than going
through different cases. And if one knows something about functional
languages such as Haskell <http://www.haskell.org/>, which has classes
defined in similar ways, then this is also the logical way to go.

  So the idea with the virtual function pointers is surely the way to go.
If it does not work within C++, then the next step is to refine the
technique by a special implemenation so that it works (like a class Class
or something).

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: fjh@cs.mu.OZ.AU (Fergus Henderson)
Date: 1998/07/29 Raw View

ncm@nospam.cantrip.org (Nathan Myers) writes:

>Garbage collection is no panacea.   First, new/delete can be made
>comparably fast.

This is true...

>Second, a "contract" style of programming
>eliminates memory leaks and complexity in managing resources.

I don't think it eliminates the complexity.  It just enables you
to manage that complexity better.

Consider the recent thread about the lifetime of the value
returned by std::exception::what().  The standard is certainly
supposed to specify a contract between the implementors and the
users, but it's very easy to forget to document the intended
lifetime of every piece of data.  "contract" style programming
is no panacea either.

Garbage collection doesn't claim to solve all problems.
But for the problems that it does address, it addresses
them in a manner that requires less discipline (i.e. effort)
on the part of the programmer than programming by contract.

--
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3        |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/29 Raw View

In article <6piuim$7ou$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> <kanze@my-dejanews.com> wrote:
> >  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> >> Second, a "contract" style of programming
> >> eliminates memory leaks and complexity in managing resources.
> >
> >It's nice to know you work with programmers who never make a mistake.
>
> Mistakes in use of a contract interface are visible, and can be detected
> during code inspection with the aid of a simple checklist.

Complicating the interface with issues of object ownership add to its
complexity, increase the probability of error, and above all, increase
the effort needed to code, inspect and test the program.

> Anyway if you
> try to use garbage collection in place of competent coders you will fail.

Nobody suggested such.  With or without garbage collection, developing
complex applications is a difficult job, and requires a certain degree
of sophistication on the part of the people doing it.  Garbage collection
is just a tool, which makes for a little less grunt work.  Sort of like
templates, in this way.

> >The Java tool to manage other resources is called the finally clause.
>
> The "finally clause" is an exception hack.  I was speaking of
> finalization, which is something entirely other done by the
> garbage collector.

"Finalization" is a C++ hack:-).

Seriously, I know of no other language which tries to put as much into
finalization as does C++.  The "finally clause" works well for encapsulating
things that have to be done before leaving a block.  It isn't necessary
in C++, because the idiom which has developed is to use destructors
for this, but many would argue that the finally clause is more natural;
the cleanup takes place in the same scope as the rest of the function.

In fact, I don't really believe that one is better than the other --
they are different, that's all.

And I don't see any pratical use for the finalize method of Java.

> >> In the meantime, garbage collection alone doesn't deliver on its
> >> promises.
> >
> >If you mean that garbage collection isn't a silver bullet, which will
> >suddenly make all C++ correct, then you're right.
>
> No.  No reasonable person expects such a hack to make code correct.
> Correct code comes from correct coding.
>
> I mean that it has been promised that it supports a different style
> of programming in which consequences of resource-consumptive actions
> may be ignored.

No it hasn't.  It has promessed that if no more references to a block
of memory exist, that block may be recycled.  It doesn't promess anything
else.

If you feel that garbage collection offers more, then you haven't
understood garbage collection or its promesses.  If you feel that
having some automatic mechanism to generate parts of the code for
you, or using a library to do part of the work, doesn't reduce the
amount of work necessary to write a correct program, then we'll just
have to agree to disagree.

> When such an action includes consuming a scarce
> resource such as a file descriptor, garbage collection alone suddenly
> doesn't help.

Nobody claims it does.  In the case of a file descriptor, neither do
destructors; only the finally clause is really appropriate.  (On all
of the systems I've used, "freeing" a file descriptor involves closing
the file, an operation that can fail.  Which means that you must have
some way of testing the error and handling it.)

> Then the coding style itself is inappropriate, and you
> must retreat to the contract style.

What do you mean, retreat to the contract style?  As far as I know, no
one ever left it.  (Resource management, in general, is a very small
part of the contract style.)

> This generally turns out to be
> the norm, not the exception, in real programs.

My experience in large projects is that we generally spend about 1/3 or
1/4 of the time, globally, in memory management issues.  With garbage
collection, this drops down to 5-10%.  I find this a significant gain.

Again, because Nathan has carefully cut most of my main argument: garbage
collection doesn't absolve you of the design issues in program developement,
including some analysis of memory management (if only to prove that it
will do the job for your application).  It does eliminate a measurable
part of the grunt work in the low level design and the implementation;
since less code means less errors, it also reduces the amount of time
needed in integration and program maintenance.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <35BC92AA.59E2@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>..For any pair f,x of defined [runtime objects], it is possible for
>the user to request the evaluation of f(x). The evaluation potentially
>evokes different code, depending upon both the definition of f, and the
>type of x.
>You've chosen to implement this using a variant of double dispatch.

  (Instead of "symbols" in the original quote I have changed it to
[runtime objects], in order to make the context correct.)

  It would be misleading to say that I use this variation of double
dispatch to identity the type of x: It can be used that way, but that is
not the case in the full generality. An example of this: I use
    class Base {
    public:
        virtual Data print(Data& arg);     // Print object on ASCII format
        virtual Data printTeX(Data& arg);  // Print object on TeX format
    };
The idea is that in order to print an object, I make a print object, say
    Print p(cout);
and an object is printed out by f(p). The object structure is such that
every subobject a of f will choose a.print(p), and it will print out the
object. Now, I can do the same with printing on TeX format:
    PrintTeX t;
and f(t) will produce a printing on TeX format instead. So far nothing special.

  But suppose some subobjects u do not have the u.printTeX function
defined. The way I have defined it, it is possible to make such objects
use the u.print function, while sending on the request for a printTeX
printing to its subobjects. If I would have written it
    class Base {
    public:
        virtual ostream& print(ostream&);     // Print object on ASCII format
        virtual ostream& printTeX(ostream&);  // Print object on TeX format
    };
this would not have been possible: As soon as an subobject uses the
u.print function, all its subopbjects will use that one too.

  So in this case it is possible for several objects x to share the same
virtual function x.print. Another instance of this use is if the object is
self-referential, with a pointer back to itself. Then the object cannot be
printed out at all, because it will go into an infinite loop, unless one
keeps track of those pointers back and prevents it from printing them
again.

  Suppose I want to print the object
    [u, v]
where v is in reality a reference to the object itself. It is reasonable
to print it something like
    1: [u, :1]
where 1: is a label, and :1 means "goto label 1". Then it is not possible
to do that in a single printing: One must first scan the object, marking
it up and then print it on the second scan.

  But how should I be able to mark it? I could not use a general virtual
function X..general, because it may scan different subobjects than the
printing mecanism. So the simplest answer is to make a new class
    PrintMark m;   // Create an object marking the object to be printed.
Then f(m)(p) will first mark the object properly and then print it on the
right form. (And one can make a special object hiding away this double
scan of the object.)

  From this, it is also clear that subobjects cannot be allowed to print
directly into an ostream like say
    void U::print(ostream& os, Data& arg) { os << "U"; }
because on the markup phase, stuff will be printed. So the functions must
look like
    Data U::print(Data& arg) { arg << "U"; }

  Now, tying this up to the original discussion: I am working with
something deeper than just identifying functions using types; I am working
with a C++ style object-orientation, except that I have taken the full
step out, making the objects completely runtime dynamic.

  This is in part the reason I knock my head so hard on this virtual
pointer question, because complicated workarounds will be both too slow
and too difficult to work with.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Alexandre Oliva <oliva@dcc.unicamp.br>
Date: 1998/07/29 Raw View

kanze  <kanze@my-dejanews.com> writes:

> The "finally clause" works well for encapsulating things that have
> to be done before leaving a block.

But it does encapsulate things that have to be done when an object
goes out of scope, which is a pity.  If I wanted to implement in Java
a locking mechanism that releases the locks as the lock object goes
out of scope, every block that created such an object would have to
end with a finally clause releasing the lock.  This is totally against
encapsulation.  :-(

> And I don't see any pratical use for the finalize method of Java.

If only it were ensured to run at application termination, it could be
used to save the state of persistent objects...

--
Alexandre Oliva
mailto:oliva@dcc.unicamp.br mailto:aoliva@acm.org
http://www.dcc.unicamp.br/~oliva
Universidade Estadual de Campinas, SP, Brasil
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/07/29 Raw View

Hans Aberg wrote:
> In article <35BC92AA.59E2@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
...
> >Symbol definitions are represented by instances of classes derived from
> >a Base class, which has a virtual argument() function which identifies
> >the type of the symbol by returning a pointer to a Base virtual member
> >function.
>
>   This is however incorrect: I use symbols to access the runtime objects,
> but in the actual runtime coiputations, the symbols are first ways. So the
> "f" and "x" are runtime object that when evaluated as "f(x)" do not know
> anything about hwho called them.

I think that by "runtime objects" you mean the same thing I am talking
about when I say "symbol definitions". Those definitions are represented
at runtime by C++ objects.
I don't know what you mean by "the symbols are first ways". I'm having
trouble parsing that phrase.
Your last sentence matches exactly my understanding, and is not in
conflict with the alternate solution I presented

> >Every derived class that knows how to handle a given symbol
> >type defines an override for the corresponding Base virtual member
> >function. Any symbols of types that it doesn't know how to handle get
> >passed to a single general() handler, defined as yet another virtual
> >member of Base. Presumably general() displays a "syntax error" message
> >of some kind.
>
>   So this is wrong: general() ("generic" as I call it) just handles cases
> where no specific method has been indicated: If "f" does not regonize the
> suggestion from "x" or does not get any suggestion at all, then general()
> is used.

What general() does is irrelevant; I was just making a guess. My
suggestion would result in it being called under exactly the same
circumstances as it currently does. You've left me curious as to what
general() actually does. All you've explained so far are the
circumstances under which it is called; you have not said what it does
once it is called.

...
> >To avoid that problem, you proposed a change that would allow a
> >variation on dynamic_cast<> to be applied to member function poitners. I
> >don't know how hard that change would be to implement, though I suspect
> >that it would complicate the internal representation or evaluation of
> >member function pointers.
>
>   Not at all (see my pther post).

I've looked at that post, and it makes explicit the way in which the
internal representation and evaluation of member function pointers would
get more complicated. I don't know for sure, but I think that right now
such a pointer could be implemented by a 1 byte vtable offset, and
evaluated by simply subscripting the vtable.

> >In any event, the new standard has been
> >approved, and there won't be any further changes of anything that isn't
> >an outright defect for a long time to come, so you'll have to look
> >elsewhere for a solution.
>
>   I think that as soon the one version of C++ has been approved, the work
> on the next revision starts.

My understanding is that ISO rules require a long (5 year?) period
during which all work is concentrated on maintaining and gaining
experience with the current standard, before the committee even starts
thinking about the next version.

...
> >I'd like to suggest the following solution.
> ...
> >        // "Base *pBase" is obtained from x, by means that don't matter
> >        // for this example
> >        Double *pDouble = dynamic_cast<Double *> pBase;
> >        if(pDouble!=NULL)
> >        {
> >                // code using Double:: members through pDouble
> >                // to evaluate f(x)
> >                return result;
> >        }
> >        Float *pFloat = dynamic_cast<Float *> pBase;
> >        {
> >                //Evaluation of f(x) through pFloat
> >                return result;
> >        }
> >        return general(x);
>
>   Actually, I am moving the other direction: I started writing code with
> cases like this, but I changed because it became illogical. Suppose a
> class takes both  one and two doubles, say combining atan and atan2 into
> one object. (And I here depart form the idea of genericity o all functions
> being of type D f(D&) as I just want to illustrate the concept.) Then the
> logical way is to write
>     class Atan : public Base {
>     public:
>         double double1(double d) { return atan(d); }
>         double double2(double x, double y) { return atan2(y,x); }
>     };
> where "double1" and "double2" are special virtual functions for taking
> care of arguments with one or two doubles.

You're giving more syntactic details about your application, which of
course complicates the design. I think that this could be handled within
the context of my suggestion by defining a Pair object derived from
Base, containing two Data members. The Atan object would then check
whether dyamic_cast<Pair *> pBase was NULL, and if not it would extract
the Base pointers from the two Data sub-objects and see if they could be
dynamic_cast<Double *>. How you create the Pair object is a detail which
depends upon how your parser recognises the ','.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/29 Raw View

In article <6pmvc3$2j6$1@mulga.cs.mu.OZ.AU>, fjh@cs.mu.OZ.AU (Fergus
Henderson) wrote:
>Consider the recent thread about the lifetime of the value
>returned by std::exception::what().  The standard is certainly
>supposed to specify a contract between the implementors and the
>users, but it's very easy to forget to document the intended
>lifetime of every piece of data.  "contract" style programming
>is no panacea either.

  This touches on a point, the lifetime of data, which I think explains
why there should be some special C++ support for implementing a GC (and
not hoping that it can just be done via a libarary alone):

  The trick is to get hold of the root set (where the pointers originate)
fast and efficiently. Suppose I try to do that within C++ alone. Then if I
only had to deal with global data and stacked data, it would not be so
difficult, I just would stack those, and pop them when they go out of
scope (which can be done, I think, with some trickery with constructors,
destructors and "operator new").

  But the problem is to handle the temporary data which brakes any stack
order, and further, the C++ standard is fuzzy in that respect: One only
knows that temporary data is allowed to die when a copy constructor has
been called, but one does not know whether this or that compiler lets the
element live longer for the use of some optimization.

  Further, suppose one just want to have some pointer to belong to the
root set (like the example with the handles I posted): Then one way to
speed up the search of just those pointers and avoiding searching a lot of
other pointers, could be to put them in special places.

  Both these features, the lifetime of temporary and where the non-dynamic
data is stored is something that the fellow who writes the compiler knows
of: Going underneath the compiler, dealing directly with the OS, does not
help out; the best one can achieve that way is guesses of where to search.

  In addition, one paramount goal with a GC, no matter how it is
implemented in C++ (libarary, etc), is efficiency: So that does not leave
so many options, because every workaround or extra function call is going
to slow down the GC extraordinarily, as these are operations that are
called every time some piece of dynamic data is created.

  For example, one idea I had was to make use of the copy-constructors to
trace the pointers from the root set and move the data on the heap: At
GC-time, one would alter the copy-constructors to make this work. But this
altering of the copy-constructor means that it must have some extra
instructions in it choosing which variation to use. But this could be a
heavy burden on the GC, because this is on such a low programming level.
So for that reason, one might want to be able to use different copy
constructors, of different "colors", which one could use to ensure that
both the regular and the GC copy-constructor gets maximized performance.
(And am aware of ideas of a copy-constructor with an extra variable, but I
am not sure it would work in my case and that it does not slow the GC
down. -- I just mention this as example of ideas that might be explored.)

  So this suggests that there are some features that the language C++
needs to have in order to facilitate the implementation of GC's.

  (I recall that Steve Strassmann, the fellow who wrote the language
Dylan, told me he felt it is extremely difficult to implement a
conservative GC using C++. A computer scientist working with implementing
Haskell all day said to me that the question frightened him. So I do not
think it is so that one can take an armchair approach to the question
saying "Sure, C++ has the capacity, just work a little harder with the
templates".)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Alex Martelli" <martelli@cadlab.it>
Date: 1998/07/27 Raw View

Hans Aberg wrote in message ...
    [snip]
>  So suppose I want to add a C++ compiler to my program called say
"gcc".
>Then I would need to add a "class gcc : public Base" to my program. I
then
>may need to add a virtual function "X::gcc" for some classes "X :
public
>B". If a class "Y : public Base" does not have the Y::gcc but is asked
to
>execute it, then it should execute Y::general instead.

I may have missed some of your posts (my feed seems to be particularly
unreliable these days), because I still haven't seen your objections to
my
suggestion: when you "add a virtual function X::gcc to some classes", do
so by having them (multiply) inherit from a further abstract base:
        class I_do_gcc {
            public:
            virtual data* gcc(data*) = 0;    // or whatever signature
        };
and the "ask object Y to execute gcc" becomes:
        data* doit(data* input_data, Base * pY)
        {
        I_do_gcc* performer = dynamic_cast<I_do_gcc*> pY;
        if(performer)
            return performer->gcc(input_data);
        else
            return generic_gcc(input_data);
        }

You can suitably wrap this with templates and/or preprocessor
macros for convenience of use, but besides such syntactic-sugar
issues, doesn't this capture the runtime semantics you're after?

Of course, instead of a pointer-to-member-function for gcc, you
would be passing around pointers to objects of abstract class doer:
        class doer {
        public:
            virtual data* doit(data*) = 0;
        }
and the above-show doit code would be the override of pure virtual
member doit in a suitable gcc_doer concrete class.

>  So from that point of view, there is no need putting X::gcc in the
Base
>class as a Base::gcc, which has the unwanted side effect that all
classes
>derived from Base (which in effect are all classes essential to the

Oh, I agree with this one -- "put it in THE base class" is no panacea;
however, "put it in A base class" can often be more satisfactory:-).

>runtime structure of the program) must be recompiled. But the way C++
is
>now, one must add Base::gcc and recompile it.

That's where we disagree: "the way C++ is now", one can perfectly
well AVOID adding function gcc to "the" base class, placing it instead
in "a" (further) base class (which will be a base for some objects
only -- those interested in implementing its functions!) -- thanks to
multiple inheritance; and RTTI will then give you the dynamic checks
you're after.

>  But if this was not forced, I could add my gcc later, and at runtime,
>only those classes bothering about gcc in a specificcally new way would
be
>needed to be recompiled.

The way I suggest, you DO need to recompile only those classes who
"bother about gcc in a specifically new way" -- because to those, you
add the new abstract base class, but not to others.

Please clarify why this rather run-of-the-mill idiom won't meet your
needs...!


Alex
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/27 Raw View

In article <haberg-2607981300230001@sl118.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
> In article <6pepb3$9qg$1@shell7.ba.best.com>, ncm@nospam.cantrip.org
> (Nathan Myers) wrote:
>
> >.. there is no reason malloc cannot
> >be implemented 50-100 times faster than is typical.  I have done it.
>
>   It would be interestint to know what the "secret" is.

#define free( p )

:-).

(Seriously, see my other posting.  I've had similar experiences with
specific applications.  Sun provides at least 3 different malloc/free
pairs, so you can choose the most appropriate for your application, but
I'm still generally able to beat all three.  Just not with the same
malloc/free each time.)

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient=E9e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----=3D=3D Posted via Deja News, The Leader in Internet Discussion =3D=3D=
-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/07/27 Raw View

Hans Aberg wrote:
>
> In article <35B74FF1.446B@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
> >Could you please provide a complete example of actual code showing what
> >exactly you want to do?
>
>   Do you mean posting the whole program on 400000 lines spread on 150-200
> files? :-)

Of course not. Just post the simplest possible code that compiles and
displays the features that make your proposed change necessary. The code
you've actually given doesn't compile, because shortcuts were used in
writing it. I've exchanged private e-mail with you to clarify the parts
that didn't make sense to me. I think I now understand it well enough to
propose an alternative that meets your objectives without requiring a
change to C++.

As I understand it, your application is a sort of programmable
calculator program, which allows users to assign various definitions to
named symbols. For any pair f,x of defined symbols, it is possible for
the user to request the evaluation of f(x). The evaluation potentially
evokes different code, depending upon both the definition of f, and the
type of x.
You've chosen to implement this using a variant of double dispatch.
Symbol definitions are represented by instances of classes derived from
a Base class, which has a virtual argument() function which identifies
the type of the symbol by returning a pointer to a Base virtual member
function. Every derived class that knows how to handle a given symbol
type defines an override for the corresponding Base virtual member
function. Any symbols of types that it doesn't know how to handle get
passed to a single general() handler, defined as yet another virtual
member of Base. Presumably general() displays a "syntax error" message
of some kind.

This design is a useful one if most derived classes are able to handle
most symbol types. However, you have complained about unnecessarily
having to re-compile your entire body of code just because you had to
add a single symbol type like Float. This is a problem worth complaining
about only if most of your derived classes don't implement the Float()
virtual function.

To avoid that problem, you proposed a change that would allow a
variation on dynamic_cast<> to be applied to member function poitners. I
don't know how hard that change would be to implement, though I suspect
that it would complicate the internal representation or evaluation of
member function pointers. In any event, the new standard has been
approved, and there won't be any further changes of anything that isn't
an outright defect for a long time to come, so you'll have to look
elsewhere for a solution.

I'd like to suggest the following solution. Use the new RTTI features of
the standard to identify symbol types. Drop most of the virtual member
functions of your current Base class, except perhaps the general()
function. Derive a class from Base for each symbol type; as appropriate,
derive other classes of the same type from those classes. At the
appropriate point in each of your derived classes execute code like the
following to implement f(x):

 // "Base *pBase" is obtained from x, by means that don't matter
 // for this example
 Double *pDouble = dynamic_cast<Double *> pBase;
 if(pDouble!=NULL)
 {
  // code using Double:: members through pDouble
  // to evaluate f(x)
  return result;
 }
 Float *pFloat = dynamic_cast<Float *> pBase;
 {
  //Evaluation of f(x) through pFloat
  return result;
 }
 return general(x);

Note that the only code that will need to be recompiled upon adding a
Float class is the code that actually uses it. There is no need to add a
virtual Base::Float() member function.

Your method is more efficient, using only two or three virtual function
evaluations, regardless of symbol type. My alternative requires one
dynamic_cast<> for every symbol type a given class needs to handle. As
long as the average number of such types is small, this shouldn't be a
problem, particularly for an interactive application like yours.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: kanze@my-dejanews.com
Date: 1998/07/27 Raw View

In article <6pepb3$9qg$1@shell7.ba.best.com>,
  ncm@nospam.cantrip.org (Nathan Myers) wrote:
> Hans Aberg<haberg@REMOVE.matematik.su.se> wrote:
> >
> >traditional "new" and "malloc" are 50-100 times slower than dynamic
> >memory using an efficient GC.
>
> This is frequently repeated, and it accurately describes common
> malloc implementations.  However, there is no reason malloc cannot
> be implemented 50-100 times faster than is typical.  I have done it.

To be more precise: for any particular application, it is probably
possible to implement malloc/free significantly faster than the system
malloc/free, and it is probably possible to find a garbage collection
scheme that is significantly faster than the system malloc/free.  Also,
almost by definition, the fastest garbage collection cannot possibly
be faster than the fastest malloc/free, since you can always implement
malloc the same as the garbage collected allocation, and #define free
to be empty.  (I am talking about C++ here, whose semantics in practice
do not allow relocating garbage collectors.  For many applications, a
relocating garbage collector can be faster than a non-relocating malloc/
free.  For other applications, of course, the opposite holds.)

Typically, if your program does any IO waits, a garbage collector will
seem faster than a typical malloc/free, because it can do the actual
freeing during the IO waits.  However, the difference doesn't have to
be enormous, since you can always implement free to just chain the
memory in a separate free list, and coelesce during the IO waits.  (The
first memory management scheme I wrote, circa 1979, worked this way.)

> Garbage collection is no panacea.   First, new/delete can be made
> comparablyh fast.  Second, a "contract" style of programming
> eliminates memory leaks and complexity in managing resources.

It's nice to know you work with programmers who never make a mistake.

My experience is slightly different.  At a design level, there isn't much
difference, because garbage collection or no, you still have to verify
that your design doesn't leak.  (And yes, it is perfectly possible to
leak memory even with garbage collection.)  Once past the design level,
however, you often (NOT always) have no implementation, because you've
been able to show that the garbage collector will take care of it.

> Finally, and most importantly, the same style that manages memory
> well manages other resources as well.

IMHO, memory is not a resouce like the others (or at least, not like
most others).  In the abstract model we program against, we have infinite
memory (and infinite sized integers, etc.)  At some point in the design/
implementation, we either prove that the code we write cannot violate
the abstract model (typically by value propagation on validated input),
or we add checks to assert that it doesn't. (We're all used to writing
things like "assert( i < INT_MAX / 10 ) " and the like for this, I'm
sure.  And rearranging our expressions so that intermediate values don't
overflow.  This is part of the day to day routine of a programmer.)

Memory is a bit awkward, because it is typically not possible to verify
anything in advance.  So about the only solution is to abort (cleanly)
when it runs out, the same as we do on overflow.  Having to worry about
free and/or delete is, in this regard, a bit like having to worry about
the bit pattern of an int or a float.  There are cases where it is
appropriate, but most of the time, it is not relative to the level of
abstraction we are working at.

There are, of course, other resources for which this is also true; in
many ways, it should be true of a mutex lock.  In practice, we don't
yet know how to do this, because the underlying system doesn't know
how to determine without our saying so explicitly when we are through
with the lock, and perhaps more important, the lock must be freed as
soon as possible.  Thus, our abstract model cannot ignore the lock.

And of course, most other resources have error codes which must be checked
when they are freed.

> This last point is important because a program manages many
> resources.  Memory has looser requirements than other resources
> -- memory resources are freely interchangeable, and not scarce.
> If your program manages those other resources properly, then
> memory is just one more; but your program must not fail to manage
> other resources just because it no longer manages memory.

The difference is that most other resources (e.g. open files, etc.) are
part of the abstraction; the finiteness of memory isn't.  (If it were,
we couldn't speak of infinite recursion:-).)

> Languages that offer garbage collection built-in sometimes fail
> to provide the tools needed to manage the other resources properly,
> out of the delusion that memory is the only resource that needs to
> be managed, or that other resources have the same characteristics
> as memory.  (Java's "finalization" may be an example of this.)

The Java tool to manage other resources is called the finally clause.

But resources aren't the only problem.  Let's not forget data coherence
either.

> Before many of us accept garbage collection as the norm, we should
> hold out for a formalism that can handle other resources and their
> typically more-stringent requirements as well.  Then memory management
> will be just one case among many, and the programming styles promised
> to be possible given garbage collection may become practical.

If you find such a solution, let us know, and I will embrace it fully.
In the meantime, half a loaf seems better than none.  (This certainly
seems the philosophy of C++, anyway.  Namespaces, rather than any real
modules, for example.)

> In the meantime, garbage collection alone doesn't deliver on its
> promises.

If you mean that garbage collection isn't a silver bullet, which will
suddenly make all C++ correct, then you're right.  But this isn't what
the serious proponents of garbage collection are claiming.  All I would
claim for it is that it means less code to write, to verify and to test.
Which are pretty much the same reasons I use existing containers, rather
than inventing my own.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/07/28 Raw View

<kanze@my-dejanews.com> wrote:
>  ncm@nospam.cantrip.org (Nathan Myers) wrote:
>> Second, a "contract" style of programming
>> eliminates memory leaks and complexity in managing resources.
>
>It's nice to know you work with programmers who never make a mistake.

Mistakes in use of a contract interface are visible, and can be detected
during code inspection with the aid of a simple checklist.  Anyway if you
try to use garbage collection in place of competent coders you will fail.

>The Java tool to manage other resources is called the finally clause.

The "finally clause" is an exception hack.  I was speaking of
finalization, which is something entirely other done by the
garbage collector.

>> In the meantime, garbage collection alone doesn't deliver on its
>> promises.
>
>If you mean that garbage collection isn't a silver bullet, which will
>suddenly make all C++ correct, then you're right.

No.  No reasonable person expects such a hack to make code correct.
Correct code comes from correct coding.

I mean that it has been promised that it supports a different style
of programming in which consequences of resource-consumptive actions
may be ignored.  When such an action includes consuming a scarce
resource such as a file descriptor, garbage collection alone suddenly
doesn't help.  Then the coding style itself is inappropriate, and you
must retreat to the contract style.  This generally turns out to be
the norm, not the exception, in real programs.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@bardeen.ceg.uiuc.edu (Siemel Naran)
Date: 1998/07/23 Raw View

>>   One should not try to learn C++ apart from its uses: C++ is more like an
>> encyclopedia. For example, to me it is a nuisance having to write out an
>> empty virtual destructor "virtual ~D() { }" just in order to make sure its
>> memory is released, but surely someone is happy about the extra word is

BTW, the destructor doesn't release the memory for "this".  The releasing
of memory is done by the scope rules of C++: only when an object goes out
of scope is its memory released.  The destructor is there for the object
to do its final cleanup before it goes out of scope.  For example, it
may release dynamic memory held by the object.  But it never releases the
memory for the object itself.


>I'm unfamiliar with that aspect of C++. I thought that the default
>destructor generated by the compiler was equivalent in all of it's
>effects to the empty one you define there. Why does explicitly defining
>it make a difference?

The compiler generated default destructor for some class will only be
virtual if this class inherits from a class which has a virtual default
destructor.  So this means that we must always write a destructor for
the most base class, even if it is trivial destructor, simply in order
to put in the 'virtual' keyword -- i.e. to ensure that all the destructors
inherited from this base class are virtual.  But writing a destructor with
the 'virtual' keyword in each derived class seems to be good style as it
reminds us the destructor is really virtual.

Also, the compiler generated destructor is inline.  So if your derived class
doesn't declare a destructor, the compiler will define the following one
(it is virtual because virtual-ness is inherited):
     inline virtual Derived::~Derived() { } // oops: syntax error!
Since virtual functions will usually called by address, each translation
unit that includes <Derived> will have its own static copy of the virtual
destructor.  This leads to code bloat.

BTW, it seems to me that at link time, if the linker knows the dynamic type
of an object, it can inline any trivial functions which weren't explicitly
declared inline.  If this is true, it means that every function has an
implicit 'inline' in it!  On ocassion, I've declared a trivial function
non-inline just for insulation.  Yet in the production version of my
program, this function should have been inlined.

For heavyweight objects, Lakos recommends a non-inline destructor, even
if it is a trivial destructor.  This is for insulation.  We may change
the definition of the destructor later, and in this case, we would like to
compile only this one ".c" file.  Perhaps Lakos takes this insulation
thing overboard, but then again, my own code is far far less than one
million lines!


Third, the compiler generated destructor is public.  On ocassion, we may
need a protected or private destructor.  So even if it is a trivial
destructor, we'll have to write it.

--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/23 Raw View

In article <35B62EEB.41C6@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:

>You're implying that a seperate function is needed for each combination
>of f and x; that the problem cannot be factored into virtual functions
>for each f interacting with virtual functions for each x.

  The f and x are handshaking, so first x returns a suggestion for a
virtual function D::g, where D can be any class in the hierarchy. Then f,
belonging to say the class B, examines this D::g and tries to interpret as
a B::f. If B::f is already defined, no problem, just use it; if it is not
already defined, there is no problem with the semantics of the program,
because then B::g should be used instead, but there is a problem with C++,
because it "jumps off the vtbl" -- so the program crashes before B::g can
be inserted.

>Lets call F
>and X the base classes from which all f's, and all x's are respectively
>defined. I'll use dF and dX as examples of classes derived from each.
>Then multi-methods could be implemented within C++ by:

  To begin there is only one base class by which all f's and x's are
derived. This is the class one does not want to change.

>1. defining a set of virtual members X::b(dF& f), overloaded for each
>dF, and overridden by dX::b(dF& f) to implement the actual
>multi-methods.
>
>2. defining a virtual member F::a(X& x), which is overriden in every dF
>to call x.b(*this)

  To make this explicit, the class X may be a dynamic class Double, and it
returns as a suggestion that any class should first try the virtual
function B::Double. Class F, however is the class Identity which just
returns a refence to its argument. So this class do not want to bother
about all the types Double, String, etc, but only return a refernce to its
argument, which its does via its only virtual function Identity::General.

  So we work through all those cases of virtual functions and put them
into the base class and make them into a library. Then a fellow comes up
with an idea, "Would it not be great with a class Float", which the
original implementor found unnecessary. So, the way C++ is now, in order
to get this to work, one must add the virtual function B::Float to the
base class and recompile everything.

  But to all classes which do not bother about Floats, like the class
Identity which only returns a refence to its argument, this is wholly
unnecessary, because they already know how to handle the situation.

>> For example, to me it is a nuisance having to write out an
>> empty virtual destructor "virtual ~D() { }" just in order to make sure its
>> memory is released, but surely someone is happy about the extra word is
>
>I'm unfamiliar with that aspect of C++. I thought that the default
>destructor generated by the compiler was equivalent in all of it's
>effects to the empty one you define there. Why does explicitly defining
>it make a difference?

  If the destructor is not virtual, "B* b = new C; delete b;" will only
release the memory of b. Thanks to this C++ feature, one can save a word
if dynamic memory is not used.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Greg Colvin" <spam@me.not>
Date: 1998/07/23 Raw View

>   So if one is writing a dynamic program of sort (which sould be a
> dynamically typed language), it would be great to be able to use C++ for
> doing that job.

The way I once did it is this:  I wrote a Scheme interpreter in C++,
and wrote the dynamic code in Scheme.  I maintained efficiency by
designing a clean C++ interface to the interpreter, so that new
types and functions could be implemented in C++ for use in Scheme,
and so that Scheme functions could be called from withing the C++
main program.

Greg Colvin



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: rdamon@BeltronicsInspection.com (Richard Damon)
Date: 1998/07/23 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:

>In article <35b4a667.69635670@news.gis.net>, richard_damon@iname.com wrote:
>>.. A pointer to derived
>>function can not be converted to a pointer to base function because a base
>>object may be missing something needed for the derived function.
>>
>>..I assume you will either make sure the
>>function is only applied to a derived or make sure the function only uses the
>>base functionality.
>
>  No, the base knows how to handle virtual pointers it does not recognize,
>because there is a general way to specify that. So I want the base to be
>able to make a runtime check if it is a recognizable virtual function
>pointer, and if not, proceed with the general method to handle it.
>
>>When I define an interface which takes member-function pointers I tend to
>>also add an interface that takes ordinary functions with a class pointer as a
>>parameter. This allows me to add new functions which were not defined in the
>>base class, but which are later needed. The derived class can then define a
>>static function which acts can convert the base pointer to the derived
>>pointer and call the member-function.
>
>  I am not sure about the details; please explain.

class Base {
public:
    int api(int Base::fn());
    int api(int fn(Base*));

    virtual int op1();  // Possible targets of first api function
    virtual int op2();
}


class Derived : public Base {
public:
    static int op3x(Base*);

    virtual int op3();

};

int Derived op3(Base* b) {
    Derived* d = dynamic_cast<Derived*>(b);
    if(d)
        return d->op3();
    else
        // code for default processing
}

I can then call
    api(&Base::op1)
    api(&Base::op2)
or
    ap(&Derived::op3x) // extended functionality

--
richard_damon@iname.com (Redirector to my current best Mailbox)
rdamon@beltronicsInspection.com (Work Adddress)
Richad_Damon@msn.com (Just for Fun)
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/23 Raw View

In article <6p5b1l$855$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>>   No, the base knows how to handle virtual pointers it does not recognize,
>> because there is a general way to specify that.
>
>No, it doesn't, not in C++.  Check the newsgroup you're posting to.

  I think you should check the articles you read: The program knows how to
handle it, but C++ does not the way C++ is now. So we are discussing for a
possible change.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/07/23 Raw View

Hans Aberg wrote:
[A complicated verbal description of a specific solution of his problem,
which he believes can't be implemented properly in C++.]

Could you please provide a complete example of actual code showing what
exactly you want to do? Give an example of how it is intended to be
used, not just how you have implemented it. Since you don't believe that
C++ can do what you need to do, please choose one of the following
options:

1. Propose specific modifications to C++, and write the code using those
modifications.

2. Write the code in C++, and indicate where C++ forces you to use
constructs that you find inconvenient.

The incomplete pieces of code you've already provided show what you want
to do, but don't provide enough context to determine whether alternate
techniques might be applicable. The text you've provided gives the
context, but in language I find difficult to interpret. We are all
fairly fluent in C++, so that is the preferred language for explaining
things that can be explained in it.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/23 Raw View

In article <01bdb5f7$499ce310$634a5ed1@gcolvin-hpc>, "Greg Colvin"
<spam@me.not> wrote:
>>   So if one is writing a dynamic program of sort (which sould be a
>> dynamically typed language), it would be great to be able to use C++ for
>> doing that job.
>
>The way I once did it is this:  I wrote a Scheme interpreter in C++,
>and wrote the dynamic code in Scheme.  I maintained efficiency by
>designing a clean C++ interface to the interpreter, so that new
>types and functions could be implemented in C++ for use in Scheme,
>and so that Scheme functions could be called from withing the C++
>main program.

  So this is sort of the same strategy I am heading for: One implements a
kind of kernel in C++ which can be used to implement a dynamic language.
Then most of the new dynamic features can be implemented in that dynamic
language, and the low level features can be implemented in C++ for
efficiency.

  One difference is that I mainly work with runtime objects and try to
find a good way to implement these efficiently in C++: The better one can
do this in C++, the better the dynamic language will be, and the less one
ends up with a language tied to a certain implentation method. -- If these
runtime objects are sufficiently well-defined and well-designed, they need
not be tied to the idea of a language at all.

  So the discussion circulates around how one should be able to do such
dynamic implementations efficiently using the otherwise rather static C++.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/23 Raw View

In article <35b75f8b.30515031@news.gis.net>, richard_damon@iname.com wrote:
>>  I am not sure about the details; please explain.
>
>class Base {

  It is possible to make an variation of what you did to get something
similiar to the feature I asked for (I would have to think more about it):

class Base;
typedef int (*Fn)(Base*);

class Base {
public:
    virtual Fn arg();
    virtual int general();

    virtual int op1();
    virtual int op2();
};

class D : public Base {
public:
    static int op3x(Base*);

    virtual int op3();
};

int op3x(Base* b) {
    D* d = dynamic_cast<D*>(b);
    if(d != NULL)
        return d->op3();
    else
        b->general();   // Code for default processing

    return 0;
}

class C : public Base {
public:
    Fn arg() { return op3x; }
};

Then, with
    C* xp;
    D* fp;
    Base* gp;
one has
    (xp->arg())(fp);    // Computes fp->op3()
    (xp->arg())(gp);    // Computes gp->general()

  There are some complications though, the extra name "op3x" which can be
bothersome when working with many tens of classes and functions, and that
these all extra functions will have the essentially same form
    int op3x(Base* b) {
        D* d = dynamic_cast<D*>(b);
        if(d != NULL)
            return d->op3();
        else
            return b->general();   // Code for default processing
    }

  In addition, one ends up with inserting an additional function in the
call chain, "op3x", which is unnecessary from the semantic point: The
ideal is that op3x returns a function pointer which then is evaluated, in
order to avoid unnecessary data to be stacked on the function parameter
stack.

  But the interesting thing is that the feature I am asking for perhaps
can be added as a feature to C++ without complicating the compiler
construction too much or complicating the language itself.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/23 Raw View

In article <35B74FF1.446B@wizard.net>, James Kuyper <kuyper@wizard.net> wrote:
>Could you please provide a complete example of actual code showing what
>exactly you want to do?

  Do you mean posting the whole program on 400000 lines spread on 150-200
files? :-)

>1. Propose specific modifications to C++, and write the code using those
>modifications.

  I thought this was done: One suggestion is an extension of
dynamic_cast<D*>(f) where "f" is also allowed to be a pointer to a virtual
function: If f is defined in D or a base class of D the result is f,
otherwise NULL. Another suggestion is that "(dp->*f)(arg)", where dp is a
pointer to an object of class D, is computed normally if f is defined in D
or a base class of D, but if not, a virtual_function_cast_fail exception
is thrown.

>.Give an example of how it is intended to be
>used, not just how you have implemented it. Since you don't believe that
>C++ can do what you need to do, please choose one of the following
>options:

  The problem shows up when implementing an improvement of a dynamic
language feature called double dispatch. So the first thing one must
understand is that if one wants to compute f(x) of two objects f and x,
then all determinations of which method to use is done at runtime, not
compile time. The variables f and x are generic, and one simple way to
implement that in C++ is by having a class Data with a pointer to a handle
class DataRef with a pointer to a class Base hierarchy of many derived
classes. In reality, I have a parser and a name table and such things too,
but that is irrelevant for the discussion.

  For example, suppose I want to compute something simple, say f = sqrt, x
= 9.0. The name table of "f" contains a variable Data with a handle to
DataRef which points to a derived class Sqrt : Base, and simlarly for "x",
except that the DataRef handle points to an object of type Double. There
is abosolutely no way to figure out at compile time what runtime variables
such as "f" and "x" should contain; in effect they can contain any data.
It is also possible to change so that "f" contains the 9.0 or vice versa:
Just change the DataRef handle to something else.

  So when computing f(x), then these objects must figure out at runtime
which method to use. One way to do this is by double dispatch:

>2. Write the code in C++, and indicate where C++ forces you to use
>constructs that you find inconvenient.

  class Base {
  public:
      virtual Data argument(Data& obj);
      virtual Data Double(Data& arg);
      virtual Data general(Data& arg);
  };

  class Double : public Base {
  public:
      double d;
      Data argument(Data& obj) { return obj.Double(arg); }
  };

  class Sqrt : public Base {
  public:
      Data Double(Data& arg) { return sqrt(arg.d); }
  };

  The class Data makes sure that when f(x) is called, first x->argument(f)
is called. We then get "return obj.Double(x)", which the class Data
converts to f->Double(x); so we land on the method Sqrt::Double, and the
square root of the double is computed and returned.

  So this is double dispatch.

  But we do not want this complicated pattern of nested functions: The
call chain becomes too steep. So instead let "Double::argument" return a
virtual function pointer, X::Double, because this is only what is needed.
(And the real life situation is more complicated, because this function
pointer could be evaluated, returning yet another poiner and so forth.) To
us, writing the program, "X" could be any class in the hierarchy, but this
does not work with C++. So it is here the problem with C++ starts: If we
know that the virtual function pointers used are always in the base class
Base, no problem, because we could always return that (unless there is a
problem with the C++ standard on virtual function pointers).

  But there is no real good reason (from the point of functionality of the
program and the point of view of writing the program in C++) for putting
the Base::Double there in the class Base: The class Base will in reality
never use the Base::Double virtual function; if it ever is called, it will
convert it into Base::general which will be used instead.

  In addition, if we turn the program into a library, and somebody later
wants to add a class Float, then we are forced to add a Base::Float
virtual function. Then this the whole librabrary must be recompiled,
despite the fact that only classes X that take an argument Float will ever
use the X::Float virtual function.

  One could try to do various workarounds, but the program then quickly
becomes wholly unworkable as a lot of classes are added all the time, and
every small addition forces the recompilation of a rather large program.
These techniques are complicated to work with, so the the C++ code is not
kept very simple, one gets stuck.

  So this is the problem.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David Wragg <davew@gatsby.u-net.com>
Date: 1998/07/24 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
>   So this is not really the case: You can only do simple dynamic
> operations with the C++ runtime mechanism. (I think this is known; it
> relates to the fragile base class problem, which is solved by including
> the names of the functions in the object code and dynamical linking, just
> as in Java.)

This is not correct. The C++ language is perfectly amenable to the
same implementation technique that Java uses (virtual function name to
vtable offset resolution at link time or run-time, rather than compile
time). There is no "C++ runtime mechanism". Problems you have with
current implementations are just that (even if almost all widespread
implementations happen to display this flaw).

In fact, I have heard of at least two C++ compilers that have been
developed which do use the alternative technique, in order to better
support reusable software components. (And I believe that adding
support for this technique to existing C++ compilers would be a fairly
simple task, if implementors weren't so busy catching up with
the standard.)

Also, the term "fragile base class problem" is used to refer to two
very different things (I recently posted on this subject in another
group - see http://x13.dejanews.com/getdoc.xp?AN=373321668).

--
Dave Wragg
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@bardeen.ceg.uiuc.edu (Siemel Naran)
Date: 1998/07/24 Raw View

>  For example, suppose I want to compute something simple, say f = sqrt, x
>= 9.0. The name table of "f" contains a variable Data with a handle to
>DataRef which points to a derived class Sqrt : Base, and simlarly for "x",
>except that the DataRef handle points to an object of type Double. There
>is abosolutely no way to figure out at compile time what runtime variables
>such as "f" and "x" should contain; in effect they can contain any data.
>It is also possible to change so that "f" contains the 9.0 or vice versa:
>Just change the DataRef handle to something else.

Question:  What does f(x) mean when f=9 and x=sqrt.

Give a sample program showing how the classes are to be used.  The
program should go like this: "int main() { /* 7 LINES OF CODE */ }".
Or maybe the program accepts user input interactively; if so, give
7 lines of the interactive session.


--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/24 Raw View

In article <haberg-2307980016090001@sl65.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
> In article <6p5b1l$855$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
> >>   No, the base knows how to handle virtual pointers it does not recognize,
> >> because there is a general way to specify that.
> >
> >No, it doesn't, not in C++.  Check the newsgroup you're posting to.
>
>   I think you should check the articles you read: The program knows how to
> handle it, but C++ does not the way C++ is now. So we are discussing for a
> possible change.

That's a very good point.  This *is* the forum for discussing possible
changes to the language.

Speaking for myself only, it is my opinion that this particular change
would not be a wise one.  Even if it is technically feasible (and I'm
not sure that it is), it would not be consistent with other C++
design goals and existing features.  Among these are strong typing,
and a virtual-function mechanism that is very low cost in many
different measurements (program size, execution speed, and programmer
productivity -- perhaps not in programmer training costs, but once
the training is completed the productivity more than makes up for
this).

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/24 Raw View

  (This might be slightly off-topic, but one difficulty readers of this
thread seems to have is understanding how it is working with a generic
variable. So showing some of it might be helpful for this thread.)

In article <slrn6rfhal.73h.sbnaran@bardeen.ceg.uiuc.edu>,
sbnaran@KILL.uiuc.edu wrote:

>Question:  What does f(x) mean when f=9 and x=sqrt.

  Anything you want it to be: You could decide to throw an exception, or
do as me, view 9 as a constant function so that f(x) = 9. Other
evaluations might return an object representing a deferred evaluation.

>Give a sample program showing how the classes are to be used.  The
>program should go like this: "int main() { /* 7 LINES OF CODE */ }".
>Or maybe the program accepts user input interactively; if so, give
>7 lines of the interactive session.

  I use the program only interactively via a parser and a look-up table,
even though one might use it when programming too. An interaction might
look like:
  >  f = sqrt
     sqrt
  >  x = 9
     9
  >  f(x)
     3
  >  x(f)
     9
  >  f(f)
     sqrt o sqrt   // Return a function composition as deferred evaluation.

  One can do more complicated things too, like using expressions f = sqrt
+ 1, and lambda formulas: Then it starts to be complicated, so it is
important to keep the C++ code simple and clean.

  I can try to write out what the corresponding C++ code might look:
    Data sqrt_of_9() {
        Data f = new Elementary(sqrt);
        Data x = new Double(9);
        return f(x);
    }
or
    Data sqrt_of_sqrt() {
        Data f = new Elementary(sqrt);
        return f(f);
    }
Here, the class Data has a pointer to a DataRef handle, which in its turn
has a pointer to the class Base hierarchy; the classes Elementary and
Double are derived from the class Base. The main point though, is that I
can let the computations be carried at runtime without forseeing what type
of data a Data variable might contain.

  The thing is that the class Data has a Data::operator()(Data&) which
makes the C++ programming interface to be very close to what the program
does dynamically.  (In my code, one could also skip "new", in which case
the class would first clone the data in the class Base hierarchy via a
virtual copy constructor.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/24 Raw View

In article <jasojsb21g.fsf@gatsby.u-net.com>, David Wragg
<davew@gatsby.u-net.com> wrote:

>haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
>>   So this is not really the case: You can only do simple dynamic
>> operations with the C++ runtime mechanism. (I think this is known; it
>> relates to the fragile base class problem, which is solved by including
>> the names of the functions in the object code and dynamical linking, just
>> as in Java.)
>
>This is not correct. The C++ language is perfectly amenable to the
>same implementation technique that Java uses (virtual function name to
>vtable offset resolution at link time or run-time, rather than compile
>time). There is no "C++ runtime mechanism". Problems you have with
>current implementations are just that (even if almost all widespread
>implementations happen to display this flaw).

  In Java, one can also access the dynamic linking names from your
program; it is called reflections. See
  http://www.javasoft.com/products/jdk/1.1/docs/guide/reflection/index.html
This could be used if everything fails, and this you cannot do with C++.

>Also, the term "fragile base class problem" is used to refer to two
>very different things (I recently posted on this subject in another
>group - see http://x13.dejanews.com/getdoc.xp?AN=373321668).

  The "fragile base class problem" is, I think, the problem that all
derived classes must be recompiled even if it is unnecessary for the
changes done to the base class. I said that the question here is related
to that question, because here also the base class must get another
virtual function if a dervied class adds it, and then all derived classes
must be recompiled. So in working around this problem in how the compiler
implements it, one may need to think about how to work around the fragile
base class problem i this instance.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/24 Raw View

In article <6p9aee$o93$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>Speaking for myself only, it is my opinion that this particular change
>would not be a wise one.  Even if it is technically feasible (and I'm
>not sure that it is), it would not be consistent with other C++
>design goals and existing features.  Among these are strong typing,

  I do not understand how the suggestion breaks the C++ static typing.

>and a virtual-function mechanism that is very low cost in many
>different measurements (program size, execution speed, and programmer
>productivity -- perhaps not in programmer training costs, but once
>the training is completed the productivity more than makes up for
>this).

  And one point in using the technique for a C++ programmer would be that
it is simple and fast to the programmer -- unlike the suggested
workarounds which are both cumbersome and slow.

  And one stress of the discussions here has been to figure out if ther
might be a reasoably simple way for the compiler to implement it. -- But
this is really something for experts on writing C++ compilers to think
about.

  The thing is that when implementing dynamic features, then C and C++ is
often the choice, because the runtime code might be fast. But it is a
problem with C and C++ that some such dynamic features are very hard to
implement in those languages, so it would be good to change at least C++
in order to facilitate such implementaions.

  This does not really have anything to do with the idea that C++ is a
statically typed language: Static typing provides for fast code, because
that compile time structures can be thrown away in the compiled code. For
example, derived classes are statically typed even though these can be
used to implement dynamic runtime objects.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/07/25 Raw View

James Kuyper <kuyper@wizard.net> wrote:
>> Could you please provide a complete example of actual code showing
>> what exactly you want to do?

Hans Aberg wrote:
> ...
>   The problem shows up when implementing an improvement of a dynamic
> language feature called double dispatch. So the first thing one must
> understand is that if one wants to compute f(x) of two objects f and
> x, then all determinations of which method to use is done at runtime,
> not compile time. The variables f and x are generic, and one simple
> way to implement that in C++ is by having a class Data with a pointer
> to a handle class DataRef with a pointer to a class Base hierarchy of
> many derived classes. In reality, I have a parser and a name table and
> such things too, but that is irrelevant for the discussion.
>
>   For example, suppose I want to compute something simple, say
> f = sqrt, x = 9.0. The name table of "f" contains a variable Data with
> a handle to DataRef which points to a derived class Sqrt : Base, and
> simlarly for "x", except that the DataRef handle points to an object
> of type Double. There is absolutely no way to figure out at compile
> time what runtime variables such as "f" and "x" should contain; in
> effect they can contain any data.
> It is also possible to change so that "f" contains the 9.0 or vice
> versa: Just change the DataRef handle to something else.
>
>   So when computing f(x), then these objects must figure out at
> runtime which method to use. One way to do this is by double dispatch:
> ...

Stroustrup discusses "double dispatch", also known as
"multi-methods", in "The Design and Evolution of C++", sect. 13.8.
In particular, he discusses why he thought it was too complicated
to add to C++, mainly in dealing with the efficiency of choosing the
correct function to call and the exponentially-sized function tables
(vtbl) apparently required.  In essence, it boils down into
deciding which member function of the form 'intersect(Sh &a, Sh &b)'
to choose when 'a' and 'b' are types derived from base class 'Sh',
out of all possible combinations of 'a' and 'b' types.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/25 Raw View

In article <haberg-2407981316130001@sl70.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:

>   I use the program only interactively via a parser and a look-up table,
> even though one might use it when programming too. An interaction might
> look like:
>   >  f = sqrt
>      sqrt
>   >  x = 9
>      9
>   >  f(x)
>      3
>   >  x(f)
>      9
>   >  f(f)
>      sqrt o sqrt   // Return a function composition as deferred evaluation.
>
>   One can do more complicated things too, like using expressions f = sqrt
> + 1, and lambda formulas: Then it starts to be complicated, so it is
> important to keep the C++ code simple and clean.
>
>   I can try to write out what the corresponding C++ code might look:
>     Data sqrt_of_9() {
>         Data f = new Elementary(sqrt);
>         Data x = new Double(9);
>         return f(x);
>     }
> or
>     Data sqrt_of_sqrt() {
>         Data f = new Elementary(sqrt);
>         return f(f);
>     }
> Here, the class Data has a pointer to a DataRef handle, which in its turn
> has a pointer to the class Base hierarchy; the classes Elementary and
> Double are derived from the class Base. The main point though, is that I
> can let the computations be carried at runtime without forseeing what type
> of data a Data variable might contain.
>
>   The thing is that the class Data has a Data::operator()(Data&) which
> makes the C++ programming interface to be very close to what the program
> does dynamically.  (In my code, one could also skip "new", in which case
> the class would first clone the data in the class Base hierarchy via a
> virtual copy constructor.)

What I'm getting from this and other posts, is that you are writing
a compiler or interpreter for some other language.  Which is a great
idea; C++ is an ideal choice for this type of project.

However, if that's correct, you have to bear in mind the difference
between your language-processor's run time environment and that of
the program being processed.

Let's imagine a hypothetical language interpreter which implements a
language similar (on some level) to C++.  For the sake of discussion,
let's call the language INTERP.  Your INERP interpreter is itself
a computer program, of course, and let's assume that it's written in
C++.  Your user will use your program, INTERP, to execute an INTERP-
language program named TEST.  This TEST program contains 50 classes,
many of which have virtual functions in a complex inheritance
heirarchy.  You, as the author of C++I, have to write code that
implements the v-tables needed by this inheritance heirarchy.
Is that pretty close to the way things are?

If so, you have to draw a distinction between virtual functions in
the INTERP program, and virtual functions in TEST.  You could try to
locate the virtual tables for your own C++I classes, and use the
values there directly, but that way lies madness.  Instead, here's
a first-cut approximation for an interpreter (this would be totally
inappropriate for a compiler, since compile-time data would have
to be separated from run-time data):
    struct MEMBER_FUNCTION {
        string  name;    // Name of the virtual function
        VISIBLE visible; // Public, protected, private
        TYPE    type;    // Data type of return value
        ADDRESS address; // Address of the function
        bool    Virtual; // Is the function virtual?
        int     argcount; // Number of arguments
        TYPE   *arglist; // Data types of arguments
    };
    struct MEMBER_DATA {
        string  name;    // Name of the data member
        VISIBLE visible; // Public, protected, private
        TYPE    type;    // Data type of data member
        long    address; // offset from beginning of class data
    };
    struct CLASS {
        string Name;     // Name of the class
        MEMBER_FUNCTION*func; // Array of member-function info
        MEMBER_DATA *data;    // Array of member-data info
        int    size;     // Redundant; sizeof(this class)
        int    bases;    // How many base classes?
        CLASS *base;     // Points to all base classes
        bool  *virtual;  // true if base class is virtual
    };

When a pointer to a class is used, you must determine it's actual
data type and locate the appropriate CLASS object.  Then you can
use the information there to look up the correct function and
transfer control to it.

There are, of course, a nearly infinite list of useful ways to
subdivide the problem into C++ classes.  My point, here, is that
you shouldn't defeat C++'s virtual table method, if what you
really want to do is to emulate it for your own language.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/25 Raw View

In article <jasojsb21g.fsf@gatsby.u-net.com>,
  David Wragg <davew@gatsby.u-net.com> wrote:
> Also, the term "fragile base class problem" is used to refer to two
> very different things (I recently posted on this subject in another
> group - see http://x13.dejanews.com/getdoc.xp?AN=373321668).

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/22 Raw View

In article <6p38ss$ljn$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>> >>  So the alternative is probably to write ones own vtbl's, via a class
>> >>Class, just as in Java. Would it not be a good idea to improve C++ so that
>> >>one more easily can implement dynamic techniques?
>
>C++ isn't Java, of course.  If you need to use C++ instead of Java,
>then you should try to find some C++ technique that accomplishes
>the same goal.  If you can't find C++ tools that help you to solve
>the same problem efficiently and elegantly, then I encourage you to
>switch back to Java instead.

  I think you have misunderstood the situation: I am programming in C++.
One idea would be to switch to Java, of course, but I would probably
eventually need to write my own class Class there too, due to the nature
of my programming.

  But writing such a class Class in C++ might be forced prematurely, due
to the limitations of C++, if say I cannot work efficiently enough with
virtual function pointers. It is a good thing not being forced to write a
class Class when not needed.

>By using "generic variables of the same C++ static type", you are
>defeating the entire virtual-member concept of C++.  Since you aren't
>using C++ vtbl's, of course you're going to have to write your own.

  I think you have misunderstood this one too: One way to implement a
variable that dynamically can change type is to have a class with a
pointer that moves over another class hierarchy. I just want to use that
feature to its full extent: To do the job simply.

>One good (and bad) thing about systems design is that there is ALWAYS
>more than one way to accomplish a given goal.

  The problem with this argument is that it is too general: it could be
used to demonstrate that any kind of feature needs not be added to a
language. Clearly, when programming, one should not sit down trying to
find work-arounds. Then it is something wrong with the language.

>> If two objects f, x should evaluate, then first the
>> object x may return a virtual function pointer. If then f does not
>> recognize that function pointer, then it should proceed with a generic
>> method instead.
>Again, returning the address of virtual functions, potentially
>from other classes, is not the way to accomplish this.  The simplest
>method is simply to define the "generic" function in the base class,
>and the specialized one in class x.

  I think you have misunderstood this one: It is not a question of simple
specialization of virtual functions. In the hierarchy, the object x
returns a suggestion for a _type_ of a virtual function, not the function
itself. Then f specializes it to a function. If f does not recognize the
suggestion, then f should make another, in advance known, choice.

>If your goal is to avoid the need for unneccesary recompiles, I have
>good news for you.  Once you have redesigned your virtual function
>definitions and usage to conform to C++ norms, this problem should
>be 100% solved -- including the ability to add new libraries for
>new objects, without altering the base class libraries.

  No this does not work in this situation, because somebody adding a class
may also need to add a new virtual function, and then the whole hierarchy
must be recompiled the way you suggests. So if my program would be in the
C++ standard library, then every time you use it, you would go in,
changing the library header by adding your virtual function to it and then
recompile it. Then this would also clash with others doing the same thing.

  But from the functional point of view of the program, this is wholly
unnecessary, because the treatmeat of your added virtual function is
already known.

>I'm not certain what the "fragile base class problem" is, but
>I suspect that Java and C++ hare very similar solutions to it.

  No, Java and C++ are distinctly different in this respect: When a base
class is changed in C++ the whole hierarchy must be recompiled, but in
Java this is not necessary.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/22 Raw View

In article <m3yatnjoko.fsf@fred.muc.de>, Andi Kleen <ak@muc.de> wrote:
>Wouldn't it be easier to simply use dynamic_cast<> for the passed in
>pointer (and check for NULL) ?

  This would be one suggestion. Here is another suggestion:

  If f is a virtual function of class B, then B::f is an object which can
be used together with any pointer p in the class hierarchy as (p->*f)(a);
in the case this fails, a virtual_function_cast_fail exception is thrown.
(So p can belong to both classes that B is derived from and classes
derived from B, and preferably also derived classes of classes that B is
derived from.)

  One can then write code as normal, and catch the exception if needed.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/22 Raw View

In article <6p3em1$rp1$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
>Confused, probably because I don't know what a "multi-method" is.  If
>it's a method that applies to more than one type of object, then C++
>has several different mechanisms to allow this; inherited/overridden
>virtual functions, overloaded functions, templates, etc.

  It is a way of dynamically choosing which method to choose dependant on
both f and x in f(x) (usually via some typing). Dynamic languages like
CLOS, Dylan, Cecil uses it.

>Conservative garbage collectors?  When did this enter the discussion
>on this thread?

  I just happened to mention it, and it snowballed.

>One of the design goals of C++ was to create a strongly-typed language.
>They came pretty close, too.  What you're doing is working against
>the grain here; you're trying to start a fire with a magnifying glass.
>While this may work to some degree, it will never be a lighter.

  But we are not speaking about turning C++ into a dynamically typed
language, but how to enable one using C++ for implementing dynamic
features. The main reason for not making C++ dynamically typed is probably
because it is a technology under development, and so there is no
well-established way to do that efficiently in C++. So perhaps C++ will
evolve in this respect, too.

  Features that might be added are, I think, support for dynamic linking,
parallel threads, and implementing GC's.

>>   As far as I can see, this is pretty much the idea of C++: It is not a
>> language that should be learned as a self-purpose, but when one sits down
>> and tries to do an implementation, the techniques needed should be
>> available.
>
>I think I missed a paragraph here.  *What* is pretty much the idea of
>C++?  As written, it looks like you're saying that C++ isn't a language
>that should be learned before you use it.  That's nonsense, of course,
>and I'm sure that's not what you meant, but I don't know what you DID
>mean.

  One should not try to learn C++ apart from its uses: C++ is more like an
encyclopedia. For example, to me it is a nuisance having to write out an
empty virtual destructor "virtual ~D() { }" just in order to make sure its
memory is released, but surely someone is happy about the extra word is
can save when not using dynamic allocation. So I try to keep that in my
mind, and not bother too much about it.

  I said this in response to someone who claimed that a feature that is
important to make my program work cannot be added because he has a team of
programmers that might use it, and it might make life difficult to his
executive position over that team.

>BTW, this works:
>    struct B { virtual int f(); virtual int g(); virtual double h(); }
>    struct C: public B { virtual int g(); }
>    struct D: public B { virtual double h(); }
>
>The "generic" version of g() is B::g().  It's automatically used, for
>instance, when you call g() on a D object.

  This is the variation one wants to avoid, because one must then add the
function to the base class whenever a new class with a new vortual
function is added: The whole stuff must then be recompiled. From a
semantic point of view, this is wholly unnecessary, because the base class
will only contain (say)
    class B
    public:
        D g(D&)                     // General case

        D f1(D&)  { return B::g; }   // Various special cases,
        ...                          // all the same.
        D f_n(D&) { return B::g; }
    };

  So one will be forced putting lots of unnecessary stuff into the base
class, while on the same wasting compile time and making it impossible to
build a library.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: stephen.clamage@sun.com (Steve Clamage)
Date: 1998/07/22 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) writes:

>In article <6p38ss$ljn$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:

>>By using "generic variables of the same C++ static type", you are
>>defeating the entire virtual-member concept of C++.  Since you aren't
>>using C++ vtbl's, of course you're going to have to write your own.

>  I think you have misunderstood this one too: One way to implement a
>variable that dynamically can change type is to have a class with a
>pointer that moves over another class hierarchy. I just want to use that
>feature to its full extent: To do the job simply.

But C++ does not (by deliberate language design) support the
concept of an object that changes type. In C++ an object has
an unchanging type from the completion of its construction until
the start of its destruction. Polymorphism is achieved only by
using pointers and references. (A pointer is an object, and
so does not really change type; but C++ infers a "dynamic type"
for pointers that applies to calling virtual functions.)

It seems you are trying to bend C++ to fit a programming technique
for which it was not intended. You could consider that a defect of
C++, but no language can properly support every programming technique.

--
Steve Clamage, stephen.clamage@sun.com

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/07/22 Raw View

Hans Aberg wrote:
>
> In article <6p3em1$rp1$1@nnrp1.dejanews.com>, AllanW@my-dejanews.com wrote:
> >Confused, probably because I don't know what a "multi-method" is.  If
> >it's a method that applies to more than one type of object, then C++
> >has several different mechanisms to allow this; inherited/overridden
> >virtual functions, overloaded functions, templates, etc.
>
>   It is a way of dynamically choosing which method to choose dependant on
> both f and x in f(x) (usually via some typing). Dynamic languages like
> CLOS, Dylan, Cecil uses it.

You're implying that a seperate function is needed for each combination
of f and x; that the problem cannot be factored into virtual functions
for each f interacting with virtual functions for each x. Lets call F
and X the base classes from which all f's, and all x's are respectively
defined. I'll use dF and dX as examples of classes derived from each.
Then multi-methods could be implemented within C++ by:

1. defining a set of virtual members X::b(dF& f), overloaded for each
dF, and overridden by dX::b(dF& f) to implement the actual
multi-methods.

2. defining a virtual member F::a(X& x), which is overriden in every dF
to call x.b(*this)

If I understand C++ virtual functions correctly, the result will be that
f.a(x) will always call dX::b(dF& f), where dX is the dynamic type of x,
and dF is the dynamic type of f, even if f and x are declared as
references to less-derived types. Is that the effect you want?

Obviously this won't work if some of the dX's are not user-defined
types.

...
>   One should not try to learn C++ apart from its uses: C++ is more like an
> encyclopedia. For example, to me it is a nuisance having to write out an
> empty virtual destructor "virtual ~D() { }" just in order to make sure its
> memory is released, but surely someone is happy about the extra word is

I'm unfamiliar with that aspect of C++. I thought that the default
destructor generated by the compiler was equivalent in all of it's
effects to the empty one you define there. Why does explicitly defining
it make a difference?

> >BTW, this works:
> >    struct B { virtual int f(); virtual int g(); virtual double h(); }
> >    struct C: public B { virtual int g(); }
> >    struct D: public B { virtual double h(); }
> >
> >The "generic" version of g() is B::g().  It's automatically used, for
> >instance, when you call g() on a D object.
>
>   This is the variation one wants to avoid, because one must then add the
> function to the base class whenever a new class with a new vortual
> function is added: The whole stuff must then be recompiled. From a

In terms of what I wrote above, adding a new dX would require no
recompilation, only new compilation of the implementation of dX, and of
any code that needs to refer explicitly to the declaration of dX. Adding
a new dF would require adding a new override of dX::b(dF& f) for every
dX, and thus of all code that uses the multi-methods in any way.
However, no existing F-derive class would need to be re-compiled.  Thus,
if new f's are more common than new 'x's, you might want to rearrange
the idea in terms of x(f), rather than f(x).
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/22 Raw View

In article <6p55h6$ei3@engnews1.Eng.Sun.COM>, stephen.clamage@sun.com
(Steve Clamage) wrote:
>>  I think you have misunderstood this one too: One way to implement a
>>variable that dynamically can change type is to have a class with a
>>pointer that moves over another class hierarchy. I just want to use that
>>feature to its full extent: To do the job simply.
>
>But C++ does not (by deliberate language design) support the
>concept of an object that changes type. In C++ an object has
>an unchanging type from the completion of its construction until
>the start of its destruction.

  I think there is a constant mixup in this thread between what C++ is and
can do on the one hand, and what one is using C++ for: I use C++ for
implementing such dynamic features, and what I ask is that C++ should help
me doing that, and I am not asking that C++ should say become a
dynamically typed language or something like that.

>Polymorphism is achieved only by
>using pointers and references.

  So this is what I am saying I am doing, but I hide away that pointer
stuff using C++ object oriented features. Then the true dynamics is of
course only achieved when the program is run, not by a simulation by the
programmers interaction with the C++ compiler.

  So if one is writing a dynamic program of sort (which sould be a
dynamically typed language), it would be great to be able to use C++ for
doing that job.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/22 Raw View

In article <haberg-2107982025400001@sl111.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
> In article <35b4a667.69635670@news.gis.net>, richard_damon@iname.com wrote:
> >..I assume you will either make sure the
> >function is only applied to a derived or make sure the function only uses the
> >base functionality.
>
>   No, the base knows how to handle virtual pointers it does not recognize,
> because there is a general way to specify that.

No, it doesn't, not in C++.  Check the newsgroup you're posting to.

Your code might compile without error messages on some particular
implementation.  It might even do what you want it to do.  But it is
not standard C++ today, and personally I hope that future versions of
the standard never support this either.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Alex Martelli" <martelli@cadlab.it>
Date: 1998/07/21 Raw View

Hans Aberg wrote in message ...
    [snip]
>>>  So the alternative is probably to write ones own vtbl's, via a
class
>>>Class, just as in Java. Would it not be a good idea to improve C++ so
>>that one more easily can implement dynamic techniques?
>..
>>No thanks, C++ is easily complex enough already -- and the extension
>>you require anything but trivial.
>
>  The idea is to implement multi-methods, conservative garbage
collectors
>and such stuff. My guess is that you have not tried doing it.

Tried writing a conservative garbage collector _better than
Boehm's_?!  You're right I haven't tried, nor do I intend to -- he's
a genius as well as a specialist, why should I try to compete?
By the way, he does seem to be doing a great job, and the
changes suggested to C++ for smoother conservative GC are
most emphatically NOT towards greater complexity and more
run-time dynamic fluidity.

Double-dispatching, and other variations of multiple dispatch,
why, sure, I _have_ needed to do that many times over the years,
including back when I couldn't count even on plain RTTI as being
portable among the compilers I had to support (and that wasn't
all that long ago, either).

And I'm still VERY, VERY happy that no more complexity is
going to be added to the language and standard libraries for
a while -- there's PLENTY of that already, and getting good
solid implementations of all that's in, plus teaching people to
use such huge richness appropriately in production code, are
serious problems today.  Have YOU ever tried leading a
software production group, and debugging the subtle bugs
that language complexity introduces in huge programs through
both programmers' misunderstandings and compilers' bugs?!


>>..You seem to be fixated on the "jumping
>>off the VTBL" failure case, but that's obviously NOT the only one...:
>>
>>struct B { virtual int f(); }
>>
>>struct C: public B { virtual int g(); }
>>
>>struct D: public B { virtual double h(); }
>
>  So the idea is not to supply a fool-proof type check, but letting C++
>provide the raw-material allowing the job to be done. (Once one starts
>working with dynamic data, it is better to work with generic variables
>which can hold any type of data. So the virtual functions will all be
of
>the type "Data (T::*)(Data&)". Therefore a runtime type check is
>unnecessary in such a case.)

You keep missing the point, which is that the NUMBER of virtual
functions implemented in a certain vtable (that you keep fixating
about, with your concerns about "jumping off the end" of the vtbl)
is absolutely no assurance regarding WHAT virtual functions are
implemented there, and therefore about the appropriateness of
a certain pointer-to-member for a certain class.

C++ provides plenty of "raw material" for you to work with, though
you often have to write lots of boilerplate to use it (but that
boilerplate, in turn, typically lends itself to being generated by
a program generation, wrapped in preprocessor idioms, or
both).  Suppose, for example, that you just couple your "generic
member-function pointers" with a typeid; and have every class
in your hierarchy implement a virtual "apply if appropriate"
member function that will:
    -- check this class's typeid versus the passed-in one, and,
        if equal, then reinterpret_cast the member pointer, apply
        it, and return some indication that the deed has been done;
    -- if not equal, then delegate the job to the base class (or
        classes) -- with the root class simply providing the final
        indication that, no, the application was not possible.

This is a "poor man's implementation", with a very simple and
common C++ idiom, of what you seem to be striving for -- and,
if performance should be a problem (VERY deep inheritance),
you can speed it up with table lookups (see, for example, the
last few C++ Report "(B)Leading Edge" columns by J. Reeves,
on variations on the "Indirect Visitor" pattern).

And it does not require further additions of complexity to the
C++ language and standard libraries, thanks be...!



Alex
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: rdamon@BeltronicsInspection.com (Richard Damon)
Date: 1998/07/21 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:

>  Is there a way in C++ to find out if a dynamic cast of virtual function
>pointers have failed? (Do not confuse this with the "dynamic_cast"
>command, which cannot be directly used in this context.)
>
>  For example:
>
>class B;
>typedef int (B::*Bf)();
>
>class B {
>public:
>    virtual int f() { return 1; }
>};
>
>class C : public B {
>public:
>    int f() { return 2; }
>    virtual int g() { return 11; }
>};
>
>  Then
>    Bf f = C::f;
>    B* bp = new B();             // Or a pointer of a derived class.
>    cout << (bp->*f)() << endl;
>will print out correctly, whereas
>    B* bp = new B();
>    Bf g = C::g;
>    cout << (bp->*g)() << endl;
>will be an error, clearly, as C::g will jump off the VTBL of class B.
>
>  So the question is can this latter be determined by a dynamic, runtime
>cast, so that before (bp->*g)() fails, one can replace it with something
>else? (Say a computation resulting in the NULL pointer, or the function
>(bp->*f)().)
>
>  What is needed is that the VTBL's know their size, and this can be used
>in a runtime dynamic cast to tell when a virtual function pointer jumps
>off it.
>
>  If this is not possible, should it not be allowed in C++? -- It is
>reasonably simple, and it can be used to avoid putting a lot of virtual
>functions in the base class when new derived classes are added, thereby
>avoiding having to recompile the base class.
>
>  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
>               * Email: Hans Aberg <mailto:haberg@member.ams.org>
>               * Home Page: <http://www.matematik.su.se/~haberg/>
>               * AMS member listing: <http://www.ams.org/cml/>
>
>
The problem stems in the direction member-function pointers can be converted. A
pointer to base function can be converted to a pointer to derived function as
all functions on a base object can take a derived object. A pointer to derived
function can not be converted to a pointer to base function because a base
object may be missing something needed for the derived function.

You seem to want to do the latter, I assume you will either make sure the
function is only applied to a derived or make sure the function only uses the
base functionality.

When I define an interface which takes member-function pointers I tend to also
add an interface that takes ordinary functions with a class pointer as a
parameter. This allows me to add new functions which were not defined in the
base class, but which are later needed. The derived class can then define a
static function which acts can convert the base pointer to the derived pointer
and call the member-function.

--
richard_damon@iname.com (Redirector to my current best Mailbox)
rdamon@beltronicsInspection.com (Work Adddress)
Richad_Damon@msn.com (Just for Fun)


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/21 Raw View

In article <35b4a667.69635670@news.gis.net>, richard_damon@iname.com wrote:
>.. A pointer to derived
>function can not be converted to a pointer to base function because a base
>object may be missing something needed for the derived function.
>
>..I assume you will either make sure the
>function is only applied to a derived or make sure the function only uses the
>base functionality.

  No, the base knows how to handle virtual pointers it does not recognize,
because there is a general way to specify that. So I want the base to be
able to make a runtime check if it is a recognizable virtual function
pointer, and if not, proceed with the general method to handle it.

>When I define an interface which takes member-function pointers I tend to also
>add an interface that takes ordinary functions with a class pointer as a
>parameter. This allows me to add new functions which were not defined in the
>base class, but which are later needed. The derived class can then define a
>static function which acts can convert the base pointer to the derived pointer
>and call the member-function.

  I am not sure about the details; please explain.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/22 Raw View

In article <haberg-2007982330460001@sl67.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
> >>  So the alternative is probably to write ones own vtbl's, via a class
> >>Class, just as in Java. Would it not be a good idea to improve C++ so that
> >>one more easily can implement dynamic techniques?

C++ isn't Java, of course.  If you need to use C++ instead of Java,
then you should try to find some C++ technique that accomplishes
the same goal.  If you can't find C++ tools that help you to solve
the same problem efficiently and elegantly, then I encourage you to
switch back to Java instead.

If you really believe that this particular Java technique is the
only way to solve your problem, then you are probably too close to
it.  Step back and think about the issues at a higher level.  With
the rich toolset that C++ represents, there are almost certainly
other effective ways to the same end.

But if you absolutely *MUST* use this Java technique, and you have
user constraints that it absolutely *MUST* be in C++, then you are
starting down a very long, bumpy road.  You will have to do some of
the same work that the Java compiler writers have already done for
you.  (You might want to tell your users about this, to give them
the chance to change their mind.  Very often, realistic projections
of price tags can have a startling effect on arbitrary constraints
such as this.)

> In article <6ovqbl$abj@engnews1.Eng.Sun.COM>, clamage@Eng.Sun.COM (Steve
> Clamage) wrote:
> >What exactly are you trying to accomplish?

Hans:
>   A variation of multimethods: All runtime objects are held in generic
> variables of the same C++ static type, but with certain runtime typing
> (using handles etc).

By using "generic variables of the same C++ static type", you are
defeating the entire virtual-member concept of C++.  Since you aren't
using C++ vtbl's, of course you're going to have to write your own.

One good (and bad) thing about systems design is that there is ALWAYS
more than one way to accomplish a given goal.  A long time ago, I
worked on a assembly-language system that used something called
"dispatch tables".  I didn't invent this, but I wish I had -- they
were very slick, giving our assembly language almost as much power
as certain "high level" languages.  At an abstract level, these
served some of the same purpose as C++ virtual functions, although
the terminology used to describe them wasn't so object-oriented-laden.

Essentially, you could use an object's TYPE (which was a 1-byte
enumeration at the beginning of every object) to locate a table of
functions.  The first entry of each table held the address of a
function to convert the object to a printable text representation,
the second was a function that converted back from text to internal,
etc.  As you can see, this is similar to the C++ virtual-function
concept, yet simple enough to implement at an application-program
level.  Since virtual functions don't serve your current project
very well, you might consider something like this.

Hans:
> If two objects f, x should evaluate, then first the
> object x may return a virtual function pointer. If then f does not
> recognize that function pointer, then it should proceed with a generic
> method instead.

Again, returning the address of virtual functions, potentially
from other classes, is not the way to accomplish this.  The simplest
method is simply to define the "generic" function in the base class,
and the specialized one in class x.

Hans:
>   So if I have a runtime method to recognize that the virtual function
> pointer that x returns has not been implemented in f, then I can avoid
> having to add that function pointer to the static C++ class that
> implements the runmtime object f. I would not have to recompile the whole
> program on 400000 lines every time I add a new type of virtual function
> pointer somewhere down the hierarchy. And I could open up the possibility
> for adding libraries, which is not possible if the root base class must be
> altered and recompiled every time somebody wants to add a new class with a
> new virtual function pointer.

If your goal is to avoid the need for unneccesary recompiles, I have
good news for you.  Once you have redesigned your virtual function
definitions and usage to conform to C++ norms, this problem should
be 100% solved -- including the ability to add new libraries for
new objects, without altering the base class libraries.  The only
thing not covered by the language, automatically, is how you manage
to create one of the new objects in the first place (this generally
requires changes to a section of user code called the "class
factory").  Once you have created this new object, you can pass
pointers and references to it anywhere that you can pass pointers
or references to the base class.  The new object's virtual functions
will be used, even though the base class libraries weren't modified
or even recompiled.

Indeed, this is one of the very reasons that virtual functions were
created in the first place.

Steve:
> >It seems to me that all the mechanisms you need are already
> >present and are safe.

Hans:
>   So this is not really the case: You can only do simple dynamic
> operations with the C++ runtime mechanism. (I think this is known; it
> relates to the fragile base class problem, which is solved by including
> the names of the functions in the object code and dynamical linking, just
> as in Java.)

Perhaps, compared to your experience in Java, the C++ dynamic
operations are all "simple."  But most of the people that post
to this newsgroup can tell you that it is sufficient for a very
wide range of problems.  Furthermore, one of the main
criticisms it receives is that it is not simple enough,
although I tend to disagree.  (It does get complicated when you
consider functions that hide same-name-but-different-signature
functions in multiple virtual base classes, but for most of us
this type of class heirarchy is rare indeed.)

I'm not certain what the "fragile base class problem" is, but
I suspect that Java and C++ hare very similar solutions to it.
If I'm wrong, and if you vastly prefer the Java solution, then
you should probably be using Java.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Andi Kleen <ak@muc.de>
Date: 1998/07/22 Raw View

"Alex Martelli" <martelli@cadlab.it> writes:

> C++ provides plenty of "raw material" for you to work with, though
> you often have to write lots of boilerplate to use it (but that
> boilerplate, in turn, typically lends itself to being generated by
> a program generation, wrapped in preprocessor idioms, or
> both).  Suppose, for example, that you just couple your "generic
> member-function pointers" with a typeid; and have every class
> in your hierarchy implement a virtual "apply if appropriate"
> member function that will:
>     -- check this class's typeid versus the passed-in one, and,
>         if equal, then reinterpret_cast the member pointer, apply
>         it, and return some indication that the deed has been done;
>     -- if not equal, then delegate the job to the base class (or
>         classes) -- with the root class simply providing the final
>         indication that, no, the application was not possible.

Wouldn't it be easier to simply use dynamic_cast<> for the passed in
pointer (and check for NULL) ?

-Andi
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/22 Raw View

  I think we start getting into the flame realms, so I will stop here.

  However :-):

In article <EwFuGG.IKp@cadlab.it>, "Alex Martelli" <martelli@cadlab.it> wrote:
>Tried writing a conservative garbage collector _better than
>Boehm's_?!  You're right I haven't tried, nor do I intend to -- he's
>a genius as well as a specialist, why should I try to compete?
>By the way, he does seem to be doing a great job, and the
>changes suggested to C++ for smoother conservative GC are
>most emphatically NOT towards greater complexity and more
>run-time dynamic fluidity.

  This is the line I am also into: C++ needs to be improved in this
respect. The problem is finding a good way to implement different GC's as
it is currently a developing technology. Good GC's are usually hybrids of
techniques.

  Do you have a ref (URL) to the Boem's work?

  A good intro to GC's is "Real Time Non-copying Garbage Collection" at
        http://www.cs.utexas.edu/users/oops/papers.html

  Also try to remember, just because you are not trying to implement a
certain feature using C++, somebody else might: The discussions here in
this newsgroup are not among those who do not intend to use a language
feature, but those that might be forced to do it.

>..  Have YOU ever tried leading a
>software production group, and debugging the subtle bugs
>that language complexity introduces in huge programs through
>both programmers' misunderstandings and compilers' bugs?!

  I gather so called bread programming is different from creative
programming: I only work alone on programs compiling on 400000 lines on
150 files and such.

  So I do not have the option renting a guy trying to work around
limitations in the computer language, let alone trying to teach him what
it is all about.

>>  So the idea is not to supply a fool-proof type check, but letting C++
>>provide the raw-material allowing the job to be done. (Once one starts
>>working with dynamic data, it is better to work with generic variables
>>which can hold any type of data. So the virtual functions will all be
>of
>>the type "Data (T::*)(Data&)". Therefore a runtime type check is
>>unnecessary in such a case.)
>
>You keep missing the point, which is that the NUMBER of virtual
>functions implemented in a certain vtable (that you keep fixating
>about, with your concerns about "jumping off the end" of the vtbl)
>is absolutely no assurance regarding WHAT virtual functions are
>implemented there, and therefore about the appropriateness of
>a certain pointer-to-member for a certain class.

  I am well aware of that the full problem is more complex. But otherwise
the idea is that one should be able work around typical static limitations
in order to provide an efficient implementation.

>C++ provides plenty of "raw material" for you to work with, though
>you often have to write lots of boilerplate to use it (but that
>boilerplate, in turn, typically lends itself to being generated by
>a program generation, wrapped in preprocessor idioms, or
>both).  Suppose, for example, that you just couple your "generic
>member-function pointers" with a typeid; and have every class
>in your hierarchy implement a virtual "apply if appropriate"
>member function that will:

  The problem is that all those fancy workarounds becomes unworkable with
such a large program. (Otherwise I have used various workarounds.)

  This was also one of the original ideas of C++, enable a single person
to be able larger amounts of code. This really requires implementing C++
language features that make programming efficient.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/07/22 Raw View

In article <haberg-2007981836150001@sl62.modempool.kth.se>,
  haberg@REMOVE.matematik.su.se (Hans Aberg) wrote:
>   The idea is to implement multi-methods, conservative garbage collectors
> and such stuff. My guess is that you have not tried doing it.

Confused, probably because I don't know what a "multi-method" is.  If
it's a method that applies to more than one type of object, then C++
has several different mechanisms to allow this; inherited/overridden
virtual functions, overloaded functions, templates, etc.

Conservative garbage collectors?  When did this enter the discussion
on this thread?

> >..You seem to be fixated on the "jumping
> >off the VTBL" failure case, but that's obviously NOT the only one...:
> >
> >struct B { virtual int f(); }
> >
> >struct C: public B { virtual int g(); }
> >
> >struct D: public B { virtual double h(); }
>
>   So the idea is not to supply a fool-proof type check, but letting C++
> provide the raw-material allowing the job to be done. (Once one starts
> working with dynamic data, it is better to work with generic variables
> which can hold any type of data. So the virtual functions will all be of
> the type "Data (T::*)(Data&)". Therefore a runtime type check is
> unnecessary in such a case.)

One of the design goals of C++ was to create a strongly-typed language.
They came pretty close, too.  What you're doing is working against
the grain here; you're trying to start a fire with a magnifying glass.
While this may work to some degree, it will never be a lighter.

>   As far as I can see, this is pretty much the idea of C++: It is not a
> language that should be learned as a self-purpose, but when one sits down
> and tries to do an implementation, the techniques needed should be
> available.

I think I missed a paragraph here.  *What* is pretty much the idea of
C++?  As written, it looks like you're saying that C++ isn't a language
that should be learned before you use it.  That's nonsense, of course,
and I'm sure that's not what you meant, but I don't know what you DID
mean.

BTW, this works:
    struct B { virtual int f(); virtual int g(); virtual double h(); }
    struct C: public B { virtual int g(); }
    struct D: public B { virtual double h(); }

The "generic" version of g() is B::g().  It's automatically used, for
instance, when you call g() on a D object.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/20 Raw View

In article <slrn6r2q3e.am.sbnaran@fermi.ceg.uiuc.edu>,
sbnaran@KILL.uiuc.edu wrote:
>But the above code is illegal.  Here are my compiler's error messages for
>the following program:

  We are getting off the point: In my compiler (which is old), with the code

class B;
typedef int (B::*Bf)();

class B {
public:
    virtual int f();
};

class C : public B {
public:
    int f();
    virtual int g();
};

then it somehow accepts the assignment
    Bf f = C::f;
but if I define a function
    T h(Bf);
and try to use h(f) then it complains. So I use h((Bf)f) instead with a
forced type cast.

  I can then use this converted (Bf)f as a Bf with all derived classes of
B (above D) as long as it does not jump off the vtbl.

  But in my case, I also know what the behavior of my program should be if
it jumps off the vtbl. If C++ would have had it, this could have been
handled by a dynamic_cast type of function working on virtual function
pointers. Say one would write
    Bf f = dynamic_cast<Bf>(C::f);
just as in the case of the dynamic_cast on class pointers, and get a NULL
whenever the cast is not possible.

  So this is my question: Is it possible to somhow achieve this in C++? If
not, should one not extend C++ so it is possible? The reason is that it
seems to be relatively simple to implement, and one can do implementations
where newly added derived classes with new virtual functions does not
force you to also put these new virtual functions in the base class
because they can be recognized dynamically at runtime.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Alexandre Oliva <oliva@dcc.unicamp.br>
Date: 1998/07/20 Raw View

Siemel Naran <sbnaran@fermi.ceg.uiuc.edu> writes:

>   // class C is derived from class B
>   int (B::*g)() = &C::g; // C::g has type   int (C::*)()

> The "&" is not necessary because the only thing you can do with a
> function is call it or take its address.

Nevertheless, it is explicitly required in the standard [expr.unary.op]:

3 A pointer to member is only formed when an explicit & is used and  its
                         ^^^^
  operand  is  a  qualified-id not enclosed in parentheses. [...]

--
Alexandre Oliva
mailto:oliva@dcc.unicamp.br mailto:aoliva@acm.org
http://www.dcc.unicamp.br/~oliva
Universidade Estadual de Campinas, SP, Brasil
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/20 Raw View

In article <35B14962.6D6A90FE@acm.org>, Pete Becker <petebecker@acm.org> wrote:
>>   So it does not answer the original question, as it is such casts one
>> wants to have with the additional capacity of being able to make a runtime
>> check if they jump off the VTBL. (One alternative is that I write my own
>> VTBL's, but then the point of using C++ is somewhat diminished.)
>
>According to the C++ language definition, if you use a cast to do such a
>conversion you must convert back to the original type before you use the
>pointer. If you don't, you're on your own. The language definition does
>not require that it work.

  So the alternative is probably to write ones own vtbl's, via a class
Class, just as in Java. Would it not be a good idea to improve C++ so that
one more easily can implement dynamic techniques?

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Alex Martelli" <martelli@cadlab.it>
Date: 1998/07/20 Raw View

Hans Aberg wrote in message ...
    [snip]
>>>   So it does not answer the original question, as it is such casts
one
>>> wants to have with the additional capacity of being able to make a
runtime
>>> check if they jump off the VTBL. (One alternative is that I write my
own
>>> VTBL's, but then the point of using C++ is somewhat diminished.)
    [snip]
>  So the alternative is probably to write ones own vtbl's, via a class
>Class, just as in Java. Would it not be a good idea to improve C++ so
that
>one more easily can implement dynamic techniques?


No thanks, C++ is easily complex enough already -- and the extension
you require anything but trivial.  You seem to be fixated on the "jumping
off the VTBL" failure case, but that's obviously NOT the only one...:

struct B { virtual int f(); }

struct C: public B { virtual int g(); }

struct D: public B { virtual double h(); }

B* pB = new D;

Now consider using an &C::g (presumably asking for the 2nd vtable
entry) on pB -- it won't "jump off"... it will quietly get D::h(), which
does something completely different, and quietly fail in horribly-hard-to-track
down-to-the-real-cause ways.  And that's not even considering
multiple or virtual inheritance, just a trivial single-inheritance case!

You don't have to implement a full-fledged class object with vtbl's
etc to get your desired behaviour -- there are many idioms and
patterns (many with help from RTTI and templates) to achieve
such ends.  E.g., pack a typeid from RTTI together with the
member pointer, have every class in your hierarchy implement
a method taking such a structure and checking if it meets the
typeid needs (possibly by delegating to base classes to check),
as you can in fact do pretty easily with a preprocessor macro.


Alex



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: clamage@Eng.Sun.COM (Steve Clamage)
Date: 1998/07/20 Raw View

haberg@REMOVE.matematik.su.se (Hans Aberg) writes:

>In article <35B14962.6D6A90FE@acm.org>, Pete Becker <petebecker@acm.org> wrote:
>>>   So it does not answer the original question, as it is such casts one
>>> wants to have with the additional capacity of being able to make a runtime
>>> check if they jump off the VTBL. (One alternative is that I write my own
>>> VTBL's, but then the point of using C++ is somewhat diminished.)
>>
>>According to the C++ language definition, if you use a cast to do such a
>>conversion you must convert back to the original type before you use the
>>pointer. If you don't, you're on your own. The language definition does
>>not require that it work.

>  So the alternative is probably to write ones own vtbl's, via a class
>Class, just as in Java. Would it not be a good idea to improve C++ so that
>one more easily can implement dynamic techniques?

What exactly are you trying to accomplish?

Pointers to members in C++ are perfectly safe as long as you
don't use casts. Without casts, you can't invoke a derived
method on a base class. (It appears that the original question
involved a compiler that accepted invalid code. That's a compiler
bug, not a language flaw.)

You can use dynamic_cast to verify whether the object you
have is of a suitable type. Dynamic cast requires that a
class have virtual functions, but you've already said that
is the case.

It seems to me that all the mechanisms you need are already
present and are safe.

--
Steve Clamage, stephen.clamage@sun.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/21 Raw View

In article <6ovqbl$abj@engnews1.Eng.Sun.COM>, clamage@Eng.Sun.COM (Steve
Clamage) wrote:
>haberg@REMOVE.matematik.su.se (Hans Aberg) writes:
>>  So the alternative is probably to write ones own vtbl's, via a class
>>Class, just as in Java. Would it not be a good idea to improve C++ so that
>>one more easily can implement dynamic techniques?
>
>What exactly are you trying to accomplish?

  A variation of multimethods: All runtime objects are held in generic
variables of the same C++ static type, but with certain runtime typing
(using handles etc). If two objects f, x should evaluate, then first the
object x may return a virtual function pointer. If then f does not
recognize that function pointer, then it should proceed with a generic
method instead.

  So if I have a runtime method to recognize that the virtual function
pointer that x returns has not been implemented in f, then I can avoid
having to add that function pointer to the static C++ class that
implements the runmtime object f. I would not have to recompile the whole
program on 400000 lines every time I add a new type of virtual function
pointer somewhere down the hierarchy. And I could open up the possibility
for adding libraries, which is not possible if the root base class must be
altered and recompiled every time somebody wants to add a new class with a
new virtual function pointer.

>It seems to me that all the mechanisms you need are already
>present and are safe.

  So this is not really the case: You can only do simple dynamic
operations with the C++ runtime mechanism. (I think this is known; it
relates to the fragile base class problem, which is solved by including
the names of the functions in the object code and dynamical linking, just
as in Java.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/21 Raw View

In article <EwEHnA.52u@cadlab.it>, "Alex Martelli" <martelli@cadlab.it> wrote:
>Hans Aberg wrote:
>>  So the alternative is probably to write ones own vtbl's, via a class
>>Class, just as in Java. Would it not be a good idea to improve C++ so
>that one more easily can implement dynamic techniques?
..
>No thanks, C++ is easily complex enough already -- and the extension
>you require anything but trivial.

  The idea is to implement multi-methods, conservative garbage collectors
and such stuff. My guess is that you have not tried doing it.

>..You seem to be fixated on the "jumping
>off the VTBL" failure case, but that's obviously NOT the only one...:
>
>struct B { virtual int f(); }
>
>struct C: public B { virtual int g(); }
>
>struct D: public B { virtual double h(); }

  So the idea is not to supply a fool-proof type check, but letting C++
provide the raw-material allowing the job to be done. (Once one starts
working with dynamic data, it is better to work with generic variables
which can hold any type of data. So the virtual functions will all be of
the type "Data (T::*)(Data&)". Therefore a runtime type check is
unnecessary in such a case.)

  As far as I can see, this is pretty much the idea of C++: It is not a
language that should be learned as a self-purpose, but when one sits down
and tries to do an implementation, the techniques needed should be
available.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/18 Raw View

  Is there a way in C++ to find out if a dynamic cast of virtual function
pointers have failed? (Do not confuse this with the "dynamic_cast"
command, which cannot be directly used in this context.)

  For example:

class B;
typedef int (B::*Bf)();

class B {
public:
    virtual int f() { return 1; }
};

class C : public B {
public:
    int f() { return 2; }
    virtual int g() { return 11; }
};

  Then
    Bf f = C::f;
    B* bp = new B();             // Or a pointer of a derived class.
    cout << (bp->*f)() << endl;
will print out correctly, whereas
    B* bp = new B();
    Bf g = C::g;
    cout << (bp->*g)() << endl;
will be an error, clearly, as C::g will jump off the VTBL of class B.

  So the question is can this latter be determined by a dynamic, runtime
cast, so that before (bp->*g)() fails, one can replace it with something
else? (Say a computation resulting in the NULL pointer, or the function
(bp->*f)().)

  What is needed is that the VTBL's know their size, and this can be used
in a runtime dynamic cast to tell when a virtual function pointer jumps
off it.

  If this is not possible, should it not be allowed in C++? -- It is
reasonably simple, and it can be used to avoid putting a lot of virtual
functions in the base class when new derived classes are added, thereby
avoiding having to recompile the base class.

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Pete Becker <petebecker@acm.org>
Date: 1998/07/18 Raw View

Hans Aberg wrote:
>
>   Is there a way in C++ to find out if a dynamic cast of virtual function
> pointers have failed? (Do not confuse this with the "dynamic_cast"
> command, which cannot be directly used in this context.)
>
>     B* bp = new B();
>     Bf g = C::g;
>     cout << (bp->*g)() << endl;
> will be an error, clearly, as C::g will jump off the VTBL of class B.

There's a much simpler answer. The code

Bf g = &C::g;

(the & is required) is illegal. A pointer to a member function of a
derived class cannot be implicitly converted into a pointer to a member
function of a base class. The reason, as Hans points out, is that it
won't work right. Conversions of pointers to members go the other way: a
pointer to a member of a base class can be implicitly converted into a
pointer to a member function of a derived class.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: haberg@REMOVE.matematik.su.se (Hans Aberg)
Date: 1998/07/18 Raw View

In article <35B0DB23.9CB9FA07@acm.org>, Pete Becker <petebecker@acm.org> wrote:
>There's a much simpler answer. The code
>
>Bf g = &C::g;
>
>(the & is required) is illegal. A pointer to a member function of a
>derived class cannot be implicitly converted into a pointer to a member
>function of a base class. The reason, as Hans points out, is that it
>won't work right.

  No, the code is not illegal (at least on my compiler), and clearly the
conversion can be made explicit and working unless it is the case that
function pointer jumps off the virtual table, as the VTBL offsets of the
derived functions must be the same as in the base class. (And the & is not
required, as C++ will recognize that it is function pointer, and not using
the & makes the code easier to read.)

  So it does not answer the original question, as it is such casts one
wants to have with the additional capacity of being able to make a runtime
check if they jump off the VTBL. (One alternative is that I write my own
VTBL's, but then the point of using C++ is somewhat diminished.)

  Hans Aberg   * Anti-spam: Remove "REMOVE." from email address.
               * Email: Hans Aberg <mailto:haberg@member.ams.org>
               * Home Page: <http://www.matematik.su.se/~haberg/>
               * AMS member listing: <http://www.ams.org/cml/>

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@fermi.ceg.uiuc.edu (Siemel Naran)
Date: 1998/07/19 Raw View

>>There's a much simpler answer. The code
>>
>>Bf g = &C::g;

  // EXPLANATION OF THE ABOVE
  // class C is derived from class B
  int (B::*g)() = &C::g; // C::g has type   int (C::*)()

>>(the & is required) is illegal. A pointer to a member function of a
>>derived class cannot be implicitly converted into a pointer to a member
>>function of a base class. The reason, as Hans points out, is that it
>>won't work right.


>  No, the code is not illegal (at least on my compiler), and clearly the
>conversion can be made explicit and working unless it is the case that
>function pointer jumps off the virtual table, as the VTBL offsets of the
>derived functions must be the same as in the base class. (And the & is not
>required, as C++ will recognize that it is function pointer, and not using
>the & makes the code easier to read.)


The "&" is not necessary because the only thing you can do with a function
is call it or take its address.

But the above code is illegal.  Here are my compiler's error messages for
the following program:

struct B     { int f(); };
struct D : B { int f(); };

// PS: overloading across different level scopes is a bad idea!

int main()
{
     int (B::*p)()=D::f;
}

e.cc: In function `int main()':
e.cc:8: type `D' is not a base type for type `B'
e.cc:8:    in pointer to member function conversion

Note that
     int (D::*p)()=B::f;
is allowed because D inherits all the functions of B.


Perhaps you are using Microsoft Visual C++ ?  There was a thread recently
on comp.lang.c++ where the above mentioned compiler allowed someone to
initialize a static variable defined in the Base class B as if it were
a member of the Derived class.  For example, B.h defines class B which
declares a static int s_num, and file D.cpp initializes the static variable
as "int D::s_num=0".



>  So it does not answer the original question, as it is such casts one
>wants to have with the additional capacity of being able to make a runtime
>check if they jump off the VTBL. (One alternative is that I write my own
>VTBL's, but then the point of using C++ is somewhat diminished.)

The point of the strong typing is that the checks are done at compile time.



--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Pete Becker <petebecker@acm.org>
Date: 1998/07/19 Raw View

Hans Aberg wrote:
>
> In article <35B0DB23.9CB9FA07@acm.org>, Pete Becker <petebecker@acm.org> wrote:
> >There's a much simpler answer. The code
> >
> >Bf g = &C::g;
> >
> >(the & is required) is illegal. A pointer to a member function of a
> >derived class cannot be implicitly converted into a pointer to a member
> >function of a base class. The reason, as Hans points out, is that it
> >won't work right.
>
>   No, the code is not illegal (at least on my compiler),

Even though your compiler accepts it, it is illegal.

> and clearly the
> conversion can be made explicit and working unless it is the case that
> function pointer jumps off the virtual table, as the VTBL offsets of the
> derived functions must be the same as in the base class. (And the & is not
> required, as C++ will recognize that it is function pointer, and not using
> the & makes the code easier to read.)
>
>   So it does not answer the original question, as it is such casts one
> wants to have with the additional capacity of being able to make a runtime
> check if they jump off the VTBL. (One alternative is that I write my own
> VTBL's, but then the point of using C++ is somewhat diminished.)

According to the C++ language definition, if you use a cast to do such a
conversion you must convert back to the original type before you use the
pointer. If you don't, you're on your own. The language definition does
not require that it work.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]