Topic: garbage collection (delete considered bad style!)


Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/01/29
Raw View
Ian Haggard <ian@shellus.com> writes:

|>  James Kanze wrote:
|>  >
|>  > Why should any static_cast be a problem for garbage collection?
|>
|>  struct ti {
|>    int i;
|>  };
|>  struct tp {
|>    int* i;
|>  };
|>
|>  main() {
|>    ti* pti=new ti;
|>    tp* ptp=static_cast<tp*>(pti);
|>  }
|>
|>  I think that's all all that need be said.  The way I read the standard,
|>  this is permissible.

Several points:

1. It's illegal.  There is no relationship between ti and tp, so the
only legal cast would be a reinterpret_cast.

2. Even using reinterpret_cast, dereferencing the resulting pointer is
undefined behavior.

3. Even using reinterpret_cast, and admitting that the "undefined
behavior" in fact doesn't do anything disagreeable, how does this pose a
problem with garbage collection.

(I'll admit that the only collector I'm familiar with is the Boehm
collector, and this sort of code doesn't pose any problems for it.)

|>  And it compiles on g++2.7.0, also.  So the
|>  compiler needs to check every static_cast to make sure that it's not
|>  something like this.  This, by the way, is what I meant by casts between
|>  incompatible pointer types -- casts between pointer types that are not
|>  members of a common inheritance hierarchy.

According to the standard, the only pointer (or reference) casts allowed
by static_cast are those that take place implicitly, and their inverse.
(I beleive that there is some ambiguity with regards to casts to and
from void*.  Since there is an implicit conversion of any pointer TO
void*, a static_cast from void* to another pointer type would presumably
be legal.  Other text, however, describing pointer conversions in
detail, does not mention this case, and the introduction to this text
seems to me to imply that it is exhaustive.)

|>  > Current C++ garbage collectors are able to handle pointers into the
|>  > middle of an object.  This is not a particular problem.
|>  I'm glad to hear it.  This being the case, it sounds like we won't have
|>  to wait too long for the day when we all have garbage collection
|>  available in our compilers.

I believe that Cygnus plans it for g++ 2.8.0.  At least, they did at one
time.

|>  > |>  So
|>  > |>  should it be disallowed in a garbage-collecting implementation?  And
|>  > |>  static-casting down an inheritance hierarchy that has virtual functions
|>  > |>  -- should the compiler give an error, should it assume that you know
|>  > |>  what you are doing and hell to pay if you don't, or should it covertly
|>  > |>  change the static cast into a dynamic cast?
|>  >
|>  > If you don't know what you are doing, the results are (currently)
|>  > undefined behavior.  So it really doesn't matter what you do.  IMHO, a
|>  > good compiler would implement a static_cast down in exactly the same way
|>  > it implements a dynamic_cast, with, however, a compiler option to
|>  > suppress the test.  (This is currently legal anyway.)

|>  I agree completely here.

I suppose that we also agree that good compilers are rare:-).

|>  > |>  And actually, even casting
|>  > |>  to and from void* would be doable, if you were willing to have the
|>  > |>  compiler add some run-time checking to such casts.  But should this be
|>  > |>  standard, optional, or disallowed altogether (see note 2)?
|>  >
|>  > The experience of Ellis and Detleff in implementing their C++ garbage
|>  > collector would seem to indicate that this is not a problem.

|>  The problem here is, however, that the Ellis-Detleff collecter is a
|>  conservative collector.  I really don't believe that garbage-collection
|>  will truly take off until we have copy-collection.  Once we have found a
|>  satisfactory subset of the C++ language that is amenable to
|>  copy-collection, then an only then will we be able to leverage off the
|>  tremendous knowledge base in the Lisp community.  Moreover, copy
|>  collection has the potential to be MUCH more efficient than other
|>  garbage collection methods -- not only doesn't in have to worry about
|>  deallocating memory that is no longer referenced, it also improves
|>  locality of reference over time, rather than worsening it.

I think that Boehm and others have done some studies which indicated
that copying didn't make a significant difference in real programs.
I'll admit that intuitively, I would expect copy collection, combined
with generational collection, to make a significant improvement in
locality.  I've learned to mistrust my intuition, however, and would
rather see actual measurements.

In any case, a conservative collector will not result in worse locality
than the current situation.  Let's face it, once a few memory leaks have
fragmented your free space arena permenantly, locality is shot.

|>  >     void
|>  >     hide( T* p , void* dst )
|>  >     {
|>  >         memcpy( dst , &p , sizeof( p ) ) ;
|>  >         for ( int i = 0 ; i < sizeof( p ) ; i ++ )
|>  >             reinterpret_cast< char* >( dst )[ i ] ^= 0x55 ;
|>  >     }
|>  >
|>  >     T*
|>  >     unhide( void* src )
|>  >     {
|>  >         char            tmp[ sizeof( T* ) ] ;
|>  >         memcpy( tmp , src , sizeof( T* ) ) ;
|>  >         for ( int i = 0 ; i < sizeof( p ) ; i ++ )
|>  >             tmp[ i ] ^= 0x55 ;
|>  >         return *(reinterpret_cast< T** >( tmp )) ;
|>  >     }
|>  >
|>  > I don't think anyone does this intentionally, just for the sake of doing
|>  > it, and I doubt that it occurs very often in any case.  Something
|>  > similar, however, without the ^ operator, might occur in practice,
|>  > however, and will result in pointers at odd byte boundaries, potentially
|>  > with the wrong byte order.
|>  Could you please enlighten me as to how anything similar might happen in
|>  practice?  I've never seen anything like this and can't envision a need
|>  for anything like this, either.  The only thing remotely similar I can
|>  think of would be that in the new auto_ptr class, you may end up setting
|>  the low-bit (or maybe some of the upper bits, for 64-bit machines) to
|>  indicate whether or not the thing being pointed to is owned.  Still, I
|>  strongly disapprove of that disgusting hack, and hope the standards
|>  committee repents and goes back to the original auto_ptr, which was much
|>  better designed, more well thought out, and harder to misuse.

Streaming.  Generally, I don't see the interest of streaming a pointer,
since the goal of streaming is to flatten the object in order to
transmit it to an external medium (disk, network), and pointers have no
meaning outside of the process.

Still, I can conceive of programs which pack objects internally, say to
save space.  Obviously, if I stream an object, then compress the
resulting array of bytes, garbage collection is going to have a
difficult time tracing any pointers.

IMHO: C++ (and even C, for that matter), should declare this case to be
undefined behavior.  It isn't that useful, and doesn't occur often
enough in real programs to create a problem defining it as undefined.
And unless it is undefined, there will be conforming programs for which
garbage collection will fail.  (Basically, if there is no pointer to the
object, nor to any memory within the object, it should be undefined
whether the object continues to exist.)

And BTW: using the low order bit of a pointer as a flag (as is sometimes
proposed for the implementation of auto_ptr) does not cause the Boehm
collector to fail.  The pointer itself is still at a correctly aligned
address, and it still points into the allocated object (one byte past
the beginning).  This is all that is required.

|>  > The Ellis/Detleff collector uses a conservative algorithm.  Anything
|>  > that looks like a pointer is treated like one.  They do not depend on
|>  > any particular type information.
|>  If destructors are to be run, then the type of the object being
|>  destroyed must be known.

This is not a significant problem.  The (dynamic) type information can
be generated by the constructor/operator new, and is associated with the
actual memory block, and not the pointers.

|>  Moreover, with pointer arithmetic, nothing can
|>  be guaranteed to be safe without significant overhead.  I bet this
|>  example or something similar would have the potential to break the
|>  Ellis/Detleff collector:
|>    int lenai=100;
|>    int* ai=new int[100];
|>    int* endi=ai+101;
                ^^^^^^

>From your following discussion, I presume you mean "ai + 100".  "ai +
101" is undefined behavior.

|>    ai=NULL;
|>  Now, the array pointed to by ai would be deleted, but endi would still
|>  be there, referencing the array as a one-past-the-end pointer, in good
|>  STL fashion.  Perhaps this is the example you were looking for earlier?

Obviously, the solution is not to delete the array.  A pointer to
one-past-the-end should be treated (by the collector) as a pointer into
the object.  (I think that the Boehm collector handles this correctly.)

|>  > Also, the current standard does not prohibit passing hidden (to the
|>  > programmer) type information to varargs.  To tell the truth, I don't
|>  > know why no one does it.

|>  It would be nice, wouldn't it?

Actually, it occurs to me that Object Center might.  (It also occurs to
me that I'm going to have to get their compiler.)

|>  > |>  But the
|>  > |>  pointer-arithmetic restrictions are a bigger problem, especially given
|>  > |>  the way the STL paradigm has taken off.
|>  >
|>  > Pointer arithmetic is done implicitly when casting up and down a
|>  > hierarchy.  You don't need STL to get it; it's there even if you never
|>  > write a single line with it.

|>  Perhaps I should be more specific.  Pointer arithmetic on arrays of
|>  objects is what I'm talking about here.  As I showed above, using
|>  objects in an STL-like way can cause problems for current garbage
|>  collecters.  Performing run-time checking is probably the only solution
|>  short of disallowing pointer arithmetic altogether.  But this imposes a
|>  significant run-time overhead in all cases except for the few cases in
|>  which it's statically determinable that you are doing valid pointer
|>  arithmetic.  Ellis/Detleff do a pretty good job of outlining this,
|>  however.

I forget exactly what the Ellis/Detleff paper says, but in general, as
long as you don't invoke undefined behavior in the pointer arithmetic,
the Boehm collector should be able to handle it.

The fact that on most machines, invoking undefined behavior doesn't
cause an immediate error, and in fact, may even work as expected, is, of
course, a separate problem.  It does mean, however, that garbage
collection could reveal certain programs which appear to work today to
be broken.  I personally think that this is a good idea; I can imagine
that the vendors (who want to continue selling compilers to the authors
of the broken programs) disagree.

|>  > My recommendation is to down-load the Ellis/Detleff collector, and see
|>  > what they do.  You might also want to see their proposal for garbage
|>  > collection in C++.  It was made too late for serious consideration in
|>  > this round of standardization, but it is very thorough, covering most of
|>  > the standardization issues, and is backed up by an existing
|>  > implementation which works.  (I don't have the co-ordinates here, but if
|>  > you can't find it, email me at my home email address, and I will try and
|>  > look them up.)

|>  I must say, it is a pretty good paper.  For everyone else's reference,
|>  it is available at:
|>  http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-102.html
|>
|>  I wasn't able to find an implementation of the exact collecter they
|>  described.  Is there one, or did you mean the implementations that they
|>  referred to in their paper?

That's at least partially because I mixed up the author of the paper,
and the author of the implementation (Boehm).  The implementation (plus
some interesting papers) is available at
ftp://ftp.parc.xerox.com/pub/gc.  The Ellis paper is also available at
this site.  (I hope I got it right.  I'm citing it from memory.)

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Jason Merrill <jason@cygnus.com>
Date: 1997/01/30
Raw View
>>>>> James Kanze <james-albert.kanze@vx.cit.alcatel.fr> writes:

> |>  > Current C++ garbage collectors are able to handle pointers into the
> |>  > middle of an object.  This is not a particular problem.
> |>  I'm glad to hear it.  This being the case, it sounds like we won't have
> |>  to wait too long for the day when we all have garbage collection
> |>  available in our compilers.

> I believe that Cygnus plans it for g++ 2.8.0.  At least, they did at one
> time.

Nope.  We were considering incorporating an implementation of the
Ellis/Detlef proposal at one point, but never did.

Jason
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Hans-Juergen Boehm <boehm@mti.mti.sgi.com>
Date: 1997/01/29
Raw View
Ian Haggard wrote:
>
> James Kanze wrote:
> >
> > Why should any static_cast be a problem for garbage collection?
>
> struct ti {
>   int i;
> };
> struct tp {
>   int* i;
> };
>
> main() {
>   ti* pti=new ti;
>   tp* ptp=static_cast<tp*>(pti);
> }
>
This is of course not a problem for a conservative collector.

> The problem here is, however, that the Ellis-Detleff collecter is a
> conservative collector.  I really don't believe that garbage-collection
> will truly take off until we have copy-collection. Once we have found a
> satisfactory subset of the C++ language that is amenable to
> copy-collection, then an only then will we be able to leverage off the
> tremendous knowledge base in the Lisp community.  Moreover, copy
> collection has the potential to be MUCH more efficient than other
> garbage collection methods -- not only doesn't in have to worry about
> deallocating memory that is no longer referenced, it also improves
> locality of reference over time, rather than worsening it.

The above is controversial.  In particular, I would argue that pure
copying collection is a bad thing, except for the youngest
generation in a generational collector.  The space overhead associated
with copying old objects often increases the memory footprint of an
application to the point of unacceptability.  I think this actually
contributed significantly to the unpopularity of some of the
older garbage collectors and the associated language implementations.
Note that newer implementations often do something more sophisticated,
such as compacting old objects in place, if at all.

I have other disagreements with the above as well, but we've been
through those before.

The Ellis/Detlefs proposal considered allowing nonconservative collection.
But it's hard to deal with C unions in any reasonable way without some
degree of conservative pointer finding in the collector.


> > The Ellis/Detleff collector uses a conservative algorithm.  Anything
> > that looks like a pointer is treated like one.  They do not depend on
> > any particular type information.
> If destructors are to be run, then the type of the object being
> destroyed must be known.  Moreover, with pointer arithmetic, nothing can
> be guaranteed to be safe without significant overhead.  I bet this
> example or something similar would have the potential to break the
> Ellis/Detleff collector:
>   int lenai=100;
>   int* ai=new int[100];
>   int* endi=ai+101;
>   ai=NULL;
> Now, the array pointed to by ai would be deleted, but endi would still
> be there, referencing the array as a one-past-the-end pointer, in good
> STL fashion.  Perhaps this is the example you were looking for earlier?

I assume you mean ai+100.  Otherwise the code is illegal.
This is fine.  The allocator adds an additional byte at the end of every
object to make this an interior pointer and hence cover this case.
Not free, but ...

> > My recommendation is to down-load the Ellis/Detleff collector, and see
> > what they do.  You might also want to see their proposal for garbage
> > collection in C++.  It was made too late for serious consideration in
> > this round of standardization, but it is very thorough, covering most of
> > the standardization issues, and is backed up by an existing
> > implementation which works.  (I don't have the co-ordinates here, but if
> > you can't find it, email me at my home email address, and I will try and
> > look them up.)
> I must say, it is a pretty good paper.  For everyone else's reference,
> it is available at:
> http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-102.html
>
> I wasn't able to find an implementation of the exact collecter they
> described.  Is there one, or did you mean the implementations that they
> referred to in their paper?
> --

Ellis and Hull added a C++ interface to our collector.  You can get to
both from

http://reality.sgi.com/employees/boehm_mti/gc.html

This is lacking the C++ compiler support, which never became available.
It's as close as you can get without language extensions.

--
Hans-Juergen Boehm
boehm@mti.sgi.com


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Jason Merrill <jason@cygnus.com>
Date: 1997/01/29
Raw View
>>>>> Ian Haggard <ian@shellus.com> writes:

> James Kanze wrote:
>>
>> Why should any static_cast be a problem for garbage collection?

> struct ti {
>   int i;
> };
> struct tp {
>   int* i;
> };

> main() {
>   ti* pti=new ti;
>   tp* ptp=static_cast<tp*>(pti);
> }

> I think that's all all that need be said.  The way I read the standard,
> this is permissible.  And it compiles on g++2.7.0, also.

g++ 2.7.x did not implement the restrictions on the use of new-style casts,
just the syntax.  2.8.0 will implement the restrictions, however, and
rejects this code.

Jason


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Ian Haggard <ian@shellus.com>
Date: 1997/01/28
Raw View
James Kanze wrote:
>
> Why should any static_cast be a problem for garbage collection?

struct ti {
  int i;
};
struct tp {
  int* i;
};

main() {
  ti* pti=new ti;
  tp* ptp=static_cast<tp*>(pti);
}

I think that's all all that need be said.  The way I read the standard,
this is permissible.  And it compiles on g++2.7.0, also.  So the
compiler needs to check every static_cast to make sure that it's not
something like this.  This, by the way, is what I meant by casts between
incompatible pointer types -- casts between pointer types that are not
members of a common inheritance hierarchy.

> Current C++ garbage collectors are able to handle pointers into the
> middle of an object.  This is not a particular problem.
I'm glad to hear it.  This being the case, it sounds like we won't have
to wait too long for the day when we all have garbage collection
available in our compilers.

>
> |>  So
> |>  should it be disallowed in a garbage-collecting implementation?  And
> |>  static-casting down an inheritance hierarchy that has virtual functions
> |>  -- should the compiler give an error, should it assume that you know
> |>  what you are doing and hell to pay if you don't, or should it covertly
> |>  change the static cast into a dynamic cast?
>
> If you don't know what you are doing, the results are (currently)
> undefined behavior.  So it really doesn't matter what you do.  IMHO, a
> good compiler would implement a static_cast down in exactly the same way
> it implements a dynamic_cast, with, however, a compiler option to
> suppress the test.  (This is currently legal anyway.)
I agree completely here.

>
> |>  And actually, even casting
> |>  to and from void* would be doable, if you were willing to have the
> |>  compiler add some run-time checking to such casts.  But should this be
> |>  standard, optional, or disallowed altogether (see note 2)?
>
> The experience of Ellis and Detleff in implementing their C++ garbage
> collector would seem to indicate that this is not a problem.
The problem here is, however, that the Ellis-Detleff collecter is a
conservative collector.  I really don't believe that garbage-collection
will truly take off until we have copy-collection.  Once we have found a
satisfactory subset of the C++ language that is amenable to
copy-collection, then an only then will we be able to leverage off the
tremendous knowledge base in the Lisp community.  Moreover, copy
collection has the potential to be MUCH more efficient than other
garbage collection methods -- not only doesn't in have to worry about
deallocating memory that is no longer referenced, it also improves
locality of reference over time, rather than worsening it.

>     void
>     hide( T* p , void* dst )
>     {
>         memcpy( dst , &p , sizeof( p ) ) ;
>         for ( int i = 0 ; i < sizeof( p ) ; i ++ )
>             reinterpret_cast< char* >( dst )[ i ] ^= 0x55 ;
>     }
>
>     T*
>     unhide( void* src )
>     {
>         char            tmp[ sizeof( T* ) ] ;
>         memcpy( tmp , src , sizeof( T* ) ) ;
>         for ( int i = 0 ; i < sizeof( p ) ; i ++ )
>             tmp[ i ] ^= 0x55 ;
>         return *(reinterpret_cast< T** >( tmp )) ;
>     }
>
> I don't think anyone does this intentionally, just for the sake of doing
> it, and I doubt that it occurs very often in any case.  Something
> similar, however, without the ^ operator, might occur in practice,
> however, and will result in pointers at odd byte boundaries, potentially
> with the wrong byte order.
Could you please enlighten me as to how anything similar might happen in
practice?  I've never seen anything like this and can't envision a need
for anything like this, either.  The only thing remotely similar I can
think of would be that in the new auto_ptr class, you may end up setting
the low-bit (or maybe some of the upper bits, for 64-bit machines) to
indicate whether or not the thing being pointed to is owned.  Still, I
strongly disapprove of that disgusting hack, and hope the standards
committee repents and goes back to the original auto_ptr, which was much
better designed, more well thought out, and harder to misuse.

> The Ellis/Detleff collector uses a conservative algorithm.  Anything
> that looks like a pointer is treated like one.  They do not depend on
> any particular type information.
If destructors are to be run, then the type of the object being
destroyed must be known.  Moreover, with pointer arithmetic, nothing can
be guaranteed to be safe without significant overhead.  I bet this
example or something similar would have the potential to break the
Ellis/Detleff collector:
  int lenai=100;
  int* ai=new int[100];
  int* endi=ai+101;
  ai=NULL;
Now, the array pointed to by ai would be deleted, but endi would still
be there, referencing the array as a one-past-the-end pointer, in good
STL fashion.  Perhaps this is the example you were looking for earlier?

> Also, the current standard does not prohibit passing hidden (to the
> programmer) type information to varargs.  To tell the truth, I don't
> know why no one does it.
It would be nice, wouldn't it?

> |>  But the
> |>  pointer-arithmetic restrictions are a bigger problem, especially given
> |>  the way the STL paradigm has taken off.
>
> Pointer arithmetic is done implicitly when casting up and down a
> hierarchy.  You don't need STL to get it; it's there even if you never
> write a single line with it.
Perhaps I should be more specific.  Pointer arithmetic on arrays of
objects is what I'm talking about here.  As I showed above, using
objects in an STL-like way can cause problems for current garbage
collecters.  Performing run-time checking is probably the only solution
short of disallowing pointer arithmetic altogether.  But this imposes a
significant run-time overhead in all cases except for the few cases in
which it's statically determinable that you are doing valid pointer
arithmetic.  Ellis/Detleff do a pretty good job of outlining this,
however.

> My recommendation is to down-load the Ellis/Detleff collector, and see
> what they do.  You might also want to see their proposal for garbage
> collection in C++.  It was made too late for serious consideration in
> this round of standardization, but it is very thorough, covering most of
> the standardization issues, and is backed up by an existing
> implementation which works.  (I don't have the co-ordinates here, but if
> you can't find it, email me at my home email address, and I will try and
> look them up.)
I must say, it is a pretty good paper.  For everyone else's reference,
it is available at:
http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-102.html

I wasn't able to find an implementation of the exact collecter they
described.  Is there one, or did you mean the implementations that they
referred to in their paper?
--
Ian Haggard || ian@shellus.com (work) || IanHaggard@juno.com (home)
Linux -- "Oh, no, Mr Bill!" || #define employer_opinion !my_opinion
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Ian Haggard <ian@shellus.com>
Date: 1997/01/22
Raw View
In article <rf53evw8tex.fsf@vx.cit.alcatel.fr>,
  James Kanze <james-albert.kanze@vx.cit.alcatel.fr> wrote:
> Ian Haggard <ian@shellus.com> writes:
> |>    marek@iiasa.ac.at (Marek  MAKOWSKI) wrote:
> |>  >
> |>  > Barry Margolin (barmar@bbnplanet.com) wrote:
> |>  > : In article <5bh1c7$mmm@kane.ico.net>,
> |>  > : Matt Seitz <mseitz@meridian-data.com> wrote:
> |>  > : >That makes sense to me, but I have to wonder:  if one should always set
> |>  > : >a pointer to NULL following a delete, then why doesn't delete just
> |>  > : >automatically set the pointer to NULL?
> |>  >
> |>  > I think that Matt made a good point and your arguments are not convincing.
> |>  > Formally you are right, it would be difficult (and not rational) to trace
> |>  > back all uses of a pointer to be deleted.
> |>  > However, you examples are rather "exotic", in a vast majority of cases
> |>  > setting a deleted pointer to 0 would save a lot of time and aggrevation.
> |>  I'd have to side with Barry here.  What you want to do simply is
> |>  impossible to do correctly without garbage-collection.  And
> |>  garbage-collection is not possible unless the types of casts a user can
> |>  do are seriously restricted in the language.  I personally wouldn't mind
> |>  dropping C-style casts, as well as static_cast and reinterpret_cast from
> |>  the language or further restricting the kinds of casts they can perform
> |>  in exchange for garbage-collection.  But you'd have a lot of other
> |>  people who would be very angry with you.
>
> Several points:
>
> 1. In your restriction concerning static_cast's, I presume that you only
> meant static_cast's involving pointers.  Or do you also want to ban
> casts between floats and ints?
This is actually a pretty interesting topic, and I'm glad you brought it
up.  You are correct that only casts involving pointers (and references)
need to be restricted.  I find it unfortunate that there is no simple
way you can say, "Don't use this cast" or "don't use that cast" in your
code and your code will be garbage-collectable.  It appears the
standards designers did not design the new casts with garbage-collection
in mind.  So, in a garbage-collecting implementation, the compiler would
have to check each cast at compile-time and see if it was allowable.  In
principle, you only need to disallow casts between incompatible pointer
types, casts between pointer-types to non-pointer types.  Actually, it
can get rather messy as to what kinds of casts the user should be
allowed to perform.  For instance, static-casting up a multiple
inheritance hierarchy which has no virtual functions is doable, but
would require some extra work on the part of the garbage-collector since
it allows pointers into the middle of a memory block (see note 1). So
should it be disallowed in a garbage-collecting implementation?  And
static-casting down an inheritance hierarchy that has virtual functions
-- should the compiler give an error, should it assume that you know
what you are doing and hell to pay if you don't, or should it covertly
change the static cast into a dynamic cast?  And actually, even casting
to and from void* would be doable, if you were willing to have the
compiler add some run-time checking to such casts.  But should this be
standard, optional, or disallowed altogether (see note 2)?

Another interesting question that Stroustrup mentions in D&E is should
objects that are garbage-collected have their destructors invoked?  In
the Lisp-variants I have seen, the answer is no unless you specifically
register a destruction function for the object in question.  This allows
copy-collection to really show off how much more efficient it is than
in-place garbage collection.  Stroustrup seems to agree with the Lisp
philosophy here, and he gives some good arguments, too.  For that
matter, what would delete mean in a garbage-collected implementation?
Should it mean do nothing, or should it mean deconstruct the object (and
for in-place collecters potentially deallocate its memory)?

> 2. Casts, including static_cast's and reinterpret_cast's, do not, in
> themselves, cause problems with garbage collection.  The fact that you
> can memcpy a pointer into an array of char's, and memcpy it out later,
> does, as does the fact that you can write it to disk, and read it back
> in later.
But casts (specifically casts going between pointer and non-pointer
types) are what allow all this to happen.  You don't see this issue
coming up, say, in Pascal or Object Pascal, do you?  And that's because
of their extremely strong typing.

> 3. Modulo a few simple restrictions (precisely concerning things like
> writing to disk, etc.), there is a garbage collection implementation
> that works with C++.  I had even heard that it might be incorporated in
> some future version of g++, but that was some time ago.
Stuff such as varargs (since they lose type information) and pointer
arithmetic (since it allows pointers into the middle of a memory block
(see note 1)) also need to be disallowed.  Both of these are potentially
problematic.  printf and its cousins would be disallowed or the compiler
would have to perform some special compile-time or run-time checking on
them (see note 2).  g++ does compile-time checking on printf-like
functions, so this might not be too big an obstacle).  But the
pointer-arithmetic restrictions are a bigger problem, especially given
the way the STL paradigm has taken off.

So the real question is how difficult is it for a garbage-collector to
(a) find out that a particular pointer points to the middle of a memory
block and (b) find the start of that memory block?  I can think of some
theoretically reasonably efficient ways of doing this, but I don't know
if anyone has ever tried to do this in practice.

Anyway, I've gotten way off the original topic here, but I think this is
a topic that is both interesting in its own right and (hopefully)
relevant to future C++ standards.
--

(note 1) It's not so much that the pointers are in the middle of a
memory block, but that they are in the middle of a memory block and
don't know how to find the start of that memory block.  Pointers to a
class with virtual functions may or may not be in the middle of a memory
block, but they always know how to find the begin of their object in
memory.

(note 2) Given the ability to find the begin of a memory block, if a
garbage-collector puts type information at the start of each memory
block, checked casts to and from void* are possible.  This would solve
half the problem for any vararg function.  The other half of the problem
would be to for the vararg function to determine which of its arguments
are pointers and which aren't and disallow the conversion of non-pointer
arguments into pointers and vice-versa.  This too is doable with some
additional overhead if vararg function-calls are known to be such at
compile-time.

--
Ian Haggard || ian@shellus.com (work) || IanHaggard@juno.com (home)
Linux -- "Oh, no, Mr Bill!" || #define employer_opinion !my_opinion
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/01/22
Raw View
Ian Haggard <ian@shellus.com> writes:

|>  > 1. In your restriction concerning static_cast's, I presume that you only
|>  > meant static_cast's involving pointers.  Or do you also want to ban
|>  > casts between floats and ints?

|>  This is actually a pretty interesting topic, and I'm glad you brought it
|>  up.  You are correct that only casts involving pointers (and references)
|>  need to be restricted.  I find it unfortunate that there is no simple
|>  way you can say, "Don't use this cast" or "don't use that cast" in your
|>  code and your code will be garbage-collectable.  It appears the
|>  standards designers did not design the new casts with garbage-collection
|>  in mind.  So, in a garbage-collecting implementation, the compiler would
|>  have to check each cast at compile-time and see if it was allowable.

Why should any static_cast be a problem for garbage collection?

|>  In
|>  principle, you only need to disallow casts between incompatible pointer
|>  types, casts between pointer-types to non-pointer types.

Casts between pointers and non-pointers are "reinterpret_cast" (I
think).  I'm not sure what you mean by "incompatible pointer types".  If
you mean pointer types which have different representations, then in
practice, these would also be reinterpret_cast's; in practice, all
pointers to class must have the same representation.

|>  Actually, it
|>  can get rather messy as to what kinds of casts the user should be
|>  allowed to perform.  For instance, static-casting up a multiple
|>  inheritance hierarchy which has no virtual functions is doable, but
|>  would require some extra work on the part of the garbage-collector since
|>  it allows pointers into the middle of a memory block (see note 1).

Current C++ garbage collectors are able to handle pointers into the
middle of an object.  This is not a particular problem.

|>  So
|>  should it be disallowed in a garbage-collecting implementation?  And
|>  static-casting down an inheritance hierarchy that has virtual functions
|>  -- should the compiler give an error, should it assume that you know
|>  what you are doing and hell to pay if you don't, or should it covertly
|>  change the static cast into a dynamic cast?

If you don't know what you are doing, the results are (currently)
undefined behavior.  So it really doesn't matter what you do.  IMHO, a
good compiler would implement a static_cast down in exactly the same way
it implements a dynamic_cast, with, however, a compiler option to
suppress the test.  (This is currently legal anyway.)

|>  And actually, even casting
|>  to and from void* would be doable, if you were willing to have the
|>  compiler add some run-time checking to such casts.  But should this be
|>  standard, optional, or disallowed altogether (see note 2)?

The experience of Ellis and Detleff in implementing their C++ garbage
collector would seem to indicate that this is not a problem.

|>  Another interesting question that Stroustrup mentions in D&E is should
|>  objects that are garbage-collected have their destructors invoked?  In
|>  the Lisp-variants I have seen, the answer is no unless you specifically
|>  register a destruction function for the object in question.  This allows
|>  copy-collection to really show off how much more efficient it is than
|>  in-place garbage collection.  Stroustrup seems to agree with the Lisp
|>  philosophy here, and he gives some good arguments, too.  For that
|>  matter, what would delete mean in a garbage-collected implementation?
|>  Should it mean do nothing, or should it mean deconstruct the object (and
|>  for in-place collecters potentially deallocate its memory)?

In practice, in programs written to run under garbage collection,
destructors are the exception rather than the rule.  On the other hand,
the destructors that exist generally have to be run, at a specific
time.  (Which, in fact, may mean that they shouldn't be destructors, but
normal member functions.)

In the Ellis/Detleff garbage collector, whether the destructors are run
or not is configurable.

|>  > 2. Casts, including static_cast's and reinterpret_cast's, do not, in
|>  > themselves, cause problems with garbage collection.  The fact that you
|>  > can memcpy a pointer into an array of char's, and memcpy it out later,
|>  > does, as does the fact that you can write it to disk, and read it back
|>  > in later.

|>  But casts (specifically casts going between pointer and non-pointer
|>  types) are what allow all this to happen.  You don't see this issue
|>  coming up, say, in Pascal or Object Pascal, do you?  And that's because
|>  of their extremely strong typing.

OK.  It's true that without a cast (or the implicit conversion to
void*), you could not even call memcpy on the pointer.  Still, the type
of code I was thinking of was something like:

    void
    hide( T* p , void* dst )
    {
     memcpy( dst , &p , sizeof( p ) ) ;
     for ( int i = 0 ; i < sizeof( p ) ; i ++ )
         reinterpret_cast< char* >( dst )[ i ] ^= 0x55 ;
    }

    T*
    unhide( void* src )
    {
     char            tmp[ sizeof( T* ) ] ;
     memcpy( tmp , src , sizeof( T* ) ) ;
     for ( int i = 0 ; i < sizeof( p ) ; i ++ )
         tmp[ i ] ^= 0x55 ;
     return *(reinterpret_cast< T** >( tmp )) ;
    }

I don't think anyone does this intentionally, just for the sake of doing
it, and I doubt that it occurs very often in any case.  Something
similar, however, without the ^ operator, might occur in practice,
however, and will result in pointers at odd byte boundaries, potentially
with the wrong byte order.

I think that the Ellis/Detleff collector has a configuration parameter
to determine whether it will handle misaligned pointers.

|>  > 3. Modulo a few simple restrictions (precisely concerning things like
|>  > writing to disk, etc.), there is a garbage collection implementation
|>  > that works with C++.  I had even heard that it might be incorporated in
|>  > some future version of g++, but that was some time ago.
|>  Stuff such as varargs (since they lose type information) and pointer
|>  arithmetic (since it allows pointers into the middle of a memory block
|>  (see note 1)) also need to be disallowed.  Both of these are potentially
|>  problematic.

The Ellis/Detleff collector uses a conservative algorithm.  Anything
that looks like a pointer is treated like one.  They do not depend on
any particular type information.

This may result in some additional memory not being collected.  In
practice, the difference is probably not much, although it also probably
depends on the machine architecture.  If you can have dynamically
allocated memory at address 0, for example, it is almost certain that
this block will never be freed.

Also, the current standard does not prohibit passing hidden (to the
programmer) type information to varargs.  To tell the truth, I don't
know why no one does it.

|>  printf and its cousins would be disallowed or the compiler
|>  would have to perform some special compile-time or run-time checking on
|>  them (see note 2).  g++ does compile-time checking on printf-like
|>  functions, so this might not be too big an obstacle).

G++ only does compile-time checking for the exceptional case where the
format string is a string literal.  Since in practice, this case almost
never occurs (the format string is typically read from disk, depending
on the LANGUAGE or LC_LANG environment variable), it is as if the
checking didn't exist.

|>  But the
|>  pointer-arithmetic restrictions are a bigger problem, especially given
|>  the way the STL paradigm has taken off.

Pointer arithmetic is done implicitly when casting up and down a
hierarchy.  You don't need STL to get it; it's there even if you never
write a single line with it.

|>  So the real question is how difficult is it for a garbage-collector to
|>  (a) find out that a particular pointer points to the middle of a memory
|>  block and (b) find the start of that memory block?

Again: the Ellis/Detleff collector doesn't seem to have any problem with
this.

|>  I can think of some
|>  theoretically reasonably efficient ways of doing this, but I don't know
|>  if anyone has ever tried to do this in practice.
|>
|>  Anyway, I've gotten way off the original topic here, but I think this is
|>  a topic that is both interesting in its own right and (hopefully)
|>  relevant to future C++ standards.

My recommendation is to down-load the Ellis/Detleff collector, and see
what they do.  You might also want to see their proposal for garbage
collection in C++.  It was made too late for serious consideration in
this round of standardization, but it is very thorough, covering most of
the standardization issues, and is backed up by an existing
implementation which works.  (I don't have the co-ordinates here, but if
you can't find it, email me at my home email address, and I will try and
look them up.)

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]