Topic: OGC++: Optional Non-Conservative Garbage Collector for C++: Library solution - possible breakthrough


Author: Hans_Boehm@hp.com (Hans-J. Boehm)
Date: 22 Oct 2002 13:52:23 -0400
Raw View
"Mirek Fidler" <cxl@centrum.cz> wrote in message news:<aoos4v$10d9$1@news.vol.cz>...
> Anyway, I tested it on Linux/GCC2.96 and Win2k/MSCV6.0 and there seems to be
> no problem.
>
> As for what I mean by optional GC, just imagine
>
> gc_ptr<Foo> p = gc_new Foo;
>
> where gc_ptr has semantics very close to boost::shared_ptr with one
> important difference - instead of deleting pointed object when last
> shared_ptr pointing to it is destructed, it deletes it after it becomes
> unreachable.
>
> Best of all, its performance is comparable to Boehm's GC.
>
> Download form
>
> http://www.volny.cz/cxl
>
Thanks for making this discussion more substantive with a concrete
implementation.

I hacked the (single-threaded) C++ version of GCBench to try to get a
better calibration on performance.  This is a benchmark that has been
fairly widely used to roughly calibrate GC performance.  It has some
known problems, and the C++ version hardly uses the best C++ style.
But I think it's still a reasonable starting point.  Probably the most
substantial error here is introduced by the small object size; most
objects contain only two pointers and two integers.  This is probably
more representative of Scheme than C++.

The hacked version is at
http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_bench/GCBench_OGC.cpp
Various other versions are in the same directory.  Aside from
supporting OGC, I made two other changes:

1) To make things a little more similar to Mirek's test, I removed a
bit of type information that would normally prevent our collector from
scanning an array of doubles.  Thus the resulting benchmark probably
still involves handling more misidentified pointers than most real
applications, but not as many as Mirek's test.  (In reality, it's easy
to eliminate scanning of the nonpointers in either test.  And nobody
has any objection to using available type information for heap
objects.)

2) I worked around a problem with the smart pointer implementation.
See below.

Here are the results of the experiment:

1) It initially didn't work.  The benchmark makes recursive allocating
calls in a constructor argument.  This apparently causes blocks to be
added to a global list without fully filling in the descriptor
information.  This generates a fairly mysterious segmentation fault in
the collector.  I think this is fixable, and bugs in software this
recent are certainly to be expected.  I mention this only because this
is consistent with my general experience with smart pointers.
Sometimes they work.  Sometimes they break. And it's rarely possible
to understand when they will break without understanding a lot about
the implementation.  (The posted version of the benchmark works around
this with -DBROKEN_SMART_PTRS.)

2) Overall execution times (ancient 266 MHz PentiumII, Linux,
gcc2.95.1):

a) new/delete: 20.5 sec

b) OGC, policy=20 (default): 225.4 sec

c) OGC, policy=100: 131 sec

d) Our collector, default single-threaded config.: 15.4 sec.

OGC with policy = 100 used appreciably more memory than our collector
(33 vs. 24 MB)

If you look at the profile, the top time sinks are FindBlock and the
STL sort call.  I don't think this is surprising.  The live object
count goes to 220K or so.  Most real problems for which GC time
matters are probably larger than that. A garbage collector should
avoid any algorithms that take time more than linear in the number of
heap objects or pointers.  (If you eliminated both of those, it would
still clearly be in last place.  But by far less than an order of
magnitude.)

Hans
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Mirek Fidler" <cxl@centrum.cz>
Date: 23 Oct 02 02:05:27 GMT
Raw View
> Thanks for making this discussion more substantive with a concrete
> implementation.

    Thanks for reviewing my code.

> I hacked the (single-threaded) C++ version of GCBench to try to get a
> better calibration on performance.  This is a benchmark that has been
> fairly widely used to roughly calibrate GC performance.  It has some
> known problems, and the C++ version hardly uses the best C++ style.
> But I think it's still a reasonable starting point.  Probably the most
> substantial error here is introduced by the small object size; most
> objects contain only two pointers and two integers.  This is probably
> more representative of Scheme than C++.

    Note that it favors your GC in terms of memory consumption - gc_ptr
is
12 bytes long (on 32bit flat platform). So Node is twice as big using
OGC
than GC.

> 1) To make things a little more similar to Mirek's test, I removed a
> bit of type information that would normally prevent our collector from
> scanning an array of doubles.  Thus the resulting benchmark probably
> still involves handling more misidentified pointers than most real
> applications, but not as many as Mirek's test.  (In reality, it's easy
> to eliminate scanning of the nonpointers in either test.  And nobody
> has any objection to using available type information for heap
> objects.)

    Sure. That is right. For ongoing work on my attempt I have removed
these
"false pointer generators" from my benchmark code completely.

> 1) It initially didn't work.  The benchmark makes recursive allocating
> calls in a constructor argument.  This apparently causes blocks to be
> added to a global list without fully filling in the descriptor
> information.  This generates a fairly mysterious segmentation fault in
> the collector.  I think this is fixable, and bugs in software this
> recent are certainly to be expected.  I mention this only because this
> is consistent with my general experience with smart pointers.
> Sometimes they work.  Sometimes they break. And it's rarely possible
> to understand when they will break without understanding a lot about
> the implementation.  (The posted version of the benchmark works around
> this with -DBROKEN_SMART_PTRS.)

    I will fix it. It is caused by fact that I concentrated on GC
implementation rather than on smart pointer (there are other flaws in
its
design that has to be fixed).

> 2) Overall execution times (ancient 266 MHz PentiumII, Linux,
> gcc2.95.1):
>
> a) new/delete: 20.5 sec
>
> b) OGC, policy=20 (default): 225.4 sec
>
> c) OGC, policy=100: 131 sec
>
> d) Our collector, default single-threaded config.: 15.4 sec.

    I expected it to be worse. This is 8 times slower. Nice for the
beggining.

    I agree that my original benchmark was a little bit favoring OGC.
Also,
I said that performance is comparable, not the same.

    BTW, new/delete performance is bad. Perhaps malloc implementation on
test platform is not that stellar. Allocator I am using seems to be
faster
than your collector, in syntetic examples by 5 times... (I have to find
out
how it applies to this benchmark though).

> OGC with policy = 100 used appreciably more memory than our collector
> (33 vs. 24 MB)

    Actually, this is good. sizeof(Node) is 16 bytes for GC and 32 bytes
for
OGC. So in fact I would expect OGC to allocate as much as twice more
memory.

> If you look at the profile, the top time sinks are FindBlock and the
> STL sort call.  I don't think this is surprising.  The live object
> count goes to 220K or so.  Most real problems for which GC time
> matters are probably larger than that. A garbage collector should
> avoid any algorithms that take time more than linear in the number of
> heap objects or pointers.  (If you eliminated both of those, it would
> still clearly be in last place.  But by far less than an order of
> magnitude.)

    Well, you are right. This seems to be classical tradeoff between
memory
consumption and speed. I have already found a simple way how to improve
speed of my code by 200% by extending block information by 8 bytes. I
also
try to explore new ways how to better schedule collecting.

    In fact, if I was allowed to spend even more memory, I would be
easily
able to get more from OGC. But this is something that I consider
unneeded
for optional GC, as I expect number of GC managed blocks to be
significantly
less than number of manually managed ones.

Mirek




      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]




Author: "Mirek Fidler" <cxl@centrum.cz>
Date: 19 Oct 02 05:23:29 GMT
Raw View
Hi,

It is possible that I have found idea that perhaps leads to building
optional non-GC for C++ without any core language changes.

It is so trivial (250 lines of code) that I am afraid I got something wrong.
If I did, I apologize. If I reinvented wheel, I apologize too (but then I do
not understand why there is boost::shared_ptr but no boost::gc_ptr).

Anyway, I tested it on Linux/GCC2.96 and Win2k/MSCV6.0 and there seems to be
no problem.

As for what I mean by optional GC, just imagine

gc_ptr<Foo> p = gc_new Foo;

where gc_ptr has semantics very close to boost::shared_ptr with one
important difference - instead of deleting pointed object when last
shared_ptr pointing to it is destructed, it deletes it after it becomes
unreachable.

Best of all, its performance is comparable to Boehm's GC.

Download form

http://www.volny.cz/cxl

Comments welcome.

Mirek



      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]