Thread

Topic: One more thing C++ really needs

Author: rp@iscp.bellcore.com (Robert Pearlman)
Date: 1995/12/04 Raw View


I'll summarize the responses to date, because already the
pattern is clear.

First, there's definitely a problem: there are more recompilations than
are logically necessary, far more than we'd like.  Although user's
source code is protected from changes in implementation, user's binary
code is not.

Second, there's not much point to my proposed "reopening" solution,
which was clearly submitted without enough consideration.  My apologies.

Third, there are already several solutions becoming visible (SOM and
IDE handles everything ).  This can cause trouble not too far down the
road.  So we can hope that the question will be taken up by the
standards effort.

Continuing -- what should be the design objectives?  I suggest one :
don't depend on an IDE or repository.  Many implementations don't have
them available.  Why put up more barriers to entry?

A second suggestion: easy conversion of code to the (probably more
efficient) closely linked version.

--
rp




---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: rp@iscp.bellcore.com (Robert Pearlman)
Date: 1995/12/01 Raw View


This is required strictly from a practical viewpoint.  The problem is
simply that changing a header --in any way-- makes every dependent code file
recompile.  That includes adding the occasional forgotten
function, used only when coding the public methods of the class.
So one is often faced with a nasty choice -- write a
tiny function and pay (half hour, hour, more?) or write ungood
code.  Self-discipline requires the function, but it does hurt.  Can we
avoid it without changing all the nine zillion forms of make?

One can touch back the header, but that's also unclean.  Also, some
integrated environments, certainly Borland's, don't always respect the touch.

A similar, but much worse, problem occurs with templates.  It's worse
because the template code is so much more extensive.  No problem in
a library, of course, but if one is writing one's own templates it
gobbles up rebuild time.  I beat this one by breaking up a template into
whatever.h and whatever.hh.  whatever.h  has the declarations, whatever.hh
has the function definitions.  Then there's a single source file for each
template- generated class, instantiating all the functions.  It's only
these files which #include whatever.hh, and only these which recompile
when it changes.

Could we do the same for non-template headers?  Only with some awkardness
and danger.  This is pretty ugly:

class XX {
 int  dataMember1;
 void functionMember1();
  public:
   XX() { - - - }
#if ALL_OF_XX
 #include "restOfXX.h"
#endif
};

Now your Makefile dependencies need not usually list restOfXX, but you're
terribly exposed to natural errors.  The compiler and the pre-processor
don't talk to each other, so there's no way to forbid data members, virtual
functions, etc. in the conditional region.  So this won't do.

So perhaps we could have a language feature like this:

class XX {
 int  dataMember1;
 void functionMember1();
  public:
   XX() { - - - }
};

When needed (probably in restOfXX.h):

struct XX << { // '<<' means that this is an extension
 int  functionMember2(int) {dataMember1 = 0;}
}

This looks risky -- after all, you could have different extensions
in different places.  That's no more likely than having different
definitions entirely in different places.  You can do it and the
language lets you get away with it, for a while

Extensible definitions are less risky.  The compiler can easily keep bad
things out of the extension -- no data members (mess up space calculations
based on the unextended version), no virtual functions (ditto),  Static
data members look OK.

Is the world ready for this?

--
rp

---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: Herb Sutter <herbs@interlog.com>
Date: 1995/12/01 Raw View

In article <9511302342.AA18036@mercedes.iscp.bellcore.com>,
   rp@iscp.bellcore.com (Robert Pearlman) wrote:
>simply that changing a header --in any way-- makes every dependent code file
>recompile.  That includes adding the occasional forgotten
>function, used only when coding the public methods of the class.

Yet it must... for instance, adding a non-public member function might change
the vtable, and adding a non-public member variable will change the size of
objects of the class.  Client code must be completely in sync.

>So one is often faced with a nasty choice -- write a
>tiny function and pay (half hour, hour, more?) or write ungood code.

Or use good version control (e.g. the releaseable/versioned class categories
in Bob Martin's book), and let clients move to newer versions when they want.
 That way they can always use new functionality when it's needed, and they
can control when they need to rebuild rather than being forced to do it every
time the component changes.  This gives you the best of both worlds.

>One can touch back the header, but that's also unclean.  Also, some
>integrated environments, certainly Borland's, don't always respect the touch.

Yikes!  Even if it did, that's one sure path to errors and 'undefined
behaviour' (see my first paragraph above, and Matt's "Undefined versus
unspecified" article posted yesterday -- BTW, Matt, yes, I think there's a
huge difference, because to me 'undefined' implies an incorrect program and
'unspecified' implies a correct program with a platform dependency).

Herb

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Herb Sutter                 2228 Urwin, Suite 102       voice (416) 618-0184
Connected Object Solutions  Oakville ON Canada L6L 2T2    fax (905) 847-6019

[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: shankar@engr.sgi.com (Shankar Unni)
Date: 1995/12/02 Raw View

Robert Pearlman (rp@iscp.bellcore.com) wrote:

> This is required strictly from a practical viewpoint.  The problem is
> simply that changing a header --in any way-- makes every dependent code file
> recompile.

There is indeed a group of vendors who are talking to each other about
implementing some sort of "release-to-release binary compatibility", which
basically addresses changes like this in the class interface, allowing the
classes to change in certain ways without forcing all client code to
recompile.

This is especially important when releasing class libraries, especially
shared libraries, where the customer may not have an option to rebuild the
application.

Some partial solutions exist today in certain implementations. SGI (I
mention it first because I worked on it :-) has the "dynamic classes"
feature, IBM has SOM, and Taligent also gives you a way to add a virtual
member function to the end of your current list of virtuals (reliably).

The idea of the RRBC group is to come up with a wider, and more
standardized, set of supported changes.

The downside? Code generation for such "RRBC" classes potentially may not
be as optimal as for the current C++ implementations.  Also, there may be
some language restrictions in the use of such classes (like maybe you
cannot have a case label value or an array dimension in a declaration that
is "sizeof class" or "offsetof (class, member) (since the size can
change)).

--
Shankar Unni    E-Mail: shankar@sgi.com
Silicon Graphics Inc.   Phone: +1-415-933-2072
URL: http://reality.sgi.com/employees/shankar

---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: Etay_Bogner@mail.stil.scitex.com (Etay Bogner)
Date: 1995/12/02 Raw View


What you really want is :

class X {
    public :
        void publicF();
    protected :
        void protectedF();
    private :
        void privateF();
    };

and the same for data.

Now, the virtual table will be divided into three parts, the public,
protected and private which are all accessable via one vptr in the object
itself.

This means that each access to code/data is handled thru a double pointer
( double dereferencing ), instead of one in the current model.

This also means that now you have REALLY divided your code into three
parts, according to the access privilages. Why ?

If you change/add something to your public interface, all the sources must
be re-compiled. But, if you changed the protected part, just all the
derived classes must be re-compiled and for the private part, only ONE
sources has to be re-compiled.

I currently don't have a solution for the make problem ( when the file
changes, the make doesn't know which part was changed so it still
re-compiles everything ) but IDE's can do it quite easy.

The price is the double dereference although one might think of ways to
improve that, for instance, the vptr's of the protected and private vtbl's
are stored inside the public vtbl, so the public functions are still
accessed with one dereference, and protected/private are optimized since
they can be called only from derived classes or from within the class
itself ( so hold the other vptr's in other registers, or whatever.

This, along with a version control system will give the ability to do what
you want.

Now, to the real surprise. This is already implemented. It's called SOM,
and IBM developed it, and Apple is using it for OpenDoc.

This scheme, althought not described here, ensures binary compatability
between versions of the same class/object by adding new functions/data to
the end of the vtbl, so if thew minor version had changed, old
applications can still use a new library without a change.

I think that's worth the price of the double-dereference.

BTW, they have also something that's called DSOM, which means
distributed-SOM, which enables one to "send" objects on a net to other
computers for processing.

-- Etay Bogner,
-- Etay_Bogner@mail.stil.scitex.com,
-- Scitex Corp.
-- Israel.


---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: rp@iscp.bellcore.com (Robert Pearlman)
Date: 1995/12/02 Raw View

In article <49n93a$mse@engnews1.eng.sun.com>, Herb Sutter <herbs@interlog.com> writes:
|> In article <9511302342.AA18036@mercedes.iscp.bellcore.com>,
|>    rp@iscp.bellcore.com (Robert Pearlman) wrote:
|> >simply that changing a header --in any way-- makes every dependent code file
|> >recompile.  That includes adding the occasional forgotten
|> >function, used only when coding the public methods of the class.
|>
|> Yet it must... for instance, adding a non-public member function might change
|> the vtable, and adding a non-public member variable will change the size of
|> objects of the class.  Client code must be completely in sync.
|>

Quite so.  The proposal I made in the later part of the post prevented
precisely the cases you cite.

Whether re-opening would be useful is the question.  If you turn
all the private members into protected you get a re-openable class,
but then the division into base and derived classes is pretty-near
permanent.  If you allow re-opening you get a developer's speedup, but
later you can fold the extensions into the original definition and
do a total recompile.

So the question is, should the language design cater to people in heavy
development mode?    Why not?  Who better?

--
rp


---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]

Author: jodle@bix.com (jodle)
Date: 1995/12/02 Raw View

Robert Pearlman (rp@iscp.bellcore.com) wrote:


: This is required strictly from a practical viewpoint.  The problem is
: simply that changing a header --in any way-- makes every dependent code file
: recompile.  That includes adding the occasional forgotten

Put the implementation of functions that might change someplace besides
the header file.  Separating the description of the interface (which must
be public) from the implementation has other benefits, too.

: A similar, but much worse, problem occurs with templates.  It's worse
: because the template code is so much more extensive.  No problem in
: a library, of course, but if one is writing one's own templates it

In general, templates are most beneficial when they provide a basic
generic service that many different modules can share.  In general, the
dependencies that templates have on other code should be minimal.  This
implies that the way templates will be used in a context should be
well-understood before they or the code that utilizes them is
implemented.  Again, this way of doing things has several benefits,
including minimizing the kind of tiresome compiles you've described.

[Moderator's note: this is still appropriate for c.s.c++, since it
deals with whether or not a proposed language change is necessary.
Please make sure to keep followups relevant, though: discussion of C++
programming techniques belongs on comp.lang.c++ or
comp.lang.c++.moderated.  mha]

---
[ comp.std.c++ is moderated.  Submission address: std-c++@ncar.ucar.edu.
  Contact address: std-c++-request@ncar.ucar.edu.  The moderation policy
  is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]