Thread

Topic: Encapsulated Code Generation

Author: jones@cais.cais.com (Ben Jones)
Date: 1 Jul 1994 21:17:38 GMT Raw View

Hello out there in C++ standards land.  There is an important issue which
is not being addressed by anyone as far as I can tell:

    ENCAPSULATED CODE GENERATION

What do I mean by that?  I mean anything where the user has to code by hand
using a formula which can't be expressed in the language.  Let me illustrate:

* You declare a class in a header file.  In a separate file you
methodically define function bodies and initializations which pertain to
that class declaration.  Any time you want to add new functions, you have
to declare them in both places.  When you add new members you must make
sure the constructor initializes them.

* You use certain naming conventions solely to prevent name clashes and
not to convey any particular information.  You qualify function and
keyword names whose meaning ought to be obvious from the context.

* You define parameters to functions whose only purpose is to provide
information about other parameters or about the context in which you are
calling the function, which the compiler ought to be able to find in the
symbol table.  For example, you have to pass the size of arrays
separately, or the count of the number of arguments.

* You write code to save and restore the contents of an object.  In order
to write this code, you look at the class declaration to see how to
proceed.  If you change the class declaration, you must change any code
which was written with that class in mind.

* You find yourself copying large blocks of code from other programs, such
as the main loop of a program with a graphical user interface.

What is common to all of the above examples?

All of the code generated by hand could have been derived from information
which the compiler has access to but the user of the compiler does not.

There is currently no way to encapsulate these various formulas in C++.
As a result, programming can be very tedious and error prone.  This is
especially true when you are programming anything to do with GUI.  In
spite of all the facilities provided by class libraries like Object
Windows Library or Microsoft Foundation Classes, it is still a major
undertaking to create any C++ program which uses windows.

Ben Jones
jones@arsoftware.arclch.com

Author: vincer@iaccess.za (Vincent Risi)
Date: 2 Jul 1994 08:59:22 GMT Raw View

 -=> Quoting Jones@cais.cais.com to All <=-

 > ENCAPSULATED CODE GENERATION

 > * You declare a class in a header file.  In a separate file you
 >   methodically define function bodies and initializations which pertain
 >   to that class declaration.  Any time you want to add new functions, you
 >   have to declare them in both places.  When you add new members you must
 >   make sure the constructor initializes them.

 This has been one of my pet hates about C++ from the very first time I
 saw and used it. I am sure that somebody will say that it separates the
 design of the class from the implimentation. I would personnaly like to
 see the whole class defined and implimented in one source with the
 interface generated by the compilation (preferably in machine readable
 form rather that text that can be modified after the compilation.)

 Vince

Author: alex@uqbar.cirfid.unibo.it (Alex Martelli)
Date: Sun, 3 Jul 1994 15:03:59 GMT Raw View

vincer@iaccess.za (Vincent Risi) writes:
 ...
> > * You declare a class in a header file.  In a separate file you
> >   methodically define function bodies and initializations which pertain
 ...
> This has been one of my pet hates about C++ from the very first time I
> saw and used it. I am sure that somebody will say that it separates the
> design of the class from the implimentation. I would personnaly like to
> see the whole class defined and implimented in one source with the
> interface generated by the compilation (preferably in machine readable
> form rather that text that can be modified after the compilation.)

NO WAY!  This would mean that the interface would not be available to
me until *after* the implementation is written (and _compiled_, to
boot!), thus practically *forcing* me to code bottom-up.  Terrible
concept!

The way I work most of the time is that I design the interface, in
concert with both programmers who will be _using_ it and ones who will
be _implementing_ it, before one line of implementation is written.  We
then "publish" (internally release) the interface as a header-file, and
class implementation proceeds simultaneously with coding of subsystems
which use it ("clients") -- client subsystems can make use of the
interface to the class being implemented, and still compile; if it is
necessary to be able to link and try things out during development of
client code, dummy/subset/scaffolding implementation of such classes
can also be supplied as needed (when I can swing it, I like scaffolding
coding to be done by the "client" side of the team, and vetted by the
"class-implementation" side, with my supervision as architect in both
cases; this sometimes help unearth possible differences in
interpretation of design documents, and generally promotes communication).

During such development, of course, any class under development will
generally start with something like:

class designed {
    class internals;
    friend class internals;
    class internals &it;
    // no other private stuff
public:
    // the designed interface (public & protected stuff)
};

Later, when the class implementation is more solid and some performance
profiling has been done, we can, if needed, substitute the set of "real"
private members of the class as implemented, insert inline methods for
further optimization where warranted, and so on.

C++ does seem to be doing a neat job at enabling this development
process.  If I could have *one* wish, it would be for C++ to make
it easier for me to use a *nested* class of "designed" as its own
"internals"; the names of internals classes tend to be of the sort
"designed_internals", generally a symptom of missing nesting, while
I'd much rather have them as "designed::internals".  Unfortunately,
I see no way in C++ as it stands to declare a nested class while
still "forwarding" its inner details to a separate source file, sigh.

Your own desire -- to start coding the class declaration and
implementation in a single place, and have a tool automatically
extract a just-declaration headerfile and a just-implementation
sourcefile for release purposes -- would seem to be reasonably
easy to handle with a tool separate from the language compiler.

It _would_ indeed be neat to have this tool for relatively simple
cases where class design and implementation proceed apace, and
even in a process like ours it would be a little and useful
mechanical help in the "release optimization" phase, where a
class's "true" private stuff is to be placed inside its declaration
instead of the single "reference to" (unspecified) "internals".

We used to have similar helpers for C, tools which would scan a set of
sourcefiles and spit out all of the typedefs and function prototypes
needed to be placed into a .h as the interface to that subsystem.
I don't know of such tools for C++, yet.

Still, I'm _much_ happier not to have the language distorted and to
have to wait for, or build myself, such little tools, than to have
to alter my whole development process, which I like pretty fine as
it stands, thank you...!

Alex
--
 ____    Alex Martelli, Bologna, Italia
 \SM/___
  \/\bi/ Our wars are wars of life, & wounds of love
     \/  With intellectual spears, & long winged arrows of thought.

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 3 Jul 1994 15:41:28 GMT Raw View

In article <2v215i$srg@sun.cais.com> jones@cais.cais.com (Ben Jones) writes:
>Hello out there in C++ standards land.  There is an important issue which
>is not being addressed by anyone as far as I can tell:
>
>    ENCAPSULATED CODE GENERATION

 Not true. I have addressed a number of these issues
by proposing extensions. Most have been (or will be) rejected.
In each case you list, I agree with your sentiments.

 The word I think is more pertinent than "encapsulated"
is "localised".

>
>* You declare a class in a header file.  In a separate file you
>methodically define function bodies and initializations which pertain to
>that class declaration.  Any time you want to add new functions, you have
>to declare them in both places.  When you add new members you must make
>sure the constructor initializes them.

 Inline functions solve the problem for functions.
I proposed in-class initialisation of BOTH static and non-static
data members to complete the system so that a class could
be written entirely in one place (as is the rule in Eiffel).
The proposal was rejected (except for static const integral
variables initialised by constant expressions -- i.e. what
you could already do with an enum :-(
>
>* You use certain naming conventions solely to prevent name clashes and
>not to convey any particular information.  You qualify function and
>keyword names whose meaning ought to be obvious from the context.

 Namespaces address this effectively.
>
>* You define parameters to functions whose only purpose is to provide
>information about other parameters or about the context in which you are
>calling the function, which the compiler ought to be able to find in the
>symbol table.  For example, you have to pass the size of arrays
>separately, or the count of the number of arguments.

 That is sometime desirable, other times you can
invent a class just for the purpose of encapsulating the
information. Its just not commonly done, but it can be.
Of course -- the constructor for that class still has to
take the separate arguments :-(

>
>* You write code to save and restore the contents of an object.  In order
>to write this code, you look at the class declaration to see how to
>proceed.  If you change the class declaration, you must change any code
>which was written with that class in mind.

 Of course you do. The existing mechanisms of encapsulation
limit that work to "added members". If you do it right,
you dont have to make the changes in more than one place
(as well as adding the data member).
>
>* You find yourself copying large blocks of code from other programs, such
>as the main loop of a program with a graphical user interface.

 That is harder. Often you need to edit it in ways
that cant be easily parameterised. I suspect what we
need to encapsulate here is new control structures. It turns
out that "user defined control structures" are easy to do
if you have garbage collection and closures. Coroutines
are another way to do this.
>
>What is common to all of the above examples?
>
>All of the code generated by hand could have been derived from information
>which the compiler has access to but the user of the compiler does not.

 What you mean is the language could be extended to provide
superior modularisation and better reusability. Yes, it could.
>
>There is currently no way to encapsulate these various formulas in C++.
>As a result, programming can be very tedious and error prone.

 Yes, but not half as bad as using C. :-)

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 3 Jul 1994 15:49:30 GMT Raw View

In article <2v3a9a$3br@cstatd.cstat.co.za> vincer@iaccess.za (Vincent Risi) writes:
> -=> Quoting Jones@cais.cais.com to All <=-
>
> > ENCAPSULATED CODE GENERATION
>
> > * You declare a class in a header file.  In a separate file you
> >   methodically define function bodies and initializations which pertain
> >   to that class declaration.  Any time you want to add new functions, you
> >   have to declare them in both places.  When you add new members you must
> >   make sure the constructor initializes them.
>
> This has been one of my pet hates about C++ from the very first time I
> saw and used it. I am sure that somebody will say that it separates the
> design of the class from the implimentation. I would personnaly like to
> see the whole class defined and implimented in one source with the
> interface generated by the compilation (preferably in machine readable
> form rather that text that can be modified after the compilation.)

 include "file";

as a language statement requiring that the interface
of "file" be made accessible would do just that.
"file" is compiled recursively, or, the interface portion
of the object of an already compiled file loaded.

Its almost a module system.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: jones@cais.cais.com (Ben Jones)
Date: 4 Jul 1994 11:59:06 GMT Raw View

John Max Skaller (maxtal@physics.su.OZ.AU) wrote:
: In article <2v215i$srg@sun.cais.com> jones@cais.cais.com (Ben Jones) writes:
: >Hello out there in C++ standards land.  There is an important issue which
: >is not being addressed by anyone as far as I can tell:
: >
: >    ENCAPSULATED CODE GENERATION

:  Not true. I have addressed a number of these issues
: by proposing extensions. Most have been (or will be) rejected.
: In each case you list, I agree with your sentiments.

Precisely my point.  I really wasn't suggesting that nobody had thought of
these points besides myself but rather that the standards commitee didn't
seem interested in solving these problems.

:  The word I think is more pertinent than "encapsulated"
: is "localised".

: >
: >* You declare a class in a header file.  In a separate file you
: >methodically define function bodies and initializations which pertain to
: >that class declaration.  Any time you want to add new functions, you have
: >to declare them in both places.  When you add new members you must make
: >sure the constructor initializes them.

:  Inline functions solve the problem for functions.
: I proposed in-class initialisation of BOTH static and non-static
: data members to complete the system so that a class could
: be written entirely in one place (as is the rule in Eiffel).
: The proposal was rejected (except for static const integral
: variables initialised by constant expressions -- i.e. what
: you could already do with an enum :-(

One of the main problems caused by the separation is that anytime you add
new functions or static data, even if they are private, you trigger
recompilations.  You don't necessarily want the functions to be inline.

Suppose that you could define classes simply by enclosing a set of
functions and their related global data in braces and calling it a class,
having the interface automatically generated.  Like:

    export class name
    {
        <functions and data>
    };

You make the rule that the automatically generated interface contains only
public and protected declarations and any private ones which contribute
to the size of the object.  Only functions which are explicitly "inline"
would have their bodies output to the interface.  Furthermore, the interface
is only updated if its information changes, thus triggering recompilations
of those modules which import the interface (the Ada language works this
way).  In fact, the only time you'd need to trigger recompilations would
be when the size of the object changed, when the order of items in the
virtual function dispatch were changed, when public and protected members
are removed, and when inline function bodies were changed.

One of the advantages of this would be that it would dramatically improve
the understanding of object-oriented programming.  As it stands now, because
of the need to split everything up, programmers have to totally change their
coding style when they move into the object-oriented realm rather than simply
moving to the idea of instancing their data areas.

: >
: >* You use certain naming conventions solely to prevent name clashes and
: >not to convey any particular information.  You qualify function and
: >keyword names whose meaning ought to be obvious from the context.

:  Namespaces address this effectively.

There are still a number of cases where qualification is IMHO unnecessarily
required.  See recent posts regarding "enums as bit patterns".  Another
example has to do with member function pointers:

    void (X::*pf)() = &X::f;

The & and X:: are always required, even when you are inside of "X" and in
spite of the fact that "ph" has all the information it needs to place "f"
in context.  It is because of this that member function pointers cannot
be used as manipulators.  It also means that when you switch from non-object
oriented programming to object-oriented, anything you used to do with
function pointers has a totally different syntax.

: >
: >* You define parameters to functions whose only purpose is to provide
: >information about other parameters or about the context in which you are
: >calling the function, which the compiler ought to be able to find in the
: >symbol table.  For example, you have to pass the size of arrays
: >separately, or the count of the number of arguments.

:  That is sometime desirable, other times you can
: invent a class just for the purpose of encapsulating the
: information. Its just not commonly done, but it can be.
: Of course -- the constructor for that class still has to
: take the separate arguments :-(

: >
: >* You write code to save and restore the contents of an object.  In order
: >to write this code, you look at the class declaration to see how to
: >proceed.  If you change the class declaration, you must change any code
: >which was written with that class in mind.

:  Of course you do. The existing mechanisms of encapsulation
: limit that work to "added members". If you do it right,
: you dont have to make the changes in more than one place
: (as well as adding the data member).

This is a place where it would be nice if the compiler gave you access to
the class structure.  If you could write a procedure which could generate
the save/restore code by looking at the class declaration you could save
yourself a lot of trouble.

In fact, there are a lot of other areas like this.  If you have a class
which you would like to access via a dialog box, you usually have to create
a dialog box class with a list of control definitions which parallel the
list of members in the class you want to access.  For each of those controls
you have to write code to copy data between the dialog box object and your
object.  In most cases, this code could easily be derived by looking at
your class.

: >
: >* You find yourself copying large blocks of code from other programs, such
: >as the main loop of a program with a graphical user interface.

:  That is harder. Often you need to edit it in ways
: that cant be easily parameterised. I suspect what we
: need to encapsulate here is new control structures. It turns
: out that "user defined control structures" are easy to do
: if you have garbage collection and closures. Coroutines
: are another way to do this.

Macros are often used for this purpose but because #define macros have no
looping and branching mechanisms and no reasonable error reporting mechanism
it becomes difficult to use them effectively.

: >
: >What is common to all of the above examples?
: >
: >All of the code generated by hand could have been derived from information
: >which the compiler has access to but the user of the compiler does not.

:  What you mean is the language could be extended to provide
: superior modularisation and better reusability. Yes, it could.

Right.  I think that at the very least, it is time to dump #define and
provide a reasonable macro facility which allows looping, branching,
access to the parse tree/symbol table information, etc.

: >
: >There is currently no way to encapsulate these various formulas in C++.
: >As a result, programming can be very tedious and error prone.

:  Yes, but not half as bad as using C. :-)

I much prefer C++ over C myself.

Ben Jones
jones@arsoftware.arclch.com

Author: jones@cais.cais.com (Ben Jones)
Date: 4 Jul 1994 12:22:27 GMT Raw View

Alex Martelli (alex@uqbar.cirfid.unibo.it) wrote:
: vincer@iaccess.za (Vincent Risi) writes:
:  ...
: > > * You declare a class in a header file.  In a separate file you
: > >   methodically define function bodies and initializations which pertain
:  ...
: > This has been one of my pet hates about C++ from the very first time I
: > saw and used it. I am sure that somebody will say that it separates the
: > design of the class from the implimentation. I would personnaly like to
: > see the whole class defined and implimented in one source with the
: > interface generated by the compilation (preferably in machine readable
: > form rather that text that can be modified after the compilation.)

: NO WAY!  This would mean that the interface would not be available to
: me until *after* the implementation is written (and _compiled_, to
: boot!), thus practically *forcing* me to code bottom-up.  Terrible
: concept!

Suppose that you started with the interface as you do now, where you define
your data and function prototypes but without function bodies initially
and declared the class "exportable":

    export class Foo
    {
      public:
        <data declarations>

        <function declarations>

      protected:
        ...
      private:
        ...
    };

Now, the rule would be that the generated interface consists of all public
and protected members and function prototypes, any private non-static
members, and the bodies of functions explicitly marked "inline".  This
interface would only be regenerated if those items changed.

In the course of development, you could add or edit the bodies of functions,
add private static members, add private support functions, all without
causing the interface to be regenerated.

Another advantage of doing things this way is that there would be absolutely
no question of which module emits the virtual function dispatch table.

There would be nothing to prevent you from defining the bodies of functions
in separate files, as always.

: Your own desire -- to start coding the class declaration and
: implementation in a single place, and have a tool automatically
: extract a just-declaration headerfile and a just-implementation
: sourcefile for release purposes -- would seem to be reasonably
: easy to handle with a tool separate from the language compiler.

The Ada language and some flavors of Pascal have had this sort of feature
all along.

Ben Jones
jones@arsoftware.arclch.com

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Tue, 5 Jul 1994 19:50:36 GMT Raw View

> ...
>> > * You declare a class in a header file.  In a separate file you
>> >   methodically define function bodies and initializations which pertain
> ...
>> This has been one of my pet hates about C++ from the very first time I
>> saw and used it. I am sure that somebody will say that it separates the
>> design of the class from the implimentation. I would personnaly like to
>> see the whole class defined and implimented in one source with the
>> interface generated by the compilation (preferably in machine readable
>> form rather that text that can be modified after the compilation.)
>
>NO WAY!  This would mean that the interface would not be available to
>me until *after* the implementation is written (and _compiled_, to
>boot!), thus practically *forcing* me to code bottom-up.  Terrible
>concept!

 The _ability_ to write a class all in one place is
not the same thing as a _requirement_ to do so. I want to
enable BOTH.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: jones@cais.cais.com (Ben Jones)
Date: 6 Jul 1994 14:13:13 GMT Raw View

John Max Skaller (maxtal@physics.su.OZ.AU) wrote:
: > ...
: >> > * You declare a class in a header file.  In a separate file you
: >> >   methodically define function bodies and initializations which pertain
: > ...
: >> This has been one of my pet hates about C++ from the very first time I
: >> saw and used it. I am sure that somebody will say that it separates the
: >> design of the class from the implimentation. I would personnaly like to
: >> see the whole class defined and implimented in one source with the
: >> interface generated by the compilation (preferably in machine readable
: >> form rather that text that can be modified after the compilation.)
: >
: >NO WAY!  This would mean that the interface would not be available to
: >me until *after* the implementation is written (and _compiled_, to
: >boot!), thus practically *forcing* me to code bottom-up.  Terrible
: >concept!

:  The _ability_ to write a class all in one place is
: not the same thing as a _requirement_ to do so. I want to
: enable BOTH.

The proposed scheme allows both.  If you are interested in trying out this
concept, download yourself a Beta copy of ARC++ from the anonymous FTP:

    arsoftware.arclch.com:  /pub/arsoftware/arc++

and get "arc.READ_ME" for downloading instructions.  Alternatively, you
can get "EXPORT.txt" for more detailed information about the concept
of exportable classes.  The other issues raised in the original post
are also discussed in "*.txt".

Ben Jones
jones@arsoftware.arclch.com