Topic: Abolish CPP => Clearer code, faster compiles, simpler makefiles


Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 4 Dec 1994 00:28:06 GMT
Raw View
In article <MATT.94Nov21172619@physics10.berkeley.edu> matt@physics.berkeley.edu writes:
>any notion of a module with an interface/implementation split.  Sure,
>you can use the mechanism of textual inclusion to simulate a module
>system (which is what pretty much everyone does), but the whole notion
>of a header file is simply an extralinguistic programming convention.

 Wish that were true, but its not. We need rules,
the One Definition Rule in particular , and complicated
template instantiation rules, to make this idea work at all.

 In C they have "compatible types" and in C that is
horrible but probably "just enough".

 In C++ there are inlines and templates -- its a whole
lot more sophisticated a system and it demands a module system,
not "non-diagnosable equivalence" rules, for robust system
development.

 The biggest problem , I think, is something most
people will not have come across much yet. Template
Specialisations. Because, unlike the usual "inline" function
and class declaration problems, where a simple protocol

 "use header files, dont use macros, dont use
 static variables"

is a pretty good idiom, there is NO idiom that can work
properly with specialisations.

 // file 1 ...
 #include <stl.h>
 vector<char> vc; ...

 // file 2
 #include <stl.h>
 struct vector<char> ..// specialisation
 vector<char> special_vc

WOOPS!  Two definitions of vector<char> -- the standard
one and a specialised one.

Its OK to write a rule and say this program is naughty because
the specialisation used in file 2 is not declared in file 1.

So what? How do you MANAGE multi-vendor libraries with
separate compilation in big systems?

The only idiom that works is to modify the ORIGINAL
source file <stl.h> if you specialise. That guarrantees
consistency across a whole site -- and
means recompiling and debugging multiple systems each time
you add a specialisation you may only need in one translation
unit of one program of one project.

Another way is to include:

 #include "project1_stl.h"

where that file #include's STL and then provides specialisations
to be used exclusively in project1 -- which means establishing
a convention never to #include the native Standard Library
headers. Gak.

And when you install multiple libraries from different
vendors, how do you know IF they can possibly work together?
Because if they provide different specialisations they cant.
And that includes one specialising and the other NOT specialising.

IMHO that makes specialisations kind of hard to use for anyone
other than the implementor OF the original library (for example,
the vendor of the Standard Library in the case of STL).
And it makes adding them afterwards (in later releases) hard too.

How come specialisations of templates have that problem when
specialisations of virtual member functions do not?

Answer: modular design of language facilities.


--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189




Author: hevi@hilja.it.lut.fi (Petri Heinil{)
Date: Fri, 25 Nov 1994 03:23:50 GMT
Raw View
In article <MALCOLM.94Nov22165509@xenon.mlb.sticomet.com>, malcolm@xenon.mlb.sticomet.com (Malcolm McRoberts) writes:
> I'm in favor of a true module/package mechanism, especially one that
> isn' file based.  But this isn't the only thing that cpp does, and you
> aren't proposing any solutions to the other applications.  Rather than
> lose cpp, we should be pushing for a better high-level, C++ syntax
> aware macro facility.

Isn't that called "cfront" :)

But, yes, the one functionality for cpp, additional to the include
mechanism, is the conditioning (#ifdef) facility, that helps very
much in porting software.

And, templates might be one thing to exist in cpp, because the
main functionality of them is to substitute the generic type.
(but, the type checking might get violated)

--
-- <A HREF="http://www.lut.fi/~hevi/">The Page</A> --




Author: andrewfg@emerald.aifh.ed.ac.uk (Andrew Fitzgibbon)
Date: Fri, 18 Nov 1994 17:05:24 GMT
Raw View
Newsgroups: comp.std.c++,comp.lang.c++
Subject: Abolish CPP => Clearer code, faster compiles, simpler makefiles
Distribution: world
--text follows this line--

         ** Executive Summary **

   By eliminating the use off CPP, and introducing a language-based
   packaging mechanism, we can easily write precompiled headers.

   In addition, we don't have to look at CPP's 'left-justified chicken
   scratches' [Dennis Ritchie's phrase], meaning prettier code.

   Finally, by identifying the packages with UNIX archive libraries,
   the compiler can automatically determine much of the information
   which is nowadays supplied to make.

   CPP can still remain as a compiler option, and will be needed for
   many situations, but programmers who use it could soon be considered
   like those who preprocess their code with M4 (Yes I've seen some!)

   Please, let's do the Right Thing.  I've got a 100 line file which
   includes 1000 lines of system header, 4000 of XView, and 2000 of
   my own.  There's only so much coffee that I can drink in a day!

         ** Original Proposal **

I've previously posted this article, which I'm including again
because I sent it only to gnu.g++.bug:

> The Problem:
> ------------
>
> Part of what makes precompiling headers so tricky is the C
> preprocessor, a horrible beast, rightly denigrated by Dennis
> Ritchie for turning fine code into an ugly mass of "left-justified
> chicken scratches".  He points out that CPP is largely
> anachronistic and unnecessary, that
>  #define TEN 10
> is much better replaced by
>  const int TEN = 10;
> This makes the symbol visible to the debugger and looks cleaner.
>
> A more compelling example, from sys/fcntl.h, is the O_* macros:
>  #define O_CREAT         0x100
>  #define O_TRUNC         0x200
>  #define O_EXCL          0x400
>  #define O_NOCTTY        0x800
> These would surely be much better replaced by an enum.
>
> Finally, he notes that even #if..#endif can occasionally be
> replaced by a simple "if (CONSTANT) { .. }" that will be optimized
> out.
>
> Without CPP, header precompilation should be a fairly simple task
> of dumping some encoding of the symbol-table additions induced by
> compiling the file, and reloading those encodings at a later
> compile, probably checking that the 'source' file is not newer.
>
> A Suggestion:
> -------------
>
> Perhaps G++ could be persuaded only to precompile headers which
> obey certain restrictions on the use of CPP.  One sequence of
> restrictions in decreasing order of severity, and hence increasing
> difficulty of implementation might be:
>
> 1. No CPP syntax at all
>
> 2. No CPP other than the "include once" wrapper that GNU's CCCP already
>    special-cases.
>
> 3. No internal #define.
>    #if EXPR .. #endif, where EXPR depends only on externally
>    #defined symbols.  At this stage, the precompiled header would
>    have to include some encoding of what the CPP state was on input
>    to the file, so that future compiles could be verified to have
>    the same effect.
>
> 4. Limited macro capability that could be left in a CPP-able header,
>    but with class definitions etc. being put in the encoded file.
>
> Many C++ programmers would be happy to go with #3 if it meant a
> large reduction in compile time, and it might also lead system
> manufacturers to supply more 'precompile-friendly' system headers.
> I know I'd be happy to rewrite the Solaris headers to comply.
>
> Comments?
>
> A.

         ** A Rethink: Units **

Since I wrote this, discussions with people on the net have convinced
me that we should not paper over a what is effectively a bad crack.

My (and others') suggestion and, I believe, The Right Thing To Do is
to abolish cpp, and replace #include with a decent packaging
construct within the language.

My personal favourite is for the old UCSD pascal 'uses' keyword.
Paraphrasing the old Apple ][ pascal example, we have something like the
following.  AppleII.cc is a user of the turtlegraphics and math
libraries, and so writes "uses turtlegraphics" to load the interface
definitions for the library:

---- snip ----
// AppleII.cc

// New construct:
uses turtlegraphics,math;

main()
{
  pen(DOWN);
  forward(1);
  pen(UP);
}
---- snip ----

I see the units as corresponding directly to UNIX archive libraries,
so that the compiler can known the libraries with which to invoke the
linker.  In the example above, we would automatically link with ld
-lturtlegraphics -lmath

The library implementer can write something like the following
throughout the library, and this defines what is imported into client
code.

--- snip ---
interface turtlegraphics {
 class Pen { ... };
 void forward(double);
};
--- snip ---

In addition, implementation as I've described it is really
simple... the compiler now knows that it's making a library, and can
just make a new "object" in the library containing machine-readable
"precompiled" interface definitions, much as the __.SYMDEF object
does for ranlib archives.  The interface definitions could
conceivably be placed anywhere within the code, which would allow

     ** Objections **

The following are main objections that I foresee:

"What about #if defined(WACKY_OS) && WACKY_VER <= 1.3 ?
 What will we do about portability?"
   Because almost all system headers will have to be rewritten to not
   use cpp, and with the onset of X/Open and POSIX, these sort of
   hacks should become rarer.  I will admit, though that this is a
   problem.  My only current solution is to load different libraries
   or units for different OS's.  Of course, the smart-alec answer to
   the question "What will we do about portability?" is "write
   portable code!" :-)

"Make will want to recompile everything all the time"
   Now this is a biggie,  we can have make depend on a per-unit
   timestamp file (the precompiled "header"), but ideally we would
   like library-level granularity, meaning that any interface change
   within the library would mean recompilation of all clients.
   I think the compiler could instead supply make with an "interface
   change time" on a per-class basis, and per-sourcefile class
   dependency listings.  Actually, this answers my other big problem
   too!

"The UNIX philosophy decries monolithic tools -- cpp, cc1, make, and
 ar should remain separate, rather than combining them into one."

   This argument is fine for general-purpose tools such as sed, awk,
   and cpp.  However, cc1 is only ever used as a C compiler and ar is
   very rarely used other than in creating archive libraries[1].  In
   this case, they might as well be lumped in together, especially if
   the result is cleverer, faster and smaller.

"Not another keyword"
   This is no excuse, the language would be so much improved that
   renaming variables that accidentally conflicted would be a minor
   inconvenience.

         ** GCC **

GCC is already trying to provide some of this type of functionality,
but with #pragmas and nonstandard compiler options.  Please let's
make the world a better place.

A.

[1] [For the juniors] ar is sometimes preferred over tar when all the files
    in the archive are printable.

--
Andrew Fitzgibbon (Research Associate),                     andrewfg@ed.ac.uk
Artificial Intelligence, Edinburgh University.               +44 031 650 4504
                        http://www.dai.ed.ac.uk/staff/personal_pages/andrewfg
       "If it ain't broke, don't fix it" - traditional (c1950)
          "A stitch in time saves nine." - traditional (c1590)




Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sat, 19 Nov 1994 20:45:51 GMT
Raw View
In article <ANDREWFG.94Nov18170524@emerald.aifh.ed.ac.uk> andrewfg@emerald.aifh.ed.ac.uk (Andrew Fitzgibbon) writes:
>
>         ** Executive Summary **
>
>   By eliminating the use off CPP, and introducing a language-based
>   packaging mechanism, we can easily write precompiled headers.
>
>   Please, let's do the Right Thing.  I've got a 100 line file which
>   includes 1000 lines of system header, 4000 of XView, and 2000 of
>   my own.  There's only so much coffee that I can drink in a day!

 How many users feel this?

 I personally believe that of the _nonfunctional_ changes
to C++ that are important, this is number 1. I think many users
are ****ed off at slow compiles, unreliable instantiation,
lack of ODR checking, macro interference, and other effects of
retaining CPP and not providing a module system.

 However, adding one, while not that demanding, will
take some time and effort and probably delay Standardisation
for 4 months.

 Is it worth it? I think so, but most other members of the
committee do not. Too bad.

>My (and others') suggestion and, I believe, The Right Thing To Do is
>to abolish cpp, and replace #include with a decent packaging
>construct within the language.

 Yea!
>
>My personal favourite is for the old UCSD pascal 'uses' keyword.
>Paraphrasing the old Apple ][ pascal example, we have something like the
>following.  AppleII.cc is a user of the turtlegraphics and math
>libraries, and so writes "uses turtlegraphics" to load the interface
>definitions for the library:

 Yea! But

 include turtlegraphics;

would do. My idea is quite simple, it follows in fact from an idea
expressed by Bjarne Stroustrup (documented in D&E). The idea is
to _preserve_ C and C++ compatibility -- you can use CPP if you
_really_ want to.


 1) provide a way of naming _translation units_.
 2) specify that

  include unitname;

 shall include the _external interface_ of the translation unit;
 the include statement is permitted only in namespace scope
 (not in a class or block -- just to simplify things).

 The effect is to include declarations of each entity in the unit.

ISSUES.

The meaning of "external interface" must be specified. Here is my
proposal:

 1) All entities with external linkage are imported.
 2) All aliases and enumeration constants are imported.
 3) Static functions and variables are NOT imported.

What does it mean?  Why is this the best proposal?

First, this proposal allows compilation of include files
and header files, the _compiled_ forms can then be included.
Such inclusion does not require any preprocessing, parsing,
or name binding: binary images of symbol tables might well be
loaded: on some systems "attaching" a file as virtual
memory would make this operation _extremely fast_.

Second, whole source files (not just headers) can be compiled
and included. This does away with the need for headers completely.
However-- it does not exclude them! Its your choice.

Third, include compiled translation units eliminates any doubts
introduced by the One Definition Rule. Including an already
compiled function definition does not consititute a second
definition -- you're including the _original_ single definition.

Fourth, implementors may attach definitions to such functions
and load them in too -- although this is not required.
And that means that calls to these functions can be _inlined_
whether the function was inline or not -- the function does not
need to be inlined in a header file for inlining to work.

Fifth, compiled templates can be processed the same way:
including a module which contains a template definition allows
the template to be instantiated during compilation and inlined,
and does not require this to be delayed until link time
(or an inline definition provided in a header).

Sixth: multiple inclusions are permitted but have have no effect.

In short, this facility allow vastly improved compile times,
it provides semantic guarrantees of correctness, it removes
the need for separate header files and source files while
not excluding either, and it is compatible with the pre-processor.

One disadvantage of the proposal is that it does not introduce
a "proper" module system. It defines a module as "any" translation
unit, and extracts the interface automatically.

Of course, this is deliberate, to provide the minimal extension
to the language with the maximal flexibility and power.
You can _use_ this system as a module system if you choose,
but you do not have to.

Naturally, this is _not_ a full technical proposal. There is not
enough support for this to bother writing one. Sorry.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189




Author: jones@cais.cais.com (Ben Jones)
Date: 21 Nov 1994 15:10:49 GMT
Raw View
Andrew Fitzgibbon (andrewfg@emerald.aifh.ed.ac.uk) wrote:
: My (and others') suggestion and, I believe, The Right Thing To Do is
: to abolish cpp, and replace #include with a decent packaging
: construct within the language.

: My personal favourite is for the old UCSD pascal 'uses' keyword.
: Paraphrasing the old Apple ][ pascal example, we have something like the
: following.  AppleII.cc is a user of the turtlegraphics and math
: libraries, and so writes "uses turtlegraphics" to load the interface
: definitions for the library:

A more elegant solution would be to have an exportable class/package:

    export class name: base1, base2,... { ... };

    export package name1: base3, base4,... { ... };

A "package" is a class where all members are "static".  This would serve
the function of both a module and a namespace.  Classes and packages
specified as "bases" would be automatically imported.

A package "inherited" by a class simply has its members directly
accessible to functions and initializations defined in that class.

A class "inherited" by a package behaves as though an instance of
it was created in the package with its members directly accessible
to functions and initializations defined in the package.

There would be no need for writing headers or #including them in this
scheme.  A "pre-compiled" header would be generated by compiling the
file containing the "export" of a class or package.

:      ** Objections **

: The following are main objections that I foresee:
[snip snip]

: "Make will want to recompile everything all the time"
:    Now this is a biggie,  we can have make depend on a per-unit
:    timestamp file (the precompiled "header"), but ideally we would
:    like library-level granularity, meaning that any interface change
:    within the library would mean recompilation of all clients.
:    I think the compiler could instead supply make with an "interface
:    change time" on a per-class basis, and per-sourcefile class
:    dependency listings.  Actually, this answers my other big problem
:    too!

The export process could compare the generated pre-compiled header with
a previously generated one and backdate it if changes don't warrant
a recompilation.

This scheme has been implemented in a preprocessor.  If you would like
more information, please contact me:

Ben Jones
ARSoftware Corporation
jones@arsoftware.arclch.com





Author: linh@info.polymtl.ca (Li~nh)
Date: 21 Nov 1994 22:45:45 GMT
Raw View
John Max Skaller (maxtal@physics.su.OZ.AU) wrote:
: In article <ANDREWFG.94Nov18170524@emerald.aifh.ed.ac.uk> andrewfg@emerald.aifh.ed.ac.uk (Andrew Fitzgibbon) writes:
: >
: >         ** Executive Summary **
: >
: >   By eliminating the use off CPP, and introducing a language-based
: >   packaging mechanism, we can easily write precompiled headers.
: >
: >   Please, let's do the Right Thing.  I've got a 100 line file which
: >   includes 1000 lines of system header, 4000 of XView, and 2000 of
: >   my own.  There's only so much coffee that I can drink in a day!

:  How many users feel this?

I do.

:  I personally believe that of the _nonfunctional_ changes
: to C++ that are important, this is number 1. I think many users
: are ****ed off at slow compiles, unreliable instantiation,
: lack of ODR checking, macro interference, and other effects of
: retaining CPP and not providing a module system.

[stuffs deleted]

: Naturally, this is _not_ a full technical proposal. There is not
: enough support for this to bother writing one. Sorry.

: --
:         JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
:  Maxtal Pty Ltd,
:         81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
:         NSW 2037, AUSTRALIA     Phone: 61-2-566-2189




Author: matt@physics10.berkeley.edu (Matt Austern)
Date: 22 Nov 1994 01:26:18 GMT
Raw View
In article <CzJ8CG.IsC@ucc.su.OZ.AU> maxtal@physics.su.OZ.AU (John Max Skaller) writes:

>  However, adding one, while not that demanding, will
> take some time and effort and probably delay Standardisation
> for 4 months.
>
>  Is it worth it? I think so, but most other members of the
> committee do not. Too bad.

I think it's worth it too.  It's really bizarre that C++ doesn't have
any notion of a module with an interface/implementation split.  Sure,
you can use the mechanism of textual inclusion to simulate a module
system (which is what pretty much everyone does), but the whole notion
of a header file is simply an extralinguistic programming convention.


--

                               --matt




Author: andrewfg@aisb.ed.ac.uk (Andrew Fitzgibbon)
Date: Tue, 22 Nov 1994 12:42:30 GMT
Raw View
In article <3aqd9p$5kf@news.cais.com> you wrote:
> Andrew Fitzgibbon (andrewfg@emerald.aifh.ed.ac.uk) wrote:

 [I wanted 'uses turtlegraphics' to give me all
  external definitions in the group of source files
  which constitute the 'turtlegraphics' library]

> A more elegant solution would be to have an exportable class/package:
>
>     package turtlegraphics: class1, class2,... { ... };
>
[my edit, removing tautological 'export']

Syntactically, I like it.  So the 'client-side' code uses
multiple-inheritance syntax to indicate package dependencies on a
per-class basis.  In addition, the file-scope question completely
disappears -- files may be organized as users see fit.

But what about non-class functions, for example main()?

> There would be no need for writing headers or #including them in this
> scheme.  A "pre-compiled" header would be generated by compiling the
> file containing the "export" of a class or package.

I thought this was so obvious as not to need emphasis, but yes, header
files go.  See Max Skaller's eulogy for the obvious UNIX approach of
mmapped symbol-tables.  In addition, there's the potential of mmapped
function RTLs (GCC terminology) for global cross-file inlining.

Isn't anyone EXCITED?

A.

--
Andrew Fitzgibbon (Research Associate),                     andrewfg@ed.ac.uk
Artificial Intelligence, Edinburgh University.               +44 031 650 4504
                        http://www.dai.ed.ac.uk/staff/personal_pages/andrewfg
       "If it ain't broke, don't fix it" - traditional (c1950)
          "A stitch in time saves nine." - traditional (c1590)





Author: malcolm@xenon.mlb.sticomet.com (Malcolm McRoberts)
Date: 22 Nov 1994 21:55:08 GMT
Raw View
I'm in favor of a true module/package mechanism, especially one that
isn' file based.  But this isn't the only thing that cpp does, and you
aren't proposing any solutions to the other applications.  Rather than
lose cpp, we should be pushing for a better high-level, C++ syntax
aware macro facility.

-Malcolm
--
____________________________________________________________________________
___________
Malcolm McRoberts
STI
1225 Evans Rd.
Melboune, Fl. 32904
Email:  malcolm@sticomet.com
Ph:  (407) 723-3999
Fax: (407) 676-4510
____________________________________________________________________________
___________