Thread

Topic: Suggestion for improvement of C++ standard: forward refs

Author: nagle@animats.com (John Nagle)
Date: Tue, 31 Dec 2002 16:48:52 +0000 (UTC) Raw View

llewelly wrote:

> allan_w@my-dejanews.com (Allan W) writes:
>
> [snip]
>
>>I do know that multi-pass goes against the history of both C and
>>C++. Kernigan and Ritchie wanted C to be an easy language to parse,
>>and Stroustrup wanted to continue this tradition.
>>
> [snip]
>
> *boggle* How did this lead to ISO C++98's enormous, ambigous grammar?


     It all comes from bad original design of declarations in C.

     In the beginning, C really was a sort of high-level
assembler.  "structs" were just a list of offsets; initially,
you couldn't have the same fieldname in two different structures.
There was no user typing at all.  No "typedef".  C initially
was a slight improvement over BCPL, sometimes known as the
British Cruddy Programming Language.  But
the language was LALR(1), and could be parsed with
a straightforward grammar.

     The whole typing system in C, let alone that of C++,
is a backwards-compatible retrofit to early C.

     "typedef" was the first thing that broke parsing.
Before typedef, you could parse C without looking up
user-defined symbols in the dictionary.

     Since then, the syntax has become uglier with each
addition to the language.

     If C had gone with Pascal-style declaration syntax,
"var x,y: int; c: char", instead of "int x,y; char c;",
it would have been easy to parse.  But because C didn't
originally have a type system, the need for a real
declaration syntax wasn't forseen.  Hence the current
mess.

     John Nagle
     Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: llewelly.@@xmission.dot.com (llewelly)
Date: Tue, 31 Dec 2002 05:42:24 +0000 (UTC) Raw View

allan_w@my-dejanews.com (Allan W) writes:

[snip]
> I do know that multi-pass goes against the history of both C and
> C++. Kernigan and Ritchie wanted C to be an easy language to parse,
> and Stroustrup wanted to continue this tradition.
[snip]

*boggle* How did this lead to ISO C++98's enormous, ambigous grammar,
    which is entirely hostile to traditional LALR parsing techniques? I'm
    convinced Ritchie and Stroustrup both had long lists of things they
    viewed as more important than 'be an easy language to parse'. (I
    think they also thought of 'easy to parse' as 'easy for a
    hand-written variable lookahead recursive descent parser', as
    opposed to easy for a machine-generated LALR parser, which is a
    different beast. )

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.de (James Kanze)
Date: Thu, 2 Jan 2003 06:25:45 +0000 (UTC) Raw View

+Callan_w@my-dejanews.com (Allan W) wrote in message
news:<7f2735a5.0212261455.27cfe47d@posting.google.com>...
> ai@springtimesoftware.com ("David Spector") wrote
> > Hi, all. I am wondering why in these modern times
> > we still have the 'no forward reference' limitation?

Because it helps to make the code readable.

> > This sets C++ apart from most other programming languages and makes
> > programs harder to write and/or less natural to read. For example,
> > functions must be both declared and defined lexically prior to any
> > use (call),

Functions must be declared, not defined, prior to any use.

Most well designed modern languages have completely separate sections
for interface definitions (which are exported) and the implementation
(which isn't exported).  Such languages tend to require special linkers,
etc., which the authors of C++ didn't want.

> You're mistaken. Functions must be EITHER declared or defined
> lexically prior to any use. (A definition counts as a declaration.)
> AFAIK, the sole exception is inline functions; surely, as a former
> compiler writer, you can understand the need for that (with existing
> linker technology).

Come now.  Some modern compilers DO inline functions across module
boundaries, even if the function isn't declared inline.  The goal was,
however, that such sophisticated technologies not be needed.

> > or else an "empty" file-level forward reference declaration must be
> > made.

> Not certain what this means. Are you talking about header files, which
> typically contain a list of declarations?

> > There is very little implementation impact on compilers (it requires
> > at most an additional partial (fixup) pass). I have some knowledge
> > in this area because I spent many years as a compiler writer for
> > companies such as DEC, Concurrent Systems MASSCOMP), and Prime
> > Computer.

> Maybe you're talking about code like this?

>     int main() {
>         show(2);                     // show()  not declared yet
>         for (int i=3; i<LIMIT; ++i)  // LIMIT   not declared yet
>             if (prime(i))            // prime() not declared yet
>                 show(i);             // show()  not declared yet
>         std::cout << std::endl;
>     }
>     const int LIMIT = 1000;
>     bool prime(int i) {
>         // For exposition only; better algorithms exist
>         for (j=3; j<i; j+=2)
>             if ((j*j)>i) return true;
>             else if((i%j)==0) return false;
>     }
>     #include <iostream>
>     void show(int i) { std::cout << i; }

> Not a compiler expert like yourself, but I can see problems with using
> fixup classes. Before we see the declaration of show(), we don't know
> if it's going to return any data (that we will have to throw away). We
> could emit a bunch of NOPs and then change them to something more
> meaningful if needed, but that seems wasteful -- not optimized.

I'm not sure what you mean by "fixup classes", but today, it wouldn't be
too difficult to build a compiler which didn't require the declarations
to precede the use.

I'm not sure what advantages it would buy us, however.  For very simple
examples, like yours, fine, but typically, the functions you will be
using will be in a different module, in a different file, and you will
pick up the declarations in an include.  Given that, I don't see why you
would want to put the include at the end of the file.

> We could also turn C++ into a multi-pass compiler. The first pass
> would discover that LIMIT, prime, and show are all defined; the second
> pass would use that knowledge, without having to declare the functions
> in addition to defining them. Certainly there is precedent for this; I
> suspect FORTRAN has to be multi-pass, and I know that the DEC assembly
> language was two pass!

Modern C++ requires backtracking, which is a lot more difficult than
multiple passes.

There is some history involved.  In C, there is no requirement that a
function be declared before use.  But using an undeclared function
implicitly declares it, and there is a requirement that all declarations
agree.  Thus:

    int
    main()
    {
        //      Illegal, f is implicitly declared to return int.
        char* p = f() ;
        //      Legal.
        int i = g() ;
    }

    //      Illegal, since we have already declared f to return int.
    char*
    f() { ... }

    //      Legal
    int
    g() { ... }

The earliest C compilers were simple beasts, and requiring the
declaration of f before its use certainly simplified things then.
Today, I really don't think it makes a lot of difference, but I don't
see really what we gain either by allowing it.  (Java has to allow it
because they don't allow separating declarations from definitions.  C++
allows it for functions defined directly in the class, where it is also
necessary.  But most coding guidelines I've seen forbid defining
functions directly in the class, for readability reasons.  Why Java
requires what had already been proven to be extremely poor programming
practice in C++, I don't know.)

> People smarter than I can tell you why multi-pass is a disadvantage
> (although it may have something to do with synchronization? If a
> symbol has a different value in pass 1 than in pass 2, the compiled
> code will go a bit crazy.)

C++ already requires multiple passes over parts of the sources, and in
ways far more complex than would be required for this.

> I do know that multi-pass goes against the history of both C and
> C++. Kernigan and Ritchie wanted C to be an easy language to parse,
> and Stroustrup wanted to continue this tradition.

If THAT were Stroustrup's main goal, I think we can agree that he
failed:-).  C++ is one of the hardest languages in the world to parse.

I'm not sure whether parser simplicity was a goal of C, either.  The
ability to generate reasonably good code using very limited machine
resources, yes, but C, while much simpler than C++, isn't as easy to
parse as it could be either.

The only language I'm familiar with where parser simplicity is an avowed
goal is Pascal.  (That doesn't mean that other languages introduce
complications just for the fun of it.)

--
James Kanze                           mailto:jkanze@caicheuvreux.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: allan_w@my-dejanews.com (Allan W)
Date: Fri, 27 Dec 2002 20:49:05 +0000 (UTC) Raw View

ai@springtimesoftware.com ("David Spector") wrote
> Hi, all. I am wondering why in these modern times
> we still have the 'no forward reference' limitation?
> This sets C++ apart from most other programming languages
> and makes programs harder to write and/or less natural to
> read. For example, functions must be both declared and
> defined lexically prior to any use (call),

You're mistaken. Functions must be EITHER declared or defined
lexically prior to any use. (A definition counts as a
declaration.) AFAIK, the sole exception is inline functions;
surely, as a former compiler writer, you can understand the
need for that (with existing linker technology).

> or else an "empty"
> file-level forward reference declaration must be made.

Not certain what this means. Are you talking about header files,
which typically contain a list of declarations?

> There is very little implementation impact on compilers (it
> requires at most an additional partial (fixup) pass). I have
> some knowledge in this area because I spent many years as
> a compiler writer for companies such as DEC, Concurrent
> Systems MASSCOMP), and Prime Computer.

Maybe you're talking about code like this?

    int main() {
        show(2);                     // show()  not declared yet
        for (int i=3; i<LIMIT; ++i)  // LIMIT   not declared yet
            if (prime(i))            // prime() not declared yet
                show(i);             // show()  not declared yet
        std::cout << std::endl;
    }
    const int LIMIT = 1000;
    bool prime(int i) {
        // For exposition only; better algorithms exist
        for (j=3; j<i; j+=2)
            if ((j*j)>i) return true;
            else if((i%j)==0) return false;
    }
    #include <iostream>
    void show(int i) { std::cout << i; }

Not a compiler expert like yourself, but I can see problems with
using fixup classes. Before we see the declaration of show(), we
don't know if it's going to return any data (that we will have to
throw away). We could emit a bunch of NOPs and then change them
to something more meaningful if needed, but that seems wasteful --
not optimized.

We could also turn C++ into a multi-pass compiler. The first pass
would discover that LIMIT, prime, and show are all defined; the
second pass would use that knowledge, without having to declare
the functions in addition to defining them. Certainly there is
precedent for this; I suspect FORTRAN has to be multi-pass, and
I know that the DEC assembly language was two pass!

People smarter than I can tell you why multi-pass is a disadvantage
(although it may have something to do with synchronization? If
a symbol has a different value in pass 1 than in pass 2, the
compiled code will go a bit crazy.)

I do know that multi-pass goes against the history of both C and
C++. Kernigan and Ritchie wanted C to be an easy language to parse,
and Stroustrup wanted to continue this tradition.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jackklein@spamcop.net (Jack Klein)
Date: Fri, 27 Dec 2002 22:23:35 +0000 (UTC) Raw View

On Thu, 26 Dec 2002 10:04:18 +0000 (UTC), ai@springtimesoftware.com
("David Spector") wrote in comp.std.c++:

> Hi, all. I am wondering why in these modern times
> we still have the 'no forward reference' limitation?
> This sets C++ apart from most other programming languages
> and makes programs harder to write and/or less natural to
> read. For example, functions must be both declared and
> defined lexically prior to any use (call), or else an "empty"
> file-level forward reference declaration must be made.

Neither of these assumptions is correct.

A function must have a prototype in scope at the point where it is
called.  This may be provided by the definition (body) of the
function, which in C++ also provides a valid prototype.  Otherwise a
prototype must be explicitly provided, but it does not need to be at
file or namespace scope.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: ai@springtimesoftware.com ("David Spector")
Date: Thu, 26 Dec 2002 10:04:18 +0000 (UTC) Raw View

Hi, all. I am wondering why in these modern times
we still have the 'no forward reference' limitation?
This sets C++ apart from most other programming languages
and makes programs harder to write and/or less natural to
read. For example, functions must be both declared and
defined lexically prior to any use (call), or else an "empty"
file-level forward reference declaration must be made.

There is very little implementation impact on compilers (it
requires at most an additional partial (fixup) pass). I have
some knowledge in this area because I spent many years as
a compiler writer for companies such as DEC, Concurrent
Systems MASSCOMP), and Prime Computer.

I would think that there is no compatibility problem, since
relaxing this restriction should not make any existing programs
fail to compile.

David Spector
President, Springtime Software
ai@springtimesoftware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]