Topic: Semantic-context keywords (was Global "if" statement?)


Author: dacut@henry.ece.cmu.edu (David A. Cuthbert)
Date: 1997/05/08
Raw View
David A. Cuthbert wrote:
>> Which brings me to another question:  what are the guidelines that the
>> C++ committee followed concering syntax parsers?  Would it have
>> complicated compilers much if certain keywords were reserved only in
>> specific contexts?  Consider an alternate form of declaring a const
>> virtual method:
>>
>> int  myMethod(int, char)  is(const, virtual);

Steve Clamage  <stephen.clamage@Eng.Sun.COM> replied:
>As best I remember, no one ever seriously considered keywords that
>weren't always reserved. Parsing C++ is already difficult, and adding
>unnecessary complications isn't going to win any votes. Error reporting
>is also better when the program context does not affect which words
>are keywords.

I'd even go as far to say that parsing C++ involves some pretty heavy
magic.  <g>

I don't know that the is() directive would be any more difficult to
parse.  You would have to add some states to the lexical analyizer,
but this could simplify the parser.  In the end, I don't think that
either method has any more advantages in implementation complexity; of
course, the current way has the advantage of "that's the way we've
always done it (in C)," and, therefore, fewer bugs to track down.

I think that the quality of implementation affects error reporting
more than the parsing method.  One of my peeves are the messages
") expected", ", expected", and "; expected".  I usually want to ask
the compiler, "Why?  Tell me what you thought I was trying to do
(function decl, etc.), so that I can disambiguate and tell you what I
really want to do."

Anyway, I mentioned this because I've noticed that most extensions
that to C that make up  C++ try to use existing syntax in odd,
counter-intuitive ways as much as possible.  This avoids adding
keywords and utilizes expressions that were illegal (good for
backwards compatability).  You have to wonder, though, if these same
guidelines should apply to fresh code that uses no legacy features
(i.e.:  shouldn't such code be written in C++++?).

--
David A. Cuthbert (henry.ece.cmu.edu!dacut)
Graduate Student, Electrical and Computer Engineering
Data Storage Systems Center, Carnegie Mellon University
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Steve Clamage <stephen.clamage@eng.sun.com>
Date: 1997/05/10
Raw View
David A. Cuthbert wrote:
>
> Steve Clamage  <stephen.clamage@Eng.Sun.COM> replied:
> >As best I remember, no one ever seriously considered keywords that
> >weren't always reserved. Parsing C++ is already difficult, and adding
> >unnecessary complications isn't going to win any votes. Error reporting
> >is also better when the program context does not affect which words
> >are keywords.
>
> I think that the quality of implementation affects error reporting
> more than the parsing method.  One of my peeves are the messages
> ") expected", ", expected", and "; expected".  I usually want to ask
> the compiler, "Why?  Tell me what you thought I was trying to do
> (function decl, etc.), so that I can disambiguate and tell you what I
> really want to do."

That is probably so, but is a different point. Independent of parsing
method and the implementor, some language design choices can make error
reporting easier or more difficult.

Assumptions: Quitting after finding one syntax error is not acceptable.
The compiler should recover (resynchronize) after finding an error and
continue analysis to find as many of the actual errors as possible in
any one run. Reporting "errors" that are artifacts of losing
synchronization is not desirable, and should be kept to a minimum.

Example: Why require semicolons to terminate expression statements?
 x = y
 a = b
is clear enough. The "a" following the "y" cannot possibly be part
of the same expression or statement, so it must begin a new
statement. You don't need a semicolon separator.

Answer: Apart from possibly reducing ambiguities, the semicolon
represents a "reliable token" that positively indicates the end of
an expression, and also the end of most statements. (A for-header
is an exception.) That is, when you encounter a semicolon token,
you can reset the expression parser unconditionally.

Without a required semicolon, many kinds of typographical or other
errors in the source code make it hard or impossible to determine
where you can can continue parsing and looking for more errors. If
you try to resynchronize too soon, you generate a lot of bogus error
messages due to misinterpreting what might be valid code. You have
to skip ahead to some other "reliable token" like a closing brace.

On the other hand, since a semicolon IS required to terminate an
expression statement, the compiler can report a missing semicolon
after the "y", pretend it saw one, and continue parsing accurately.
Notice this is a language design issue independent of any particular
grammer or parsing method.

Similarly, if context determines whether an identifier is a keyword,
it is that much harder for the parser to resynchronize after finding
a syntax error. In the presence of syntax errors, you can't be sure
you are making the right assumptions about context, and thus can't
be sure whether you are looking at an ordinary identifier or a
keyword. That in turn makes further recovery more difficult.

> Anyway, I mentioned this because I've noticed that most extensions
> that to C that make up  C++ try to use existing syntax in odd,
> counter-intuitive ways as much as possible.  This avoids adding
> keywords and utilizes expressions that were illegal (good for
> backwards compatability).  You have to wonder, though, if these same
> guidelines should apply to fresh code that uses no legacy features
> (i.e.:  shouldn't such code be written in C++++?).

I don't think this observation is correct. Most added features of
C++ are not backwards compatible with C, and no attempt was made
to make them backwards compatible. C compilers cannot deal with
general C++ class declarations, scope modifiers, pointers to members,
new/delete, overloaded operator functions, namespaces, exceptions, or
templates, for example.

--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: dacut@henry.ece.cmu.edu (David A. Cuthbert)
Date: 1997/05/12
Raw View
Steve Clamage  <stephen.clamage@eng.sun.com> wrote:
[Excellent discussion of why semicolons are required]
>Similarly, if context determines whether an identifier is a keyword,
>it is that much harder for the parser to resynchronize after finding
>a syntax error. In the presence of syntax errors, you can't be sure
>you are making the right assumptions about context, and thus can't
>be sure whether you are looking at an ordinary identifier or a
>keyword. That in turn makes further recovery more difficult.

Ah, good point.  I don't think that one instance where keywords are
defined by context would make too much of a difference; it would be an
easy extension to the parser.  More than one, though... and trouble is
certainly brewing.

[I originally wrote]
>> Anyway, I mentioned this because I've noticed that most extensions
>> that to C that make up  C++ try to use existing syntax in odd,
>> counter-intuitive ways as much as possible.

>I don't think this observation is correct.

Right; I did a bad cut-and-paste job, and that sentence, aside from
being incredibly difficult to read, reads *nothing* like what I meant.

>Most added features of
>C++ are not backwards compatible with C, and no attempt was made
>to make them backwards compatible.

More along the lines of what I was trying to observe:  the features
added to/proposed for C++ were constructed to make them illegal
constructs in C.  Therefore, it is unlikely that said feature would
cause well-formed C programs to break/have a different meaning under
C++.

This has the disadvantage of forcing C++ users to use syntax that is
occasionally not the most intuitive for expressing the idea they are
trying to convey.

An example of this (IMHO) would be reference variables.  I would argue
that T is a very different type than T& whose behavior can be and
usually is completely changed for a given function.  Therefore, the
indicator that you're using a reference should be spelled out (along
the lines of reinterpret_cast, etc.).  Furthermore, & already has the
connotations of taking the address of a variable and putting it in a
pointer (which is very different from a reference).  Thus, "ref"
should be a modifier that can be applied to a type, along the lines of
"const".

Of course, to put this in the language would be silly.  I'm sure that
I've even written C code that includes "void* ref;" to declare a
variable called "ref" in a program somewhere.
--
David A. Cuthbert (henry.ece.cmu.edu!dacut)
Graduate Student, Electrical and Computer Engineering
Data Storage Systems Center, Carnegie Mellon University
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Steve Clamage <stephen.clamage@Eng.Sun.COM>
Date: 1997/05/06
Raw View
David A. Cuthbert wrote:
>
>
> Which brings me to another question:  what are the guidelines that the
> C++ committee followed concering syntax parsers?  Would it have
> complicated compilers much if certain keywords were reserved only in
> specific contexts?  Consider an alternate form of declaring a const
> virtual method:
>
> int  myMethod(int, char)  is(const, virtual);
>
> That is, create a new keyword, "is," and allow virtual to be used like
> any other variable outside of the is() directive.
>
> No, I'm not proposing this for C++; just curious as to the issues
> involved.

As best I remember, no one ever seriously considered keywords that
weren't always reserved. Parsing C++ is already difficult, and adding
unnecessary complications isn't going to win any votes. Error reporting
is also better when the program context does not affect which words
are keywords.

--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: dacut@henry.ece.cmu.edu (David A. Cuthbert)
Date: 1997/05/05
Raw View

Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
>OK, so how about conditional declarations using the conditional-expression
>syntax (`... ? ... : ...')?

Amusing.  :-)  Tossed it into my compiler, which complained about
wanting to see ')' when it encountered '?', which I assume means that
it was expecting a typedef to a pointer-to-function.

Which brings me to another question:  what are the guidelines that the
C++ committee followed concering syntax parsers?  Would it have
complicated compilers much if certain keywords were reserved only in
specific contexts?  Consider an alternate form of declaring a const
virtual method:

int  myMethod(int, char)  is(const, virtual);

That is, create a new keyword, "is," and allow virtual to be used like
any other variable outside of the is() directive.

No, I'm not proposing this for C++; just curious as to the issues
involved.
--
David A. Cuthbert (henry.ece.cmu.edu!dacut)
Graduate Student, Electrical and Computer Engineering
Data Storage Systems Center, Carnegie Mellon University
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]