Topic: C++ Grammar


Author: jkanze@otelo.ibmmail.com
Date: 1998/05/18
Raw View
In article <19980517173609.28357@fred.muc.de>,
  ak@muc.de wrote:
>
> On Sun, May 17, 1998 at 04:42:37PM +0200, Valentin Bonnard wrote:
> > Andi Kleen <ak@MUC.DE> writes:
> >
> > > benchmark customer <benchmark.customer@Sun.COM> writes:
> > >
> > > > Would anyone be able to point me to a place on the web
> > > > or a book that defines the C++ grammar in LALR or any other
> > > > format that I can use directly with yacc.
> > > > I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> > > > grammar.
> > >
> > > You can't because C++ is not LALR(1). C++ compilers using yacc/bison
> > > usually have to implement some hacks in the scanner to implement an
> > > arbitary lookahead.
> >
> > Are there any reasons to use a tool if you are going to hack
> > it to force it to do something for what it isn't designed ?
>
> You're misunderstanding. bison/yacc only generate parsers, not scanners
> so there is no "hacking of the tool" involved. Usually on uses flex/lex
> to generate the scanners or write them from hand. None of them require
> any "hacks" for that.

Neither C nor C++ (nor Pascal, for that matter) are really context free,
so some hacking IS necessary, although with C or Pascal, it is usually
limited to the scanner.  Basically, identifiers are looked up in
the scanner, with the returned token depending on the semantic
information associated with the identifiers.  This is NOT sufficient
for C++.

> > I am completly in favor of tools when they handle the language
> > except for trivial ambiguities (like with Pascal), but I don't
> > understand their use for languages like C or worse C++.
>
> C is LL(1) so no ambiguities. I think the standard grammar has the
> dangling else problem, but that can be easily solved.

Nonsense.  C is not even context free, much less LL(1).

I think that the C subset without user defined types (typedef) is
probably LL(1).  For that matter, C++ without user defined types
(typedef, class, struct, union) might be LL(1).  But it isn't C++.

> >
> > But I have never written a C or C++ parser. So I ask to
> > implementors: what do you use (if it isn't a secret) ?
>
> From the C++ parsers I have source for:
>
> - Tendra C++: uses some homegrown LL(1) parser generator and uses the
> backtracing in the scanner technique. For some symbols the scanner starts
> to save token streams, and when the parser recognizes that it can't parse
> a statement it backs up and tries another alternative.
>
> - Open C++: uses a normal bison parser and the same trick. Probably the
> smallest and most readable freely avaiable C++ parser [I have not looked
> at the free PCCTS C++ grammar, that might be readable too]
>
> - GNU C++: uses a rather non standard grammar and uses many tricks in the
> grammar to avoid backtracking, but that seems to cause many problems and
> I hear there are plans for EGCS to rewrite the parser to use the grammar
> from the standard and a scanner backtracing technique too.
>
> Bison was certainly not designed to parse LALR(n), n>1 grammars, but the
> scanner backtracing trick is well understood. I wouldn't call it elegant or
> fast, but it works.

Backtracking in general is well understood, but I think I'd go along with
Valentin on this one.  Making a yacc/lex based parser handle backtracking
is a lot of work, and very error prone and difficult to test.  What's the
point in using a tool which doesn't work.

As to the fact that many parsers do use them, you know the old saying:
if all you have is a hammer, everything looks like a nail.  (From
previous discussions in this and other forums, I gather that most
commercial C++ parsers do NOT use a lex/yacc equivalent.)

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
        +49 (0)69 66 45 33 10    mailto: jkanze@otelo.ibmmail.com
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient=E9e objet --
              -- Beratung in objektorientierter Datenverarbeitung

-----== Posted via Deja News, The Leader in Internet Discussion ===
-----
http://www.dejanews.com/   Now offering spam-free web-based newsreading


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: benchmark customer <benchmark.customer@Sun.COM>
Date: 1998/05/12
Raw View
Would anyone be able to point me to a place on the web
or a book that defines the C++ grammar in LALR or any other
format that I can use directly with yacc.
I tried the ANSI C++ draft, but it doesn't provide a disambiguous
grammar.

Please send your replies either to this news group or to
   umesh@objectek.com

Thank You
Umesh
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "Dave J.G." <see@signa.ture>
Date: 1998/05/12
Raw View
> Would anyone be able to point me to a place on the web
> or a book that defines the C++ grammar in LALR or any other
> format that I can use directly with yacc.

Check out Bjarne Stroustrup's The C++ Programming Language. One of the appendicies
details C++ Grammar as defined by the new standard. I'm not sure if it is in the
format that you are looking for though. It's a great book that I highly recommend
for any C++ programmer or even for any OO programmer in general. For more info,
check out: http://www.research.att.com/~bs/3rd.html .

 Dave Grossman

--
( reply-to address changed to avoid the spammers,
  use the following e-mail address )
daveg    T unpronounceable D   T com
http://www.unpronounceable.com/daves



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Andi Kleen <ak@MUC.DE>
Date: 1998/05/12
Raw View
benchmark customer <benchmark.customer@Sun.COM> writes:

> Would anyone be able to point me to a place on the web
> or a book that defines the C++ grammar in LALR or any other
> format that I can use directly with yacc.
> I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> grammar.

You can't because C++ is not LALR(1). C++ compilers using yacc/bison
usually have to implement some hacks in the scanner to implement an
arbitary lookahead.

-Andi


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Piet Van Vlierberghe" <pieter.vanvlierberghe@lms.be>
Date: 1998/05/13
Raw View
benchmark customer <benchmark.customer@Sun.COM> wrote in article
<35587697.42C790E9@Sun.COM>...
> Would anyone be able to point me to a place on the web
> or a book that defines the C++ grammar in LALR or any other
> format that I can use directly with yacc.

IMHO there is no such thing. In order to disambiguate the C++ syntax,
information needs to flow from semantic analysis into the parser. Most of
the tokens that are to be found in C++ basically boil down to identifiers,
but making the right choice in a parser can only happen when you know that
an identifier denotes a type, a class, an enumeration value, ... . Example:

identifier1 (identifier2);

This might be
- the creation of a temporary object of class identifier1
- a type conversion of identifier2 to type identifier1
- a call of function identifier1 with parameter identifier2
- a forward declaration of an int function identifier1 with a parameter of
type idenfifier2

There are numerous other examples to be found in the grammar. Only by
untangling all types of identifiers can you ever hope to make this
unambiguous.
This, amongst others, is a reason why template instantation is as complex
as it is: some expressions, based on template parameters, might turn out to
be
a basic type in one instance, a variable in another instance, and a class
in a third instance. So parsing of templates is even worse, since you
cannot make
semantical information flow into your lexical analysis.

The mere fact that the C++ syntax is what it is, results from the strategic
choice to build upon the C syntax.

Hope this helps.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/05/13
Raw View
benchmark customer wrote:
>
> Would anyone be able to point me to a place on the web
> or a book that defines the C++ grammar in LALR or any other
> format that I can use directly with yacc.
> I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> grammar.

I'm no expert, but I'm pretty sure that C++ is beyond the capability of
YACC, since it involves lots of feedback from the semantic processor
back into the grammar. Most notably, if a name refers to a class, it can
be used in places where it couldn't otherwise be used, so the parser
needs to know if a name is a class or not. This can certainly be hacked
into a YACC program, but it can't be described within the grammar
itself.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Hubert HOLIN <hh@ArtQuest.fr>
Date: 1998/05/14
Raw View
benchmark customer wrote:
>
> Would anyone be able to point me to a place on the web
> or a book that defines the C++ grammar in LALR or any other
> format that I can use directly with yacc.
> I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> grammar.
>
> Please send your replies either to this news group or to
>    umesh@objectek.com
>
> Thank You
> Umesh

[SNIP]

 It will not work with Yacc (or bison), and I do not (yet) know if it is
up to what's in the (unavailable for mere mortals) FDIS, but there is
the following (C++ grammar for PCCTS by John Lilley) which is not too
far from what you wish:

  http://www.empathy.com/pccts/index.html

 This works with the C++ version of PCCTS (1.33).

  Hubert Holin
  holin@mathp7.jussieu.fr
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: ak@muc.de
Date: 1998/05/17
Raw View
On Sun, May 17, 1998 at 04:42:37PM +0200, Valentin Bonnard wrote:
> Andi Kleen <ak@MUC.DE> writes:
>
> > benchmark customer <benchmark.customer@Sun.COM> writes:
> >
> > > Would anyone be able to point me to a place on the web
> > > or a book that defines the C++ grammar in LALR or any other
> > > format that I can use directly with yacc.
> > > I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> > > grammar.
> >
> > You can't because C++ is not LALR(1). C++ compilers using yacc/bison
> > usually have to implement some hacks in the scanner to implement an
> > arbitary lookahead.
>
> Are there any reasons to use a tool if you are going to hack
> it to force it to do something for what it isn't designed ?

You're misunderstanding. bison/yacc only generate parsers, not scanners
so there is no "hacking of the tool" involved. Usually on uses flex/lex
to generate the scanners or write them from hand. None of them require
any "hacks" for that.

>
> I am completly in favor of tools when they handle the language
> except for trivial ambiguities (like with Pascal), but I don't
> understand their use for languages like C or worse C++.

C is LL(1) so no ambiguities. I think the standard grammar has the
dangling else problem, but that can be easily solved.

>
> But I have never written a C or C++ parser. So I ask to
> implementors: what do you use (if it isn't a secret) ?



Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1998/05/17
Raw View
Andi Kleen <ak@MUC.DE> writes:

> benchmark customer <benchmark.customer@Sun.COM> writes:
>
> > Would anyone be able to point me to a place on the web
> > or a book that defines the C++ grammar in LALR or any other
> > format that I can use directly with yacc.
> > I tried the ANSI C++ draft, but it doesn't provide a disambiguous
> > grammar.
>
> You can't because C++ is not LALR(1). C++ compilers using yacc/bison
> usually have to implement some hacks in the scanner to implement an
> arbitary lookahead.

Are there any reasons to use a tool if you are going to hack
it to force it to do something for what it isn't designed ?

I am completly in favor of tools when they handle the language
except for trivial ambiguities (like with Pascal), but I don't
understand their use for languages like C or worse C++.

But I have never written a C or C++ parser. So I ask to
implementors: what do you use (if it isn't a secret) ?

--

Valentin Bonnard                mailto:bonnardv@pratique.fr
info about C++/a propos du C++: http://pages.pratique.fr/~bonnardv/


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]