Topic: c++ grammar
Author: skaller@nospam.com.au ("John Max Skaller")
Date: Tue, 18 Jan 2005 01:35:23 GMT Raw View
On Mon, 10 Jan 2005 18:28:27 +0000, Gabriel Dos Reis wrote:
> "John Max Skaller" <skaller@nospam.com.au> writes:
>
> [...]
>
> | > As far as I know, the C++ grammar requires
> | > backtracking. And did in 1994 as well.
> |
> | That depends on how you parse it: I'm playing with
> | a GLR parser (Elkhound) in Felix. GLR doesn't backtrack,
> | it forks the parse stack instead.
>
> And the resulting forest is horrible.
C++ is horrible .. what do you expect :)
And we have yet to see Elkhound
> accept full C++.
Since Scott is no longer working on Elsa, you probably
never will..
Still .. I have yet to see the new g++ parser, the old
one doesn't work either.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: skaller@nospam.com.au ("John Max Skaller")
Date: Tue, 18 Jan 2005 06:42:33 GMT Raw View
On Mon, 03 Jan 2005 11:30:59 -0600, Pete Becker wrote:
> John Max Skaller wrote:
>> On Wed, 29 Dec 2004 20:33:50 -0600, John Nagle wrote:
>>
>>> There are some serious downsides to a context dependent
>>>grammar.
>>
>>
>> Hehe. You bet. There is no such thing -- which is
>> a fairly major downside :)
>>
>
> Noam Chomsky disagrees with you. The Chomsky hierarchy includes regular
> grammars, context-free grammars, context-sensitive grammars, and
> unrestricted grammars.
Nope. The hierarchy is about *languages*. Grammars stop at
context free. After that you need a Post Production Scheme
which is like a grammar, but has pattern matches on the LHS.
These are basically pattern match based rewrite rules.
A *grammar* requires a nonterminal on the LHS, if a language
has a grammar it is context free, period, end of story.
The context freeness is guarranteed by the LHS being a non-terminal,
the lack of it is indicated by the LHS being a pattern, since
patterns indicate context.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: gdr@cs.tamu.edu (Gabriel Dos Reis)
Date: Tue, 18 Jan 2005 06:42:48 GMT Raw View
skaller@nospam.com.au ("John Max Skaller") writes:
| > And we have yet to see Elkhound accept full C++.
|
| Since Scott is no longer working on Elsa, you probably
| never will..
When I assisted at his presentation of Elkhound at Barcelona last
year, I had high hopes for a freely available and "independent" parser
targetting standard C++...
| Still .. I have yet to see the new g++ parser, the old
| one doesn't work either.
Which new one? The g++ parser has been rewritten and part of the
production releases since 3.4. GCC-3.4 was released on April 20 2004.
Today is Januray 17 2005.
--
Gabriel Dos Reis
gdr@cs.tamu.edu
Texas A&M University -- Department of Computer Science
301, Bright Building -- College Station, TX 77843-3112
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: petebecker@acm.org (Pete Becker)
Date: Tue, 18 Jan 2005 16:34:39 GMT Raw View
John Max Skaller wrote:
> On Mon, 03 Jan 2005 11:30:59 -0600, Pete Becker wrote:
>>Noam Chomsky disagrees with you. The Chomsky hierarchy includes regular
>>grammars, context-free grammars, context-sensitive grammars, and
>>unrestricted grammars.
>
>
> Nope. The hierarchy is about *languages*. Grammars stop at
> context free.
Since Chomsky called them all "grammars", I suspect he'd disagree with you.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: jcoffin@taeus.com
Date: Thu, 20 Jan 2005 22:38:11 CST Raw View
[ .. ]
> > Nope. The hierarchy is about *languages*. Grammars stop at
> > context free.
>
> Since Chomsky called them all "grammars", I suspect he'd disagree
with you.
In fact, if grammars were always context-free by definition, then the
phrase "context free" would never have been invented, because it would
have been redundant by definition.
--
Later,
Jerry.
The universe is a figment of its own imagination.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: gdr@cs.tamu.edu (Gabriel Dos Reis)
Date: Tue, 11 Jan 2005 05:36:49 GMT Raw View
"John Max Skaller" <skaller@nospam.com.au> writes:
| On Sun, 26 Dec 2004 17:17:00 -0600, Andrey Tatarinov wrote:
|
| > Is there a simple example of C++ code that proves that GLR parsing is
| > one that is most suitable for C++ parsing?
|
| I wouldn't claim it is most suitable. Top down parses have
| the advantage they're a bit easier perhaps to construct by
| hand, which means you can hack about to solve the various
| problems C++ presents.
And probably even importantly, good diagnostics.
Alas, "parse error before token <foo>" seems to be the norm rather the
exception now :-(
--
Gabriel Dos Reis
gdr@cs.tamu.edu
Texas A&M University -- Department of Computer Science
301, Bright Building -- College Station, TX 77843-3112
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: "John Max Skaller" <skaller@nospam.com.au>
Date: Sun, 2 Jan 2005 21:30:03 CST Raw View
On Wed, 29 Dec 2004 20:33:50 -0600, John Nagle wrote:
> Actually, the real issue in a grammar is whether there's
> a useful "ground state". All parsers face the problem
> of what to do when a parsing error occurs. The usual
> answer is to force the parser to some arbitrary "ground
> state". That doesn't work well for C++.
Yes, error reporting is quite hard and very
parser technology dependent.
> There are some serious downsides to a context dependent
> grammar.
Hehe. You bet. There is no such thing -- which is
a fairly major downside :)
[A language with a grammar is necessarily context free]
> Because C++ is so hard to parse, there are few tools
> that modify C++ source well. This is a real lack.
Yes it is. Indeed, my C++ compiler (g++ 3.2.2) can't
parse basic C++ correctly [ the expression (X()) fails
to parse even though it is entirely unambiguous]
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: Pete Becker <petebecker@acm.org>
Date: Mon, 3 Jan 2005 11:30:59 CST Raw View
John Max Skaller wrote:
> On Wed, 29 Dec 2004 20:33:50 -0600, John Nagle wrote:
>
>> There are some serious downsides to a context dependent
>>grammar.
>
>
> Hehe. You bet. There is no such thing -- which is
> a fairly major downside :)
>
Noam Chomsky disagrees with you. The Chomsky hierarchy includes regular
grammars, context-free grammars, context-sensitive grammars, and
unrestricted grammars.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: gdr@cs.tamu.edu (Gabriel Dos Reis)
Date: Mon, 10 Jan 2005 18:28:27 GMT Raw View
"John Max Skaller" <skaller@nospam.com.au> writes:
[...]
| > As far as I know, the C++ grammar requires
| > backtracking. And did in 1994 as well.
|
| That depends on how you parse it: I'm playing with
| a GLR parser (Elkhound) in Felix. GLR doesn't backtrack,
| it forks the parse stack instead.
And the resulting forest is horrible. And we have yet to see Elkhound
accept full C++.
--
Gabriel Dos Reis
gdr@cs.tamu.edu
Texas A&M University -- Department of Computer Science
301, Bright Building -- College Station, TX 77843-3112
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: Paolo Carlini <pcarlini@suse.de>
Date: Wed, 15 Dec 2004 18:20:27 CST Raw View
Gianluca Silvestri wrote:
> ... He also has written
> that it is possible to write a nice recursive-descent parser for C++. That
> was back in 1994. Is it still true?
I'm not sure to fully understand what you are implying, but definitely,
in the words of the author (Mark Mitchell), the current GCC parser in
the C++ front-end is
"... of the standard recursive-descent variety."
Paolo.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: kanze@gabi-soft.fr
Date: Thu, 16 Dec 2004 10:26:54 CST Raw View
"Gianluca Silvestri" wrote:
> Stroutrup in D&E has written that his mind is biased towards
> top-down parsing and that is revealed in many grammar
> choices. He also has written that it is possible to write a
> nice recursive-descent parser for C++. That was back in
> 1994. Is it still true? I was told that C and C++ grammars
> were difficult beasts.
It depends on what you consider a "nice recursive-descent
parser". As far as I know, the C++ grammar requires
backtracking. And did in 1994 as well.
--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orient e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S mard, 78210 St.-Cyr-l' cole, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: "David Crocker" <dcrocker@eschertech.ccoomm>
Date: Mon, 20 Dec 2004 11:49:16 CST Raw View
""Gianluca Silvestri"" <gianluca_silvestri75@yahoo.it> wrote in message
news:31t955F3facr7U1@individual.net...
>
> Hi,
> Stroutrup in D&E has written that his mind is biased towards top-down
> parsing and that is revealed in many grammar choices. He also has written
> that it is possible to write a nice recursive-descent parser for C++. That
> was back in 1994. Is it still true? I was told that C and C++ grammars
were
> difficult beasts.
>
> Thanks,
> Gianluca Silvestri
>
>
> ---
> [ comp.std.c++ is moderated. To submit articles, try just posting with ]
> [ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
> [ --- Please see the FAQ before posting. --- ]
> [ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
There are 2 widely-used families of programming language grammars: the LL
family (LL1, LL2 etc.) and the LR family (LR1, LR2, also derivatives like
LALR1 and SLR1). One usually sees parsers for LL grammars described as
top-down, and LR parsers described as bottom-up. Another way of looking at
it is that both sorts of parser are top-down, but the LR parser delays
making decisions until there is only one possible interpretation of the
sentence parsed so far.
LL parsers are typically implemented by handwritten recursive-descent
parsers. LR grammars are invariably implemented by table-driven parsers
generated from the grammar by a parser-generating program such as yacc or
bison. In general, LR grammars and parsers are more powerful than LL
grammars and parsers (that is, they can handle a wider range of languages).
The grammar for ANSI C is not too bad. Provided that names defined by
"typedef" are immediately flagged so that they are recognised as type names
when they are encountered later in the parse, the ANSI C language can be
described by an LALR1 grammar with just one ambuguity (the dangling "else"
problem). So building a parser for ANSI C is very easy using yacc, bison or
a similar tool.
Parsing C++ is much harder because the language is genuinely ambiguous at
the syntactic level. For example, there are some C++ constructs that can be
interpreted as either a declaration or an expression. One solution is to
rewrite the grammar to include these ambugious constructs (e.g. invent a new
construct called DeclarationOrExpression) and decide after the parse is
finished how they should be interpreted.
I gave up writing recursive descent parsers many years ago when parser
generators became available; but from a maintenance viewpoint, I can't
imagine any handwritten recursive descent parser for a language as big as
C++ being "nice".
David Crocker
www.eschertech.com
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: Andrey Tatarinov <elephantum@dezcom.mephi.ru>
Date: Sun, 26 Dec 2004 17:17:00 CST Raw View
> That depends on how you parse it: I'm playing with
> a GLR parser (Elkhound) in Felix. GLR doesn't backtrack,
> it forks the parse stack instead.
Is there a simple example of C++ code that proves that GLR parsing is
one that is most suitable for C++ parsing?
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: "John Max Skaller" <skaller@nospam.com.au>
Date: Wed, 29 Dec 2004 11:42:43 CST Raw View
On Sun, 26 Dec 2004 17:17:00 -0600, Andrey Tatarinov wrote:
> Is there a simple example of C++ code that proves that GLR parsing is
> one that is most suitable for C++ parsing?
I wouldn't claim it is most suitable. Top down parses have
the advantage they're a bit easier perhaps to construct by
hand, which means you can hack about to solve the various
problems C++ presents.
On the other hand, generalised parsers such as GLR or
Earley can handle arbitrary context free languages with
a suitable grammar, and grammars being declarative languages
are more likely to work than hackery.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: Hyman Rosen <hyrosen@mail.com>
Date: Wed, 29 Dec 2004 17:33:00 CST Raw View
John Max Skaller wrote:
> Top down parses have the advantage they're a bit easier
> perhaps to construct by hand
One of the major advantages of a top-down handcrafted
parser is that it makes it easier to generate sensible
error messages and diagnostics for ill-formed programs.
For example, GNAT (the GNU Ada compiler) uses such a
parser and, from what I understand, generates the best
error messages in the business.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: Wed, 29 Dec 2004 20:33:50 CST Raw View
Actually, the real issue in a grammar is whether there's
a useful "ground state". All parsers face the problem
of what to do when a parsing error occurs. The usual
answer is to force the parser to some arbitrary "ground
state". That doesn't work well for C++.
The Pascal/Modula/Ada family of languages have
context independent grammars; you can parse them based
on the reserved words and delimiters only. That's not
true of C++. It was true of original C (not K&R C, but
the C that came with V6 UNIX. circa 1978), but as soon
as "typedef" went in, the grammar lost that property.
There are some serious downsides to a context dependent
grammar. One of them is that you can't parse a C++ file
without seeing its include files. People try, in
indenters and such, but it can't be done reliably.
Because C++ is so hard to parse, there are few tools
that modify C++ source well. This is a real lack.
John Nagle
Hyman Rosen wrote:
> John Max Skaller wrote:
>
>> Top down parses have the advantage they're a bit easier
>
> > perhaps to construct by hand
>
> One of the major advantages of a top-down handcrafted
> parser is that it makes it easier to generate sensible
> error messages and diagnostics for ill-formed programs.
> For example, GNAT (the GNU Ada compiler) uses such a
> parser and, from what I understand, generates the best
> error messages in the business.
>
> ---
> [ comp.std.c++ is moderated. To submit articles, try just posting with ]
> [ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
> [ --- Please see the FAQ before posting. --- ]
> [ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
>
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: gianluca_silvestri75@yahoo.it ("Gianluca Silvestri")
Date: Fri, 10 Dec 2004 18:31:05 GMT Raw View
Hi,
Stroutrup in D&E has written that his mind is biased towards top-down
parsing and that is revealed in many grammar choices. He also has written
that it is possible to write a nice recursive-descent parser for C++. That
was back in 1994. Is it still true? I was told that C and C++ grammars were
difficult beasts.
Thanks,
Gianluca Silvestri
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]