Topic: If it looks like a declaration it is.


Author: Michiel Salters<Michiel.Salters@cmg.nl>
Date: Fri, 16 Mar 2001 11:02:48 GMT
Raw View
In article <986d91$b5t$1@coranto.ucs.mun.ca>, Theo Norvell says...
>
>Hi there.  I'm trying to write a top-down parser for C++.
>I'm coming up against a few problems in resolving
>syntactic ambiguity.  Here is the simplest. In fact this
>is so simple I'm probably missing something very simple.
>
>The problem is simply telling an expression statement
>from a declaration statement. Stoustrup and Ellis give
>a very simple rule:
>   If it looks like a declaration, then it is: otherwise
>   If it looks like an expression [statement] it is: otherwise
>   It is a syntax error.
>They also claim, and the standard agrees, that this
>disambiguation is purely syntactic.

Actually, the standard claims that the syntax isn't sufficient:
In Annex A/1 [gram] it states: "
Disambiguation rules (6.8, 7.1, 10.2) must be applied to distinguish
expressions from declarations."

Which to me are indicates that the grammar won't do.

Regards,
Michiel Salters

>Cheers,
>Theodore Norvell
>(theo@engr.mun.ca)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Anon Sricharoenchai <ans@beethoven.cpe.ku.ac.th>
Date: Mon, 19 Mar 2001 00:10:05 GMT
Raw View
Theodore Norvell wrote:

> In a sense you are right.  The grammar is ambiguous, so the
> grammar alone won't do.  By analogy, the grammar for if statements
> is ambiguous:
>         statement --> if( condition ) statement
>                     | if( condition ) statement else statement
>                     | other
> gives two parses to the statement
>         if( a ) if( b ) p() ; else q() ;
>

But, I can write the unambiguous grammar for if statement.

        statement_excluding_if_then --> if (condition)
statement_excluding_if_then else statement
                    | other
        statement --> if (condition) statement
                    | statement_excluding_if_then


> but there is a "syntactic" disambiguation rule, which is that
> the second production has higher precedence than the first.
> So that's the sort of thing that I meant by "syntactic".

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Theodore Norvell <theo@engr.mun.ca>
Date: Fri, 16 Mar 2001 23:22:49 GMT
Raw View
Michiel Salters wrote:
>
> In article <986d91$b5t$1@coranto.ucs.mun.ca>, Theo Norvell says...
> >
> >Hi there.  I'm trying to write a top-down parser for C++.
> >I'm coming up against a few problems in resolving
> >syntactic ambiguity.  Here is the simplest. In fact this
> >is so simple I'm probably missing something very simple.
> >
> >The problem is simply telling an expression statement
> >from a declaration statement. Stoustrup and Ellis give
> >a very simple rule:
> >   If it looks like a declaration, then it is: otherwise
> >   If it looks like an expression [statement] it is: otherwise
> >   It is a syntax error.
> >They also claim, and the standard agrees, that this
> >disambiguation is purely syntactic.
>
> Actually, the standard claims that the syntax isn't sufficient:
> In Annex A/1 [gram] it states: "
> Disambiguation rules (6.8, 7.1, 10.2) must be applied to distinguish
> expressions from declarations."

Michiel: Thanks for your reply.

I know that disambiguation rules are required. My point is
that I don't understand the rules, since my example of
 f() ;
is classified as a declaration according to my "syntactic" rephrasing
of Ellis and Strousstrup's rule.  Clearly my understanding of that
rule is flawed and I'd like to know how. (I guess there is also
the remote possibility that both Ellis and Strousstup and
the standard are flawed on this point.)  As a programmer, my flawed
understanding doesn't bother me, since I think I know what the compiler
is going to do in the cases that matter, even if I don't understand
exactly why.  But as a guy trying to right a parser, it is a problem.

>
> Which to me are indicates that the grammar won't do.


In a sense you are right.  The grammar is ambiguous, so the
grammar alone won't do.  By analogy, the grammar for if statements
is ambiguous:
 statement --> if( condition ) statement
                    | if( condition ) statement else statement
                    | other
gives two parses to the statement
 if( a ) if( b ) p() ; else q() ;
but there is a "syntactic" disambiguation rule, which is that
the second production has higher precedence than the first.
So that's the sort of thing that I meant by "syntactic".
Another example is my rephrasing of Stroustrup and Ellis's
rule (see my originaly post). And that's the sort of thing
that I assuse the standard means by the disambiguation
being purely syntactic, as opposed where you have to
start applying various "context constraints", such as the
constraint that only in the declaration of constructors,
destructors and conversion operators can the type (and
hence all decl-specifiers) be omitted.  In fact both
Stroustrup and Ellis and the standard give an example of
a declaration that violoates a context constraint, but claim
that it is to be classified as a declaration, by their rule;
so it is clear that merely violating a context constraint
is not enough to disqualify something from being interpreted
as a declaration.

Anyway, I think I can solve this one, since declarations
of constructors, destructors, and conversion operators
can't (by context condition) occur where an expression
statement might occur.  So I can use two versions of
the nonterminal simple-delcaration, one for inside local
blocks (requireing a decl-specifiers) and one for elsewhere.
Then I need to versions of nonterminal declaration too.

More of a concern to me is the question of when to stop
consuming decl-specifiers in a decl-specifier list.
The obvious rule is to consume as many as possible,
but then you can end up consuming the constructor
name in a constructor.

inline A::B A::f() { ... } ;
inline A::A () { ... } ;

In the first case you consume the decl-specifier-seq is "inline" and "A::B",
then comes the declarator.  In the second case, only "inline" goes in the
decl-specifier-seq and then you have to start parsing the declarator.
Is there a good rule for a top-down parser to apply to stop adding
things that look like decl-specifiers into the decl-specifier list?

Cheers,
Theodore Norvell

----------------------------
Dr. Theodore Norvell                                    theo@engr.mun.ca
Electrical and Computer Engineering         http://www.engr.mun.ca/~theo
Engineering and Applied Science                    Phone: (709) 737-8962
Memorial University of Newfoundland                  Fax: (709) 737-4042
St. John's, NF, Canada, A1B 3X5

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]





Author: Theo Norvell <theo@peta.engr.mun.ca>
Date: Wed, 7 Mar 2001 22:54:34 GMT
Raw View
Hi there.  I'm trying to write a top-down parser for C++.
I'm coming up against a few problems in resolving
syntactic ambiguity.  Here is the simplest. In fact this
is so simple I'm probably missing something very simple.

The problem is simply telling an expression statement
from a declaration statement. Stoustrup and Ellis give
a very simple rule:
   If it looks like a declaration, then it is: otherwise
   If it looks like an expression [statement] it is: otherwise
   It is a syntax error.
They also claim, and the standard agrees, that this
disambiguation is purely syntactic.  I'm going to
take a small leap and interpret this as meaning:
  If it can be parsed according to the syntactic definition
  of a declaration, then it is one: and so on

Fine. Now consider the following, where f is not a type:
      f() ;
Syntactically it is a well formed declaration because the
syntactic definition of simple-declaration is
   simple-declaration:
       decl-specifier-seq_opt init-declarator-list_opt ;
But we all know it is an expression statement.

Now I know that the decl-specifier-seq was not made optional
to support "this sort of declaration". It is only intended to be omitted
for constructors, destructors, and type conversions. Perhaps I can rewrite
simple_declaration as
   simple-declaration:
       decl-specifier-seq init-declarator-list_opt ;
       constructor-destructor-or-conversion-declarator-list_opt ;
but see now I need to split the definition of declarator into two kinds of
declarator and the whole grammar gets more complex, and anyway I'm not
sure I'm not missing something simpler.

I related problem is exemplified by this silly example where A and B
are both types:
    A B ;
Syntactically it's ambiguous as a simple-declaration. Is it a sequence of
two decl-specifiers and an empty declarator list, or is it a single
decl-specifier "A" followed by a single init-declarator "B". Either way
the declaration is semantically wrong, so you may legitimately be
wondering "why does he care". It has to do with finding a single
simple (and syntactic) rule about when to stop putting things that
look like decl-specifiers in the decl-specifier-seq.  In the case of
    long unsigned int a ;
you put as many decl-specifiers as you can in the decl-specifier-seq, but
in
    MyClass() ; // Constuctor declaration
you omit the decl-specifier-seq.  The above suggested change to
simple-declaration might help here too.  I have a feeling that in
a bottom up parser, this isn't as much of a problem.

Any help will be greatly appreciated.
If you post a response, please email me too.

Cheers,
Theodore Norvell
(theo@engr.mun.ca)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]
[ Note that the FAQ URL has changed!  Please update your bookmarks.     ]