Thread

Topic: Operators and tokens

Author: Robert Klemme <bob.news@gmx.net>
Date: Tue, 9 Jul 2002 16:17:21 GMT Raw View


John the newbie schrieb:
> > No.  The preprocessing token '(' is converted into the token '(', and
> > so on.  The differences between preprocessing tokens and tokens is
> > very subtle: the preprocessor doesn't recognize keywords, for example,
> > so as a preprocessing token, "while" is an indentifier, whereas as a
> > token, it is a keyword.
>
> Thus, according to the classification in 2.6 there's no () operator
> but there are two punctuators, right?

correct.

> strlen (  str  );
>  |     |   |   |
>  |     |   |   -------- punctuator )
>  |     |   ------------------------- identifier
>  |     |------------------------------------------- punctuator )
>  |-- identifier
>
> And what if I write f(), i.e. with nothing in parenthesis? Are the '('
> and ')' still two separate tokens?

yes.

> If so, and if I understand you
> correctly, the only "operators" that are recognized as "operator
> tokens" are those formed by a single character +, -, etc...

there are some more, section 2.12 lists all of operators and
punctuators:

[lex.operators] 2.12 Operators and punctuators
1 The lexical representation of C++ programs includes a number of
preprocessing tokens which are used in
the syntax of the preprocessor or are converted into tokens for
operators and punctuators:
preprocessing-op-or-punc: one of

{ } [ ] # ## ( )
<: :> <% %> %: %:%: ; : ...
new delete ? :: . .*
+ - * / % ^ & | ~
! = < > += -= *= /= %=
^= &= |= << >> >>= <<= == !=
<= >= && || ++ -- , ->* ->
and and_eq bitand bitor compl not not_eq
or or_eq xor xor_eq

Each preprocessing-op-or-punc is converted to a single token in
translation phase 7 (2.1).

> (For
> instance what we know as "function call operator" is actually two
> tokens, none of which is an operator according to 2.6). Did I get your
> explanation?

well, maybe we should be more precise here: there are operators
as elements of the syntax (see above).  these are only symbols.
and then there are operators as elements of the language, i.e.,
they have a defined meaning and semantics.

consequently there is a point in time during ealiery translation
phases where there are only symbols (e.g. "someIdent" "(" ")" )
and later on when the parse tree is build (see my first posting
in this thread) then there is a single node in the tree which is
the root node of the sub tree that produces this token sequence.
the latter is typically used for computations that ultimately
lead to the generated code that realizes this operator call.

> > I don't know of one; I learned about compilers long before there was
> > an Internet.
>
> Why? There was something before? :-)

i think there were some carvings in stone...  :-))

> >  When I was learning, the dragon book ("Compilers:
> > Principles, Techniques, and Tools", by Aho, Sethi and Ullman) was
> > *the* reference.  I don't know if this is still the case, but I don't
> > think that the basics of parsing have changed that much since then.
>
> Thanks a lot. I'll look for it on Amazon!

i posted the link already in my first posting.

regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Tue, 9 Jul 2002 16:17:01 GMT Raw View

hi gennaro

Gennaro Prota schrieb:
> Please don't be offended, but I'm rapidly losing interest because you
> seem more desirous of having your own way than to discuss C++.

i'm sorry, if i was beeing rude.  sometimes i get frustrated when
the other does not seem to understand what i mean - but this is
really my fault.  with all the obstacles (only text, no native
speaker) it sometimes gets quite hard to say what one means...

> The standard never speaks of "preprocessing phase". It simply states
> that there are 9 phases of translation (2.1); it doesn't give them a
> name. So, strictly speaking, we should have both defined what we meant
> by "preprocessing phase".

that's true.

> In the common use, which is what I was
> adopting too, the preprocessing phase is phase 4. The previous phase
> (phase 3) is commonly known as the "tokenization phase". Now, what I
> was pointing out is that your wording seemed to imply that tokens are
> formed in phase 4, while as I wrote, they are formed - except for a
> few situations - in phase 3.

if my wording did indeed imply this, i'm sorry.  that was not
what i wanted to say.

> Of course, not even the expression "output of a phase of translation"
> is defined by the standard, we were speaking a little informally.
> Anyhow, it seems to me that there's a little difference in meaning
> between saying that a given phase outputs prep-tokens (something I
> suppose you would say of phases 5 and 6 too) and saying that
> prep-tokens are *the result* of that phase (which would imply that the
> production of that tokens is exactly what the phase is required to
> do).

from the perspective of the following phase it really does not
matter who created those tokens in the first place, they are
simply the result (or call it "output") of the preceding phase -
whichever means that employed to create this result.  so in
effect, i don't share your linguistic distincition but it seems
we were trying to say the same.  :-)

> If that's not what you meant then it's all right for me. The purpose
> of newsgroup discussions is to share ideas and corrections, not to
> spot people's mistakes at all costs or saying they are wrong playing
> on an unintended nuance of meaning in what they wrote.

how true.

kind regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: whatiscpp@yahoo.com (John the newbie)
Date: Fri, 5 Jul 2002 22:36:01 GMT Raw View

kanze@gabi-soft.de (James Kanze) wrote in message news:<d6651fb6.0207050353.2326582a@posting.google.com>...
> whatiscpp@yahoo.com (John the newbie) wrote in message
> news:<102a8848.0206291109.386d93e9@posting.google.com>...
> > kanze@gabi-soft.de (James Kanze) wrote in message
> > news:<d6651fb6.0206240356.19edff7e@posting.google.com>...
> > > whatiscpp@yahoo.com (John the newbie) wrote in message
> > > news:<102a8848.0206230906.79905ef@posting.google.com>...
>
> > > > section 2.1 of the standard says that each preprocessing token
> > > > is converted in tokens. Consider the following:
>
> > > > char str[] = "test";
> > > > std::strlen (str);
> > > > //          1   2
>
> > > > Here, 1 and 2 are two separate tokens. How they become the ()
> > > > operator?
>
> > > The same way [ and ] become the [] operator, or ? and : become the
> > > ?: operator.  And operator is an abstract construct, which doesn't
> > > necessarily map directly into a single token.
>
> > Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
> > "Each preprocessing token is converted into a token". Is this poor
> > language?
>
> No.  The preprocessing token '(' is converted into the token '(', and
> so on.  The differences between preprocessing tokens and tokens is
> very subtle: the preprocessor doesn't recognize keywords, for example,
> so as a preprocessing token, "while" is an indentifier, whereas as a
> token, it is a keyword.

Thus, according to the classification in 2.6 there's no () operator
but there are two punctuators, right?

strlen (  str  );
 |     |   |   |
 |     |   |   -------- punctuator )
 |     |   ------------------------- identifier
 |     |------------------------------------------- punctuator )
 |-- identifier


And what if I write f(), i.e. with nothing in parenthesis? Are the '('
and ')' still two separate tokens? If so, and if I understand you
correctly, the only "operators" that are recognized as "operator
tokens" are those formed by a single character +, -, etc... (For
instance what we know as "function call operator" is actually two
tokens, none of which is an operator according to 2.6). Did I get your
explanation?


> You seem to be confusing "operator" with "token".  These are two
> separate concepts, and there is no problem with an operator requiring
> several tokens.
>
> > P.S.: Robert Klemme's explanation is really too difficult for me. Is
> > there any Internet resource where I can learn that kind of things?
>
> I don't know of one; I learned about compilers long before there was
> an Internet.

Why? There was something before? :-)

>  When I was learning, the dragon book ("Compilers:
> Principles, Techniques, and Tools", by Aho, Sethi and Ullman) was
> *the* reference.  I don't know if this is still the case, but I don't
> think that the basics of parsing have changed that much since then.

Thanks a lot. I'll look for it on Amazon!

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Gennaro Prota <gennaro_prota@yahoo.com>
Date: Sat, 6 Jul 2002 06:20:18 GMT Raw View

On Thu,  4 Jul 2002 17:53:37 GMT, Robert Klemme <bob.news@gmx.net>
wrote:

>
>
>Gennaro Prota schrieb:
>> This confirm that you have not a clear understanding of what the
>> preprocessing phase does. See below.
>
>[snip]
>
>> >well, how exactly would you call what the lexical analyser does?
>> >does he put the tokens into the source file?  of course "find" is
>> >not the correct technical term, but i think the informal
>> >description is close enough.  you can replace the sentence with
>> >"tokens that the lexical analyser synthesises directly from the
>> >source file".
>>
>> The remark about the term "find" was indeed marginal.
>
>ok.
>
>> My main
>> objection was against referring to preprocessing tokens as to "what
>> the preprocessor outputs". Except for a few cases dealing with the #
>> and ## operators and, conceptually, with #include directives, they are
>> formed in the previous phase of translation (though they survive,
>> eventually rearranged, to the preprocessing phase and so also belong
>> to its "output").
>
>thank you for confirming that i was right - in spite of what you
>say in the beginning.  these things ("preprocessing tokens") are
>indeed that, what the preprocessor outputs.  of course he first
>has to recognize in the source, so the come into existence in the
>preprocessing phase.  but then they are output to the next
>compiler phase.  you rephrased exactly what i wrote in other
>words.

Please don't be offended, but I'm rapidly losing interest because you
seem more desirous of having your own way than to discuss C++.

The standard never speaks of "preprocessing phase". It simply states
that there are 9 phases of translation (2.1); it doesn't give them a
name. So, strictly speaking, we should have both defined what we meant
by "preprocessing phase". In the common use, which is what I was
adopting too, the preprocessing phase is phase 4. The previous phase
(phase 3) is commonly known as the "tokenization phase". Now, what I
was pointing out is that your wording seemed to imply that tokens are
formed in phase 4, while as I wrote, they are formed - except for a
few situations - in phase 3.

Of course, not even the expression "output of a phase of translation"
is defined by the standard, we were speaking a little informally.
Anyhow, it seems to me that there's a little difference in meaning
between saying that a given phase outputs prep-tokens (something I
suppose you would say of phases 5 and 6 too) and saying that
prep-tokens are *the result* of that phase (which would imply that the
production of that tokens is exactly what the phase is required to
do).

If that's not what you meant then it's all right for me. The purpose
of newsgroup discussions is to share ideas and corrections, not to
spot people's mistakes at all costs or saying they are wrong playing
on an unintended nuance of meaning in what they wrote.

Genny.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: James Dennett <jdennett@acm.org>
Date: Mon, 8 Jul 2002 16:01:39 GMT Raw View

Gennaro Prota wrote:
> On Thu,  4 Jul 2002 17:53:37 GMT, Robert Klemme <bob.news@gmx.net>
> wrote:
>
>
>>
>>Gennaro Prota schrieb:
>>
>>>My main
>>>objection was against referring to preprocessing tokens as to "what
>>>the preprocessor outputs". Except for a few cases dealing with the #
>>>and ## operators and, conceptually, with #include directives, they are
>>>formed in the previous phase of translation (though they survive,
>>>eventually rearranged, to the preprocessing phase and so also belong
>>>to its "output").
>>
>>thank you for confirming that i was right - in spite of what you
>>say in the beginning.  these things ("preprocessing tokens") are
>>indeed that, what the preprocessor outputs.  of course he first
>>has to recognize in the source, so the come into existence in the
>>preprocessing phase.  but then they are output to the next
>>compiler phase.  you rephrased exactly what i wrote in other
>>words.
>

[snip]

>
> The standard never speaks of "preprocessing phase". It simply states
> that there are 9 phases of translation (2.1); it doesn't give them a
> name. So, strictly speaking, we should have both defined what we meant
> by "preprocessing phase".

That would have avoided this discussion, yes.

> In the common use, which is what I was
> adopting too, the preprocessing phase is phase 4.

I don't believe that I have come across this use before.
To me, the "preprocessing phase" is that which has often
been implemented in a separate preprocessor, which covers
phases 1--6.  To me it makes little sense to say that
there is any processing *before* preprocessing, but here
it's just down to what you think the words mean.  We
disagree.  That's fine.  I don't think we disagree on
what 2.1 [lex.phases] has to say, only on how we try to
translate it into informal English.

> The previous phase
> (phase 3) is commonly known as the "tokenization phase". Now, what I
> was pointing out is that your wording seemed to imply that tokens are
> formed in phase 4, while as I wrote, they are formed - except for a
> few situations - in phase 3.
>
> Of course, not even the expression "output of a phase of translation"
> is defined by the standard, we were speaking a little informally.
> Anyhow, it seems to me that there's a little difference in meaning
> between saying that a given phase outputs prep-tokens (something I
> suppose you would say of phases 5 and 6 too) and saying that
> prep-tokens are *the result* of that phase (which would imply that the
> production of that tokens is exactly what the phase is required to
> do).

Here's it's an argument about what words mean.  I've mostly
implemented a C++ preprocessor, and I would certainly think
in terms of each phase having a particular type of output,
not being put off if it was of the same form as the input.
To my mind, saying that some phase outputs pptokens does
not imply in the slightest that it created those pptokens,
only that it passes them on to the next phase for processing.
The "result" of a phase, to me, is that phase's output.

> If that's not what you meant then it's all right for me. The purpose
> of newsgroup discussions is to share ideas and corrections, not to
> spot people's mistakes at all costs or saying they are wrong playing
> on an unintended nuance of meaning in what they wrote.

Indeed.

--
James Dennett <jdennett@acm.org>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Wed, 3 Jul 2002 15:50:13 GMT Raw View

Gennaro Prota schrieb:
>
> On 2 Jul 2002 07:35:01 GMT, Robert Klemme <bob.news@gmx.net> wrote:
>
> >
> >hi john
> >
> >John the newbie schrieb:
> >> Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
> >> "Each preprocessing token is converted into a token". Is this poor
> >> language?
> >
> >i think this just wants to make clear, that everything that the
> >preprocessor outputs, is parsed the same way as tokens that are
> >found literally in a source file.
>
> Fortunately you think wrong ;-)

well, thank you!

> Preprocessing tokens are not the
> result of the preprocessing phase,

what exactly are they otherwise?  the standard reads:

<quote>
2.4.Preprocessing tokens

1.
Each preprocessing token that is converted to a token (2.6) shall
have the lexical form of a keyword, an
identifier, a literal, an operator, or a punctuator.

2.
[...] The categories of preprocessing token are: header names,
identifiers, preprocessing numbers, character
literals, string literals, preprocessing-op-or-punc, and single
non-white-space characters that do not lexi-cally
match the other preprocessing token categories.

2.6 Tokens

1.
There are five kinds of tokens: identifiers, keywords, literals,
operators, and other separators.
</quote>

in phase 7 "preprocessing tokens" are converted into "tokens",
which need not necessarily mean any textual manipulation but
simply change of type.

> and there are no tokens that are
> "found" in a source file.

well, how exactly would you call what the lexical analyser does?
does he put the tokens into the source file?  of course "find" is
not the correct technical term, but i think the informal
description is close enough.  you can replace the sentence with
"tokens that the lexical analyser synthesises directly from the
source file".

regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Gennaro Prota <gennaro_prota@yahoo.com>
Date: Wed, 3 Jul 2002 16:40:24 GMT Raw View

On Wed,  3 Jul 2002 15:50:13 GMT, Robert Klemme <bob.news@gmx.net>
wrote:

>
>
>Gennaro Prota schrieb:
>>
>> On 2 Jul 2002 07:35:01 GMT, Robert Klemme <bob.news@gmx.net> wrote:
>>
>> >
>> >hi john
>> >
>> >John the newbie schrieb:
>> >> Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
>> >> "Each preprocessing token is converted into a token". Is this poor
>> >> language?
>> >
>> >i think this just wants to make clear, that everything that the
>> >preprocessor outputs, is parsed the same way as tokens that are
>> >found literally in a source file.
>>
>> Fortunately you think wrong ;-)
>
>well, thank you!
>
>> Preprocessing tokens are not the
>> result of the preprocessing phase,
>
>what exactly are they otherwise?

This confirm that you have not a clear understanding of what the
preprocessing phase does. See below.

>[snip...]

>in phase 7 "preprocessing tokens" are converted into "tokens",
>which need not necessarily mean any textual manipulation but
>simply change of type.
>
>> and there are no tokens that are
>> "found" in a source file.
>
>well, how exactly would you call what the lexical analyser does?
>does he put the tokens into the source file?  of course "find" is
>not the correct technical term, but i think the informal
>description is close enough.  you can replace the sentence with
>"tokens that the lexical analyser synthesises directly from the
>source file".


The remark about the term "find" was indeed marginal. My main
objection was against referring to preprocessing tokens as to "what
the preprocessor outputs". Except for a few cases dealing with the #
and ## operators and, conceptually, with #include directives, they are
formed in the previous phase of translation (though they survive,
eventually rearranged, to the preprocessing phase and so also belong
to its "output").


Genny.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Thu, 4 Jul 2002 17:53:37 GMT Raw View


Gennaro Prota schrieb:
> This confirm that you have not a clear understanding of what the
> preprocessing phase does. See below.

[snip]

> >well, how exactly would you call what the lexical analyser does?
> >does he put the tokens into the source file?  of course "find" is
> >not the correct technical term, but i think the informal
> >description is close enough.  you can replace the sentence with
> >"tokens that the lexical analyser synthesises directly from the
> >source file".
>
> The remark about the term "find" was indeed marginal.

ok.

> My main
> objection was against referring to preprocessing tokens as to "what
> the preprocessor outputs". Except for a few cases dealing with the #
> and ## operators and, conceptually, with #include directives, they are
> formed in the previous phase of translation (though they survive,
> eventually rearranged, to the preprocessing phase and so also belong
> to its "output").

thank you for confirming that i was right - in spite of what you
say in the beginning.  these things ("preprocessing tokens") are
indeed that, what the preprocessor outputs.  of course he first
has to recognize in the source, so the come into existence in the
preprocessing phase.  but then they are output to the next
compiler phase.  you rephrased exactly what i wrote in other
words.

regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.de (James Kanze)
Date: Fri, 5 Jul 2002 16:27:34 GMT Raw View

whatiscpp@yahoo.com (John the newbie) wrote in message
news:<102a8848.0206291109.386d93e9@posting.google.com>...
> kanze@gabi-soft.de (James Kanze) wrote in message
> news:<d6651fb6.0206240356.19edff7e@posting.google.com>...
> > whatiscpp@yahoo.com (John the newbie) wrote in message
> > news:<102a8848.0206230906.79905ef@posting.google.com>...

> > > section 2.1 of the standard says that each preprocessing token
> > > is converted in tokens. Consider the following:

> > > char str[] = "test";
> > > std::strlen (str);
> > > //          1   2

> > > Here, 1 and 2 are two separate tokens. How they become the ()
> > > operator?

> > The same way [ and ] become the [] operator, or ? and : become the
> > ?: operator.  And operator is an abstract construct, which doesn't
> > necessarily map directly into a single token.

> Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
> "Each preprocessing token is converted into a token". Is this poor
> language?

No.  The preprocessing token '(' is converted into the token '(', and
so on.  The differences between preprocessing tokens and tokens is
very subtle: the preprocessor doesn't recognize keywords, for example,
so as a preprocessing token, "while" is an indentifier, whereas as a
token, it is a keyword.

You seem to be confusing "operator" with "token".  These are two
separate concepts, and there is no problem with an operator requiring
several tokens.

> P.S.: Robert Klemme's explanation is really too difficult for me. Is
> there any Internet resource where I can learn that kind of things?

I don't know of one; I learned about compilers long before there was
an Internet.  When I was learning, the dragon book ("Compilers:
Principles, Techniques, and Tools", by Aho, Sethi and Ullman) was
*the* reference.  I don't know if this is still the case, but I don't
think that the basics of parsing have changed that much since then.

--
James Kanze                           mailto:jkanze@caicheuvreux.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: whatiscpp@yahoo.com (John the newbie)
Date: Mon, 1 Jul 2002 22:11:34 GMT Raw View

kanze@gabi-soft.de (James Kanze) wrote in message news:<d6651fb6.0206240356.19edff7e@posting.google.com>...
> whatiscpp@yahoo.com (John the newbie) wrote in message
> news:<102a8848.0206230906.79905ef@posting.google.com>...
>
> > section 2.1 of the standard says that each preprocessing token is
> > converted in tokens. Consider the following:
>
> > char str[] = "test";
> > std::strlen (str);
> > //          1   2
>
> > Here, 1 and 2 are two separate tokens. How they become the ()
> > operator?
>
> The same way [ and ] become the [] operator, or ? and : become the ?:
> operator.  And operator is an abstract construct, which doesn't
> necessarily map directly into a single token.
>

Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
"Each preprocessing token is converted into a token". Is this poor
language?

P.S.: Robert Klemme's explanation is really too difficult for me. Is
there any Internet resource where I can learn that kind of things?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Tue, 2 Jul 2002 02:32:34 CST Raw View

hi john

John the newbie schrieb:
> Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
> "Each preprocessing token is converted into a token". Is this poor
> language?

i think this just wants to make clear, that everything that the
preprocessor outputs, is parsed the same way as tokens that are
found literally in a source file.

> P.S.: Robert Klemme's explanation is really too difficult for me. Is
> there any Internet resource where I can learn that kind of things?

and i thought it was simplistic...  if you find something more
easy to understand then please let me know.  but what i've shown
is the basic principle of context free grammars and how they
work.  it's a bit theoretical but cfg's are really a quite formal
(or mathematical, if you like) concept.

regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Gennaro Prota <gennaro_prota@yahoo.com>
Date: Tue, 2 Jul 2002 11:46:31 GMT Raw View

On 2 Jul 2002 07:35:01 GMT, Robert Klemme <bob.news@gmx.net> wrote:

>
>hi john
>
>John the newbie schrieb:
>> Thanks. What puzzles me is the sentence in 2.1, point 7 that states:
>> "Each preprocessing token is converted into a token". Is this poor
>> language?
>
>i think this just wants to make clear, that everything that the
>preprocessor outputs, is parsed the same way as tokens that are
>found literally in a source file.

Fortunately you think wrong ;-) Preprocessing tokens are not the
result of the preprocessing phase, and there are no tokens that are
"found" in a source file.


Genny.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: whatiscpp@yahoo.com (John the newbie)
Date: 24 Jun 2002 02:30:35 GMT Raw View

Hi everybody,

section 2.1 of the standard says that each preprocessing token is
converted in tokens. Consider the following:

char str[] = "test";
std::strlen (str);
//          1   2

Here, 1 and 2 are two separate tokens. How they become the ()
operator? There's also the identifier "str" between the two
parenthesis, so how can '(' and ')' be merged?

Thanks for any help.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Mon, 24 Jun 2002 07:32:53 GMT Raw View

hi john,

trying to bring up a short and simplified explanation: the
compiler first detects all tokens in a compilation unit (file
with included files).  the tokens are unrelated yet.  afterwards
the parser builds up a parse tree that is consistent with the
language's context free grammar and produces exactly the stream
of tokens that you see.  (of course only when there are no errors
in the compilation unit.)  you could view this as kind of reverse
engineering.  the cfg is a set of production rules such as

S  : E

E  : E '+' C
   | C

C  : number

with this very simple grammar you can produce token streams such
as

1

1 + 2

1 + 2 + 33

etc.

the tree for the last expression:

         S
         |
         E
         |
        / \
      E '+' C
      |     |
     / \   '33'
   E '+' C
   |     |
   C    '2'
   |
  '1'


if you want to know more, you should get you some information
about context free grammars and compiler construction.  i'm sure
you will find a lot on the web, otherwise this is one of the
classics: http://www.amazon.com/exec/obidos/ASIN/0201100886

regards

 robert

John the newbie schrieb:
> section 2.1 of the standard says that each preprocessing token is
> converted in tokens. Consider the following:
>
> char str[] = "test";
> std::strlen (str);
> //          1   2
>
> Here, 1 and 2 are two separate tokens. How they become the ()
> operator? There's also the identifier "str" between the two
> parenthesis, so how can '(' and ')' be merged?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.de (James Kanze)
Date: Mon, 24 Jun 2002 15:19:44 GMT Raw View

whatiscpp@yahoo.com (John the newbie) wrote in message
news:<102a8848.0206230906.79905ef@posting.google.com>...

> section 2.1 of the standard says that each preprocessing token is
> converted in tokens. Consider the following:

> char str[] = "test";
> std::strlen (str);
> //          1   2

> Here, 1 and 2 are two separate tokens. How they become the ()
> operator?

The same way [ and ] become the [] operator, or ? and : become the ?:
operator.  And operator is an abstract construct, which doesn't
necessarily map directly into a single token.

> There's also the identifier "str" between the two
> parenthesis, so how can '(' and ')' be merged?

They aren't merged.

--
James Kanze                           mailto:jkanze@caicheuvreux.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(0)69 63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Robert Klemme <bob.news@gmx.net>
Date: Mon, 24 Jun 2002 15:45:12 GMT Raw View

oops, forgot the crucial part: by building up those trees (see my
other posting) brackets can be grouped.  if you augment the mini
cfg with a rule such as

C : '(' E ')'

you can clearly (hopefully) see, how this rule becomes a node in
the tree that is build up by the parser and that relates the two
brackets to one and another.

regards

 robert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]