Thread

Topic: Macro expansion in conforming compiler

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: 1995/07/25 Raw View

rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:
>I'd much prefer that you'ld come up with a local solution, rather the
>standard trying to fix a problem with people using non-standard and
>obsolete character sets.

James Kanze US/ESC 60/3/141 #40763 <kanze@lts.sel.alcatel.de> wrote:
>The character sets in question are *very* standard.  An ISO standard,
>in fact.

An obsolete standard.  I'm sure there's a standard for BAUDOT(sp?) too.

>  In practice, of course, the problem isn't really the character set
>(ISO 8859 is very widespread over here), but the keyboard.  A standard
>French or German keyboard simply doesn't have a `|' character on it.
>The fact that the character set generated may contain the character
>doesn't help if I have no way of entering it (or must use gymnastics
>with my fingers in order to hit three keys at once).

How about just having your editor expand the digraphs?  Most systems
support also remapping keyboards.  I used to use a keyboard that was
designed for Pascal programming to write C code.  It was pain but
I didn't expect or want the C language changed to make it easier
for me.

>|> As a code maintainer, I dread the day I'm
>|> handed code that uses these ridiculous diagraphs.  The worst part of
>|> about it is that it'll have been written by someone who does have '|'
>|> on their keyboard but is trying to use the alternative tokens as part
>|> of some half-baked coding style.  Having two ways to specify the same
>|> token like this in a language is *wrong*.
>
>Well, I too prefer code using the American characters; I try and have
>an American keyboard on all of the machines I use for developing code.
>But I don't always have a choice.  And if I use the same machine for
>writing code and documentation, and the documentation is in French,
>what am I supposed to do?

Anything you want that doesn't change the language.  (Not it that
it isn't pretty much a fait accompli already.  *sigh*)  Even making
it optional or part of seperate standard would be better.

>This said, I think we could live without the special tokens.  Most
>people I know are capable of using both keyboards, and most machines
>allow reprogramming the keyboard for the different languages.  But
>*if* the special tokens are adapted, they must really be special, and
>be understood by the preprocessor if they are to be of any use.

True.  I was just taking the opportunity to gripe about digraphs.

       Ross Ridge

--
 l/  //   Ross Ridge -- The Great HTMU                         +1 519 883 4329
[oo][oo]  rridge@csclub.uwaterloo.ca      http://csclub.uwaterloo.ca/u/rridge/
-()-/()/
 db  //

Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/07/25 Raw View

In article <DByLxA.LHs@tigadmin.ml.com> mansionj@lonnds.ml.com (James
Mansion LADS LDN X4923) writes:

|> I don't understand your point.  Why does it matter which translation phase accepts
|> the form that looks like a keyword/identifier rather than the form that looks like an
|> operator?

Yes.  Preprocessor symbols become keywords or identifiers *after* the
preprocessor has evaluated its expressions.

Suppose I write:

 #if (defined)A and (defined)B

If `and' is a preprocessor symbol (which will only later become a
keyword), then this expression is illegal.  To work, `and' must be a
special token, recognized by the preprocessor.  (Ron Guillemet once
warned me that this feature would cause some problems before we were
finished.  I didn't believe him; it seemed too trivial.  He was
right.)

Perhaps treating them as something like predefined macros would be a
solution.

|> My point is simply that tokens which look like a keyword or identifier should
|> behave consistently as far as possible, and I don't like the look of the special-
|> casing that we have proposed at the moment.

I, too, would prefer to keep the consistency.  But it doesn't work.

|> In article <3ue94i$28m@gabi.gabi-soft.fr>, kanze@gabi-soft.fr (J. Kanze) writes:
|> >James Mansion LADS LDN X4923 (mansionj@lonnds.ml.com) wrote:
|> >|> I feel that this should be taken up with the committee.  It will be
|> >|> difficult to countenance using the alternative tokens at all without
|> >|> the compatibility get-out of the preprocessor.
|> >
|> >|> They look like keywords - they should act like keywords.
|> >
|> >|> Even sizeof has no special status in early phases.
|> >
|> >But this would defeat their purpose.  A token like `|' has a special
|> >meaning to the preprocessor.  But on my PC, there is no `|' on the
|> >keyboard.  The purpose of the new tokens is to allow me to write the
|> >equivalent of `|' even with my PC's (German) keyboard.




--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/07/26 Raw View

In article <u9aga2la9h.fsf@phydeaux.cygnus.com> jason@cygnus.com
(Jason Merrill) writes:

|> I don't think there is actually a problem; from my reading of the draft,

There isn't a real problem in the draft, but some people were
complaining the `and' was getting different treatment than `while'.  I
was just pointing out why this is necessary.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: jason@cygnus.com (Jason Merrill)
Date: 1995/07/25 Raw View

I don't think there is actually a problem; from my reading of the draft,

#if 1 or 1

is treated just like

#if 1 || 1

Jason

  2.4  Alternative tokens                                  [lex.digraph]

1 Alternative  token representations are provided for some operators and
  punctuators3).

2 In  all  respects  of the language, each alternative token behaves the
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  same,  respectively,  as its primary token, except for its spelling4).
  The set of alternative tokens is defined in Table 2.

  _________________________
  3) These include "digraphs" and additional reserved words.   The  term
  "digraph"  (token  consisting  of two characters) is not perfectly de-
  scriptive, since one of the alternative preprocessing-tokens  is  %:%:
  and of course several primary tokens contain two characters.  Nonethe-
  less, those alternative tokens that aren't lexical keywords are collo-
  quially known as "digraphs".

Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/07/20 Raw View

In article <3ujnaj$87v@calum.csclub.uwaterloo.ca>
rridge@calum.csclub.uwaterloo.ca (Ross Ridge) writes:

|> James Mansion LADS LDN X4923 (mansionj@lonnds.ml.com) wrote:
|> >I feel that this should be taken up with the committee.  It will be
|> >difficult to countenance using the alternative tokens at all without
|> >the compatibility get-out of the preprocessor.
|> >
|> >They look like keywords - they should act like keywords.
|> >
|> >Even sizeof has no special status in early phases.

|> J. Kanze <kanze@gabi-soft.fr> wrote:
|> >But this would defeat their purpose.  A token like `|' has a special
|> >meaning to the preprocessor.  But on my PC, there is no `|' on the
|> >keyboard.  The purpose of the new tokens is to allow me to write the
|> >equivalent of `|' even with my PC's (German) keyboard.

|> I'd much prefer that you'ld come up with a local solution, rather the
|> standard trying to fix a problem with people using non-standard and
|> obsolete character sets.

The character sets in question are *very* standard.  An ISO standard,
in fact.  In practice, of course, the problem isn't really the
character set (ISO 8859 is very widespread over here), but the
keyboard.  A standard French or German keyboard simply doesn't have a
`|' character on it.  The fact that the character set generated may
contain the character doesn't help if I have no way of entering it (or
must use gymnastics with my fingers in order to hit three keys at
once).

|> As a code maintainer, I dread the day I'm
|> handed code that uses these ridiculous diagraphs.  The worst part of
|> about it is that it'll have been written by someone who does have '|'
|> on their keyboard but is trying to use the alternative tokens as part
|> of some half-baked coding style.  Having two ways to specify the same
|> token like this in a language is *wrong*.

Well, I too prefer code using the American characters; I try and have
an American keyboard on all of the machines I use for developing code.
But I don't always have a choice.  And if I use the same machine for
writing code and documentation, and the documentation is in French,
what am I supposed to do?

This said, I think we could live without the special tokens.  Most
people I know are capable of using both keyboards, and most machines
allow reprogramming the keyboard for the different languages.  But
*if* the special tokens are adapted, they must really be special, and
be understood by the preprocessor if they are to be of any use.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung

Author: rridge@calum.csclub.uwaterloo.ca (Ross Ridge)
Date: 1995/07/19 Raw View

James Mansion LADS LDN X4923 (mansionj@lonnds.ml.com) wrote:
>I feel that this should be taken up with the committee.  It will be
>difficult to countenance using the alternative tokens at all without
>the compatibility get-out of the preprocessor.
>
>They look like keywords - they should act like keywords.
>
>Even sizeof has no special status in early phases.

J. Kanze <kanze@gabi-soft.fr> wrote:
>But this would defeat their purpose.  A token like `|' has a special
>meaning to the preprocessor.  But on my PC, there is no `|' on the
>keyboard.  The purpose of the new tokens is to allow me to write the
>equivalent of `|' even with my PC's (German) keyboard.

I'd much prefer that you'ld come up with a local solution, rather the
standard trying to fix a problem with people using non-standard and
obsolete character sets.  As a code maintainer, I dread the day I'm
handed code that uses these ridiculous diagraphs.  The worst part of
about it is that it'll have been written by someone who does have '|'
on their keyboard but is trying to use the alternative tokens as part
of some half-baked coding style.  Having two ways to specify the same
token like this in a language is *wrong*.

      Ross Ridge

--
 l/  //   Ross Ridge -- The Great HTMU                         +1 519 883 4329
[oo][oo]  rridge@csclub.uwaterloo.ca      http://csclub.uwaterloo.ca/u/rridge/
-()-/()/
 db  //

Author: mansionj@lonnds.ml.com (James Mansion LADS LDN X4923)
Date: 1995/07/19 Raw View

I don't understand your point.  Why does it matter which translation phase accepts
the form that looks like a keyword/identifier rather than the form that looks like an
operator?

My point is simply that tokens which look like a keyword or identifier should
behave consistently as far as possible, and I don't like the look of the special-
casing that we have proposed at the moment.

James

In article <3ue94i$28m@gabi.gabi-soft.fr>, kanze@gabi-soft.fr (J. Kanze) writes:
>James Mansion LADS LDN X4923 (mansionj@lonnds.ml.com) wrote:
>|> I feel that this should be taken up with the committee.  It will be
>|> difficult to countenance using the alternative tokens at all without
>|> the compatibility get-out of the preprocessor.
>
>|> They look like keywords - they should act like keywords.
>
>|> Even sizeof has no special status in early phases.
>
>But this would defeat their purpose.  A token like `|' has a special
>meaning to the preprocessor.  But on my PC, there is no `|' on the
>keyboard.  The purpose of the new tokens is to allow me to write the
>equivalent of `|' even with my PC's (German) keyboard.
>--
>James Kanze           (+33) 88 14 49 00          email: kanze@gabi-soft.fr
>GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
>Conseils en informatique industrielle--
>                             --Beratung in industrieller Datenverarbeitung

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1995/07/17 Raw View

James Mansion LADS LDN X4923 (mansionj@lonnds.ml.com) wrote:
|> I feel that this should be taken up with the committee.  It will be
|> difficult to countenance using the alternative tokens at all without
|> the compatibility get-out of the preprocessor.

|> They look like keywords - they should act like keywords.

|> Even sizeof has no special status in early phases.

But this would defeat their purpose.  A token like `|' has a special
meaning to the preprocessor.  But on my PC, there is no `|' on the
keyboard.  The purpose of the new tokens is to allow me to write the
equivalent of `|' even with my PC's (German) keyboard.
--
James Kanze           (+33) 88 14 49 00          email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle--
                             --Beratung in industrieller Datenverarbeitung

Author: mansionj@lonnds.ml.com (James Mansion LADS LDN X4923)
Date: 1995/06/28 Raw View

I feel that this should be taken up with the committee.  It will be
difficult to countenance using the alternative tokens at all without
the compatibility get-out of the preprocessor.

They look like keywords - they should act like keywords.

Even sizeof has no special status in early phases.

James

In article <3spq8d$jls@engnews2.Eng.Sun.COM>, clamage@Eng.Sun.COM (Steve Clamage) writes:
>In article ktf@toon.ctp.com, jroy@ctp.com (John Roy) writes:

>>So my reasoning leads me to conclude that keywords can be replaced via
>>macro expansion whereas lexical keywords cannot.
>
>If by "lexical keyword" you mean what the draft standard calls "alternative
>tokens" (2.4), then yes. "not_eq" is an example.
>
>> Example:
>>
>>#define while XXX // OK
>>#define not_eq YYY // technically OK, no diagnostic given
>
>I think that last line requires a diagnostic. The '#define' is not processed
>until phase 4, but phase 3 converts that line into the equivalent of
> #define != YYY
>which is not well-formed.
>---
>Steve Clamage, stephen.clamage@eng.sun.com
>
>

Author: jroy@ctp.com (John Roy)
Date: 1995/06/27 Raw View

 I have a question on the sections of the April working
paper concerned with preprocessor macro expansion. I want to know
what the behaviour of a conforming compiler should be with regard to
macro replacement of keywords and lexical keywords.
 My understanding is that C++ keywords ('if;, 'while' etc.)
are recognised as 'identifier' preprocessing tokens (see 2.3), whereas
digraphs that are lexical keywords ('and', 'xor_eq' etc.) are recognised
as 'preprocessing-op-or-punc' tokens. I understand that macro replacement
occurs only on those preprocessing tokens that are identifiers. So
my reasoning leads me to conclude that keywords can be replaced via
macro expansion whereas lexical keywords cannot.
 Example:

#define while XXX // OK
#define not_eq YYY // technically OK, no diagnostic given

int a=1;
int b=1;
while( a not_eq b ) {}; // should be expanded to XXX( a not_eq b ) {};

 Is this a correct interpretation?

John

--
_______________________________________________________________________________
    John Roy    |   Cambridge Technology Partners  |   Phone: (617) 374-8375
  jroy@ctp.com  |   304 Vassar Street              |   Fax:   (617) 374-8300
                |   Cambridge, MA 02139            |

Author: clamage@Eng.Sun.COM (Steve Clamage)
Date: 1995/06/27 Raw View

In article ktf@toon.ctp.com, jroy@ctp.com (John Roy) writes:
> I have a question on the sections of the April working
>paper concerned with preprocessor macro expansion. I want to know
>what the behaviour of a conforming compiler should be with regard to
>macro replacement of keywords and lexical keywords.
> My understanding is that C++ keywords ('if;, 'while' etc.)
>are recognised as 'identifier' preprocessing tokens (see 2.3), whereas
>digraphs that are lexical keywords ('and', 'xor_eq' etc.) are recognised
>as 'preprocessing-op-or-punc' tokens. I understand that macro replacement
>occurs only on those preprocessing tokens that are identifiers.

I believe that is correct. In other words, "not_eq" and "!=" are converted
to the same internal representation during phase 3 of preprocessing. (2.1,
2.3, 2.4)

>So my reasoning leads me to conclude that keywords can be replaced via
>macro expansion whereas lexical keywords cannot.

If by "lexical keyword" you mean what the draft standard calls "alternative
tokens" (2.4), then yes. "not_eq" is an example.

> Example:
>
>#define while XXX // OK
>#define not_eq YYY // technically OK, no diagnostic given

I think that last line requires a diagnostic. The '#define' is not processed
until phase 4, but phase 3 converts that line into the equivalent of
 #define != YYY
which is not well-formed.
---
Steve Clamage, stephen.clamage@eng.sun.com