Topic: Microsoft exploits bug in preprocessor?


Author: "Martijn Lievaart" <nobody@orion.nl>
Date: 1999/04/15
Raw View
Gernot wrote in message <3715DB57.5A41@eae.com>...
>Martijn Lievaart wrote:
>
>> I think it is incorrect, it should yield two adjacend tokens, not
strings.
>
>Hmm, could be, but I'm not really convinced. Ok, the preprocessor can
>identify only "tokens" (in the meaning ofcharacter sequences surrounded
>by whitespace) as expandable macros. But then it just replaces these
>tokens with a new sequence of characters. If it has identified to
>adjacent macros like "I()NT", it is not obliged to generate two new
>tokens, like "i nt". The resulting "int" would therefore be correct.

No, no. The preprocessor deals only in existing tokens. It is allowed to
insert existing tokens (that is what macroexpansion is all about), but it is
not allowed to form new tokens this way. The only defined way to do that is
by ##.

[ BTW, it is not imporatant that MS uses a non-standard way to parse their
own headers, they are perfectly allowed to do so. What is important is that
this "extension" can break well formed programs. Admittedly a small risk in
this case. ]

Martijn
--
My reply-to address is intentionally set to /dev/null
reply to <newsgroupname> at greebo.orion in nl
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: miker3@ix.netcom.com (Michael Rubenstein)
Date: 1999/04/15
Raw View
On 15 Apr 99 10:16:49 GMT, Ewert_Ahr._Electronic_GmbH@t-online.de
(Gernot) wrote:

>Martijn Lievaart wrote:
>
>> I think it is incorrect, it should yield two adjacend tokens, not strings.
>
>Hmm, could be, but I'm not really convinced. Ok, the preprocessor can
>identify only "tokens" (in the meaning ofcharacter sequences surrounded
>by whitespace) as expandable macros. But then it just replaces these
>tokens with a new sequence of characters. If it has identified to
>adjacent macros like "I()NT", it is not obliged to generate two new
>tokens, like "i nt". The resulting "int" would therefore be correct.

No it wouldn't.  A macro does not replace preprocessing tokens
with a sequence of characters; it replaces them with a sequence
of preprocessing tokens.

Look at the description of translation phases in the standard
(2.1).  The source file is decomposed into preprocessing tokens
and white space characters in phase 3.  Macro substitution takes
place in phase 4.  The description of macro substitution (16.3)
specifies that macros are replaced by sequences of preprocessing
tokens; there is no provision for combining two preprocessing
tokens after phase 3 except using the ## operator.
--
Michael M Rubenstein
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: "Martijn Lievaart" <nobody@greebo.orion.nl>
Date: 1999/04/15
Raw View
[ comp.std.c++ added, this is about what the standard allows. Martijn ]

Gernot wrote in message <37147942.25C8@eae.com>...
>Tom wrote:
>>
>>   There is a file that ships with MS VC++6, rpcproxy.h, which appears
>> to depend upon a bug in the preprocessor.  It concerns these two lines
>> from that header file:
>>
>> // get around the pain of cpp with an extra level of expansion
>> #define EXPANDED_ENTRY_PREFIX() ENTRY_PREFIX
>>
>>   For a glimpse of what they are doing here, consider this complete
>> example:
>>
>> #define I() i
>> #define NT   nt
>> #define INT I()NT
>> INT i;
>>
>>   Under MS:
>>   cl -E test.cpp
>>
>> yields:
>>   int i;
>>
>
>Looks correct to me. Expansion of I() and NT creates two adjacent
>strings. There's no whitespace in between (you have the parentheses as
>delimiters), so why shouldn't it be concatenated?

I think it is incorrect, it should yield two adjacend tokens, not strings.
This "trick" used to be done on some preprocessors by including /**/ between
tokens, ANSI formalised this by adding the ## preprocessor directive. This
was added exactly because the above is at least unportable, but I think
plain wrong.

Anyone who knows this for sure?

M4
--
My reply-to address is intentionally set to /dev/null
reply to mlievaart at orion in nl
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Ewert_Ahr._Electronic_GmbH@t-online.de (Gernot)
Date: 1999/04/15
Raw View
Martijn Lievaart wrote:

> I think it is incorrect, it should yield two adjacend tokens, not strings.

Hmm, could be, but I'm not really convinced. Ok, the preprocessor can
identify only "tokens" (in the meaning ofcharacter sequences surrounded
by whitespace) as expandable macros. But then it just replaces these
tokens with a new sequence of characters. If it has identified to
adjacent macros like "I()NT", it is not obliged to generate two new
tokens, like "i nt". The resulting "int" would therefore be correct.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]