Topic: Augment the Preprocessor to Get Rid of Mores Uses of


Author: Ricardo Fabiano de Andrade <ricardofabianodeandrade@gmail.com>
Date: Tue, 13 Dec 2016 09:36:04 -0600
Raw View
--001a114c9f8e7203e605438bfa67
Content-Type: text/plain; charset=UTF-8

*Motivation*

Over the years, there has been a lot of arguments against the Preprocessor
and supporting decades of preprocessor-based code is sometimes a daunt task
but in the current state of the standard one can hardly live without it.

In the other hand, it's hard to deny the usefulness of the Preprocessor
when it comes to purely mechanical tasks involving code generation, being
the extreme example of this the Boost Preprocessor library.

Interoperability with C is also a point in favor of keeping the
Preprocessor around.

C++ has got many tools to avoid using the Preprocessor though, namely:
constants, templates, inline, constexpr, user-defined literals, at some
capacity modules and soon, reflection.

This last one has the potential to, in the future, eliminate almost all
uses of the Preprocessor, except by fact that any reflection facility added
to the language would operate after the work of the Preprocessor is done.

Having this in mind and knowing that almost any piece of legacy code will
contain an abundant number of macros, it's concerning the fact that the
language support for reflection will not be able to reason about the
Preprocessor tokens which may be declaring the objects and functions that
are actually making up all the APIs of such piece of code.

Conversely, it's hard to imagine a resulting C++ language feature,
reflection or otherwise, which would allow crossing the Preprocessor
boundary to obtain information about its macros declarations in a clean way.

Instead, the standard should be changed in the direction to augment the
Preprocessor capabilities in a way to provide some support for
introspection and that is what this proposal is about.

*Preprocessor Minimal Introspection*

The Preprocessor already has some pseudo-reflection facilities in the form
of the # (strigizing) and ## (token-pasting) operators.
However, it does not provide any introspection support. For example,
operations to list the macros defined or to tell if a given macro takes
arguments are not present.

*The Macro Listing Operation*

This is an operation which lists the declared macros. One may think that
such operation should be contained in an operator such as the ones
mentioned above but the problem is that there are no reserved characters
which could be used by a new operator in the Preprocessor and using an
operator would prevent this operation from taking arguments.

Therefore it is proposed to have a function-like internal macro tentatively
named: __DEFINED_MACROS__(filter)

Being "filter" a string literal (more below).
The returned value is the following:
- If the filter matches anything, the result is the comma-separated macro
names.
- If no matches are found, the result is nothing.

*Macro List Filtering*

The number of macros even in a small program can be overwhelming, so to
make the resulting of this operation more digestible it is proposed that it
supports filtering.

The author of this proposal would like to have regular expressions as the
filter mechanism but realistically speaking it would be very difficult to
reach an agreement on which variant (posix, ECMA, grep, etc.) should be
supported or how to select one or define options.

For now, this proposal will try to stick with the widely accepted wildcard
character '*' (any multiple characters) and '?' (any single character) and
the known mechanisms associated with those in the bash "globbing". That
should be sufficient for the majority of the use cases.

For example, using the following snippet (extracted from OpenLDAP ldap.h):

#define LDAP_DEBUG_TRACE 0x001
#define LDAP_DEBUG_PACKETS 0x002
#define LDAP_DEBUG_ARGS  0x004

The call below:
__DEFINED_MACROS__("LDAP_DEBUG_*")

Would have the result of (incomplete for brevity):

LDAP_DEBUG_TRACE, LDAP_DEBUG_PACKETS, LDAP_DEBUG_ARGS ...

Which in turn could be input for (non working, just illustrative):

#define FOR_EACH(function, ...) // out of the scope of this proposal
#define DECLARE_ENUM(macro) e#macro = macro,
// Redeclare LDAP debug constants as a strong-typed enum.
enum class LdapDebug {
FOR_EACH(DECLARE_ENUM, __DEFINED_MACROS__("LDAP_DEBUG_*"));
}; // aware of the trailing comma, please ignore it

Now, LdapDebug names and values could be used in C++ and also available to
reflection.

This might work great for macros which declare objects, what about
functions?

*The Macro Arguments Operations*

The first operation returns the number of arguments, the second their
names. Again, internal function-like pre-defined macros will be proposed
instead of new operators.

The number of arguments of a macro would be obtained with (tentative name):
__DEFINED_ARGS_COUNT__(macro)

The name of the arguments could be obtained with (tentative name):
__DEFINED_ARGS_NAMES__(macro)

Being "macro" the string literal representing the macro name (# operator
can be used).
The returned value is the following:
- If a function macro name is passed, the result is the number of arguments
(for ...COUNT) or their comma-separated names (for ...NAMES).
- If a non-function macro name is passed, the result is nothing.

Please note that __VA_ARGS__ is counted as a regular argument, and its name
will be present among other argument names in the result.

For example (again, from ldap.h again):

#define Debug( level, fmt, arg1, arg2, arg3 ) ...
Could be transformed into a lambda (non working, just illustrative):

#define TEMPLATE_ARG(name) auto #name,
#define LDAP_FUNC_DECLARE(macro) auto = Ldap#macro[]( \
FOR_EACH(TEMPLATE_ARG, __DEFINED_ARGS_NAMES__(#macro))) \
{ return #macro(__DEFINED_ARGS_NAMES__(#macro)); }// aware of the
trailing comma, please ignore it
// Redeclare LDAP functions as generic lambdas.
#if defined(__DEFINED_ARGS_COUNT__(#Debug))
LDAP_FUNC_DECLARE(Debug);
#endif

The lambda could be then be made available for C++ and even though the
current reflection proposal (P0194R2) would not be able to handle it,
the authors of that proposal made clear their intentions to support
functions at some point in the future.

*Conclusion*

Stretch this ideas of this proposal a little bit and you can vision
definitions being redeclared this way and used in C++ instead of the legacy
macros which in turn could get deprecated over time (of course, if C
support is to be dropped). The author of this proposal can see derivations
of such work aiding with a better support modules in legacy environments
too.

Please note that C++ is the language the benefit the most from the changes
proposed here but nothing prevents such ideas to be made also available for
C, if the concern is having the same Preprocessors for C and C++.

I'm looking forward to hearing comments and suggestions.
If such functionality can already be achieved using already available
techniques, I'd love to hear more.

Thank you,
Ricardo Andrade

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BfGSbN8_3wO2XU5cdTRTS6B_MrgmBa8ks9PCQdxb-GtfYCjJQ%40mail.gmail.com.

--001a114c9f8e7203e605438bfa67
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><b class=3D"gmail-m_1303713928937859996gmail_msg">Motivati=
on</b><div class=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmai=
l-m_1303713928937859996gmail_msg"></div><div class=3D"gmail-m_1303713928937=
859996gmail_msg">Over the years, there has been a lot of arguments against =
the Preprocessor and supporting decades of preprocessor-based code is somet=
imes a daunt task but in the current state of the standard one can hardly l=
ive without it.</div><div class=3D"gmail-m_1303713928937859996gmail_msg"><b=
r class=3D"gmail-m_1303713928937859996gmail_msg"></div><div class=3D"gmail-=
m_1303713928937859996gmail_msg">In the other hand, it&#39;s hard to deny th=
e usefulness of the Preprocessor when it comes to purely mechanical tasks i=
nvolving code generation, being the extreme example of this the Boost Prepr=
ocessor library.</div><div class=3D"gmail-m_1303713928937859996gmail_msg"><=
br></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Interoperabili=
ty with C is also a point in favor of keeping the Preprocessor around.</div=
><div class=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_1=
303713928937859996gmail_msg"></div><div class=3D"gmail-m_130371392893785999=
6gmail_msg">C++ has got many tools to avoid using the Preprocessor though, =
namely: constants, templates, inline, constexpr, user-defined literals, at =
some capacity modules and soon, reflection.</div><div class=3D"gmail-m_1303=
713928937859996gmail_msg"><br class=3D"gmail-m_1303713928937859996gmail_msg=
"></div><div class=3D"gmail-m_1303713928937859996gmail_msg">This last one h=
as the potential to, in the future, eliminate almost all uses of the Prepro=
cessor, except by fact that any reflection facility added to the language w=
ould operate after the work of the Preprocessor is done.</div><div class=3D=
"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_130371392893785=
9996gmail_msg"></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Ha=
ving this in mind and knowing that almost any piece of legacy code will con=
tain an abundant number of macros, it&#39;s concerning the fact that the la=
nguage support for reflection will not be able to reason about the Preproce=
ssor tokens which may be declaring the objects and functions that are actua=
lly making up all the APIs of such piece of code.</div><div class=3D"gmail-=
m_1303713928937859996gmail_msg"><br class=3D"gmail-m_1303713928937859996gma=
il_msg"></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Conversel=
y, it&#39;s hard to imagine a resulting C++ language feature, reflection or=
 otherwise, which would allow crossing the Preprocessor boundary to obtain =
information about its macros declarations in a clean way.</div><div class=
=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_130371392893=
7859996gmail_msg"></div><div class=3D"gmail-m_1303713928937859996gmail_msg"=
>Instead, the standard should be changed in the direction to augment the Pr=
eprocessor capabilities in a way to provide some support for introspection =
and that is what this proposal is about.</div><div class=3D"gmail-m_1303713=
928937859996gmail_msg"><br class=3D"gmail-m_1303713928937859996gmail_msg"><=
/div><div class=3D"gmail-m_1303713928937859996gmail_msg"><b class=3D"gmail-=
m_1303713928937859996gmail_msg">Preprocessor Minimal Introspection</b></div=
><div class=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_1=
303713928937859996gmail_msg"></div><div class=3D"gmail-m_130371392893785999=
6gmail_msg">The Preprocessor already has some pseudo-reflection facilities =
in the form of the # (strigizing) and ## (token-pasting) operators.</div><d=
iv class=3D"gmail-m_1303713928937859996gmail_msg">However, it does not prov=
ide any introspection support. For example, operations to list the macros d=
efined or to tell if a given macro takes arguments are not present.</div><d=
iv class=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_1303=
713928937859996gmail_msg"></div><div class=3D"gmail-m_1303713928937859996gm=
ail_msg"><b class=3D"gmail-m_1303713928937859996gmail_msg">The Macro Listin=
g Operation</b><br class=3D"gmail-m_1303713928937859996gmail_msg"></div><di=
v class=3D"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_13037=
13928937859996gmail_msg"></div><div class=3D"gmail-m_1303713928937859996gma=
il_msg">This is an operation which lists the declared macros. One may think=
 that such operation should be contained in an operator such as the ones me=
ntioned above but the problem is that there are no reserved characters whic=
h could be used by a new operator in the Preprocessor and using an operator=
 would prevent this operation from taking arguments.</div><div class=3D"gma=
il-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m_13037139=
28937859996gmail_msg">Therefore it is proposed to have a function-like inte=
rnal macro tentatively named: __DEFINED_MACROS__(filter)</div><div class=3D=
"gmail-m_1303713928937859996gmail_msg"><br class=3D"gmail-m_130371392893785=
9996gmail_msg"></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Be=
ing &quot;filter&quot; a string literal (more below).<br></div><div class=
=3D"gmail-m_1303713928937859996gmail_msg"><div class=3D"gmail-m_13037139289=
37859996gmail_msg">The returned value is the following:</div><div class=3D"=
gmail-m_1303713928937859996gmail_msg">- If the filter matches anything, the=
 result is the comma-separated macro names.</div><div class=3D"gmail-m_1303=
713928937859996gmail_msg">- If no matches are found, the result is nothing.=
</div></div><div class=3D"gmail-m_1303713928937859996gmail_msg"><br></div><=
div class=3D"gmail-m_1303713928937859996gmail_msg"><b class=3D"gmail-m_1303=
713928937859996gmail_msg">Macro List Filtering</b></div><div class=3D"gmail=
-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m_1303713928=
937859996gmail_msg">The number of macros even in a small program can be ove=
rwhelming, so to make the resulting of this operation more digestible it is=
 proposed that it supports filtering.<br></div><div class=3D"gmail-m_130371=
3928937859996gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996=
gmail_msg">The author of this proposal would like to have regular expressio=
ns as the filter mechanism but realistically speaking it would be very diff=
icult to reach an agreement on which variant (posix, ECMA, grep, etc.) shou=
ld be supported or how to select one or define options.</div><div class=3D"=
gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m_13037=
13928937859996gmail_msg">For now, this proposal will try to stick with the =
widely accepted wildcard character &#39;*&#39; (any multiple characters) an=
d &#39;?&#39; (any single character) and the known mechanisms associated wi=
th those in the bash &quot;globbing&quot;. That should be sufficient for th=
e majority of the use cases.</div><div class=3D"gmail-m_1303713928937859996=
gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Fo=
r example, using the following snippet (extracted from OpenLDAP ldap.h):</d=
iv><div class=3D"gmail-m_1303713928937859996gmail_msg"><pre style=3D"color:=
rgb(0,0,0);word-wrap:break-word;white-space:pre-wrap">#define LDAP_DEBUG_TR=
ACE 0x001
#define LDAP_DEBUG_PACKETS 0x002
#define LDAP_DEBUG_ARGS  0x004</pre></div><div class=3D"gmail-m_13037139289=
37859996gmail_msg">The call below:</div><div class=3D"gmail-m_1303713928937=
859996gmail_msg">__DEFINED_MACROS__(&quot;LDAP_DEBUG_*&quot;)</div><div cla=
ss=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m=
_1303713928937859996gmail_msg">Would have the result of (incomplete for bre=
vity):</div><div class=3D"gmail-m_1303713928937859996gmail_msg"><pre style=
=3D"color:rgb(0,0,0);word-wrap:break-word;white-space:pre-wrap">LDAP_DEBUG_=
TRACE, LDAP_DEBUG_PACKETS, LDAP_DEBUG_ARGS ...</pre></div><div class=3D"gma=
il-m_1303713928937859996gmail_msg">Which in turn could be input for (non wo=
rking, just illustrative):</div><div class=3D"gmail-m_1303713928937859996gm=
ail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_msg"><fon=
t face=3D"monospace, monospace">#define FOR_EACH(function, ...) // out of t=
he scope of this proposal</font></div><div class=3D"gmail-m_130371392893785=
9996gmail_msg"><font face=3D"monospace, monospace">#define DECLARE_ENUM(mac=
ro) e#macro =3D macro,</font></div><div class=3D"gmail-m_130371392893785999=
6gmail_msg"><font face=3D"monospace, monospace">// Redeclare LDAP debug con=
stants as a strong-typed enum.</font></div><div class=3D"gmail-m_1303713928=
937859996gmail_msg"><font face=3D"monospace, monospace">enum class LdapDebu=
g {</font></div><div class=3D"gmail-m_1303713928937859996gmail_msg"><font f=
ace=3D"monospace, monospace">FOR_EACH(DECLARE_ENUM,=C2=A0__DEFINED_MACROS__=
(&quot;LDAP_DEBUG_*&quot;));</font></div><div class=3D"gmail-m_130371392893=
7859996gmail_msg"><font face=3D"monospace, monospace">}; // aware of the tr=
ailing comma, please ignore it</font></div><div class=3D"gmail-m_1303713928=
937859996gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmai=
l_msg">Now, <font face=3D"monospace, monospace">LdapDebug </font>names and =
values could be used in C++ and also available to reflection.</div><div cla=
ss=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m=
_1303713928937859996gmail_msg">This might work great for macros which decla=
re objects, what about functions?</div><div class=3D"gmail-m_13037139289378=
59996gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_ms=
g"><b>The Macro Arguments Operations</b></div><div class=3D"gmail-m_1303713=
928937859996gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996g=
mail_msg">The first operation returns the number of arguments, the second t=
heir names. Again, internal function-like pre-defined macros will be propos=
ed instead of new operators.</div><div class=3D"gmail-m_1303713928937859996=
gmail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Th=
e number of arguments of a macro would be obtained with (tentative name):</=
div><div class=3D"gmail-m_1303713928937859996gmail_msg">__DEFINED_ARGS_COUN=
T__(macro)</div><div class=3D"gmail-m_1303713928937859996gmail_msg"><br></d=
iv><div class=3D"gmail-m_1303713928937859996gmail_msg">The name of the argu=
ments could be obtained with=C2=A0(tentative name):</div><div class=3D"gmai=
l-m_1303713928937859996gmail_msg">__DEFINED_ARGS_NAMES__(macro)</div><div c=
lass=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail=
-m_1303713928937859996gmail_msg"><div class=3D"gmail-m_1303713928937859996g=
mail_msg">Being &quot;macro&quot; the string literal representing the macro=
 name (# operator can be used).</div><div class=3D"gmail-m_1303713928937859=
996gmail_msg">The returned value is the following:</div><div class=3D"gmail=
-m_1303713928937859996gmail_msg">- If a function macro name is passed, the =
result is the number of arguments (for ...COUNT) or their comma-separated n=
ames (for ...NAMES).</div><div class=3D"gmail-m_1303713928937859996gmail_ms=
g">- If a non-function macro name is passed, the result is nothing.</div><d=
iv class=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"g=
mail-m_1303713928937859996gmail_msg">Please note that __VA_ARGS__ is counte=
d as a regular argument, and its name will be present among other argument =
names in the result.</div></div><div class=3D"gmail-m_1303713928937859996gm=
ail_msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_msg">For =
example (again, from ldap.h again):</div><div class=3D"gmail-m_130371392893=
7859996gmail_msg"><pre style=3D"color:rgb(0,0,0);word-wrap:break-word;white=
-space:pre-wrap"><font face=3D"monospace, monospace">#define Debug( level, =
fmt, arg1, arg2, arg3 ) ...

</font><font face=3D"arial, helvetica, sans-serif">Could be transformed int=
o a lambda</font> <span style=3D"font-family:arial,sans-serif;color:rgb(34,=
34,34)">(non working, just illustrative)</span>:</pre><pre style=3D"color:r=
gb(0,0,0);word-wrap:break-word;white-space:pre-wrap">#define TEMPLATE_ARG(n=
ame) auto #name,
#define LDAP_FUNC_DECLARE(macro) auto =3D Ldap#macro[]( \
FOR_EACH(TEMPLATE_ARG, __DEFINED_ARGS_NAMES__(<span style=3D"font-family:mo=
nospace,monospace">#macro)</span><span style=3D"font-family:monospace,monos=
pace">)) \ </span> <span style=3D"font-family:monospace,monospace">
{ return #macro(</span><span style=3D"font-family:monospace,monospace">__DE=
FINED_ARGS_NAMES__(</span><span style=3D"font-family:monospace,monospace">#=
macro)</span><font face=3D"monospace, monospace">); }
</font>// aware of the trailing comma, please ignore it<font face=3D"monosp=
ace, monospace">

</font>// Redeclare LDAP functions as generic lambdas.
#if defined(__DEFINED_ARGS_COUNT__(#Debug))
LDAP_FUNC_DECLARE(Debug);
#endif</pre><pre style=3D"color:rgb(0,0,0);word-wrap:break-word;white-space=
:pre-wrap"><font face=3D"arial, helvetica, sans-serif">The lambda could be =
then be made available for C++ and even though the current reflection propo=
sal (P0194R2) would not be able to handle it, the authors of that proposal =
made clear their intentions to support functions at some point in the futur=
e.</font></pre><pre style=3D"color:rgb(0,0,0);word-wrap:break-word;white-sp=
ace:pre-wrap"><font face=3D"arial, helvetica, sans-serif"><b>Conclusion</b>=
</font></pre></div><div class=3D"gmail-m_1303713928937859996gmail_msg">Stre=
tch this ideas of this proposal a little bit and you can vision definitions=
 being redeclared this way and used in C++ instead of the legacy macros whi=
ch in turn could get deprecated over time (of course, if C support is to be=
 dropped). The author of this proposal can see derivations of such work aid=
ing with a better support modules in legacy environments too.</div><div cla=
ss=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"gmail-m=
_1303713928937859996gmail_msg">Please note that C++ is the language the ben=
efit the most from the changes proposed here but nothing prevents such idea=
s to be made also available for C, if the concern is having the same Prepro=
cessors for C and C++.</div><div class=3D"gmail-m_1303713928937859996gmail_=
msg"><br></div><div class=3D"gmail-m_1303713928937859996gmail_msg">I&#39;m =
looking forward to hearing comments and suggestions.</div><div class=3D"gma=
il-m_1303713928937859996gmail_msg">If such functionality can already be ach=
ieved using already available techniques, I&#39;d love to hear more.</div><=
div class=3D"gmail-m_1303713928937859996gmail_msg"><br></div><div class=3D"=
gmail-m_1303713928937859996gmail_msg">Thank you,</div><div class=3D"gmail-m=
_1303713928937859996gmail_msg">Ricardo Andrade</div><div class=3D"gmail-m_1=
303713928937859996gmail_msg"><br></div></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/CA%2BfGSbN8_3wO2XU5cdTRTS6B_MrgmBa8ks=
9PCQdxb-GtfYCjJQ%40mail.gmail.com?utm_medium=3Demail&utm_source=3Dfooter">h=
ttps://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BfGSbN8_3wO=
2XU5cdTRTS6B_MrgmBa8ks9PCQdxb-GtfYCjJQ%40mail.gmail.com</a>.<br />

--001a114c9f8e7203e605438bfa67--

.