Thread

Topic: ARC++ and Enumerators/Manipulators
Author: jones@jameson.arclch.com ((Ben Jones))
Date: Fri, 4 Mar 94 10:22:29 EST Raw View
This is the fourth in a series of articles which will explore ideas
for extending the C++ language.  These extensions are implemented in
an experimental preprocessor called ARC++ which takes the extended C++
syntax and generates ANSI C++.  Those who wish to try out the ideas
being presented here may obtain a copy of ARC++ for the PC, Mac,
Sparcstation, or Iris from the anonymous ftp on: "arcfos1.arclch.com".
Please go to the directory "/pub" and download the file "arc.READ_ME"
for instructions.


               ENUMERATORS AND MANIPULATORS
               ============================

                        Ben Jones
              (c) 1994 ARSoftware Corporation
                 jones@jameson.arclch.com


Introduction
============

Enumerator names and function names are often used as keywords in C++
programs.  Enumerators cannot be overloaded in the same scope in C++,
which severely limits their usefulness as keywords.  For example, we
cannot say:

    enum light { off, on };
    enum burner { off, low, medium, high };

in the same scope because they have overlapping enumerators.  As a
result, keywords often have to have ugly-looking names like
"WM_Create" or "O_RDWR".

Function names can be used as keywords (manipulators) by virtue of
overloading:

    void hex(iostream&);
    iostream& operator << (iostream&,void(*)(iostream&);
        ...
    cout << hex;

The operator << function executes "hex" to produce a side effect.
This makes for a clean looking syntax but an ugly implementation.
If all you wanted was to set some status flag, an enumerator would
have served just fine:

    enum iostream_radix { dec, hex, oct };
    iostream & operator << (iostream&,iostream_radix);
        ...
    cout << hex;

The operator << function would simply set the value of the flag in
the stream object.  The reason this is not done is because enumerators
cannot be overloaded.  However, there is really no reason for this.
If a function argument knows that it needs an enumerated type, why
can't it compare the name given as an argument with the list of names
which are defined with that enumerated type.

Manipulators are useful if you really want to produce a side effect.
It would be even nicer if member function names could serve the same
purpose.  Then the manipulators could be defined as members of a class.
The reason they can't be is that C++ requires full qualification on
member function names when they are used as pointers:

    class iostream
    {
     void hex();
        iostream & operator << (void (iostream::*)());
    };
        ...
    cout << &iostream::hex;

As with enumerators, the expression in which "hex" occurs knows
already about the scope in which "hex" might be found.

What is needed is context-sensitive expression analysis.  There is already
precedent for this.  When the "." or "->" operators occur in expressions,
the name on the right hand of the operator is interpreted according to the
type of the object or pointer on the left.

ARC++ implements solutions to both these problems using the following
rules for interpreting names within expressions.


Enumerations
============

When a named enumeration is defined, its enumerators are defined in a
private scope rather than in the scope in which the enumeration is
declared.  This prevents conflicts with other enumerations declared in
that same scope:

    enum E { a,b,c };
    enum F { c,b,a };

The following rules are used to evaluate a name:

1. If the left-handed argument of a binary operator is of an
enumerated type, the right handed argument will be evaluated first in
the context of that type.

    E e1 = a;
    F f1 = a;

2. If the argument to "switch" is of an enumerated type, the "case"
labels will be evaluated first in the context of that type.

    switch (e1) { case a: ... case b: ... case c: ... }
    switch (f1) { case c: ... case b: ... case a: ... }

3. If the dummy argument of a function is of an enumerated type, a
name passed as an actual argument will be evaluated first in the
context of that type.  Each test for an overload of the function
will evaluate the name:

    void ee(E);
    void ff(F);
        ...
    ee(a);
    ff(a);

4. It is not necessary for the "enum" to be defined in the current
scope for all this to work:

    class X
       {
       enum XE { i,j,k };
       XE m;
       void f(XE);
       };
        ...
    X x1;
    x1.m = i;
    switch (x1.m) { case i: ... case j: ... case k: ... }
    x1.f(j);

5. If a name is assigned to an integer and more than one enumeration
defines that name, an ambiguity is flagged.  The scope resolution
operator "::" may be used to resolve the ambiguity:

    int t = F::a;

6. If an integer value occurs in a binary expression with an
enumerated type, it should be promoted to that enumerated type:

    E e1 = a;
    e1 = e1+1;          // Resulting in e1 == b
    e1++;               // Resulting in e1 == c


NOTE: There has been a recent trend away from allowing integers to be
promoted to enumerated types in recent C++ compilers.  The above
rule reverses the current practice which "promotes" enumerated types
to integers.  Enumerated types would be converted to integers if
assigned to integers.  Assignment of integers to enumerated types
is disallowed unless as cast is performed.

One further refinement (not yet in ARC++) should be made for the sake
of consistency.  When an expression consisting of enumerators is
assigned to a variable or dummy argument of an enumerated type, all
the names should be evaluated in the context of that type:

    enum E { a=1,b=2,c=4 };
    E e1 = a | b | c;


Pointers to Member Functions
============================

Similar context rules are used to deal with pointers to members and
especially pointers to member functions whose syntax is extremely ugly
in current C++:

1. If the left-handed argument of a binary operator is a pointer to
member of a class, the right-hand argument will be evaluated first in
the context of that class:

    class X { int f(); };
    int (X::*pf)();
    pf = f;                     // Instead of: pf = &X::f;

Which is to say that it is not necessary to qualify the member
function name as required currently in C++:

    pf = &X::f

2. If the dummy argument of a function is of type pointer to member of
a class, the actual argument will be evaluated first in the context of
that class:

    void xx(int (X::*)());
    xx(f);                      // Instead of xx(&X::f);

One advantage here is that member functions can be used as
manipulators (which was possible before except that they would have
had to be fully qualified):

    class Out
        {
        void dec();
        void hex();

        Out & operator << (int);
        Out & operator << (void X::*)());
        };

    Out cout;
    cout = hex << 100 << dec << 25;

3. If a pointer to member is used in a context where the "this"
pointer is needed, apply "this->*" automatically.  If the pointer to
member was just obtained by dereferencing an object, use a pointer to
that object as "this" if it is of the same class as the pointer:

    class Y
        {
        void f();
        void (Y::*pf)();
        void f1()
            {
            pf();           // Instead of:  (this->*pf)();
            }
        };

    Y y;
    Y *py = &y;
    y.pf = f;               // Instead of:  y.pf = &X::f;
    y.pf();                 // Instead of:  (y.*pf)();
    py->pf = f;             // Instead of:  py->pf = &X::f;
    py->pf();               // Instead of:  (py->*pf)();


Conclusions
===========

* These rules dramatically improve the readability and efficiency
of programs.  Keywords may have more intuitive names and manipulators
will be used only when side effects are actually needed.

* We can now talk about libraries of enumerated types.  ARC++ also
allows the export and import of enumerated types.  Since manipulators
may now be member functions, they may be defined as part of an
exportable class (see ARC++ and Exportable Classes) instead of having
to be defined outside of the class.