Topic: ARC++ and User-Defined Operators


Author: jones@jameson.arclch.com ((Ben Jones))
Date: Thu, 10 Mar 94 10:40:20 EST
Raw View
This is the sixth in a series of articles which will explore ideas for
extending the C++ language.  These extensions are implemented in an
experimental preprocessor called ARC++ which takes the extended C++
syntax and generates ANSI C++.  Those who wish to try out the ideas
being presented here may obtain a copy of ARC++ for the PC, Mac,
Sparcstation, or Iris from the anonymous ftp on: "arcfos1.arclch.com".
Please go to the directory "/pub" and download the file "arc.READ_ME"
for instructions.  If you are interested in the other articles in the
series, you may download the files "arc.TUTOR1", "arc.TUTOR2", etc.


                  USER-DEFINED OPERATORS
                  ======================

                        Ben Jones
              (c) 1994 ARSoftware Corporation
                 jones@jameson.arclch.com

Introduction
============

Operator overloading can make for a cleaner-looking C++ program but
you are limited to using the operators already available in the
language, most of which already have an arithmetic meaning.  When
overloaded for non-arithmetic purposes, things can get a little
confused.  A classic example is the use of the shift operators for
input and output in the <stream.h> library:

    cout << a << (b<<1) << (c&1);

The shift operator is suggestive of data flowing a particular direction
and many arithmetic expressions can be embedded in a single output
stream.  However, the user has to be careful about precedence and
about using the shift operator as Shift.

Sometimes additional operators are proposed for C++.  Some time ago,
there was a quasi-religious furor on "comp.lang.c" over the issue of
whether to provide an exponentiation operator such as Fortran's ** in
the language.  It looked like people were lining up on different sides
to brand each other as heretics for daring to suggest one form or
another.


Defining New Operators
======================

To totally sidestep arguments like this, ARC++ provides Freedom Of
Choice: Any combination of punctuation marks excluding ( ) [ ] { } , ;
? may be used as an operator.  For example, you might want to use the
combination "<--" to indicate a stream.  This would be declared as
follows:

    iostream & operator <-- (iostream&,int);

 cout <-- a <-- b<<1 <-- c&1;

ARC++ also allows names to be defined as operators:

    vector& operator dot (vector&,vector&)

 a = b dot c;

ARC++ also allows any characters in the range \200 to \377 to be
defined as operators.  This might be very useful if you have a good
symbol font available.  However, I cannot show any examples here.


Precedence and Grouping
=======================

When defining new operators, some way of indicating precedence and
grouping is needed.  ARC++ uses the following construction:

    vector& operator_+ dot (vector&,vector);
    vector& operator_* cross (vector&,vector&);

 a = b cross c dot d cross e;

That is, "operator_" followed by an existing operator copies the
precedence and grouping rules from that existing operator into
the new one being defined.

New levels of precedence may be defined relative to old ones by
using a literal integer to indicate a level of precedence.  An
even number indicates left to right grouping.  An odd number
indicates right to left.  For example:

    double operator_* 3 ** (double,double);

This would define the exponentiation operator ** to have a
higher precedence than the times operator * and have right
to left grouping.  Thus:

    a = b**c**d

would cause "c" to be raised to the "d" power and then "b" to be
raised to that resulting power, just as in Fortran.


Overloading New Operators
=========================

Once a new operator is defined, it may be overloaded without having
to specify the precedence rules each time.  It is not permitted in
a given compilation to change the precedence of an operator once it
has been established.  This could lead to some conflicts when combining
different libraries which use the same new operators.

One potential solution would be to allow different overloads of an
operator to have different precedences.  This would require some
dynamic rearranging of the parse tree based on the types of the
operands.


Other Potential Problems
========================

C and C++ already use every punctuation mark in the ASCII character
set except for accent grave ` and the at sign @.  This means that new
operator combinations may look very similar to existing ones.  ARC++
looks for the longest operator it can recognize when separating tokens
out of the input stream.  Spaces and parentheses can be used to set
off the operators but there is still the potential for breaking
existing code.

For example, if the ** operator is defined then a**b would no longer be
interpreted as "a" times the item pointed to by "b".  Of course, anyone
whose code is broken by the existance of the ** operator should be shot.