Topic: Second draft: proposal to add an exponentiation operator to C++


Author: jbuck@forney.berkeley.edu (Joe Buck)
Date: 8 Mar 92 21:14:27 GMT
Raw View
This is the second round for this proposal.  I've added a bit more
commentary, answered a few more objections, and corrected an error:
-0.25@-2 is perfectly well defined, and I knew that, but I screwed up when
I wrote it up.

What's next?  If someone will send me the "standard form" for writing
up a proposed extension for the C++ standards committee, I'd be willing
to rewrite it.  If this proposal is to go any further, I'll need a
"champion"; i.e. a person who is already on the committee and interested
in this extension who'd be willing to argue for it at committee meetings.
(If such a person doesn't exist, the proposal isn't going to get through
anyway).

Followups are directed to comp.std.c++.
------------------------------------------------------------------------
Draft Proposal to add an exponentiation operator to C++

Version 2
March 8, 1992
Joseph T. Buck (jbuck@ohm.berkeley.edu)

The portion of the grammar described on p. 72 of the ARM is revised as
follows:

exp-expression:
 pm-expression
 pm-expression @ exp-expression

multiplicative-expression:
 exp-expression
 multiplicative-expression * exp-expression
 multiplicative-expression / exp-expression
 multiplicative-expression % exp-expression

The revision introduces a new operator, @, which is right-associative
and binds more tightly than the multiplication/division operators
(*, /, and %), and more loosely than .*, ->*, cast operators, or
unary operators.  The grammar change basically sticks a new production
in between pm-expression and multiplicative-expression.

The @ operator returns the result of raising the first operand to the
power represented by the second operand.

The exponentiation operator, @, groups right-to-left.  The operands
of @ must have arithmetic type.

The "usual arithmetic conversions" of section 4.5 are used, with the
following exception:

If the first argument is of one of the types long double, double, or
float, and the second argument is of integral type, integral promotion
(ch 4.1) takes place on the second argument, but no change is made to
the first argument.

 Commentary: we wish to allow -2.0@i, but not -2.0@x, where i
 is of integral type and x is of floating type.

The effect of these rules is to allow the following combinations of
arguments and result types:

 NOTE: I know that the language does not permit overloading
 on builtin arguments.  I am showing the "simulated prototypes"
 for the argument combinations that can occur.

Group 1:
long double operator@(long double,long double);
double operator@(double,double);
float operator@(float,float);

Group 2:
long double operator@(long double,long int);
long double operator@(long double,unsigned int);
long double operator@(long double,int);
double operator@(double,long int);
double operator@(double,unsigned int);
double operator@(double,int);
float operator@(float,long int);
float operator@(float,unsigned int);
float operator@(float,int);

Group 3:
long int operator@(long int,long int);
unsigned int operator@(unsigned int,unsigned int);
int operator@(int,int);

For all three groups, the result when both arguments are zero
is undefined.

For group 1 (both arguments of integral type), if the first argument
is negative the result is undefined.  Provided that the result
of the operation can be represented in the result type, group 1
operations are guaranteed to be exact.

For group 2 (floating base raised to integral power), if the first
argument is zero and the second argument is zero or negative, the
result is undefined.  Specification of the precision of group 2
operations is outside the scope of this standard.

 Commentary: note that neither ANSI C nor ANSI C++ mandate
 the use of IEEE floating point; it would be inconsistent
 to demand that the @ operator be specified more strictly
 than the * operator.

For group 3 (both arguments of floating type), the result of a negative second argument is undefined.

Specification of the precision of group 2 and group 3 operations are
outside the scope of this standard.

 Commentary: I expect such issues to be handled in extensions to
 standards such as the IEEE floating point standard to cover
 exponentiation and the transcendental functions; just as an
 ANSI C compiler need not implement IEEE floating point, an
 ANSI C++ compiler need not implement these rules for
 exponentiation and the transcendentals.

The effect of an overflow is implementation-dependent.

 Commentary: "undefined" means that an arithmetic exception
 may occur, or that the result may be garbage, for example,
 0@0 might return 1, 0, or an error.  I've been asked to
 define 0@0 to be 1, but I can't in good conscience mandate
 something that isn't correct.

 I'm choosing "undefined", rather than specifying exceptions,
 to be consistent with how the ARM deals with things like
 division by zero.

-------------------------------------------------------------------

"class complex" is not part of the standard at this point.  Should
it be standardized, the following operator overloads might be used:

complex operator@(complex b,complex e) {
 return exp(e*log(b));
}

Note, however, that log(complex) is multi-valued, and that just
using the principal value might not suffice in some applications
(e.g. approximation to contour integrals on the complex plane).

// optional: might be desirable because log(double) is cheaper
// and uniquely defined:

complex operator@(double b,complex e) {
 return exp(e*log(b));
}

Alternatively, complex arguments might be passed as "const complex&".

-------------------------------------------------------------------

Now for the objections and questions:

1.  But -1@0.5 is complex(0,1).  Why shouldn't that result be returned?

It is inconsistent with the language for the type of the result to
depend on the value of the arguments.  If a user wants to deal with
complex results, it's not difficult to specify complex classes.  Every
other operator in the language restricts the result to be no more
general than the type of its arguments; e.g. 3/2 = 1, not 1.5, even
though 1.5 is "correct".

2.  But nevertheless, complex(0,1) is the right answer.  Users won't
accept this.

Experience with Fortran suggests otherwise.  The semantics I've chosen
generally agree with Fortran, and yet everything fits nicely with C++
conventions.

3.  You're trying to make my favorite language more like Fortran, and
Fortran is an inferior language.

You're not going to get a lot of the people who use Fortran to convert to
C++ as long as you make life difficult for them.  It's not just a matter
of syntax; most compilers are simply going to generate worse code if there
isn't an exponentiation operator, unless the users write their code more
carefully than we can expect.  Fortran is inferior in many ways, but there
is a reason why it's still used for large scientific problems, and the
exponentiation operator is one of the reasons (vectorizability is another,
but that's another argument).

4.  Why don't you just use the "usual arithmetic conversions" (chapter
4.5 of the ARM)?  Why the exception?

Because users will be unhappy if X@2 returns an exception for negative
X, when it's perfectly well-defined.

5.  Why isn't it enough to use "pow", especially since you can overload it
to define pow(double,double), pow(double,int), and pow(int,int)?

Most scientific codes contain large amounts of exponentiation; raising
a real exponent to an integer power where the integer is known at compile
time is common, but real or integer unknown exponents are also common.
Evaluation of polynomials (where x@1, x@2, ... are used in sequence) is
a common operation.  Forcing the use of pow(...) has several harmful effects:

 The code size increases and complicated expressions become more
 difficult to read.  C/C++ programmers who don't think this is
 a problem haven't seen scientific codes, where even with the
 exponentiation operator, expressions can require several lines
 to write.

 Strength reduction becomes much more difficult to apply, unless
 pow(...) is made a special function known by the compiler.

 Optimizations commonly applied in Fortran compilers when the same
 base appears with several constant exponents are also more difficult.

 Scientific users are put off by the lack of a feature they find
 very useful.  From a posting by Robert Davies:

"Some people have strong feelings about the need for an exponentiation
operator. I quote from Press, Flannery, Tuekolsky and Vetterling in their
respected book "Numerical Recipes in C", pp 14 and 23 "... the slowness of C's
penetration into scientific computing has been due to deficiencies in the
language that computer scientists have been (we think, stubbornly) slow to
recognize. Examples are the lack of a good way to raise a number to small
integer powers ...", "The omission of this operator from C is perhaps the
language's most galling insult to the scientific programmer"."

6.  Why not use ** as in Fortran?

It would break "int main(int argc,char **argv)".  We can't make ** a
token.  As for tricks to make ** do exponentiation for user-defined
classes, I have seen tricky code that overloads both unary * and
binary * to make ** look like an exponentiation operator for a class,
but I'm not impressed with the hackery.  Such tricks tend to fail when
more complicated expressions are considered.

7.  How about ^, ^^, #, or ~ ?

^ is taken (exclusive-or).  It has the wrong precedence and associativity
to be used as an overloaded operator signifying exponentiation.

I could live with ^^, but there is a relation between & and && and |
and || that might suggest a completely different meaning for ^^ to some
(a logical rather than a bitwise exclusive-or).

# doesn't suggest the right thing to me; also it might confuse the
preprocessor if it appeared as the first nonblank character on a line.

~ is currently a unary operator but not a binary operator, and it
would be possible to use it for exponentiation instead; at least
I don't see any conflicts.  I prefer @, but this may be a matter of
personal taste.  Using ~ instead of @ would be acceptable.

8.  @ already has a meaning in gdb.  This means we can't use it.

The Gnu project is busy adding more languages; as they are added, gdb
will also grow to parse all of those languages.  There will inevitably
be conflicts between language features and gdb features, and in all
such conflicts, debuggers should give way before languages.  It
won't be difficult to add a quoting mechanism for distinguishing
language-specific expressions from debugger expressions.  If ~ is
chosen instead of @, it should be for a better reason than not
breaking gdb.

9.  It will be hard to implement.

I disagree.  The modification to the grammar is simple.  The single
exception to the "usual arithmetic conversions" rule is not difficult
to check for.  A quick and dirty implementation can simply insert
function calls for the various templates shown above, and their number
can be reduced in many environments; strength reduction for common
cases (exponent is a small constant integer) is very easy to apply.
Compared to, say, implementing exceptions, the work required for this
change will be trivial.

10.  Andy Koenig's example and questions: is x@3 guaranteed to be
equal to x@3.0?  To x*x*x?

The short answer to Andy's question is "no".

Such questions are beyond the scope of the C++ standard.  After all,
the C++ standard does not answer the question: is x*2 guaranteed to
be equal to x+x.  The IEEE standard for floating point arithmetic
DOES answer the question, but C++ implementations are not required to
implement IEEE floating point (DEC and Cray would be very upset if
this changed).

Implementers would do well to learn from the experience of Fortran
implementers, who have 30 years of experience in dealing with this
problem.

11.  But for implementations like cfront, which produce C, the lack of
an exponentiation operator in C is a problem.

Just as overloaded operators for classes are turned into function calls
unless they are inlined, a cfront-like implementation could inline
some cases (turning x@2 into x*x, say) and generate function calls
for the rest.  cfront already generates function calls for user-defined
operators; @ could be treated the same way even when used on builtin
types.

12.  What is the effect on existing programs?

None.  "@" is not used in the current grammar; no existing program
will break.

13.  We shouldn't have an exponentiation operator because exponentiation
isn't "close to the machine".

This argument is perhaps better suited for C than for C++.  Just the same,
this argument isn't as clear-cut as it might appear.  Many machines lack
an integer division instruction; on some DSP chips, it can cost about the
same to evaluate i@j as i/j for integers i and j (where their values are
not known until runtime), and cost substantially less when j is known at
compile time, because division can be 30 times more expensive than
multiplication.  Would a language implementer consider not including
division because, on most RISC machines, division is not close to the
machine?

14.  "@" isn't on all keyboards; this is the trigraph problem all over
again.

Someone has commented by email "I believe that @ is specific to US ASCII".
I don't know that it is.  Let's suppose that this is true.  If this were
the first such case, it would be a strong objection.  But since there are
already about nine other cases, this one can be solved in similar ways
(trigraph, special keyword).  If that isn't acceptable, I suppose that
binary ~ could be used instead, but ~ is not on all keyboards either.

15.  Are there any other benefits or impact?

Many have argued that the most important reason for adding the
exponentiation operator is not so that it can be used on builtin types,
but rather so that it can be used on user-defined types.

The new operator might also be used for overloading in situations where
the intuitive meaning "at" is a useful mnemonic, just as << and >> are
used with streams.

If we attract more Fortran users they might ask us to implement
EQUIVALENCE next. :-)
--
Joe Buck jbuck@ohm.berkeley.edu