Topic: A proposal to add an exponentiation operator to the C++ language
Author: matt@physics.Berkeley.EDU (Matt Austern)
Date: 10 Sep 1992 00:40:56 GMT Raw View
Just as the subject line states, this is my proposal to add an
exponentiation operator to the C++ language. For the most part, I'm
just repeating the things that I, and other people, said in the
discussion that took place a while ago on comp.lang.c++ and
comp.std.c++. Now it's all in one place, though; I think that I
have addressed all of the important concerns.
This is a fairly long LaTeX file; it will be much more legible if you
run it through LaTeX and print it out. It you want to read it but
don't have TeX, send me mail, and maybe we'll be able to figure
something out.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentstyle[12pt]{article}
\catcode`\^=11
\def\~{\char126}
\def\@{\char64}
\def\op{{\tt *^}}
\def\C{{\tt C++}}
\def\oldC{{\tt C}}
\begin{document}
\title{A proposal to add an exponentiation operator to the C{\tt++} language}
\author{Matthew H. Austern\\
{\normalsize\it Lawrence Berkeley Laboratory; Berkeley, CA 94720}\\
{\normalsize\tt (matt@physics.berkeley.edu)}}
\date{September 8, 1992}
\maketitle
\begin{abstract}
This paper is a description of a proposal to extend \C\ by adding an
exponentiation operator. The proposal itself is given in
Section~\ref{proposal}; the remainder of this paper is an argument for
the desirability of this extension, and an analysis of it.
\end{abstract}
\section{Introduction}
On several occasions, I have happened to remark to a friend who does
not know either \oldC\ or \C\ that I am working on a proposal to add
an exponentiation operator to \C. On all such occasions, the response
has been incredulity: they are unwilling to believe that the language
does not already have such an operator. We, who have used \oldC\ and
\C\ for many years, have had time to get used to the exponentiation
operator's absence, but novice users still find its absence
surprising.
I suggest that their na{\"\i}ve expectation is correct: \C\ should
have such an operator, particularly as it is possible to add one to
the language with minimal effort and with no effect on existing code.
This paper is a proposal to extend the \C\ language by adding this
operator.
In Section~\ref{justification}, I explain the reasons why an
exponentiation operator is desirable; this section includes a
discussion of possible alternatives to this operator, and the reasons
why they are inadequate. In Section~\ref{proposal-section}, I
describe the proposal in detail, and, in Section~\ref{rationale}, I
describe the rationale behind the design choices presented there.
Finally, in Section~\ref{questions}, I address various questions about
this extension, and, in Section~\ref{objections}, I address possible
objections to it. I conclude in Section~\ref{conclusion}.
Note that much of the material in this document is not original with
me; to a large extent, I am simply transcribing the consensus about
this issue which has formed on the Usenet newsgroup {\tt
comp.lang.c++}. In particular, I acknowledge the work of Joe Buck.
This proposal differs from his only in small details, and in the
extent of the discussion. I also wish to thank John Skaller for help
with writing this document.
\section{Why is this proposal important?}\label{justification}
\subsection{The importance of exponentiation}\label{importance}
Examining a moderate-sized (30,000 line) {\sc fortran}
program\footnote{{\sc papageno}, written by I.~Hinchliffe.}, I found
that the exponentiation operator was used quite commonly: about half
as often as the division operator. Or, to put it differently: there
was an average of about one use every six lines. In my field, at
least (high-energy physics), this program is rather typical:
exponentiation is a common operation in mathematical expressions. It
is certainly much more common, in the kinds of programs that I write
and work with, than are any of the bitwise operators!
The primary justification for an exponentiation operator, then, is
simple: it is one of the basic binary operators of mathematics. Just
as it would be excessively clumsy to use the syntax {\tt add(x,y)} for
addition, or {\tt div(x,y)} for division, so it is excessively clumsy
to use the syntax {\tt pow(x,y)} for exponentiation. A function call
looks very different from the way that exponentiation is denoted in
ordinary mathematical expressions written down on paper, and in
complicated mathematical expressions this syntactic clumsiness can
have a very serious deleterious effect on clarity.
(It is unnecessary to explain the importance of clarity; there is,
however, a specific reason, in addition to the usual ones, why it is
important for mathematical expressions in particular. It is often
necessary to verify that a formula in the code is the same as a
formula on paper, or in another program. The clearer the notation,
the more likely it is that this can be done without error.)
It should be noted that the most common use of exponentiation, by far,
is raising a floating-point number to a small integral power which is
known at compile time; that is, in {\sc fortran} programs, an
expression like {\tt x ** 4} is much more common than one like {\tt x
** 0.007297}, or one like {\tt x ** y}. The problem, then, is
particularly acute: not only does \C\ not provide an operator for
exponentiation, but it provides no means whatsoever for raising a
number to an integral power. The function call {\tt pow(x,3)}, for
example, is equivalent to the function call {\tt pow(x,3.0)}. On most
systems, this is a serious loss of efficiency, and possibly precision
as well, since cubing a number is much simpler than raising that
number to some arbitrary non-integral power, a task which requires
computing transcendental functions.
I believe that an exponentiation operator is important primarily for
scientific programmers, and for others who write numerical code.
Almost all scientific programmers find \C's lack of an exponentiation
operator to be at least an inconvenience, and some find it almost
intolerable. (Consider, for example, the very strong language used in
Chapter~1 of {\sl Numerical Recipes in C\/}.\footnote{ W.~H.~Press,
B.~P.~Flannery, S.~A.~Teukolsky, and W.~T.~Vetterling, {\sl Numerical
Recipes in C\/} (Cambridge: Cambridge University Press), 1988.}) Some
scientific programmers have chosen not to use \oldC\ or \C\ partly for
this reason.
People who do not write numerical programs will probably find a \C\
exponentiation operator neither beneficial nor detrimental.
\subsection{Possible alternatives}
\subsubsection{Programming techniques}
In the \C\ language as it currently stands, there are no satisfactory
methods for performing exponentiation. As described above, there are
two distinct problems:
\begin{enumerate}
\item Using function calls is syntactically clumsy.
\item The language provides no way to raise a number to an integral
power.
\end{enumerate}
There is no way to resolve the first difficulty without changing the
language; it cannot be solved by operator overloading. There are two
reasons for this, either of which, on its own, is sufficient to
preclude such a solution. First, operator overloading applies only to
user-defined types; a useful exponentiation operator, however, must be
defined for arguments of type {\tt double}, {\tt float}, and {\tt
int}. Second, there is no operator with a precedence suitable for
this overloading. All of the binary operators which might be chosen
(such as {\tt ^}) have a lower precedence than multiplication and
addition. In ordinary mathematical notation, and in all other
computer languages that have exponentiation operators, exponentiation
binds more tightly than multiplication; an exponentiation operator
which bound less tightly than addition would be confusing, and would
be an invitation to errors. Combined, these two objections are so
formidable a barrier that I have never seen even an attempt to
implement exponentiation by overloading some existing operator.
The second difficulty---the inability to specify that the exponent is
some small integral power---can be circumvented by the programmer, but
only at the cost of some inconvenience. One possibility is to write a
user-defined function, {\tt pow(double,int)}. For most compilers
today, unfortunately, it is unlikely, no matter how this function is
defined, and no matter what is declared {\tt inline}, that {\tt
pow(x,2)} would expand to something as efficient as {\tt x*x}.
Furthermore, a proper implementation of this function is likely to
depend on the architecture of the target machine; such functions
properly belong to the realm of the library author or the compiler
writer, not the individual programmer. Still, with a moderate amount
of effort, it is possible to write a version of {\tt pow(double,int)}
which is preferable to the {\tt pow(double,double)} in the standard
library.
This effort is sufficiently great, however, that a more commonly used
technique is to define a series of small inline functions: {\tt
square(dou\-ble)}, {\tt cube(dou\-ble)}, {\tt fourth(double)}, {\tt
fifth(double)}, and so on, and then to use the standard library
function {\tt pow(double,double)}, or a user-defined function {\tt
pow(dou\-ble,int)}, for those cases where the exponent is not small or
is not known at compile time. This solution is ugly, and requires a
fair amount of programmer effort, but it does at least allow the
programmer to raise a number to a small integral power.
Some programmers use even more cumbersome workarounds---a lookup table
of powers, for example.
\subsubsection{Changes to the standard library}
If the functions {\tt pow(double,int)} and {\tt pow(float,int)} were
added to the standard library, this would alleviate at least part of
the difficulty. (This was not a possible solution for \oldC, which
does not allow function overloading.) This would be a satisfactory
means of raising numbers to small integral powers, and would be a
distinct improvement over the present situation.
This solution is really only useful if it is done by all compiler
vendors, {\em i.e.}, if it is mandated in the Standard. Nobody is
going to write {\tt pow(x,2)} to square a quantity if most compilers
are just going to pass it to {\tt pow(dou\-ble,dou\-ble)} and compute it
by means of transcendental functions.
One difficulty with this solution is that it would change the meaning
of existing code: {\tt pow(x,2)} and {\tt pow(x,2.)}, for example,
would now represent calls to two different functions. Whenever the
meaning of existing code is changed, even in a seemingly innocuous
way, there is some risk of breaking a currently working program.
Note also that this only solves one of the two problems which was
discussed above. The more important problem, the clumsy syntax, still
remains. Library solutions, by their nature, cannot address the
syntactic problem.
\subsubsection{Use of an existing operator}
Several operators exist ({\it e.g.}, {\tt ^}) which are not defined
for floating-point operands; one could imagine changing the language
so that, if at least one argument is of a floating-point type, it
denotes exponentiation. This is a poor idea, however, because all of
these operators have precedence lower than addition; as discussed
above, such a precedence for an exponentiation operator would be
grossly counter-intuitive.
\section{The proposal} \label{proposal-section}
\subsection{Changes to the language} \label{proposal}
A new token, \op, will be added to the \C\ language. This token will
be a binary operator, and will denote exponentiation. It will group
right to left, and will have a precedence higher than multiplication
and division, but lower than the pointer-to-member operators. (Note
that this is a new precedence level.)
Both arguments of this operator must be of a numeric type.
If the first operand is a floating-point type and the second is an
integral type, then the type of the expression is that of the first
operand; the expression is evaluated {\em without} converting the
second operand to the same type as the first. Except for this special
case, the usual arithmetic conversions are performed on the operands,
and determine the return type.
The following conditions on the operands constitute a domain error:
\begin{enumerate}
\item The first operand is zero, and the second is negative.
\item The first operand is negative, and the second is not
an integer. (An implementation is allowed, but
not required, to interpret ``not an integer'' as ``not a
number of an integral type.'')
\end{enumerate}
The value of the expression {\tt 0 *^ 0} is implementation
dependent, regardless of whether the types of the operands are
integral or floating-point. An implementation is allowed to treat
this as a domain error.
A second token, {\tt *^=}, will also be added. It will be defined in
the same way as all of the other {\it op\/}{\tt =} operators.
As with other operators, \op\ and {\tt *^=} may be overloaded by the
user if at least one operand is an object of a class type.
\subsection{Changes to the \C\ reference manual}
A new section describing the exponentiation operator will have to be
added to \S5 of the reference manual, which describes expressions; it
would logically fit between \S5.5 and \S5.6 of the {\sl Annotated
Reference Manual\/}\footnote{M.~A.~Ellis and B.~Stroustrup, {\sl The
Annotated C++ Reference Manual} (Reading: Addison-Wesley),
1990.}~(ARM), but will probably have to be placed elsewhere instead,
to avoid renumbering most of the chapter. It will include the
following description of the grammar:
\begin{it}
\hspace*{3cm}power-expression: \\
\hspace*{4cm} pm-expression \\
\hspace*{4cm} pm-expression *^ power-expression.
\end{it}
In \S5.6, which describes multiplicative expressions, the references
to {\it pm-expression} will have to be changed to {\it
power-expression}.
If an {\it op\/}{\tt =} operator for exponentiation is added as well
(see Section~\ref{op-equals}), it will have to be included in the list
of {\it op\/}{\tt =} operators at the beginning of \S5.17 of the ARM.
\section{Rationale behind details of the proposal} \label{rationale}
\subsection{General discussion}
The primary goal behind this proposal is to make the behavior of the
exponentiation operator as unsurprising as possible. Specifically,
this means that its behavior should be consistent with
\begin{enumerate}
\item The meaning of exponentiation in ordinary mathematical notation;
\item The behavior of exponentiation operators in other languages
(particularly {\sc fortran}); and
\item The behavior of other \C\ operators.
\end{enumerate}
Once these design goals are accepted, there is very little freedom
remaining in choosing how the exponentiation operator should behave;
in almost all cases, only one choice is reasonable.
\subsection{The name of the operator}
Most other languages use {\tt ^} or {\tt **} as exponentiation
operators. Neither is suitable for \C, since both already have very
different meanings. We must therefore look for alternative names.
The names {\tt \@}, {\tt !}, {\tt ^^}, {\tt \~}, and {\tt *^} have been
suggested for an exponentiation operator. Most of these are
unsuitable. {\tt \@} cannot be used, because it is not present in
many countries' character sets. {\tt !} and {\tt ^^} would be
possible choices, but poor ones, because they could cause confusion.
In particular, {\tt !} would be a confusing choice because the
corresponding {\it op\/}{\tt =} operator would be {\tt !=}, which, of
course, already has a quite different meaning. (The confusion would
remain even if it was decided not to define an {\it op\/}{\tt =}
operator for exponentiation.) Similarly, {\tt ^^} would be a
confusing choice because programmers might, reasoning by analogy from
{\tt \&\&} and {\tt ||}, expect it to behave as a
logical exclusive or.
The remaining choices, then, are {\tt *^} and {\tt \~}. Of the two,
{\tt *^} is preferable because it is more mnemonic: it is similar,
even though not identical, to the exponentiation operators used in
other languages. It is also preferable because it is a character
sequence that does not occur in any existing legal \C\ code; making
this extension, then, cannot change the meaning of any currently
working program.
\subsection{Types of the operands}
As emphasized in Section~\ref{importance}, the most common
situation, by far, is an expression like {\tt x *^ 3}, where a
floating-point number is raised to an integral power. Even if the
operator \op\ were only defined for the case where the first operand
is floating-point and the second is integral, then, it would still be
useful; calling a library function for the remaining cases would not
be an undue burden.
The reason why I have proposed a more general operator than that is
simply because I believe that such a restriction would be confusing;
exponentiation is a well-defined operation for two floating-point
operands, or two integral operands, and leaving it undefined for these
cases would be without precedent in either \C\ or any other language.
Furthermore, I do not see any advantage in making this restriction; it
wouldn't make implementation of \op\ significantly easier.
\subsection{Return type}
The first issue to discuss is the general rule, that the return type
of an expression involving \op\ is the same type as the operands.
Specifically, one might question whether the expression {\tt n*^m},
where {\tt n} and {\tt m} are integers, really should return an
integer.
There are two reasons why this expression should return an integer.
First, this is the ordinary rule in \C\ (consider, for example, the
expression {\tt 1/n}), and programmers have the right to expect some
degree of consistency in the language. Second, this behavior is
consistent with the behavior of the {\sc fortran} exponentiation
operator; again, programmers have the right to expect that a \C\
exponentiation operator should behave similarly to exponentiation
operators in other languages. In fact, {\sc fortran} programmers do
sometimes make use of this property; expressions like {\tt (-1)**n}
are not uncommon.
The proposal in Section~\ref{proposal} specifies an exception to the
usual \C\ rule for evaluation of arithmetic operators: if the first
operand of \op\ is a floating-point type and the second is an integral
type, then the return type is that of the first operand, but the
second operand is not promoted to the same type as the first. This
behavior is consistent with that of the {\sc fortran} exponentiation
operator, and it is an essential part of this proposal. The intent is
that if {\tt x} is a variable of some floating-point type, a compiler
may generate different code for the expression {\tt x *^ 3} than for
the expression {\tt x *^ 3.0}.
As emphasized in Section~\ref{justification}, the primary use for an
exponentiation operator is raising a number to a small integral power
which is known at compile time. A good {\sc fortran} compiler can be
expected to optimize an expression like {\tt x ** 4} to two
floating-point multiplies, and the intent of this proposal is that a
good \C\ compiler should be able to perform that same optimization for
the expression {\tt x *^ 4}.
\subsection{Domain errors}
The conditions which are identified in Section~\ref{proposal} as
domain errors are those for which, mathematically, the result of an
exponentiation is either a complex number or is undefined.
One might argue that an expression like {\tt (-1) *^ 0.5} should return
a complex result instead of being an error; this would, however, be a
mistake. First, \C\ has no complex data type; it would be a very poor
design decision if a feature of the language itself depended on the
inclusion of some class library. Second, \C\ is a strongly typed
language, and the type system cannot accommodate an operator which
could return either a {\tt double} or a {\tt Complex} depending on the
values of the operands. Finally, even in {\sc fortran}, which does
have a complex data type, the expression {\tt (-1) ** 0.5} is an error;
users of {\sc fortran} who want complex results must provide complex
operands.
Note that this proposal does not specify the run-time behavior of a
program which contains a domain error. This is consistent with what
the Standard currently says about the treatment of domain errors ({\it
e.g.}, {\tt x/0}) and overflows. In both cases, implementations
should be free to do whatever is reasonable for the specific hardware
and operating environment. Some reasonable choices might be returning
a {\tt NaN}, or raising an exception, or printing a diagnostic and
terminating program execution.
\subsection{0 \op\ 0}
\catcode`\^=7
Mathematically, the meaning of $0^0$ depends on how this expression is
interpreted; one might sensibly imagine it to mean
\[ \lim_{x \rightarrow 0} x^x,\]
\[ \lim_{x \rightarrow 0} x^0,\]
\[ \lim_{x \rightarrow 0^+} 0^x,\]
or several other possibilities. The values of these expressions are
different. A computer language, then, might plausibly compute the
value of this expression as $0$, or as $1$, or treat it as a domain
error. (In {\sc fortran}, this expression is an error.)
\catcode`\^=11
It is consistent with the spirit of \C\ to leave this choice up to the
compiler writer; compare, for example, the sign of {\tt \%} when one
operand is negative. As with {\tt \%}, one motive for specifying that
this behavior is implementation-dependent is to allow compiler writers
to make efficient use of whatever hardware features are present.
\subsection{Associativity} \label{associativity}
Unparenthesized expressions involving two exponentiation operators are
not very common, so this choice isn't a matter of terribly great
importance. For the sake of consistency, it is best to follow the
precedent of {\sc fortran}, where the exponentiation operator binds to
the right. That is, in {\sc fortran}, the expression
{\tt x ** y ** z} means the same thing as {\tt x ** (y ** z)}.
\subsection{Precedence}
\subsubsection{Possibilities for the precedence}
Mathematically, exponentiation binds more tightly than multiplication.
This leaves four possibilities, then, for the precedence of a \C\
exponentiation operator:
\begin{itemize}
\item A new precedence level between the multiplicative operators and
the unary operators.
\item The same precedence level as the unary operators.
\item A new precedence level above the unary operators but below the
postfix operators.
\item The same precedence level as the postfix operators.
\end{itemize}
I will consider these in turn, in order of decreasing precedence.
\subsubsection{\op\ as a postfix operator} \label{postfix}
Postfix expressions, as described in \S 5.2 of the ARM, group left to
right. The proposal of Section~\ref{proposal} specifies that the
associativity of \op\ is right to left, but changing this would not be
a terribly serious matter. As noted in Section~\ref{associativity},
it is rare to encounter expression where the associativity of an
exponentiation operator makes much difference.
A more serious problem, however, is in an expression like
\begin{center}\begin{tt} x *^ p->a. \end{tt}\end{center}
This would be interpreted by the compiler as
\begin{center}\begin{tt} (x*^p)->a, \end{tt}\end{center}
which would be disastrous. Operators like {\tt ->}, {\tt .}, and {\tt
[]} have a high precedence for a reason, which is to ensure that
postfix expressions behave the same way in arithmetic expressions as
ordinary variables do. This property is valuable, and it should not
be broken by an exponentiation operator.
\subsubsection{A new level above unary operators}
The problem here is very similar to that described above: the
expression
\begin{center}\begin{tt} *p *^ x \end{tt}\end{center}
would be interpreted as
\begin{center}\begin{tt} *(p*^x), \end{tt}\end{center}
which is undesirable for exactly the same reason as that given in
Section~\ref{postfix}.
\subsubsection{The same precedence as unary operators}
Unary operators group right to left, so, again, the expression
\begin{center}\begin{tt} *p *^ x \end{tt}\end{center}
would be interpreted in an undesirable way. \op\ must be given a
precedence lower than the unary operators.
The same reasoning about the operator {\tt ->} also applies to the
operators {\tt ->*} and {\tt .*}, implying that the precedence of {\tt
*^} must be below that of the pointer-to-member operators.
\subsubsection{A new level above multiplication}
There are no disastrous problems associated with putting \op\ above
the multiplicative operators and below unary operators, but there is a
small annoyance: the expression {\tt -x*^2} would be interpreted as
{\tt (-x) *^ 2}, which is unlikely to be what the programmer intended.
This is merely an annoyance, however; it is unlikely to be a serious
source of errors. Expressions of this sort are rare, and compilers
could issue warnings when they occur without parentheses.
\subsection{Guarantees about the value returned by \op}
Nothing in this proposal guarantees that (for example) {\tt x*^3 ==
x*x*x}, or, for that matter, that {\tt x*^3 == x*^3.0}. This is an
intentional omission. Exact equality of floating-point numbers is a
very strong statement, and it would be grossly inappropriate to
require it here. Doing so would make this proposal much more
complicated, and would also severely constrain the techniques that
could be used for implementation.
Note that this is consistent with the way that the Standard treats
existing operators; there is nothing in the Standard to guarantee that
{\tt x*3 == x+x+x}.
\subsection{A {\tt *^=} operator} \label{op-equals}
The operator {\tt *^=} is not nearly as important as is \op.
Expressions like {\tt x = x *^ 0.5} do occur on occasion, but they are
sufficiently rare that the syntactic convenience of an {\it op\/}{\tt
=} operator is unimportant.
There are, however, several good reasons for including the operator
{\tt *^=}. First, the intention of this proposal is to treat the
exponentiation operator, \op, in a similar manner to the other common
binary arithmetic operators. Accordingly, it would be surprising to
omit an {\it op\/}{\tt =} operator for exponentiation while including
them for all of the others. As always, it is important for the
language to work in as unsurprising a manner as possible. Second, and
perhaps more important: {\tt *^=} is not really necessary if both
operands are of built-in types, but, if \op\ is overloaded for some
user-defined type, it could prove useful to overload {\tt *^=} as
well. The absence of this operator, in such cases, would be an
annoying and gratuitous restriction.
As discussed in Section~\ref{overloading}, there are two distinct
types of overloading to consider: overloading \op\ to provide
exponentiation for some user-defined type, and overloading \op\ for
some purpose unrelated to exponentiation. In both cases, the operator
{\tt *^=} could easily prove useful. For the first type, consider
matrix exponentiation; if the matrices involved are large, then,
unless clever reference-counting techniques are used, the expression
\begin{center}\begin{tt}
M *^= 5
\end{tt}\end{center}
is likely to be much more efficient than the expression
\begin{center}\begin{tt}
M = M *^ 5.
\end{tt}\end{center}
For the second type: if \op\ is overloaded to mean something other
than exponentiation, then the semantics of that operation might make
an {\it op\/}{\tt =} operator useful. It would be useful, for
example, for substring extraction, or for Lorentz transformations.
Despite these reasons for including {\tt *^=}, there are also good
reasons for rejecting its inclusion. First, it makes this proposal
twice as complicated, by adding two new tokens instead of one.
Second, there is no prior art for it: to the best of my knowledge, no
language has an {\it op}{\tt =} operator for exponentiation. I do not
anticipate that adding {\tt *^=} will cause any problems, but no
matter how carefully we think about a feature, there is no substitute
for experience; we {\em know\/} that an exponentiation operator causes
no problems, but we can only surmise that about an exponentiation {\it
op\/}{\tt =} operator.
To summarize: I believe that, on balance, it would be better to
include the operator {\tt *^=} than not to include it, but I do not
regard {\tt *^=} as an essential part of this proposal. If the
extensions committee chose to remove the {\tt *^=} operator, I would
regard that decision as completely rational. It is more conservative
to include only {\tt *^}, instead of both {\tt *^} and {\tt *^=}, and
in this case conservatism may be appropriate.
\section{Implications of this proposal} \label{questions}
\subsection{Prior art}
To the best of my knowledge, no existing \oldC\ or \C\ compiler
includes an exponentiation operator. However, this proposal is by no
means without prior art! \oldC\ and \C\ are highly unusual in not
having this operator; I have never used any other language which has
operators at all, but which does not have an exponentiation operator.
Examples of languages with this operator include {\sc fortran},
Pascal, {\sc basic}, Eiffel, and Ada.
This point can scarcely be stressed sufficiently: although designing
algorithms for efficient and accurate floating-point exponentiation is
by no means trivial, we in the \C\ community are not starting {\it ab
initio}. These algorithms already exist, and these issues were
addressed decades ago when the very first {\sc fortran} compilers were
written.
Arithmetic expressions in {\sc fortran} are sufficiently similar to
those in \C\ so that the experience with a {\sc fortran}
exponentiation should carry over to \C\ with little modification. It
isn't always obvious that experience with a feature in one language
carries over to a different language, but in this case, it is. Many
languages, not just one, have exponentiation operators, and with
regard to arithmetic expressions, {\sc fortran} and \C\ are no more
different than {\sc fortran} and Eiffel. If the {\sc fortran}
experience carries over to other languages with exponentiation
operators, there is no reason to think that \C\ would be an exception.
\subsection{Impact on existing code}
Adding the new token \op\ as an exponentiation operator will not
affect any existing code: this sequence of characters does not appear
in \C\ code in any context. The meaning of code that does not use
this operator will be unchanged. (Note, in particular, that I am not
proposing any change at all in the behavior of the standard library
function {\tt pow()}.)
It should not be necessary to recompile or relink code that does not
use the exponentiation operator.
\subsection{Efficiency and runtime support}
An exponentiation operator should be at least as efficient as the
techniques that are currently used in its absence. ({\em e.g.},
inline functions, {\tt pow()}, and so on.) On most hardware, it will
probably require runtime support; this will be similar in magnitude to
that presently required by the standard math library function {\tt
pow().} It is possible, in fact, that {\tt operator*^(double,double)}
might use the same code as {\tt pow(double,double)}.
This runtime support is not a serious burden; programs which use
exponentiation will almost certainly use floating-point math functions
defined in the standard math library, and thus will not require any
more runtime support than they otherwise would.
Finally, there is no reason why any program that does not use
exponentiation should be affected at all by this extension, either in
speed or in size of executable. The technology certainly exists for a
compiler to recognize that exponentiation is not used, and to avoid
linking in any runtime code related to it.
\subsection{Interaction with other features of the language}
\label{overloading}
The obvious feature to consider is operator overloading. There are
two cases to consider: first, overloading \op\ to mean exponentiation,
but for some user-defined types, and second, overloading \op\ to have
some completely different semantics unrelated to exponentiation.
In the first case, I anticipate that \op\ will be overloaded for many
user-defined types which have an algebra similar to that of real
numbers---complex numbers, for example, and matrices. For these
examples, it would be sensible to define (among other overloadings)
\begin{center}\begin{tt}
operator*^ (Complex, Complex)
\end{tt}\end{center}
and
\begin{center}\begin{tt}
operator*^ (Matrix, int).
\end{tt}\end{center}
The latter case, in particular, could be useful in some situations;
computing {\tt operator*^ (Matrix, int)} might well be preferable to
executing a string of {\tt operator* (Matrix, Matrix)}.
I have nothing to say about the second type of overloading. I do not
foresee any particular non-exponentiation use for an operator of this
precedence and associativity; on the other hand, I don't think that I
would have foreseen the use of {\tt operator<<} for stream insertion.
\subsection{Harmony with the ``spirit'' of \C}
This, to be sure, is a subjective question. On the one hand,
exponentiation is a common binary mathematical operation, just like
subtraction or multiplication, and so it is entirely consistent that
it be represented by an operator in the same manner as those others
are. On the other hand, it is certainly true that exponentiation
usually requires more computation than most other floating-point
operations, and some people believe that it is in the spirit of \C\
that the language should be close to the hardware---that non-atomic
operations should be performed by calls to library functions.
I wish to suggest, however, that this distinction between atomic and
non-atomic operations is less clear-cut than it appears at first
sight, is less important in \C\ than in \oldC, and is, in any case,
machine-dependent. I am currently writing this proposal on a machine
which does not do floating-point addition in hardware; conversely, I
have worked on machines where floating-point exponentiation {\em is\/}
done in hardware. On some machines, both addition and exponentiation
require runtime support; on other machines, neither does.
In my opinion, then, an exponentiation operator is by no means out of
place; it may not have been one of the PDP-11 machine instructions,
but in use, and in purpose, it is similar in spirit to the other
mathematical operators.
I believe, furthermore, that this change will make the \C\ language
easier to learn, not harder; its presence will be less surprising to
people familiar with other languages than its absence currently is.
\subsection{Why should \C\ support numerical programming?}
This question might be rephrased: ``If you like {\sc fortran}, you
know where to find it.''
The problem is that while {\sc fortran} is suitable for some tasks in
scientific computing, it is very poorly suited to others. Abstract
data types, polymorphism, and inheritance are just as useful for
scientific programming as for other types of programming; many
scientific programs have grown so large that they are beginning to
grow out of control, and an increasing number of scientists are
beginning to realize that object-oriented programming could help them
solve their problems. \C\ is a useful language for numerical
programming; the lack of an exponentiation operator is simply a small
blemish.
Unfortunately, some scientists are asking themselves essentially the
same question as the one above, but turned around: ``Why should I
switch to a new language if I have to give up a useful feature?'' It
would be a shame if a largely syntactic matter hindered the acceptance
of object-oriented programming by the scientific community.
\section{Objections to this proposal} \label{objections}
The general idea of an exponentiation operator has been discussed at
some length on the Usenet newsgroup {\tt comp.lang.c++}, and several
objections to it have been raised. I will now summarize these
objections, and my responses to them. Many of them have already been
discussed earlier in this document, so most of the responses will be
brief.
\begin{enumerate}
\item It requires extensive runtime support.
{\bf My response:}
No run-time support is required for programs that do not use
\op. Programs that do use it will require no more extensive run-time
support than programs that do floating-point computation already require.
\item A proper implementation of {\tt pow()} would render this
operator unnecessary.
{\bf My response:}
No matter how {\tt pow()} is defined, the syntactic problem remains.
\item The restrictions on the domain of the operands are excessively
complicated, and are without precedent in the \C\ language.
{\bf My response:}
The restrictions on the domain of {\tt *^} are those dictated by
ordinary mathematics. This is no different in principle from the
\C\ division operator, which may not be given operands for which
division is mathematically undefined.
\item It is too difficult to implement algorithms which perform
exponentiation in an adequate manner.
{\bf My response:}
The techniques for exponentiation are known, and have been
implemented in many {\sc fortran} compilers.
\item It is premature to add an exponentiation operator before resolving
design decisions, specifically,
\begin{enumerate}
\item The return type for different operand types;
\item The value of the expression {\tt 0 *^ 0};
\item The associativity of the operator;
\item The behavior upon domain error; and
\item Whether (for example) {\tt x *^ 3 == x*x*x}.
\end{enumerate}
{\bf My response: }
All of these examples fall into two categories:
\begin{enumerate}
\item Issues which are resolved by examining how the exponentiation
operator works in {\sc fortran}; and
\item Issues which should be left unspecified in the Standard,
because the analogous issues with other arithmetic operators
are left unspecified.
\end{enumerate}
\item This is really a \oldC\ issue, not a \C\ issue, so it should
be referred to the \oldC\ standardization process instead.
{\bf My response: }
It is true that an exponentiation operator could be added to \oldC,
but this is no reason why it should not be added to \C\ first. Note that
this same objection could have been made to function prototypes (which
existed in \C\ before they did in \oldC), and to the use of the
delimiter {\tt //} to begin comments.
\item This is only one of many issues in numerical programming; we
should wait until the Numerical C Extensions Group (NCEG) is
finished, and then consider their proposal as a whole.
{\bf My response:}
The NCEG is working on a very elaborate proposal, which would
change the \oldC\ language in very fundamental ways. These plans will
only be included in mainstream \C\ compilers years from now, if ever.
Exponentiation may well not be the most important issue for numerical
programmers, but it is by far the easiest to implement, and has no
major repercussions; I am not proposing a new C-like language, but
merely a small change which fixes an omission in \C.
\item The code which is generated is not a simple machine instruction,
so it is misleading to use a syntax which suggests that it is.
{\bf My response:}
The distinction between atomic and non-atomic operations makes
less sense in \C, which has operator overloading, than it does in
\oldC. In any case, the distinction is machine-dependent.
\item Adding this operator makes the language more complicated by
adding a new token and a new precedence level.
{\bf My response:}
This objection is valid. A new token could be avoided by using
the operator {\tt \~} instead of \op, but a new precedence level is
unavoidable. In my opinion, the convenience of this operator
outweighs the added complication in the table of precedence levels.
\end{enumerate}
\section{Conclusion} \label{conclusion}
An exponentiation operator is a desirable feature; I don't think that
anybody would contest that. I have never, for example, heard anybody
suggest that {\sc fortran} would be a better language if the operator
{\tt **} were removed from it. Not every desirable feature, however,
can be included in \C.
What I have shown in this paper is that an exponentiation operator is
not merely a desirable feature, but, more importantly, also one that
can be added to \C\ with very little effort, with no loss in
efficiency, and with no effect on existing code. I believe that this
justifies its inclusion.
\end{document}
--
Matthew Austern Just keep yelling until you attract a
(510) 644-2618 crowd, then a constituency, a movement, a
austern@lbl.bitnet faction, an army! If you don't have any
matt@physics.berkeley.edu solutions, become a part of the problem!