Thread

Topic: Operator overloading in C

Author: fjh@cs.mu.OZ.AU (Fergus Henderson)
Date: 2000/05/16 Raw View

James Kuyper <kuyper@wizard.net> writes:

>Fergus Henderson wrote:
>....
>> The idea that references are need to properly implement operator
>> overloading is a myth.  It seems to be a somewhat persistent one,
>> probably because references were introduced in C++ "primarily to
>> support operator overloading", according to Stroustrup [1].
>> But as Jacov Navia explained, it is not necessary to introduce
>> references to support operator overloading.
>>
>> Please help stamp out this myth.
>
>You're citing Stroustrup, but only in support of the opposite point of
>view. That doesn't help your case, you know. Would you care to flesh out
>your argument against this "myth" more fully? I'm not convinced by
>Navia's alternative. I don't follow all of the implications of his
>suggestion, but it seems to me that the restriction on operator-() could
>rule out smart pointers, and most arithmetic-like classes as well.

Well, Jacob Navia's alternative might have some problems.  But let me
sketch out my own description of an alternative way of handling
operator overloading that does not require introducing reference
types.  The basic idea is the same: use pointers rather than references.

As with current C++, for each operator `op', the phrase `operator op'
would be allowed as a function name.  For each binary operator `op',
every expression of the form `x op y' would be treated as equivalent
to `operator op(&xv, &yv)', where if `x' is an lvalue then `xv' is
just `x', and if `x' is an rvalue then `xv' is a temporary defined by
`const T xv = x', with `T' being type of X; and likewise for `yv'.
(The lifetime of such temporaries would be the full expression in
which they occur.)  Similarly, for each unary operator `op', every
expression of the form `x op' (for postfix operators) or `op x' (for
prefix operators) would be treated as equivalent to `operator
op(&xv)', where `xv' is defined as above.

The above paragraph completely defines the way ordinary overloaded
operators would be handled.  Some special cases would however need
somewhat different treatment.  In particular, as with current C++,
there would need to be special rules for `new', `delete', and `->'.

Rather than extending the type system, this approach uses a simple
source transformation.  The transformation is mostly syntactic; it
does depend on lvalueness, which is a semantic property, but so to do
the meaning of other C++ expressions (for example, whether `new T'
calls `operator new' or `operator new []' depends on whether T is an
array type or not).  The resulting language is IMHO significantly
simpler overall than one with references in the type system.

Here's a couple of examples of code written using this scheme.

 //
 // complex numbers
 //

 struct complex { double real, imag; };

 operator +(const complex *x, const complex *y) {
  complex r;
  r.real = x->real + y->real;
  r.imag = x->imag + y->imag;
  return r;
 }
 // similarly for *, -, /, etc.

 void example() {
  complex a = { 1, 0 };
  complex b = { 0, 1 };
  complex c = a + b;
  ...
 };

 //
 // bit vectors
 //

 // for simplicity, in this example
 // the bit vector size size is fixed
 #define NUM_BITS 32

 struct bitarray {
  unsigned char bits[NUM_BITS / CHAR_BIT];
 };

 struct bitref {
  bitarray *array;
  int index;
 };
 struct const_bitref {
  const bitarray *array;
  int index;
 };

 bitref operator [] (bitarray *array, const int *index) {
  bitref r = { array, *index; }
  return r;
 };
 const_bitref operator [] (const bitarray *array, const int *index) {
  const_bitref r = { array, *index; }
  return r;
 };

 // implicit conversion operator
 operator bool (const bitref *r) {
  return r->array.bits[r->index / CHAR_BIT] &
   (1 << (r->index % CHAR_BIT));
 };
 operator bool (const const_bitref *r) {
  return r->array.bits[r->index / CHAR_BIT] &
   (1 << (r->index % CHAR_BIT));
 }

 bool operator = (bitref *r, const bool *b) {
  r->array.bits[r->index / CHAR_BIT] |=
   (*b << (r->index % CHAR_BIT));
  return *b;
 };

 void example2() {
  bitarray x;
  x[0] = true;
  x[1] = false;
  x[2] = x[0] && x[1];
  x[3] = x[0] || x[1];
  ...
 };

Note that this scheme does not allow operators to return references.
Instead, a proxy class, like `bitref' in the above example, must be used.

Interestingly, although this scheme does not make references a part
of the language, with this scheme (and some other C++ features)
you can implement classes with reference-like behaviour:

 //
 // reference classes
 //

 template <class T>
 struct ref<T> { T *ptr; };

 // taking address of ref<T>: returns an ordinary pointer
 template<class T>
 T* operator & (const ref<T> *r) {
  return r->ptr;
 };

 // implicit conversion operator: converts ref<T> to T
 template <class T>
 operator T (const ref<T> *r) {
  return *r->ptr;
 };

 // assignment to ref<T>: assigns to the reference location
 template <class T>
 T operator = (ref<T> *r, const T *val) {
  return *r->ptr = *val;
 };

Here's the same thing for const references:

 template <class T>
 struct const_ref<T> { const T *ptr; };

 template<class T>
 const T* operator & (const const_ref<T> *r) {
  return r->ptr;
 };

 template <class T>
 operator T (const const_ref<T> *r) {
  return *r->ptr;
 };

I'm not sure whether that is actually necessary; you may be able to
get away with just using ref<const T> rather than const_ref<T>.

Here's an array class that uses the above reference classes:

 //
 // an array class
 //

 template <class T, int Size>
 struct array { T elems[Size]; };

 // array subscript: returns a (const_)ref<T> proxy
 template <class T, int Size>
 ref<T> operator [] (array<T, Size> *a, const int *i) {
  ref r = { &a->elems[*i] };
  return r;
 };
 template <class T, int Size>
 const_ref<T> operator [] (const array<T, Size> *a, const int *i) {
  const_ref<T> r = { &a->elems[*i]; };
  return r;
 };

 void example3() {
  array<int, 20> x;
  x[0] = 42;
  x[1] = x[0] * 3;
  x[2] = x[0] + x[1];
  x[3] = x[0] || x[1];
  ...
 };

You can also have smart pointer classes.  Here's some code for a
"smart" pointer that doesn't add any intelligence:

 template <class T>
 struct ptr<T> { T *ptr; };

 template <class T>
 ref<T> operator * (const ptr<T> *x) {
  return *x->ptr;
 }

 template <class T>
 size_t operator - (const ptr<T> *x, const ptr<T> *y) {
  return x->ptr - y->ptr;
 }

Note that this `operator -()' does not conflict with the builtin
`operator -()' function, since the prototype for the operator function
corresponding to builtin pointer subtraction is

 template <class T>
 size_t operator - (T* const *x, T* const *y);

not

 template <class T>
 size_t operator - (T*, T*);

It might be nice to provide special treatment for `[]' and unary `*'.
In particular:

 - expressions of the form `x[y] = z' could be treated as
   calls to `operator [] =(&xv, &yv, &zv)';

 - expressions of the form `&x[y]' could be treated as
   calls to `operator & [](&xv, &yv)';

 - expressions of the form `*x = y' could be treated as calls
   to `operator * =(&xv, &yv)' (note that `operator * =' here
   is NOT the same as `operator *='!);

 - expressions of the form `&*x could be treated as calls
   to `operator & *(&xv, &yv)'

This would mean that you could write array or smart pointer classes
without needing to use a proxy class for smart references.
(This post is long enough already, so I leave the details as
an exercise for the reader.)

Finally, let me note that I'm not saying that introducing references
in C++ was necessarily a bad thing overall; I'm just saying that I
don't find the originally stated rationale convincing.

--
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3        |     -- the last words of T. S. Garp.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/09 Raw View

I'm cross-posting this to comp.std.c++; I want to bring in some expert
opinion on this matter. For the benefit of the new audience on
comp.std.c++, here's some material that was referred to but not directly
cited in the message I'm responding to:

In comp.std.c, Jacob Nevia proposed creating an implementation of C
extended to include operator overloads. In addition to the operator
overloads familiar to C++ users, with essentially the same notation, he
also proposed operator[]=(). He proposed it as if the meaning of this
notation was obvious; those who responded on comp.std.c indicated
confusion. I made a guess about what he meant, which he confirmed:

Jacob Nevia wrote:
> James Kuyper wrote:
....
> > I suspect that he intends something that in C++ would (if it supported
> > his concept) work as follows:
> >
> > struct TenInt {
> > private:
> >         int array[10];  // fixed size for exposition only
> > public:
> >         int operator[](size_t i) const { return array[i];}
> >         int operator[]=(size_t i, int j) { return array[i]=j; }
> >         // Other appropriate members.
> > }
>
> Exactly. I thought it was so obvious that didn't need to be mentioned.
> Sorry.
>
> >
> >
> > Since he intends to implement an extension to C, rather than a
> > simplified version of C++, he's stuck with this approach; C doesn't
> > support references, which are needed to properly implement a non-const
> > operator[].
>
> I implemented references, to allow this extension to be source code
> compatible
> with C++, but they are not mandatory. You can (contrary to C++) use
> pointers
> as arguments to those functions, except in operators where this would lead
> to
> ambiguities:
> Given:
>     Type *a,*b;
>
>     a-b
> could mean either:
>     ((unsigned int)a) - ((unsigned int)b) )/(sizeof(*a))
> i.e. the distance of the two pointer in unites of sizeof(Type)
> or could mean
>    call operator-(a,b)
>
> That is why my implementation doesn't allow the operator '-' to have
> exclusively pointer operators.
....

Fergus Henderson wrote:
....
> The idea that references are need to properly implement operator
> overloading is a myth.  It seems to be a somewhat persistent one,
> probably because references were introduced in C++ "primarily to
> support operator overloading", according to Stroustrup [1].
> But as Jacov Navia explained, it is not necessary to introduce
> references to support operator overloading.
>
> Please help stamp out this myth.
>
> [1] Bjarne Stroustrup, "The Design and Evolution of C++", 1994,
>         Addison-Wesley.

You're citing Stroustrup, but only in support of the opposite point of
view. That doesn't help your case, you know. Would you care to flesh out
your argument against this "myth" more fully? I'm not convinced by
Navia's alternative. I don't follow all of the implications of his
suggestion, but it seems to me that the restriction on operator-() could
rule out smart pointers, and most arithmetic-like classes as well.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]