Topic: It's a lot worse than it sounds (was The problem with temp...)
Author: mitch@Apple.COM (Mitch Adler)
Date: 22 Aug 91 23:13:40 GMT Raw View
In article euaeny@eua.ericsson.se (Erik Nyquist) writes:
>As an observing member of the x3j16 committee I have bben given the chance
>to look at Jerry Schwarz proposal for a standard string class.
>
>Among other things I noticed that there is no
> string operator(const string&, const string&);
>defined for the string class.
>
>This operator has apparently been removed from the original proposal
>at the Lund x3j16 meeting to avoid the problem with the lifetime of
>temporaries.
[Discussion of why or why not to change this proposal deleted.]
First of all, this is a bigger problem than just strings. Any object
that overrides an operator that does not have assignment semantics
(that is, it takes const parameters and returns a result) must
allocate space for the result, and this space must be freed by the
caller, explicit or implicit, to avoid a memory leak (yes, garbage
collection would solve the whole problem).
Implementation dependent temporaries don't solve the problem, since
the operator definition (const &operator+(const &x, const &y) )
doesn't provide a way for the caller to define storage for the return
value, forcing the function to allocate the space.
OK, so the function has to allocate space, well it has to return an
object that will persist (since it returns a reference, the object
better persist!).
There are several ways to allocate persistent space, one is static
space and pitfalls of reuse of space (you can make a system that will
reuse active space) make it an unacceptable solution in the general case.
This leaves the other way to allocate persistent space: on the heap.
If you allocate a heap object, then it is left up to the caller
(unwitting expression user) of your operator to know you allocated an
object and delete it himself.
For overloaded operators this is BAD. It means he cannot use that
operator in a larger expression, i.e.
aClass a, b, c, result;
result = (a + b) + c;
and aClass overrides operator+ using heap space for the result, the memory
allocated for the temporary result of (a + b) is lost!
This happens with ALL classes that override ANY operator that doesn't have
assignment semantics!!!
If in overloading an operator/function you have CHANGED it's behavior
(before a class overrides these operators the operators guaranteed no
change in resource allocation, now it orphans an object), you aren't
defining the same function so you shouldn't have overridden the
operator in he first place, you should declare a new
function/operator.
So if you agree with the train of thought (almost de-railed, but still
a train) above, you should never overload non-assignment semantic
operators (ones that need intermediate results). Which makes them
ability useless.
I think operator overloading should be defined such that the caller
(the compiler in the case of expressions) provides the space for the
result of the operation, instead of the callee (function) producing
the space.
But my ideas aside, the problem is NOT isolated to strings, it is a
general one that should be solved for ALL objects that want to
implement operators for some syntatic sugar.
Mitch
It's a really nasty problem, and the group I'm in just avoids
it and doesn't override non-assignment operators.
--
Mitch Adler Strayed Member of TEAM CTHULHU (Wisher, to some)
Object Based Systems mitch@apple.com
Apple Computer, Inc. AppleLink: M.Adler
Claimer: These are MY opinions, not Darin's, not Apple's! MINE!
Author: mitch@Apple.COM (Mitch Adler)
Date: 29 Aug 91 00:05:10 GMT Raw View
In article <15751@goofy.Apple.COM> mitch@Apple.COM (Mitch Adler) writes:
>In article euaeny@eua.ericsson.se (Erik Nyquist) writes:
[ My argument about non-assignment semantic operators deleted ]
Well, it seems that my argument had a premise I didn't recognize (and
has been pointed out to me by people here.
You only have this problem with operators trying to return references
from operator functions (which Bjarne strongly recommends against).
The only penalty doing things so you won't loose resources is calling
your copy constructor.
You can return the object, i.e.
MyObject MyObject::operator+(MyObject &a, MyObject &b)
{
return MyObject(a.foo + b.foo);
}
But this uses the copy constructor, and makes an extra copy (forcing
us to define MyObject pointer types if we want effeciency).
Ok, I can deal with that.
Mitch
--
Mitch Adler Strayed Member of TEAM CTHULHU (Wisher, to some)
Object Based Systems mitch@apple.com
Apple Computer, Inc. AppleLink: M.Adler
Claimer: These are MY opinions, not Darin's, not Apple's! MINE!
Author: ark@alice.att.com (Andrew Koenig)
Date: 29 Aug 91 14:34:31 GMT Raw View
In article <15848@goofy.Apple.COM> mitch@Apple.COM (Mitch Adler) writes:
> But this uses the copy constructor, and makes an extra copy (forcing
> us to define MyObject pointer types if we want effeciency).
> Ok, I can deal with that.
So can some compilers -- the call to the copy constructor can
often be optimized away. This is especially likely if the
function in question is inline.
--
--Andrew Koenig
ark@europa.att.com
Author: jbuck@forney.berkeley.edu (Joe Buck)
Date: 29 Aug 91 23:05:33 GMT Raw View
In article <15848@goofy.Apple.COM> mitch@Apple.COM (Mitch Adler) writes:
>> But this uses the copy constructor, and makes an extra copy (forcing
>> us to define MyObject pointer types if we want effeciency).
>> Ok, I can deal with that.
In article <20797@alice.att.com> ark@alice.UUCP () writes:
>So can some compilers -- the call to the copy constructor can
>often be optimized away. This is especially likely if the
>function in question is inline.
I took your course a year and a half ago, Andy, and that's what you
told us. Unfortunately, no one has taught cfront (as of version 2.1)
how to do it, even in simple cases like the following:
Here I have omitted many things that a real matrix class might have;
all that's here is the declaration of a copy constructor, a +=
operator, an assignment operator, and a + operator.
class Matrix {
private:
double* data;
int rows, cols;
public:
Matrix();
Matrix(const Matrix&);
Matrix& operator+=(const Matrix&);
Matrix& operator=(const Matrix&);
};
inline Matrix operator+(const Matrix& a,const Matrix& b) {
Matrix tmp(a);
tmp += b;
return tmp;
}
void add(const Matrix& a,const Matrix& b,Matrix& c) {
c = a + b;
}
If, in any case, a temporary can be optimized away, this is such a case.
If the compiler were allowed to assume that assignment and copy construction
mean assignment and copy construction, it could produce the equivalent of
void add(const Matrix& a,const Matrix& b,Matrix& c) {
c = a;
c += b;
}
The code generated by Sun's version of AT&T's cfront 2.1 for the
"add" function calls the copy constructor twice and does not optimize
away the temporary. Here is a very-cleaned-up version of what is generated
(I rearranged the comma operators and re-named the temporaries so that
the code can be read by mere mortals; cfront would have no problem winning
the Obfuscated C contest):
char add__FRC6MatrixT1R6Matrix (__0a , __0b , __0c )
struct Matrix *__0a ;
struct Matrix *__0b ;
struct Matrix *__0c ;
{
struct Matrix *pt1 ;
struct Matrix *pt2 ;
struct Matrix t3 ;
struct Matrix t4 ;
pt1 = __0a;
pt2 = __0b;
__ct__6MatrixFRC6Matrix (&t3, pt1);
__apl__6MatrixFRC6Matrix (&t3, pt2);
__ct__6MatrixFRC6Matrix (&t4, &t3);
__as__6MatrixFRC6Matrix (__0c, t4);
}
When this code is handed to g++ (version 1.37.1), the exact same four
calls are produced in the same order. However, g++ provides a language
extension: I could have written
inline Matrix operator+(const Matrix& a,const Matrix& b) return tmp(a) {
tmp += b;
}
That is, I declare the return variable in the header and provide an
expression with the same syntax as those used to initialize the members
of a class when a constructor is used. If I do this, one of the
copy constructors is eliminated. We still don't have the ideal code;
we have the equivalent of
void add(const Matrix& a,const Matrix& b,Matrix& c) {
Matrix tmp(a);
tmp += b;
c = tmp;
}
So the extension helps, but doesn't solve the problem: the compiler
must assume a particular relation between the copy constructor and
the assignment operator to eliminate the temporary. Here's the Sparc
assembler code (with -O):
_add__FRC6MatrixT0R6Matrix:
!#PROLOGUE# 0
save %sp,-128,%sp
!#PROLOGUE# 1
add %fp,-32,%l0
mov %l0,%o0
call ___6MatrixRC6Matrix,0
mov %i0,%o1
mov %l0,%o0
call _op$assign_plus__6MatrixRC6Matrix,0
mov %i1,%o1
mov %i2,%o0
call _op$assign_nop__6MatrixRC6Matrix,0
mov %l0,%o1
ret
restore
The name mangling is different, but you get the idea.
--
--
Joe Buck
jbuck@galileo.berkeley.edu {uunet,ucbvax}!galileo.berkeley.edu!jbuck
Author: ark@alice.att.com (Andrew Koenig)
Date: 30 Aug 91 07:53:07 GMT Raw View
In article <1991Aug29.230533.22972@agate.berkeley.edu> jbuck@forney.berkeley.edu (Joe Buck) writes:
> I took your course a year and a half ago, Andy, and that's what you
> told us. Unfortunately, no one has taught cfront (as of version 2.1)
> how to do it, even in simple cases like the following:
Right you are -- cfront 2.1 doesn't support this particular optimization.
At this point, though, it looks like cfront 3.0 will.
--
--Andrew Koenig
ark@europa.att.com
Author: steve@taumet.com (Stephen D. Clamage)
Date: 30 Aug 91 18:12:31 GMT Raw View
jbuck@forney.berkeley.edu (Joe Buck) writes:
|In article <20797@alice.att.com> ark@alice.UUCP () writes:
|>So can some compilers -- the call to the copy constructor can
|>often be optimized away. This is especially likely if the
|>function in question is inline.
|I took your course a year and a half ago, Andy, and that's what you
|told us. Unfortunately, no one has taught cfront (as of version 2.1)
|how to do it, even in simple cases like the following:
There are alternatives to cfront. Oregon C++, for example (a product
of my company) does optimize away the copy constructor and temp in this
case. It is available on many popular platforms, including Sun-4.
--
Steve Clamage, TauMetric Corp, steve@taumet.com