Thread

Topic: Unnecessary copying of returned object

Author: bs@alice.UUCP (Bjarne Stroustrup)
Date: 1 Sep 90 13:09:19 GMT Raw View


Antony Hodgson @ Massachusetts Institute of Technology writes

 > Consider the following:
 >
 >  Vector Vector::operator + ( Vector & A )
 >   {
 >    Vector temp;
 >    // for all valid i, temp[i] = (*this)[i] + A[i]
 >    return temp;  // copy of temp made here
 >   } // temp deallocated here
 >
 > Could an intelligent compiler bypass this copying without the help of
 > any new syntax?

Yes it could. The technique is described in the annotated C++ reference
manual in section 12.1.1c. I believe the technique is used in the Zortech
compiler (existence proof).

Author: ark@alice.UUCP (Andrew Koenig)
Date: 1 Sep 90 13:21:56 GMT Raw View

In article <1990Aug31.235606.166@athena.mit.edu>, ahodgson@hstbme.mit.edu (Antony Hodgson) writes:

>  Vector Vector::operator + ( Vector & A )
>   {
>    Vector temp;
>    // for all valid i, temp[i] = (*this)[i] + A[i]
>    return temp;  // copy of temp made here
>   } // temp deallocated here

> Could an intelligent compiler bypass this copying without the help of
> any new syntax?

Yes.

Moreover, I hear that at least one commercially available C++
implementation already does this.

One fairly easy to express such an optimization would be to say that
`if every return statement in a function returns the same local
variable, and that local variable has the same type as the
function's return value, then the compiler should alias the
local variable to the return value.'

The language definition permits this optimization.
--
    --Andrew Koenig
      ark@europa.att.com

Author: ahodgson@hstbme.mit.edu (Antony Hodgson)
Date: 31 Aug 90 23:56:06 GMT Raw View

I'm a fairly recent user of C++, but I've already noticed one situation
in which a lot of unnecessary object creation occurs which I think could
be solved by a slight change in syntax.  This is the common situation in
which one creates a temporary object which is intended to be the returned
object, but since it is a local object, one must return a copy of it.
Two objects are therefore created when only one ought to be needed.

Consider the following:

 Vector Vector::operator + ( Vector & A )
  {
   Vector temp;
   // for all valid i, temp[i] = (*this)[i] + A[i]
   return temp;  // copy of temp made here
  } // temp deallocated here

Could an intelligent compiler bypass this copying without the help of
any new syntax?  Or, failing that, would it be reasonable to introduce
a new keyword such as 'returnvalue' which would be defined to have the
same type as the specified return value?  Any comments?

Tony Hodgson ahodgson@hstbme.mit.edu

Author: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Date: 4 Sep 90 21:18:59 GMT Raw View

On 1 Sep 90 13:21:56 GMT, ark@alice.UUCP (Andrew Koenig) said:

ark> In article <1990Aug31.235606.166@athena.mit.edu>,
ark> ahodgson@hstbme.mit.edu (Antony Hodgson) writes:

ahodgson> Vector Vector::operator + ( Vector & A )
ahodgson>  {
ahodgson>   Vector temp;
ahodgson>   // for all valid i, temp[i] = (*this)[i] + A[i]
ahodgson>   return temp;  // copy of temp made here
ahodgson>  } // temp deallocated here

ahodgson> Could an intelligent compiler bypass this copying without the
ahodgson> help of any new syntax?

ark> Yes.

Yes indeed, at the expense of a substantial bloat in space and time for
the compilation:

ark> One fairly easy to express such an optimization would be to say
ark> that `if every return statement in a function returns the same
ark> local variable, and that local variable has the same type as the
ark> function's return value, then the compiler should alias the local
ark> variable to the return value.'

This is a very good example of the divide between people like me and
people who love compilers that take several hundred kilobytes of core.

A compiler can discover a large number of things by careful analysis of
the source; in this case, by doing two passes over each function it can
discover that it can alias a local variable to the return value slot on
the stack or wherever it is.

Now let me observe: what is the cost of telling the compiler beforehand?
virtually none. What is the cost of letting the compiler discover it?
large -- you have to make two passes over each function, and delay
generating real code until the second pass, while you build a full in
memory representation of the procedure in the first pass, in order to
check whether you can alias or not.

ark> The language definition permits this optimization.

But is does not permit doing without it; if you have the option of
telling the compiler explicitly, like in Gnu C++, and you do, it can
avoid doing any analysis, and if you do not, it could then switch to
'slow-and-big' mode (but should not IMNHO).

Consider my favourite example of wise language design: the 'register'
keyword is defined such that you cannot take the address of variables so
tagged. Why ever, when the compiler, by the end of the function, knows
whether you have taken the address of the variable or not, and can then
ignore the 'register' specification if so?

But simply because Classic C has been carefully designed so as to be
compilable with a simple one pass compiler that has no memory between
statements except for declarations (that is the reason for 'register'
itself -- the compiler can allocate a register to hold a variable
between statements, without ever analyzing more than one statement at a
time, because you told it by a declaration that it would be worthwhile).

Note that Classic C does not actually *forbid* more sophisticated
compilation strategies; it makes it possible for a simple compiler to
compete with a sophisticated one, thanks to some assistance from the
programmer; if such assistance is not forthcoming, the compiler *can* do
more work (but should not IMNHO).

In C++ the trend is towards making it difficult for simple compilers to
be competitive with complicated (and theref ore unreliable) ones.

 What is more worrying is that C++ seems to be evolving also
 towards requiring ever more complicated *implementation*
 strategies as well; consider constructors/destructors and
 virtuals, and more recently and ominously, virtual base
 classes, and in the future, templates.

Consider the incredible syntax ambiguities (that could be easily
removed, with virtually 100% backwards compatibility), the complexity of
many rules, the fact that because of a single (IMNHO very unwise and
confusing) rule a single pass compiler is actually *impossible*, and
many other details.

 Did AT&T want to make it very difficult for anybody but
 themselves produce a conforming implementation? Or at least
 to make it so difficult to understand what a conforming
 implementation is so that people would stick by AT&T's for
 fear of the unknown? If this has been true, it has failed;
 it is well known that cfront is not conforming itself, i.e.
 AT&T have shot themselves in the foot :-).

A plea: remove the damn rule that requires two pass compilation of class
definitions. I think that it is damn unwise, and if people want to use a
class member in a member function body that appears before its
declaration, they should be damned. They should avoid doing anything so
gross and unreadable. Encourage people to define member functions after
the class definition, not within it!
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

Author: bright@Data-IO.COM (Walter Bright)
Date: 6 Sep 90 01:28:39 GMT Raw View

In article <PCG.90Sep4221859@athene.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
<On 1 Sep 90 13:21:56 GMT, ark@alice.UUCP (Andrew Koenig) said:
<ark< One fairly easy to express such an optimization would be to say
<ark< that `if every return statement in a function returns the same
<ark< local variable, and that local variable has the same type as the
<ark< function's return value, then the compiler should alias the local
<ark< variable to the return value.'

Zortech C++ 2.1 does exactly this optimization. It actually turns out
to be less code in the compiler to implement the optimization rather
than implement the syntax for the named return value.

<This is a very good example of the divide between people like me and
<people who love compilers that take several hundred kilobytes of core.

ZTC++ is the smallest and fastest C++ compiler, by a wide margin.
(Yes, I'm throwing down the gauntlet!)

<A compiler can discover a large number of things by careful analysis of
<the source; in this case, by doing two passes over each function it can
<discover that it can alias a local variable to the return value slot on
<the stack or wherever it is.

A C++ compiler must do multiple passes anyway, in order to determine where
to put the destructors. (It actually scans a 'basic block' version of the
function in memory.)

<Now let me observe: what is the cost of telling the compiler beforehand?
<virtually none. What is the cost of letting the compiler discover it?
<large -- you have to make two passes over each function, and delay
<generating real code until the second pass, while you build a full in
<memory representation of the procedure in the first pass, in order to
<check whether you can alias or not.

This is necessary for C++ anyway. The extra overhead to do 'return value
optimization' is insignificant. (ZTC++ spends most of it's time
forming tokens and sitting inside malloc().)

Author: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Date: 6 Sep 90 16:13:46 GMT Raw View

On 4 Sep 90 21:18:59 GMT, I wrote, as an aside to a technical discussion:

pcg> Did AT&T want to make it very difficult for anybody but
pcg> themselves produce a conforming implementation? Or at least
pcg> to make it so difficult to understand what a conforming
pcg> implementation is so that people would stick by AT&T's for
pcg> fear of the unknown? If this has been true, it has failed;
pcg> it is well known that cfront is not conforming itself, i.e.
pcg> AT&T have shot themselves in the foot :-).

This seems to have caused offense. I want to state that it was not meant
as other than irony, which I hope was understood. Apologies to any party
that felt this ironic paragraph was serious.

 Another aside on AT&T and standards: somebody once remarked that
 the SVID and its verification suite were designed in places so
 that apparently one *must* copy the AT&T implementation to pass
 it; for example, where multiple error codes could conceivably
 be returned, the SVID or its verification suite would accept as
 conforming only the code returned by System V, thus forcing
 other implementations to perform certain tests in the same order
 as the System V one. This was suspected to be a ploy to defeat
 cloners, and not sloppyness on the part of the authors of the
 SVID or its verification suite. It happened that eventually in
 parts the System V implementation evolved, other codes were
 returned, and thus *System V would fail to be certified as SVID
 conforming*. Now, this shows that probably, like in all large
 organizations, the left hand does not know what the right hand
 is doing, and this is often a more credible explanation than
 conspiracy theories. However the story has a moral: those who
 did not believe that AT&T had just been sloppy took issue from
 this, and founded the OSF, to ensure an "open process", not
 just an "open specification", which they believed could be
 tweaked. This is the one point on which the fact that now C++
 is going to be ANSIfied is a win -- I would otherwise be
 confident that Stroustrup and his colleagues would do a better
 job, but (in the eyes of the conspiracy theorists) an "open
 process" is better, even if it leads to design-by-committee.

The point of the irony was to (as is usual for me) show the dangers of
complexity; with any work that is complicated enough and still evolving,
even the organization to which its designer belongs has difficulty
keeping up with it. There is no need to invoke conspiracy theories,
except to make irony.

I hope (against all past experience) that the ANSIfication, which after
all benefits from the work of Stroustrup, Shopiro, etc..., as well as
many others, will make the language ever *simpler*, and thus ever more
accessible, to any user and any implementor, including AT&T's :-).
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk