Topic: Temporaries and optimization?


Author: jpotter@falcon.lhup.edu (John Potter)
Date: Thu, 27 Jun 2002 23:02:45 GMT
Raw View
On Thu, 27 Jun 2002 20:27:45 GMT, Daniel Frey <daniel.frey@aixigo.de>
wrote:

> John Potter wrote:

> > On Wed, 26 Jun 2002 15:50:49 GMT, Daniel Frey <daniel.frey@aixigo.de>
> > wrote:

> > > I am wondering, why a lot of temporaries are not optimized away.

> > Inlining may not change the semantics of a program.

> I know. But what do you mean by this for the code given?

Replace all of the definitions with declarations, place the
definitions in another translation unit, reask the questions and you
will have the answers in front of you.

> > > A test1( A a, const A& b ) { return a += b; }

> > a is a local variable, not a temporary.  It is initialized by the
> > caller.  There is only one temporary produced.

> I should have been more careful on the wording, probably temporaries is
> not technically correct. I am interested of the number if instances
> created and the possibility to minimize them. Whether they are
> named/unnamed or called temporaries doesn't really matter to me. Hope
> this doesn't sound to confusing :)

See 3.7.2/3  Named objects with side effects in initialization or
destructor may not be removed even if not used.  In this case, a is
used and its initialization does have the side effect of producing the
correct result.

> > > A test3( const A& a, const A& b ) { return A( a ) += b; }

> > Operator += is not required to return *this.  It is asking too
> > much to expect the implementation to look across functions.  I

> Why?

Because the standard does not tell me how to write my code.

> 'return *this;' is a very common thing and could allow a bunch of
> optimizations, thus I'd expect a compiler to recognise this case.

It often can't find it and I sure would not want it to make
assumptions which could break code.

> It's
> far easier than a lot of other things it find out about functions (e.g.
> the side-effects you mention),

The implementation is not required to check for side effects.  It must
produce code which follows the semantics of the language.  If it wants
to violate that, it may check for side effects and produce bad code as
long as the program can't tell.  Maybe not high on compiler writer's
agenda?

> although it won't make a difference for
> test3, but test4 could benefit, couldn't it?

No, see answer to test5.  The function does not say "return t;"

> > don't think it is allowed other than the as-if rule which would
> > not apply here with observable side effects.

> The side-effects are the same as in test5, aren't they?

Yes, but the standard allows certain temporaries to be removed even
when the optimization has observable side effects.  These are
explicitly permitted violations of the "as-if" rule.

> Hm, the translation unit is the argument to rule out test1 and test2.
> For test3 and test4 I really don't see the reason to forbid
> optimizations because of side-effects when test5 is allowed to optimize.

Because the standard says that RVO and NRVO are allowed to violate the
"as-if" rule.  There are those who object to this and want
implementations to provide a switch to turn it off.  Some do.  Others
do not even implement NRVO.  The rules were tightened quite a bit
between CD2 and the standard.  12.8/15 states what is allowed.

> Anyway, the result is, that test5 shows the recommended way to write
> efficient code. I don't have the impression that this is already used
> wherever appropriate...

It has been discussed for about ten years.  It is about time that
those who care started paying attention.  Education?  Maybe our little
interchange will help.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Daniel Frey <daniel.frey@aixigo.de>
Date: Fri, 28 Jun 2002 17:11:01 GMT
Raw View
John Potter wrote:
>=20
> On Thu, 27 Jun 2002 20:27:45 GMT, Daniel Frey <daniel.frey@aixigo.de>
> wrote:
>=20
> > > Inlining may not change the semantics of a program.
>=20
> > I know. But what do you mean by this for the code given?
>=20
> Replace all of the definitions with declarations, place the
> definitions in another translation unit, reask the questions and you
> will have the answers in front of you.

Got it :)

> > > > A test3( const A& a, const A& b ) { return A( a ) +=3D b; }
>=20
> > > Operator +=3D is not required to return *this.  It is asking too
> > > much to expect the implementation to look across functions.  I
>=20
> > Why?
>=20
> Because the standard does not tell me how to write my code.

Sorry, "Why?" wasn't meant as "Why doesn't the standard require
operator+=3D to return *this?", it was meant as "Why is it asked for too
much to expect the compiler to detect that the implementation returns
*this". This is a big problem if the implementation is inside another
translation unit, but with an inlined function, it could be possible.
But still you are right that this doesn't allow optimizations because of
the side-effects...

> > Anyway, the result is, that test5 shows the recommended way to write
> > efficient code. I don't have the impression that this is already used
> > wherever appropriate...
>=20
> It has been discussed for about ten years.  It is about time that
> those who care started paying attention.  Education?  Maybe our little
> interchange will help.

I hope so. When starting this thread, I just wanted to make sure that
I'm not getting used to bad style, verifying that test5 *is* the way to
go. I decided that in the future, I will replace 't', 'tmp', 'res',
'result' or whatever it is called today by 'nrv', thus writing:

A operator+( const A& lhs, const A& rhs )
{
   A nrv( lhs ); nrv +=3D rhs; return nrv;
}

Which will hopefully make more people aware of it. People that don't
know about 'nrv' will wonder what 'nrv' means and thus might start to
search for answers. If you think this is a good idea, support it :)

Regards, Daniel

--
Daniel Frey

aixigo AG - financial training, research and technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Daniel Frey <daniel.frey@aixigo.de>
Date: Wed, 26 Jun 2002 15:50:49 GMT
Raw View
Hi,

I am wondering, why a lot of temporaries are not optimized away. I know,
that this is dependend on the compiler used, but I don't know whether
there is some influence from the standard that prevents certain
optimizations. I tried the following code:

#include <iostream>
using namespace std;

int ctor( 0 ), cpy( 0 );

class A
{
private:
   int i_;
  =20
public:
   explicit A( const int i ) : i_( i ) { ++ctor; cout << "CTOR "; }
   A( const A& rhs ) : i_( rhs.i_ ) { ++cpy; cout << "CPY "; }

   friend ostream& operator<<( ostream& s, const A& v )
   { return s << v.i_; }

   A& operator+=3D( const A& rhs ) { i_ +=3D rhs.i_; return *this; }
};

A test1( A a, const A& b ) { return a +=3D b; }
A test2( A a, const A& b ) { a +=3D b; return a; }
A test3( const A& a, const A& b ) { return A( a ) +=3D b; }
A test4( const A& a, const A& b ) { A t( a ); return t +=3D b; }
A test5( const A& a, const A& b ) { A t( a ); t +=3D b; return t; }

int main()
{
   const A a( 1 ), b( 2 );
   cout << a << " " << b << endl;

   cout << test1( a, b ) << endl;
   cout << test2( a, b ) << endl;
   cout << test3( a, b ) << endl;
   cout << test4( a, b ) << endl;
   cout << test5( a, b ) << endl;

   cout << "#CTOR: " << ctor << "\n#CPY: " << cpy << endl;
}

The output for GCC 2.95.2, GCC 2.95.3, GCC 3.0.4 and the TenDRA 4.1.2
(tested by a colleague) all show TWO temporaries for test1 - test5. If
compiled with GCC 3.1, test1 - test 4 yield TWO temporaries, while test5
only has ONE temporary (I counted the occurences of 'CPY' in the
output). Sure later optimization passes help for "simple" objects, but
given some larger, more complex classes, this may introduce a lot of
overhead. I'm thinking of classes for matrices, etc. where test1-test5
are the typical ways to implement operators.

This last one gets optimized because of the "named return value
optimization" of the GCC 3.1. As temporaries are very important and the
examples are so simple, I wonder if the standard prevents optimizations
of the other functions or if it's just a bad compiler/optimizer. Also,
I'd like to hear about the result of other compilers to decide what's
the best pattern to use for portable code. Sorry if this is slightly OT,
but I don't know any better group to ask :)

Regards, Daniel

--
Daniel Frey

aixigo AG - financial training, research and technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jpotter@falcon.lhup.edu (John Potter)
Date: Thu, 27 Jun 2002 15:49:50 GMT
Raw View
On Wed, 26 Jun 2002 15:50:49 GMT, Daniel Frey <daniel.frey@aixigo.de>
wrote:

> Hi,
>
> I am wondering, why a lot of temporaries are not optimized away.

Strike 1:
Inlining may not change the semantics of a program.

> A test1( A a, const A& b ) { return a += b; }

Strike 2:
a is a local variable, not a temporary.  It is initialized by the
caller.  There is only one temporary produced.

> A test2( A a, const A& b ) { a += b; return a; }

Same as test1.

> A test3( const A& a, const A& b ) { return A( a ) += b; }

Strike 3:
Operator += is not required to return *this.  It is asking too
much to expect the implementation to look across functions.  I
don't think it is allowed other than the as-if rule which would
not apply here with observable side effects.

> A test4( const A& a, const A& b ) { A t( a ); return t += b; }

Same as test3.

> A test5( const A& a, const A& b ) { A t( a ); t += b; return t; }

If you want NRVO, that's the way to write it.  Note that it is all
done within the function which could be in a different translation
unit.

> int main()
> {
>    const A a( 1 ), b( 2 );
>    cout << a << " " << b << endl;
>
>    cout << test1( a, b ) << endl;

You're out.
There must be an object here after test1 completes and it is a
temporary.  A c(test1(a,b)); or A c = test1(a,b); could use c without
a temporary.  Not all compilers support both.  The same applies to the
other four tests.  Note that it is all done at the call site in an
implementation in which the caller provides a pointer to the space
where the return value is to be constructed.

>    cout << test5( a, b ) << endl;

A c(test5(a, b));

You might note that c is copy constructed from a and then modified by
operator+=.  Technically, t was not removed from the subprogram, it
was just considered to be a different name for a temporary which was
also a different name for c.  Since the one temporary was removed
twice, t vaporized in the process.  :)

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Michiel.Salters@cmg.nl (Michiel Salters)
Date: Thu, 27 Jun 2002 15:56:08 GMT
Raw View
Daniel Frey <daniel.frey@aixigo.de> wrote in message news:<3D19BCD2.8B782EDE@aixigo.de>...
> Hi,
>
> I am wondering, why a lot of temporaries are not optimized away. I know,
> that this is dependend on the compiler used, but I don't know whether
> there is some influence from the standard that prevents certain
> optimizations. I tried the following code:

[ instrumented ctors ]

> As temporaries are very important and the
> examples are so simple, I wonder if the standard prevents optimizations
> of the other functions or if it's just a bad compiler/optimizer.

Your desired "optimalizations" alter the externally visible behavior
of the source; less output is produced. If that was legal, the common
"Hello, world\n" program could also be optimized by leaving out the
std::cout << there.

The standard does explicitly allow a change in externally visible behavior
as a result of the elimination of some temporaries, but not all. The
other temporaries can only be eliminated if you do not instrument them,
so your measurements infuence the measured quantity.

Regards,
--
Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Matthew Collett <m.collett@auckland.ac.nz>
Date: Thu, 27 Jun 2002 20:26:37 GMT
Raw View
In article <3D19BCD2.8B782EDE@aixigo.de>,
 Daniel Frey <daniel.frey@aixigo.de> wrote:

> class A
> {
> private:
>    int i ;
>
> public:
[...]
>    A& operator+=( const A& rhs ) { i  += rhs.i ; return *this; }
> };
>
> A test1( A a, const A& b ) { return a += b; }
> A test2( A a, const A& b ) { a += b; return a; }
> A test3( const A& a, const A& b ) { return A( a ) += b; }
> A test4( const A& a, const A& b ) { A t( a ); return t += b; }
> A test5( const A& a, const A& b ) { A t( a ); t += b; return t; }
>
[...]
> This last one gets optimized because of the "named return value
> optimization" of the GCC 3.1. As temporaries are very important and the
> examples are so simple, I wonder if the standard prevents optimizations
> of the other functions or if it's just a bad compiler/optimizer. Also,
> I'd like to hear about the result of other compilers to decide what's
> the best pattern to use for portable code. Sorry if this is slightly OT,
> but I don't know any better group to ask :)
>
> Regards, Daniel

As I understand it (but doubtless the real experts will correct me):-

In test1 and test2, a copy may not be optimised away.  Pass-by-value
requires a copy, and neither version of the RVO applies to function
arguments. (As an unrelated stylistic objection to this style, it
makes the declared interface of operator+ asymmetric in its
arguments for purely implementation reasons.)

In test3 and test4, the compiler is permitted to optimise a copy
away, but would need to be fairly smart to do so.  It has to
recognise that the reference returned from operator+= is always to
the same value (either 't' or the unnamed temporary) that has just
been constructed.  Since operator+= is inlined here, this should be
possible, but is evidently too subtle for your compiler.

In test5, the NRVO should be immediately and straightforwardly
applicable by any compiler which implements it. (And a quick check
using your code suggests that mine doesn't :-(. )

Best wishes,
Matthew Collett

--
The word "reality" is generally used with the intention of evoking sentiment.
                                                          -- Arthur Eddington

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Daniel Frey <daniel.frey@aixigo.de>
Date: Thu, 27 Jun 2002 20:27:45 GMT
Raw View
John Potter wrote:
>=20
> On Wed, 26 Jun 2002 15:50:49 GMT, Daniel Frey <daniel.frey@aixigo.de>
> wrote:
>=20
> > Hi,
> >
> > I am wondering, why a lot of temporaries are not optimized away.
>=20
> Strike 1:
> Inlining may not change the semantics of a program.

I know. But what do you mean by this for the code given?

> > A test1( A a, const A& b ) { return a +=3D b; }
>=20
> Strike 2:
> a is a local variable, not a temporary.  It is initialized by the
> caller.  There is only one temporary produced.

I should have been more careful on the wording, probably temporaries is
not technically correct. I am interested of the number if instances
created and the possibility to minimize them. Whether they are
named/unnamed or called temporaries doesn't really matter to me. Hope
this doesn't sound to confusing :)

> > A test3( const A& a, const A& b ) { return A( a ) +=3D b; }
>=20
> Strike 3:
> Operator +=3D is not required to return *this.  It is asking too
> much to expect the implementation to look across functions.  I

Why? 'return *this;' is a very common thing and could allow a bunch of
optimizations, thus I'd expect a compiler to recognise this case. It's
far easier than a lot of other things it find out about functions (e.g.
the side-effects you mention), although it won't make a difference for
test3, but test4 could benefit, couldn't it?

> don't think it is allowed other than the as-if rule which would
> not apply here with observable side effects.

The side-effects are the same as in test5, aren't they?

> > A test4( const A& a, const A& b ) { A t( a ); return t +=3D b; }
>=20
> Same as test3.
>=20
> > A test5( const A& a, const A& b ) { A t( a ); t +=3D b; return t; }
>=20
> If you want NRVO, that's the way to write it.  Note that it is all
> done within the function which could be in a different translation
> unit.

Hm, the translation unit is the argument to rule out test1 and test2.
For test3 and test4 I really don't see the reason to forbid
optimizations because of side-effects when test5 is allowed to optimize.

Anyway, the result is, that test5 shows the recommended way to write
efficient code. I don't have the impression that this is already used
wherever appropriate...

>=20
> > int main()
> > {
> >    const A a( 1 ), b( 2 );
> >    cout << a << " " << b << endl;
> >
> >    cout << test1( a, b ) << endl;
>=20
> You're out.
> There must be an object here after test1 completes and it is a
> temporary.  A c(test1(a,b)); or A c =3D test1(a,b); could use c without

I know that one "temporary" is necessary, only two are too much. Again,
sorry for using the wrong words :)

Thanks for your explanations :)

Regards, Daniel

--
Daniel Frey

aixigo AG - financial training, research and technology
Schlo=DF-Rahe-Stra=DFe 15, 52072 Aachen, Germany
fon: +49 (0)241 936737-42, fax: +49 (0)241 936737-99
eMail: daniel.frey@aixigo.de, web: http://www.aixigo.de

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]