Thread

Topic: VC++ bug (Was: empty base and copy assignment)

Author: Lisa Lippincott <lisa_lippincott@advisories.com>
Date: 1999/09/02 Raw View

I wrote:
> In the case at hand, the empty class E has an implicitly declared and
> defined copy assignment operator, described in 12.8 [class.copy]:
>
> E& E::operator=( const E& )     { return *this; }
>
> MSVC seems to incorrectly pessimize this function to:
>
> E& E::operator=( const E& e )
>   {
>    memmove( this, &e, sizeof(E) );
>    return *this;
>   }

James Kanze <James.Kanze@dresdner-bank.com> suggested:
> More likely memcpy, since if the objects overlap, the behavior of
> assignment is undefined.

Perhaps I gave MSVC too much credit when I assumed they allowed for
self-assignment.  Or perhaps not.

And he continues:
> It's not really pessimisation: basically (I suppose), VC++ has detected
> that the class contains no objects for which bitwise copy will not work,
> and so replaces the assignment operator by a (probably inline) bitwise
> copy.  I don't know about pentiums, but on earlier Intel chips, this was
> definitly an optimization.

In the example under discussion, a function requiring no work has been
transformed into a function requiring work.  While I'm not familiar with
all the quirks of the Intel architecture, I'll go out on a limb and predict
that the transformed function is both larger and slower.

> What I would expect to be undefined is assigning anything but the most
> derived type.  Everytime I've seen it, it has been a programming error
> -- the programmer meant to assign the most derived type.  And I'm not
> sure what the meaning should be.

Certainly I've seen problems that arise when people get confused about
which assignment operator they're calling.  But I've also seen assignment
of base classes put to good use.  In particular, it's common to
call the base class assignment operator from the assignment operator
of a derived type.

But the last-quoted statement puzzles me.  Why would the meaning be
other than "call the operator function chosen by the usual rules?"

                                             --Lisa Lippincott

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: pixi@bloodbath.burble.org (maurice s. barnum)
Date: 1999/09/03 Raw View

Lisa Lippincott <lisa_lippincott@advisories.com> writes:

: James Kanze <James.Kanze@dresdner-bank.com> suggested:
: > More likely memcpy, since if the objects overlap, the behavior of
: > assignment is undefined.
:
: Perhaps I gave MSVC too much credit when I assumed they allowed for
: self-assignment.  Or perhaps not.

this is an interseting quesiton.. if two assumptions are made:

1) assiging an object to itself is an exceptional adn rare occurrence.
2) conditional branches are expensive.

i think there is an argument for having a compiler generate code that
elides the "optimization" when the object is being assigned to itself.

 --xmsb
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Lisa Lippincott <lisa_lippincott@advisories.com>
Date: 1999/09/01 Raw View

In the case at hand, the empty class E has an implicitly declared and
defined copy assignment operator, described in 12.8 [class.copy]:

E& E::operator=( const E& )     { return *this; }

MSVC seems to incorrectly pessimize this function to:

E& E::operator=( const E& e )
  {
   memmove( this, &e, sizeof(E) );
   return *this;
  }

Presumably, the authors expected the above pessimization to follow the
as-if rule -- that is, they assumed the memmove to have no visible
effect.  The empty base class optimization provides a situation where
that assumption is violated, and gives MSVC a bug.

James Kanze <James.Kanze@dresdner-bank.com> wrote:
> IMHO, the behavior should be undefined in such cases.  I see no reason
> to define a specific behavior to assigning the base class without
> assigning the derived class.  But I don't know off hand of any words in
> the standard which make it undefined.

I'm of the opposite opinion -- I'm happy with the current definition of
the implicitly defined assignment operator.  I see no reason why
the standard should add "but if it's called for a base class subobject,
the results are undefined," particularly since the only reason to add
such a clause would be to allow the pessimization above.

                                                --Lisa Lippincott

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James.Kanze@dresdner-bank.com
Date: 1999/09/02 Raw View

In article <010919991421481234%lisa_lippincott@advisories.com>,
  Lisa Lippincott <lisa_lippincott@advisories.com> wrote:
>
> In the case at hand, the empty class E has an implicitly declared and
> defined copy assignment operator, described in 12.8 [class.copy]:
>
> E& E::operator=( const E& )     { return *this; }
>
> MSVC seems to incorrectly pessimize this function to:
>
> E& E::operator=( const E& e )
>   {
>    memmove( this, &e, sizeof(E) );

More likely memcpy, since if the objects overlap, the behavior of
assignment is undefined.

>    return *this;
>   }

It's not really pessimisation: basically (I suppose), VC++ has detected
that the class contains no objects for which bitwise copy will not work,
and so replaces the assignment operator by a (probably inline) bitwise
copy.  I don't know about pentiums, but on earlier Intel chips, this was
definitly an optimization.

> Presumably, the authors expected the above pessimization to follow the
> as-if rule -- that is, they assumed the memmove to have no visible
> effect.  The empty base class optimization provides a situation where
> that assumption is violated, and gives MSVC a bug.

Correct (except that I'm still not sure that the program contains no
undefined behavior).  And that at least on earlier Intel processors, it
definitly did result in a win.

> James Kanze <James.Kanze@dresdner-bank.com> wrote:
> > IMHO, the behavior should be undefined in such cases.  I see no reason
> > to define a specific behavior to assigning the base class without
> > assigning the derived class.  But I don't know off hand of any words in
> > the standard which make it undefined.

> I'm of the opposite opinion -- I'm happy with the current definition
> of the implicitly defined assignment operator.

IMHO, the problem isn't with the definition of the assignment operator
per se; it is with the definition of what happens, or is supposed to
happen, when you only assign part of an object.  In general, I wouldn't
expect this to work, and I can think of no good use for it to motivate
it to work.  All of the rare times I've seen an assignment of just the
base part of a larger object, it has been a program error.

> I see no reason why
> the standard should add "but if it's called for a base class
> subobject, the results are undefined," particularly since the only
> reason to add such a clause would be to allow the pessimization above.

What I would expect to be undefined is assigning anything but the most
derived type.  Everytime I've seen it, it has been a programming error
--
the programmer meant to assign the most derived type.  And I'm not sure
what the meaning should be.  (I'm talking here only about the object on
the left hand side of the assignment.)

--
James Kanze                   mailto: James.Kanze@dresdner-bank.com
Conseils en informatique orient   e objet/
                  Beratung in objekt orientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James.Kanze@dresdner-bank.com
Date: 1999/09/03 Raw View

In article <020919991521259045%lisa_lippincott@advisories.com>,
  Lisa Lippincott <lisa_lippincott@advisories.com> wrote:

> And he continues:
> > It's not really pessimisation: basically (I suppose), VC++ has
detected
> > that the class contains no objects for which bitwise copy will not
work,
> > and so replaces the assignment operator by a (probably inline)
bitwise
> > copy.  I don't know about pentiums, but on earlier Intel chips, this
was
> > definitly an optimization.

> In the example under discussion, a function requiring no work has been
> transformed into a function requiring work.  While I'm not familiar
with
> all the quirks of the Intel architecture, I'll go out on a limb and
predict
> that the transformed function is both larger and slower.

The optimization does not only apply to this specific case; it applies
to many assignments.  And globally, it is almost certainly a win.

I agree that a good compiler would recognize that it isn't a win in this
particular case (since doing nothing is even better), and do nothing.
But this is a separate optimization (which apparently VC++ doesn't
make).

> > What I would expect to be undefined is assigning anything but the
> > most derived type.  Everytime I've seen it, it has been a
> > programming error -- the programmer meant to assign the most derived
> > type.  And I'm not sure what the meaning should be.

> Certainly I've seen problems that arise when people get confused about
> which assignment operator they're calling.  But I've also seen
> assignment of base classes put to good use.  In particular, it's
> common to call the base class assignment operator from the assignment
> operator of a derived type.

This particular use occured to me after I posted.  If I derive from E,
and in the operator= of my derived class, I write E::operator=( other ),
the code had better work.

> But the last-quoted statement puzzles me.  Why would the meaning be
> other than "call the operator function chosen by the usual rules?"

What does it mean to assign the base class, while leaving the derived
class part with the old values?  I cannot conceive of a case where this
would be correct.

--
James Kanze                   mailto: James.Kanze@dresdner-bank.com
Conseils en informatique orient   e objet/
                  Beratung in objekt orientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James.Kanze@dresdner-bank.com
Date: 1999/09/01 Raw View

In article <slrn7sgclg.2dq.sbnaran@localhost.localdomain>,
  sbnaran@uiuc.edu wrote:
> On 28 Aug 99 17:29:42 GMT, Valentin Bonnard <Bonnard.V@wanadoo.fr>
wrote:
> >Vesa A J Karvonen wrote:
>
> >> struct E {};
> >> struct F : E {char c;};
>
> sizeof(class) must be 1 or more.
> Most likely sizeof(E) is 1.
> On a compiler that doesn't do empty base optimization, sizeof(F) is 2.
> On a compiler that does    do empty base optimization, sizeof(F) is 1.
> That is, the E sub-object of the full F object has size 0.
>
> >> int main() {
> >>   F f0;
> >>   F f1;
> >>
> >>   f0.c = '0';
> >>   f1.c = '1';
> >>
> >>   E& e0 = f0;
> >>   E& e1 = f1;
> >>
> >>   cout << f0.c << '\n';
> >>
> >>   e0 = e1;
>
> In general, assignment on polymorphic objects is usually a bad idea
> because you risk leaving the object in an inconsistent state.  In the
> above, you are chaning the E part of f0 but not the F-only part of f0.
> You really should change both the E and F parts in tandem.  For this
> reason, I normally make operator= private and non-implemented in the
> base class.
>
> >This one copies sizeof(E) bytes !
> >
> >>   cout << f0.c << '\n';
> >
> >You get 1, right ?
>
> On a compiler that does not do the empty base optimization, you get 1.
> But on a compiler that does do the empty base optimization, isn't the
> behaviour undefined because f0.c will be overwritten with sizeof(E)
> bytes?  What am I missing?

Interesting question.  First, whether the behavior is undefined or not
cannot depend on the optimization -- either it is undefined, or it
isn't.  Second, unless there is a clause in the standard somewhere that
says the behavior is undefined, you must get 0 -- you have only assigned
the E parts, so the F parts must remain unchanged.  (This isn't
difficult for the compiler to do; since it knows that E is empty, it
could treat an assignment of E's as a no-op.  On the other hand, I
wouldn't be overly critical of a compiler writer who didn't think to
handle this special case.)

IMHO, the behavior should be undefined in such cases.  I see no reason
to define a specific behavior to assigning the base class without
assigning the derived class.  But I don't know off hand of any words in
the standard which make it undefined.

--
James Kanze                   mailto: James.Kanze@dresdner-bank.com
Conseils en informatique orient   e objet/
                  Beratung in objekt orientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/08/28 Raw View

Vesa A J Karvonen wrote:

> #include <iostream>
> using namespace std;
>
> struct E {};
> struct F : E {char c;};
>
> int main() {
>   F f0;
>   F f1;
>
>   f0.c = '0';
>   f1.c = '1';
>
>   E& e0 = f0;
>   E& e1 = f1;
>
>   cout << f0.c << '\n';
>
>   e0 = e1;

This one copies sizeof(E) bytes !

>   cout << f0.c << '\n';

You get 1, right ?

>   return 0;
> }

The behaviour is very well defined.

This is a known bug in MSVC, caused by the fact that E takes
0 bytes inside F, not sizeof(E) bytes.

I think that you can disable this behaviour (so that E
takes sizeof(E) bytes inside a F).

--

Valentin Bonnard
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Vesa A J Karvonen <vkarvone@cc.helsinki.fi>
Date: 1999/08/29 Raw View

Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:
> Vesa A J Karvonen wrote:
[snip]
> This is a known bug in MSVC, caused by the fact that E takes
> 0 bytes inside F, not sizeof(E) bytes.
[snip]

That's what I thought. Here is another program that I find interesting:

#include <iostream>
using namespace std;

struct B {};

template<int i>
struct D : B {};

struct DD : D<0>, D<1> {};

int main() {
  DD a[2];

  D<1>* d1 = &a[0];
  D<0>* d0 = &a[1];
  B* b1 = d1;
  B* b0 = d0;

  cout << (b0 == b1) << '\n';

  return 0;
}

What should it output?

It seems to me that given the clause 5.10, there is very little, although
significant, scope for storage optimization of base classes.

Anyway, on the subject of compiler bugs, here is yet another bug in
MSVC++:

#include <iostream>
#include <typeinfo>
using namespace std;

int main() {
  double d;
  long double ld;

  cout << typeid(d+ld).name() << '\n';
  cout << typeid(ld+d).name() << '\n';

  return 0;
}

---
Vesa Karvonen

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 1999/08/30 Raw View

In article <user-3008990800070001@aus-as5-130.io.com>, blargg <postmast.
root.admi.gov@iname.com> writes
>You're saying that the object has a different size? I don't buy that. All
>objects of type E have a size of sizeof (E). Maybe you mean that the
>compiler overlays other objects of a *different* type in the same space as
>the E sub-object. That's different than it having a zero size.

I think that is sophistry.

Of course the implementor can side step the problem we are talking about
by making assignment between dataless objects a no-op.



Francis Glassborow      Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Valentin Bonnard <Bonnard.V@wanadoo.fr>
Date: 1999/08/30 Raw View

Siemel B. Naran wrote:
>
> On 28 Aug 99 17:29:42 GMT, Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:

> >You get 1, right ?
>
> On a compiler that does not do the empty base optimization, you get 1.
> But on a compiler that does do the empty base optimization, isn't the
> behaviour undefined

No. MS gets it wrong, once again.

--

Valentin Bonnard

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: sbnaran@uiuc.edu (Siemel B. Naran)
Date: 1999/08/30 Raw View

On 28 Aug 99 17:29:42 GMT, Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:
>Vesa A J Karvonen wrote:

>> struct E {};
>> struct F : E {char c;};

sizeof(class) must be 1 or more.
Most likely sizeof(E) is 1.
On a compiler that doesn't do empty base optimization, sizeof(F) is 2.
On a compiler that does    do empty base optimization, sizeof(F) is 1.
That is, the E sub-object of the full F object has size 0.

>> int main() {
>>   F f0;
>>   F f1;
>>
>>   f0.c = '0';
>>   f1.c = '1';
>>
>>   E& e0 = f0;
>>   E& e1 = f1;
>>
>>   cout << f0.c << '\n';
>>
>>   e0 = e1;

In general, assignment on polymorphic objects is usually a bad idea
because you risk leaving the object in an inconsistent state.  In the
above, you are chaning the E part of f0 but not the F-only part of f0.
You really should change both the E and F parts in tandem.  For this
reason, I normally make operator= private and non-implemented in the
base class.

>This one copies sizeof(E) bytes !
>
>>   cout << f0.c << '\n';
>
>You get 1, right ?

On a compiler that does not do the empty base optimization, you get 1.
But on a compiler that does do the empty base optimization, isn't the
behaviour undefined because f0.c will be overwritten with sizeof(E)
bytes?  What am I missing?

--
----------------------------------
Siemel B. Naran (sbnaran@uiuc.edu)
----------------------------------
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: postmast.root.admi.gov@iname.com (blargg)
Date: 1999/08/30 Raw View

In article <slrn7sgclg.2dq.sbnaran@localhost.localdomain>,
sbnaran@uiuc.edu wrote:

> On 28 Aug 99 17:29:42 GMT, Valentin Bonnard <Bonnard.V@wanadoo.fr> wrote:
> >Vesa A J Karvonen wrote:
>
> >> struct E {};
> >> struct F : E {char c;};
>
> sizeof(class) must be 1 or more.
> Most likely sizeof(E) is 1.
> On a compiler that doesn't do empty base optimization, sizeof(F) is 2.
> On a compiler that does    do empty base optimization, sizeof(F) is 1.
> That is, the E sub-object of the full F object has size 0.

You're saying that the object has a different size? I don't buy that. All
objects of type E have a size of sizeof (E). Maybe you mean that the
compiler overlays other objects of a *different* type in the same space as
the E sub-object. That's different than it having a zero size.

> >> int main() {
> >>   F f0;
> >>   F f1;
> >>
> >>   f0.c = '0';
> >>   f1.c = '1';
> >>
> >>   E& e0 = f0;
> >>   E& e1 = f1;
> >>
> >>   cout << f0.c << '\n';
> >>
> >>   e0 = e1;
>
> In general, assignment on polymorphic objects is usually a bad idea
> because you risk leaving the object in an inconsistent state. In the
> above, you are chaning the E part of f0 but not the F-only part of f0.
> You really should change both the E and F parts in tandem.  For this
> reason, I normally make operator= private and non-implemented in the
> base class.

Did you miss the fact that this code is only to demonstrate the bug?!?

Discretion.

> >This one copies sizeof(E) bytes !
> >
> >>   cout << f0.c << '\n';
> >
> >You get 1, right ?
>
> On a compiler that does not do the empty base optimization, you get 1.

An optimization is something that doesn't change the effect. I wouldn't
call it an optimization if it broke the program. I'd call it the "empty
base bug" :-)

> But on a compiler that does do the empty base optimization, isn't the
> behaviour undefined because f0.c will be overwritten with sizeof(E)
> bytes?

I don't think it would be a good idea for a behavior to be made undefined
in the standard just because a naive implementation would get it wrong. A
correct implementation won't copy any bytes *ever* when copying E objects
(in this example).

    struct E { };

    E e1, e2;

    e1 = e2; // no code generated

Ignoring the fact that the above code fragment has no observable effects,
because this can all be optimized to nothing, assuming no optimization is
performed, a compiler that employs the empty base optimization would need
to put a special case to not copy anything for objects of types with no
members. This wouldn't be a bad idea in general anyway, since nothing ever
needed to be copied in the first place.

Just consider the effects if this really were undefined - programmers
would have to watch out very carefully for this situation. Most likely,
they would miss cases where it invoked undefined behavor, since it would
be very subtle. I remember reading that Stroustrup, in general, prefers
pushing some complexity on the compiler than the programmer, when given a
choice "keep the language simple, not the compiler" (*not* an actual
quote). In this case, the fix in the compiler is so trivial (don't
generate copy code when the struct has no members) that there would be no
reason not to require compilers to get it right.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]