Topic: Conv. of pointers to unconstructed objects (12.7.2)


Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: 1995/05/24
Raw View
"Per    ngstr   m" <eri.edt.edtpang@memo.ericsson.se> writes:

>I have some questions regarding the Draft Standard, clause 12.7,
>Construction and destruction [class.cdtor]. It says:
>
>"2 To  explicitly or implicitly convert a pointer to an object of class X
>  to a pointer to a direct or indirect base class B, the construction of
>  X  and  the  construction  of all of its direct or indirect bases that
>  directly or indirectly derive  from  B  shall  have  started  and  the
>  destruction  of  these classes shall not have completed, otherwise the
>  computation results in undefined behavior.  To form  a  pointer  to  a
>  direct  nonstatic member of an object X given a pointer to X, the con-
>  struction of X shall have started and the destruction of X  shall  not
>  have  completed, otherwise the computation results in undefined behav-
>  ior."

I was a member of the Working Group which decided on this text
at the Valley Forge meeting (Nov 94).  We debated the issue at
considerable length.

Although it had not been documented, the undefined behaviour in
question has always existed, because conversion to a virtual base class
requires examining the vtable (or equivalent implementation magic), and
if the class has not been constructed, the vtable may not have been
initialized.

In object systems like SOM (and probably also Delta-C++), the
implementation techniques used for virtual base classes are used for
all base classes, to reduce the need for recompilation in incremental
compilation systems and to provide release-to-release binary compatibility.
The working group wanted C++ to support these modern object systems,
and wanted to minimize any difficulties or incompatibilities between these
systems and the C++ standard.

I argued in favour of making the behaviour undefined only for
virtual base classes rather than for all base classes, in part
because I was worried about the effects on existing code - in
particular, code using conversions to non-virtual
base classes works fine with most existing compilers.
However, it was pointed out that the behaviour in this situation was
really not specified by the ARM, existing implementations differed,
and so it was up to the committee to decide what the behaviour
should be.  Limiting it to virtual base classes would make the
potential problems all the more subtle, and would cause real problems
for the use of object systems like SOM.  The concensus of the working
group was that it was very important for C++ to support a wide variety
of implementation techniques and object models, and that this was more
important than sanctifying the behaviour of some existing
implementations on some rather obscure code.  I stopped debating the
point when I realized I was in a minority of one ;-).

>Below, in "alternative 1", is a synthesized version of the design of
>my current project, which consists of two parallel hierarchies. The main
>point is that at each inheritance level in the {a,b} hierarchy, there is
>a pointer or reference to the corresponding level in the {subA, subB}
>hierarchy.
>
>The problem is that on reading the above paragraph, I realize that the
>way it is currently implemented it will not be "defined" according to the
>Standard. Therefore, I have outlined a number of alternative solutions,
>some probably not legal, some maybe legal.
>
>I would like to have your comments on this issue, more specifically:
>1) Am I right in my initial worry, about the undefinedness of my current
>solution?

Yes.

>2) Am I right about the other alternatives?

Yes.

>3) Does anybody have a better solution?
>4) Is the fact that references are not mentioned in this context an indication
>that casting of references instead of pointers has defined behaviour?

No.

>5) If I am right about alternative 4 being defined, and not alternative 3,
>just because of the order of their respective base classes, is this not a
>potential source of very subtle problems?

Yes.

The cause of the problem is not really 12.7/4, it is really 12.6.2/4,
which comes straight from ARM 12.6.2, where it states that base classes
and members are initialized in declaration order.  Specifying
the order of initialization means that programmers are allowed to
write programs that depend on the order of initialization.
The working group was unanimous in wanting to discourage such code,
but there was really nothing we could do about it - once the order
of initialization is specified, the fact that certain declaration
orders cause defined behaviour and others cause undefined behaviour
is an inevitable consequence.

The alternative of revoking 12.6.2/4 was of course very unpalateable,
not to mention way beyond our charter.  Besides, sometimes it really
_is_ useful to write code that depends on the declaration order.
The fact that C++ allows this means that it also has the potential
for some very subtle problems.  It is of course a trade-off, but the
fact that C++ has succeeded in a way that Ada has not seems to indicate
that the market prefers the C++ philosophy of allowing the programmer
to do dangerous things.

>6) I would like to know the rationale for these restrictions.

See above! ;-)
I hope you find my explanation useful.

>//----------------------
>
>class subA{
>public:
> virtual
> void some_function() = 0;
>};
>
>class subB : public subA{
>public:
> void some_function()
>  {}
>};
>
>// alternative 1
>
>class a1{
>protected:
> a1( subA * _psub )
>  : psub( _psub )
>  {}
> subA * psub;
>};
>
>class b1 : public a1{
> subB sub;
>public:
> b1()
>  :a1( &sub ) // undefined; sub not constructed yet

Yes, exactly right.

>  {}
>};
>
>// end alternative 1
>
>
>// alternative 2
>
>class a2{
>protected:
> a2( subA & _rsub )
>  : rsub( _rsub )
>  {}
> subA & rsub;
>};
>
>class b2 : public a2{
> subB sub;
>public:
> b2()
>  : a2( sub ) // is this defined? sub not constructed yet

Yes, this is undefined.  The intent was certainly that the restriction
should apply to references as well as pointers; the fact that 12.7/2
talks only about pointers and not about references is I am sure an
accidental omission (which should with any luck soon be corrected).

>  {}
>};
>
>// end alternative 2
>
>
>// alternative 3
>
>class b3 : public a1, private subB{  // a1 from above
>public:
> b3()
>  : a1( this ) // undefined; subB not constructed yet

Yes, the conversion from `b3 *' to `subA *'
is undefined.  The constructors will be called in the
order specified, so `a1' will get constructed before `subB',
and hence the behaviour is undefined since `subB' has not been
constructed when you attempt to cast from `b3 *' to `sub A *'
via `subB *'.

>  {}
>};
>
>// end alternative 3
>
>
>// alternative 4
>
>class b4 : private subB, public a1{  // a1 from above
>public:
> b4()
>  : a1( this ) // defined; subB fully constructed

Yes, switching the order of the base classes makes it defined,
since it is now guaranteed that `subB' will be fully constructed
before the constructor for `a1' is invoked.

>  {}
>};
>
>// end alternative 4

--
Fergus Henderson                       | I'll forgive even GNU emacs as
fjh@cs.mu.oz.au                        | long as gcc is available ;-)
http://www.cs.mu.oz.au/~fjh            |             - Linus Torvalds





Author: "Per ngstr m" <eri.edt.edtpang@memo.ericsson.se>
Date: 1995/05/15
Raw View
I have some questions regarding the Draft Standard, clause 12.7,
Construction and destruction [class.cdtor]. It says:

"2 To  explicitly or implicitly convert a pointer to an object of class X
  to a pointer to a direct or indirect base class B, the construction of
  X  and  the  construction  of all of its direct or indirect bases that
  directly or indirectly derive  from  B  shall  have  started  and  the
  destruction  of  these classes shall not have completed, otherwise the
  computation results in undefined behavior.  To form  a  pointer  to  a
  direct  nonstatic member of an object X given a pointer to X, the con-
  struction of X shall have started and the destruction of X  shall  not
  have  completed, otherwise the computation results in undefined behav-
  ior."

Below, in "alternative 1", is a synthesized version of the design of
my current project, which consists of two parallel hierarchies. The main
point is that at each inheritance level in the {a,b} hierarchy, there is
a pointer or reference to the corresponding level in the {subA, subB}
hierarchy.

The problem is that on reading the above paragraph, I realize that the
way it is currently implemented it will not be "defined" according to the
Standard. Therefore, I have outlined a number of alternative solutions,
some probably not legal, some maybe legal.

I would like to have your comments on this issue, more specifically:
1) Am I right in my initial worry, about the undefinedness of my current
solution?
2) Am I right about the other alternatives?
3) Does anybody have a better solution?
4) Is the fact that references are not mentioned in this context an indication
that casting of references instead of pointers has defined behaviour?
5) If I am right about alternative 4 being defined, and not alternative 3,
just because of the order of their respective base classes, is this not a
potential source of very subtle problems?
6) I would like to know the rationale for these restrictions.

//----------------------

class subA{
public:
 virtual
 void some_function() = 0;
};

class subB : public subA{
public:
 void some_function()
  {}
};

// alternative 1

class a1{
protected:
 a1( subA * _psub )
  : psub( _psub )
  {}
 subA * psub;
};

class b1 : public a1{
 subB sub;
public:
 b1()
  :a1( &sub ) // undefined; sub not constructed yet
  {}
};

// end alternative 1


// alternative 2

class a2{
protected:
 a2( subA & _rsub )
  : rsub( _rsub )
  {}
 subA & rsub;
};

class b2 : public a2{
 subB sub;
public:
 b2()
  : a2( sub ) // is this defined? sub not constructed yet
  {}
};

// end alternative 2


// alternative 3

class b3 : public a1, private subB{  // a1 from above
public:
 b3()
  : a1( this ) // undefined; subB not constructed yet
  {}
};

// end alternative 3


// alternative 4

class b4 : private subB, public a1{  // a1 from above
public:
 b4()
  : a1( this ) // defined; subB fully constructed
  {}
};

// end alternative 4

//-----------------

Per Angstrom (eri.edt.edtpang@memo.ericsson.se)






Author: guus@proxim.franken.de (Guus C. Bloemsma)
Date: 1995/05/18
Raw View
In article <3p7q6t$64a@erinews.ericsson.se> "Per  ngstr m"
<eri.edt.edtpang@memo.ericsson.se> writes:
> I have some questions regarding the Draft Standard, clause 12.7,
> Construction and destruction [class.cdtor]. It says:
>
> "2 To  explicitly or implicitly convert a pointer to an object of class
X
>   to a pointer to a direct or indirect base class B, the construction of
>   X  and  the  construction  of all of its direct or indirect bases that
>   directly or indirectly derive  from  B  shall  have  started  and  the
>   destruction  of  these classes shall not have completed, otherwise the
>   computation results in undefined behavior.  To form  a  pointer  to  a
>   direct  nonstatic member of an object X given a pointer to X, the con-
>   struction of X shall have started and the destruction of X  shall  not
>   have  completed, otherwise the computation results in undefined behav-
>   ior."
[examples deleted]

I guess this limitation is needed for virtual base classes. The first
thing the constructor of X does is installing the necessary pointers to
virtual bases (or the v-table that contains the necessary offset. After
that (still before completion of X's constructor) the conversions can use
them.

What you are encountering is a general problem where several
interdependent objects need to be initialized in a very specific order.

Maybe a multi-phase constructor could help here:

1:  vtables and pointers to virtual bases are set.
.here conversions could take place..
2:  all objects are initialized using the arguments
.at this point methods may be called..
3:  from here on other objects can be called for extra initialization

There would have to be some syntax to mark the boundary between phase 2
and 3.

The first phase would be invisible to the programmer because it's
completely generated by the compiler.

The important thing is that the constructor goes through phase 1 for all
objects before starting phase 2 and so on. That way, all methods can
assume that at least phase 1 and 2 have been completed.

The same method can be (and is) used for reading connected objects from a
stream.

Good luck, Guus.