Thread

Topic: Unions containing objects w/ constructors/destructors

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 12 Feb 1995 17:13:33 GMT Raw View

tob@world.std.com (Tom O Breton) writes:

>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>> >The thinking basically seems to be to make unions typesafe. The
>> >reasoning given in the ARM is clearly based on that assumption.
>>
>>         That is not correct. The reason for the ARM rules
>> is so that an object of a union type can be copied, default initialised,
>> and destroyed.
>
>Sorry, I couldn't quite catch your distinction. Probably I expressed
>myself sloppily.

The distinction is this: the ARM rules ensure that _every_ union
type can by copied, default initialised, and destroyed, by ruling
certain unions to be illegal.  If you only want type safety, then
you can allow those unions, and just disallow copying, etc.,
for them.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au
all [L] (programming_language(L), L \= "Mercury") => better("Mercury", L) ;-)

Author: mlg@scr.siemens.com (Michael Greenberg)
Date: Wed, 8 Feb 1995 17:20:44 GMT Raw View

Has there been any discussion about allowing unions to contain objects
with constructors or destructors?

One approach would allow the union's constructor/destructor to call
the appropriate sub-object's constructor/destructor. For example:

union u1 {
  myclass1 c1;
  myclass2 c2;
  u1(myclass1& x) : c1(x) { }
  u1(myclass2& x) : c2(x) { }
  ~u1() {   if (...)
  c1.~myclass1() ;
     else
  c2.~myclass2();
 }
} ;

--
Michael Greenberg                      email: mgreenberg@scr.siemens.com
Siemens Corporate Research             phone: 609-734-3347
755 College Road East                  fax: 609-734-6565
Princeton, NJ 08540

Author: andys@thone.demon.co.uk (Andy Sawyer)
Date: Thu, 9 Feb 1995 00:55:24 +0000 Raw View

In article <D3oyuL.MvD@scr.siemens.com>
           mlg@scr.siemens.com "Michael Greenberg" writes:

>
> Has there been any discussion about allowing unions to contain objects
> with constructors or destructors?
>
> One approach would allow the union's constructor/destructor to call
> the appropriate sub-object's constructor/destructor. For example:
>
> union u1 {
>   myclass1 c1;
>   myclass2 c2;
>   u1(myclass1& x) : c1(x) { }
>   u1(myclass2& x) : c2(x) { }
>   ~u1() {   if (...)
>                 c1.~myclass1() ;
>             else
>                 c2.~myclass2();
>         }
> } ;
>

 From the ARM, section 9.5:

 "... A union may have member functions (including costructors and
destructors), but not virtual functions..."

 "...An object of a class with a constructor or a destructor or a
user-defined assignment operator cannot be a member of a union..."

 and in the commentary:

 "....The rule [against unions of classes with constructors or destructors]
  is necessary, though, because member functions for a class that has
  constructors usually rely on being invoked on objects that have been
  correctly constructed...."

 Aside from anything else, which assignment operator do you use?

e.g. given you above example:
  myclass1 c1;
  myclass1 c2;
  u1 a( c1 );
  u1 b( c2 );

  a = b;   // myclass1::operator=()  or  myclass2::operator=() ?

 As an aside, I have strong feelings against the use of unions anyway. Let's
get oursselves a really nice, _strongly typed_, language, then have unions
that let us completly bypass them....there are, IMHO, very few cases where
they are _absolutely_ necessary.

 (These opinions have not been helped by spending the better part of two days
ploughing through code generated by somebody else's YACC script....and the
problem was, of course, a %union....:-(. Once the problem was found (in the
C++), we all sat around looking at the YACC saying "Of course - it's obvious")

Regards,
 Andy
--
* Andy Sawyer ** e-mail:andys@thone.demon.co.uk ** Compu$erve:100432,1713 **
 The opinions expressed above are my own, but you are granted the right to
 use and freely distribute them. I accept no responsibility for any injury,
 harm or damage arising from their use.                --   The Management.

Author: tob@world.std.com (Tom O Breton)
Date: Thu, 9 Feb 1995 05:23:12 GMT Raw View

mlg@scr.siemens.com (Michael Greenberg) writes:
> Has there been any discussion about allowing unions to contain objects
> with constructors or destructors?

Not recently.

> One approach would allow the union's constructor/destructor to call
> the appropriate sub-object's constructor/destructor.

I prefer a much simpler approach - dump typesafety for unions. Union
ctors can of course already initialize whatever member you say it
"really" is. Too bad dtors can't take parameters; in this case it could
be useful.

Here is another message I wrote on the subject:

From: tob@world.std.com
Subject: Wrong to try to make unions typesafe

I think the decision to make unions disallow members with ctors was
wrong and was contrary to the "trust the programmer" spirit of C.

The thinking basically seems to be to make unions typesafe. The
reasoning given in the ARM is clearly based on that assumption.

But that's not helping anything. If the class you want to keep typesafe
is aliased to (say) an array of char, your typesafety is _gone_, so why
pretend?

If I declare a union, I am saying to the compiler that I, not it, will
handle the structured-data-to-raw-binary-data relation. I will either
not put into unions classes that dare not get munged, or I will control
how they are accessed so it's safe.

This was prompted by a project where I needed to use unions in this way.
The recommended solution, pull the data members out of the root of the
hierarchy down into another class and include that, was just awful:
onerous, would have required rewriting a class I considered finished
several levels ago, would have required a hell of a lot of mechanics to
do a tiny thing, etc. Not to mention it won't work for non-trivial
hierarchies or abstract bases.

I ended up doing something like:

    struct ...
        {
        ...
        char    d[ sizeof( biggest_member ) ];
        this_type&
        get_this_type( void ) { return *( this_type *)d; };
        that_type&
        get_this_type( void ) { return *( that_type *)d; };
        };

...which is quite a bit more source than simply writing "union { }" but
still much easier than the recommended workaround.

        Tom

--
tob@world.std.com
TomBreton@delphi.com: Author of The Burning Tower

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Thu, 9 Feb 1995 17:38:18 GMT Raw View

In article <D3pwAo.MnH@world.std.com> tob@world.std.com writes:
>mlg@scr.siemens.com (Michael Greenberg) writes:
>> Has there been any discussion about allowing unions to contain objects
>> with constructors or destructors?
>
>Not recently.

 Well, there has on the committee reflector. I wrote
a paper proposing exactly that. As yet the paper has not been
considered.

>
>I think the decision to make unions disallow members with ctors was
>wrong and was contrary to the "trust the programmer" spirit of C.
>
>The thinking basically seems to be to make unions typesafe. The
>reasoning given in the ARM is clearly based on that assumption.

 That is not correct. The reason for the ARM rules
is so that an object of a union type can be copied, default initialised,
and destroyed.

 My proposal, basically, is that if the union has
a reference or constructible member, then the compiler generated
default constructor and destructor do nothing, and the compiler
_refuses_ to generate a copy constructor or assignment operator.

 Which, in my opinion, is 100% compatible
with the ARM union, but allows a union to have any kind of
member, just like it can in ISO C.

 I _also_ proposed that a union IS a class (exactly),
so it can have virtual functions, be a base, or have bases.

 Thus getting rid of most of the restrictions on unions.

>This was prompted by a project where I needed to use unions in this way.
>The recommended solution, pull the data members out of the root of the
>hierarchy down into another class and include that, was just awful:
>onerous, would have required rewriting a class I considered finished
>several levels ago, would have required a hell of a lot of mechanics to
>do a tiny thing, etc. Not to mention it won't work for non-trivial
>hierarchies or abstract bases.

 Exactly. People just do not understand how fundamental
type unification (and subsequent discrimination) are to
computing.

 There is a consistent mistake of thinking switches are
bad, and that polymorphism should be used. In C++ that is
poor thinking, IMHO. (Object Oriented programming
_complements_ structured programming, it does not replace it)

 Unions and switches are _correct_ when you wish
to unify a known finite set of distinct types.

 Inheritance is correct for a _single_ type and
an indeterminate family of possible subtypes.

 The distinction is quite fundamental, just like
"or" and "and" in logic. (Or set union and intersection).

 As someone pointed out to me, the distinction between
"state" and "type" is an engineering compromise. EVERY
conditional can be viewed as switching on either state or type:
a type is just a set of states with coherent operations you
choose to name because it is useful to do so.

>
>I ended up doing something like:
>
>    struct ...
>        {
>        ...
>        char    d[ sizeof( biggest_member ) ];
>        this_type&
>        get_this_type( void ) { return *( this_type *)d; };
>        that_type&
>        get_this_type( void ) { return *( that_type *)d; };
>        };

 Yes. It is ghastly. Works better if you use
a union of pointers.

 But, unions should be discriminated, and C unions are not.
So even a revamped union which permits constructible members
is still far from what, in principle _should_ have been added
to complement the high level extensions to structs. IMHO.

 By the way, if you think unions are not used very often,
you are missing the point: a whole lot of code SHOULD use
unions but uses some other technique.

 IMHO the _worst_ case of this is something I call
implicit unification, and you can find it in most text books.
Thats where you unify two structures by making a third
structure with the members of both. For example

 struct A {int a; int b; };
 struct B {int c; int d; };
 struct AorB { int a; int b; int c; int d; };

Then, you just use the members a and b, for type A, and
c and d, for type B, leaving the other two members "unused".

This is a ghastly practice, if you do not
initialise _all_ the members of a struct (other than those
of type "unsigned char") you can't copy it.

But the real problem is that it is very hard to know what the
rules are, but really easy to just add a few more members,
refer to an uninitialised member, use a member the wrong
way -- etc.

 The text book example is:

 struct op { // unary or binary operator
  char opname;
  struct op *left;
  struct op *right;
 };

That should have been:

 struct node {
  enum nomiality {unary_type, binary_type} tag;
  struct unary {
   char opname;
   node *argument;
  };
  struct binary {
   char opname;
   node *left;
   node *right;
  };
  union {
   unary u;
   binary b;
  };
 };

This makes decoding simple. And it makes extension to ternary
an nullary operators simple too. It isn't _safe_, but it
is _clear_.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189

Author: tob@world.std.com (Tom O Breton)
Date: Fri, 10 Feb 1995 05:17:33 GMT Raw View

maxtal@physics.su.OZ.AU (John Max Skaller) writes:
> >The thinking basically seems to be to make unions typesafe. The
> >reasoning given in the ARM is clearly based on that assumption.
>
>         That is not correct. The reason for the ARM rules
> is so that an object of a union type can be copied, default initialised,
> and destroyed.

Sorry, I couldn't quite catch your distinction. Probably I expressed
myself sloppily.

>         My proposal, basically, is that if the union has
> a reference or constructible member, then the compiler generated
> default constructor and destructor do nothing, and the compiler
> _refuses_ to generate a copy constructor or assignment operator.

I'm not entirely sure about that. It seems to me that you want to forbid
a default copy ctor if and only if there is at least one member that has
_a copy ctor_, not that has any ctor at all. In cases where all the
members can safely copy bitwise, I see no reason for the restriction.

>         I _also_ proposed that a union IS a class (exactly),
> so it can have virtual functions, be a base, or have bases.
>
>         Thus getting rid of most of the restrictions on unions.

An intriguing idea, at least in theory. When you say it "can have
virtual functions", you have in mind that whatever space is required for
polymorhpism (EG, pointer to a vtable) is _not_ aliased, I assume?

>         Yes. It is ghastly. Works better if you use
> a union of pointers.

For engineering reasons, I prefer to have the ability to hold the data
directly rather than by indirection.

>         But, unions should be discriminated, and C unions are not.

Well, I'm talking about a less ambitious near-term vision, where type
discrimination is on the outside and left up to the programmer. Thus it
can work without getting type-discrimination working first.

And there are occasional cases where you don't want to discriminate,
where external requirements dictate that memory be interpreted in two
ways. Such as handling byte-order or fiddling with memory that will be
copied into CPU registers that have overlapping names.

> So even a revamped union which permits constructible members
> is still far from what, in principle _should_ have been added
> to complement the high level extensions to structs. IMHO.

I think I have all your "variants" posts on file somewhere.

        Tom

--
tob@world.std.com
TomBreton@delphi.com: Author of The Burning Tower

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sat, 11 Feb 1995 11:38:28 GMT Raw View

In article <D3rqpA.Iup@world.std.com> tob@world.std.com writes:
>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>> >The thinking basically seems to be to make unions typesafe. The
>> >reasoning given in the ARM is clearly based on that assumption.
>>
>>         That is not correct. The reason for the ARM rules
>> is so that an object of a union type can be copied, default initialised,
>> and destroyed.
>
>Sorry, I couldn't quite catch your distinction. Probably I expressed
>myself sloppily.

 The fact that copying works is not the same as
making unions typesafe. Only discriminated unions
can be (statically) typesafe, and then only with draconian restrictions.
(Due to type changing assignments)

>>         My proposal, basically, is that if the union has
>> a reference or constructible member, then the compiler generated
>> default constructor and destructor do nothing, and the compiler
>> _refuses_ to generate a copy constructor or assignment operator.
>
>I'm not entirely sure about that. It seems to me that you want to forbid
>a default copy ctor if and only if there is at least one member that has
>_a copy ctor_, not that has any ctor at all.

 No, it is not that simple. Almost all classes have copy
constructors: some user defined, some generated. Sometimes,
the generated copy ctor can be a bitwise copy. It is still
a copy constructor though.

>In cases where all the
>members can safely copy bitwise, I see no reason for the restriction.

 Yes, but you can't just say that, without defining
when a type is bitwise copyable.   Have a go (it is not quite trivial).

 The above is a dirty summary, the full proposal
is more pedantic. In fact, the rule is:

 "A union is a class, a union of the form

 union X {
  non-static-members
  other-members
 };

is immediately rewritten as:

 struct {
  union { non-static-members };
  other-members
 }

(In the proposal, I forgot to mention
preserving access specifiers.  A detail, but an important one).
The rules for unions are now complete and exactly
the same as the rules for classes. This leaves only the
existing problem of defining the semantics of anonymous unions
in classes.

 In providing a sensible set of rules for that,
it is not hard to notice that a restriction against anonymous
unions having reference or constructible data members is
unnecessary. The restriction has a purpose, and the purpose
is retained, but the time of binding is defered: it is
moved from the DECLARATION of the union to the USE of
a function which cannot sensibly be generated by the compiler.

 Which is excactly in keeping with the spirit of C++
in many other places (eg templates, calling copy constructors
which can't be generated if a _class_ has a base
with a private copy constructor, etc).

>>         I _also_ proposed that a union IS a class (exactly),
>> so it can have virtual functions, be a base, or have bases.
>>
>>         Thus getting rid of most of the restrictions on unions.
>
>An intriguing idea, at least in theory. When you say it "can have
>virtual functions", you have in mind that whatever space is required for
>polymorhpism (EG, pointer to a vtable) is _not_ aliased, I assume?

 No, it's aliased. See above. It is perfectly simple, a
union IS a class.  (Literally)

 It says so in the ARM and I think _that_ part of the ARM is correct.
And in ISO C, unions can have any kind of member a struct can,
and _that_ should remain true in C++ too, if possible. And it
_is_ possible.

>>         Yes. It is ghastly. Works better if you use
>> a union of pointers.
>
>For engineering reasons, I prefer to have the ability to hold the data
>directly rather than by indirection.

 Of course. Otherwise there would be no point in the proposal.
What was a simple struct, suddenly needs to become a full scale
class with complex memory management to make it work.

 I ran into this problem writing a parser. (You can't
write a parser without type unification and discrimination, because
a grammar and a parse tree are heterogenous and recursive).

>>         But, unions should be discriminated, and C unions are not.
>
>Well, I'm talking about a less ambitious near-term vision, where type
>discrimination is on the outside and left up to the programmer. Thus it
>can work without getting type-discrimination working first.

 Yes. I agree.
>
>And there are occasional cases where you don't want to discriminate,
>where external requirements dictate that memory be interpreted in two
>ways. Such as handling byte-order or fiddling with memory that will be
>copied into CPU registers that have overlapping names.

 In ISO C this is implementation defined. In C++
is should be undefined. The reason is: it MUST be
undefined, as far as I can see, if garbage collectors
are to work properly.

>> So even a revamped union which permits constructible members
>> is still far from what, in principle _should_ have been added
>> to complement the high level extensions to structs. IMHO.
>
>I think I have all your "variants" posts on file somewhere.

 After a LOT of analysis, Fergus and I came to the
conclusion that the 100% safe variants were possible but a bit
clumbsy to implement assignment correctly.

 A slightly less safe discriminated union leaves
room for errors in type changing assignments inside
type selections -- otherwise it is statically safe.

 A union with constructible members is not safe,
but it can be used idiomatically and some errors will
be trapped by the compiler.

 Without constructible members, you can still
implement it by raw storage management. But it is
REALLY messy. And highly error prone.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189