Thread

Topic: Anonymous Union Catch 22

Author: "Daniel M. Pfeffer" <pfefferd@nospam.internet-zahav.net>
Date: 2000/03/09 Raw View

"James Kuyper" <kuyper@wizard.net> wrote in message
news:38BF1D3B.DBCEFABA@wizard.net...
> "Daniel M. Pfeffer" wrote:
> ....
> > Unions are generally non-portable, because you have no idea how much
packing
> > is inserted between elements in structures. For example, if
sizeof(short) =
> > 2 and 'short' is always aligned on even addresses, sizeof(long) is 4
chars,
> > and 'long' is always aligned on addresses divisible by 4 and
sizeof(double)
> > is 8 and 'double' is always aligned on addresses divisible by 8, the
> > following union will be completely different from a union in an
environment
> > that has no special alignment requirements:
> >
> > struct A {
> >     short x;   // 6 bytes of padding if y is aligned mod 8, otherwise no
> > padding.
> >     double y;
> > }
>
> Section 9.5 p1 says that for unions: "Each data member is allocated as
> if it were the sole member of a struct." Now, if that data member were
> the sole member of a struct, then Section 9.2 p17 applies: "A pointer to
> a POD-struct object, suitably converted using a reinterpret_cast, points
> to its initial member (or if that member is a bit-field, then to the
> unit in which it resides) and vice versa."  Therefore, all of the
> members of a union must have the same address as the union itself.
>
> This logic doesn't apply, to data members of a union that would prevent
> a struct containing them from being a POD-struct, but that doesn't apply
> to the member types in your example.
>
> However, even in that case, while the address of a member might be
> different from the address of the union, section 5.9 p2 still applies:
> "If two pointers point to data members of the same union object, they
> compare equal (after conversion to void *, if necessary)."
>
> All of the data members must start at the same location. It's not
> meaningful to talk about having padding "between" them, because there is
> no such thing as a place that is between them.
>
> I can't figure out what misunderstanding you've made, but with the
> implementation you describe, 'a.x' must occupy the first two bytes of
> 'a', 'a.y' must occupy the first eight bytes of 'a', 'a' must be aligned
> on a multiple of 8 bytes, and therefore must have a size which is a
> positive multiple of 8 bytes. The only padding allowed is at the end of
> the union, not at the beginning.
>
> > union {
> >     A a;
> >     short b[sizeof(A)/sizeof(short)];  // 5 elements if no padding. 8
> > elements if padding & b[1]..b[3] are garbage
> > } t;
>
> The number of elements in t.b must be a positive multiple of 4. It will
> be exactly 4 if there is no padding. I can't figure out where you got
> the number 5. 8 is valid, and represents the minimum non-zero amount of
> padding, but I think you're implying that it's also the maximum, which
> would be false.
>
> I just realized what mistake you've made: your comments make perfect
> sense if you're completely unaware of how a union differs from an
> ordinary structure. Have you ever actually used a union before?

<sarcasm> Yes, I _have_ used unions before, thank you. <\sarcasm>

I am perfectly aware that all members of a union start at the same address.
The elements of structure A are not elements of the union, but have a
(possibly) non-zero offset from the beginning of the structure, i.e. from
the beginning of the union.

My point was that when you include structures as elements of a union,
different alignment requirements can destroy the correspondence between the
elements of two different structures. I gave as an example a compiler for an
environment that _must_ allocate 'double's on an 8-byte boundary vs. a
compiler that may allocate them at any address.

Given the following definitions:

    struct A {
        short x;
        double y;
    };

    union {
        A a;
        short b[sizeof(A)/sizeof(short)];
    } t;

struct A is allocated as either

    XX pppppp YYYYYYYY  (p = padding)

or
    XX YYYYYYYY (no padding)

If we overlay an array of 'short's over it, this array will either have 5
elements (sizeof(A) = 10) or 8 elements (sizeof(A) = 16). Are you claiming
that mere inclusion of struct A in a union forces the compiler to pack all
elements of struct A on byte boundaries? How does this fit in with separate
compilation, where an instance of struct A may be passed to a function in a
different compilation unit?

furthermore, assuming (as I did) that sizeof(short) == 2 and sizeof(double)
== 8, where is there any need for padding at the end of the union? Alignment
requirements may force padding between the union instance and the next datum
(whatever that may be), but where is there a need for padding in the
_union_, over and above the padding inside struct A?

We agree that &t.a == &t.b. Are you claiming that &t.a.x == &t.a.y merely
because t.a is part of a union?!

Daniel Pfeffer

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Michiel Salters <salters@lucent.com>
Date: 2000/03/09 Raw View

"Daniel M. Pfeffer" wrote:

[ In response to James Kuyper ]

> My point was that when you include structures as elements of a union,
> different alignment requirements can destroy the correspondence between the
> elements of two different structures. I gave as an example a compiler for an
> environment that _must_ allocate 'double's on an 8-byte boundary vs. a
> compiler that may allocate them at any address.

I don't understand the second part of the first sentence. Which
correspondence exists between the elements of two different structures?
And since inclusion in a union doesn't alter a type, what changes?

> Given the following definitions:

>     struct A {
>         short x;
>         double y;
>     };

>     union {
>         A a;
>         short b[sizeof(A)/sizeof(short)];
>     } t;

> struct A is allocated as either

>     XX pppppp YYYYYYYY  (p = padding)

> or
>     XX YYYYYYYY (no padding)

> If we overlay an array of 'short's over it, this array will either have 5
> elements (sizeof(A) = 10) or 8 elements (sizeof(A) = 16). Are [you] claiming
> that mere inclusion of struct A in a union forces the compiler to pack all
> elements of struct A on byte boundaries? How does this fit in with separate
> compilation, where an instance of struct A may be passed to a function in a
> different compilation unit?

The alignment requirements on t.a are at least as strict as those on
any other A, and might be stronger (because &t.a must equal the address of
any other union member), but for the rest t.a is equal to any other A in
layout - and therefore padding must be independant of membership of a
union.

> furthermore, assuming (as I did) that sizeof(short) == 2 and sizeof(double)
> == 8, where is there any need for padding at the end of the union? Alignment
> requirements may force padding between the union instance and the next datum
> (whatever that may be), but where is there a need for padding in the
> _union_, over and above the padding inside struct A?

Such a need may arise on compilers which need to align unions on boundaries.
In that case, a array of N*unions must be N*sizeof(union) bytes. Therefore,
it is possible the union must be padded so its size is a multiple of the
alignment granularity.

Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Daniel M. Pfeffer" <pfefferd@nospam.internet-zahav.net>
Date: 2000/03/01 Raw View

"Karl Nelson" <kenelson@sequoia.ece.ucdavis.edu> wrote in message
news:89cgp8$1p6$1@mark.ucdavis.edu...
> There appears to be a few bugs in the specification of the
> behavior of anonymous unions or at least the implementation of
> it in a wide number of compilers.  A member of an anonymous union
> may not have a constructor, copy constructor, or operator =.
> This is a reasonable restriction.
>
> However, it appears that it is illegal to explicitly declare
> that these functions are private and can not be used to prevent
> assignment.  Thus if a member is within an anonymous union and
> it assigned the compiler will provide the default
> assignment and thus alter that member.  Thus there does not
> appear to be anyway to prevent assignment.
>
> For example, here is a proxy used in a class which should
> allow partitioning of a classes function's into a number of names.
>    object.function_group.function();
>
> Example:
> -----------------------------------------------------
>
> class Widget;
>
> class Proxy {
> private:
>   Widget* const t;
>   void operator= (const Proxy&);  // we do not want copy to work
> };
>
> class Widget {
>   int i;
> public:
>   union {
>     Widget* const this_;
>     Proxy proxy;                // this should be allowed
>   };
>   Widget(): this_(this) {}
> };
>
> int main() {
>   Proxy r;
>   Widget w;
>   w.proxy=r;                   // this should return an error
>   return 0;
> }
>
> -----------------------------------------------------
>
> This code fails on all compilers tested at the declaration
> of the anonymous union.  (Even without the operator=)
>
> Declaring the Proxy const is not an option because then the
> proxy will appear const when applied to a non-const Widget.
> The const of the parent class is important for the implementation
> of the proxy.
>
> Is there anything in the standard which would screen out the
> above use, or have all the compilers implemented something
> which was more restricted than intended?  (Having read through
> the standard, I couldn't find anything which implies that a
> const member can not be a member of a struct in a union as
> it is assumed that the union member can be assigned through any
> of the union members.)
>
> Thanks,
>
> --Karl Nelson

IMHO, the compiler cannot tell whether you have declared an assignment
operator merely in order to make assignments illegal. For example, you could
conceivably provide a method for assignment as follows:

class foo {
public:
    foo &assign(const foo &op)    { *this = op; return *this; }

private:
    foo &operator =(const foo &);
};

if foo::assign() were defined in a separate compilation unit, there would be
no way for the compiler to check that the assignment operator is never
"actually" called. The only safe way to handle this case is to disallow
operator =() in anonymous unions.


Daniel Pfeffer



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Karl Nelson <kenelson@sequoia.ece.ucdavis.edu>
Date: 2000/03/01 Raw View

Daniel M. Pfeffer <pfefferd@nospam.internet-zahav.net> wrote:
[snip]
:>
:> class Widget;
:>
:> class Proxy {
:> private:
:>   Widget* const t;
:>   void operator= (const Proxy&);  // we do not want copy to work
:> };
[snip]

: IMHO, the compiler cannot tell whether you have declared an assignment
: operator merely in order to make assignments illegal. For example, you could
: conceivably provide a method for assignment as follows:


: class foo {
: public:
:     foo &assign(const foo &op)    { *this = op; return *this; }

: private:
:     foo &operator =(const foo &);
: };

: if foo::assign() were defined in a separate compilation unit, there would be
: no way for the compiler to check that the assignment operator is never
: "actually" called. The only safe way to handle this case is to disallow
: operator =() in anonymous unions.

I don't see why the operator = is even really a problem for the
anonymous union in this case.  For example, if the proxy is a pointer
type in which assignment is equivalent to change the contents
of the pointer and not the pointers itself.   After all it should
be the responsibility of the user to  decide how the union is copied.

But to be a bit more specific, why is this construct legal if this
other one it not?

Legal:
-------------------
struct A {
  int k;
};

struct B {
  int j;
union {
  const int i;
  A a;
};
};


Illegal:
-------------------
struct A {
  const int k;
};

struct B {
  int j;
union {
  const int i;
  A a;
};
};

It would seem to me that declaring the contents to be const should
be legal in a anonymous union if declaring a const is legal by itself.
This would allow the user to maintain the copy restrictions.  However,
compilers interpret a const member as a copy constructor.

Perhaps more interesting mutable seems to be fine, but it is
doesn't effect the copy constructor so the value still gets changed.

Also it does not seem to bother the compilers if you place any other
operator = besides the copy assignment.  So why is it necessary
to be so specific about the copy constructor in this case?  After
all the union is going to be copied in a POD style anyway so if
there is a copy constructor or assignment operator it by definition
won't get run.  Therefore, won't it just be better to allow regular
types copy definitions in the anonymous union?

--Karl

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Daniel M. Pfeffer" <pfefferd@nospam.internet-zahav.net>
Date: 2000/03/02 Raw View

"Karl Nelson" <kenelson@sequoia.ece.ucdavis.edu> wrote in message
news:89hpoj$413$1@mark.ucdavis.edu...
> Daniel M. Pfeffer <pfefferd@nospam.internet-zahav.net> wrote:
> [snip]
> :>
> :> class Widget;
> :>
> :> class Proxy {
> :> private:
> :>   Widget* const t;
> :>   void operator= (const Proxy&);  // we do not want copy to work
> :> };
> [snip]
>
> : IMHO, the compiler cannot tell whether you have declared an assignment
> : operator merely in order to make assignments illegal. For example, you
could
> : conceivably provide a method for assignment as follows:
>
>
> : class foo {
> : public:
> :     foo &assign(const foo &op)    { *this = op; return *this; }
>
> : private:
> :     foo &operator =(const foo &);
> : };
>
> : if foo::assign() were defined in a separate compilation unit, there
would be
> : no way for the compiler to check that the assignment operator is never
> : "actually" called. The only safe way to handle this case is to disallow
> : operator =() in anonymous unions.
>
> I don't see why the operator = is even really a problem for the
> anonymous union in this case.  For example, if the proxy is a pointer
> type in which assignment is equivalent to change the contents
> of the pointer and not the pointers itself.   After all it should
> be the responsibility of the user to  decide how the union is copied.
>
> But to be a bit more specific, why is this construct legal if this
> other one it not?
>
> Legal:
> -------------------
> struct A {
>   int k;
> };
>
> struct B {
>   int j;
> union {
>   const int i;
>   A a;
> };
> };
>
>
> Illegal:
> -------------------
> struct A {
>   const int k;
> };
>
> struct B {
>   int j;
> union {
>   const int i;
>   A a;
> };
> };
>
> It would seem to me that declaring the contents to be const should
> be legal in a anonymous union if declaring a const is legal by itself.
> This would allow the user to maintain the copy restrictions.  However,
> compilers interpret a const member as a copy constructor.
>
> Perhaps more interesting mutable seems to be fine, but it is
> doesn't effect the copy constructor so the value still gets changed.
>
> Also it does not seem to bother the compilers if you place any other
> operator = besides the copy assignment.  So why is it necessary
> to be so specific about the copy constructor in this case?  After
> all the union is going to be copied in a POD style anyway so if
> there is a copy constructor or assignment operator it by definition
> won't get run.  Therefore, won't it just be better to allow regular
> types copy definitions in the anonymous union?
>
> --Karl


I see your problem now. Section 9.5 [class.union] of the Standard says that
an object with a non-trivial constructor, non-trivial destructor,
non-trivial copy constructor and non-trivial assignment operator cannot be a
member of a union. Section 12.8 [class.copy] defines implicit and trivial
copy constructor and assignment operators, trivial versions of these being
subsets of the implicit versions.

struct A of the illegal case has a trivial constructor (A::k contains
garbage), a trivial destructor and a trivial copy constructor, but no
implicit (and therefore no trivial) assignment operator can be generated,
because there is no way to assign to a 'const int'.


Daniel Pfeffer



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Karl Nelson <kenelson@redbelly.ece.ucdavis.edu>
Date: 2000/03/02 Raw View

:> Also it does not seem to bother the compilers if you place any other
:> operator = besides the copy assignment.  So why is it necessary
:> to be so specific about the copy constructor in this case?  After
:> all the union is going to be copied in a POD style anyway so if
:> there is a copy constructor or assignment operator it by definition
:> won't get run.  Therefore, won't it just be better to allow regular
:> types copy definitions in the anonymous union?
:>
:> --Karl

: I see your problem now. Section 9.5 [class.union] of the Standard says that
: an object with a non-trivial constructor, non-trivial destructor,
: non-trivial copy constructor and non-trivial assignment operator cannot be a
: member of a union. Section 12.8 [class.copy] defines implicit and trivial
: copy constructor and assignment operators, trivial versions of these being
: subsets of the implicit versions.

: struct A of the illegal case has a trivial constructor (A::k contains
: garbage), a trivial destructor and a trivial copy constructor, but no
: implicit (and therefore no trivial) assignment operator can be generated,
: because there is no way to assign to a 'const int'.

So is there a remedy?  It would seem that having a matching member in
each class match that in the union would be a potentially useful construct
being screened by an overly specific standard.  The section on
[class.union] was meant to prevent classes which have resources
which need to be treated with care from being placed in a union.

Yes, this is a bit of an abuse of the concept of a union.  A union is
intended when you can have many types but only one of them can occupy
that spot at a time.  Here I am defining that all of the types share
the same location to save them from having to store the same pointer
(the pointer to the parent class) hundreds of times.

Is there an alternate structure which achieves this effect?  My
alternative is decidedly unportable as it uses the reinterpret_cast
quite generously.

------------------------------------------------------------------
struct C { void* i; };
struct A {
  void* get_value() const { return reinterpret_cast<const C*>(this)->i; }
  void method();
  void method() const;
};

struct B {
  int j;
  union {
    const void* i;
    A a;
  };
  B() : i(this) {}
};
------------------------------------------------------------------

Since I can't prevent the use of the assignment operator the most
I can do is reduce it to a totally meaningless operation of copying
a class with no data.

Still is seems that a union could be defined in such a way as
to allow a non-trivial assignment operator as the user can decide
to call it explicitly for all members of the union or for the
specific type of the union.  Complete restriction seems draconic.
Much the same can be said about the destructor where the responsibility
of the user could call the dtor explicitly for the type to which the
union is used.  After all one can achieve the same effect by defining a
method named dtor in a class and calling it when the parent class is
destroyed.

--Karl

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Karl Nelson <kenelson@sequoia.ece.ucdavis.edu>
Date: 2000/02/29 Raw View

There appears to be a few bugs in the specification of the
behavior of anonymous unions or at least the implementation of
it in a wide number of compilers.  A member of an anonymous union
may not have a constructor, copy constructor, or operator =.
This is a reasonable restriction.

However, it appears that it is illegal to explicitly declare
that these functions are private and can not be used to prevent
assignment.  Thus if a member is within an anonymous union and
it assigned the compiler will provide the default
assignment and thus alter that member.  Thus there does not
appear to be anyway to prevent assignment.

For example, here is a proxy used in a class which should
allow partitioning of a classes function's into a number of names.
   object.function_group.function();

Example:
-----------------------------------------------------

class Widget;

class Proxy {
private:
  Widget* const t;
  void operator= (const Proxy&);  // we do not want copy to work
};

class Widget {
  int i;
public:
  union {
    Widget* const this_;
    Proxy proxy;                // this should be allowed
  };
  Widget(): this_(this) {}
};

int main() {
  Proxy r;
  Widget w;
  w.proxy=r;                   // this should return an error
  return 0;
}

-----------------------------------------------------

This code fails on all compilers tested at the declaration
of the anonymous union.  (Even without the operator=)

Declaring the Proxy const is not an option because then the
proxy will appear const when applied to a non-const Widget.
The const of the parent class is important for the implementation
of the proxy.

Is there anything in the standard which would screen out the
above use, or have all the compilers implemented something
which was more restricted than intended?  (Having read through
the standard, I couldn't find anything which implies that a
const member can not be a member of a struct in a union as
it is assumed that the union member can be assigned through any
of the union members.)

Thanks,

--Karl Nelson

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Daniel M. Pfeffer" <pfefferd@nospam.internet-zahav.net>
Date: 2000/03/03 Raw View

"Karl Nelson" <kenelson@redbelly.ece.ucdavis.edu> wrote in message
news:89jq8r$rmj$1@mark.ucdavis.edu...
> :> Also it does not seem to bother the compilers if you place any other
> :> operator = besides the copy assignment.  So why is it necessary
> :> to be so specific about the copy constructor in this case?  After
> :> all the union is going to be copied in a POD style anyway so if
> :> there is a copy constructor or assignment operator it by definition
> :> won't get run.  Therefore, won't it just be better to allow regular
> :> types copy definitions in the anonymous union?
> :>
> :> --Karl
>
> : I see your problem now. Section 9.5 [class.union] of the Standard says
that
> : an object with a non-trivial constructor, non-trivial destructor,
> : non-trivial copy constructor and non-trivial assignment operator cannot
be a
> : member of a union. Section 12.8 [class.copy] defines implicit and
trivial
> : copy constructor and assignment operators, trivial versions of these
being
> : subsets of the implicit versions.
>
> : struct A of the illegal case has a trivial constructor (A::k contains
> : garbage), a trivial destructor and a trivial copy constructor, but no
> : implicit (and therefore no trivial) assignment operator can be
generated,
> : because there is no way to assign to a 'const int'.
>
> So is there a remedy?  It would seem that having a matching member in
> each class match that in the union would be a potentially useful construct
> being screened by an overly specific standard.  The section on
> [class.union] was meant to prevent classes which have resources
> which need to be treated with care from being placed in a union.
>
> Yes, this is a bit of an abuse of the concept of a union.  A union is
> intended when you can have many types but only one of them can occupy
> that spot at a time.  Here I am defining that all of the types share
> the same location to save them from having to store the same pointer
> (the pointer to the parent class) hundreds of times.
>
> Is there an alternate structure which achieves this effect?  My
> alternative is decidedly unportable as it uses the reinterpret_cast
> quite generously.

Unions are generally non-portable, because you have no idea how much packing
is inserted between elements in structures. For example, if sizeof(short) =
2 and 'short' is always aligned on even addresses, sizeof(long) is 4 chars,
and 'long' is always aligned on addresses divisible by 4 and sizeof(double)
is 8 and 'double' is always aligned on addresses divisible by 8, the
following union will be completely different from a union in an environment
that has no special alignment requirements:

struct A {
    short x;   // 6 bytes of padding if y is aligned mod 8, otherwise no
padding.
    double y;
}

union {
    A a;
    short b[sizeof(A)/sizeof(short)];  // 5 elements if no padding. 8
elements if padding & b[1]..b[3] are garbage
} t;


I suggest that you provide a class with appropriate conversion methods. This
would not solve the problem of using reinterpret_cast<>, but it would
localise it to the class. Hopefully, that is more maintainable than having
reinterpret_cast<> everywhere.

class A {
public:
    // constructors, etc.

    int asInt() const;
    int &asInt();

    void *asVoidP() const;
    void *&asVoidP();

    ...

private:
    // allocate enough room for the "union"
};


Daniel Pfeffer




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Karl Nelson <kenelson@sequoia.ece.ucdavis.edu>
Date: 2000/03/03 Raw View

Daniel M. Pfeffer <pfefferd@nospam.internet-zahav.net> wrote:
:>
:> So is there a remedy?  It would seem that having a matching member in
:> each class match that in the union would be a potentially useful construct
:> being screened by an overly specific standard.  The section on
:> [class.union] was meant to prevent classes which have resources
:> which need to be treated with care from being placed in a union.
:>
:> Yes, this is a bit of an abuse of the concept of a union.  A union is
:> intended when you can have many types but only one of them can occupy
:> that spot at a time.  Here I am defining that all of the types share
:> the same location to save them from having to store the same pointer
:> (the pointer to the parent class) hundreds of times.
:>
:> Is there an alternate structure which achieves this effect?  My
:> alternative is decidedly unportable as it uses the reinterpret_cast
:> quite generously.

: Unions are generally non-portable, because you have no idea how much packing
: is inserted between elements in structures. For example, if sizeof(short) =
: 2 and 'short' is always aligned on even addresses, sizeof(long) is 4 chars,
: and 'long' is always aligned on addresses divisible by 4 and sizeof(double)
: is 8 and 'double' is always aligned on addresses divisible by 8, the
: following union will be completely different from a union in an environment
: that has no special alignment requirements:
[snip]

Packing is clearly not a problem in my system because all of the
types using in the union are structs containing the exact
same pointer.  They all contain 1 pointer and the pointer is
always the same type.  The type I wish to have in all
of them is "MyStruct * const".


: I suggest that you provide a class with appropriate conversion methods. This
: would not solve the problem of using reinterpret_cast<>, but it would
: localise it to the class. Hopefully, that is more maintainable than having
: reinterpret_cast<> everywhere.

: class A {
: public:
:     // constructors, etc.

:     int asInt() const;
:     int &asInt();

:     void *asVoidP() const;
:     void *&asVoidP();

:     ...

: private:
:     // allocate enough room for the "union"
: };

If the point were to provide a union with multiple types
sharing an address this would be fine.  But in my case
I am trying to have a single address with many types which
are proxies containing nothing methods.

Imagine, you many list like structures in a class.  Each
of these list structures is protected in such a way that
you can't expose them to the outside.  However, you wish
to provide an STL equivenlent for access to the user.

I can either write this as

class MyClass
  {
    A_iterator A_list_begin();
    A_iterator A_list_end();
    A_iterator A_list_insert(A_iterator A,SomeType);
    ...

    B_iterator B_list_begin();
    B_iterator B_list_end();
    B_iterator B_list_insert(B_iterator B,SomeType);
    ...
  };

This looks superficially like I have the STL list stuff
but the names of methods have to include which list I want to use.
Thus I converted this to a proxy structure...

class A_Proxy
  {
    MyClass* const mc;
    iterator begin();
    iterator end();
    iterator insert(iterator,SomeType);
    ...
  };

class B_Proxy
  {
    MyClass* const mc;
    iterator begin();
    iterator end();
    iterator insert(iterator,SomeType);
    ...
  };

class MyClass
  {
    A_Proxy a;
    B_Proxy b;
    MyClass () : a(this), b(this);
  };

Now I have something which really looks like STL lists.
I can refer to one or the other as  myclass.a.begin().

Unfortunately this falls completely appart when I get
serveral hundred of these proxies in together.  When
used like this my users are hitting 64k objects with nothing
but pointers to this!

Thus my only choice is to combine these with an anonymous
union.  Since they all have identical data all the usual
problem of construction and destruction are irrelevent because
the proxy just contains a pointer to the object to which it
belongs.   The only problem is that I do not want the users
to be able to change the proxies object accedentally.  After
all with lists is perfectly valid to copy one list to another.


I could combine the proxies and use reinterpret cast as you
suggest, however, many of my proxies are ment to look like functions.

 widget.show.connect();  // looks like an object
 widget.show();          // looks like a function

Making reinterpret functions for all these types would thus
be a total mess as which is the accessor and which is the
call function.

I find the whole logic that the compile can't tell therefore
we shouldn't allow it to be nothing more than a cheap escape
from the problem.  If unions were allowed to contain
things with constructors and destructors, then they would
just write programs like this to use them....


class A {}; // non trivial
class B {}; // non trivial
class C
  {
    enum Type { NONE, TYPE_A, TYPE_B } type;
    union {
      A a;
      B b;
    } u;
  };

C::C()
  {
   type=NONE;
  }

C::do_something()
  {
   if (type==NONE)
     new (&u.a) A(args);
  }

C::operator=(const C& c)
  {
    if (c.type!=type) return;
    switch (type)
      {
        case TYPE_A:
          u.a=c.u.a;
          break;
        case TYPE_B:
          u.b=c.u.b;
          break;
      };
  }

C::~C()
  {
    switch (type)
      {
        case TYPE_A:
          u.a.~A();
          break;
        case TYPE_B:
          u.b.~B();
          break;
      }
  }


Here the fact, that the compiler did not know how to handle
the data was irrelevent.  The used defined what they wanted
to do the data.  The feature placement new and explicit
deletion were used to handle the data.

Considering this is largely equivenlent to how data is
dealt with in C unions, I don't see why having the
compiler not know is that bad.  Most of the time, the
user can cast things such that the compiler is just guessing
so why not define a union to hold anything and do nothing
with the data unless told to do so?

--Karl

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 2000/03/03 Raw View

"Daniel M. Pfeffer" wrote:
....
> Unions are generally non-portable, because you have no idea how much packing
> is inserted between elements in structures. For example, if sizeof(short) =
> 2 and 'short' is always aligned on even addresses, sizeof(long) is 4 chars,
> and 'long' is always aligned on addresses divisible by 4 and sizeof(double)
> is 8 and 'double' is always aligned on addresses divisible by 8, the
> following union will be completely different from a union in an environment
> that has no special alignment requirements:
>
> struct A {
>     short x;   // 6 bytes of padding if y is aligned mod 8, otherwise no
> padding.
>     double y;
> }

Section 9.5 p1 says that for unions: "Each data member is allocated as
if it were the sole member of a struct." Now, if that data member were
the sole member of a struct, then Section 9.2 p17 applies: "A pointer to
a POD-struct object, suitably converted using a reinterpret_cast, points
to its initial member (or if that member is a bit-field, then to the
unit in which it resides) and vice versa."  Therefore, all of the
members of a union must have the same address as the union itself.

This logic doesn't apply, to data members of a union that would prevent
a struct containing them from being a POD-struct, but that doesn't apply
to the member types in your example.

However, even in that case, while the address of a member might be
different from the address of the union, section 5.9 p2 still applies:
"If two pointers point to data members of the same union object, they
compare equal (after conversion to void *, if necessary)."

All of the data members must start at the same location. It's not
meaningful to talk about having padding "between" them, because there is
no such thing as a place that is between them.

I can't figure out what misunderstanding you've made, but with the
implementation you describe, 'a.x' must occupy the first two bytes of
'a', 'a.y' must occupy the first eight bytes of 'a', 'a' must be aligned
on a multiple of 8 bytes, and therefore must have a size which is a
positive multiple of 8 bytes. The only padding allowed is at the end of
the union, not at the beginning.

> union {
>     A a;
>     short b[sizeof(A)/sizeof(short)];  // 5 elements if no padding. 8
> elements if padding & b[1]..b[3] are garbage
> } t;

The number of elements in t.b must be a positive multiple of 4. It will
be exactly 4 if there is no padding. I can't figure out where you got
the number 5. 8 is valid, and represents the minimum non-zero amount of
padding, but I think you're implying that it's also the maximum, which
would be false.

I just realized what mistake you've made: your comments make perfect
sense if you're completely unaware of how a union differs from an
ordinary structure. Have you ever actually used a union before?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]