Topic: How far "layout compatibility" can be stretched?


Author: igusarov@akella.com ("Igor A. Goussarov")
Date: Sun, 5 Jan 2003 10:07:57 +0000 (UTC)
Raw View
Hello group,

    I've some questions connected to the recent GotW, the one about the
unions. In brief, my questions boil down to the one: "exactly which
types can be considered layout compatible"? There's no real-world
problem behind this question; it's that I'm just curious.

    And now a lengthy description of how this question had risen.

    If a union has two structs in it, and several first data members of
these two structs are "layout compatible", then the standard permits
(9.2p16, 9.5p1) accessing a said member of one struct when another
struct is 'active'. There's no code example, but I trust it could be
written this way:

struct S1       /* A POD */
{
   int       x;
};

struct S2       /* A POD */
{
   int       y;
};

union U1
{
   S1       s1;
   S2       s2;
};

U1        u1;
u1.s1 = S1();
u1.s1.x = 5;
if (u1.s2.y == 5) ...    /* Legal because 'x' and 'y' are among
                             those first layout compatible members
                             in both structs. */

    The first question is: why are the structs so important in this
case? What if I omit them, whill the code still be legal?

union U2
{
   int        x;
   int        y;
};

U2         u2;
u2.x = 5;
if (u2.y == 5) ...  /* is this legal? Where does the standard
                        say it is or it isn't? */

    Then, the standard say (3.9p11) that T1 and T2 are layout compatible
if they are the same type. Later it goes on defining layout-compatible
enums, structs and unions. But what about other possibilities? I mean, a
"pointer to T" and a "pointer to const T" are required to have the same
alignment and value representation. Does it count as "layout compatibility"?

union U3
{
   const char*       p1;
         char*       p2;
};

U3        u3;
u3.p1 = "abc";
if (u3.p2 != NULL) ... /* is accessing p2 legal here? */

    Now, the third question. The standard require that the value
representation of 'char', 'unsigned char' and 'signed char' use all the
bits, and that every possible bit combination maps to a valid value of
the corresponding char type (3.9.1p1). Sure, they also have the same
size and alignment... So is it legal to use, say, 'char' and 'signed
char' like 'const char*' and 'char*' were used in the previous example?
    If it is legal for chars, then what about ints? Sure, there's no
guarantee that every possible bit combination maps to a valid int value,
but there is a guarantee that the common set of values of 'int' and
'unsigned int' have the same value representation (3.9.1p3). Thus,

union U4
{
   int              i1;
   unsigned int     i2;
};

U4     u4;
u4.i2 = 5;
if (u4.i1 == ...)   /* is it legal? Would it be legal if I
                        had written
                           u4.i2 = (unsigned int)MAX_INT + 1u;
                        instead? */

    Thank you for your thoughts on these questions!

Igor

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: gp1@paradise.net.nz (Graeme Prentice)
Date: Tue, 7 Jan 2003 14:13:24 +0000 (UTC)
Raw View
On Sun, 5 Jan 2003 10:07:57 +0000 (UTC), igusarov@akella.com ("Igor A.
Goussarov") wrote:

>Hello group,
>
>    I've some questions connected to the recent GotW, the one about the
>unions. In brief, my questions boil down to the one: "exactly which
>types can be considered layout compatible"? There's no real-world
>problem behind this question; it's that I'm just curious.
>
>    And now a lengthy description of how this question had risen.
>
>    If a union has two structs in it, and several first data members of

You left out an important term here "POD"  - para 16 is talking about
POD union containing two or more POD structs.


>these two structs are "layout compatible", then the standard permits
>(9.2p16, 9.5p1) accessing a said member of one struct when another
>struct is 'active'. There's no code example, but I trust it could be
>written this way:
>
>struct S1       /* A POD */
>{
>   int       x;
>};
>
>struct S2       /* A POD */
>{
>   int       y;
>};

S1 and S2 are layout compatible POD structs because they contain the
same number of non static data members in the same order and the
corresponding members have layout compatible types.


>
>union U1
>{
>   S1       s1;
>   S2       s2;
>};
>
>U1        u1;
>u1.s1 = S1();
>u1.s1.x = 5;
>if (u1.s2.y == 5) ...    /* Legal because 'x' and 'y' are among
>                             those first layout compatible members
>                             in both structs. */

Correct  9.2 para 16 makes this legal

>
>    The first question is: why are the structs so important in this
>case? What if I omit them, whill the code still be legal?
>
>union U2
>{
>   int        x;
>   int        y;
>};
>
>U2         u2;
>u2.x = 5;
>if (u2.y == 5) ...  /* is this legal? Where does the standard
>                        say it is or it isn't? */


Strictly speaking this is illegal as far as I can see  - however there
doesn't seem any point in having two names for the same piece of storage
with identical type so the fact that u2.y == 5 is technically illegal
doesn't matter.


>
>    Then, the standard say (3.9p11) that T1 and T2 are layout compatible
>if they are the same type. Later it goes on defining layout-compatible
>enums, structs and unions. But what about other possibilities? I mean, a
>"pointer to T" and a "pointer to const T" are required to have the same
>alignment and value representation. Does it count as "layout compatibility"?


No - definitely NOT  - this would allow you to modify a const object  -
two non struct POD types are layout compatible only if they are
*exactly* the same type.


>
>union U3
>{
>   const char*       p1;
>         char*       p2;
>};
>
>U3        u3;
>u3.p1 = "abc";
>if (u3.p2 != NULL) ... /* is accessing p2 legal here? */

nope  - it's illegal


>
>    Now, the third question. The standard require that the value
>representation of 'char', 'unsigned char' and 'signed char' use all the
>bits, and that every possible bit combination maps to a valid value of
>the corresponding char type (3.9.1p1). Sure, they also have the same
>size and alignment... So is it legal to use, say, 'char' and 'signed
>char' like 'const char*' and 'char*' were used in the previous example?
>    If it is legal for chars, then what about ints? Sure, there's no
>guarantee that every possible bit combination maps to a valid int value,
>but there is a guarantee that the common set of values of 'int' and
>'unsigned int' have the same value representation (3.9.1p3). Thus,

um - I'll pass on exactly what the significance of this is


>
>union U4
>{
>   int              i1;
>   unsigned int     i2;
>};
>
>U4     u4;
>u4.i2 = 5;
>if (u4.i1 == ...)   /* is it legal? Would it be legal if I
>                        had written
>                           u4.i2 = (unsigned int)MAX_INT + 1u;
>                        instead? */
>

No  it's not legal either way.

Even if you wrote
struct s1 { int i1; };
struct s2 { unsigned int i2; };
union u1 { s1 m1;  s2 m2; };

it's still not legal to inspect the int via the unsigned int.

Graeme

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kuyper@wizard.net ("James Kuyper Jr.")
Date: Tue, 7 Jan 2003 18:53:15 +0000 (UTC)
Raw View
Graeme Prentice wrote:
> On Sun, 5 Jan 2003 10:07:57 +0000 (UTC), igusarov@akella.com ("Igor A.
> Goussarov") wrote:
....
>>union U2
>>{
>>  int        x;
>>  int        y;
>>};
>>
>>U2         u2;
>>u2.x = 5;
>>if (u2.y == 5) ...  /* is this legal? Where does the standard
>>                       say it is or it isn't? */
>
>
>
> Strictly speaking this is illegal as far as I can see  - however there
> doesn't seem any point in having two names for the same piece of storage
> with identical type so the fact that u2.y == 5 is technically illegal
> doesn't matter.

One possible use would be if the union was to be used with two different
templates, one of which needed it's template argument to have a member
named 'x', the other of which needed it to have a member named 'y', and
the use you're intending to make of the two templates required those
names to refer to the same piece of memory.
That's a pretty obscure possibility, but not impossible.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]