Topic: Reusing storage after object destruction


Author: Raoul Gough <chbjrdnr@trwk06111509.net>
Date: Tue, 13 Jan 2009 18:14:31 CST
Raw View
James Kanze <james.kanze@gmail.com> writes:

> On Dec 31, 1:50 am, Raoul Gough <nyvmk...@qbvq59279308.net> wrote:
>> James Kanze <james.ka...@gmail.com> writes:
>> > On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:
>> [snip]
>> > I think the standard could be clearer here, but I'm not sure
>> > what it should say.  Given something like:
>
>> >    void foo( int* p1, double* p2 ) { ... }
>
>> > being able to assume that p1 and p2 don't alias the same
>> > object is an important help to optimization.  But of course,
>> > the compiler cannot assume this in every case: the following
>> > is a perfectly conform and well defined bit of C++:
>
>> >    union U { int i ; double d ; } ;
>
>> >    int
>> >    foo( int* p1, double* p2 )
>> >    {
>> >        int r = *p1 ;
>> >        *p2 = 3.1415 ;
>> >        return 4 ;
>> >    }
>
>> >    U u ;
>> >    u.i = 42 ;
>> >    foo( &u.i, &u.d ) ;
>
>> > (On the other hand, if foo writes *p2, then reads *p1, the
>> > compiler can assume that they aren't aliases.  Subtle,
>> > n'est-ce pas?)

I've asked some questions about these examples on the gcc mailing
list, and an opinoin there is that the C standards committee
ultimately decided that the union example is *not* valid unless the
union is visible within the code that manipulates the union
members. If you read the following web page to the bottom, it says

http://www.open-std.org/JTC1/SC22/WG14/www/docs/dr_236.htm

"Both programs invoke undefined behavior [...]"

Which is in reference both to the union-based example and another one
using malloc and pointer casts. Or am I missing something?

>> In the case of the memory manager, there may be an explicit
>> "delete" expression, which should give it a hint that the
>> pointer may immediately alias an object of a different type.
>> However, from what I've seen, g++ doesn't seem to use this
>> information, at least not in the case of objects with trivial
>> destructors.

Ah, I thought the same! In fact, by using some compiler-debugging
options, it's possible to examine the internal representation of the
aliasing information. The crucial point seems to be that my example
had an explicit destructor call, but no explicit object construction
when reusing the storage. The code should be like this:

#include <new>

void** global_free_list = 0;

inline void operator delete(void* p2) throw()
{
     // Save "next" pointer in re-used client storage. TODO - check for NULL
     new (p2) (void *) (global_free_list);
     global_free_list = static_cast<void **>(p2);
}

double foo(double* p1)
{
     double result = *p1;
     delete p1;
     return result;
}

So this uses placement-new when re-using the storage (instead of
pointer casting). I posted the g++ 4.3.2 alias analysis for this to
the gcc list, and apparently it would correctly prevent reordering the
operations on p1 and p2 for the code as shown (but not if it uses a
cast instead of placement new).

--
Cheers,
Raoul Gough.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: gpderetta <gpderetta@gmail.com>
Date: Fri, 26 Dec 2008 21:42:20 CST
Raw View
On Dec 23, 6:49 pm, James Kanze <james.ka...@gmail.com> wrote:
> On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:
>
<snip>
> being able to assume that p1 and p2 don't alias the same object
> is an important help to optimization.  But of course, the
> compiler cannot assume this in every case: the following is a
> perfectly conform and well defined bit of C++:
>
>    union U { int i ; double d ; } ;
>
>    int
>    foo( int* p1, double* p2 )
>    {
>        int r = *p1 ;
>        *p2 = 3.1415 ;
>        return 4 ;
>    }
>
>    U u ;
>    u.i = 42 ;
>    foo( &u.i, &u.d ) ;
>

Actually I'm not so sure it is well defined at all.  There is an open
defect against both the C and C++ standard about this almost exact
example. See:

http://std.dkuug.dk/jtc1/sc22/wg14/www/docs/dr_236.htm

And the proposed resolution would make the above UB.

<snip>
>
> > As for GCC, if you use "-fstrict-aliasing", you are telling
> > the compiler to assume that "an object of one type is assumed
> > never to reside at the same address as an object of a
> > different type, unless the types are almost the same."  Given
> > that assumption, the compiler's reordering is allowed.
>
> In other words, if you use "-fstrict-aliasing", you're telling
> the compiler that you never use a union or do any type punning.
> This is, IMHO, a valid solution: the compiler generates strictly
> conforming code otherwise, and you use an option to tell it that
> it can go even further.  If you use the option here, then you've
> lied to the compiler.

The problem is that -fstrict-aliasing is on by default with (not so)
recent GCC versions, because, according to the GCC developers, it is
the behavior mandated by the standard.
-fno-strict-aliasing can be used to disable optimizations based on
type alias analysis, but according to the GCC documentation, this
option is not guaranteed to be available in the future (unlikely, as
it would break a large percentage of free software).

> (Of course, all options aside, when the
> compiler sees the reinterpret_cast in the same function, it
> should automatically disactivate the option.  QoI issue.)
>

Well, I do not think it  would really work well in the presence of
inlining or aggressive interprocedural optimizations. The optimizer
might no longer have the notion of a function.

> > With "-Wall", you should get the message "dereferencing
> > type-punned pointer will break strict-aliasing rules".  Try
> > that.
> > So this isn't a C++ standard problem.
> > Generally, if you're using "reinterpret_cast" without dire
> > necessity, you're writing bad code.  There are much better
> > ways to code this.
>
> It's a standard idiom in low level memory managers.  It can be
> written in different ways; I generally reinterpret_cast the
> original pointer to a pointer to a union, and do all of the
> accesses through that, e.g.:
>
>    union Header
>    {
>        int             userData ;
>        Header*         next ;
>    } ;
>

GCC (and many other compilers) accept type punning via a union as an
explicit extension. (but only when directly accessing union members).
There are other ways though, like using memcpy (at the cost of making
the code hard to read).
And BTW, wouldn't using placement operator new actually fix the OP
code:

extern void* globalFreeList;

int foo(int* userPtr) {
 int temp = *userPtr;   // Read user value
 userPtr->~int();   // int is a pod, so this shouldn't actually be
necessary
 void** listPointer = new (userPtr) void* (globalFreeList); // Re-use
storage here
 globalFreeList = listPointer;
 // Return copied value
 return temp;
}

Does the standard actually guarantee that the above must work?
(Assuming that void* and ints have compatible sizes and alignment, of
course).

--
gpd

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: James Kanze <james.kanze@gmail.com>
Date: Sat, 27 Dec 2008 15:29:28 CST
Raw View
On Dec 27, 4:42 am, gpderetta <gpdere...@gmail.com> wrote:
> On Dec 23, 6:49 pm, James Kanze <james.ka...@gmail.com> wrote:
> > On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:

> <snip>
> > being able to assume that p1 and p2 don't alias the same
> > object is an important help to optimization.  But of course,
> > the compiler cannot assume this in every case: the following
> > is a perfectly conform and well defined bit of C++:

> >    union U { int i ; double d ; } ;

> >    int
> >    foo( int* p1, double* p2 )
> >    {
> >        int r = *p1 ;
> >        *p2 = 3.1415 ;
> >        return 4 ;
> >    }

> >    U u ;
> >    u.i = 42 ;
> >    foo( &u.i, &u.d ) ;

> Actually I'm not so sure it is well defined at all.  There is
> an open defect against both the C and C++ standard about this
> almost exact example. See:

> http://std.dkuug.dk/jtc1/sc22/wg14/www/docs/dr_236.htm

> And the proposed resolution would make the above UB.

And the committee did not accept the proposed wording.

According to the current wording in the C++ standard (and I'm
pretty sure in the C standard as well), the above is well
defined.  It does create a very large problem for compiler
writers, but the only really acceptable solution I see is a
compiler option promessing the compiler that you don't do this.

> <snip>
> > > As for GCC, if you use "-fstrict-aliasing", you are
> > > telling the compiler to assume that "an object of one type
> > > is assumed never to reside at the same address as an
> > > object of a different type, unless the types are almost
> > > the same."  Given that assumption, the compiler's
> > > reordering is allowed.

> > In other words, if you use "-fstrict-aliasing", you're
> > telling the compiler that you never use a union or do any
> > type punning.  This is, IMHO, a valid solution: the compiler
> > generates strictly conforming code otherwise, and you use an
> > option to tell it that it can go even further.  If you use
> > the option here, then you've lied to the compiler.

> The problem is that -fstrict-aliasing is on by default with
> (not so) recent GCC versions, because, according to the GCC
> developers, it is the behavior mandated by the standard.
> -fno-strict-aliasing can be used to disable optimizations
> based on type alias analysis, but according to the GCC
> documentation, this option is not guaranteed to be available
> in the future (unlikely, as it would break a large percentage
> of free software).

That is a problem.  If only from a quality of implementation
point of view; for any number of reasons, aliasing does exist in
real code, and it is necessary in low level code.  It's
supported by the standard if 1) the pointers point to members of
a union, or 2) one of the pointers is to a character type.

The only real short term solution is to provide an option like
-fstrict-aliasing (which shouldn't be on by default).  The long
term solution, of course, is to develop optimizing techniques
which allow the compiler to detect such cases; modern
optimizers already look beyond the module boundary, so this
isn't so unreasonable.

> > (Of course, all options aside, when the compiler sees the
> > reinterpret_cast in the same function, it should
> > automatically disactivate the option.  QoI issue.)

> Well, I do not think it  would really work well in the
> presence of inlining or aggressive interprocedural
> optimizations. The optimizer might no longer have the notion
> of a function.

No.  It sees more than just one function.  And the more it sees,
the better it can determine whether the optimization is really
safe.

> > > With "-Wall", you should get the message "dereferencing
> > > type-punned pointer will break strict-aliasing rules".
> > > Try that.
> > > So this isn't a C++ standard problem.
> > > Generally, if you're using "reinterpret_cast" without dire
> > > necessity, you're writing bad code.  There are much better
> > > ways to code this.

> > It's a standard idiom in low level memory managers.  It can be
> > written in different ways; I generally reinterpret_cast the
> > original pointer to a pointer to a union, and do all of the
> > accesses through that, e.g.:

> >    union Header
> >    {
> >        int             userData ;
> >        Header*         next ;
> >    } ;

> GCC (and many other compilers) accept type punning via a union
> as an explicit extension. (but only when directly accessing
> union members).  There are other ways though, like using
> memcpy (at the cost of making the code hard to read).

And generally with a performance impact as well.

It is funny that the standard doesn't allow type punning with
unions (except in special cases), but does through
reinterpret_cast, where as traditionally, unions have been the
most used for this, and most compilers, as you say, do support
it (although I've used one which didn't).  Thus, for example, if
you know you're dealing with big-endian IEEE double, and you
want to extract the exponent, you can write either:

     union U
     {
         double              d ;
         unsigned short      s[ 4 ] ;
     }                   tmp ;
     tmp.d = theDouble ;
     exp = (s[ 0 ] & 0x7FF0) >> 4 ;

or
     exp = (*reinterpret_cast< unsigned short const* >( &theDouble )
                 & 0x7FF0) >> 4 ;

By my reading of the standard, the intent is that the second
version should work, but the first is undefined behavior.  From
experience, however, more compilers get the first right.  And of
course:

     unsigned char const*tmp
         = static_cast< unsigned char const* >(
                 static_cast< void const* >( &theDouble ) ) ;
     exp = ((tmp[ 0 ] & 0x7F) << 4) || (tmp[ 1 ] >> 4) ;

is guaranteed to work.

> And BTW, wouldn't using placement operator new actually fix the OP
> code:

> extern void* globalFreeList;

> int foo(int* userPtr) {
>  int temp = *userPtr;   // Read user value
>  userPtr->~int();   // int is a pod, so this shouldn't actually be
> necessary
>  void** listPointer = new (userPtr) void* (globalFreeList); // Re-use
> storage here
>  globalFreeList = listPointer;
>  // Return copied value
>  return temp;
> }

> Does the standard actually guarantee that the above must work?
> (Assuming that void* and ints have compatible sizes and
> alignment, of course).

I think it actually guarantees that the original code should
work.  Whether this is intentional or not is another question.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Raoul Gough <nyvmktcb@qbvq59279308.net>
Date: Tue, 30 Dec 2008 18:50:32 CST
Raw View
James Kanze <james.kanze@gmail.com> writes:

> On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:
[snip]
> I think the standard could be clearer here, but I'm not sure
> what it should say.  Given something like:
>
>    void foo( int* p1, double* p2 ) { ... }
>
> being able to assume that p1 and p2 don't alias the same object
> is an important help to optimization.  But of course, the
> compiler cannot assume this in every case: the following is a
> perfectly conform and well defined bit of C++:
>
>    union U { int i ; double d ; } ;
>
>    int
>    foo( int* p1, double* p2 )
>    {
>        int r = *p1 ;
>        *p2 = 3.1415 ;
>        return 4 ;
>    }
>
>    U u ;
>    u.i = 42 ;
>    foo( &u.i, &u.d ) ;
>
> (On the other hand, if foo writes *p2, then reads *p1, the
> compiler can assume that they aren't aliases.  Subtle, n'est-ce
> pas?)

Is this exactly the same problem for the compiler as the memory
manager example, or an even more difficult one? In the case of the
memory manager, there may be an explicit "delete" expression, which
should give it a hint that the pointer may immediately alias an object
of a different type. However, from what I've seen, g++ doesn't seem to
use this information, at least not in the case of objects with trivial
destructors.

[snip]

>> Generally, if you're using "reinterpret_cast" without dire
>> necessity, you're writing bad code.  There are much better
>> ways to code this.
>
> It's a standard idiom in low level memory managers.  It can be
> written in different ways; I generally reinterpret_cast the
> original pointer to a pointer to a union, and do all of the
> accesses through that, e.g.:
>
>    union Header
>    {
>        int             userData ;
>        Header*         next ;
>    } ;
>
> There'll still be a reinterpret_cast (or a static_cast from a
> void*) to get the Header* in the deallocation routine, however.

Yes, exactly. For instance, take a look at some memory managment code
from Andrei Alexandrescu (available from the link below)

http://accu.org/content/conf2008/Alexandrescu-memory-allocation.screen.pdf

template <size_t S, class B>
struct ExactFreeList : private B {
// ...
 void deallocate(void * p) {
   if (allocatedSize(p) != S)
     return B::deallocate(p);
   list * pL = static_cast<List*>(p);
   pL->next_ = list_;
   list_= pL;
 }
// ...

I believe this suffers the same storage re-use problem. So I can think
of three possible conclusions:

1. Storing free-list pointers within the (freed) user data area is
inherently non-portable, regardless of programming language.

2. C++ just doesn't provide a (portable) mechanism to fix (1)

3. g++ is being over-zealous in applying its strict-aliasing optimizations

I have a vague suspicion that (3) is correct, at least assuming that
an object has been explicitly destroyed in between the pointer
usages. In James' union case, there isn't even that much information
for the compiler.

However, I don't think my suspicion is going to be enough to convince
anyone to change g++ - from what I understood of the implementation,
its anti-aliasing operates more or less the same way for C (which does
not have destructors) and C++ (which does). That is, it considers only
whether two pointers of a given type can ever alias one-another. For
example, in C or C++ a char* could alias anything, and in C++, derived
and base pointers could alias each other. However, if I understand
correctly, it considers that int* and void** cannot ever alias the
same storage.

--
Cheers,
Raoul.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: James Kanze <james.kanze@gmail.com>
Date: Wed, 31 Dec 2008 13:09:09 CST
Raw View
On Dec 31, 1:50 am, Raoul Gough <nyvmk...@qbvq59279308.net> wrote:
> James Kanze <james.ka...@gmail.com> writes:
> > On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:
> [snip]
> > I think the standard could be clearer here, but I'm not sure
> > what it should say.  Given something like:

> >    void foo( int* p1, double* p2 ) { ... }

> > being able to assume that p1 and p2 don't alias the same
> > object is an important help to optimization.  But of course,
> > the compiler cannot assume this in every case: the following
> > is a perfectly conform and well defined bit of C++:

> >    union U { int i ; double d ; } ;

> >    int
> >    foo( int* p1, double* p2 )
> >    {
> >        int r = *p1 ;
> >        *p2 = 3.1415 ;
> >        return 4 ;
> >    }

> >    U u ;
> >    u.i = 42 ;
> >    foo( &u.i, &u.d ) ;

> > (On the other hand, if foo writes *p2, then reads *p1, the
> > compiler can assume that they aren't aliases.  Subtle,
> > n'est-ce pas?)

> Is this exactly the same problem for the compiler as the
> memory manager example, or an even more difficult one?

It's related.  The goal here was to create an alias without a
reinterpret_cast, so the issues of what is guaranteed by
reinterpret_cast don't come into play.

> In the case of the memory manager, there may be an explicit
> "delete" expression, which should give it a hint that the
> pointer may immediately alias an object of a different type.
> However, from what I've seen, g++ doesn't seem to use this
> information, at least not in the case of objects with trivial
> destructors.

Exactly.  I can understand the problem occuring in something
like the above, where the aliasing isn't visible, but from a QoI
point of view, if nothing else, there's no excuse for it if the
"reuse" is visible locally, whether it is made visible by an
explicit call to a destructor, a reinterpret_cast or a union.
(From what I've been led to understand, the only case g++ will
treat this correctly is if there is a union visible locally.)

> [snip]
> So I can think of three possible conclusions:

> 1. Storing free-list pointers within the (freed) user data area is
> inherently non-portable, regardless of programming language.

> 2. C++ just doesn't provide a (portable) mechanism to fix (1)

> 3. g++ is being over-zealous in applying its strict-aliasing
> optimizations

All of the above.  For various reasons, it's almost impossible
to specify fully and portably at the language level.  The
obvious intent of the standard, however, is that
reinterpret_cast or a union be usable for this sort of thing;
and g++ overdoes it by ignoring this intent.

> I have a vague suspicion that (3) is correct, at least
> assuming that an object has been explicitly destroyed in
> between the pointer usages. In James' union case, there isn't
> even that much information for the compiler.

Unless it is doing intermodule analysis, there is no
information.

> However, I don't think my suspicion is going to be enough to
> convince anyone to change g++ - from what I understood of the
> implementation, its anti-aliasing operates more or less the
> same way for C (which does not have destructors) and C++
> (which does).

Which, of course, doesn't affect my example.  Nor an example
using an explicit cast.  The argument given with regards to the
explicit cast is the wording in    5.2.10/7:

     A pointer to an object can be explicitly converted to a
     pointer to an object of different type.  Except that
     converting an rvalue of type "pointer to T1" to the type
     "pointer to T2" (where T1 and T2 are object types and
     where the alignment requirements of T2 are no stricter
     than those of T1) and back to its original type yields
     the original pointer value, the result of such a pointer
     conversion is unspecified.

You could also use two static_cast: to void*, then to the target
type, but the standard doesn't really say whether this should
work or not, either.  All of this, of course, ignores intent,
and why reinterpret_cast is present to begin with.  For example,
the text in    3.10/15, which explicitly authorizes accessing any
object type as an array of char or unsigned char doesn't make
sense unless there is a means of obtaining the address as a
char* or an unsigned char*, and the text in    3.8 which you cited
to begin with also seems to require some sort of pointer
aliasing to work.  In the exact case in question, of course, you
could (and probably should) use a union, in which case, IMHO,
the standard makes an absolute guarantee that it must work, even
if the union isn't visible in the function in question.  But
there are other legitimate uses where the union is explicitly
given as undefined behavior (e.g. low level code to extract the
exponent from a floating point); it seems obvious to me that the
*intent* of the standard is for this to work where the hardware
supports it, even if, for various pragmatic reasons, it can't
require it explicitly.

> That is, it considers only whether two pointers of a given
> type can ever alias one-another. For example, in C or C++ a
> char* could alias anything, and in C++, derived and base
> pointers could alias each other. However, if I understand
> correctly, it considers that int* and void** cannot ever alias
> the same storage.

Which is manifestedly false if you have a union.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Raoul Gough <tyyjvtdn@cdyx92232360.net>
Date: Mon, 22 Dec 2008 02:52:50 CST
Raw View
I'm trying to understand whether the following code has
well-defined behaviour or not. At least one version of g++ will
sometimes reorder the load and store when optimizing with "strict
aliasing" enabled.

extern void* globalFreeList;

int foo(int* userPtr) {
 int temp = *userPtr;   // Read user value

 void** listPointer = reinterpret_cast<void**>(userPtr);

 *listPointer = globalFreeList;  // Re-use storage here
 globalFreeList = listPointer;

 // Return copied value
 return temp;
}

(Assume for simplicity that sizeof(int) == sizeof(void*))

So I think how g++ works is that userPtr and listPointer have
incompatible types and it thinks it can therefore freely reorder
the reading of *userPtr and writing of *listPointer. Adding a
pseudo-destructor call like userPtr->~int(); doesn't seem to
help.

I'm having trouble understanding what (I think) is the relevant
part of the C++98 standard**. In Basic Concepts 3.8 Object
Lifetime it refers to the storage of an object being "reused or
released", but how is the compiler supposed to know when that has
happened? I'm thinking particularly of the case of a user-defined
deallocator that gets inlined, producing code like the above. Can
anyone hazard an opinion on this?

** Unfortunately I don't have tr1 to hand - apologies if this
  question is outdated

--
Thanks,
Raoul Gough.

Note - I already asked this question in a more specific context
in the posting "Reusing user data block in (de)allocator" in
comp.lang.c++.moderated, but I don't think there was an answer to
the underlying question.

http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/ddaeccff3c4a16fb

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: John Nagle <nagle@animats.com>
Date: Mon, 22 Dec 2008 22:21:37 CST
Raw View
Raoul Gough wrote:
>
> I'm trying to understand whether the following code has
> well-defined behaviour or not. At least one version of g++ will
> sometimes reorder the load and store when optimizing with "strict
> aliasing" enabled.
>
> extern void* globalFreeList;
>
> int foo(int* userPtr) {
>  int temp = *userPtr;   // Read user value
>
>  void** listPointer = reinterpret_cast<void**>(userPtr);
>
>  *listPointer = globalFreeList;  // Re-use storage here
>  globalFreeList = listPointer;
>
>  // Return copied value
>  return temp;
> }
>
> So I think how g++ works is that userPtr and listPointer have
> incompatible types and it thinks it can therefore freely reorder
> the reading of *userPtr and writing of *listPointer. Adding a
> pseudo-destructor call like userPtr->~int(); doesn't seem to
> help.

5.2.10 doesn't say anything specifically relevant to this case.

As for GCC, if you use "-fstrict-aliasing", you are telling the
compiler to assume that "an object of one type is assumed never to
reside at the same address as an object of a different type, unless
the types are almost the same."  Given that assumption, the compiler's
reordering is allowed.

With "-Wall", you should get the message "dereferencing type-punned pointer
will break strict-aliasing rules".  Try that.

So this isn't a C++ standard problem.

Generally, if you're using "reinterpret_cast" without dire necessity,
you're writing bad code.  There are much better ways to code this.

                                       John Nagle
                                       Animats

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: James Kanze <james.kanze@gmail.com>
Date: Tue, 23 Dec 2008 11:49:09 CST
Raw View
On Dec 23, 5:21 am, John Nagle <na...@animats.com> wrote:
> Raoul Gough wrote:

> > I'm trying to understand whether the following code has
> > well-defined behaviour or not. At least one version of g++
> > will sometimes reorder the load and store when optimizing
> > with "strict aliasing" enabled.

> > extern void* globalFreeList;

> > int foo(int* userPtr) {
> >  int temp = *userPtr;   // Read user value

> >  void** listPointer = reinterpret_cast<void**>(userPtr);

> >  *listPointer = globalFreeList;  // Re-use storage here
> >  globalFreeList = listPointer;

> >  // Return copied value
> >  return temp;
> > }

> > So I think how g++ works is that userPtr and listPointer
> > have incompatible types and it thinks it can therefore
> > freely reorder the reading of *userPtr and writing of
> > *listPointer. Adding a pseudo-destructor call like
> > userPtr->~int(); doesn't seem to help.

> 5.2.10 doesn't say anything specifically relevant to this case.

   5.2.10 doesn't really say much specific at all:-).  It does say
that the mapping between pointers and integers is "intended to
be unsurprising"; logically, I would expect this to extend to
conversions between pointers to object types as well.

The real issue here, as Raoul has correctly pointed out, is the
lifetime of the original int.  According to the standard, it
ends when "the storage which the object occupies is reused or
released."  The statement he marks "reuses" the storage, so the
lifetime of the object ends.  If moving the code accessing the
object after the lifetime of the object changes the observable
behavior of the program, then the compiler isn't allowed to do
it.

Provided the code contains no undefined behavior.  That is the
real question.     5.2.10 is relevant, because it says that the
behavior of the reinterpret_cast here is unspecified, not
undefined.  (And if a reinterpret_cast of object pointers can't
be used for type punning, what use is it?)

I think the standard could be clearer here, but I'm not sure
what it should say.  Given something like:

   void foo( int* p1, double* p2 ) { ... }

being able to assume that p1 and p2 don't alias the same object
is an important help to optimization.  But of course, the
compiler cannot assume this in every case: the following is a
perfectly conform and well defined bit of C++:

   union U { int i ; double d ; } ;

   int
   foo( int* p1, double* p2 )
   {
       int r = *p1 ;
       *p2 = 3.1415 ;
       return 4 ;
   }

   U u ;
   u.i = 42 ;
   foo( &u.i, &u.d ) ;

(On the other hand, if foo writes *p2, then reads *p1, the
compiler can assume that they aren't aliases.  Subtle, n'est-ce
pas?)

I think it is the intent of the standard that the given snippet
of code work.  At least if all object addresses have the same
size and format, all alignment restrictions are met, etc.  The
standard doesn't guarantee it, of course, because there are
architectures where some of these conditions might not be met,
or where such a requirement would place undue overhead on other
code.

> As for GCC, if you use "-fstrict-aliasing", you are telling
> the compiler to assume that "an object of one type is assumed
> never to reside at the same address as an object of a
> different type, unless the types are almost the same."  Given
> that assumption, the compiler's reordering is allowed.

In other words, if you use "-fstrict-aliasing", you're telling
the compiler that you never use a union or do any type punning.
This is, IMHO, a valid solution: the compiler generates strictly
conforming code otherwise, and you use an option to tell it that
it can go even further.  If you use the option here, then you've
lied to the compiler. (Of course, all options aside, when the
compiler sees the reinterpret_cast in the same function, it
should automatically disactivate the option.  QoI issue.)

> With "-Wall", you should get the message "dereferencing
> type-punned pointer will break strict-aliasing rules".  Try
> that.

> So this isn't a C++ standard problem.

> Generally, if you're using "reinterpret_cast" without dire
> necessity, you're writing bad code.  There are much better
> ways to code this.

It's a standard idiom in low level memory managers.  It can be
written in different ways; I generally reinterpret_cast the
original pointer to a pointer to a union, and do all of the
accesses through that, e.g.:

   union Header
   {
       int             userData ;
       Header*         next ;
   } ;

There'll still be a reinterpret_cast (or a static_cast from a
void*) to get the Header* in the deallocation routine, however.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                  Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]