Topic: operators new[]/delete[]


Author: "Constantine Antonovich:" <const@Orbotech.Co.IL>
Date: 1996/02/26
Raw View
Several days  ago I sent to this  newsgroup the following code
(with a question is the  code  incorrect according to current ANSI
standard or do I have a bug in my compiler):

//----------------------------------------------------------
#include <iostream.h>
#include <assert.h>
#include <new.h>

class A {
public:
  A(void)     { cout << "A constructed" << endl; }
  ~A()        { cout << "A destructed" << endl; }
};

class B {
public:
  B(void)     { cout << "B constructed" << endl; }
  ~B()        { cout << "B destructed" << endl; }
};

A* foo_allocate(unsigned size)
{
  assert(sizeof(A)==sizeof(B));

  B* bptr=new B[size];
  A* aptr=(A*)bptr; // A* aptr=reinterpret_cast<A*>bptr; is more correct
                    // but my compilers do not support that yet.
  for (unsigned j=size; j>0;) // this place corrected according
    (bptr+(--j))->~B();       // to remark of a person whose name
                              // I have lost to my regret.
  for (unsigned i=0; i<size; ++i)
    new(aptr+i) A;

  return aptr;
}

int main(void)
{
  A* arr=foo_allocate(2);
  delete [] arr;          # here
  return 0;
}
//----------------------------------------------------------

    I  considered  as a problematic point the fact that one  of my
compilers treats line marked "#here" as destruction  of array of B
objects (B class destructors called).

    I  have  received several  answers   (I  send  this  code also
personally to Steve Clamage  and he kindly answered me) concluding
that according to  the standards, the program contains  operations
with  undefined result and so   cannot be considered  as a correct
one. Meanwhile after some time of reflections I am going to insist
on the following:

    -- ANSI work papers (April, 1995)  leave interpretation of the
       correctness of the above-mentioned code ambiguous;
    -- If so, it's necessary  to define this matter more precisely
       to eliminate differences of the interpretation by different
       compiler vendors;
    -- The  code can be interpreted as  representing an absolutely
       correct and well-defined behavior.

    In the following text, I am going to prove this point of view.


              1. Little history of the code sample.

    The above-mentioned code sample can be considered as a play of
imagination without any applicable weight. Meanwhile this code was
born from another one making little more sense.
    Some times ago, I  noticed periodically  appearing discussions
about necessity of renew operation in C++. Generally, I never felt
myself out  of my share because  of renew nonexistence  but seeing
arguments of  its usefulness again and  again, I started to think:
what the hell is its problem, if it's not possible to implement it
by means of the language itself?
    So I wrote the  following  example, trying to  avoid redundant
operations usually existing in reallocation on classic manner:

//----------------------------------------------------------
template<class T>
class Allocator {
private:
  struct filler { char filler_[sizeof(T)]; };
public:
  static T*   allocate_array(unsigned elm);
  static void dup_array(T* dst, T* src, unsigned elm);
  static void fill_array(T* dst, unsigned elm);
};

template<class T>
T* Allocator<T>::allocate_array(unsigned elm)
{
  return (T*)new filler[elm];
}

template<class T>
void Allocator<T>::dup_array(T* dst, T*src, unsigned elm)
{
  for (unsigned i=0; i<elm; ++i)
    new(dst+i) T(src[i]);
}

template<class T>
void Allocator<T>::fill_array(T* dst, unsigned elm)
{
  for (unsigned i=0; i<elm; ++i)
    new(dst+i) T;
}

template<class T>
T* realloc(T*& array, unsigned old_size, unsigned new_size)
{
  set_new_handler(0);
  T* tmp=Allocator<T>::allocate_array(new_size);
  if (tmp) {
    if (new_size>old_size) {
      Allocator<T>::dup_array(tmp,array,old_size);
      Allocator<T>::fill_array(rmp+old_size,new_size-old_size);
    }
    else
      Allocator<T>::dup_array(tmp,array,new_size);
    delete [] array; array=tmp;
  }
  return tmp;
}
//----------------------------------------------------------

    The   general  idea  was to     allocate  an  array with    no
initialization of its objects and just  to use copy constructor to
copy old    elements into  the newly    allocated  array (avoiding
redundant creation of  objects,  with their default   constructor,
when they are immediately  replaced by the following  assignment).
There is no problems of such  technique usage in container classes
where all  memory and object management  are absolutely hided from
the user, but the  imaginary renew operation should  be applicable
to regular arrays like:

//----------------------------------------------------------
    A* ap=new A[4];
    realloc(ap,4,8);
    delete [] ap;
//----------------------------------------------------------

    Obviously     the    code     should     be    contained    in
Allocator<T>::allocate_array   function  produces  the   mentioned
problem.

        2. Undefined behavior.

    Of  course, some  constructions  in   a possible program   may
produce undefined behavior.  Nevertheless, even undefined behavior
should have some definition. Let's consider the following code:

//----------------------------------------------------------
      A* ap=new A;
      B* bp=reinterpret_cast<B*>(ap);
      delete bp; // #here
//----------------------------------------------------------

Without any doubt, result of the code  executed in "#here" line is
undefined.   But  definitely  I  wouldn't   like,  as a result  of
uncertainty of the behavior, my  compiler to send email  complaint
to  some   League "C++  compilers     against  stupidity of    the
programmers".  Also   I  wouldn't like   my  compiler to recognize
incorrectness of the code and silently to call A destructor (after
all, bp  points to A  object, isn't it?) instead of  B one. Here I
mean that

      UNDEFINED BEHAVIOR HAS AN ERROR MEANING.
   UNDEFINED BEHAVIOR ALWAYS RESULTING IN CORRECT EXECUTION OF
    A CODE WITH UNDEFINED BEHAVIOR, IS FORBIDDEN.

    In  the   previous example,   we are dealing with a code  with
undefined behavior. The code is obviously incorrect. Meanwhile, in
case of an imaginary compiler which can recognize  true type of an
object,  the   compiler could call  A   destructor  in line marked
"#here"  (because  its behavior whould be   undefined anyway).  In
such a case, the  invalid code will  be correctly executes in  any
case (ALWAYS) and   so the  behavior  of  the compiler  cannot  be
considered as a proper one.
    C++,  partly by itself,   partly as heir  of  C, stands on the
principles  of not stinting of a  programmer in correctness of his
actions  if they don't contradict  syntactical correctness.    So,
in the example, in the line marked "#here", the imaginary compiler
should    honestly try to  destroy  B    object  and  to free  its
memory. Applying  of B destructor to A  object most  probably will
cause "undefined behavior", but its harm will depend on particular
A and B classes (and obviously such applying will be not harmless
ALWAYS).


     3. No kidding.

    C++   is not wizard   language.    Generally, its  behavior is
understandable, enough clear and well predictable. Creation of C++
objects consists  of  two  parts:  memory allocation   and  object
construction  by itself.   Even   if  such  separation   into  two
independent parts  is not  obvious and  is not  proclaimed by ANSI
draft straightly, that    doesn't  change anything   because  such
separation results from the language definition anyway.
    C++ memory management is ALMOST  ALWAYS typeless. Here "ALMOST
ALWAYS" stands for   all   cases covered  by   standard-conforming
language   implementation except of   denumerable  number of cases
where  a programmer  explicitly changes the   language behavior by
means of the language constructions (and, I should add, on his own
responsibility). To illustrate the term, the following example can
be considered:

//----------------------------------------------------------
    A* ap=new A;
//----------------------------------------------------------

What does the code do? Obviously, it creates  a new object of type
A.  Yes, but I should say it creates a new object of type A ALMOST
ALWAYS, just because the definition of A class may be as follows:

//----------------------------------------------------------
    class A {
      // some stuff
    public:
      // some stuff
      void* operator new(unsigned) { exit(1); } // not for heap usage
    };
//----------------------------------------------------------

and in such case obviously the  previous statement will not create
any A object.
    C++ memory    management  is ALMOST   ALWAYS  typeless because
default operators new  and new[] has   no knowledge about type  of
object they  allocate  memory for.  From   the other  hand,  these
operators     are  the  single   C++   mechanism    to manage  the
memory. Moreover,  this and only this part  of object creation can
be   absolutely overloaded by   a  programmer and that  absolutely
separates it to independent stage of object creation.
    An   object    construction  has    hidden    features   (like
initialization   of tables of virtual   functions) and only partly
(constructors and destructors) can  be influenced by a programmer.
Meanwhile, declaration in the ANSI  standard "placement new"  also
had  finished separation of   object construction into independent
part since  an object can be created  with no allocation of memory
(at  any place and  by the programmer, not  only on the stack) and
can be legally destroyed with no freeing of  the memory (by direct
call to its destructor).
    If we recall  that according to C++ principles there should be
no difference between objects  and their behavior regardless their
placement we have to agree that allocation of memory for an object
and construction of the  object in the allocated memory  represent
two independent stages ALMOST ALWAYS.
    Taking all this into account, even definitions of operator new
and delete can be reconsidered  to eliminate number of  duplicated
definitions, for example:

                        new T(<arg-list>);
                 represents shorthand of sequence
           new(::operator new(sizeof T)) T(<arg-list>);
                                or
           new(T::operator new(sizeof T)) T(<arg-list>);
                  if T::operator new is defined.

  delete tp; // there tp is non-null pointer on object of type T
              represents shorthand of atomic sequence
if (tp) { tp->~T(); ::operator delete(<cast-to-mostly-base>tp); }
                                or
if (tp) { tp->~T(); T::operator delete(<cast-to-mostly-base>tp); }
                if T::operator delete is defined.

    Actually, similar redefinition can  be done for new[]/delete[]
also.

        4. Alignment and memory allocation.

    In the starting the article example, the following code

//----------------------------------------------------------
    assert(sizeof(A)==sizeof(B));

    B* bptr=new B[size];
    A* aptr=(A*)bptr;
//----------------------------------------------------------

really seems dangerous.

    Fergus Henderson writes:
       "This assertion is not guaranteed to succeed.
        It would take an extremely perverse implementation
        for it to fail, however, so I think it would be very
        portable, even though it is not strictly guaranteed
        to work."

This sentence seems to be  reasonable but, in deal, this assertion
guarantees the  correctness     ALMOST ALWAYS  and    under   that
circumstance this check is absolutely portable.

    ANSI draft says:
      18.4.1.1  Single-object forms
      Effects:
        The allocation function called by a new-expression to
        allocate size  bytes  of storage  suitably aligned to
        represent any object of that size.

      18.4.1.2  Array forms
      Effects:
        The  allocation  function called by the array form of
        a new-expression to  allocate  size bytes  of  storage
        suitably aligned to represent any array object of that
        size or smaller.32)

    We see that ANSI draft   says that allocated memory should  be
suitably  aligned  for any object  and  any array  object with the
single  limitation  of   size.  We can  also   recall C++  pointer
arithmetic   and what is sizeof  of  some particular object (which
contains concept of alignment in the object itself).
    An  implementation    hasn't  to be    extremely perverse  the
assertion condition  to fail. It  can  be very simple  one where B
class for example has its own operator  new[] allocating memory in
specific alignment suitable for B but  not for any other class and
A  one  particularly (and  even  that  is impossible for compilers
still not supporting overloading of operator new[]).
    But  this situation is exactly  "ALMOST  ALWAYS" case. If I am
taking   responsibility to  overload  operator  new[] for specific
class, it's also my responsibility to take a care of usage of such
operations like one I am doing  with the condition of the equation
of object sizes.

           The language  should  stand  in "ALMOST ALWAYS"
        correctness (and de facto it does that).      If a
        programmer  is  doing  something   that  is ALMOST
        ALWAYS correct,  the  language should  demonstrate
        behavior like that is ALWAYS correct. Since ALMOST
        ALWAYS correct action may became incorrect only as
        a result of  a programmer  activity,  this is also
        responsibility of the programmer to take a care of
        usage of such actions.


    Fergus Henderson continues:
//----------------------------------------------------------
    B* bptr=new B[size];
    A* aptr=(A*)bptr;
//----------------------------------------------------------
        "This cast has unspecified behavior. (See 5.2.9
         [expr.cast.reinterpret]/8.). However, I would
         expect it to work on most implementations."

    This  article    of ANSI draft    interprets the  operation as
unspecified in case of cast from T1 to T2 and back and if there is
difference in alignment of T1 and  T2.  Obviously, that is not our
case (at least because definition of allocation function returning
suitable for any object alignment).



   5. Rest in peace.

    The following  peace of the code has  been considered as clear
by all experts:

//----------------------------------------------------------
  for (unsigned j=size; j>0;)
    (bptr+(--j))->~B();

  for (unsigned i=0; i<size; ++i)
    new(aptr+i) A;
//----------------------------------------------------------


  6. Undefined behavior (continue).

    All experts have considered deletion of the array allocated in
so  strange manner as    mostly  incorrect point with    undefined
behavior.

    Steve Clamage writes:
        "You do have an operation  with undefined results,
        however. In effect you are doing this:
     A* p = (A*)new B[2];
     delete [] p;
        The rule  is  that  the type of the pointer passed
        to delete[]  must  match  the type  of the pointer
        returned  by  new[],  which is  not  the case here.
        The compiler is not required to diagnose the error,
        and the language  definition does  not say what the
        result is."

    Definitely,  I am not   doing that.  If   the standard of  the
language enables to interpret my code in such manner then there is
something  wrong with  the standard! But  even if  the behavior is
proclaimed to be  undefined, I would like to   remind what I  have
said in  2-nd paragraph. If  the  uncertainty of the  behavior was
properly defined then  the code wouldn't  have undefined behavior!
(It would  be very interesting  to  test the code  on some another
compilers.    I may suppose   that the code  has,  in deal, enough
defined behavior de facto as result of most logical implementation
of  operators new/delete and  just  SPARCompiler C++, for  unknown
reason, stores pointer  to destructor function together with array
size).

    Fergus Henderson agrees with Steve Clamage:
        "This has undefined behaviour.      It contravenes 5.3.5
         [expr.delete]/2,   which   says  that  the   expression
         passed  to `delete []'  must be  a pointer to the first
         element of an array of objects allocated with `new []';
         this is not the case,  because  although there once was
         such an  array at  that memory location,   its lifetime
         ended  when the memory  was  overwritten  by  the calls
         to placement new (see 3.8[basic.life]/1)."

    There  is    at least one   self-contradictory  point  in that
conclusion.  Of course, lifetime of all B  objects had been ended,
by why does that  mean end of the  array life? Or in contrary,  if
end of life-time of B objects means end of life-time of the array,
so  creation  of  A objects  should  mean  creation of new  array,
shouldn't it?.

    Article 5.3.5.2 of ANSI draft says something slightly
    different:
        "...In the second  alternative  (delete array), the value
        of  the  operand  of  delete  shall  be  a pointer  to an
        array created by a new-expression without a new-placement
        specification. If not, the behavior is undefined."

    So delete takes as its argument POINTER TO ARRAY (even objects
are not mentioned). No one says that
        ...pointer passed to delete[] must match the type
        of the pointer returned  by  new[]...
        ...the expression passed to `delete []' must be a
        pointer to the first element of an array of objects
        allocated with `new []'...
All that already  means  INTERPRETATION of  the standard and  also
that the  standard   enables such interpretations.  Meanwhile  C++
memory  management seems not to  need  so strong restrictions just
because   the  memory  can  be   managed  separately from  objects
construction/destruction and can  be reused without reallocation.
I  agree  that all  above said  regarding  operators new/delete is
point of view of common sense (one should  use delete and delete[]
with the same  pointer to the same type  he got from new and new[]
and  not   play with  the    pointers in  99.9999% cases  he  uses
new/delete at all and in 100% cases if  he doesn't understand what
he is doing) but that has nothing common with boundaries of proper
language processing.
    And here we really arrive to the final point. ANSI draft gives
no   strong   array   definition   to   disable  ambiguous   array
interpretation.  And    above-mentioned  common-sense based  array
understanding has all rights to exist.

      7. No kidding (continue).

    I suppose that this  ambiguity in interpretation of arrays and
operators new/delete  should be  eliminated from the  standard.  I
would propose the following additions in supposition that they:

      -- do   not  conflict  with  any    of   previous standard's
         definitions;
      -- do not change nothing in the standard's common principles
         and  common understanding of  the standard except of very
         specific point with no influence upon the standard itself;
      -- do not   influence mostly on existing implementations  of
         the language since some implementations  use this idea de
         facto and others can easily be corrected;
      -- do  not   influence mostly on  existing  C++ applications
         because  they  concern some  very  specific  point in the
         standard  with not  common and extremely  rarely (if any)
         usage.
      -- will make the standard more logically completed.

    Addition to array definition [dcl.array]:
        Array of N T object represents contiguous amount of
        memory  of  suitable  size  and  alignment  with  N
        non-overlapping objects  of type T  placed into the
        memory with no gaps and each properly aligned.

    Addition to operator delete [expr.delete] ("above" here
    means all previously said by the standard):
        In either alternative, the type of the deleted object
        is evaluated  as described above and according to the
        type of the actual operand.

    In  my    opinion,  the additions  may    be  considered as an
overweight but the experience shows they are not.

        8. Renew

    In my opinion, the previously mentioned additions   would make
starting the article  example  absolutely clear and  ALMOST ALWAYS
correct with no discussions (I continue to insist that the example
is so even now but  with discussions). But  what about renew?  C++
standard  very hardly accepts new  keywords and new features.  But
may be  it makes sense, at  least for completeness,  to add to the
standard's (enough)  new family of various_cast<T>  something like
following:

dynamic_sizeof(object); // should return sizeof evaluated in
                        // run time, for example:

//----------------------------------------------------------
class A {
public:
  unsigned u;
  A();
  virtual ~A();
};

class B : public A {
public:
  int i;
  B();
  ~B();
};

int main(void)
{
  A* a=new B;
  cout << sizeof(A)          << endl; // types 8
  cout << sizeof(B)          << endl; // types 12
  cout << sizeof(*a)         << endl; // types 8
  cout << dynamic_sizeof(*a) << endl; // types 12
  return 0;
}
//----------------------------------------------------------


array_sizeof(pointer); // should return sizeof of an array, for example

//----------------------------------------------------------
class A {
public:
  unsigned u;
  A();
};

int main(void)
{
  A* a=new A;
  A  ar[2];
  A* ap=new A[4];

  cout << array_sizeof(a)/sizeof(A)  << endl; // types 0
  cout << array_sizeof(ar)/sizeof(A) << endl; // types 2
  cout << array_sizeof(ap)/sizeof(A) << endl; // types 4
  return 0;
}
//----------------------------------------------------------

    I   may suppose   that usage of    arrays of  objects with  no
destructors (where a compiler may  optimize away storage of number
of elements) became enough rare in contemporary C++ (and in lot of
cases, number  of elements in   such arrays  is known already   at
compilation  time) so  the  language may  enough easily  provide a
programmer  with  such information  as  number of  elements in  an
array.

--
//------------------------------------------------------------------
// Opinions expressed here are my own only
// Constantine Antonovich                       const@orbotech.co.il
//------------------------------------------------------------------
[ To submit articles: Try just posting with your newsreader.
        If that fails, use mailto:std-c++@ncar.ucar.edu
  FAQ:    http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: 1996/02/27
Raw View
"Constantine Antonovich:" <const@Orbotech.Co.IL> writes:

>        2. Undefined behavior.
>
>    Of  course, some  constructions  in   a possible program   may
>produce undefined behavior.  Nevertheless, even undefined behavior
>should have some definition.

I disagree.  Imposing restrictions on the behaviour of code
which has "undefined behaviour" would be a very confusing use
of terminology, and more importantly would place constraints on
implementors that would prevent efficient implementations.

You may perhaps be able to make a case that certain specific cases
of undefined behaviour ought to be instead made merely unspecified.
But in the general case, a write via a stray pointer might cause
arbitrary instructions to be executed, thus violating any guarantees.
There is basically no way that an implementation which allows writes
via stray pointers can prevent this.  Preventing stray pointer writes
is possible, but would have a significant efficiency penalty.

>Let's consider the following code:
>
>//----------------------------------------------------------
>      A* ap=new A;
>      B* bp=reinterpret_cast<B*>(ap);
>      delete bp; // #here
>//----------------------------------------------------------
>
>Without any doubt, result of the code  executed in "#here" line is
>undefined.   But  definitely  I  wouldn't   like,  as a result  of
>uncertainty of the behavior, my  compiler to send email  complaint
>to  some   League "C++  compilers     against  stupidity of    the
>programmers".

You might not like it, but depending on the types `A' and `B',
this could cause heap corruption on some implementations.
For example, the compiler might represent pointers to `A' as
pointing not directly to the start of the memory for `A', but
instead pointing to some fixed offset before or after the start.
The reinterpret_case<B*> might not adjust for that offset.

Heap corruption could cause writes through stray pointers,
which could cause arbitrary code to be executed.  The result
of that could be anything -- sending email complaints is
unlikely, but can't be ruled out.

>Also   I  wouldn't like   my  compiler to recognize
>incorrectness of the code and silently to call A destructor (after
>all, bp  points to A  object, isn't it?) instead of  B one. Here I
>mean that
>
>      UNDEFINED BEHAVIOR HAS AN ERROR MEANING.
>   UNDEFINED BEHAVIOR ALWAYS RESULTING IN CORRECT EXECUTION OF
>    A CODE WITH UNDEFINED BEHAVIOR, IS FORBIDDEN.

Preventing implementations from doing "the right thing" when
executing code with undefined behaviour would place unreasonable
constraints on implementors that would prevent efficient implementations.

For example, a write via a stray pointer might write to some ununsed
memory, in which case it will have no effect, and the code may continue
to work.  There is basically no way the implementation can avoid this
other than by checking for stray pointer writes, which as I said before
would have a significant efficient impact.

>        4. Alignment and memory allocation.
>
>    In the starting the article example, the following code
>
>//----------------------------------------------------------
>    assert(sizeof(A)==sizeof(B));
>
>    B* bptr=new B[size];
>    A* aptr=(A*)bptr;
>//----------------------------------------------------------
>
>really seems dangerous.
>
>    Fergus Henderson writes:
>       "This assertion is not guaranteed to succeed.
>        It would take an extremely perverse implementation
>        for it to fail, however, so I think it would be very
>        portable, even though it is not strictly guaranteed
>        to work."
>
>This sentence seems to be  reasonable but, in deal, this assertion
>guarantees the  correctness     ALMOST ALWAYS  and    under   that
>circumstance this check is absolutely portable.

I don't understand what you are saying here.  (This assertion guarantees
the correctness of what?  Under which circumstances?)

>    An  implementation    hasn't  to be    extremely perverse  the
>assertion condition  to fail.  It  can  be very simple  one where B
>class for example has its own operator  new[] allocating memory in
>specific alignment suitable for B but  not for any other class and
>A  one  particularly (and  even  that  is impossible for compilers
>still not supporting overloading of operator new[]).

In the test case, A and B were both identical classes (other than the
class name); I think only a peverse implementation would allocate them
different sizes.

Your talk about B having `operator new[]' is describing a hypothetical
peice of source code, not a hypothetical C++ implementation; I don't
see how it is relevant.

>    Fergus Henderson continues:
>//----------------------------------------------------------
>    B* bptr=new B[size];
>    A* aptr=(A*)bptr;
>//----------------------------------------------------------
>        "This cast has unspecified behavior. (See 5.2.9
>         [expr.cast.reinterpret]/8.). However, I would
>         expect it to work on most implementations."
>
>    This  article    of ANSI draft    interprets the  operation as
>unspecified in case of cast from T1 to T2 and back and if there is
>difference in alignment of T1 and  T2.  Obviously, that is not our
>case (at least because definition of allocation function returning
>suitable for any object alignment).

That's irrelevant, since in your example piece of code, you don't cast
back to `B *'.  5.2.9/8 says that the result in this case is unspecified.

>    Fergus Henderson agrees with Steve Clamage:
>        "This has undefined behaviour.      It contravenes 5.3.5
>         [expr.delete]/2,   which   says  that  the   expression
>         passed  to `delete []'  must be  a pointer to the first
>         element of an array of objects allocated with `new []';
>         this is not the case,  because  although there once was
>         such an  array at  that memory location,   its lifetime
>         ended  when the memory  was  overwritten  by  the calls
>         to placement new (see 3.8[basic.life]/1)."
>
>    There  is    at least one   self-contradictory  point  in that
>conclusion.  Of course, lifetime of all B  objects had been ended,
>by why does that  mean end of the  array life?

The lifetime of an array object is distinct from the lifetime of its
elements.  The ending of the lifetime of all the B objects (which you
did by explciitly calling the destructur) doesn't end the lifetime of
the array.

What ends the lifetime of the array is reusing the memory (which
you did by calling placement new).  3.8 is quite clear about this:
"the lifetime of an array object ... ends when the storage which the array ...
occupies is reused or released."

>Or in contrary,  if
>end of life-time of B objects means end of life-time of the array,
>so  creation  of  A objects  should  mean  creation of new  array,
>shouldn't it?.

Yes.  But this array was not "created with new []" as required by
5.3.5.2, which you quite below, and thus the behaviour is still
undefined.

>    Article 5.3.5.2 of ANSI draft says something slightly
>    different:
>        "...In the second  alternative  (delete array), the value
>        of  the  operand  of  delete  shall  be  a pointer  to an
>        array created by a new-expression without a new-placement
>        specification. If not, the behavior is undefined."
>
>
>    So delete takes as its argument POINTER TO ARRAY (even objects
>are not mentioned).

This is indeed a minor error; it has been changed in the January 96
draft to say "... shall be a pointer to the first element of an array ...".

>No one says that
>        ...pointer passed to delete[] must match the type
>        of the pointer returned  by  new[]...
>        ...the expression passed to `delete []' must be a
>        pointer to the first element of an array of objects
>        allocated with `new []'...

It does say that the dynamic type of the pointer passed to delete[] must
match its static type, which is effectively the same thing.
(See 5.3.5/3.)

>    And here we really arrive to the final point. ANSI draft gives
>no   strong   array   definition   to   disable  ambiguous   array
>interpretation.  And    above-mentioned  common-sense based  array
>understanding has all rights to exist.

The draft is definitely not easy reading, but I think it does define
things sufficiently well to allow unambiguous interpretation in this
case.

>    Addition to array definition [dcl.array]:
>        Array of N T object represents contiguous amount of
>        memory  of  suitable  size  and  alignment  with  N
>        non-overlapping objects  of type T  placed into the
>        memory with no gaps and each properly aligned.

I think this is already covered by the wording on array lifetimes (3.8)
and the statement in [dcl.array] that an array object consists of
contiguously allocated elements.  But adding that wording might make
things clearer.

>    Addition to operator delete [expr.delete] ("above" here
>    means all previously said by the standard):
>        In either alternative, the type of the deleted object
>        is evaluated  as described above and according to the
>        type of the actual operand.

The draft says that if the dynamic type is different from the static
type, then the behaviour is undefined.  So the above text would be
a change, not just a clarification.  I'm not at all convinced that
it would be a change for the better.  Leaving the behaviour
undefined in this case gives implementors more flexibility, and
I think that flexibility is probably more important than defining
the behaviour of programs like yours which play tricks with memory
allocation.

--
Fergus Henderson              WWW: http://www.cs.mu.oz.au/~fjh
fjh@cs.mu.oz.au               PGP: finger fjh@128.250.37.3
---
[ To submit articles: try just posting with your news-reader.
                      If that fails, use mailto:std-c++@ncar.ucar.edu
  FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu.
]





Author: phalpern@truffle.ultranet.com (Pablo Halpern)
Date: 1996/02/29
Raw View
Although I don't agree with most of what Constantine Antonovich says, I
do wonder about one particular set of situations. Many allocation and
re-allocation systems use the default new() operator to allocate memory
in the form of char arrays. This "raw" memory is then recast to an array
of specific type, which is initialized one element at a time:

  template <class T>
  T* dup_array(const T* p, size_t s)
  {
    T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)];  // note 1
    while (s-- > 0)
      new (p2 + s) T(p[s]);  // Initialize using copy constructor
  }

  void f(T* p, size_t s)
  {
    T* newp = dup_array(p, s);
    // do something with newp
    delete [] newp;     // note 2
  }

I believe that something similar to the line marked "note 1" is common
practice for this sort of operation. However, I believe that the line
marked "note 2" is undefined behavior. Is there a way in the standard
can be modified so that the above code becomes well-defined and works as
intended? How does the STL deal with this in its "unitialized copy"
operation?

There is another way to allocate raw memory, but here the operations are
even less defined (I believe):

   void *p2 = operator new[] (s * sizeof(T));
   delete p2;  // What does this do? p2 was not the result of a normal
               // new expression.
   delete reinterpret_cast<T*> p2;  // What does this do?

How do we solve these problems in a standard-conforming way. Does there
need to be special language for char arrays or void * allocations. (Or
is there already? I haven't found it.)

Thanks,

Pablo Halpern                   phalpern@truffle.ultranet.com

I am self-employed. Therefore, my opinions *do* represent
those of my employer.
---
[ To submit articles: try just posting with your news-reader.
                      If that fails, use mailto:std-c++@ncar.ucar.edu
  FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu.
]





Author: kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: 1996/02/29
Raw View
Hi,

Pablo Halpern (phalpern@truffle.ultranet.com) wrote:
:   template <class T>
:   T* dup_array(const T* p, size_t s)
:   {
:     T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)];  // note 1

This is in no ways portable. However, it IS portable to allocate "raw"
memory with 'operator new()' or 'operator new[]()'. See below...
:   }

:   void f(T* p, size_t s)
:   {
:     T* newp = dup_array(p, s);
:     // do something with newp
:     delete [] newp;     // note 2

This results indeed in undefined behavior.
:   }

: I believe that something similar to the line marked "note 1" is common
: practice for this sort of operation. However, I believe that the line
: marked "note 2" is undefined behavior. Is there a way in the standard
: can be modified so that the above code becomes well-defined and works as
: intended? How does the STL deal with this in its "unitialized copy"
: operation?

There is no need to modify the standard because there is already a
method available to deal with "raw" memory (see below), which e.g. use
by STL.

: There is another way to allocate raw memory, but here the operations are
: even less defined (I believe):

The operations are well defined, if used correctly...

:    void *p2 = operator new[] (s * sizeof(T));
:    delete p2;  // What does this do? p2 was not the result of a normal
:                // new expression.
:    delete reinterpret_cast<T*> p2;  // What does this do?

Both attempts to release the memory pointed to by 'p2' result in
undefined behavior: 'delete' can only be applied to objects allocated
with 'new T' (for some type 'T').  Likewise, 'delete[]' can only release
array objects allocated with 'new T[i]' (for some type 'T' and some
value 'i'). The whole trick is to distinguish 'new T' from 'operator
new()' and 'delete ptr' from 'operator delete()' (and correspondingly
the array variants):  They are just different operations (see e.g.
"More Effective C++", S.Meyers, Addison-Wesly, Item 8).

I will describe the stuff for arrays because apparently the "renew"
topic is currently "in" :-) 'new T[i]' does something like:

  #include <new>

  T *new_T_array(size_t size) // a "homegrown" 'new T[size]'
  {
    // Allocate enough memory to hold the requested array plus
    // additional information about the size of the array. This
    // is as written NOT portable (insufficient alignment) but it
    // is also not necessary to do the non-portable stuff, if the
    // operations are encapsulated in an array class like 'vector':
    // the size can be stored somewhere else.

    void   *ptr  = operator new[](sizeof(T) + sizeof(size_t));
    size_t *sptr = static_cast<size_t*>(ptr);
    *sptr = size; // first store the size
    // .. then get the address of the actual array
    T *Tptr = static_cast<T*>(static_cast<void*>(sptr + 1));

    // now initialize the array
    for (size_t idx = 0; i < size; ++i)
      operator new(Tptr + idx) T();
    return Tptr;
  }

However, this is NOT how it is indeed implemented but it depicts what
is basically going on and how it COULD be implemented (well, not
really: It is also necessary to take care of exceptions in the
constructors and to release constructed objects if there is an
exception). In particular, it shows the basics how to implement an own
routine to allocate an array which basically feels like a built-in
array (i.e. how 'operator new[]()' and "placement new" are used to
create and initialize the array). Unfortunately, you cannot 'delete[]'
an array created with 'new_T_array()'. Instead, you have to mimic the
behavior of 'delete[]' using explicit destruction and 'operator
delete[]()'.  Here are the details:

  void delete_T_array(T *Tptr)
  {
    // Again this code is somewhat non-portable but againn this doesn't
    // matter because an array class can do a better (and portable) job
    // by storing the size somewhere else.

    // First retrieve the size from where it was store in 'new_T_array()':
    size_t *sptr = static_cast<size_t*>(static_cast<void*>(Tptr)) - 1;
    // ... then release the individual objects in the reverse order
    // constructed:
    for (size_t idx = *sptr; idx-- > 0; )
      Tptr[idx].~T();

    // Finally release the allocated "raw" memory
    operator delete[](static_cast<void*>(sptr));
  }

Again, this is basically how it works (but e.g. with exception handling
excluded). If you need to do similar stuff, you can do es like this.
However, you should place such stuff in a class (e.g. because it is
much simpler to do this portable). If do so, you are likely to end up
with a class similar to 'vector'.  So, why bother...?

Concerning the 'renew' problematic: Using a combination of the stuff in
'new_T_array()' and 'delete_T_array()' you can implement a
'renew_T_array()' which is capable to 'renew' arrays allocated with
'new_T_array()' (or, even easier, if used with in a class to 'renew'
the internal storage used to represent the array).  However, there is
still a minor inefficiency in comparison with 'realloc()': The need to
store the size. Since the memory management almost certainly "knows"
the size of the memory object somehow (I guess in all implementation it
does know the size but I can also imagine that there could be a strange
environment where this is not the case...), the number of elements in
the memory object could potentially be deduced from this size. There is
no portable method to do so. But I believe that this additional
'size_t' is not that heavy-weight to be a real problem which justifies
some very specific language extension.

: How do we solve these problems in a standard-conforming way. Does there
: need to be special language for char arrays or void * allocations. (Or
: is there already? I haven't found it.)

How to solve this problems in a standard-conforming way: See above.
...  and no, neither are there special allocations for 'char' arrays or
'void*', nor are they needed.
--
dietmar.kuehl@uni-konstanz.de
http://www.informatik.uni-konstanz.de/~kuehl
I am a realistic optimist - that's why I appear to be slightly pessimistic
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: tony@online.tmx.com.au (Tony Cook)
Date: 1996/03/01
Raw View
Pablo Halpern (phalpern@truffle.ultranet.com) wrote:
: Although I don't agree with most of what Constantine Antonovich says, I
: do wonder about one particular set of situations. Many allocation and
: re-allocation systems use the default new() operator to allocate memory
: in the form of char arrays. This "raw" memory is then recast to an array
: of specific type, which is initialized one element at a time:

:   template <class T>
:   T* dup_array(const T* p, size_t s)
:   {
:     T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)];  // note 1
:     while (s-- > 0)
:       new (p2 + s) T(p[s]);  // Initialize using copy constructor
:   }

:   void f(T* p, size_t s)
:   {
:     T* newp = dup_array(p, s);
:     // do something with newp
:     delete [] newp;     // note 2
:   }

: I believe that something similar to the line marked "note 1" is common
: practice for this sort of operation. However, I believe that the line
: marked "note 2" is undefined behavior.

Yes it is.

: Is there a way in the standard
: can be modified so that the above code becomes well-defined and works as
: intended?

This isn't likely - most delete[] implementations where a
non-trivial destructor is involved will use extra information before
the beginning of the array - which isn't present in your example
(and that information is implementation dependent, so you can't set
it portably in your own code).

: How does the STL deal with this in its "unitialized copy"
: operation?

It destroys the original objects using their destructors (and the
containers call operator delete to release the memory.)

For example:
   template <class T>
   void f(const T* p, size_t s)
   {
     p += s;
     while (s-- > 0)
       (--p)->~T();
     operator delete(p);
   }


: There is another way to allocate raw memory, but here the operations are
: even less defined (I believe):

:    void *p2 = operator new[] (s * sizeof(T));
:    delete p2;  // What does this do? p2 was not the result of a normal
:                // new expression.
:    delete reinterpret_cast<T*> p2;  // What does this do?

You should use:
 operator delete[](p2);

--
        Tony Cook - tony@online.tmx.com.au
                    100237.3425@compuserve.com
---
[ To submit articles: try just posting with your news-reader.
                      If that fails, use mailto:std-c++@ncar.ucar.edu
  FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu.
]