Topic: operators new[]/delete[]
Author: "Constantine Antonovich:" <const@Orbotech.Co.IL>
Date: 1996/02/26 Raw View
Several days ago I sent to this newsgroup the following code
(with a question is the code incorrect according to current ANSI
standard or do I have a bug in my compiler):
//----------------------------------------------------------
#include <iostream.h>
#include <assert.h>
#include <new.h>
class A {
public:
A(void) { cout << "A constructed" << endl; }
~A() { cout << "A destructed" << endl; }
};
class B {
public:
B(void) { cout << "B constructed" << endl; }
~B() { cout << "B destructed" << endl; }
};
A* foo_allocate(unsigned size)
{
assert(sizeof(A)==sizeof(B));
B* bptr=new B[size];
A* aptr=(A*)bptr; // A* aptr=reinterpret_cast<A*>bptr; is more correct
// but my compilers do not support that yet.
for (unsigned j=size; j>0;) // this place corrected according
(bptr+(--j))->~B(); // to remark of a person whose name
// I have lost to my regret.
for (unsigned i=0; i<size; ++i)
new(aptr+i) A;
return aptr;
}
int main(void)
{
A* arr=foo_allocate(2);
delete [] arr; # here
return 0;
}
//----------------------------------------------------------
I considered as a problematic point the fact that one of my
compilers treats line marked "#here" as destruction of array of B
objects (B class destructors called).
I have received several answers (I send this code also
personally to Steve Clamage and he kindly answered me) concluding
that according to the standards, the program contains operations
with undefined result and so cannot be considered as a correct
one. Meanwhile after some time of reflections I am going to insist
on the following:
-- ANSI work papers (April, 1995) leave interpretation of the
correctness of the above-mentioned code ambiguous;
-- If so, it's necessary to define this matter more precisely
to eliminate differences of the interpretation by different
compiler vendors;
-- The code can be interpreted as representing an absolutely
correct and well-defined behavior.
In the following text, I am going to prove this point of view.
1. Little history of the code sample.
The above-mentioned code sample can be considered as a play of
imagination without any applicable weight. Meanwhile this code was
born from another one making little more sense.
Some times ago, I noticed periodically appearing discussions
about necessity of renew operation in C++. Generally, I never felt
myself out of my share because of renew nonexistence but seeing
arguments of its usefulness again and again, I started to think:
what the hell is its problem, if it's not possible to implement it
by means of the language itself?
So I wrote the following example, trying to avoid redundant
operations usually existing in reallocation on classic manner:
//----------------------------------------------------------
template<class T>
class Allocator {
private:
struct filler { char filler_[sizeof(T)]; };
public:
static T* allocate_array(unsigned elm);
static void dup_array(T* dst, T* src, unsigned elm);
static void fill_array(T* dst, unsigned elm);
};
template<class T>
T* Allocator<T>::allocate_array(unsigned elm)
{
return (T*)new filler[elm];
}
template<class T>
void Allocator<T>::dup_array(T* dst, T*src, unsigned elm)
{
for (unsigned i=0; i<elm; ++i)
new(dst+i) T(src[i]);
}
template<class T>
void Allocator<T>::fill_array(T* dst, unsigned elm)
{
for (unsigned i=0; i<elm; ++i)
new(dst+i) T;
}
template<class T>
T* realloc(T*& array, unsigned old_size, unsigned new_size)
{
set_new_handler(0);
T* tmp=Allocator<T>::allocate_array(new_size);
if (tmp) {
if (new_size>old_size) {
Allocator<T>::dup_array(tmp,array,old_size);
Allocator<T>::fill_array(rmp+old_size,new_size-old_size);
}
else
Allocator<T>::dup_array(tmp,array,new_size);
delete [] array; array=tmp;
}
return tmp;
}
//----------------------------------------------------------
The general idea was to allocate an array with no
initialization of its objects and just to use copy constructor to
copy old elements into the newly allocated array (avoiding
redundant creation of objects, with their default constructor,
when they are immediately replaced by the following assignment).
There is no problems of such technique usage in container classes
where all memory and object management are absolutely hided from
the user, but the imaginary renew operation should be applicable
to regular arrays like:
//----------------------------------------------------------
A* ap=new A[4];
realloc(ap,4,8);
delete [] ap;
//----------------------------------------------------------
Obviously the code should be contained in
Allocator<T>::allocate_array function produces the mentioned
problem.
2. Undefined behavior.
Of course, some constructions in a possible program may
produce undefined behavior. Nevertheless, even undefined behavior
should have some definition. Let's consider the following code:
//----------------------------------------------------------
A* ap=new A;
B* bp=reinterpret_cast<B*>(ap);
delete bp; // #here
//----------------------------------------------------------
Without any doubt, result of the code executed in "#here" line is
undefined. But definitely I wouldn't like, as a result of
uncertainty of the behavior, my compiler to send email complaint
to some League "C++ compilers against stupidity of the
programmers". Also I wouldn't like my compiler to recognize
incorrectness of the code and silently to call A destructor (after
all, bp points to A object, isn't it?) instead of B one. Here I
mean that
UNDEFINED BEHAVIOR HAS AN ERROR MEANING.
UNDEFINED BEHAVIOR ALWAYS RESULTING IN CORRECT EXECUTION OF
A CODE WITH UNDEFINED BEHAVIOR, IS FORBIDDEN.
In the previous example, we are dealing with a code with
undefined behavior. The code is obviously incorrect. Meanwhile, in
case of an imaginary compiler which can recognize true type of an
object, the compiler could call A destructor in line marked
"#here" (because its behavior whould be undefined anyway). In
such a case, the invalid code will be correctly executes in any
case (ALWAYS) and so the behavior of the compiler cannot be
considered as a proper one.
C++, partly by itself, partly as heir of C, stands on the
principles of not stinting of a programmer in correctness of his
actions if they don't contradict syntactical correctness. So,
in the example, in the line marked "#here", the imaginary compiler
should honestly try to destroy B object and to free its
memory. Applying of B destructor to A object most probably will
cause "undefined behavior", but its harm will depend on particular
A and B classes (and obviously such applying will be not harmless
ALWAYS).
3. No kidding.
C++ is not wizard language. Generally, its behavior is
understandable, enough clear and well predictable. Creation of C++
objects consists of two parts: memory allocation and object
construction by itself. Even if such separation into two
independent parts is not obvious and is not proclaimed by ANSI
draft straightly, that doesn't change anything because such
separation results from the language definition anyway.
C++ memory management is ALMOST ALWAYS typeless. Here "ALMOST
ALWAYS" stands for all cases covered by standard-conforming
language implementation except of denumerable number of cases
where a programmer explicitly changes the language behavior by
means of the language constructions (and, I should add, on his own
responsibility). To illustrate the term, the following example can
be considered:
//----------------------------------------------------------
A* ap=new A;
//----------------------------------------------------------
What does the code do? Obviously, it creates a new object of type
A. Yes, but I should say it creates a new object of type A ALMOST
ALWAYS, just because the definition of A class may be as follows:
//----------------------------------------------------------
class A {
// some stuff
public:
// some stuff
void* operator new(unsigned) { exit(1); } // not for heap usage
};
//----------------------------------------------------------
and in such case obviously the previous statement will not create
any A object.
C++ memory management is ALMOST ALWAYS typeless because
default operators new and new[] has no knowledge about type of
object they allocate memory for. From the other hand, these
operators are the single C++ mechanism to manage the
memory. Moreover, this and only this part of object creation can
be absolutely overloaded by a programmer and that absolutely
separates it to independent stage of object creation.
An object construction has hidden features (like
initialization of tables of virtual functions) and only partly
(constructors and destructors) can be influenced by a programmer.
Meanwhile, declaration in the ANSI standard "placement new" also
had finished separation of object construction into independent
part since an object can be created with no allocation of memory
(at any place and by the programmer, not only on the stack) and
can be legally destroyed with no freeing of the memory (by direct
call to its destructor).
If we recall that according to C++ principles there should be
no difference between objects and their behavior regardless their
placement we have to agree that allocation of memory for an object
and construction of the object in the allocated memory represent
two independent stages ALMOST ALWAYS.
Taking all this into account, even definitions of operator new
and delete can be reconsidered to eliminate number of duplicated
definitions, for example:
new T(<arg-list>);
represents shorthand of sequence
new(::operator new(sizeof T)) T(<arg-list>);
or
new(T::operator new(sizeof T)) T(<arg-list>);
if T::operator new is defined.
delete tp; // there tp is non-null pointer on object of type T
represents shorthand of atomic sequence
if (tp) { tp->~T(); ::operator delete(<cast-to-mostly-base>tp); }
or
if (tp) { tp->~T(); T::operator delete(<cast-to-mostly-base>tp); }
if T::operator delete is defined.
Actually, similar redefinition can be done for new[]/delete[]
also.
4. Alignment and memory allocation.
In the starting the article example, the following code
//----------------------------------------------------------
assert(sizeof(A)==sizeof(B));
B* bptr=new B[size];
A* aptr=(A*)bptr;
//----------------------------------------------------------
really seems dangerous.
Fergus Henderson writes:
"This assertion is not guaranteed to succeed.
It would take an extremely perverse implementation
for it to fail, however, so I think it would be very
portable, even though it is not strictly guaranteed
to work."
This sentence seems to be reasonable but, in deal, this assertion
guarantees the correctness ALMOST ALWAYS and under that
circumstance this check is absolutely portable.
ANSI draft says:
18.4.1.1 Single-object forms
Effects:
The allocation function called by a new-expression to
allocate size bytes of storage suitably aligned to
represent any object of that size.
18.4.1.2 Array forms
Effects:
The allocation function called by the array form of
a new-expression to allocate size bytes of storage
suitably aligned to represent any array object of that
size or smaller.32)
We see that ANSI draft says that allocated memory should be
suitably aligned for any object and any array object with the
single limitation of size. We can also recall C++ pointer
arithmetic and what is sizeof of some particular object (which
contains concept of alignment in the object itself).
An implementation hasn't to be extremely perverse the
assertion condition to fail. It can be very simple one where B
class for example has its own operator new[] allocating memory in
specific alignment suitable for B but not for any other class and
A one particularly (and even that is impossible for compilers
still not supporting overloading of operator new[]).
But this situation is exactly "ALMOST ALWAYS" case. If I am
taking responsibility to overload operator new[] for specific
class, it's also my responsibility to take a care of usage of such
operations like one I am doing with the condition of the equation
of object sizes.
The language should stand in "ALMOST ALWAYS"
correctness (and de facto it does that). If a
programmer is doing something that is ALMOST
ALWAYS correct, the language should demonstrate
behavior like that is ALWAYS correct. Since ALMOST
ALWAYS correct action may became incorrect only as
a result of a programmer activity, this is also
responsibility of the programmer to take a care of
usage of such actions.
Fergus Henderson continues:
//----------------------------------------------------------
B* bptr=new B[size];
A* aptr=(A*)bptr;
//----------------------------------------------------------
"This cast has unspecified behavior. (See 5.2.9
[expr.cast.reinterpret]/8.). However, I would
expect it to work on most implementations."
This article of ANSI draft interprets the operation as
unspecified in case of cast from T1 to T2 and back and if there is
difference in alignment of T1 and T2. Obviously, that is not our
case (at least because definition of allocation function returning
suitable for any object alignment).
5. Rest in peace.
The following peace of the code has been considered as clear
by all experts:
//----------------------------------------------------------
for (unsigned j=size; j>0;)
(bptr+(--j))->~B();
for (unsigned i=0; i<size; ++i)
new(aptr+i) A;
//----------------------------------------------------------
6. Undefined behavior (continue).
All experts have considered deletion of the array allocated in
so strange manner as mostly incorrect point with undefined
behavior.
Steve Clamage writes:
"You do have an operation with undefined results,
however. In effect you are doing this:
A* p = (A*)new B[2];
delete [] p;
The rule is that the type of the pointer passed
to delete[] must match the type of the pointer
returned by new[], which is not the case here.
The compiler is not required to diagnose the error,
and the language definition does not say what the
result is."
Definitely, I am not doing that. If the standard of the
language enables to interpret my code in such manner then there is
something wrong with the standard! But even if the behavior is
proclaimed to be undefined, I would like to remind what I have
said in 2-nd paragraph. If the uncertainty of the behavior was
properly defined then the code wouldn't have undefined behavior!
(It would be very interesting to test the code on some another
compilers. I may suppose that the code has, in deal, enough
defined behavior de facto as result of most logical implementation
of operators new/delete and just SPARCompiler C++, for unknown
reason, stores pointer to destructor function together with array
size).
Fergus Henderson agrees with Steve Clamage:
"This has undefined behaviour. It contravenes 5.3.5
[expr.delete]/2, which says that the expression
passed to `delete []' must be a pointer to the first
element of an array of objects allocated with `new []';
this is not the case, because although there once was
such an array at that memory location, its lifetime
ended when the memory was overwritten by the calls
to placement new (see 3.8[basic.life]/1)."
There is at least one self-contradictory point in that
conclusion. Of course, lifetime of all B objects had been ended,
by why does that mean end of the array life? Or in contrary, if
end of life-time of B objects means end of life-time of the array,
so creation of A objects should mean creation of new array,
shouldn't it?.
Article 5.3.5.2 of ANSI draft says something slightly
different:
"...In the second alternative (delete array), the value
of the operand of delete shall be a pointer to an
array created by a new-expression without a new-placement
specification. If not, the behavior is undefined."
So delete takes as its argument POINTER TO ARRAY (even objects
are not mentioned). No one says that
...pointer passed to delete[] must match the type
of the pointer returned by new[]...
...the expression passed to `delete []' must be a
pointer to the first element of an array of objects
allocated with `new []'...
All that already means INTERPRETATION of the standard and also
that the standard enables such interpretations. Meanwhile C++
memory management seems not to need so strong restrictions just
because the memory can be managed separately from objects
construction/destruction and can be reused without reallocation.
I agree that all above said regarding operators new/delete is
point of view of common sense (one should use delete and delete[]
with the same pointer to the same type he got from new and new[]
and not play with the pointers in 99.9999% cases he uses
new/delete at all and in 100% cases if he doesn't understand what
he is doing) but that has nothing common with boundaries of proper
language processing.
And here we really arrive to the final point. ANSI draft gives
no strong array definition to disable ambiguous array
interpretation. And above-mentioned common-sense based array
understanding has all rights to exist.
7. No kidding (continue).
I suppose that this ambiguity in interpretation of arrays and
operators new/delete should be eliminated from the standard. I
would propose the following additions in supposition that they:
-- do not conflict with any of previous standard's
definitions;
-- do not change nothing in the standard's common principles
and common understanding of the standard except of very
specific point with no influence upon the standard itself;
-- do not influence mostly on existing implementations of
the language since some implementations use this idea de
facto and others can easily be corrected;
-- do not influence mostly on existing C++ applications
because they concern some very specific point in the
standard with not common and extremely rarely (if any)
usage.
-- will make the standard more logically completed.
Addition to array definition [dcl.array]:
Array of N T object represents contiguous amount of
memory of suitable size and alignment with N
non-overlapping objects of type T placed into the
memory with no gaps and each properly aligned.
Addition to operator delete [expr.delete] ("above" here
means all previously said by the standard):
In either alternative, the type of the deleted object
is evaluated as described above and according to the
type of the actual operand.
In my opinion, the additions may be considered as an
overweight but the experience shows they are not.
8. Renew
In my opinion, the previously mentioned additions would make
starting the article example absolutely clear and ALMOST ALWAYS
correct with no discussions (I continue to insist that the example
is so even now but with discussions). But what about renew? C++
standard very hardly accepts new keywords and new features. But
may be it makes sense, at least for completeness, to add to the
standard's (enough) new family of various_cast<T> something like
following:
dynamic_sizeof(object); // should return sizeof evaluated in
// run time, for example:
//----------------------------------------------------------
class A {
public:
unsigned u;
A();
virtual ~A();
};
class B : public A {
public:
int i;
B();
~B();
};
int main(void)
{
A* a=new B;
cout << sizeof(A) << endl; // types 8
cout << sizeof(B) << endl; // types 12
cout << sizeof(*a) << endl; // types 8
cout << dynamic_sizeof(*a) << endl; // types 12
return 0;
}
//----------------------------------------------------------
array_sizeof(pointer); // should return sizeof of an array, for example
//----------------------------------------------------------
class A {
public:
unsigned u;
A();
};
int main(void)
{
A* a=new A;
A ar[2];
A* ap=new A[4];
cout << array_sizeof(a)/sizeof(A) << endl; // types 0
cout << array_sizeof(ar)/sizeof(A) << endl; // types 2
cout << array_sizeof(ap)/sizeof(A) << endl; // types 4
return 0;
}
//----------------------------------------------------------
I may suppose that usage of arrays of objects with no
destructors (where a compiler may optimize away storage of number
of elements) became enough rare in contemporary C++ (and in lot of
cases, number of elements in such arrays is known already at
compilation time) so the language may enough easily provide a
programmer with such information as number of elements in an
array.
--
//------------------------------------------------------------------
// Opinions expressed here are my own only
// Constantine Antonovich const@orbotech.co.il
//------------------------------------------------------------------
[ To submit articles: Try just posting with your newsreader.
If that fails, use mailto:std-c++@ncar.ucar.edu
FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: 1996/02/27 Raw View
"Constantine Antonovich:" <const@Orbotech.Co.IL> writes:
> 2. Undefined behavior.
>
> Of course, some constructions in a possible program may
>produce undefined behavior. Nevertheless, even undefined behavior
>should have some definition.
I disagree. Imposing restrictions on the behaviour of code
which has "undefined behaviour" would be a very confusing use
of terminology, and more importantly would place constraints on
implementors that would prevent efficient implementations.
You may perhaps be able to make a case that certain specific cases
of undefined behaviour ought to be instead made merely unspecified.
But in the general case, a write via a stray pointer might cause
arbitrary instructions to be executed, thus violating any guarantees.
There is basically no way that an implementation which allows writes
via stray pointers can prevent this. Preventing stray pointer writes
is possible, but would have a significant efficiency penalty.
>Let's consider the following code:
>
>//----------------------------------------------------------
> A* ap=new A;
> B* bp=reinterpret_cast<B*>(ap);
> delete bp; // #here
>//----------------------------------------------------------
>
>Without any doubt, result of the code executed in "#here" line is
>undefined. But definitely I wouldn't like, as a result of
>uncertainty of the behavior, my compiler to send email complaint
>to some League "C++ compilers against stupidity of the
>programmers".
You might not like it, but depending on the types `A' and `B',
this could cause heap corruption on some implementations.
For example, the compiler might represent pointers to `A' as
pointing not directly to the start of the memory for `A', but
instead pointing to some fixed offset before or after the start.
The reinterpret_case<B*> might not adjust for that offset.
Heap corruption could cause writes through stray pointers,
which could cause arbitrary code to be executed. The result
of that could be anything -- sending email complaints is
unlikely, but can't be ruled out.
>Also I wouldn't like my compiler to recognize
>incorrectness of the code and silently to call A destructor (after
>all, bp points to A object, isn't it?) instead of B one. Here I
>mean that
>
> UNDEFINED BEHAVIOR HAS AN ERROR MEANING.
> UNDEFINED BEHAVIOR ALWAYS RESULTING IN CORRECT EXECUTION OF
> A CODE WITH UNDEFINED BEHAVIOR, IS FORBIDDEN.
Preventing implementations from doing "the right thing" when
executing code with undefined behaviour would place unreasonable
constraints on implementors that would prevent efficient implementations.
For example, a write via a stray pointer might write to some ununsed
memory, in which case it will have no effect, and the code may continue
to work. There is basically no way the implementation can avoid this
other than by checking for stray pointer writes, which as I said before
would have a significant efficient impact.
> 4. Alignment and memory allocation.
>
> In the starting the article example, the following code
>
>//----------------------------------------------------------
> assert(sizeof(A)==sizeof(B));
>
> B* bptr=new B[size];
> A* aptr=(A*)bptr;
>//----------------------------------------------------------
>
>really seems dangerous.
>
> Fergus Henderson writes:
> "This assertion is not guaranteed to succeed.
> It would take an extremely perverse implementation
> for it to fail, however, so I think it would be very
> portable, even though it is not strictly guaranteed
> to work."
>
>This sentence seems to be reasonable but, in deal, this assertion
>guarantees the correctness ALMOST ALWAYS and under that
>circumstance this check is absolutely portable.
I don't understand what you are saying here. (This assertion guarantees
the correctness of what? Under which circumstances?)
> An implementation hasn't to be extremely perverse the
>assertion condition to fail. It can be very simple one where B
>class for example has its own operator new[] allocating memory in
>specific alignment suitable for B but not for any other class and
>A one particularly (and even that is impossible for compilers
>still not supporting overloading of operator new[]).
In the test case, A and B were both identical classes (other than the
class name); I think only a peverse implementation would allocate them
different sizes.
Your talk about B having `operator new[]' is describing a hypothetical
peice of source code, not a hypothetical C++ implementation; I don't
see how it is relevant.
> Fergus Henderson continues:
>//----------------------------------------------------------
> B* bptr=new B[size];
> A* aptr=(A*)bptr;
>//----------------------------------------------------------
> "This cast has unspecified behavior. (See 5.2.9
> [expr.cast.reinterpret]/8.). However, I would
> expect it to work on most implementations."
>
> This article of ANSI draft interprets the operation as
>unspecified in case of cast from T1 to T2 and back and if there is
>difference in alignment of T1 and T2. Obviously, that is not our
>case (at least because definition of allocation function returning
>suitable for any object alignment).
That's irrelevant, since in your example piece of code, you don't cast
back to `B *'. 5.2.9/8 says that the result in this case is unspecified.
> Fergus Henderson agrees with Steve Clamage:
> "This has undefined behaviour. It contravenes 5.3.5
> [expr.delete]/2, which says that the expression
> passed to `delete []' must be a pointer to the first
> element of an array of objects allocated with `new []';
> this is not the case, because although there once was
> such an array at that memory location, its lifetime
> ended when the memory was overwritten by the calls
> to placement new (see 3.8[basic.life]/1)."
>
> There is at least one self-contradictory point in that
>conclusion. Of course, lifetime of all B objects had been ended,
>by why does that mean end of the array life?
The lifetime of an array object is distinct from the lifetime of its
elements. The ending of the lifetime of all the B objects (which you
did by explciitly calling the destructur) doesn't end the lifetime of
the array.
What ends the lifetime of the array is reusing the memory (which
you did by calling placement new). 3.8 is quite clear about this:
"the lifetime of an array object ... ends when the storage which the array ...
occupies is reused or released."
>Or in contrary, if
>end of life-time of B objects means end of life-time of the array,
>so creation of A objects should mean creation of new array,
>shouldn't it?.
Yes. But this array was not "created with new []" as required by
5.3.5.2, which you quite below, and thus the behaviour is still
undefined.
> Article 5.3.5.2 of ANSI draft says something slightly
> different:
> "...In the second alternative (delete array), the value
> of the operand of delete shall be a pointer to an
> array created by a new-expression without a new-placement
> specification. If not, the behavior is undefined."
>
>
> So delete takes as its argument POINTER TO ARRAY (even objects
>are not mentioned).
This is indeed a minor error; it has been changed in the January 96
draft to say "... shall be a pointer to the first element of an array ...".
>No one says that
> ...pointer passed to delete[] must match the type
> of the pointer returned by new[]...
> ...the expression passed to `delete []' must be a
> pointer to the first element of an array of objects
> allocated with `new []'...
It does say that the dynamic type of the pointer passed to delete[] must
match its static type, which is effectively the same thing.
(See 5.3.5/3.)
> And here we really arrive to the final point. ANSI draft gives
>no strong array definition to disable ambiguous array
>interpretation. And above-mentioned common-sense based array
>understanding has all rights to exist.
The draft is definitely not easy reading, but I think it does define
things sufficiently well to allow unambiguous interpretation in this
case.
> Addition to array definition [dcl.array]:
> Array of N T object represents contiguous amount of
> memory of suitable size and alignment with N
> non-overlapping objects of type T placed into the
> memory with no gaps and each properly aligned.
I think this is already covered by the wording on array lifetimes (3.8)
and the statement in [dcl.array] that an array object consists of
contiguously allocated elements. But adding that wording might make
things clearer.
> Addition to operator delete [expr.delete] ("above" here
> means all previously said by the standard):
> In either alternative, the type of the deleted object
> is evaluated as described above and according to the
> type of the actual operand.
The draft says that if the dynamic type is different from the static
type, then the behaviour is undefined. So the above text would be
a change, not just a clarification. I'm not at all convinced that
it would be a change for the better. Leaving the behaviour
undefined in this case gives implementors more flexibility, and
I think that flexibility is probably more important than defining
the behaviour of programs like yours which play tricks with memory
allocation.
--
Fergus Henderson WWW: http://www.cs.mu.oz.au/~fjh
fjh@cs.mu.oz.au PGP: finger fjh@128.250.37.3
---
[ To submit articles: try just posting with your news-reader.
If that fails, use mailto:std-c++@ncar.ucar.edu
FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu.
]
Author: phalpern@truffle.ultranet.com (Pablo Halpern)
Date: 1996/02/29 Raw View
Although I don't agree with most of what Constantine Antonovich says, I
do wonder about one particular set of situations. Many allocation and
re-allocation systems use the default new() operator to allocate memory
in the form of char arrays. This "raw" memory is then recast to an array
of specific type, which is initialized one element at a time:
template <class T>
T* dup_array(const T* p, size_t s)
{
T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)]; // note 1
while (s-- > 0)
new (p2 + s) T(p[s]); // Initialize using copy constructor
}
void f(T* p, size_t s)
{
T* newp = dup_array(p, s);
// do something with newp
delete [] newp; // note 2
}
I believe that something similar to the line marked "note 1" is common
practice for this sort of operation. However, I believe that the line
marked "note 2" is undefined behavior. Is there a way in the standard
can be modified so that the above code becomes well-defined and works as
intended? How does the STL deal with this in its "unitialized copy"
operation?
There is another way to allocate raw memory, but here the operations are
even less defined (I believe):
void *p2 = operator new[] (s * sizeof(T));
delete p2; // What does this do? p2 was not the result of a normal
// new expression.
delete reinterpret_cast<T*> p2; // What does this do?
How do we solve these problems in a standard-conforming way. Does there
need to be special language for char arrays or void * allocations. (Or
is there already? I haven't found it.)
Thanks,
Pablo Halpern phalpern@truffle.ultranet.com
I am self-employed. Therefore, my opinions *do* represent
those of my employer.
---
[ To submit articles: try just posting with your news-reader.
If that fails, use mailto:std-c++@ncar.ucar.edu
FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu.
]
Author: kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: 1996/02/29 Raw View
Hi,
Pablo Halpern (phalpern@truffle.ultranet.com) wrote:
: template <class T>
: T* dup_array(const T* p, size_t s)
: {
: T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)]; // note 1
This is in no ways portable. However, it IS portable to allocate "raw"
memory with 'operator new()' or 'operator new[]()'. See below...
: }
: void f(T* p, size_t s)
: {
: T* newp = dup_array(p, s);
: // do something with newp
: delete [] newp; // note 2
This results indeed in undefined behavior.
: }
: I believe that something similar to the line marked "note 1" is common
: practice for this sort of operation. However, I believe that the line
: marked "note 2" is undefined behavior. Is there a way in the standard
: can be modified so that the above code becomes well-defined and works as
: intended? How does the STL deal with this in its "unitialized copy"
: operation?
There is no need to modify the standard because there is already a
method available to deal with "raw" memory (see below), which e.g. use
by STL.
: There is another way to allocate raw memory, but here the operations are
: even less defined (I believe):
The operations are well defined, if used correctly...
: void *p2 = operator new[] (s * sizeof(T));
: delete p2; // What does this do? p2 was not the result of a normal
: // new expression.
: delete reinterpret_cast<T*> p2; // What does this do?
Both attempts to release the memory pointed to by 'p2' result in
undefined behavior: 'delete' can only be applied to objects allocated
with 'new T' (for some type 'T'). Likewise, 'delete[]' can only release
array objects allocated with 'new T[i]' (for some type 'T' and some
value 'i'). The whole trick is to distinguish 'new T' from 'operator
new()' and 'delete ptr' from 'operator delete()' (and correspondingly
the array variants): They are just different operations (see e.g.
"More Effective C++", S.Meyers, Addison-Wesly, Item 8).
I will describe the stuff for arrays because apparently the "renew"
topic is currently "in" :-) 'new T[i]' does something like:
#include <new>
T *new_T_array(size_t size) // a "homegrown" 'new T[size]'
{
// Allocate enough memory to hold the requested array plus
// additional information about the size of the array. This
// is as written NOT portable (insufficient alignment) but it
// is also not necessary to do the non-portable stuff, if the
// operations are encapsulated in an array class like 'vector':
// the size can be stored somewhere else.
void *ptr = operator new[](sizeof(T) + sizeof(size_t));
size_t *sptr = static_cast<size_t*>(ptr);
*sptr = size; // first store the size
// .. then get the address of the actual array
T *Tptr = static_cast<T*>(static_cast<void*>(sptr + 1));
// now initialize the array
for (size_t idx = 0; i < size; ++i)
operator new(Tptr + idx) T();
return Tptr;
}
However, this is NOT how it is indeed implemented but it depicts what
is basically going on and how it COULD be implemented (well, not
really: It is also necessary to take care of exceptions in the
constructors and to release constructed objects if there is an
exception). In particular, it shows the basics how to implement an own
routine to allocate an array which basically feels like a built-in
array (i.e. how 'operator new[]()' and "placement new" are used to
create and initialize the array). Unfortunately, you cannot 'delete[]'
an array created with 'new_T_array()'. Instead, you have to mimic the
behavior of 'delete[]' using explicit destruction and 'operator
delete[]()'. Here are the details:
void delete_T_array(T *Tptr)
{
// Again this code is somewhat non-portable but againn this doesn't
// matter because an array class can do a better (and portable) job
// by storing the size somewhere else.
// First retrieve the size from where it was store in 'new_T_array()':
size_t *sptr = static_cast<size_t*>(static_cast<void*>(Tptr)) - 1;
// ... then release the individual objects in the reverse order
// constructed:
for (size_t idx = *sptr; idx-- > 0; )
Tptr[idx].~T();
// Finally release the allocated "raw" memory
operator delete[](static_cast<void*>(sptr));
}
Again, this is basically how it works (but e.g. with exception handling
excluded). If you need to do similar stuff, you can do es like this.
However, you should place such stuff in a class (e.g. because it is
much simpler to do this portable). If do so, you are likely to end up
with a class similar to 'vector'. So, why bother...?
Concerning the 'renew' problematic: Using a combination of the stuff in
'new_T_array()' and 'delete_T_array()' you can implement a
'renew_T_array()' which is capable to 'renew' arrays allocated with
'new_T_array()' (or, even easier, if used with in a class to 'renew'
the internal storage used to represent the array). However, there is
still a minor inefficiency in comparison with 'realloc()': The need to
store the size. Since the memory management almost certainly "knows"
the size of the memory object somehow (I guess in all implementation it
does know the size but I can also imagine that there could be a strange
environment where this is not the case...), the number of elements in
the memory object could potentially be deduced from this size. There is
no portable method to do so. But I believe that this additional
'size_t' is not that heavy-weight to be a real problem which justifies
some very specific language extension.
: How do we solve these problems in a standard-conforming way. Does there
: need to be special language for char arrays or void * allocations. (Or
: is there already? I haven't found it.)
How to solve this problems in a standard-conforming way: See above.
... and no, neither are there special allocations for 'char' arrays or
'void*', nor are they needed.
--
dietmar.kuehl@uni-konstanz.de
http://www.informatik.uni-konstanz.de/~kuehl
I am a realistic optimist - that's why I appear to be slightly pessimistic
---
[ comp.std.c++ is moderated. To submit articles: Try just posting with your
newsreader. If that fails, use mailto:std-c++@ncar.ucar.edu
comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu
]
Author: tony@online.tmx.com.au (Tony Cook)
Date: 1996/03/01 Raw View
Pablo Halpern (phalpern@truffle.ultranet.com) wrote:
: Although I don't agree with most of what Constantine Antonovich says, I
: do wonder about one particular set of situations. Many allocation and
: re-allocation systems use the default new() operator to allocate memory
: in the form of char arrays. This "raw" memory is then recast to an array
: of specific type, which is initialized one element at a time:
: template <class T>
: T* dup_array(const T* p, size_t s)
: {
: T *p2 = reinterpret_cast<T*> new char[s * sizeof(T)]; // note 1
: while (s-- > 0)
: new (p2 + s) T(p[s]); // Initialize using copy constructor
: }
: void f(T* p, size_t s)
: {
: T* newp = dup_array(p, s);
: // do something with newp
: delete [] newp; // note 2
: }
: I believe that something similar to the line marked "note 1" is common
: practice for this sort of operation. However, I believe that the line
: marked "note 2" is undefined behavior.
Yes it is.
: Is there a way in the standard
: can be modified so that the above code becomes well-defined and works as
: intended?
This isn't likely - most delete[] implementations where a
non-trivial destructor is involved will use extra information before
the beginning of the array - which isn't present in your example
(and that information is implementation dependent, so you can't set
it portably in your own code).
: How does the STL deal with this in its "unitialized copy"
: operation?
It destroys the original objects using their destructors (and the
containers call operator delete to release the memory.)
For example:
template <class T>
void f(const T* p, size_t s)
{
p += s;
while (s-- > 0)
(--p)->~T();
operator delete(p);
}
: There is another way to allocate raw memory, but here the operations are
: even less defined (I believe):
: void *p2 = operator new[] (s * sizeof(T));
: delete p2; // What does this do? p2 was not the result of a normal
: // new expression.
: delete reinterpret_cast<T*> p2; // What does this do?
You should use:
operator delete[](p2);
--
Tony Cook - tony@online.tmx.com.au
100237.3425@compuserve.com
---
[ To submit articles: try just posting with your news-reader.
If that fails, use mailto:std-c++@ncar.ucar.edu
FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
Comments? mailto:std-c++-request@ncar.ucar.edu.
]