Thread

Topic: operator new[](size_t, void*)': clarifica

Author: Andrew Gierth <andrewg@microlise.co.uk>
Date: 1996/08/23 Raw View

>>>>> "Dietmar" == Dietmar Kuehl <kuehl@uzwil.informatik.uni-konstanz.de> writes:

 Dietmar> My problem is more the function

 Dietmar>  void *operator new[](size_t, void *p) { return p; }
 Dietmar>      ^^ i.e, the array placement new.

 Dietmar> I would be really interested in a portable use of 'operator
 Dietmar> new[](size_t,void*)'!  I don't think that you can post any,
 Dietmar> basically because there is no such use [snip]

A suggestion that appeared a while back (either here or in c.l.c++.mod,
I forget which) was to do something like this:

void *operator new[](size_t sz, size_t& rsz) { rsz = sz; return NULL; }

Then do something like:

    size_t mem_required;
    new(mem_required) T[n];

    void *ptr = my_allocator(mem_required);
    T *result = new(ptr) T[n];

The only guarantee that this needs from the implementation is that the
array overhead is constant between the two new-expressions.

Changing the standard to guarantee that seems to me simpler than
introducing a new constant (I can imagine (perverse) implementations where
the overhead was not constant, but I can't imagine why an implementation
would supply different sizes to the above two cases).

[Sorry, I can't remember who suggested this method originally.]

--
Andrew Gierth (andrewg@microlise.co.uk)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: boukanov@sentef1.fi.uib.no (Igor Boukanov)
Date: 1996/08/24 Raw View

Andrew Gierth (andrewg@microlise.co.uk) wrote:
> void *operator new[](size_t sz, size_t& rsz) { rsz = sz; return NULL; }

....
> The only guarantee that this needs from the implementation is that the
> array overhead is constant between the two new-expressions.

> Changing the standard to guarantee that seems to me simpler than
> introducing a new constant (I can imagine (perverse) implementations where
> the overhead was not constant, but I can't imagine why an implementation
> would supply different sizes to the above two cases).

I think it would be better to introduce new function get_allocation_size:
template<class T> size_t get_allocation_size(size_t elems_number);
And now with almost all compilers it can be possible to write:
template<class T> size_t get_allocation_size(size_t elems_number)
{
   size_t size;
   new (size) T[elems_number];
   return size;
}

It is nessesary because if new will return NULL "the behavior is undefined"
according to standard as was already menssioned in this thread.


--
Regards, Igor Boukanov.
igor.boukanov@fi.uib.no
http://www.fi.uib.no/~boukanov/
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: "Philippe Verdy" <100105.3120@compuserve.com>
Date: 1996/08/24 Raw View

Tim Hollebeek <tim@franck.Princeton.EDU> a    crit dans l'article <4vhu9s$237@cnn.Princeton.EDU>...
> Dietmar Kuehl (kuehl@uzwil.informatik.uni-konstanz.de) wrote:
>
> : BTW, note that the fact that there is no possibility to use a delete
> : expression to release the object becomes even more error prone for
> : arrays than it already is for single objects, even if the above code to
> : allocate an array would be portable:
>
> :   for (size_t t = s; t-- > 0; )
> :     array[t].~T();
> :   operator delete[](ptr);
>
> : Actually, this is almost certainly necessary (I can't imagine a
> : situation where it would be suitable to use 'delete[] array' but where
> : the placement new is necessary) such that I wonder why the first
> : problem exists at all: There is no code which later accesses the data
> : stored by the new expression. The code which would need the data (e.g.
> : the loop above) cannot [portably] access the data anyway...  Are there
> : plans to allow passing of additional arguments in a delete expression?
> : As far as I know, there is still the problem to find an appropriate
> : syntax...
>
> The "There is no code which later accesses the data stored by the new
> expression" argument is exactly why I hacked gcc not to use any
> overhead in the placement-array-new case; the overhead is then zero,
> so figuring out how much space is needed is no longer a problem.
> However when I mailed in the patch to the gcc people, Mike Stump
> replied along the lines that "placement-array-delete" had been added
> at a recent meeting, and the code to save the size was in anticipation
> of support for it being added to a future version of gcc.  Whether
> "placement-array-delete" is in the latest draft or not I'm not sure;
> it wasn't in the one I was reading when I was fiddling with all this,
> but that was nearly a year ago.
>
> It certainly would be more convenient to have a placement array
> delete, since in order to call all the destructors, one must know how
> many there are, but the compiler won't tell you!  This can lead to the
> size being stored twice; once by the compiler, and once by you in
> order to call destructors manually.
>
> I like Dietmar's suggestion of a max overhead constant; typically it
> will be small anyway, and anyone who wants to manually avoid the
> problem and store the size themselves can via something like:
>
> T *memory = malloc(n*sizeof(T));
>
> for (int i = 0; i < n; i++)
>     new (memory + i) T;
>
> ---------------------------------------------------------------------------
> Tim Hollebeek         | Disclaimer :=> Everything above is a true statement,
> Electron Psychologist |                for sufficiently false values of true.
> Princeton University  | email: tim@wfn-shop.princeton.edu
> ----------------------| http://wfn-shop.princeton.edu/~tim (NEW! IMPROVED!)
> ---
The main reason of the existance of the overhead is often the system-specific
requirements concerning alignment, while still maintaining information for the
count of objects allocated in the array (so that they can all be destructed
conveniently).

The real formula will most often be that:

   new(2,f) T[5]

results in a call of:

  operator new[](roundup(sizeof(T) + objectoverhead)*5 + arrayoverhead, 2, f).

where roundup() has mainly to do with aligning an object with the next one so
that they all share the same alignment restrictions. In addition, some overhead
may be added on each object as a helper for their safe destruction. However I
think that this also occurs when you expect the size of a static array type:

  sizeof(T[n]) >= sizeof(T) * n

The size of an array may be equal to the product of the size of each item by the
count of objects only for a very few datatypes. At least you will count on the
following: sizeof(char[n]) == sizeof(char) * n. But you should not assume it for
even integral types, nor for floating point types

For example, a "double" could be on some systems a 6-bytes datatype,
with 8-bytes boundary alignment requirement. Even though sizeof(double) will
return 6, and can be safely allocated with standard functions like malloc(6), this
alignment requirement may be enforced by the compiler when allocating automatic
or static variables, or by the standard library malloc() for dynamic objects. This is
why (even in C, not C++) you should not use malloc() to allocate arrays like this:
 (double *)malloc(sizeof(double) * n)
but:
 (double *)calloc(sizeof(double), n)
which allows for alignment restrictions to be enforced.
On such a system, the arithmetic of pointers will be enforced so that incrementing a
(double *) will not add 6, the sizeof(double), but 8 instead.
And even: sizeof(T[1]) >= sizeof(T). This could seam quite troublesome, but think of
systems with severe alignment restrictions (like 64-bit architectures), and where enforcing
the alignment restrictions could result in the surprising:
 sizeof(double) == 10, and
 sizeof(double[1]) == 16 !!

The same restrictions apply also to C++, so you should take care of it.
So the memory required to allocate a dynamic array with an "operator new[]" will
at least allocate the memory required by a static array, adding some overhead (it
will be often stored at the beginning of the memory block) for the count of objects,
and the associated alignment requirement, which depends on the effective datatype
of the following items of the array.

Finally the array overhead itself is kept for operator delete[](...)
to help it call successfully the required object destructors on each item, within a
counted loop where the count of objects created in the array will be computable.
This last information will often be located within the memory block(s) allocated
for the array, so the address of the dynamic array structure will not be the same
as the address of the first object in the array.

The address obtained from an operator new[] will then not be the address of the
effective memory block allocated ! Now you can understand why the overhead
cannot be easily determined. If you plan to create your own "operator new[](size_t, void *)",
this will be very difficult to compute the size of the placement block.

I think that the best thing to do will be to also create your own "operator delete[]" for that
type, so that you can manage the allocation more safely, using your own structure
definition for the array, for example:
struct T_array {
  int count;
  T   array[1];
}
You compiler should provide a definition to find the effective offset of the internal member array,
so that you can use malloc() accurately, and return the correct address for the returned C++
array.

You should also consider some systems which have very wide architecture, but can access
easily to only a part of an aligned word, but cannot easily access to parts of consecutive two words.
For example, when creating arrays of 3-bytes wide elements on a 128-bit architecture: some
elements could be packed on offsets 0, 3, 6, 9, 12, and ... 16 (not 15 !), 19, 22, ...
On such a system, the pointers arithmetic will be adapted so that alignment of little objects
is enforced to avoid crossing word boundaries ! For now such systems should be rare, but in a
near future, very wide architecture will emerge where alignment restrictions would be too
restrictive if we could not pack some elements provided they can fit within the same word, and
still maintain a low-cost access code, while avoiding to waste much memory. In the previous
example, we would have:
 sizeof(T) == 3,
 sizeof(T[1]) == 3, sizeof(T[2]) == 6, sizeof(T[3]) == 9, sizeof(T[4]) == 12,
 sizeof(T[5]) == 15 (or 16 !),
 sizeof(T[6]) == 19, ...
In other words, some elements could have different sizes, depending on the alignment of their
starting address !

Now consider first checking the size a standard new[] operator would claim, and using that size
to compute additional space for your own size descriptor within your own user defined new[] operator.
I think that this will produce wrong results, because in fact what you need is to allocate (via malloc())
a structure containing your descriptor and THEN the standard array.
The problem is that you cannot predict exactly the additional space required to allocate your own
prefix descriptor, because the following array member will align itself depending only on its own
alignment requirements. So you cannot predict exactly what extra filler bytes will be needed between
your descriptor and the effective array storage. How do you think you will solve this problem ?
I think that the only way is to allow an extension to the language like extending the sizeof() operator
so that it gives the alignment required to extend a type with a second one.
May be something like:
  sizeof(T1, T2)
which fixes the size required to extend a type by padding another one to it.
Or may be to allow the following expression (where n could be a variable or expression):
  sizeof(struct { int count; T array[n]; })
which would compute the correct size.

Then to allow computing the address of the returned array, we would need something like:
  offsetof(struct { int count; T array[n]; }, array)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: Andrew Gierth <andrewg@microlise.co.uk>
Date: 1996/08/26 Raw View

>>>>> "Philippe" == Philippe Verdy <100105.3120@compuserve.com> writes:

[snipped entire previous context of array placement new]
[post reformatted extensively for line length]

 Philippe> The main reason of the existance of the overhead is often
 Philippe> the system-specific requirements concerning alignment,
 Philippe> while still maintaining information for the count of
 Philippe> objects allocated in the array (so that they can all be
 Philippe> destructed conveniently).

Oh dear. A misunderstanding in here somewhere, I fear.

 Philippe> The real formula will most often be that:

 Philippe>    new(2,f) T[5]

 Philippe> results in a call of:

 Philippe>   operator new[](roundup(sizeof(T) + objectoverhead)*5
                            + arrayoverhead, 2, f).

That is one of the many possibilities, but not one I would expect from a
normal implementation.

 Philippe> where roundup() has mainly to do with aligning an object
 Philippe> with the next one so that they all share the same alignment
 Philippe> restrictions. In addition, some overhead may be added on
 Philippe> each object as a helper for their safe destruction. However
 Philippe> I think that this also occurs when you expect the size of a
 Philippe> static array type:

 Philippe>   sizeof(T[n]) >= sizeof(T) * n

Ahemm. sizeof(T[n]) == n*sizeof(T), by long-standing definition.
[expr.sizeof] even states this explicitly, as does the ARM [5.3.2].
"This implies that the size of an array of /n/ elements is /n/ times
the size of an element".

 Philippe> The size of an array may be equal to the product of the
 Philippe> size of each item by the count of objects only for a very
 Philippe> few datatypes. At least you will count on the following:
 Philippe> sizeof(char[n]) == sizeof(char) * n. But you should not
 Philippe> assume it for even integral types, nor for floating point
 Philippe> types

The ARM and the drafts say you can assume it for any type. Anyone
know different?

 Philippe> For example, a "double" could be on some systems a 6-bytes
 Philippe> datatype, with 8-bytes boundary alignment requirement. Even
 Philippe> though sizeof(double) will return 6,

I think that's a bug in [expr.sizeof] - the last draft I read (Jan96)
says that padding is included in sizeof() for class types, but doesn't
say the same for builtin types. For arrays to behave, however, requires
that sizeof(double) returns 8 for the above-described case.

 Philippe>                                      and can be safely
 Philippe> allocated with standard functions like malloc(6), this
 Philippe> alignment requirement may be enforced by the compiler when
 Philippe> allocating automatic or static variables, or by the
 Philippe> standard library malloc() for dynamic objects. This is why
 Philippe> (even in C, not C++) you should not use malloc() to
 Philippe> allocate arrays like this:

 Philippe>  (double *)malloc(sizeof(double) * n)
 Philippe> but:
 Philippe>  (double *)calloc(sizeof(double), n)

C++ adopts malloc & calloc from the C standard, and I don't have an
authoritative C reference handy, but in view of the comments I've
already cited, this is nonsense.

 [a restatement elided for brevity]

 Philippe> The same restrictions apply also to C++, so you should take
 Philippe> care of it.  So the memory required to allocate a dynamic
 Philippe> array with an "operator new[]" will at least allocate the
 Philippe> memory required by a static array, adding some overhead (it
 Philippe> will be often stored at the beginning of the memory block)
 Philippe> for the count of objects, and the associated alignment
 Philippe> requirement, which depends on the effective datatype of the
 Philippe> following items of the array.

 Philippe> Finally the array overhead itself is kept for operator
 Philippe> delete[](...)  to help it call successfully the required
 Philippe> object destructors on each item, within a counted loop
 Philippe> where the count of objects created in the array will be
 Philippe> computable.  This last information will often be located
 Philippe> within the memory block(s) allocated for the array, so the
 Philippe> address of the dynamic array structure will not be the same
 Philippe> as the address of the first object in the array.

operator delete[] does not call destructors - it is called *after*
destructors have been invoked. Don't confuse the delete[] expression
with operator delete[]().

 Philippe> The address obtained from an operator new[] will then not
 Philippe> be the address of the effective memory block allocated !

correct - did anyone say it was?

 Philippe> Now you can understand why the overhead cannot be easily
 Philippe> determined. If you plan to create your own "operator
 Philippe> new[](size_t, void *)", this will be very difficult to
 Philippe> compute the size of the placement block.

The only reason that the overhead can't be determined is that the
standard effectively says it can't; as currently worded, it can vary
between identical allocations. If the standard specified that the
overhead was the same for new-expressions on identical array types,
then the solution already presented is sufficient.

 Philippe> I think that the best thing to do will be to also create
 Philippe> your own "operator delete[]" for that type, so that you can
 Philippe> manage the allocation more safely, using your own structure
 Philippe> definition for the array, for example:

 Philippe> struct T_array {
 Philippe>   int count;
 Philippe>   T   array[1];
                       ^
I hope this isn't the old trick of referencing beyond the end of a
structure - that's specifically banned.

 Philippe> }

 Philippe> You compiler should provide a definition to find the
 Philippe> effective offset of the internal member array, so that you
 Philippe> can use malloc() accurately, and return the correct address
 Philippe> for the returned C++ array.

 Philippe> You should also consider some systems which have very wide
 Philippe> architecture, but can access easily to only a part of an
 Philippe> aligned word, but cannot easily access to parts of
 Philippe> consecutive two words.  For example, when creating arrays
 Philippe> of 3-bytes wide elements on a 128-bit architecture: some
 Philippe> elements could be packed on offsets 0, 3, 6, 9, 12, and
 Philippe> ... 16 (not 15 !), 19, 22, ...  On such a system, the
 Philippe> pointers arithmetic will be adapted so that alignment of
 Philippe> little objects is enforced to avoid crossing word
 Philippe> boundaries ! For now such systems should be rare, but in a
 Philippe> near future, very wide architecture will emerge where
 Philippe> alignment restrictions would be too restrictive if we could
 Philippe> not pack some elements provided they can fit within the
 Philippe> same word, and still maintain a low-cost access code, while
 Philippe> avoiding to waste much memory. In the previous example, we
 Philippe> would have:

 Philippe>  sizeof(T) == 3,
 Philippe>  sizeof(T[1]) == 3, sizeof(T[2]) == 6,
 Philippe>      sizeof(T[3]) == 9, sizeof(T[4]) == 12,
 Philippe>  sizeof(T[5]) == 15 (or 16 !),
 Philippe>  sizeof(T[6]) == 19, ...

 Philippe> In other words, some elements could have different sizes,
 Philippe> depending on the alignment of their starting address !

Again, this flies in the face of the ARM and the drafts when they
state that sizeof(T[n]) == n*sizeof(T).

I doubt that wasting memory will be an issue on a 128-bit architecture.

 [rest of original post snipped]

--
Andrew Gierth (andrewg@microlise.co.uk)


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/08/26 Raw View

"Philippe Verdy" <100105.3120@compuserve.com> writes:

> The main reason of the existance of the overhead is often the system-specific
> requirements concerning alignment, while still maintaining information for the
> count of objects allocated in the array (so that they can all be destructed
> conveniently).
>
> The real formula will most often be that:
>
>    new(2,f) T[5]
>
> results in a call of:
>
>   operator new[](roundup(sizeof(T) + objectoverhead)*5 + arrayoverhead, 2, f).
>
> where roundup() has mainly to do with aligning an object with the next one so
> that they all share the same alignment restrictions. In addition, some overhead
> may be added on each object as a helper for their safe destruction. However I
> think that this also occurs when you expect the size of a static array type:
>
>   sizeof(T[n]) >= sizeof(T) * n

The sizeof operator must return the number of bytes that an object
occupies in an array, including any padding.  The only overhead I've
ever seen in an operator new is a bit at the beginning for the compiler
to store the number of elements in the array.  This bit may be rounded
up for alignment; e.g.: if double's require 8 byte alignment, the
compiler will add 8 bytes here for an array of objects containing a
double, even though it only needs 4 for the number of objects.  (This
would be a typical case on a 32 bit machine, for example.)

> The size of an array may be equal to the product of the size of each item by the
> count of objects only for a very few datatypes.

The size of an array *MUST* be equal to the product of the size of each
item by the count of objects for all datatypes.  See section 6.3.3.4 in
ISO 9899, or section 5.3.3 paragraph 2 in the draft working papers.

> At least you will count on the
> following: sizeof(char[n]) == sizeof(char) * n. But you should not assume it for
> even integral types, nor for floating point types

It's interesting that this is just a transposition of an example of the
use of sizeof in the C standard.  To determine the number of elements in
an array: "sizeof( array ) / sizeof( array[ 0 ] )".

> For example, a "double" could be on some systems a 6-bytes datatype,
> with 8-bytes boundary alignment requirement. Even though sizeof(double) will
> return 6, and can be safely allocated with standard functions like malloc(6), this
> alignment requirement may be enforced by the compiler when allocating automatic
> or static variables, or by the standard library malloc() for dynamic objects.

This is simply wrong, see above.

With regards to malloc( 6 ): IMHO, this is undefined behavior in this
case (since sizeof( double ) must be 8, even if only 6 bytes are
actually used); it is likely to work in most cases, since it is hard to
imagine an implementation of malloc which is capable of actually
allocating less than MAX_ALIGN bytes.

> This is
> why (even in C, not C++) you should not use malloc() to allocate arrays like this:
>  (double *)malloc(sizeof(double) * n)
> but:
>  (double *)calloc(sizeof(double), n)
> which allows for alignment restrictions to be enforced.

If this were true, 99% of all C programs I've ever seen would be
broken.

I'm curious as to where you got this information.  It sounds like a book
that we should definitly warn people against.

I've deleted the rest of the original posting, since it simply continues
with more examples of the same basic error.

--
James Kanze           (+33) 88 14 49 00          email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle --
                            -- Beratung in industrieller Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: clamage@pacific-88.Eng.Sun.COM (Steve Clamage)
Date: 1996/08/21 Raw View

In article fpo@news.BelWue.DE, kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl) writes:
>Hi,
>
>I have some trouble understanding the use of 'operator new[](size_t
>size, void *ptr)' defined in lib.new.delete.placement: Is there any
>portable use of this function? This function does nothing except
>returning 'ptr', its second argument (according to
>lib.new.delete.placement).

The function is intended to provide a standard interface to a user-defined
allocator. You generally can't write such an allocator portably either.

Suppose you have a pointer to some area of memory which you somehow know
has the right size and alignment for a type T. You then have a standard
way to cause a constructor to be run at that address, creating a T.

People writing allocators are going to want such an interface function,
and will write
 void* operator new(size_t, void* p) { return p; }
anyway. To avoid multiple-definition problems, this function was
made part of the standard library.

Here is an example of a portable use of this placement-new function:

class T { ... T(int); ... };

void foo()
{
 T t1(1); // create t1, a local T
 ...  // use t1
 t1.~T(); // destroy old object
 new (&t1) T(2); // create a new T in the old memory

} // at function exit, destructor called for t1, destroying new object

The draft standard addresses the points necessary to make this example
valid, and indeed, an example like this was used to clarify the
rules about object lifetimes.
---
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/21 Raw View

In article <199608202052.NAA05726@taumet.eng.sun.com>
clamage@pacific-88.Eng.Sun.COM (Steve Clamage) writes:

|> In article fpo@news.BelWue.DE, kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl) writes:
|> >Hi,
|> >
|> >I have some trouble understanding the use of 'operator new[](size_t
|> >size, void *ptr)' defined in lib.new.delete.placement: Is there any
|> >portable use of this function? This function does nothing except
|> >returning 'ptr', its second argument (according to
|> >lib.new.delete.placement).

|> The function is intended to provide a standard interface to a user-defined
|> allocator. You generally can't write such an allocator portably either.

|> Suppose you have a pointer to some area of memory which you somehow know
|> has the right size and alignment for a type T. You then have a standard
|> way to cause a constructor to be run at that address, creating a T.

|> People writing allocators are going to want such an interface function,
|> and will write
|>  void* operator new(size_t, void* p) { return p; }
|> anyway. To avoid multiple-definition problems, this function was
|> made part of the standard library.

|> Here is an example of a portable use of this placement-new function:

|> class T { ... T(int); ... };

|> void foo()
|> {
|>  T t1(1); // create t1, a local T
|>  ...  // use t1
|>  t1.~T(); // destroy old object
|>  new (&t1) T(2); // create a new T in the old memory

|> } // at function exit, destructor called for t1, destroying new object

This is all very true.  But Dietmar's question didn't concern placement
operator new, but placement operator new[].  The new operator is
guaranteed to call the operator new function with the first argument ==
sizeof( T ); since we know the size, we can arrange for sufficient
memory to be there.  (Alignment is a bit more difficult, but using a
union of most of the built-in types, plus a few pointer types, is
probably portable enough for most people.)  In the case of operator
new[], however, the standard explicitly states that the implementation
can demand an undefined amount of memory more.  Which makes arranging
for the memory to actually be there significantly more difficult:-).
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone



[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: tim@franck.Princeton.EDU (Tim Hollebeek)
Date: 1996/08/21 Raw View

Steve Clamage (clamage@pacific-88.Eng.Sun.COM) wrote:
: In article fpo@news.BelWue.DE, kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl) writes:
: >Hi,
: >
: >I have some trouble understanding the use of 'operator new[](size_t
: >size, void *ptr)' defined in lib.new.delete.placement: Is there any
: >portable use of this function? This function does nothing except
: >returning 'ptr', its second argument (according to
: >lib.new.delete.placement).

: The function is intended to provide a standard interface to a user-defined
: allocator. You generally can't write such an allocator portably either.

: Suppose you have a pointer to some area of memory which you somehow know
: has the right size and alignment for a type T. You then have a standard
: way to cause a constructor to be run at that address, creating a T.

In this case, it works.  The correct size is sizeof(T), and the
alignment isn't hard to fix.  But we are talking about _array_ new[];
the compiler is allowed an incredible amount of freedom over and above
n * sizeof(T), making it impossible to know how much memory to
allocate, *even for a specific compiler*.

Case in point: gcc adds 8 bytes if T has a non-trivial destructor, and
none otherwise.  But there is no way to figure out whether the correct
number is 8 or zero based on an arbitrary type T.

I actually hacked gcc at one point to have an option to _disable_ the
overhead for placement-array-new, since the only thing that needs it
is placement-array-delete [which didn't exist at the time I made the
hack].

I found it much easier to manage the number of elements myself, since
C++ has no way of figuring out the number of elements, or if/how the
number is stored, and explicitly call destructors.

: People writing allocators are going to want such an interface function,
: and will write
:  void* operator new(size_t, void* p) { return p; }
                       ^^^

the question is about new[], which is broken.  Normal placement new
works fine.  Here is your example, reworked to show the problem:

: class T { ... T(); ... };

: void foo()
: {
:  T t1[5]; // create t1, a local array of T
:  ...  // use t1
:       for (i = 0; i < 5; i++)
:           t1[i].~T(); // destroy old array
:  new (t1) T[5]; // all hell breaks loose

: } // even more fun as the destructors try to get called

Note that even if you do something like:

T *t1 = new T[5];
for (i = 0; i < 5; i++)
    t1[i].~T();
new (t1) T[5]; // Illegal.  The amount of overhead is allowed to
        // change, so the amount of memory for this T[5]
        // might not be the same as the first one (!) [1]

expr.new, clause 12:

--new(2,f) T[5] results in a call of
    operator new[](sizeof(T)*5+y,2,f).  Here, x and y are  non-negative,
    implementation-defined  values  representing  array allocation over
    head.  They might vary from one use of new to another.

The longer you look at new()[] the more you will realize there are
some _very_ wierd things going on here, which makes it completely
unusable even in a NON portable way.

IMO, the best way to fix it would be to require that overloaded new[]
functions be required to handle the number of elements themselves (and
provide a delete[] to use the number).

e.g:

// for new(...) T[n]:
// size == n * sizeof(T) [no overhead]
// num == number of elements, 0 if not necessary to be stored (trivial
//                                                      destrutor, etc)
void *
operator new[](size_t size, int num, ...) {
    ...
}

and the corresponding delete[].

---------------------------------------------------------------------------
Tim Hollebeek         | Disclaimer :=> Everything above is a true statement,
Electron Psychologist |                for sufficiently false values of true.
Princeton University  | email: tim@wfn-shop.princeton.edu
----------------------| http://wfn-shop.princeton.edu/~tim (NEW! IMPROVED!)


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: 1996/08/22 Raw View

Hi,
Steve Clamage (clamage@pacific-88.Eng.Sun.COM) wrote:
: In article fpo@news.BelWue.DE, kuehl@uzwil.informatik.uni-konstanz.de
: (Dietmar Kuehl) writes:
: >I have some trouble understanding the use of 'operator new[](size_t
: >size, void *ptr)' defined in lib.new.delete.placement: Is there any
: >portable use of this function? This function does nothing except
: >returning 'ptr', its second argument (according to
: >lib.new.delete.placement).

: The function is intended to provide a standard interface to a user-defined
: allocator. You generally can't write such an allocator portably either.

: Suppose you have a pointer to some area of memory which you somehow know
: has the right size and alignment for a type T. You then have a standard
: way to cause a constructor to be run at that address, creating a T.

Note that I didn't claim that there is not portable use of 'operator
new(size_t,void*)'! I know that this function can be used portably. I
asked specifically about the array variant of this function (well, I
should have said this more precisely I guess: Those two brackets can
easily be overviewed...).

: People writing allocators are going to want such an interface function,
: and will write
:  void* operator new(size_t, void* p) { return p; }
: anyway. To avoid multiple-definition problems, this function was
: made part of the standard library.

My problem is more the function

 void *operator new[](size_t, void *p) { return p; }
     ^^ i.e, the array placement new.

: Here is an example of a portable use of this placement-new function:
[example using 'operator new(size_t,void*)' snipped]

I would be really interested in a portable use of 'operator
new[](size_t,void*)'!  I don't think that you can post any, basically
because there is no such use (at least, this is what I think and what I
got confirmed by Tim Hollebeek in private communication). And if there
is no portable use of 'operator[](size_t,void*)', I don't see any
reason why this function should be in the standard: It suggests a false
sense of safety. Here is the problem with this function:

  class T {...};
  ...
  size_t s = ...;
  void *mem = operator new[](sizeof(T) * s + overhead);
  T *array = new(mem) T[s];

The WP says, that the last expression calls

  operator new[](sizeof(T)*s + y, mem)

to allocate the necessary memory (this is only a marginal modification
of the example in [expr.new] paragraph 12) where 'y' is some
implementation-defined value which might vary from one invocation of
'new' to another. Everything would be fine, if it would be possible to
know in advance, what 'y' will be or alternatively, what 'y' would be at
most: This is indicated by the term 'overhead' in the expression
allocating memory. Basically, I think I would like to modify the
sentence in [expr.new] paragraph 12 which is in the DWP as

  Their value might  vary  from  one invocation of new to another.

to something like this:

  Their value might  vary  from  one invocation of new to another but
  is at most 'std::max_array_overhead' which is a non-negative,
  implementation defined integral constant defined in <new>.

(I'm not used to "standard speak" but I think I made my intention
clear) plus, of course, the necessary [minor, I think] modificiations
to introduce the value into the corresponding places (basically the
header <new>). An alternative to this approach which is taken for some
other limits is to define a maximum overhead (which is currently not
guaranteed!) into the limits section ([limits]) such that it would be
possible to play safe (and probably waste a lot of memory) in a
portable application by using the value defined in this section.
However, I don't think that an upper limit of something would be
appropriate for the section limits: The other values represent
guarantees lower limites which have to be guaranteed by an
implementation...

BTW, note that the fact that there is no possibility to use a delete
expression to release the object becomes even more error prone for
arrays than it already is for single objects, even if the above code to
allocate an array would be portable:

  for (size_t t = s; t-- > 0; )
    array[t].~T();
  operator delete[](ptr);

Actually, this is almost certainly necessary (I can't imagine a
situation where it would be suitable to use 'delete[] array' but where
the placement new is necessary) such that I wonder why the first
problem exists at all: There is no code which later accesses the data
stored by the new expression. The code which would need the data (e.g.
the loop above) cannot [portably] access the data anyway...  Are there
plans to allow passing of additional arguments in a delete expression?
As far as I know, there is still the problem to find an appropriate
syntax...
--
<mailto:dietmar.kuehl@uni-konstanz.de>
<http://www.informatik.uni-konstanz.de/~kuehl/>
I am a realistic optimist - that's why I appear to be slightly pessimistic
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: boukanov@sentef1.fi.uib.no (Igor Boukanov)
Date: 1996/08/22 Raw View

Tim Hollebeek (tim@franck.Princeton.EDU) wrote:
> Steve Clamage (clamage@pacific-88.Eng.Sun.COM) wrote:
> : In article fpo@news.BelWue.DE, kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl) writes:
> : >Hi,
> : >
> : >I have some trouble understanding the use of 'operator new[](size_t
> : >size, void *ptr)' defined in lib.new.delete.placement: Is there any
> : >portable use of this function? This function does nothing except
> : >returning 'ptr', its second argument (according to
> : >lib.new.delete.placement).

> : The function is intended to provide a standard interface to a user-defined
> : allocator. You generally can't write such an allocator portably either.

> : Suppose you have a pointer to some area of memory which you somehow know
> : has the right size and alignment for a type T. You then have a standard
> : way to cause a constructor to be run at that address, creating a T.

> In this case, it works.  The correct size is sizeof(T), and the
> alignment isn't hard to fix.  But we are talking about _array_ new[];
> the compiler is allowed an incredible amount of freedom over and above
> n * sizeof(T), making it impossible to know how much memory to
> allocate, *even for a specific compiler*.

> Case in point: gcc adds 8 bytes if T has a non-trivial destructor, and
> none otherwise.  But there is no way to figure out whether the correct
> number is 8 or zero based on an arbitrary type T.

> I actually hacked gcc at one point to have an option to _disable_ the
> overhead for placement-array-new, since the only thing that needs it
> is placement-array-delete [which didn't exist at the time I made the
> hack].

> I found it much easier to manage the number of elements myself, since
> C++ has no way of figuring out the number of elements, or if/how the
> number is stored, and explicitly call destructors.

You can determine an memory amount a compiler will ask for new[] by
the next code:

void* operator new(size_t size, size_t& retSize) {
   retSize = size;
   return null;
}

Here an  example of use ( I skip all problems with allignment and with
NULL == malloc()....):

template<class T> T* userDefinedNew(size_t elemsCount) {
   size_t size;
   new(size) T[elemsCount];
   size += sizeof(size_t);
   size_t* p = static_cast<size_t*>malloc(size);
   *p = elemsCount;
   return new (p + 1) T[elemsCount];
}

template<class T> void userDefinedDelete(void* p_) {
   size_t* p = (static_cast<size_t*>p_) - 1;
   T* pT = static_cast<T*>p_;
   size_t elemsCount = *p;
   while(elemsCount > 0) {
      pT[--elemsCount].~T();
   }
   free(p);
}

class X { ... };

X* x = userDefinedNew<X>(100);
...
userDefinedDelete(x);

--
Regards, Igor Boukanov.
igor.boukanov@fi.uib.no
http://www.fi.uib.no/~boukanov/
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: tim@franck.Princeton.EDU (Tim Hollebeek)
Date: 1996/08/22 Raw View

Igor Boukanov (boukanov@sentef1.fi.uib.no) wrote:
: Tim Hollebeek (tim@franck.Princeton.EDU) wrote:

: > I actually hacked gcc at one point to have an option to _disable_ the
: > overhead for placement-array-new, since the only thing that needs it
: > is placement-array-delete [which didn't exist at the time I made the
: > hack].

: > I found it much easier to manage the number of elements myself, since
: > C++ has no way of figuring out the number of elements, or if/how the
: > number is stored, and explicitly call destructors.

: You can determine an memory amount a compiler will ask for new[] by
: the next code:

: void* operator new(size_t size, size_t& retSize) {
:    retSize = size;
:    return null;
: }

No you can't.  If you read the draft, the size is allowed to vary from
call to call in the case of new()[].  The above only works for new().
This is ignoring the issue of new[] not returning a pointer to 'size'
bytes, as required.

Also note that returning 0 from operator new() invokes implementation
defined behavior [basic.stc.dynamic.allocation 4] unless I'm reading
it wrong.

BTW, isn't basic.stc.dynamic.allocation clause 4 impossible in view of
clause 2? ("... shall return a pointer to a block of available storage
at least as large as ...", then "if the allocation function returns
the null pointer ...")

clause 13 or expr.new clearly allows this, although it is odd that the
implementation definedness isn't mentioned there.  Help?

: Here an  example of use ( I skip all problems with allignment and with
: NULL == malloc()....):

: template<class T> T* userDefinedNew(size_t elemsCount) {
:    size_t size;
:    new(size) T[elemsCount];

These next two lines will fail if size_t doesn't have the strictest
alignment requirements (likely since double or function pointers are
often worse than size_t).

:    size += sizeof(size_t);
:    size_t* p = static_cast<size_t*>malloc(size);
:    *p = elemsCount;
:    return new (p + 1) T[elemsCount];

More undefined behavior.  There is no guarantee p points to enough
space (!).  This is likely to work in practice, but the standard
guarantees no such thing.

I don't think it is possible to write a strictly conforming program
that uses new()[] *at all*.  At this point, I'm almost willing to put
money on it.

---------------------------------------------------------------------------
Tim Hollebeek | Disclaimer :=> Everything above is a true statement,
Electron Psychologist | for sufficiently false values of true.
Princeton University | email: tim@wfn-shop.princeton.edu
----------------------| http://wfn-shop.princeton.edu/~tim (NEW!
IMPROVED!)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: tim@franck.Princeton.EDU (Tim Hollebeek)
Date: 1996/08/22 Raw View

Dietmar Kuehl (kuehl@uzwil.informatik.uni-konstanz.de) wrote:

: BTW, note that the fact that there is no possibility to use a delete
: expression to release the object becomes even more error prone for
: arrays than it already is for single objects, even if the above code to
: allocate an array would be portable:

:   for (size_t t = s; t-- > 0; )
:     array[t].~T();
:   operator delete[](ptr);

: Actually, this is almost certainly necessary (I can't imagine a
: situation where it would be suitable to use 'delete[] array' but where
: the placement new is necessary) such that I wonder why the first
: problem exists at all: There is no code which later accesses the data
: stored by the new expression. The code which would need the data (e.g.
: the loop above) cannot [portably] access the data anyway...  Are there
: plans to allow passing of additional arguments in a delete expression?
: As far as I know, there is still the problem to find an appropriate
: syntax...

The "There is no code which later accesses the data stored by the new
expression" argument is exactly why I hacked gcc not to use any
overhead in the placement-array-new case; the overhead is then zero,
so figuring out how much space is needed is no longer a problem.
However when I mailed in the patch to the gcc people, Mike Stump
replied along the lines that "placement-array-delete" had been added
at a recent meeting, and the code to save the size was in anticipation
of support for it being added to a future version of gcc.  Whether
"placement-array-delete" is in the latest draft or not I'm not sure;
it wasn't in the one I was reading when I was fiddling with all this,
but that was nearly a year ago.

It certainly would be more convenient to have a placement array
delete, since in order to call all the destructors, one must know how
many there are, but the compiler won't tell you!  This can lead to the
size being stored twice; once by the compiler, and once by you in
order to call destructors manually.

I like Dietmar's suggestion of a max overhead constant; typically it
will be small anyway, and anyone who wants to manually avoid the
problem and store the size themselves can via something like:

T *memory = malloc(n*sizeof(T));

for (int i = 0; i < n; i++)
    new (memory + i) T;

---------------------------------------------------------------------------
Tim Hollebeek         | Disclaimer :=> Everything above is a true statement,
Electron Psychologist |                for sufficiently false values of true.
Princeton University  | email: tim@wfn-shop.princeton.edu
----------------------| http://wfn-shop.princeton.edu/~tim (NEW! IMPROVED!)
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1996/08/22 Raw View

In article <4vg0o2$7ai@news.BelWue.DE>
kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl) writes:

|> My problem is more the function

|>  void *operator new[](size_t, void *p) { return p; }
|>      ^^ i.e, the array placement new.

|> : Here is an example of a portable use of this placement-new function:
|> [example using 'operator new(size_t,void*)' snipped]

|> I would be really interested in a portable use of 'operator
|> new[](size_t,void*)'!

I would also be interested in seeing a practical use for this operator,
supposing it was defined in a portable way.  In every case I can think
of, the reason for separating allocation from construction (i.e.: not
using the operator new expression directly), at least with arrays,
involves *NOT* constructing everything at once, but only on an as needed
basis (e.g.: as in vector< class T >).

|> I don't think that you can post any, basically
|> because there is no such use (at least, this is what I think and what I
|> got confirmed by Tim Hollebeek in private communication). And if there
|> is no portable use of 'operator[](size_t,void*)', I don't see any
|> reason why this function should be in the standard: It suggests a false
|> sense of safety.

Orthogonality.  Presumably, this was the reason that it was added.
Given the problems you point out, it might be worth forgoing
orthogonality in this case.

If the function is removed (and I think it should be), the standard
should probably contain a footnote explaining why.  Otherwise (and
probably with the footnote anyway), the question will get asked here at
least once every three months:-).

|> Here is the problem with this function:

|>   class T {...};
|>   ...
|>   size_t s = ...;
|>   void *mem = operator new[](sizeof(T) * s + overhead);
|>   T *array = new(mem) T[s];

|> The WP says, that the last expression calls

|>   operator new[](sizeof(T)*s + y, mem)

|> to allocate the necessary memory (this is only a marginal modification
|> of the example in [expr.new] paragraph 12) where 'y' is some
|> implementation-defined value which might vary from one invocation of
|> 'new' to another. Everything would be fine, if it would be possible to
|> know in advance, what 'y' will be or alternatively, what 'y' would be at
|> most: This is indicated by the term 'overhead' in the expression
|> allocating memory. Basically, I think I would like to modify the
|> sentence in [expr.new] paragraph 12 which is in the DWP as

|>   Their value might  vary  from  one invocation of new to another.

|> to something like this:

|>   Their value might  vary  from  one invocation of new to another but
|>   is at most 'std::max_array_overhead' which is a non-negative,
|>   implementation defined integral constant defined in <new>.

This would be an alternative solution.  My impression at present is that
it is more trouble than it is worth (and it would probably take at least
two committee meetings just to agree on a name).  Can you show any
practical use for the operator if this suggestion were to be adopted?
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils,    tudes et r   alisations en logiciel orient    objet --
                -- A la recherche d'une activit    dans une region francophone


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: kuehl@uzwil.informatik.uni-konstanz.de (Dietmar Kuehl)
Date: 1996/08/23 Raw View

Hi,
Igor Boukanov (boukanov@sentef1.fi.uib.no) wrote:
: You can determine an memory amount a compiler will ask for new[] by
: the next code:

: void* operator new(size_t size, size_t& retSize) {
:    retSize = size;
:    return null;
: }

: Here an  example of use ( I skip all problems with allignment and with
: NULL == malloc()....):

: template<class T> T* userDefinedNew(size_t elemsCount) {
:    size_t size;
:    new(size) T[elemsCount];
:    size += sizeof(size_t);
:    size_t* p = static_cast<size_t*>malloc(size);
:    *p = elemsCount;
:    return new (p + 1) T[elemsCount];
: }

Just a simple question: Can you back up that this works on the grounds
of the DWP? Did you consider expr.new paragraph 12? I can't see why the
second new expression should not require more memory (although I agree
that this is not what I would expect to happen). I won't accept a
statement like this:  "This code performs as expected with all C++
systems I have".  This does not prevent some weird guy to implement a
different C++ system where expr.new paragraph 12 is explicity used!  I
don't want to have to tell my boss "Oh sorry, the multi-million dollar
project failed because I assumed there is not weird compiler out
there...".

I'm convinced that your code can fail on a standard conforming compiler
because the second call to new requests, for what ever strange reason,
more memory and also USES this memory. I can't think of a reason why
this would be the case but this is not the point: It is allowed and
thus might be used. In total: Nice try. The problem remains! To make
'operator new[](size_t,void*)' useful there has to be some stronger
guarantee.
--
<mailto:dietmar.kuehl@uni-konstanz.de>
<http://www.informatik.uni-konstanz.de/~kuehl/>
I am a realistic optimist - that's why I appear to be slightly pessimistic


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]