Topic: new foo[0]


Author: schuenem@Informatik.TU-Muenchen.DE (Ulf Schuenemann)
Date: 14 Jun 1994 10:22:50 GMT
Raw View
In article <JASON.94Jun6201445@deneb.cygnus.com>, jason@cygnus.com (Jason Merrill) writes:
|> >>>>> Adrian Filipi-Martin <adrian@mo.cs.wm.edu> writes:
|>
|> >         Has anyone ever considered providing a mechanism for either
|> > accessing the fields of the internal data or validating it from within
|> > C++? The inaccessibility of this data was what made finding the bug so
|> > difficult. Even an implementation dependent mechanism would be
|> > appreciated.
|>
|> Here's the way to do it in g++ currently.  The 'union foo' bit is to force
|> the pointer to be aligned like a double.  Note that the cookie will not be
|> used unless the class has either a destructor or an operator delete [] that
|> takes the optional size_t argument, so array_size cannot be applied to all
|> arrays allocated by new.
|>
|> Jason
|>
|> typedef unsigned long size_t;
|> extern "C" int printf (const char *, ...);
|>
|> size_t array_size (void *p)
|> {
|>   union foo {
|>     struct __new_cookie {
|>       size_t nelts;
|>     } c;
|>     double d;
|>   };
|>
|>   foo *fp = (foo *)p;
|>   --fp;
|>
|>   return fp->c.nelts;
|> }
|>
|> struct A {
|>   ~A() { }
|> };
|>
|> main()
|> {
|>   A *ap = new A[3];
|>   printf ("%ld\n", array_size (ap));
|> }
|>

To implement the delete[]-operator new[] must store information about
size of the array somewhere (Is there a way to do without it?).
I would appreciate if there were a STANDARD for obtaining this
information at runtime (with a new keyword - oh, how ugly. I'll
call it XXX here):
With pa being a kind of reference(pointer) to an array:

 XXX(pa)

returns the number of elements in the array refered to by pa.
To work in C++ pa must not be a pointer to an element (single object
or somewhere in an array), but a refenence to (the beginning of)
an array. It's type would be declared by:

 T  (&pa)[];

This pa may only refere to the beginning of an array, not to somewhere
inside it. So "int a[]10], (&pa)[]=a" is allowed but ".. (&pa)[]=&a[2]"
is not. It should then be possible:


void fct(int *p1, int p2[], int (&pa)[])
{
 printf("sizes= %d %d %d elems=%d\n",sizeof(p1),sizeof(p2),sizeof(pa),XXX(pa));

}
void main()
{ int a[10];
 fct(a,a,a);
}

results in "sizes= 4 4 40 elems=10"

I would prefere this over some array_size-hack (see above). What is
your opinion?

Ulf Schuenemann

--------------------------------------------------------------------
Ulf Sch   nemann
Institut f   r Informatik, Technische Universit   t M   nchen.
email: schuenem@informatik.tu-muenchen.de
WWW:   http://hphalle2/~schuenem  (currently not available from outside)





Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Tue, 14 Jun 1994 14:40:41 GMT
Raw View
schuenem@Informatik.TU-Muenchen.DE (Ulf Schuenemann) writes:

>To implement the delete[]-operator new[] must store information about
>size of the array somewhere (Is there a way to do without it?).
>I would appreciate if there were a STANDARD for obtaining this
>information at runtime (with a new keyword - oh, how ugly. I'll
>call it XXX here):

I agree that this would be a good idea, but it shouldn't be
a new keyword, it should be a library function.

>With pa being a kind of reference(pointer) to an array:
>
> XXX(pa)
>
>returns the number of elements in the array refered to by pa.
>To work in C++ pa must not be a pointer to an element (single object
>or somewhere in an array), but a refenence to (the beginning of)
>an array.

Nah, just make `pa' a pointer and require that it point to the
first element of an array allocated using new[], otherwise the
behaviour is undefined.

 template <class T>
 size_t dynamic_array_size(T* pointer_to_first_element);

--
Fergus Henderson - fjh@munta.cs.mu.oz.au




Author: jason@cygnus.com (Jason Merrill)
Date: Tue, 14 Jun 1994 20:07:13 GMT
Raw View
>>>>> Ulf Schuenemann <schuenem@Informatik.TU-Muenchen.DE> writes:

> To implement the delete[]-operator new[] must store information about
> size of the array somewhere (Is there a way to do without it?).

As I said in my previous article, this is not true for all arrays, only
arrays of objects that have either destructors or an operator delete []
which wants to know the size of the block of memory.

Jason





Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Wed, 15 Jun 1994 11:12:50 GMT
Raw View
In article <JASON.94Jun14130713@deneb.cygnus.com> jason@cygnus.com (Jason Merrill) writes:
>
>As I said in my previous article, this is not true for all arrays, only
>arrays of objects that have either destructors or an operator delete []
>which wants to know the size of the block of memory.

 All class types have destructors, if not user written
then compiler generated. In both cases the destructor may
be 'trivial', that is, be optimisable to nothing.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA




Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Sat, 18 Jun 1994 06:10:39 GMT
Raw View
In article <9415818.9649@mulga.cs.mu.OZ.AU> fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
>eschwarz@shearson.com (Edward Schwarz) writes:
>
>>What, if any, is the difference between
>>
>>A) foo* myfoo0 = new foo[0];
>>
>>and
>>
>>B) foo* myfoo1 = new foo[1];
>
>Yes, there is a difference: after (A), the expression `myfoo0[0]' is not legal,
>whereas after (B), the expression `myfoo1[0]' is quite legal.

Are you sure about that Fergus?  In all contexts?

How about if we put a unary & out in front?

I mean like:

 &(myfoo0[0])

??

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -




Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sat, 18 Jun 1994 19:06:33 GMT
Raw View
rfg@netcom.com (Ronald F. Guilmette) writes:
>fjh@munta.cs.mu.OZ.AU (Fergus Henderson) writes:
>>eschwarz@shearson.com (Edward Schwarz) writes:
>>
>>>What, if any, is the difference between
>>>A) foo* myfoo0 = new foo[0];
>>>B) foo* myfoo1 = new foo[1];
>>
>>Yes, there is a difference: after (A), the expression `myfoo0[0]' is not
>>legal, whereas after (B), the expression `myfoo1[0]' is quite legal.
>
>Are you sure about that Fergus?  In all contexts?
>How about if we put a unary & out in front?
>I mean like: &(myfoo0[0]) ??

Ah - now that you mention it - no, not in all contexts.
I do think that it's use as the operand of `unary &' should be allowed.
I don't know what the ARM or the latest working paper has to say
about that.  (But most likely this is one of the areas where the
working paper is "incorect, incomplet, and inConSisTeNT".)

[On a related note, I don't think it should be allowed as the
initializer for a reference, as in for example

 foo& r = myfoo0[0];
]
--
Fergus Henderson - fjh@munta.cs.mu.oz.au




Author: jason@cygnus.com (Jason Merrill)
Date: Sat, 18 Jun 1994 22:08:33 GMT
Raw View
>>>>> Fergus Henderson <fjh@munta.cs.mu.OZ.AU> writes:

> Ah - now that you mention it - no, not in all contexts.
> I do think that it's use as the operand of `unary &' should be allowed.
> I don't know what the ARM or the latest working paper has to say
> about that.  (But most likely this is one of the areas where the
> working paper is "incorect, incomplet, and inConSisTeNT".)

I think that this falls into the provisions of [expr.add], since this
expression is equivalent to &(*(myfoo0+0)).  This section says basically
that evaluating a pointer arithmetic expression which points one past the
end of an array is allowed, but using that expression as the operand of the
unary * operator yields undefined behavior.  A strict reading would
indicate that the behavior of &myfoo[0] is thus undefined, but I suspect
that [expr.add] only meant to deal with actual dereferences.

> [On a related note, I don't think it should be allowed as the
> initializer for a reference, as in for example

>  foo& r = myfoo0[0];

Certainly such usage would yield undefined behavior, if 'r' were ever
actually used for anything, and consequently an implementation can give a
diagnostic for such code.  I think that explicitly prohibiting this
particular initialization would, however, be excessive special-casing.

Jason




Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Sun, 19 Jun 1994 02:37:10 GMT
Raw View
In article <JASON.94Jun18150834@deneb.cygnus.com> jason@cygnus.com (Jason Merrill) writes:

>>  foo& r = myfoo0[0];
>
>Certainly such usage would yield undefined behavior, if 'r' were ever
>actually used for anything, and consequently an implementation can give a
>diagnostic for such code.

 What do you mean by 'use'?
Its a sadly abused term. In general for class types "use"
means "call a virtual member function virtually" or "select a data member
of scalar type in a context requiring loading a value" or
"enquire as to the Run Time Type" (and a few other things
like conversions of pointers to virtual bases).

 Everything else is just address calculations not
requiring access to the object.

 Even calling a non-virtual member is just address
calculations: only virtual function calls actually
require an object to exist. Address calculations should be
legal if, and only if, they compute valid addresses:
binding a reference may be thought of as the same as
initialising a pointer.

 Thus above, computing the address of a non-static
member by say:

 r.mem

is invalid (its "more" than one past the end of the array :-)
But forming the reference 'r' is not.

Such an address calculation is perfect fine, however
even if the object does not exist, provided the underlying
memory does -- ONLY attempts to read memory should yield
undefined behaviour (including use of virtual bases
or functions and reading the value of a scalar member subobject.)

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA




Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 19 Jun 1994 07:16:24 GMT
Raw View
jason@cygnus.com (Jason Merrill) writes:

>>>>>> Fergus Henderson <fjh@munta.cs.mu.OZ.AU> writes:
>
>> I do think that it's use as the operand of `unary &' should be allowed.
>
>I think that this falls into the provisions of [expr.add], since this
>expression is equivalent to &(*(myfoo0+0)).  This section says basically
>that evaluating a pointer arithmetic expression which points one past the
>end of an array is allowed, but using that expression as the operand of the
>unary * operator yields undefined behavior.  A strict reading would
>indicate that the behavior of &myfoo[0] is thus undefined, but I suspect
>that [expr.add] only meant to deal with actual dereferences.

Yes, the working paper is incorrect here, IMHO.
But that also raises the question of the legality of `&*foo'.
Here there is no addition involved, so that section doesn't apply.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au




Author: schuenem@Informatik.TU-Muenchen.DE (Ulf Schuenemann)
Date: 15 Jun 1994 15:46:52 GMT
Raw View

Sorry for the example in my first article. It didn't match what I was saying.
(int a[10] creates no information of how big the array is.... This would already
be an extension to the original idea)

Hernderson = fjh@munta.cs.mu.OZ.AU (Fergus Henderson) wrote:
Henderson> >To implement the delete[]-operator new[] must store information about
Henderson> >size of the array somewhere (Is there a way to do without it?).
Henderson> >I would appreciate if there were a STANDARD for obtaining this
Henderson> >information at runtime (with a new keyword - oh, how ugly. I'll
Henderson> >call it XXX here):
Henderson>
Henderson> I agree that this would be a good idea, but it shouldn't be
Henderson> a new keyword, it should be a library function.

Maybe one could use (misuse?) "sizeof(pa)" to give the dynamic (byte-)size
of an array pointed to by a (special - see below) type of 'pointer to array'.
And make XXX a (standard) macro: #define XXX(pa)  (sizeof(pa)/sizeof(pa[0]))

Henderson> >To work in C++ pa must not be a pointer to an element (single object
Henderson> >or somewhere in an array), but a refenence to (the beginning of)
Henderson> >an array.
Henderson>
Henderson> Nah, just make `pa' a pointer and require that it point to the
Henderson> first element of an array allocated using new[], otherwise the
Henderson> behaviour is undefined.

This are the same requirenments that are required for delete[] pa: Only call
delete[] pa if pa points to the first element of an array allocated using new[].

I think the constraint "must point to first element of an array" could be
integrated into the typesystem (I'm not sure if it is worth it, I'm just
thinking about it).
Say that "TPA pa" declares a pointer to an array of Ts. The difference to
conventional pointers "T *p" would be:
 T a[10]
 pa = a  - ok
 pa++  - illegal: pa may not leave the first element of an array
 pa = p  - illegal: pa must point to an array, could be that p isn't
 ps = new T - illegal: pa must point to an array, new T isn't
 ps = delete - illegal: pa(!=NULL) points to an array, not just one object
 pa = new T[i] - ok
 XXX(pa)  - ok, returns the number of elements, if pa points to
     an new[]ed array AND if T is a type with destructor or
     op delete[] (see Merrill), else undefined
 delete[] pa - ok
 XXX(p)  - not ok (illegal / always = 1)

TPA could be somehink like "T (*pa)[]" or maybe a templateclass PtoArray<class T>
(but I'm not experienced enough to predict if I could achieve everything I intend
 with PtoArray).

Merrill = jason@cygnus.com (Jason Merrill) wrote:
Merrill> > To implement the delete[]-operator new[] must store information about
Merrill> > size of the array somewhere (Is there a way to do without it?).
Merrill>
Merrill> As I said in my previous article, this is not true for all arrays, only
Merrill> arrays of objects that have either destructors or an operator delete []
Merrill> which wants to know the size of the block of memory.
Merrill>
Merrill> Jason

Yes. This is a restriction. Maybe there could be a (standard/compiler-optional/
library) way to force new[] to create this information even without dtor/delete[].


Ulf Schuenemann

--------------------------------------------------------------------
Ulf Sch   nemann
Institut f   r Informatik, Technische Universit   t M   nchen.
email: schuenem@informatik.tu-muenchen.de
WWW:   http://hphalle2/~schuenem  (currently not available from outside)





Author: schuenem@Informatik.TU-Muenchen.DE (Ulf Schuenemann)
Date: 15 Jun 1994 17:52:10 GMT
Raw View
> Say that "TPA pa" declares a pointer to an array of Ts. The difference to
> conventional pointers "T *p" would be:
> ...
I would like to correct myself:
 pa = new T - illegal: pa must point to an array, new T isn't
 ^^ (not ps)
 delete  - illegal: pa(!=NULL) points to an array, not just one object


And I would like to add another rule for TD being derived from T:
 pa = new TD[i]  - illegal, as there is no runtimeinfo for
      how to address pa[1]

C/C++ calulates offsets into arrays statically. So &(p[1]) is always the
same as (p+sizeof(T)). But with subtyping p can point to an object of
derived class TD that is bigger than sizeof(T). Then with p = new TD[2]
p[1] will not refere to the second object in the array, but to some
address after the baseobject inside p[0] ! This may be especially painfull
if delete[] calls the destructors of T with the this-pointer pointing to
something that is not a (subobject-) T.


If we do not want undefined behavior and not want to forbid "pa = new TD[i]",
i.e. if we want to correctly refere to the objects in an array, the offset
has to be calculated at runtime with the size of the objects curretenly
in the array. This information can be retrieved somehow via the dynamic
type of the first element in the array for classes (current RTTI)
OR if new[] not only stored the number of elements in an array, but also
info about the size of the objects currently in the array.

To not break existing code, and because it requires a pointer to
the beginning of the array, runtime-offset-calculation should
only be performed for indexing of pointer-to-array TPA (however
TPA looks like in concrete syntax)


Then we could circumvent the constraint described below by using a
TPA p instead of a T*p:

[In "Re: ANSI draft bug and subsequent compiler bugs"
 Schwarz = jss@lucid.com (Jerry Schwarz) wrote:]
Schwarz>
Schwarz> When you delete an array the (static) type of the pointer must be that
Schwarz> of the actual type of the array objects.  In other words if you do
Schwarz>
Schwarz>         delete [] p ;
Schwarz>
Schwarz> where  p has type "pointer to T".  Then p must originally have been
Schwarz> allocated with
Schwarz>
Schwarz>         new T[...] ;
Schwarz>
Schwarz> This constraint is imposed so that the compiler knows how large
Schwarz> the elements of the array are and what destructor to call on
Schwarz> them is.
Schwarz>
Schwarz> It is true that the system could be required to stash away this
Schwarz> information (and since RTTI has been added to the language it will be
Schwarz> required to) and use it at the delete, but the current working paper
Schwarz> does not impose that overhead on runtime systems.
Schwarz>
Schwarz> I don't know if there is a proposal before the committee to change
Schwarz> this requirement.
Schwarz>
Schwarz>   -- Jerry Schwarz(jss@lucid.com)





Author: jason@cygnus.com (Jason Merrill)
Date: Wed, 15 Jun 1994 20:23:17 GMT
Raw View
>>>>> John Max Skaller <maxtal@physics.su.OZ.AU> writes:

>  All class types have destructors, if not user written then compiler
> generated. In both cases the destructor may be 'trivial', that is, be
> optimisable to nothing.

I disagree.  If a class has a base or member with a destructor, the
compiler will generate one; otherwise, it won't.  There is a lot of text in
[class.dtor] which talks about classes with or without destructors.

Jason




Author: adrian@mo.cs.wm.edu (Adrian Filipi-Martin)
Date: Mon, 6 Jun 1994 15:20:21 GMT
Raw View
In article <CquD6C.LEK@lehman.com>, eschwarz@shearson.com (Edward Schwarz) writes:
|> A) foo* myfoo0 = new foo[0];
|>
|> and
|>
|> B) foo* myfoo1 = new foo[1];
|>
|> The ARM says that A) will retun a pointer to an object, and that distinct calls will return distinct objects. This would seem to imply that there is no difference between A) and B).
|>

        I think the first one should be read as returning a valid pointer to an
array of objects, not returning a pointer to an object. In order to successfully
execute a delete[] on any pointer to an array of objects. Some internal (hidden)
bytes are allocated (usually immediately before the actual array) that contain
the size of the array of objects. In case A, this is 0. But the space to hold 0
was still allocated and is only reclaimed when delete[] is used.

        How this works became painfully clear when I had an off-by-1 that stepped
on the bytes immediately preceding, but not overlapping, an array of objects. My
visible data never was trashed, but delete[] would cause a core dump. The only
way I discovered the bug, the off-by-1 was a non-trivial nested loop, was to
watch the memory via gdb for any changes in the vicinity of the array.

        Has anyone ever considered providing a mechanism for either accessing the
fields of the internal data or validating it from within C++? The inaccessibility
of this data was what made finding the bug so difficult. Even an implementation
dependent mechanism would be appreciated.

cheers,
 Adrian
--
adrian@cs.wm.edu          ---->>>>| Support you local programmer,
adrian@icase.edu            --->>>| STOP Software Patent Abuses NOW!
Member: The League for        -->>| membership info at prep.ai.mit.edu:/pub/lpf
       Programming Freedom      ->| print "join.ps" for an application




Author: jason@cygnus.com (Jason Merrill)
Date: Tue, 7 Jun 1994 03:14:45 GMT
Raw View
>>>>> Adrian Filipi-Martin <adrian@mo.cs.wm.edu> writes:

>         Has anyone ever considered providing a mechanism for either
> accessing the fields of the internal data or validating it from within
> C++? The inaccessibility of this data was what made finding the bug so
> difficult. Even an implementation dependent mechanism would be
> appreciated.

Here's the way to do it in g++ currently.  The 'union foo' bit is to force
the pointer to be aligned like a double.  Note that the cookie will not be
used unless the class has either a destructor or an operator delete [] that
takes the optional size_t argument, so array_size cannot be applied to all
arrays allocated by new.

Jason

typedef unsigned long size_t;
extern "C" int printf (const char *, ...);

size_t array_size (void *p)
{
  union foo {
    struct __new_cookie {
      size_t nelts;
    } c;
    double d;
  };

  foo *fp = (foo *)p;
  --fp;

  return fp->c.nelts;
}

struct A {
  ~A() { }
};

main()
{
  A *ap = new A[3];
  printf ("%ld\n", array_size (ap));
}




Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Tue, 7 Jun 1994 08:59:27 GMT
Raw View
eschwarz@shearson.com (Edward Schwarz) writes:

>What, if any, is the difference between
>
>A) foo* myfoo0 = new foo[0];
>
>and
>
>B) foo* myfoo1 = new foo[1];

Yes, there is a difference: after (A), the expression `myfoo0[0]' is not legal,
whereas after (B), the expression `myfoo1[0]' is quite legal.

>The ARM says that A) will retun a pointer to an object, and that
>distinct calls will return distinct objects. This would seem to imply
>that there is no difference between A) and B).

The object which is pointed to by the return value is a zero-length array
of `foo', not a object of type `foo'.

--
Fergus Henderson - fjh@munta.cs.mu.oz.au




Author: eschwarz@shearson.com (Edward Schwarz)
Date: Fri, 3 Jun 1994 22:01:23 GMT
Raw View
given a class

class foo
{
public:
   int x,y;
   foo()  { x=0; y=0; }
   ~foo() { x=1; y=1; }
};

What, if any, is the difference between

A) foo* myfoo0 = new foo[0];

and

B) foo* myfoo1 = new foo[1];

The ARM says that A) will retun a pointer to an object, and that distinct calls will return distinct objects. This would seem to imply that there is no difference between A) and B).

Thanks for any comments on this one.

---
Ed Schwarz - eschwarz@shearson.com