Thread

Topic: Suggestion: array_size()

Author: Michiel.Salters@logicacmg.com (Michiel Salters)
Date: Fri, 4 Jun 2004 16:07:31 +0000 (UTC) Raw View

nagle@animats.com (John Nagle) wrote in message news:<uTIvc.64501$Bh2.21343@newssvr29.news.prodigy.com>...
> Allan W wrote:
> >>jgottman@carolina.rr.com ("Joe Gottman") wrote
> >>
> >>>   I therefore propose a new function:
> >>>    template <typename T>
> >>>    size_t array_size(T const *p)
>
>     At least for fixed-count arrays (i.e. situations
> where sizeof() is meaningful), there should be
> some way to find the capacity of the array at
> compile time.
> The traditional C approach:
>
>  sizeof(tab)/sizeof(tab[0])
>
> can fail when there is alignment fill not counted
> by "sizeof".

All alignment fill must be counted by sizeof, so your statement
is empty. See 3.9/4

Regards,
Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kuyper@wizard.net (James Kuyper)
Date: Fri, 4 Jun 2004 17:42:42 +0000 (UTC) Raw View

nagle@animats.com (John Nagle) wrote in message news:<uTIvc.64501$Bh2.21343@newssvr29.news.prodigy.com>...
..
> The traditional C approach:
>
>  sizeof(tab)/sizeof(tab[0])
>
> can fail when there is alignment fill not counted
> by "sizeof".

Such fill is supposed to be included in sizeof(). The formula you give
is explicitly blessed by the C standard as a correct way of
calculating that number.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: allan_w@my-dejanews.com (Allan W)
Date: Wed, 2 Jun 2004 12:44:48 CST Raw View

> jgottman@carolina.rr.com ("Joe Gottman") wrote
> >    I therefore propose a new function:
> >     template <typename T>
> >     size_t array_size(T const *p)
> > that returns n when p was created using new T[n], return 0 when p is null,
> > and has undefined behavior in all other cases.  Note that array_size(p)
> > would have well-defined semantics if and only if delete[] p does.

Michiel.Salters@logicacmg.com (Michiel Salters) wrote
[snip]
> this function would affect the interoperability with C. If
> they don't add this function, C arrays will not have a count whereas C++
> arrays of POD elements will. Yet with the proposed syntax (template), C
> can't add it.

First, I don't think we need to specify that this is implemented as a
template... it's okay just to have "compiler magic" to make it work.

Second, this works only if p was created with "new T[n]" which isn't
legal in C anyway.

> Personally, I don't like the undefined behavior for
> int a[10]; array_size(a);

That's a good point. And it's ironic, because this is one of the situations
where we can currently find the array size (not quite this easily though).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nagle@animats.com (John Nagle)
Date: Thu, 3 Jun 2004 17:28:57 +0000 (UTC) Raw View

Allan W wrote:
>>jgottman@carolina.rr.com ("Joe Gottman") wrote
>>
>>>   I therefore propose a new function:
>>>    template <typename T>
>>>    size_t array_size(T const *p)

    At least for fixed-count arrays (i.e. situations
where sizeof() is meaningful), there should be
some way to find the capacity of the array at
compile time.
The traditional C approach:

 sizeof(tab)/sizeof(tab[0])

can fail when there is alignment fill not counted
by "sizeof".


   John Nagle

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: allan_w@my-dejanews.com (Allan W)
Date: Wed, 2 Jun 2004 12:29:32 CST Raw View

"Adam H. Peterson" <ahp6@email.byu.edu> wrote
> > In order to make array_size<> work you would have two choices:
> >
> > 1.  Require it to work for all types, in which case you would be
> > adding otherwise unnecessary storage overhead to many array
> > allocations
> >
> > .or.
> >
> > 2.  Allow the compiler to throw an exception if it can't determine the
> > array's size
> >
> >
> > Clearly, neither is a good solution.
>
> 3. Require it to work for non-POD types and allow it to return 0 (or
> some other sentinel) for POD types.
>
> 4. Require it to work for non-POD types and provide unspecified behavior
> (or maybe UB?) for POD types.
>
> 5. Require it to work for non-POD types and provide a diagnostic if used
> for PODs.
>
> There may be other ways to specify array_size<>, but what has come up so
> far doesn't please me much.  I'm not crazy about 1 or 2, naturally.  I
> don't care for 3 either -- the inconsistency makes the facility less
> than ideal and probably prone to create subtle bugs in generic code.  4
> is even worse in that respect.  I like 5 best of these, if it gets added
> at all, but it still suffers from inconsistency, provides one more thing
> the implementors have to implement, and may interfere with optimizations
> if an implementation can determine that destructors can be elided.

How about this.

array_size() on any array where sizeof() is available -- returns the
exact count. Easily calculated as sizeof(array) / sizeof(array[0]).

array_size() on pointer to beginning of non-POD array allocated with
new[] -- returns the exact count. Clearly this number is already available
somewhere, in order to implement delete[] correctly. If this requires
non-trivial calculations on some strange platform, so be it -- the standard
will impose no performance guarantees (QOI issue).

array_size() on pointer to beginning of POD array allocated with new[] --
returns a count representing the amount of space allocated. This number is
at least the number requested in the new-expression, and may be slightly
higher (reflecting the fact that i.e. new[5] might actually allocate 16
bytes). Clearly this number is available in order to implement delete
correctly.

array_size() on ANY other pointer (i.e. 1+(new char[10]), pointer to an
array on the stack, pointer to static array, etc) -- undefined behavior.

The only problem with this, as I see it, is that we need to add a hook
into new/delete to return a memory block's size. This is a problem with
user-written new/delete. Currently new/delete has to have this
information, but there is no public interface to read it.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Adam H. Peterson" <ahp6@email.byu.edu>
Date: Tue, 25 May 2004 23:54:38 CST Raw View

> In order to make array_size<> work you would have two choices:
>
> 1.  Require it to work for all types, in which case you would be
> adding otherwise unnecessary storage overhead to many array
> allocations
>
> .or.
>
> 2.  Allow the compiler to throw an exception if it can't determine the
> array's size
>
>
> Clearly, neither is a good solution.

3. Require it to work for non-POD types and allow it to return 0 (or
some other sentinel) for POD types.

4. Require it to work for non-POD types and provide unspecified behavior
(or maybe UB?) for POD types.

5. Require it to work for non-POD types and provide a diagnostic if used
for PODs.

There may be other ways to specify array_size<>, but what has come up so
far doesn't please me much.  I'm not crazy about 1 or 2, naturally.  I
don't care for 3 either -- the inconsistency makes the facility less
than ideal and probably prone to create subtle bugs in generic code.  4
is even worse in that respect.  I like 5 best of these, if it gets added
at all, but it still suffers from inconsistency, provides one more thing
the implementors have to implement, and may interfere with optimizations
if an implementation can determine that destructors can be elided.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jgottman@carolina.rr.com ("Joe Gottman")
Date: Tue, 25 May 2004 00:29:27 +0000 (UTC) Raw View

   One of the annoying things about C++ is when the compiler knows some
useful information but there is no way for the programmer to access this
information.  One example of this is demonstrated by the following code:

    using namespace std;
    string *stringPtr = new string[10];
    delete[] stringPtr;

In order to call string's destructor the right number of times and
deallocate the right amount of memory, the system must be able to examine
stringPtr and determine that it points to an array of 10 strings.  However,
there is no way for the programmer to get this information just by examining
stringPtr.  Thus, if stringPtr is the result of a function call the
programmer cannot tell how big an array it refers to.  The usual workarounds
are either to have the function return a pair<T*, size_t>, to use an out
parameter of type size_t & in the function, or to have the function always
create the same size array.  All of these solutions can be maintenance
headaches.

   I therefore propose a new function:
    template <typename T>
    size_t array_size(T const *p)

that returns n when p was created using new T[n], return 0 when p is null,
and has undefined behavior in all other cases.  Note that array_size(p)
would have well-defined semantics if and only if delete[] p does.  This
would make it much easier to use the result of a new[] in a for loop or an
STL algorithm.

Joe Gottman

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: tom_usenet@hotmail.com (tom_usenet)
Date: Tue, 25 May 2004 16:02:02 +0000 (UTC) Raw View

On Tue, 25 May 2004 00:29:27 +0000 (UTC), jgottman@carolina.rr.com
("Joe Gottman") wrote:

>   I therefore propose a new function:
>    template <typename T>
>    size_t array_size(T const *p)
>
>that returns n when p was created using new T[n], return 0 when p is null,
>and has undefined behavior in all other cases.  Note that array_size(p)
>would have well-defined semantics if and only if delete[] p does.  This
>would make it much easier to use the result of a new[] in a for loop or an
>STL algorithm.

Are you sure that array_size can always be efficiently implemented?
You're assuming that all implementations make the size of the array
readiliy accessible from a pointer to the first element, but I'm not
sure it has to be that way. Note that when delete[] is called, the
memory allocator (generally) has to do lots of work to free the block,
and retrieving the size of the block might actually only be possible
part way through that process.

OTOH, I don't now whether this is a problem in practice - perhaps all
implementations just put the array size 4 or 8 bytes before the start
of the new[] returned memory.

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Michiel.Salters@logicacmg.com (Michiel Salters)
Date: Tue, 25 May 2004 16:12:00 +0000 (UTC) Raw View

jgottman@carolina.rr.com ("Joe Gottman") wrote in message news:<mbwsc.73095$jU.4247029@twister.southeast.rr.com>...
> One of the annoying things about C++ is when the compiler knows some
> useful information but there is no way for the programmer to access this
> information.

[ example ]

>    I therefore propose a new function:
>     template <typename T>
>     size_t array_size(T const *p)
>
> that returns n when p was created using new T[n], return 0 when p is null,
> and has undefined behavior in all other cases.

This information may not be present if T doesn't have a destructor. In
that case, the runtime currently has no need to store the number. IIRC,
the Itanium psABI has this optimalization. Your proposal would break this,
and require the count to be present always.

In addition, this function would affect the interoperability with C. If
they don't add this function, C arrays will not have a count whereas C++
arrays of POD elements will. Yet with the proposed syntax (template), C
can't add it.

Personally, I don't like the undefined behavior for
int a[10]; array_size(a);

Regards,
Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: richard@ex-parrot.com (Richard Smith)
Date: Tue, 25 May 2004 16:12:12 +0000 (UTC) Raw View

Joe Gottman wrote:
>
>     using namespace std;
>     string *stringPtr = new string[10];
>     delete[] stringPtr;
>
> In order to call string's destructor the right number of times and
> deallocate the right amount of memory, the system must be able to examine
> stringPtr and determine that it points to an array of 10 strings.  However,
> there is no way for the programmer to get this information just by examining
> stringPtr. [...] I therefore propose a new function:
>
>     template <typename T>
>     size_t array_size(T const *p)
>
> that returns n when p was created using new T[n], return 0 when p is null,
> and has undefined behavior in all other cases.

In principle, I think this would be a really good idea; however, in
practice there is no guarantee that the compiler will always have this
information.  Let me give an example:

  char* str = new char[1000];
  std::cout << array_size( str ) << std::endl;
  delete[] str;

Now, because char has a trivial destructor, the compiler is allowed to
optimise it away.  And when the array is deallocated, the deallocation
function, operator delete, is not passed the size of the array. (It is
only class scoped deallocation functions that can have a std::size_t
parameter for the array
size.)  Even the internals of malloc/free (or whatever operators new
and delete call down to) don't necessarily know the size of block of
memory that was requested -- for instance, malloc may have decided to
round up the request for 1000 bytes to 1024.

This is a very real issue.  The ABI used by recent versions of gcc on
Linux on ix86 does indeed avoid storing an "array cookie" (the size of
the array) where ever possible.

--
Richard Smith

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: usenet_cpp@lehrerfamily.com (Joshua Lehrer)
Date: Tue, 25 May 2004 19:02:10 +0000 (UTC) Raw View

jgottman@carolina.rr.com ("Joe Gottman") wrote in message news:<mbwsc.73095$jU.4247029@twister.southeast.rr.com>...
> One of the annoying things about C++ is when the compiler knows some
> useful information but there is no way for the programmer to access this
> information.  One example of this is demonstrated by the following code:
>
>     using namespace std;
>     string *stringPtr = new string[10];
>     delete[] stringPtr;
>
>    I therefore propose a new function:
>     template <typename T>
>     size_t array_size(T const *p)
>
> that returns n when p was created using new T[n], return 0 when p is null,

A problem is that the compiler doesn't always know the number passed
in.  For a type with a destructor, then, yes, it most likely stores
the size of the memory block and the count.

For PODs, however, the compiler probably does not store the count, and
only the memory block size.  The count is not necessary for PODs
because no destructors get called.

The size of the memory block can not be used to calculate the count
because frequently the runtime allocates for memory than necessary
(padding, alignment, a block was available, etc...)

In order to make array_size<> work you would have two choices:

1.  Require it to work for all types, in which case you would be
adding otherwise unnecessary storage overhead to many array
allocations

.or.

2.  Allow the compiler to throw an exception if it can't determine the
array's size


Clearly, neither is a good solution.

joshua lehrer
factset research systems
NYSE:FDS

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]