Topic: Quick vector proposal


Author: phalpern@halpernwightsoftware.com (Pablo Halpern)
Date: Thu, 11 Aug 2005 05:26:22 GMT
Raw View
hinnant@metrowerks.com (Howard Hinnant) wrote:

>In article <dffoe11ga96guc9un6a5ruv63ni9c081c2@4ax.com>,
> Pablo Halpern <phalpern@halpernwightsoftware.org> wrote:
>
>> The current proposal (which appears to already have been
>> accepted), requires return of a null pointer for the empty vector.  My
>> proposal is only that the pointer be valid, but not necessarily
>> dereferenceable.
>
>This is a misunderstanding.  The informal description in lwg 464 does
>indeed appear to require the return of a null pointer for the empty
>vector.  But note that this mistake was not propagated to the current
>proposed resolution which merely states:
>
>pointer       data();
>const_pointer data() const;
>
>
>Returns: A pointer such that [data(), data() + size()) is a valid
>range. For a non-empty vector, data() == &front().
>
>Complexity: Constant time.
>
>Throws: Nothing.

Excellent!  Add my voice to those supporting this proposed resolution.

Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: phalpern@halpernwightsoftware.org (Pablo Halpern)
Date: Tue, 9 Aug 2005 14:02:52 GMT
Raw View
James Dennett <jdennett@acm.org> wrote:

>Pablo Halpern wrote:
>> ...  A vector which once had data but is now empty will
>> have hold a valid, but not dereferencable pointer to its zeroith element
>> and will need an extra 'if' statement to return a null pointer.
>
>A pointer to memory that's been deallocated may well be indeterminate,
>so not sufficiently valid to be portably passed to a function even if
>it will not be dereferenced.

Since the capacity of a vector never shrinks (except via swap), a vector
that once held data but is now empty does not contain a pointer to
de-allocated memory.  It contains a pointer to raw allocated memory in
which one or more objects have been destroyed.  The pointer is valid in
the same way that a pointer past the end of any array is valid (think of
it as a zero-sized array with a pointer to one-past the zero-ith
element).  Since the object(s) in the allocated memory have been
destroyed, the pointer is not dereferencable, but it remains a valid
pointer.
Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Wed, 10 Aug 2005 05:44:08 GMT
Raw View
In article <dffoe11ga96guc9un6a5ruv63ni9c081c2@4ax.com>,
 Pablo Halpern <phalpern@halpernwightsoftware.org> wrote:

> The current proposal (which appears to already have been
> accepted), requires return of a null pointer for the empty vector.  My
> proposal is only that the pointer be valid, but not necessarily
> dereferenceable.

This is a misunderstanding.  The informal description in lwg 464 does
indeed appear to require the return of a null pointer for the empty
vector.  But note that this mistake was not propagated to the current
proposed resolution which merely states:

pointer       data();
const_pointer data() const;


Returns: A pointer such that [data(), data() + size()) is a valid
range. For a non-empty vector, data() == &front().

Complexity: Constant time.

Throws: Nothing.

In a DR, it is the proposed resolution that matters.  Other text,
correct or not, is retained for rationale, or for historical reasons.

In this case, the proposed resolution places no requirements on the
return of data() when the vector is empty, except that [data(), data())
be a valid range.  As long as the implementation can copy it (without a
core dump), any value of data() will satisfy that requirement.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: James Dennett <jdennett@acm.org>
Date: Sat, 6 Aug 2005 13:38:57 CST
Raw View
Pablo Halpern wrote:

> The current proposal (which appears to already have been
> accepted), requires return of a null pointer for the empty vector.  My
> proposal is only that the pointer be valid, but not necessarily
> dereferenceable.  A vector which once had data but is now empty will
> have hold a valid, but not dereferencable pointer to its zeroith element
> and will need an extra 'if' statement to return a null pointer.

A pointer to memory that's been deallocated may well be indeterminate,
so not sufficiently valid to be portably passed to a function even if
it will not be dereferenced.  However, implementations which would trap
on passing an indeterminate pointer value could be required to return a
valid pointer, re-adding the (small) cost of nulling the pointer when
its storage is deallocated.  It might be hard to write standardese to
permit returning an invalid pointer iff the implementation allows
it to be copied, and not worthwhile -- the clarity of requiring a null
pointer seems preferable, to me.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Pablo Halpern <phalpern@halpernwightsoftware.org>
Date: 29 Jul 2005 03:10:01 GMT
Raw View
Daniel Kr   gler <dsp@bdal.de> wrote:

>Pablo Halpern wrote:
>>
>> Proposal: Add a two new member functions to std::vector:
>>
>>  T*       data();
>>  T const* data() const;
.
>> Has this been formally proposed by anybody?
>
>See:
>
>http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#464
>
>Greetings from Bremen,
>
>Daniel Kr   gler

I don't know how I missed this in my perusal of the active issues list.
Perhaps I mistyped the search.

Anyway, that's good news.  It looks like it will be in the next
standard.  (I also like the at() function for map.)

Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: tony_in_da_uk@yahoo.co.uk
Date: Fri, 29 Jul 2005 17:26:38 CST
Raw View
What's the practical difference between using data() and having to
check isn't not NULL, versus checking empty() then taking the address?
Fact is, people do stupid things, and your proposal's no safer.

operator[] is known to be dangerous (hence at()) but it's there anyway
to ensure it's as fast as C indexing.  If you put a check for empty()
inside the operator, and return some bizarre reference, then you've
lost your speed.

Tony

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: jgottman@carolina.rr.com ("Joe Gottman")
Date: Sat, 30 Jul 2005 00:22:28 GMT
Raw View
<tony_in_da_uk@yahoo.co.uk> wrote in message
news:1122647571.389269.268540@z14g2000cwz.googlegroups.com...
> What's the practical difference between using data() and having to
> check isn't not NULL, versus checking empty() then taking the address?
> Fact is, people do stupid things, and your proposal's no safer.
>
> operator[] is known to be dangerous (hence at()) but it's there anyway
> to ensure it's as fast as C indexing.  If you put a check for empty()
> inside the operator, and return some bizarre reference, then you've
> lost your speed.

   Most of the time when you need vector.data() the reason you need it is to
pass it to a function that takes a pointer as a parameter.  Most functions
like this take either two pointers or a pointer and a size_t so that they
know where to end.  In either of these cases, the function you are passing
the data() pointer to takes responsibility for the indexing.

    void foo(double *begin, double *end);
    void bar(double *buffer, std::size_t size);

int main()
{
    vector<double> vec = get_vector();
    foo(vec.data(), vec.data() + vec.size());
    bar(vec.data(), vec.size());
}

Joe Gottman

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: hinnant@metrowerks.com (Howard Hinnant)
Date: Sat, 30 Jul 2005 02:10:10 GMT
Raw View
In article <1122647571.389269.268540@z14g2000cwz.googlegroups.com>,
 tony_in_da_uk@yahoo.co.uk wrote:

> What's the practical difference between using data() and having to
> check isn't not NULL, versus checking empty() then taking the address?
> Fact is, people do stupid things, and your proposal's no safer.

The motivating use case is to interface vector with legacy C code:

extern "C" void legacy_code(double* p, unsigned n);

int main()
{
    vector<double> v;
    ...
    legacy_code(v.data(), v.size());
vs
    legacy_code(v.empty() ? (double*)0 : &v[0], v.size());
}

It is assumed that legacy_code will deal gracefully with the case n==0.

Agreed it is a very minor improvement.  But every little bit helps.

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Pablo Halpern <phalpern@halpernwightsoftware.org>
Date: 31 Jul 2005 16:00:01 GMT
Raw View
tony_in_da_uk@yahoo.co.uk wrote:

>What's the practical difference between using data() and having to
>check isn't not NULL, versus checking empty() then taking the address?
>Fact is, people do stupid things, and your proposal's no safer.

Actually, it's already been proposed by someone else, as has already
been pointed out.

Consider this code:

     vector<char> blob;
     char* eol = std::memchr(blob.data(), '\n', blob.size());

If blob.size() is 0, this code is perfectly safe.  strchr will not
dereference the pointer if there are no bytes to look at.  This is very
common in any kind of loop, as well -- the pointer is ignored if the
loop counter is zero.

>operator[] is known to be dangerous (hence at()) but it's there anyway
>to ensure it's as fast as C indexing.  If you put a check for empty()
>inside the operator, and return some bizarre reference, then you've
>lost your speed.

No. If the vector constructor initializes its internal pointer to null,
then no extra checking (or speed loss) is necessary for my proposal.
This may be a reason to scale back the proposal that is currently on the
table.  The current proposal (which appears to already have been
accepted), requires return of a null pointer for the empty vector.  My
proposal is only that the pointer be valid, but not necessarily
dereferenceable.  A vector which once had data but is now empty will
have hold a valid, but not dereferencable pointer to its zeroith element
and will need an extra 'if' statement to return a null pointer.

Regardless, even if the extra 'if' is needed, it is no more expensive
than the extra 'if' that would be necessary to call memchr or memcpy
conditinally for non-empty vectors, or to encapsulate exactly the same
logic outside of vector.  At best, it makes code easier to read and less
error-prone.  At worst, it is a harmless convenience.

Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: phalpern@halpernwightsoftware.org (Pablo Halpern)
Date: Tue, 26 Jul 2005 15:41:13 GMT
Raw View
Background: In order to access the underlying congiguous array within a
vector, V, the following constructs are popular:

 T* p = &V[0];
        T* p = &V.front();
 T* p = &*V.begin();

Problem: all of the above produce undefined behavior if V is empty.  In
fact, debug-mode in STLPort will trap the above as errors if V is empty.

Proposal: Add a two new member functions to std::vector:

 T*       data();
 T const* data() const;

Both functions return a pointer to the first element of the vector.  If
the vector is empty, they return a valid, but not necessarily
dereferencable pointer.  (The pointer may be null.)

Has this been formally proposed by anybody?
Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Stephen Howe" <sjhoweATdialDOTpipexDOTcom@eu.uu.net>
Date: 27 Jul 2005 02:20:01 GMT
Raw View
> Proposal: Add a two new member functions to std::vector:
>
> T*       data();
> T const* data() const;
>
> Both functions return a pointer to the first element of the vector.  If
> the vector is empty, they return a valid, but not necessarily
> dereferencable pointer.  (The pointer may be null.)

What about std::deque and std::list?
They also 'suffer' from the same problem.
What do you propose here?

Stephen Howe


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Pablo Halpern <phalpern@halpernwightsoftware.org>
Date: 27 Jul 2005 05:10:05 GMT
Raw View
"Stephen Howe" <sjhoweATdialDOTpipexDOTcom@eu.uu.net> wrote:

>> Proposal: Add a two new member functions to std::vector:
>>
>> T*       data();
>> T const* data() const;
>>
>> Both functions return a pointer to the first element of the vector.  If
>> the vector is empty, they return a valid, but not necessarily
>> dereferencable pointer.  (The pointer may be null.)
>
>What about std::deque and std::list?
>They also 'suffer' from the same problem.
>What do you propose here?

Neither std::deque nor std::list have the property of storing their
elements in contiguous memory locations.  I cannot think of any good
reason to get a pointer to the first element of an empty deque or list.
A vector is guaranteed to be contiguous, however, and we would like to
be able to access the contiguous array within the vector. For example,
we would like the following to work:

    int array[ARRAY_SIZE];
    std::vector<int> v;
    std::memcpy(array, v.data(), v.size() * sizeof(int));

Without the data() member function, you would need to write something
like this:

    int array[ARRAY_SIZE];
    std::vector<int> v;
    if (! v.empty())
        std::memcpy(array, &v[0], v.size() * sizeof(int));

Not horrible, but an easily avoidable source of bugs.  To make matters
worse, the first version will work with many implementations of vector
but then fail with others.  I found a number of examples of broken code
that were detected when I turned on debug mode in our implementation of
STLPort.

Another solution to this problem would be to require that &v[0] produce
a valid, though not necessarily dereferencable, pointer.  This is easy
to implement but quite complicated to specify in standard language.  It
requires the concept of a reference to an object whose address can be
taken but which is otherwise not valid.  I think my proposal is much
more elegant and allows direct expression of the concept.

Pablo Halpern                 phalpern@halpernwightsoftware.com
Author: The C++ Standard Library from Scratch
http://www.halpernwightsoftware.com/stdlib-scratch

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: =?ISO-8859-1?Q?Daniel_Kr=FCgler?= <dsp@bdal.de>
Date: 27 Jul 2005 17:00:22 GMT
Raw View
Pablo Halpern wrote:
> Background: In order to access the underlying congiguous array within a
> vector, V, the following constructs are popular:
>
>  T* p = &V[0];
>         T* p = &V.front();
>  T* p = &*V.begin();
>
> Problem: all of the above produce undefined behavior if V is empty.  In
> fact, debug-mode in STLPort will trap the above as errors if V is empty.
>
> Proposal: Add a two new member functions to std::vector:
>
>  T*       data();
>  T const* data() const;
>
> Both functions return a pointer to the first element of the vector.  If
> the vector is empty, they return a valid, but not necessarily
> dereferencable pointer.  (The pointer may be null.)
>
> Has this been formally proposed by anybody?

See:

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#464

Greetings from Bremen,

Daniel Kr   gler

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Howard Hinnant <hinnant@metrowerks.com>
Date: 27 Jul 2005 17:30:21 GMT
Raw View
In article <cpbbe15p2gibkme67u33nb7af5ct72tksq@4ax.com>,
 phalpern@halpernwightsoftware.org (Pablo Halpern) wrote:

> Background: In order to access the underlying congiguous array within a
> vector, V, the following constructs are popular:
>
>  T* p = &V[0];
>         T* p = &V.front();
>  T* p = &*V.begin();
>
> Problem: all of the above produce undefined behavior if V is empty.  In
> fact, debug-mode in STLPort will trap the above as errors if V is empty.
>
> Proposal: Add a two new member functions to std::vector:
>
>  T*       data();
>  T const* data() const;
>
> Both functions return a pointer to the first element of the vector.  If
> the vector is empty, they return a valid, but not necessarily
> dereferencable pointer.  (The pointer may be null.)
>
> Has this been formally proposed by anybody?

Yes.

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#464

-Howard

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: Ben Hutchings <ben-public-nospam@decadentplace.org.uk>
Date: Wed, 27 Jul 2005 12:35:36 CST
Raw View
Pablo Halpern <phalpern@halpernwightsoftware.org> wrote:
> Background: In order to access the underlying congiguous array within a
> vector, V, the following constructs are popular:
>
>         T* p = &V[0];
>         T* p = &V.front();
>         T* p = &*V.begin();
>
> Problem: all of the above produce undefined behavior if V is empty.  In
> fact, debug-mode in STLPort will trap the above as errors if V is empty.
>
> Proposal: Add a two new member functions to std::vector:
>
>         T*       data();
>         T const* data() const;
>
> Both functions return a pointer to the first element of the vector.  If
> the vector is empty, they return a valid, but not necessarily
> dereferencable pointer.  (The pointer may be null.)
>
> Has this been formally proposed by anybody?

This was suggested some time ago by Valentin Bonnard in
<http://groups.google.co.uk/group/comp.lang.c++.moderated/msg/cb90904af54fbb36>
and other messages.  However that was before it was established that
vector must use contiguous storage and that &v[0] etc. could be used
(with the limitation to non-empty vectors).

More recently Joe Gottman proposed vector<...>::data() in
<http://groups.google.co.uk/group/comp.std.c++/msg/6d442ecfe5b28ab0>,
but without any obvious result.

So I think the answer is no.

--
Ben Hutchings
Hoare's Law of Large Problems:
        Inside every large problem is a small problem struggling to get out.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]