Topic: std::vector<>::data() (Was: std::vector<>)


Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/02/18
Raw View
Valentin Bonnard wrote:
>
> Uwe Keim wrote:
>
> > What do YOU do, when you need a contiguous array in memory?
>
> I once sugested in comp.lang.c++.moderated to add a data()
> function to vector. I would return a pointer to the elements,
> has a maximum complexity of O(size()), and the returned value
> is valid until the next reallocation (or the next non-const
> function called, or whatever).
>
> data() is expected to be O(1) on most implementations.
>
> Matt Austern told me that he didn't liked the idea of
> something almost always true. I would answer that there is
> already such a thing: &v[0]+1 == &v[1]
>
> If you don't care about performance, you can do a copy.

I think there's another problem with a data() member:
If the implementation exposes the internal data structure, the
caller must not delete the pointer, and changing data
through the pointer would change the vector itself, and
vice versa. Thus an implementation which copyes the data
(due to different internal structure) will have to behave
the same.

The first point means that the caller never may delete the
pointer (as usage of the method should not depend on the
implementation), so the vector *must* keep track of any
memory allocated for data() (this would be simplified by
the non-const invalidation rule if data is a non-const
member: The next call of data() invalidates the previous
(this, however, makes a call f(v.data(), v.data()) illegal).

The second point is more complicated. Consider the following
function:

int f()
{
  vector<int> v;
  vector<int> const& v2=v;
  v.push_back(0);
  v.push_back(1);
  int* p=v.data();
  p[1]=p[0];
  return v2[1];  // v2[1] uses const method to access the value
}

Now, what does the function return?
Well, if data() returns the internal representation, the answer
is clear: element 0 is copied to element 1, which is returned.

Now, what for an implementation which copies? p[1]=p[0] makes a
element copy in the contiguous copy; the assignment cannot be
intercepted by the vector as it is done with built-in pointers.
But v2[1] (which is the const version, as called via const
reference) must return the copy of v[0] (because the not-copying
implementation would do so). So it has two choices:

1. Access the pointer version instead of the real representation
   (adds more complexity, as there's always a check needed which
   way the element can be accessed now)

2. Copy the pointer version back to the real representation before
   accessing the value (violating the O(1) constraint in this case)

Choice 2 is a bad option, as it would make the access complexity
dependant if prior calls of data() on some implementations, so
only option 1 is really a good one.

The situation is even worse if you look at the mutators (f.ex.
non-const operator[]): Even if they invalidate the pointer, they
must incorporate all changes made through it. This means that
they must be O(N) in this case (for non-contiguous implementations,
of course).

The problems above could be avoided if data returned a pointer
to const, but this would reduce the usefulness of data().


However, you can use efficient coding today by simply using
templates, f.ex.:

template<class iterator> void f(iterator begin, iterator end);
template<class T> void f(T* begin, T* end);
// the second version is called whenever your iterator is
// just a pointer

You could f.ex. use this for writing a get_data which copies
only in the case where iterators are not pointers. The solution
can then be optimized for your special needs (esp. there's
no requirement of equivalence of both methods in cases where
you don't need it).

If iterators are pointers, you are guaranteed that the memory
is contiguous (else the pointers wouldn't meet the iterator
requirements). However, implementations with contiguous memory
but non-pointer iterators will not be detected by this method.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]





Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1998/02/18
Raw View
Christopher Eltschka wrote:
>
> Valentin Bonnard wrote:
> >
> > Uwe Keim wrote:
> >
> > > What do YOU do, when you need a contiguous array in memory?
> >
> > I once sugested in comp.lang.c++.moderated to add a data()
> > function to vector. I would return a pointer to the elements,
> > has a maximum complexity of O(size()), and the returned value
> > is valid until the next reallocation (or the next non-const
> > function called, or whatever).
> >
> > data() is expected to be O(1) on most implementations.
> >
> > Matt Austern told me that he didn't liked the idea of
> > something almost always true. I would answer that there is
> > already such a thing: &v[0]+1 == &v[1]
> >
> > If you don't care about performance, you can do a copy.
>
> I think there's another problem with a data() member:
> If the implementation exposes the internal data structure, the
> caller must not delete the pointer, and changing data
> through the pointer would change the vector itself, and
> vice versa. Thus an implementation which copyes the data
> (due to different internal structure) will have to behave
> the same.

True. Just as string::data().

You can also have a rope::release_data_memory_whatever_its_name().
(rope is in the SGI STL distribution.)

> The second point is more complicated. Consider the following
> function:

data() return a const value_type*. Just as string::data().

> The problems above could be avoided if data returned a pointer
> to const, but this would reduce the usefulness of data().

Right

> However, you can use efficient coding today by simply using
> templates, f.ex.:
>
> template<class iterator> void f(iterator begin, iterator end);
> template<class T> void f(T* begin, T* end);

Of course I thought about that too. What about the introduction
of a trait: contiguous storage ?



Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1998/02/16
Raw View
Uwe Keim wrote:

> What do YOU do, when you need a contiguous array in memory?

I once sugested in comp.lang.c++.moderated to add a data()
function to vector. I would return a pointer to the elements,
has a maximum complexity of O(size()), and the returned value
is valid until the next reallocation (or the next non-const
function called, or whatever).

data() is expected to be O(1) on most implementations.

Matt Austern told me that he didn't liked the idea of
something almost always true. I would answer that there is
already such a thing: &v[0]+1 == &v[1]

If you don't care about performance, you can do a copy.

--

Valentin Bonnard                mailto:bonnardv@pratique.fr
info about C++/a propos du C++: http://www.pratique.fr/~bonnardv/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]