Topic: Is string allocation contiguous?
Author: ben-public-nospam@decadentplace.org.uk (Ben Hutchings)
Date: Tue, 26 Jul 2005 18:29:19 GMT Raw View
Marc Schoolderman <squell@alumina.nl> wrote:
> We know that vector<T> is required to be contiguous, that is,
>
> &vec[i] == &vec[0] + i if i < vec.size()
>
> But what about basic_string? Originally, it was specified that
>
> str[i] == str.data()[i] if i < str.size()
This equality was and is still required, if "==" is understood to
compare values, as it normally does. But if we apply the address
operator "&" to each side then the equality is not required
(which appears to be what you're concerned about).
<snip>
> However, basic_string<>::iterator is implementation defined and I can
> find no hard guarantee that
>
> &*(str.begin()+i) == &str.data()[i]
That's because it is not required.
> So either;
>
> -. Strings are contiguous. This suggests adding a non-const qualified
> data() member since &str[0] == str.data() anyway.
>
> -. Strings are not contiguous. Theoretically useful, but will also break
> some code relying on the current paragraph.
<snip>
The definition of basic_string is intended to allow a non-contiguous
implementation, but I don't believe any implementers have taken
advantage of this freedom.
--
Ben Hutchings
Anthony's Law of Force: Don't force it, get a larger hammer.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: "Earl Purple" <earlpurple0@yahoo.com>
Date: Wed, 27 Jul 2005 12:28:02 CST Raw View
Ben Hutchings wrote:
>
> The definition of basic_string is intended to allow a non-contiguous
> implementation, but I don't believe any implementers have taken
> advantage of this freedom.
I'm not sure about that, I think there is at least one that does.
Anyway, the advantage of allowing the string to be non-contiguous is
optimisation when building a string with concatenation. You may
implement thus implement the string as a "rope".
You would need to use a certain amount of lazy evaluation (and thus
either have mutable members or do some const_casting during const
functions to change the physical layout). Only on a call to data() or
c_str() must you then ensure that you have a contiguous buffer.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: squell@alumina.nl (Marc Schoolderman)
Date: Fri, 22 Jul 2005 14:51:52 GMT Raw View
We know that vector<T> is required to be contiguous, that is,
&vec[i] == &vec[0] + i if i < vec.size()
But what about basic_string? Originally, it was specified that
str[i] == str.data()[i] if i < str.size()
This of course can't happen in the non-const case, but I assume that
implementations use a non-const "_internal_data()". Anyway, the above
statement implies &str[i] == &str[0] + i, which is the same as for vectors.
The current working paper modifies this paragraph (21.3.4) as:
str[i] == *(str.begin()+i) if i < str.size()
See http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#259
However, basic_string<>::iterator is implementation defined and I can
find no hard guarantee that
&*(str.begin()+i) == &str.data()[i]
So either;
-. Strings are contiguous. This suggests adding a non-const qualified
data() member since &str[0] == str.data() anyway.
-. Strings are not contiguous. Theoretically useful, but will also break
some code relying on the current paragraph.
Note: This same issue was brought up years ago by Dennis Yelle.
~Marc.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]