Topic: Is string allocation contiguous?


Author: ben-public-nospam@decadentplace.org.uk (Ben Hutchings)
Date: Tue, 26 Jul 2005 18:29:19 GMT
Raw View
Marc Schoolderman <squell@alumina.nl> wrote:
> We know that vector<T> is required to be contiguous, that is,
>
>    &vec[i] == &vec[0] + i      if i < vec.size()
>
> But what about basic_string? Originally, it was specified that
>
>    str[i] == str.data()[i]     if i < str.size()

This equality was and is still required, if "==" is understood to
compare values, as it normally does.  But if we apply the address
operator "&" to each side then the equality is not required
(which appears to be what you're concerned about).

<snip>
> However, basic_string<>::iterator is implementation defined and I can
> find no hard guarantee that
>
>    &*(str.begin()+i) == &str.data()[i]

That's because it is not required.

> So either;
>
> -. Strings are contiguous. This suggests adding a non-const qualified
> data() member since &str[0] == str.data() anyway.
>
> -. Strings are not contiguous. Theoretically useful, but will also break
> some code relying on the current paragraph.
<snip>

The definition of basic_string is intended to allow a non-contiguous
implementation, but I don't believe any implementers have taken
advantage of this freedom.

--
Ben Hutchings
Anthony's Law of Force: Don't force it, get a larger hammer.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: "Earl Purple" <earlpurple0@yahoo.com>
Date: Wed, 27 Jul 2005 12:28:02 CST
Raw View

Ben Hutchings wrote:
>
> The definition of basic_string is intended to allow a non-contiguous
> implementation, but I don't believe any implementers have taken
> advantage of this freedom.

I'm not sure about that, I think there is at least one that does.
Anyway, the advantage of allowing the string to be non-contiguous is
optimisation when building a string with concatenation. You may
implement thus implement the string as a "rope".

You would need to use a certain amount of lazy evaluation (and thus
either have mutable members or do some const_casting during const
functions to change the physical layout). Only on a call to data() or
c_str() must you then ensure that you have a contiguous buffer.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: squell@alumina.nl (Marc Schoolderman)
Date: Fri, 22 Jul 2005 14:51:52 GMT
Raw View
We know that vector<T> is required to be contiguous, that is,

   &vec[i] == &vec[0] + i      if i < vec.size()

But what about basic_string? Originally, it was specified that

   str[i] == str.data()[i]     if i < str.size()

This of course can't happen in the non-const case, but I assume that
implementations use a non-const "_internal_data()". Anyway, the above
statement implies &str[i] == &str[0] + i, which is the same as for vectors.

The current working paper modifies this paragraph (21.3.4) as:

   str[i] == *(str.begin()+i)  if i < str.size()

See http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#259

However, basic_string<>::iterator is implementation defined and I can
find no hard guarantee that

   &*(str.begin()+i) == &str.data()[i]

So either;

-. Strings are contiguous. This suggests adding a non-const qualified
data() member since &str[0] == str.data() anyway.

-. Strings are not contiguous. Theoretically useful, but will also break
some code relying on the current paragraph.

Note: This same issue was brought up years ago by Dennis Yelle.

~Marc.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]