Topic: Do multiple calls to std::string::data() return the same
Author: gregh@podbridge.com (Greg Herlihy)
Date: Tue, 9 Oct 2007 22:07:39 GMT Raw View
On 10/9/07 10:09 AM, in article 20071009153829.576F8230C023@nscan3.ucar.e=
du,
"Hyman Rosen" <hyrosen@mail.com> wrote:
> Greg Herlihy wrote:
>> an object whose memory demands can increase without bound
>> is not a "const" object.
>=20
> But it is perfectly legal for the the implementation of
> string to have mutable members in the class, and for the
> class to allocate as much memory as it wants.
As I pointed out, it is up to each programmer to implement "const
correctness" - there are plenty of ways to circumvent the proper behavior.
But if the Standard's own classes can't be bothered to adhere to "logical=
ly
const" behavior, then realistically, why should anyone else?
=20
>> Note that neither std::string's c_str() or data() methods actually
>> returns a pointer that refers to "elements of a basic_string
>> sequence". Instead the two functions return pointers to an allocated
>> CharT array containing a -copy- of the CharT elements in the sequence
>> proper. Therefore the quoted paragraph does not apply - so a call to
>> data() will not invalidate a pointer returned by a prior call to
>> c_str() and vice versa.
>=20
> You are quite wrong. As should be apparent from 20.4.1 and
> 21.3.4, data() returns a pointer to the internal array of
> characters held by the string object.
data() does return a pointer to an array allocated by the string object -
but the array returned is not the same array that the std::string object
uses to manage the string's content (nor is it the same array that
std::string's operator[] references - there is more on that point below).
>> But data() may not return a pointer to the std::string's internal
>> representation - it must return a pointer to an allocated CharT array.
>=20
> Why do you think this allocated array cannot be the internal
> representation? In fact, as I show below, it must be.
There are a few reasons. For one: data() returns a const char pointer,
whereas a pointer to the internal representation should (at least under s=
ome
cases) be modifiable. Yet the Standard is quite clear that under no
circumstances may the program ever modify the characters in the array tha=
t
data() returns.[21.3.6/4]
>> The Standard guarantees that data()'s array has already been allocated
>> once a std::string object has been constructed or copy-assigned a
>> value. In fact, the purpose of std: data() is to provide clients (read=
-
>> only) access to this array (and not access to some other array or
>> series of other arrays).
>=20
> Not just read-only, but writable as well, as per 21.3.4.
>=20
The contents of the data() array are read-only, as per [21.3.6/4]
>> No, there is no contradiction. As noted above, [string.require]/4 does
>> not apply to the pointers returned by std::string's data() or c_str()
>> methods..
>=20
> No. This code is legal:
> std::string s("Hello.");
> s[5] =3D '!'; // Now s is "Hello!"
> 21.3.4 says that
> reference string::operator[](size_type pos)
> returns (assuming bounds checks are OK)
> data()[pos]
Clearly [=A721.3.4] and [=A721.3.6] contradict the other. For both to be =
correct
would be absurd. Because, the latter prohibits a program from altering an=
y
of the characters in the data() array, while the former describes a
std::string method that invites the programmer to do exactly that. So whi=
ch
of the two paragraphs is best supported by the rest of the Standard - and
which one - if it is correct - would make the most sense?
As I pointed out above std::string's operator[] must not be returning
data()[pos] - but must instead be returning a different pointer - one tha=
t
points to the string's internal representation. Otherwise,
[string.require]/4 would apply to data(), and the implications if that we=
re
the case would be absurd. For one - not only could data() return a differ=
ent
pointer upon each call - but the pointer that data() returned on the prio=
r
could be invalidated (even though the string's contents have not changed =
at
all between the two calls). Furthermore, the situation when data() return=
s
the same pointer twice in a row is even worse: in that case, the pointer
that data() returns the second time may already be invalid. (because if i=
t
is the case that a call to data() may invalidate the pointer returned by =
the
prior call - then invalidating the earlier pointer necessarily invalidate=
s
the current one - since they are both the same pointer).
So, the only reasonable conclusion is that [21.3.4] is a defect. And in
fact, the latest draft of the C++ Standard specifies std::string's
operator[] differently: operator[] now returns *(begin() + pos) - which o=
f
course is how the return value should have been specified all along.
=20
> 21.3/6 says that reference is Allocator::reference and
> 20.4.1 says that allocator<charT>::reference is charT &.
> Therefore, it must be the case that data() returns a pointer
> to the string's internal representation, because that pointer
> can be used to change the string's contents.
No, operator[] cannot be calling data() because (according to
[string.require]) calling data() might invalidate references to the strin=
g's
elements, while calling the non-const operator[] does not.
. =20
> And it is perfectly legal (but perverse) for each call to data()
> to make a new copy of the string contents and establish that as
> the new internal representation, hence the invalidation.
It would be perfectly absurd for the C++ Standard to make such behavior
legal - and would be a very dangerous, even reckless, behavior to allow a=
s
well. And because it seems quite unlikely that a committee of rational
individuals (give or take a few :-)) would ever sanction this kind of
behavior, it seems more likely that whatever support for this perverse
behavior might be found in the Standard - is more likely to be unintentio=
nal
- or in error.
Greg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]