Topic: basic_string operator[] and data()


Author: kanze@gabi-soft.de
Date: 2000/08/25
Raw View
jpotter@falcon.lhup.edu (John Potter) writes:

|>  On Mon, 21 Aug 2000 14:05:15 GMT, "Chris Newton"
|>  <chrisnewton@no.junk.please.btinternet.com> wrote:

|>  > >From 21.3.4 (basic_string element access) [abridged]:
|>  >   reference operator[](size_type pos);
|>  > Returns:
|>  >   If pos < size(), returns data()[pos].

|>  > >From 21.3.6 (basic_string string operations) [abridged]:
|>  >   const charT* data() const;

|>  > Can anyone tell me how we aren't violating const correctness here,
|>  > please?

|>  I think it's worse than that.  Data() returns a pointer to the first
|>  element of an array which need not be the implementation of the
|>  string.  Neither of

|>     return const_cast<char*>(data())[pos];
|>     return const_cast<char&>(data()[pos]);

|>  would be correct either.  It must return a reference to the pos'th
|>  character in the implementation of the string.

Well, that's what I had always thought, too.  But it is apparently not
what the standard says.  Thus, we have :

    std::string s( "some text" ) ;
    assert( &s[ 5 ] =3D=3D data() + 2 ) ;  //  guaranteed.
    s[ 4 ] =3D 'x' ;
    assert( s =3D=3D "somextext" ) ;       //  not guaranteed.

Surprising, to say the least.

Anyone for a defect report.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: jpotter@falcon.lhup.edu (John Potter)
Date: 2000/08/26
Raw View
On Fri, 25 Aug 2000 13:52:31 GMT, kanze@gabi-soft.de wrote:

> jpotter@falcon.lhup.edu (John Potter) writes:
>
> |>  On Mon, 21 Aug 2000 14:05:15 GMT, "Chris Newton"
> |>  <chrisnewton@no.junk.please.btinternet.com> wrote:
>
> |>  > >From 21.3.4 (basic_string element access) [abridged]:
> |>  >   reference operator[](size_type pos);
> |>  > Returns:
> |>  >   If pos < size(), returns data()[pos].
>
> |>  > >From 21.3.6 (basic_string string operations) [abridged]:
> |>  >   const charT* data() const;
>
> |>  > Can anyone tell me how we aren't violating const correctness here,
> |>  > please?
>
> |>  I think it's worse than that.  Data() returns a pointer to the first
> |>  element of an array which need not be the implementation of the
> |>  string.  Neither of
>
> |>     return const_cast<char*>(data())[pos];
> |>     return const_cast<char&>(data()[pos]);
>
> |>  would be correct either.  It must return a reference to the pos'th
> |>  character in the implementation of the string.
>
> Well, that's what I had always thought, too.  But it is apparently not
> what the standard says.  Thus, we have :
>
>     std::string s( "some text" ) ;
>     assert( &s[ 5 ] =3D=3D data() + 2 ) ;  //  guaranteed.
>     s[ 4 ] =3D 'x' ;
>     assert( s =3D=3D "somextext" ) ;       //  not guaranteed.
>
> Surprising, to say the least.
>
> Anyone for a defect report.

Everyone hates a DR without a proposed solution.

1.  Remove operator[] and at.  (Credit to Pete?)

Let's see if I can follow the pattern of giving everything in terms
of observable behavior.

2.  Returns a reference(R)/const_reference(CR) such that
    a.  R == data()[pos], CR == data()[pos]
    b.  after R = ch, data()[pos] == ch

I'm sure someone else can say it better, but that is a start.  It does
not cover all of the other things like @= ++ -- that could be done with
R.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/08/28
Raw View
jpotter@falcon.lhup.edu (John Potter) writes:

|>  Everyone hates a DR without a proposed solution.

|>  1.  Remove operator[] and at.  (Credit to Pete?)

Well, the non-const versions at least:-).

|>  Let's see if I can follow the pattern of giving everything in terms
|>  of observable behavior.

I thought the standard took the approach of defining everything in terms
of the behavior of an abstract machine.  This is explicit for the
language sections, and the library sections are full of private members
"for exposition only", which sounds a lot like the same thing to me.

Of course, in such cases, one must rigorously define what aspects of the
abstract machine one can count on, and which one cannot.  I think we are
agreed, for example that you can't count on s.data() =3D=3D &s[0], nor th=
e
contrary.  Where as you can count on s[0]=3D'a'; s[0]=3D=3D'a'.

And I think I've just found a legitimate use for operator[].  We can use
it to define the abstract semantics.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: jpotter@falcon.lhup.edu (John Potter)
Date: 2000/08/29
Raw View
On Mon, 28 Aug 2000 22:02:49 GMT, kanze@gabi-soft.de wrote:

> jpotter@falcon.lhup.edu (John Potter) writes:
>
> |>  Let's see if I can follow the pattern of giving everything in terms
> |>  of observable behavior.

Poor choice of words on my part.  It says exactly what I meant; however,
that phrase has a specific meaning in the standard.

Within the string class no other functions need to mention the
implementation in any way other than saying that the contents are the
same as data().  When you construct a string from "Hello", the effects
are that data() contains the characters Hello.

With these functions, it is not the effect which must be described, but
the thing which is returned.  This can not be phrased in terms of data()
directly.  They return a reference into the actual implementation which
has nothing to do with the pointer returned by data.

> I thought the standard took the approach of defining everything in terms
> of the behavior of an abstract machine.  This is explicit for the
> language sections, and the library sections are full of private members
> "for exposition only", which sounds a lot like the same thing to me.
>
> Of course, in such cases, one must rigorously define what aspects of the
> abstract machine one can count on, and which one cannot.  I think we are
> agreed, for example that you can't count on s.data() == &s[0], nor the
> contrary.  Where as you can count on s[0]='a'; s[0]=='a'.

Yes, we agree.  I hope the above clears my intentions.  You can also
count on s[0] = 'a'; s.data()[0] == 'a'.  But that still does not say
what is returned.

> And I think I've just found a legitimate use for operator[].  We can use
> it to define the abstract semantics.

Interesting circle.  Now how do you define the return without using
it.  :)

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: llewelly.@@edevnull.dot.com
Date: 2000/08/21
Raw View
Anders Pytte <anders@milkweed.com> writes:

> in article 8nm2g9$1v3$1@uranium.btinternet.com, Chris Newton at
> chrisnewton@no.junk.please.btinternet.com wrote on 8/21/00 10:05 AM:
>
> > Dear all,
> >
> >> From 21.3.4 (basic_string element access) [abridged]:
> > reference operator[](size_type pos);
> > Returns:
> > If pos < size(), returns data()[pos].

Does not violate const correctness because data() cannot be called on
  a const string. Oh, wait. It seems to impliy a call to data(), which
  returns a pointer to a const. hm - checking
  http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-index.html for an
  issue on this - damn, 104 does not seem applicable.

> >
> >> From 21.3.6 (basic_string string operations) [abridged]:
> > const charT* data() const;

Does not violate const correctness because it returns a pointer to
  const.

> >
> > Can anyone tell me how we aren't violating const correctness here,
> > please?
>
> Either your implementation is incorrect or you have misstated it.

Chris Newton is quoting the standard here, and near as I can tell, he
  has not mis-quoted it.

> My
> implementation (Metrowerks) looks like this:
>
> basic_string<charT, traits, Allocator>::operator[](size_type pos) const
> {
>     return *(__data() + pos);
> }
>
> where __data() is a private function returning pointer to non-const char.
>
> Also, I don't think the standard allows for range checking in operator[] for
> basic_string, so I don't know what the pos < size() is doing in your
> implementation. What does the function return if pos >= size()?
>

21.3.4/1 says: '[...] Otherwise, the behavior is undefined' so an
  implemetation can do anything it wants for out of bounds string
  element accesses, including throwing an exception, dumping core, or
  even (in those cases where it is detectable at compile time) emiting
  a compiler error. Your implementation (like gcc, and like most
  others) lets you walk all over unpredictable memory, which is also
  conforming.

[snip]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: jpotter@falcon.lhup.edu (John Potter)
Date: 2000/08/21
Raw View
On Mon, 21 Aug 2000 14:05:15 GMT, "Chris Newton"
<chrisnewton@no.junk.please.btinternet.com> wrote:

> Dear all,
>
> >From 21.3.4 (basic_string element access) [abridged]:
>   reference operator[](size_type pos);
> Returns:
>   If pos < size(), returns data()[pos].
>
> >From 21.3.6 (basic_string string operations) [abridged]:
>   const charT* data() const;
>
> Can anyone tell me how we aren't violating const correctness here,
> please?

I think it's worse than that.  Data() returns a pointer to the first
element of an array which need not be the implementation of the
string.  Neither of

   return const_cast<char*>(data())[pos];
   return const_cast<char&>(data()[pos]);

would be correct either.  It must return a reference to the pos'th
character in the implementation of the string.

John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Anders Pytte <anders@milkweed.com>
Date: 2000/08/21
Raw View
in article m2aee6fqdq.fsf@brownie.frogger.foobar.snot,
llewelly.@@edevnull.dot.com at llewelly.@@edevnull.dot.com wrote on 8/21/00
2:06 PM:

> Anders Pytte <anders@milkweed.com> writes:
>
>> in article 8nm2g9$1v3$1@uranium.btinternet.com, Chris Newton at
>> chrisnewton@no.junk.please.btinternet.com wrote on 8/21/00 10:05 AM:
>>
>>> Dear all,
>>>
>>>> From 21.3.4 (basic_string element access) [abridged]:
>>> reference operator[](size_type pos);
>>> Returns:
>>> If pos < size(), returns data()[pos].
>
> Does not violate const correctness because data() cannot be called on
> a const string. Oh, wait. It seems to impliy a call to data(), which
> returns a pointer to a const. hm - checking
> http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-index.html for an
> issue on this - damn, 104 does not seem applicable.
>
>>>
>>>> From 21.3.6 (basic_string string operations) [abridged]:
>>> const charT* data() const;
>
> Does not violate const correctness because it returns a pointer to
> const.
>
>>>
>>> Can anyone tell me how we aren't violating const correctness here,
>>> please?
>>
>> Either your implementation is incorrect or you have misstated it.
>
> Chris Newton is quoting the standard here, and near as I can tell, he
> has not mis-quoted it.
>
>> My
>> implementation (Metrowerks) looks like this:
>>

Oops! I should not be posting other people's code!

>>
>> where __data() is a private function returning pointer to non-const char.
>>
>> Also, I don't think the standard allows for range checking in operator[] for
>> basic_string, so I don't know what the pos < size() is doing in your
>> implementation. What does the function return if pos >= size()?
>>
>
> 21.3.4/1 says: '[...] Otherwise, the behavior is undefined' so an
> implemetation can do anything it wants for out of bounds string
> element accesses, including throwing an exception, dumping core, or
> even (in those cases where it is detectable at compile time) emiting
> a compiler error. Your implementation (like gcc, and like most
> others) lets you walk all over unpredictable memory, which is also
> conforming.

I was sloppy and did not realize he was quoting the standard. Humble
apologies. Yes, the result is undefined if pos >= size().

But there does seem to be a problem here. I'm not sure what the standard
expects for the non-const version of operator[]. I guess this is an
oversight in the standard.

Anders.

--
Anders Pytte                                   Milkweed Software
PO Box 32                                  voice: (802) 586-2545
Craftsbury, VT 05826                  email: anders@milkweed.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Richard Parkin" <rparkin@msi-eu.com>
Date: 2000/08/21
Raw View
"Anders Pytte" <anders@milkweed.com> wrote in message
news:B5C6CC32.A680%anders@milkweed.com...
> in article 8nm2g9$1v3$1@uranium.btinternet.com, Chris Newton at
> chrisnewton@no.junk.please.btinternet.com wrote on 8/21/00 10:05 AM:
>
> > Dear all,
> >
> >> From 21.3.4 (basic_string element access) [abridged]:
> > reference operator[](size_type pos);
> > Returns:
> > If pos < size(), returns data()[pos].
> >
> >> From 21.3.6 (basic_string string operations) [abridged]:
> > const charT* data() const;
> >
> > Can anyone tell me how we aren't violating const correctness here,
> > please?
>
> Either your implementation is incorrect or you have misstated it.

No, he paraphrased the standard, not an implementation
operator[]

Returns: If pos < size(), returns data()[ pos]. Otherwise, if pos == size(),
the const version returns charT(). Otherwise, the behavior is undefined.


> My
> implementation (Metrowerks) looks like this:
>
> basic_string<charT, traits, Allocator>::operator[](size_type pos) const
> {
>     return *(__data() + pos);
> }
>
> where __data() is a private function returning pointer to non-const char.
>
> Also, I don't think the standard allows for range checking in operator[]
for
> basic_string, so I don't know what the pos < size() is doing in your
> implementation. What does the function return if pos >= size()?

Undefined behaviour for > size(), charT() for == size().
Note that by having a buffer with the zero appended, this all comes for
'free' with no range checking. Although interestingly, the Dinkumware
library does some, although it does it in the non-const version. Which is
odd.

As for the original posters point, I think that the wording is intended to
convey the semantics rather than a specific implementation, although I can't
find if that's stated anywhere.

Ric



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Chris Newton" <chrisnewton@no.junk.please.btinternet.com>
Date: 2000/08/21
Raw View
Dear all,

>From 21.3.4 (basic_string element access) [abridged]:
  reference operator[](size_type pos);
Returns:
  If pos < size(), returns data()[pos].

>From 21.3.6 (basic_string string operations) [abridged]:
  const charT* data() const;

Can anyone tell me how we aren't violating const correctness here,
please?

Thanks,
Chris


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Anders Pytte <anders@milkweed.com>
Date: 2000/08/21
Raw View
in article 8nm2g9$1v3$1@uranium.btinternet.com, Chris Newton at
chrisnewton@no.junk.please.btinternet.com wrote on 8/21/00 10:05 AM:

> Dear all,
>
>> From 21.3.4 (basic_string element access) [abridged]:
> reference operator[](size_type pos);
> Returns:
> If pos < size(), returns data()[pos].
>
>> From 21.3.6 (basic_string string operations) [abridged]:
> const charT* data() const;
>
> Can anyone tell me how we aren't violating const correctness here,
> please?

Either your implementation is incorrect or you have misstated it. My
implementation (Metrowerks) looks like this:

basic_string<charT, traits, Allocator>::operator[](size_type pos) const
{
    return *(__data() + pos);
}

where __data() is a private function returning pointer to non-const char.

Also, I don't think the standard allows for range checking in operator[] for
basic_string, so I don't know what the pos < size() is doing in your
implementation. What does the function return if pos >= size()?

Anders.

--
Anders Pytte                                   Milkweed Software
PO Box 32                                  voice: (802) 586-2545
Craftsbury, VT 05826                  email: anders@milkweed.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]