Thread

Topic: Library Issue #180 - what is so special about basic_string::replace

Author: "Bo Persson" <bop@gmb.dk>
Date: Thu, 19 Nov 2009 17:18:27 CST Raw View

The insert() and erase() members of basic_string were changed from
taking iterators to taking const_iterators. However, the replace
members were not.

Library issue #180 says:
"We did not make the change in replace, because this change would
affect the implementation because the string may be written into. This
is an issue that should be taken up by concepts."

1. How is replace() different from a combination of erase() and
insert()?

2. Now that we didn't get concepts, should this be reconsidered?


Bo Persson



--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Greg Herlihy <greghe@mac.com>
Date: Fri, 20 Nov 2009 11:56:02 CST Raw View

On Nov 19, 6:18 pm, "Bo Persson" <b...@gmb.dk> wrote:
> The insert() and erase() members of basic_string were changed from
> taking iterators to taking const_iterators. However, the replace
> members were not.
>
> Library issue #180 says:
> "We did not make the change in replace, because this change would
> affect the implementation because the string may be written into. This
> is an issue that should be taken up by concepts."
>
> 1. How is replace() different from a combination of erase() and
> insert()?

I would ask instead: how are erase() and insert() different from
replace()?

And which choice is a std::strings implementor more likely to make: to
write one function that in turn implements two other functions, or to
write two functions to implement just one other function? In other
words, isn't it more likely that a std::string implementation will
have replace() implement both erase() with insert() than it would be
for the implementation to have both insert() and erase() implement
replace()?

Moreover, when replace() is used to implement erase() and insert(),
there is no advantage in having replace() overwrite existing
characters in the string. So  implementations of erase() and insert()
can simply refrain from overwriting any characters. Furthermore, once
erase() and insert() can guarantee that they will not overwrite any
characters, than there is no reason eft why these two functions should
not be declared with const iterators.

For more general replace() operations (those that both remove and add
characters to a string), the ability to overwrite existing characters
can be provide a significant (in fact, an unlimited) optimization. As
an example, to replace 1,000 characters with another 1,000 characters
in a string, could be done much more efficiently by having replace()
overwrite the characters to be replaced with their replacement
characters -than it would be to have replace(), first delete 1,000
characters and then insert 1,000 new characters at the same point
where the just prior deletion took place.

Now, no matter how much benefit const iterators for replace() may
provide to a program, it is clear that the amount of benefit they do
provide - is finite. In contrast, no matter how much benefit
overwriting existing characters may provide to a particular replace()
operation, there always exists another replace() operation where the
optimization does even better. (that is, the more characters being
replaced, the more inefficient the erase() and insert() implementation
becomes relative the implementation that overwrites the old characters
with their replacements, directly.

Probably to no one's surprise, the C++ Committee has effectively
decided in favor of retaining the unlimited benefit. The iterators
passed to replace() will remain non-const.

> 2. Now that we didn't get concepts, should this be reconsidered?

I would conclude that there is no reason to reconsider the non-const
replace() iterators - until concepts are back in C++.

Greg

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Sean Hunt <rideau3@gmail.com>
Date: Fri, 20 Nov 2009 15:36:31 CST Raw View

On Nov 20, 10:56 am, Greg Herlihy <gre...@mac.com> wrote:
> For more general replace() operations (those that both remove and add
> characters to a string), the ability to overwrite existing characters
> can be provide a significant (in fact, an unlimited) optimization. As
> an example, to replace 1,000 characters with another 1,000 characters
> in a string, could be done much more efficiently by having replace()
> overwrite the characters to be replaced with their replacement
> characters -than it would be to have replace(), first delete 1,000
> characters and then insert 1,000 new characters at the same point
> where the just prior deletion took place.
>
> Now, no matter how much benefit const iterators for replace() may
> provide to a program, it is clear that the amount of benefit they do
> provide - is finite. In contrast, no matter how much benefit
> overwriting existing characters may provide to a particular replace()
> operation, there always exists another replace() operation where the
> optimization does even better. (that is, the more characters being
> replaced, the more inefficient the erase() and insert() implementation
> becomes relative the implementation that overwrites the old characters
> with their replacements, directly.
>
> Probably to no one's surprise, the C++ Committee has effectively
> decided in favor of retaining the unlimited benefit. The iterators
> passed to replace() will remain non-const.

Is there any particular reason, however, why an implementation of
replace() cannot be given a const_iterator and simply cast it to a
regular iterator, before proceeding to do the replace as usual?
There's no risk of altering data that ought to be immutable. This
preserves the effect of the function, but doesn't limit
implementations at all.

Sean


--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: "Bo Persson" <bop@gmb.dk>
Date: Fri, 20 Nov 2009 15:40:22 CST Raw View

Greg Herlihy wrote:
> On Nov 19, 6:18 pm, "Bo Persson" <b...@gmb.dk> wrote:
>> The insert() and erase() members of basic_string were changed from
>> taking iterators to taking const_iterators. However, the replace
>> members were not.
>>
>> Library issue #180 says:
>> "We did not make the change in replace, because this change would
>> affect the implementation because the string may be written into.
>> This is an issue that should be taken up by concepts."
>>
>> 1. How is replace() different from a combination of erase() and
>> insert()?
>
> I would ask instead: how are erase() and insert() different from
> replace()?
>
> And which choice is a std::strings implementor more likely to make:
> to write one function that in turn implements two other functions,
> or to write two functions to implement just one other function? In
> other words, isn't it more likely that a std::string implementation
> will have replace() implement both erase() with insert() than it
> would be for the implementation to have both insert() and erase()
> implement replace()?

I haven't thought of it that way, but have seen that the rationale for
having a const_iterator in erase() and insert() is that it just gives
the position for the operation. To me it seems like it would be
similar for replace(), especially if I see this kind of
implementation:

     basic_string& replace(iterator _First, iterator _Last, const
basic_string& _String)
     { return replace(_First - begin(), _Last - _First, _String); }

turning the iterator replace into its start-length version.


>
> Probably to no one's surprise, the C++ Committee has effectively
> decided in favor of retaining the unlimited benefit. The iterators
> passed to replace() will remain non-const.
>
>> 2. Now that we didn't get concepts, should this be reconsidered?
>
> I would conclude that there is no reason to reconsider the non-const
> replace() iterators - until concepts are back in C++.
>

I respect the committee's decisions, just though this might be a case
where the preconditions have changed.


Bo Persson



--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: Greg Herlihy <greghe@mac.com>
Date: Mon, 23 Nov 2009 11:24:37 CST Raw View

On Nov 20, 4:40 pm, "Bo Persson" <b...@gmb.dk> wrote:
>
> I haven't thought of it that way, but have seen that the rationale for
> having a const_iterator in erase() and insert() is that it just gives
> the position for the operation.

The const_iterator parameter in string::insert()'s declaration means
that the value of the character at that iterator's position (if there
is one) will be not affected by insert()'s execution. A const_iterator
parameter in a function declaration effectively promises the caller
that - when the function returns - the element at that
const_iterator's position (if there is one) will have the same value
that that element had at the point where the function was called.

Unfortunately, there is a serious error in the proposed changes for
Library Issue #180. The problem is that not every std::string member
function interprets an iterator parameter in the same way as insert().
Specifically, string::erase() starts erasing from the exact position
specified by its iterator parameter (and not at an earlier position).

Here is the proposed declaration for one of std::string erase()
members:

    iterator erase(const_iterator const_position);

So, according its description, string::erase() deletes the character
at the position specified by const_position. But, according to its
declaration, string::erase() will not change the value of the
character at const_iterator's position (a promise that would seem
difficult for erase() to keep, in light of the fact that the character
that furnished that value when erase() was called - is no longer even
in the string when erase() returns).

So I think it would be a good idea to review some of the
const_iterator parameter choices described Issue #180. Otherwise, this
issue will in all likelihood resurface - once a library vendor tries
to have the std::string erase() member functions accept const_iterator
parameters.

Greg

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: =3D?ISO-8859-1?Q?Daniel_Kr=3D FCgler?=3D <daniel.kruegler@googlemail.com>
Date: Mon, 23 Nov 2009 17:19:23 CST Raw View

On 23 Nov., 18:24, Greg Herlihy <gre...@mac.com> wrote:
> On Nov 20, 4:40 pm, "Bo Persson" <b...@gmb.dk> wrote:
>
> > I haven't thought of it that way, but have seen that the rationale for
> > having a const_iterator in erase() and insert() is that it just gives
> > the position for the operation.
>
> The const_iterator parameter in string::insert()'s declaration means
> that the value of the character at that iterator's position (if there
> is one) will be not affected by insert()'s execution. A const_iterator
> parameter in a function declaration effectively promises the caller
> that - when the function returns - the element at that
> const_iterator's position (if there is one) will have the same value
> that that element had at the point where the function was called.
>
> Unfortunately, there is a serious error in the proposed changes for
> Library Issue #180. The problem is that not every std::string member
> function interprets an iterator parameter in the same way as insert().
> Specifically, string::erase() starts erasing from the exact position
> specified by its iterator parameter (and not at an earlier position).
>
> Here is the proposed declaration for one of std::string erase()
> members:
>
>     iterator erase(const_iterator const_position);
>
> So, according its description, string::erase() deletes the character
> at the position specified by const_position. But, according to its
> declaration, string::erase() will not change the value of the
> character at const_iterator's position (a promise that would seem
> difficult for erase() to keep, in light of the fact that the character
> that furnished that value when erase() was called - is no longer even
> in the string when erase() returns).
>
> So I think it would be a good idea to review some of the
> const_iterator parameter choices described Issue #180. Otherwise, this
> issue will in all likelihood resurface - once a library vendor tries
> to have the std::string erase() member functions accept const_iterator
> parameters.

Your statements seem to imply that you would partition the
current set of changes as of #180 in some where this
change is appropriate and in others where it is not. Could
you please explain the reason for these differences between
these sets?

Your above reasoning

"A const_iterator parameter in a function declaration effectively
promises the caller that - when the function returns - the element
at that const_iterator's position (if there is one) will have the
same
value that that element had at the point where the function was
called."

is not granted by the standard in any way. This would only hold,
if the corresponding member function, which accepts the iterator
(e.g. erase, insert, replace) would have const-qualifiers.

The iterator argument(s) provided are only used to compute the
location of the to be erased, inserted, or replaced element(s). The
effect of calling these function is basically equivalent to providing
a position index and a feasible implementation technique would
be to forward to the index-based overloads, e.g.

iterator erase(const_iterator first, const_iterator last) {
 size_type pos = first - begin();
 size_type n = last - first;
 erase(pos, n);
 return begin() + pos;
}

The first two lines of the above function body do not require
any capability to modify the string, the user could have written a
corresponding wrapper on it's own without any difference (except
that the implementation may take a slight advantage of being
aware of the internals of the iterators).

The general advantage of providing iterator-based positioning
functions is, that the container implementation may use the
iterator to access the referred to a element even then very
efficiently, if no random-access iterators are available.

Greetings from Bremen,

Daniel Kr=FCgler

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use
mailto:std-c++@netlab.cs.rpi.edu<std-c%2B%2B@netlab.cs.rpi.edu>
]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]