Topic: Standard string class


Author: robert.davies@vuw.ac.nz (Robert Davies)
Date: 1996/02/18
Raw View
Here are some comments and a query about the standard string class. I am
working from the January 96 working papers, but the comments also apply to
the April 95 version of the standard.


Reallocation
------------

Under the section on reserve the draft standard says that no reallocation
of the storage takes place during insertions until the string reaches
the length given by the last call of reserve. The standard does not say
what insertion means. I assume it means append, +=, and possibly insert
statements. Is this correct? If so the value returned by data() should
remain valid following a call of append, += or insert if the string doesn't
exceed the length set by reserve.

But this is conflict with the description of data() which says the value
returned becomes invalid after any non-constant call to the string concerned.

Am I missing something or does this still need to be worked out?

...recommendation: this be clarified


reserve
-------

The reserve function allows you to increase the capacity but there doesn't
seem to be anyway of decreasing it. So before working of a string we might
increase the capacity to something large to avoid repeated reallocating
storage. When we have finished we might want to set the capacity down to size()
to save space. There doesn't seem to be anyway of doing this.

...recommendation: a command be included to reduce the capacity to the
length of the string, eg reserve().


at and operator[]
-----------------

The same comments apply here as applied to data() in my comments on
reallocation.

Are we to assume that
x[3] = 'a';
as an example, may cause a reallocation, and so invalidate data()?

at and operator[] seem almost identical except that operator[] isn't required
to check bounds. However, the const version of x[x.size()] is supposed to
return traits::eos(), so if we store our strings without a trailing
traits::eos() we will have to check bounds. Is this is what is intended?

The constant version of operator[] returns a charT whereas the constant
version of at returns a const_reference. What is the point of this
difference and what effect does it have?

...recommendation: remove the requirement that x[x.size()] return
traits::eos() and clarify the difference between operator[] and
at.

Question: are we safe to assume that something = x[3] will use the constant
version of the operator?


c_str
-----

Same comments apply as applied to data() in the section on reallocation.


data
----

Also note that the value might be destroyed by a call to c_str(), since we
might have to reallocate the string to accommodate a trailing traits::eos().

... recommendation: under the section of data() note that the returned value
may be invalidated by a call to c_str.


reallocation again
------------------

Suppose I am right in thinking we can update a string with operator[], at,
insert or append without causing reallocation. Suppose also we are writing
a string package with copy-on-write. Then copy-on-write must be disabled on
string for which data() etc has been called. We need some way of telling a
program that copy-on-write can be reactivated - eg we are no longer interested
in the value returned by data(). Calling reserve() as I have defined above
may be one possibility.

... recommendation: some way be found for allowing a user to re-instate
copy-on-write.


find etc.
--------

Most of the functions that deal with two strings (such as replace and
compare in the January papers) use the order of parameters:
   pos and n for the string on the left of the '.'
   the name of the second string
   pos and n for the second string.

But find does not follow this convention. Here pos follows the second string
but refers to the string on the left of '.' . The situation is
doubly confusing for str.find(s, pos ,n) where s is a charT*. Here
pos refers to str and n refers to s.

...recommendation: fix this.


That's all for now.

Robert



[ To submit articles: Try just posting with your newsreader.  If that fails,
        use mailto:std-c++@ncar.ucar.edu
  FAQ:    http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/02/20
Raw View
In article <199602171134.LAA00305@kauri.vuw.ac.nz>
robert.davies@vuw.ac.nz (Robert Davies) writes:

|> Here are some comments and a query about the standard string class. I am
|> working from the January 96 working papers, but the comments also apply to
|> the April 95 version of the standard.


|> Reallocation
|> ------------

|> Under the section on reserve the draft standard says that no reallocation
|> of the storage takes place during insertions until the string reaches
|> the length given by the last call of reserve. The standard does not say
|> what insertion means. I assume it means append, +=, and possibly insert
|> statements. Is this correct? If so the value returned by data() should
|> remain valid following a call of append, += or insert if the string doesn't
|> exceed the length set by reserve.

|> But this is conflict with the description of data() which says the value
|> returned becomes invalid after any non-constant call to the string
|> concerned.

|> Am I missing something or does this still need to be worked out?

|> ...recommendation: this be clarified

My feeling is that this is more a problem in getting all of the wording
right than anything else.  Personally, I would like to see a global
statement as to when iterators, references and pointers become invalid,
rather than have the cases enumerated at each function.  Something along
the lines of the following would seem to be the intent.

 An iterator into a basic_string, a reference returned by the member
 functions at or operator[] or a pointer returned by the member
 functions c_str and data, may be invalidated by any of the
 following:

 1. Calling a non-const member function on the string, unless
  `reserve' has been previously called, in which case, they will
  only be invalidated if the non-const member function causes
  the string to become longer than the reserved size.

 2. Calling either c_str or data.

 3. Using the string as an argument to the member function swap.

I personally think that one could add assigning to the string, i.e.: an
operator= undoes the effect of reserve.  There is nothing in the current
draft that lets me think that this was the intent, however.

I would prefer seeing this defined purly in terms of validity of
reference, without mention of possible reallocation.


|> at and operator[]
|> -----------------

|> The same comments apply here as applied to data() in my comments on
|> reallocation.

|> Are we to assume that
|> x[3] = 'a';
|> as an example, may cause a reallocation, and so invalidate data()?

|> at and operator[] seem almost identical except that operator[] isn't
|> required to check bounds. However, the const version of x[x.size()] is
|> supposed to return traits::eos(), so if we store our strings without a
|> trailing traits::eos() we will have to check bounds. Is this is what is
|> intended?

Another interesting question is the meaning of "x[ x.size() ] = 'a'".

|> The constant version of operator[] returns a charT whereas the constant
|> version of at returns a const_reference. What is the point of this
|> difference and what effect does it have?

|> ...recommendation: remove the requirement that x[x.size()] return
|> traits::eos() and clarify the difference between operator[] and
|> at.

|> Question: are we safe to assume that something = x[3] will use the constant
|> version of the operator?

Definitly not.  Whether the const or non-const version will be used
depends entirly upon whether x is const or not.


|> reallocation again
|> ------------------

|> Suppose I am right in thinking we can update a string with operator[], at,
|> insert or append without causing reallocation.

As I interpret the intent, this is true if and only if you have called
reserve.  These functions are non-const.

|>  Suppose also we are writing a string package with copy-on-write. Then
|>  copy-on-write must be disabled on string for which data() etc has been
|>  called. We need some way of telling a program that copy-on-write can be
|>  reactivated - eg we are no longer interested in the value returned by
|>  data(). Calling reserve() as I have defined above may be one
|>  possibility.

Why does copy-on-write have to be disabled when data is called?  Any
non-const function (which might trigger copy-on-write) should invalidate
the pointer.

The one place I think copy-on-write must be disabled is precisely
reserve.  Since by calling reserve, I have guaranteed no further
allocations (no pointer invalidation) as long as the length remains
inferior to the capacity.

|> ... recommendation: some way be found for allowing a user to re-instate
|> copy-on-write.

This is the purpose behind my suggestion that assignment `undo' the
effects of reserve.

It would probably also be worth mentioning that copy on write cannot be
used if the allocators of the two strings compare different.  I think
that this is implicit in the current wording, and I'm not quite sure how
to put it into legalese, either, but I think it must be clear;
otherwise, what is the purpose of having the allocators?  (Note that
this is not a problem for string and wstring, since they both use the
default allocator, and all instances of the default allocator compare
equal.)
--
James Kanze           (+33) 88 14 49 00          email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils, itudes et rialisations en logiciel orienti objet --
              -- A la recherche d'une activiti dans une region francophone
---
[ To submit articles: try just posting with your news-reader.
                      If that fails, use mailto:std-c++@ncar.ucar.edu
  FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
  Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu.
]





Author: reycri@atlantis.actrix.gen.nz (Reynaldo Chrisostono)
Date: Sat, 24 Dec 1994 07:11:53 GMT
Raw View
Where can I get information on the specification of the standard string
class?

I want to start using the string class provided by the compiler that I am
using but I doubt whether all the member functions included in the class
are part of the standard.

If possible, I want a complete description of all public constructors,
destructors, operators, member functions and their behaviour, side-effects
etc. Also, I want the standard name of the header file that I need to
include to declare string objects; and where this header file should
reside.

I want to write portable code so I want to know which features of the
standard string class I should use.





Author: smr@cognex.uucp (Steven Rosenthal)
Date: Sat, 25 Sep 1993 22:25:17 GMT
Raw View
I understand that the C++ Standards Committee is standardizing certain
library components, including a string class.

Can anybody direct me to an implementation of the standard string
class?  I'd prefer a public domain or GPL implementation.

Thanks!

Steven Rosenthal
smr@cognex.com