Topic: lib.string.access problem?


Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/01/09
Raw View
Matt Austern <austern@isolde.mti.sgi.com> writes:

|>  kanze@gabi-soft.fr (J. Kanze) writes:
|>
|>  > |>  It is almost as if I have to add another internal flag to fstring
|>  > |>  which states: "If the user ever got a non-const reference, then
|>  > |>  don't allow anybody to ever add a reference count to shared
|>  > |>  storage -- instead, force them to make a separate copy of the
|>  > |>  data since it can change via the non-const reference."
|>  > |>
|>  > |>  My questions are:
|>  > |>
|>  > |>  1.  Must I implement fstring using this flag?
|>  >
|>  > If you want to conform to the proposed standard class, yes.  The
|>  > standard defines ways of obtaining references (and iterators) into the
|>  > string class, and the guaranteed valid lifetime for such references.
|>  > Practically, you must somehow inhibit sharing during this lifetime, and
|>  > the only way I know of doing it is such a flag.
|>  >
|>  > An alternative, which I used in my own string classes, is to make the
|>  > returned value of operator[] a helper class, rather than a reference.
|>  > This is not permitted by the standard for a conforming string class,
|>  > however.
|>
|>  A third alternative, though, is just to abandon the whole reference
|>  counting idea altogether.  It's fairly high overhead (it turns
|>  character access from a simple dereference into something that
|>  requires a branch, and it requires at least two additional flags), and
|>  it's not at all clear, given the way that basic_string is defined,
|>  that it buys you very much in this case.

I think it depends on what you are doing.  For various reasons, most of
my input parsing is still done in char[]'s.  (One of the reasons, of
course, is that most of it was written 10 or more years ago:-).)  And
other than parsing, when DO you look at individual characters.  My own
string class was around for years before I added a non-const operator[].

When most of what you are doing is copying strings, if only because of
return values, reference counting is a big win.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orientie objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/01/05
Raw View
polk@sprintmail.com (Max Polk) writes:

|>  While trying to implement:
|>
|>       char & fstring::operator [] (int)
|>
|>  where fstring is supposed to be as close to a char-based basic_string as
|>  possible without using templates or exceptions due to my requirement to
|>  use a very old C++ compiler, and where the above prototype was to
|>  simulate the following (see 21.3.4 lib.string.access in final draft):
|>
|>       reference operator [] (size_type pos);
|>
|>  I ran into a problem with reference counting and strings.  I know that
|>  the user can change the character via the reference since it is not
|>  const, so therefore I have to break any ties to other fstrings using the
|>  same internal storage when the above function is called.  That is not a
|>  problem.  The problem is what I should do to prevent the user from
|>  getting the reference, then doing a copy constructor on the original
|>  string, then the user changing the reference:
|>
|>       fstring  x ("Hello");          // x is "Hello"
|>       char     &c = x[0];            // c refers to the first char of x
|>       fstring  y (x);                // y shares x's internal storage
|>       c = 'J';                       // x is "Jello", y is "Jello" !!!
|>
|>  As you can see, there is no way for the assignment via the non-const
|>  reference to disconnect the internal shared storage of y and x!  The
|>  copy constructor does not invalidate the non-const reference: "[21.3.4]
|>  The reference returned is invalid after any subsequent call to c_str(),
|>  data(), or any non-const member function for the object."  And yet, the
|>  user can change the character they are holding a reference to, and
|>  violate data integrity.
|>
|>  It is almost as if I have to add another internal flag to fstring which
|>  states: "If the user ever got a non-const reference, then don't allow
|>  anybody to ever add a reference count to shared storage -- instead,
|>  force them to make a separate copy of the data since it can change via
|>  the non-const reference."
|>
|>  My questions are:
|>
|>  1.  Must I implement fstring using this flag?

If you want to conform to the proposed standard class, yes.  The
standard defines ways of obtaining references (and iterators) into the
string class, and the guaranteed valid lifetime for such references.
Practically, you must somehow inhibit sharing during this lifetime, and
the only way I know of doing it is such a flag.

An alternative, which I used in my own string classes, is to make the
returned value of operator[] a helper class, rather than a reference.
This is not permitted by the standard for a conforming string class,
however.

|>  2.  Is there a problem with the ratified C++ standard for string?
|>      Shouldn't 21.3.4 include some remark about the reference going
|>      invalid if another object's copy constructor is called using the
|>      object?  Or is the C++ standard fine and this only a problem for the
|>      library writer who must know to include such an aforementioned flag?

Well, I happen to think that there are a number of problems with the
ratified string class.  IMHO, it should allow operator[] to return a
smart reference and perhaps use different rules concerning the lifetime
of iterators and references.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Matt Austern <austern@isolde.mti.sgi.com>
Date: 1998/01/06
Raw View
kanze@gabi-soft.fr (J. Kanze) writes:

> |>  It is almost as if I have to add another internal flag to fstring which
> |>  states: "If the user ever got a non-const reference, then don't allow
> |>  anybody to ever add a reference count to shared storage -- instead,
> |>  force them to make a separate copy of the data since it can change via
> |>  the non-const reference."
> |>
> |>  My questions are:
> |>
> |>  1.  Must I implement fstring using this flag?
>
> If you want to conform to the proposed standard class, yes.  The
> standard defines ways of obtaining references (and iterators) into the
> string class, and the guaranteed valid lifetime for such references.
> Practically, you must somehow inhibit sharing during this lifetime, and
> the only way I know of doing it is such a flag.
>
> An alternative, which I used in my own string classes, is to make the
> returned value of operator[] a helper class, rather than a reference.
> This is not permitted by the standard for a conforming string class,
> however.

A third alternative, though, is just to abandon the whole reference
counting idea altogether.  It's fairly high overhead (it turns
character access from a simple dereference into something that
requires a branch, and it requires at least two additional flags), and
it's not at all clear, given the way that basic_string is defined,
that it buys you very much in this case.

The standard does not require, or even mention, reference counting.
It has always been the committee's intention that it be possible to
write a conforming non-reference counted implementation.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: polk@sprintmail.com (Max Polk)
Date: 1997/12/30
Raw View
While trying to implement:

     char & fstring::operator [] (int)

where fstring is supposed to be as close to a char-based basic_string as
possible without using templates or exceptions due to my requirement to
use a very old C++ compiler, and where the above prototype was to
simulate the following (see 21.3.4 lib.string.access in final draft):

     reference operator [] (size_type pos);

I ran into a problem with reference counting and strings.  I know that
the user can change the character via the reference since it is not
const, so therefore I have to break any ties to other fstrings using the
same internal storage when the above function is called.  That is not a
problem.  The problem is what I should do to prevent the user from
getting the reference, then doing a copy constructor on the original
string, then the user changing the reference:

     fstring  x ("Hello");          // x is "Hello"
     char     &c = x[0];            // c refers to the first char of x
     fstring  y (x);                // y shares x's internal storage
     c = 'J';                       // x is "Jello", y is "Jello" !!!

As you can see, there is no way for the assignment via the non-const
reference to disconnect the internal shared storage of y and x!  The
copy constructor does not invalidate the non-const reference: "[21.3.4]
The reference returned is invalid after any subsequent call to c_str(),
data(), or any non-const member function for the object."  And yet, the
user can change the character they are holding a reference to, and
violate data integrity.

It is almost as if I have to add another internal flag to fstring which
states: "If the user ever got a non-const reference, then don't allow
anybody to ever add a reference count to shared storage -- instead,
force them to make a separate copy of the data since it can change via
the non-const reference."

My questions are:

1.  Must I implement fstring using this flag?

2.  Is there a problem with the ratified C++ standard for string?
    Shouldn't 21.3.4 include some remark about the reference going
    invalid if another object's copy constructor is called using the
    object?  Or is the C++ standard fine and this only a problem for the
    library writer who must know to include such an aforementioned flag?
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]