Topic: Can std::basic_string implementations be reference counted?


Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/06/27
Raw View
<kanze@gabi-soft.de> wrote
>Herb Sutter <hsutter@peerdirect.com> writes:


|>  There are better (read: both faster and safer) ways of optimizing string
|>  memory allocation than using COW. See GotW #45:

|>     http://www.peerdirect.com/resources/gotw045a.html

|>  See Rob Murray's "C++ Strategies and Tactics" book, pages 70-72, for
|>  more empirical tests that show COW is only beneficial in certain
|>  situations even in single-threaded code.

> I'm familiar with Murray's comments.  Basically, the benefits of COW
> depend on how much reuse the buffer actually gets.  If I remember
> correctly, his measures showed a cut-off point of about 2.5 instances
> per value.  In my own code, I note an average of about 4 or 5 copies.
> And almost never a string that isn't copied.

Rob Murray's example appears to have used two allocations (both with
general-purpose allocators) for each unique string.

Herb Sutter's code (URL above, written for Win32) uses the Win32 function
InterlockedExchangeAdd(&i,0) to perform an atomic read.  However, on Win32
all aligned 32-bit reads are documented as being atomic.

Modifying Herb's code to take advantage of this (and applying several other
minor optimizations to the COW code), his 2A test, with strings of length 9,
runs faster with MT COW (similar to his COW_AtomicInt2) than with non-COW
(his Plain).   Herb's 2A test corresponds (roughly) to Rob's test when there
are about 1.33 instances per value (it makes N copies of one value,
modifying two thirds of them so that they become unshared).  These tests use
the default allocator.  Non-COW with Herb's fast allocator is much faster
for these short strings.  I haven't tried using the fast allocator with COW.

I get some unexpected results when the string size is changed (results
aren't monotone).  Total time (seconds) for two runs with 1000000 copies
each (500 Mhz PIII):
StringSize    PlainTime    AtomicIntTime
3       1.5 1.2
9       1.5 1.1
30     1.3 1.3
50     1.4 1.5
90     1.7 1.4
300   2.2 1.9
900   2.7 2.1
3000 4.4 3.3
The "copy" overhead doesn't become obvious until the strings become quite
large.  For shorter strings we are looking almost entirely at allocation
overhead.  I believe the default allocator probably has some block sizes
where it is relatively slow (and AtomicInt2 always allocates blocks that are
twelve bytes larger than the corresponding Plain blocks).

COW certainly has some disadvantages.  I just wanted to point out that some
of the timing results weren't as well optimized as they could have been.  I
must admit that I don't often see code where I'd get real excited about
saving 1.1 seconds when handling six billion characters.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "David Abrahams" <abrahams@mediaone.net>
Date: 2000/06/20
Raw View
<kanze@gabi-soft.de> wrote in message news:861z1wl3d1.fsf@gabi-soft.de...

> Right now, most implementors are happy if they have any standard
> library, much less a good one.

On behalf of the implementors I know who work hard on not just the
completeness, but the quality of their library implementations, I object.
This is unfair and insulting at best. I personally know of at least four
library implementors who are highly concerned for the quality of their
product... and there aren't too many library implementors out there, AFAICT.

-Dave

And, FWIW, the "full run-time" checking you're hoping for is already
available in the STLport.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/22
Raw View
"David Abrahams" <abrahams@mediaone.net> writes:

|>  <kanze@gabi-soft.de> wrote in message news:861z1wl3d1.fsf@gabi-soft.d=
e...

|>  > Right now, most implementors are happy if they have any standard
|>  > library, much less a good one.

|>  On behalf of the implementors I know who work hard on not just the
|>  completeness, but the quality of their library implementations, I
|>  object.  This is unfair and insulting at best. I personally know of
|>  at least four library implementors who are highly concerned for the
|>  quality of their product... and there aren't too many library
|>  implementors out there, AFAICT.

I didn't mean to criticize the implementors.  I know that they are
working hard, and that full conformance is an important issue for them.
And every implementor I know is concerned about the quality of their
product.  But the standard added an enormous number of new features, and
time and resources are not infinite.  The implementors do the best they
can, given the constraints, but the current situation is that just
conforming, without too many bugs, requires pushing their resources to
the limit.

Only once conformance is attained, can they start considering quality of
implementation issues (other than bugs).

|>  And, FWIW, the "full run-time" checking you're hoping for is already
|>  available in the STLport.

Which is the standard library in no compiler I'm aware of.

I know that such libraries exist.  Cay Horstman made one available
shortly after the STL first appeared.  There is, however, a big
difference between simply existing somewhere on the net, and being
integrated, with selection through some sort of compiler option.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/18
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  I think perhaps it is time to reassess whether or not this
|>  constitutes something we want to consider a defect.

I'd rather prefer that some definition was used for defect, and all that
was evaluated was whether this case met the definition.  I didn't like
the decision when it was made, however, it was made according to the
formal proceedings of the committee, and unless there is really a
contradiction or an ambiguity in it, I think it would be mocking the
committee to try and change it with a defect report.

There are a number of things I don't like in the current standard.
However, I think what C++ needs most at this point is stability.  We
need to allow time for the implementors to catch up.  Making the
standard better is a nice objective, but if it means that the standard
is constantly changing so much that we cannot ever have a conforming
implementation, it does more harm than good.

(Obviously, real defects -- ambiguities or contradictions, for example
-- should be corrected, since it is impossible to conform to a
contradiction, or to know whether one conforms to an ambiguity.)

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/18
Raw View
Herb Sutter <hsutter@peerdirect.com> writes:

|>  kanze@gabi-soft.de writes:
|>  >"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

|>  >|>  That was my point: the cow does not pull its weight. (What a
|>  >|>  fat cow.)  We (GotW) discovered that mt cow sucks. Now we
|>  >|>  discover that, even if the standard made a conscious effort for
|>  >|>  allowing the cow, it failed in doing so.

|>  >|>  Moreover, all strings in all programs are six characters on the
|>  >|>  average :o).

|>  >And operator new/malloc is a very expensive function in most
|>  >implementations.  Even for six characters:-).

|>  And acquiring a mutex lock is usually far more expensive than a
|>  general-purpose memory allocation.

How do you do a general-purpose memory allocation without acquiring a
mutex lock?  And who says that the shared character blocks must be
acquired using a general-purpose memory allocator.  The idea wouldn't
even occur to me if I were implementing a string class where performance
were a problem.  I'd almost certainly use a fixed length allocator for
the descriptor blocks, and, since I can easily find all of the pointers
to the character blocks, some sort of copying garbage collection for
them.

|>  Even atomic integer operations are
|>  relatively expensive compared to plain integer operations.

True, but still cheaper than allocation and/or acquiring a mutex.

Of course, with the standard string class and COW, you need to acquire
the mutex for a lot of functions where you end up not copying.  This
isn't true for my string class, nor for Java's.

|>  There are better (read: both faster and safer) ways of optimizing str=
ing
|>  memory allocation than using COW. See GotW #45:

|>     http://www.peerdirect.com/resources/gotw045a.html

|>  Two of the results reported there are:

|>  3. It is a myth that COW's principal advantage [always] lies in
|>     avoiding memory allocations.  Especially for longer
|>     strings, COW's principal advantage is that it avoids
|>     copying the characters in the string.

|>  Here I should add: On compilers with reasonably efficient default
|>  memory allocators. This includes MSVC. Even so, for short strings
|>  allocation was still important enough to be optimizable:

It apparently doesn't include Sun OS or Solaris, for small blocks.
(The Sun default allocator is OK for large blocks, but becomes
incredibly slow for a lot of small blocks.  And it has been some time
since I made the measures -- I've not had any performance problems
involving allocation recently.)

|>  4. Optimized allocation, not COW, was a consistent true
|>     speed optimization in all cases (but note that it
|>     does trade off space). Here is perhaps the most
|>     important conclusion from the Detailed Measurements
|>     section:

|>    "* Most of COW's primary advantage for small strings could be
|>        gained without COW by using a more efficient allocator.
|>        (Of course, you could also do both -- use COW and an
|>        efficient allocator.)"

|>  See Rob Murray's "C++ Strategies and Tactics" book, pages 70-72, for
|>  more empirical tests that show COW is only beneficial in certain
|>  situations even in single-threaded code.

I'm familiar with Murray's comments.  Basically, the benefits of COW
depend on how much reuse the buffer actually gets.  If I remember
correctly, his measures showed a cut-off point of about 2.5 instances
per value.  In my own code, I note an average of about 4 or 5 copies.
And almost never a string that isn't copied.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/19
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

    [concerning std::string...]
|>  >So where's the problem?  In practice, I would expect a quality
|>  >implementation to use a class type for all iterators by default
|>  >anyway.

|>  I personally do not have a problem with this, but I expect that a
|>  number of implementors might.

Right now, most implementors are happy if they have any standard
library, much less a good one.  I expect, however, that once the basics
become stable, most implementors will offer several variants of the
standard library, some with full run-time checking (which means all
iterators are class types, since T*'s don't do a lot of checking on
their own), some without, for performance reasons.  Hopefully, the
default will be with the checking, and the faster versions will require
a special flag, but even in the other case, my makefiles have always
distinguished between test builds and delivery builds, and I'm used to
putting tons of flags in the compiler invocation anyway.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/19
Raw View
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

|> |>  And operator new/malloc is a very expensive function in most
|> |>  implementations.  Even for six characters:-).

|>  Right. However, I saw a nice article by Jack Reeves in the Report. He
|>  stored small strings right in the space used for the pointers. Hard t=
o
|>  implement and very hard to implement generically and portably, but
|>  worth looking into.

How small?  I once did something similar for IO parameter blocks, for
blocks up to four bytes.  In that case, it was definitly useful, because
we were doing a lot of single byte IO.  For strings, I'm less convinced.

With regards to generically, who cares?  Specialize basic_string< char >
and basic_string< wchar_t >; how many people actually use anything else?
Portably, I don't see the problem.  The number of characters that will
fit in a pointer will vary, but other than that?

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/19
Raw View
Francis Glassborow <francis@robinton.demon.co.uk> writes:

|>  In article <86zoparke6.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
|>  >A defect report should not be used to change an
|>  >explicit decision of the committee, however, even if you and I don't
|>  >agree with that decision.

|>  Even if it proves to be a mistake, unimplementable, ambiguous or
|>  contradictory? I am not claiming that the case in question is any of
|>  these, I just wish to point out that whether something is a
|>  deliberate decision of the Committee or not says little about its
|>  defectiveness.

Are you saying that the committee deliberately decided to make certain
things unimplementable, ambiguous or contradictory:-)?

Seriously, of course.  In this case, however, if I understood correctly,
it was a question of removing some undefined behavior.  Since there was
an explicit vote to make the behavior undefined, I don't think that a
defect report can be used to make it defined.  At least not directly --
if in the course of correcting a real defect, it "accidentally" becomes
defined as a side effect, I guess that is acceptable.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/19
Raw View
"David Abrahams" <abrahams@mediaone.net> writes:

|>  "Herb Sutter" <hsutter@peerdirect.com> wrote in message
|>  > Good point, and let me address it again: With thread-safe non-COW,
|>  > you do one allocation (incl. lock) per copy operation only. With
|>  > thread-safe COW, you do one lock (mutex or atomic int op) per
|>  > possibly mutating operation on the string whether the code that
|>  > uses it is multithreaded or not -- which is often spelt "ouch."

|>  I think Herb must be overstating the case here. Processors as old as
|>  the Motorola 68000 had an atomic decrement-and-test instruction. I
|>  don't claim to be an expert on the gamut of modern CPU
|>  architectures, but I would be astonished to find that anyone was
|>  doing multithreading these days on a machine that doesn't support
|>  such a facility.

I almost responded similarly -- the original 8086 had a lock prefix
which made the following instruction atomic.  However, on thinking about
it: making something atomic certainly has some implications with regards
to the pipeline.  This certainly wouldn't have a measurable effect on
the 8086 or the 68000, but on a modern processor, it someone showed me a
measurement showing that an atomic incr was five times longer than a
normal one (on the average), I wouldn't necessarily be surprised.

The question still remains as to what this signifies.  In my own work,
I'd guess that most of what I do with strings is copy them, with a fair
amount of concatenation, and a lot of just scanning (read-only
operations).  Typically, the scanning occurs in separate functions,
which are passed std::string const&, so one can hope that threading
issues are completely irrelevant here.  For the rest, in the absense of
measurements, especially on modern processors, I won't really say
anything.  I have a good idea how I would implement std::string, using
copy on write, but until I've actually done so, and measured, I have no
idea whether it would really be significantly faster (although I have
strong reasons to believe so).

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/06/19
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  Even if one comes down to mandating concrete types, I think one
|>  needs at least two: modifiable strings and non-modifiable
|>  strings. This is the model I have in some code, with the latter ref
|>  counted and the former not.

I sort of agree.  Before the standard, I had two "string" classes:
String and StringBuffer.  Conceptually, however, I had one unmodifiable
string class String, and a certain number of string builders -- if the
only modifiable "string" was StringBuffer, it was principally because
the only way I'd needed to build up a string was by appending elements
(usually single characters).  Had I needed to build strings in a
different manner, I would have created another string builder.

IMHO: building a string is an operation, which should be done by a
specialized class.  Once built, a string has the value it has -- it
can't be changed.  Except that C++ uses value semantics, so of course,
it can be changed by assignment.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/06/06
Raw View
"David Abrahams" <abrahams@mediaone.net> writes:
>"Herb Sutter" <hsutter@peerdirect.com> wrote in message
>> Good point, and let me address it again: With thread-safe non-COW, you
>> do one allocation (incl. lock) per copy operation only. With thread-safe
>> COW, you do one lock (mutex or atomic int op) per possibly mutating
>> operation on the string whether the code that uses it is multithreaded
>> or not -- which is often spelt "ouch."
>
>I think Herb must be overstating the case here.

Unfortunately, no. On my test platform, the fastest possible MT-safe COW
string (using efficient atomic int ops) was slower than the plain
(non-COW) string even in cases where there were more copy operations
than possibly-mutating operations (and no actually mutating operations)
in the entire system. The test harness was single-threaded.

I measured commercial implementations. I also measured seven (7) of my
own hand-written and heavily hand-optimized partial string
implementations, for which I provide full source code... quoting from
GotW #45 (October 1998), they are:

             Name  Description
  ---------------  -----------------------------------------------------
            Plain  Non-use-counted string; all others are modeled on
                   this (a refined version of the GotW #43 answer)

       COW_Unsafe  Plain + COW, not thread-safe
                   (a refined version of the GotW #44 answer)

    COW_AtomicInt  Plain + COW + thread-safe
                   (a refined version of the GotW #45 1(a) answer above)

   COW_AtomicInt2  COW_AtomicInt + StringBuf in same buffer as the data
                   (another refined version of GotW #45 #1(a))

      COW_CritSec  Plain + COW + thread-safe (Win32 critical sections)
                   (a refined version of GotW #45 #1(b) answer above)

        COW_Mutex  Plain + COW + thread-safe (Win32 mutexes)
                   (COW_CritSec w/ mutexes instead of critical sections)

  I also threw in a seventh flavour to measure the result of
  optimizing memory allocation instead of optimizing copying:

  Plain_FastAlloc  Plain + an optimized memory allocator


>Processors as old as the
>Motorola 68000 had an atomic decrement-and-test instruction. [...]

The availability of atomic int ops isn't at issue, because I assume
they're available and focused on them. In fact, I coded and tested two
(2) variant implementations of COW with atomic int ops to demonstrate
different potential optimizations.

(Why two versions? Someone claimed that my AtomicInt implementation
wasn't optimal because I was allocating a StringBuf and data separately,
so I implemented the suggested optimization as a variant -- it turned
out to be slightly slower than my original, in which I'd already
specifically optimized away the "double allocation" cost. The same
someone then claimed that I must be incurring function-call overhead on
my atomic int operations, so I hand-coded x86 assembler for the atomic
int ops -- the hand-coded assembler was slightly slower than the Win32
atomic int ops, which were remarkably efficient. Then people stopped
criticizing my optimizations, although I'm still willing to consider new
suggestions -- Bill Wade just suggested another possible optimization
that could help some cases.)

(If atomic int ops weren't available, MT-safe COW would be nearly always
a pure pessimization. Alas, last time I looked a year ago, at least one
popular standard library implementation still shipped a COW basic_string
that used critical sections or mutexes and, when compiled for MT-safe
mode, became one or two orders of magnitude slower in typical use
cases.)

Atomic int operations were still much slower than plain integer
operations on the popular (Pentium MMX or II) platform on which I
tested, resulting in measurable overhead for MT-safe COW even if the
using code happened to be single-threaded or didn't share strings
between threads. Quoting from GotW #45:

  I focused on comparing Plain with COW_AtomicInt.
  COW_AtomicInt was generally the most efficient
  thread-safe COW implementation.  The results were as
  follows:

  1. For all mutating and possibly-mutating operations,
     COW_AtomicInt was always worse than Plain. This is
     natural and expected.

  2. COW should shine when there are many unmodified
     copies, but for an average string length of 50:

     a) When 33% of all copies were never modified,
        and the rest were modified only once each,
        COW_AtomicInt was still slower than Plain.

     b) When 50% of all copies were never modified,
        and the rest were modified only thrice each,
        COW_AtomicInt was still slower than Plain.

     This result may be more surprising to many --
     particularly that COW_AtomicInt is slower in cases
     where there are more copy operations than mutating
     operations in the entire system!

And, incidentally, I failed to note that these were actually only
possibly-mutating operations; none actually modified the string, but
they were enough to trigger an unshare.

  [... more results ...]

>[...] give all COW FUD a rest.

No FUD. I measured.

  http://www.peerdirect.com/resources/gotw045a.html

To see what the overhead would be on a Motorola or other system, just
take my source code (with minor tweaks to get rid of the Windows-isms)
and compare the Plain and COW_AtomicInt cases on your platform.

See also:

  Sutter, H. "Optimizations That Aren't (In a Multithreaded World),"
             C/C++ Users Journal, 17(6), June 1999.

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/06/07
Raw View
Herb Sutter <hsutter@peerdirect.com> wrote in message
news:9vinjs8iotat8odjvq87ebcm0ti98mn4ii@4ax.com...
[snip]

I guess this is reason enough to send the mad cow to other pastures...
not in string implementations.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/06/02
Raw View
In article <1ebiw1e.8g6zt81j2pi6kN@slip139-92-12-136.hm.de.prserv.net>,
Joerg Barfurth <joerg.barfurth@attglobal.net> writes
>Bill Wade <bill.wade@stoner.com> wrote:
>> When the standard talks about a "const member function" I expect that it is
>> referring to
>>   9.3.1/3 "A member function declared const is a const member function."
>
>Note that 9.3.1 is titled "Nonstatic member functions".
>
>> I expect that any member function which is not declared "const" and not
>> declared "const volatile" is a non-const member function.
>
>It is not a const member function, but does that make it non-const ?
>Here you assume that Boolean logic applies this way, which I questioned.

I think, as you have pointed out, that the Law of the Excluded Middle
does not apply here.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/06/02
Raw View
>If the concept does not apply to a function, you cannot ascribe any
>relation of it to constness except 'indifferent'. This would mean that
>it is neither const nor non-const. But that means your application of
>Boolean logic is in error.

If you dispute the statement that

  All static member functions are non-const.

I would assume you must also dispute the statement that

  All const member functions are non-static.

Both are just different ways of stating that const and static mf's aren't
indifferent, they are exclusive.

The existence of more than two states is not sufficient to say that not-ax
does not apply.

unsigned and non-signed don't mean the same thing.  char is a non-signed
integer type.  That does not mean that it is an unsigned integer type.  It
also doesn't mean that char is indifferent to signedness.  It just means
that it is included in the definition of "integer type" and is not included
in the definition of "signed integer type".

Saying that a member function is non-const, does not mean that it will be an
"unconst" member function when some future version of the standard adds that
keyword.  It is likely that 'tors and static members will be both non-const
and non-unconst.




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/06/03
Raw View
"Bill Wade" <bill.wade@stoner.com> writes:
>"Herb Sutter" <hsutter@peerdirect.com> wrote in message
>news:knf2jsgkbk6nbo37s9aq8s64i28rm4ioo0@4ax.com...
>> And acquiring a mutex lock is usually far more expensive than a
>> general-purpose memory allocation. Even atomic integer operations are
>> relatively expensive compared to plain integer operations.
>
>I don't really want to dispute your MT conclusions since I don't have a lot
>of practical experience there, so take my MT comments with a large grain of
>salt.
>
>However, many MT allocators also aquire a mutex lock, and most good ones do
>so in the case where a block is allocated in one thread and free'd in
>another.

Good point, and let me address it again: With thread-safe non-COW, you
do one allocation (incl. lock) per copy operation only. With thread-safe
COW, you do one lock (mutex or atomic int op) per possibly mutating
operation on the string whether the code that uses it is multithreaded
or not -- which is often spelt "ouch."

You suggested an additional optimization for the unshareable flag that
can help for some cases. I haven't had a chance to look at it right now,
but I'll make a point of looking at it next time I revisit that
material. Thanks!

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/06/03
Raw View
"Herb Sutter" <hsutter@peerdirect.com> wrote in message
news:5dnfjsordeor0acfihd2ev27hiiouj93ss@4ax.com...

> You suggested an additional optimization for the unshareable flag that
> can help for some cases. I haven't had a chance to look at it right now,
> but I'll make a point of looking at it next time I revisit that
> material. Thanks!

It seems like it would also be possible to have an unguarded
(this_thread_knows_the_reference_count_is_1) flag (know_1 for short).  An
operation like append() checks the flag (not atomic).  If true, no atomic
operations will be required.  If false, append() will eventually set it to
true, while leaving the unshareable flag cleared.  Subsequent append
operations (until the next COW) are cheap.

COW would still never need fewer locks than non-COW, but the number of extra
locks would be limited (to something like two extra locks during the
lifetime of an allocated string).


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Anders J. Munch" <andersjm@post.tele.dk>
Date: 2000/06/03
Raw View
>Anders J. Munch wrote:
>>Nonsense.  The object does exist.  It just hasn't entered the stage of
>>its existence which the standard calls the 'lifetime'.
>>
>> '3.8/1: "The lifetime of an object is a runtime property of the
object."

Francis Glassborow wrote:
>Perhaps until you make a request for interpretation as a result of which
>the C++ Committees uphold your view. We certainly intended (and, yes I
>was actually a participant in that decision so carry part of the blame
>if the words do not express our intent) that an object had no existence
>outside its lifetime. We intended that there was no complete object
>prior to the start of its lifetime nor after the end of it.

I am quite happy with the standard as is it.  Nowhere does it attempt
to define what it means to exist, fortunately.

About the 'lifetime' word.  '3.8/1 tells me that this is a purely
technical term, with a specific meaning in the context of the IS.
Some parts of the IS defines when an object is within its 'lifetime',
and other parts of the IS defines which operations are valid only
within the 'lifetime'.  Replace the word 'lifetime' with 'frobsz'
everywhere and the meaning stays the same, because the word is used as
an intrareference, not for the common meaning of the word.

Other interpretations are hard to reconcile with (from '1.8/1): "An
_object_ is a region of storage.".  On request I could find plenty of
other places where it is implied that *this is an object or even where
*this called an 'object' out loud.

If this is not what you intended, then IMHO the IS came out *better*
than you intended.  As for why, see my reply to Herb.

- Anders

--
Anders Munch, andersjm@post.tele.dk
Still confused but at a higher level.




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Anders J. Munch" <andersjm@post.tele.dk>
Date: 2000/06/03
Raw View
Herb Sutter wrote:
>Anders J. Munch wrote:
>>Herb Sutter wrote in message ...
>>>No, it's stronger: While the constructor is running, the object does
not
>>>yet exist. Object lifetime begins when the constructor returns
>>>successfully. It's not that constructors aren't const, so much that
>>>constness isn't an applicable concept until the object in question
>>>exists (is constructed).
>>

>>Nonsense.  The object does exist.

>If so, then what is its type? See below.

>While a building is under construction, it's true that some component
>parts of it do exist. We generally don't call a collection of
>scaffolding a "building" until it's done, though.

Metaphors can be a great way of explaining a concept to someone who is
clueless.  Which I am not.  I'll refrain from a clever reply within the
metaphor, because using metaphors in an argument like this invariably
creates misunderstandings and confusion.  Examples are much better.

>What statement do you disagree with? We agree that the lifetime doesn't
>begin until the constructor ends successfully.

If you are using 'lifetime' as the technical term defined in the IS, we
agree.  If you are using it as in the statement 'the lifetime of X is the
period of time during which X exists' then I disagree.  And the statement
I unconditionally disagree with is this:

>>>While the constructor is running, the object does not yet exist.



>You may want to say that
>an object t of type T "exists" during construction and consider it a
>matter of taste, but I think it's invalid and a bad mental model.
[...]
>Until the constructor finishes, you have memory that's a bundle of bits.
>If you want to call that an object, I ask "of what type?"

If you want to ask "of what type?", I ask "in which type system?" :-)

First, some example code:
    class BASE
    {
        public:  BASE()  {
            // A = the type of x here.
        }
    };

    class X : public BASE
    // Class invariant: j > 0
    {
        public:
            int i, j,k;
            X(int i_) : i(i_)
            {
                // B = the type of *this here
                j = 1;
                // C = the type of *this here
            }
            void f()
            {
                j = -1;
                // D = the type of *this here
                j = +1;
            }
    };

    int main()
    {
         X x(2);
        // E = the type of x here
        x.f();
        x.k  = 3;
        // F = the type of x here
    }

I understand the question of what type the object has during
construction as asking for the value of B (or C).

(Before I go any further, let me just say that the dynamic type A
makes no sense, as the object x does not yet exist at this point, I
think we can agree on that.  You will hear no argument from me on what
happens in the initializer list.  It is only when the opening brace is
crossed that I am compelled to say that the object exists.)

In the static type description language which is a part of C++, the
answer is simple.  A = B = C = D = E = F = X.

Since it is hardly any challenge to up with that, I can only assume
that some other, more descriptive, dynamic type system was implied.
One might ask for the dynamic type expressed using C++ type
expressions, but I think that would be doing Herb a great injustice: I
can think of no alternative to B = C = X in this system, as no other
values of B and C capture the fact that e.g. this->X::i is a valid
expression.

Time to bring out the big gun: the most accurate and detailed type
system that I can imagine.

Definition: the type of x = the (mathematical) set of all operations
that can be performed on or with x within defined behaviour.

To see just how extreme this type system is, consider that the types
of the constants 0 and 1 are different: The type of 1 contains the
operation "can be divided by".  The type of 0 does not.  Strictly
speaking 1 and 2 have different types, too, because the type of 2
contains the operation "can be subtracted one from then divided by"
which 1 does not.

Within the example,
    E = { can have address taken, can have X::f() invoked, can read
   and write to X::i and X::j, can write to X::k ....etc... };

Notably E doesn't contain "can read from X::k", because k is
uninitialized and reading from it would cause undefined behaviour. F
does contain "can read from X::k".

Now with this definition explained, I propose that C = E.  That the
type within the constructor and the type after the constructor are
exactly the same.  And when the most detailed type system I can
imagine can't tell them apart, I certainly don't want to do so either,
and I will insist that at the point of B and C an object does exist,
and that it has static type X.

Wrt. to the object not existing until it upholds the class invariants.
You are essentially proposing a dynamic type system in which x can be
said to be of type X only if it fulfills the invariants of X.  But
then what is the value of E?  What is the dynamic type of x when,
during the processing of X::f(), the invariants temporarily do not
hold?  Does x temporarily have no type?  Does x temporarily not
exist??  My answer is that, useful as they may be, class invariants
are in the minds of programmers, not a part of the language, and they
have no bearing on which objects exist.

- Anders

--
Anders Munch, andersjm@post.tele.dk
Still confused but at a higher level.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Dennis Yelle <dennis51@jps.net>
Date: 2000/06/04
Raw View
"Anders J. Munch" wrote:
[...]
> If you want to ask "of what type?", I ask "in which type system?" :-)
>
> First, some example code:
>     class BASE
>     {
>         public:  BASE()  {
>             // A = the type of x here.
>         }
>     };
>
>     class X : public BASE
>     // Class invariant: j > 0
>     {
>         public:
>             int i, j,k;
>             X(int i_) : i(i_)
>             {
>                 // B = the type of *this here
>                 j = 1;
>                 // C = the type of *this here
>             }
>             void f()
>             {
>                 j = -1;
>                 // D = the type of *this here
>                 j = +1;
>             }
>     };
>
>     int main()
>     {
>          X x(2);
>         // E = the type of x here
>         x.f();
>         x.k  = 3;
>         // F = the type of x here
>     }
>
> I understand the question of what type the object has during
> construction as asking for the value of B (or C).
>
> (Before I go any further, let me just say that the dynamic type A
> makes no sense, as the object x does not yet exist at this point, I
> think we can agree on that.  You will hear no argument from me on what
> happens in the initializer list.  It is only when the opening brace is
> crossed that I am compelled to say that the object exists.)
>
> In the static type description language which is a part of C++, the
> answer is simple.  A = B = C = D = E = F = X.

I think I agree with everything you say, except when
you say the type at A is X.  A is inside the initializer list of X.
I am not willing to say that X exists at that time.

In the example above, these two lines are equivalent:
                 X(int i_) : i(i_)
                 X(int i_) : BASE(), i(i_)

Dennis Yelle

--
I am a computer programmer and I am looking for a job.
There is a link to my resume here:  http://table.jps.net/~vert

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/06/05
Raw View
"Anders J. Munch" <andersjm@post.tele.dk> writes:
>Herb Sutter wrote:
>>Until the constructor finishes, you have memory that's a bundle of bits.
>>If you want to call that an object, I ask "of what type?"
>
>If you want to ask "of what type?", I ask "in which type system?" :-)

I guess I don't understand what's so complex about the question "what is
an object's type?" Obviously (to me) it means: What operations are valid
on the object, with what semantics? Does it meet the invariants of an
object of a given type, or not? Can all the operations valid for an
object of the given type be applied to it with the same semantics, or
not?

You claimed that a piece of memory whose constructor has not yet run to
completion is an "object." So what's its type? It can't be the type of
the class whose constructor is being run because the constructor isn't
done yet, and so the object can't be known to yet meet the invariants of
that class.

>    class BASE
>    {
>        public:  BASE()  {
>            // A = the type of x here.
>        }
>    };
>
>    class X : public BASE
>    // Class invariant: j > 0
>    {
>        public:
>            int i, j,k;
>            X(int i_) : i(i_)
>            {
>                // B = the type of *this here
>                j = 1;
>                // C = the type of *this here
>            }
Here I add:      // C' = immediately after the constructor

             virtual e() = 0;

>            void f()
>            {
>                j = -1;
>                // D = the type of *this here
>                j = +1;
>            }
>    };
>
>    int main()
>    {
>         X x(2);
>        // E = the type of x here
>        x.f();
>        x.k  = 3;
>        // F = the type of x here
>    }
>
>I understand the question of what type the object has during
>construction as asking for the value of B (or C).

Right. And it's not X. At C' the type is X except that I think this
class is broken because its invariants don't cover the class' whole
interface (specifically, k is part of the interface and not covered by
any invariant). But let's try to put that brokenness aside as much as
possible by ignoring k (the culprit) and talk about the rest of the
example, which is useful.

>In the static type description language which is a part of C++, the
>answer is simple.  A = B = C = D = E = F = X.

I'm not sure I follow. Of course A != X. And if you add the pure virtual
function X::e(), then B and C definitely != X, because the pure virtual
function can't be called from them. That's just one example of how
during construction an object has not yet attained its desired type.

So, according to your first possible criterion, it's clear to me that
the object is not of type X until after the constructor has completed
successfully. (For the same reason, the object is no longer of type X as
soon as the destructor begins, since for example pure virtual functions
again can't be called.) Notice that this, not coincidentally, perfectly
maps onto the standard's definition of object lifetime.

>Since it is hardly any challenge to up with that, I can only assume
>that some other, more descriptive, dynamic type system was implied.
>One might ask for the dynamic type expressed using C++ type
>expressions, but I think that would be doing Herb a great injustice: I
>can think of no alternative to B = C = X in this system, as no other
>values of B and C capture the fact that e.g. this->X::i is a valid
>expression.

No, for this->e() is not valid at B and C, but is valid during the
object's lifetime.

So, according to your second possible criterion, it's clear to be that
the object is not of type X until after the constructor has completed
successfully. (etc., same as above)

>Time to bring out the big gun: the most accurate and detailed type
>system that I can imagine.
>
>Definition: the type of x = the (mathematical) set of all operations
>that can be performed on or with x within defined behaviour.

Sure, and here you're talking about the interface of X (which includes
both member and nonmember functions; see my C++ Report columns of March
1998 and March 1999).

To me this implies that an object whose constructor has not yet
completed (what I call a "non-object" because IMO it isn't one yet) does
not exist not only because it may not yet meet its invariants but
because by definition no operations can be performed on it. It's not
legal to call a member function until after the constructor completes!
Given:

  T t;

Until T::T() completes, you can go ahead and claim that something exists
(I agree that some memory does exist), and you might even try to claim
it's an object (I deny it's an "object" in the normal meaning of the
word which is an instance of a class or builtin type), but according to
C++ and according to your own definition above it's definitely not an
object -of type T-.

>To see just how extreme this type system is, consider that the types
>of the constants 0 and 1 are different: The type of 1 contains the
>operation "can be divided by".  The type of 0 does not.

Not true, because 0 "can be divided by" with well-defined behavior. 0
and 1 are both integers and both have well-defined behavior for all
integer operators. The same goes for 0.0 and 1.0 as objects of type
double, for example; that's what NaNs and infinities are for.

>Wrt. to the object not existing until it upholds the class invariants.
>You are essentially proposing a dynamic type system in which x can be
>said to be of type X only if it fulfills the invariants of X.

.... and of course supports the X interface (incl. all semantics), which
is directly related to class invariants. Right, but I'm not proposing
it; what other meaning can there be for "an object is of type X"?

Compare the following related question: What does Liskov IS-A mean?
You'll end up down the same shining path, and it's no coincidence.

>But then what is the value of E?

The example class is badly written because it fails to initialize a
class member (nearly always bad) and exposes it as part of its interface
(definitely bad). True, you can document the blazes out of it, but
that's just admitting there's really another invariant or rule not
stated above. The fact that X presents an interface (k) not covered by
any invariant is plain old bad design. Arguments based on k therefore
don't carry much weight with me because they build on a faulty
foundation; it's like arguing about a faulty class that uses the
implicitly generated copy constructor but holds an owned subobject by
pointer, and trying to use it as a basis for reasoning about what "to
copy" should mean in C++.

>What is the dynamic type of x when, during the processing of X::f(),
>the invariants temporarily do not hold?  Does x temporarily have no type?

Of course not. A class invariant is a precondition and postcondition for
every member function.

A constructor takes raw memory and turns it into an object of type X
(i.e., an object that meets X's invariants). A member function operates
on a valid object of type X and leaves it as another object of type X.
The intermediate state is irrelevant because the member function is
executed atomically -- and if you are writing reentrant/multithreaded
code it's typically the object user's responsibility to serialize
multithreaded accesses to the object so that member functions are indeed
called atomically and serially. (In a few weird cases, including copy on
write (COW), the class must also do some serialization because the
outside world cannot have enough information to do the whole job, but
the default responsibility definitely lies with the owner of the object
to serialize all access to it.)

Incidentally, I don't think that anything I've said above (except the
rule about pure virtual functions) is specific to C++ alone among
object-oriented languages.

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "David Abrahams" <abrahams@mediaone.net>
Date: 2000/06/05
Raw View
"Herb Sutter" <hsutter@peerdirect.com> wrote in message
> Good point, and let me address it again: With thread-safe non-COW, you
> do one allocation (incl. lock) per copy operation only. With thread-safe
> COW, you do one lock (mutex or atomic int op) per possibly mutating
> operation on the string whether the code that uses it is multithreaded
> or not -- which is often spelt "ouch."

I think Herb must be overstating the case here. Processors as old as the
Motorola 68000 had an atomic decrement-and-test instruction. I don't claim
to be an expert on the gamut of modern CPU architectures, but I would be
astonished to find that anyone was doing multithreading these days on a
machine that doesn't support such a facility. Well, let me ask, since I
don't know: are there any such processors around being used for
multithreading? If so, I stand educated. If not, we should give all COW FUD
a rest.

-Dave

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Anders J. Munch" <andersjm@post.tele.dk>
Date: 2000/06/05
Raw View
Dennis Yelle wrote in message <39393CB3.651FDFB7@jps.net>...
>"Anders J. Munch" wrote:
>> In the static type description language which is a part of C++, the
>> answer is simple.  A = B = C = D = E = F = X.
>
>I think I agree with everything you say, except when
>you say the type at A is X.  A is inside the initializer list of X.
>I am not willing to say that X exists at that time.

I was specifically talking about static types.  Static types is a
compile-time concept.  By definition (with static types) A = E = F,
because they ask for the type of the same entity.  My using the word
"here" in the definitions of the types A-F is solely for the purpose of
dynamic typing; with static types the word "here" is misleading.

The dynamic type, OTOH, refers to the object associated with a
variable or expression at a given point in run-time.  With dynamic
types your point applies, and A is ill-defined.

- Anders



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/06/01
Raw View
"Joerg Barfurth" <joerg.barfurth@attglobal.net> wrote in message
news:1eb9oed.1jjycw31os3iofN@slip139-92-12-86.hm.de.prserv.net...
>Bill Wade <bill.wade@stoner.com> wrote:

>> Boolean logic would say that one of the three statements is true
>>   1) 'tors are not member functions
>>   2) 'tors are const member functions
>>   3) 'tors are (non-const) member functions.

> Where does Boolean logic put static member functions ?

(3)

When the standard talks about a "const member function" I expect that it is
referring to
  9.3.1/3 "A member function declared const is a const member function."
I expect that any member function which is not declared "const" and not
declared "const volatile" is a non-const member function.

A static member function meets that definition (can't be declared const).  I
am pretty sure that some people feel that the definition of "non-const
member function" is more like:

<Any member function which is not a constructor and not a destructor and not
static and not declared const and not declared const volatile.>

The second definition corresponds to "what comes to mind" when you say
non-const member function, because as soon as you see the word const you
stop thinking about 'tors and static functions.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/06/01
Raw View
"Herb Sutter" <hsutter@peerdirect.com> wrote in message
news:knf2jsgkbk6nbo37s9aq8s64i28rm4ioo0@4ax.com...

> And acquiring a mutex lock is usually far more expensive than a
> general-purpose memory allocation. Even atomic integer operations are
> relatively expensive compared to plain integer operations.

I don't really want to dispute your MT conclusions since I don't have a lot
of practical experience there, so take my MT comments with a large grain of
salt.

However, many MT allocators also aquire a mutex lock, and most good ones do
so in the case where a block is allocated in one thread and free'd in
another.

> There are better (read: both faster and safer) ways of optimizing string
> memory allocation than using COW. See GotW #45:
>
>    http://www.peerdirect.com/resources/gotw045a.html

Your implementation requires an atomic operation even in the case where the
string is already unshareable (because unshareable is flagged with a magic
reference count).  Allowing unlocked access to the unshareable flag (say
that string capacity is always even, an odd number represents unshareable)
should substantially reduce the expected cost of the iterator/reference
operations.  I admit this doesn't help for the modify operations (like
append).

> Two of the results reported there are:
>
> 3. It is a myth that COW's principal advantage [always] lies in
>    avoiding memory allocations.  Especially for longer
>    strings, COW's principal advantage is that it avoids
>    copying the characters in the string.

I agree with this for most real-world examples.  Certainly in a
producer/consumer type relationship it is likely that the number of
allocations is not significantly reduced.  However there are notable
exceptions that avoid the allocation entirely:
  a) Return by value
  b) Operations similar to vector<string>::reserve
In other words situation where swap() would apply (because the source is no
longer necessary) but there is no convenient way to tell that to the
compiler.

> Here I should add: On compilers with reasonably efficient default memory
> allocators. This includes MSVC. Even so, for short strings allocation
> was still important enough to be optimizable:

Typical ST allocators have gotten much better in the last several years.  I
believe MSVC got much better going from version 4 to 5, but it may have been
going from version 3 to 4.  I believe the art of MT allocators has gotten
better, although I believe that common practice is way behind art in that
area.

> 4. Optimized allocation, not COW, was a consistent true
>    speed optimization in all cases (but note that it
>    does trade off space). Here is perhaps the most
>    important conclusion from the Detailed Measurements
>    section:
>
>   "* Most of COW's primary advantage for small strings could be
>       gained without COW by using a more efficient allocator.

True, but for ST all COW operations are at least "almost" as efficient as
the non-COW version (penalties in both space and time are O(1) with a small
constant).  On the other hand COW can turn many O(size()) operations into
O(1).

> See Rob Murray's "C++ Strategies and Tactics" book, pages 70-72, for
> more empirical tests that show COW is only beneficial in certain
> situations even in single-threaded code.

His implementation required two allocations per string, making it difficult
for COW to win with slow allocators unless more than one copy is made or the
strings are very long.

I suspect (from the date of his book) that his allocator was substantially
slower (proportionately) than today's allocators.  Faster allocators,
especially combined with COW using only one heap block, actually make his
test cases more COW favorable (since the memory copying savings are
proportionately more important in all but the smallest strings).

More importantly, almost any constraint (or freedom) you could write about
is only beneficial in certain situations (contiguous vector,
vector::reserve, packed vector<bool>, vector<bool> meeting sequence
requirements).

If you care about the performance (or safety) of non-const [] more than you
care about copies, COW is bad.  If you really care about the performance of
very long string copies, COW becomes better.  If many of your strings are
short enough (for some definition of many and short), avoid using the heap
at all for short strings.

The problem for the library implementer is to determine what his client
really cares about.  The standard doesn't say which behavior is optimized.
That decision is up to the implementer (presumably influenced by his
customer base).

I'm really not so much for COW as against changes to the standard.
Personally I'd like to see my compiler vendors spend more time doing a good
job with "export" and less time tracking changes to the standard.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: joerg.barfurth@attglobal.net (Joerg Barfurth)
Date: 2000/06/02
Raw View
Bill Wade <bill.wade@stoner.com> wrote:

> "Joerg Barfurth" <joerg.barfurth@attglobal.net> wrote in message
> news:1eb9oed.1jjycw31os3iofN@slip139-92-12-86.hm.de.prserv.net...
> >Bill Wade <bill.wade@stoner.com> wrote:
>=20
> >> Boolean logic would say that one of the three statements is true
> >>   1) 'tors are not member functions
> >>   2) 'tors are const member functions
> >>   3) 'tors are (non-const) member functions.
>=20
> > Where does Boolean logic put static member functions ?
>=20
> (3)
>=20
> When the standard talks about a "const member function" I expect that i=
t is
> referring to
>   9.3.1/3 "A member function declared const is a const member function.=
"

Note that 9.3.1 is titled "Nonstatic member functions".

> I expect that any member function which is not declared "const" and not
> declared "const volatile" is a non-const member function.

It is not a const member function, but does that make it non-const ?
Here you assume that Boolean logic applies this way, which I questioned.

> A static member function meets that definition (can't be declared const=
).  I
> am pretty sure that some people feel that the definition of "non-const
> member function" is more like:
>=20
> <Any member function which is not a constructor and not a destructor an=
d not
> static and not declared const and not declared const volatile.>
>=20
> The second definition corresponds to "what comes to mind" when you say
> non-const member function, because as soon as you see the word const yo=
u
> stop thinking about 'tors and static functions.

The reason for this is, that the concept of constness does not apply to
static member functions. Constness really is about the (dis)ability to
change an object's state. For static member functions there is no such
object. For 'tors the standard specifies explicitly that the object does
not yet sufficiently exist, and that therefore constness does not apply
either.

If the concept does not apply to a function, you cannot ascribe any
relation of it to constness except 'indifferent'. This would mean that
it is neither const nor non-const. But that means your application of
Boolean logic is in error.

IOW: static member functions (and 'tors) are not in the domain of the
predicates 'const' and 'non-const'. The statement "some static member
functions are non-const" is not true (nor false), but meaningless. This
seems even more convincing if you look at "some free functions are
non-const".

Then "Boolean logic would say that (exactly) one of the four statements
is true":
> >>   1) 'tors are not member functions
-new- 1.1) 'tors are member functions to which constness does not apply
> >>   2) 'tors are const member functions
> >>   3) 'tors are non-const member functions.

J=F6rg

--=20
J=F6rg Barfurth                         joerg.barfurth@attglobal.net
--------------- =7F=7F=7Fusing std::disclaimer; ------------------
Software Engineer                     joerg.barfurth@germany.sun.com
Star Office GmbH                      =7Fhttp://www.sun.com/staroffice

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: joerg.barfurth@attglobal.net (Joerg Barfurth)
Date: 2000/05/31
Raw View
Bill Wade <bill.wade@stoner.com> wrote:

> Boolean logic would say that one of the three statements is true
>   1) 'tors are not member functions
>   2) 'tors are const member functions
>   3) 'tors are (non-const) member functions.

Where does Boolean logic put static member functions ?

--=20
J=F6rg Barfurth                         joerg.barfurth@attglobal.net
--------------- =7F=7F=7Fusing std::disclaimer; ------------------
Software Engineer                     joerg.barfurth@germany.sun.com
Star Office GmbH                      =7Fhttp://www.sun.com/staroffice

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 2000/05/31
Raw View
In article <8gmqkj$b5r$2@news.inet.tele.dk>, Anders J. Munch
<andersjm@post.tele.dk> writes
>One problem is inconsistency.  The keyword 'const' would mean
>something entirely different for a constructor than for a member
>function.  A _truly_ const constructor would be pretty much useless:
>the type of 'this' would be T const*const, and initializing instance
>variables would be impossible!

Why, if we decide to provide a const ctor we get to write the rules from
scratch. Note that, historically, the type of this was not const
qualified at all.


Francis Glassborow      Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/31
Raw View
kanze@gabi-soft.de wrote:
>
> Kevlin Henney <kevlin@curbralan.com> writes:
....
> |>  >Calling a non-const function (like begin, here), invalidates all
> |>  >previous iterators.
>
> |>  Except that it does not. That is exactly the reason for my posting. 21.3
> |>  para 5 in part states that
>
> |>  "References, pointers, and iterators referring to the elements of a
> |>  basic_string sequence may be invalidated by the following uses of that
> |>  basic_string object:
> |>  ....
> |>  -- Calling non-const member functions, except operator[](), at(),
> |>  begin(), rbegin(), end(), and rend()...."
>
> And in part states "Subsequent to any of the above usages [...] the
> first call to non-const member functions operator[](), at(), begin(),
> rbegin(), end(), or rend()."  I'm still trying to figure out what is
> supposed to be meant by the first part of the phrase.  Normally, I would
> have thought that "subsequent to" could better be said by "after" --
> that would certainly correspond to the latin meaning.  But in this
> context, the phrase would be meaningless, since any iterators are
> already invalid "subsequent to" the above uses.

Only the iterators that were in existence prior to the use. Iterators
created by and after the use are valid. However, the first subsequent
call as described in the fifth bullet will invalidate them all.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/31
Raw View
In article <86puq7roch.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
[...]
>(And the French suggestion actually went a lot further -- what we wanted
>was to allow operator[] to return a helper class instead of an actual
>reference.  Which in turn can only be made to work if the charT in
>basic_string is limited to a scalar type.)

Yes, this would have made basic_string watertight once and for all. It's
a shame that was not the way things went.

>|>  Alas, when taken together with the recent fix to comparing
>|>  iterators it does not.
>
>Could you please elaborate.

It was considered a defect that const_iterators and iterators could not
be compared. This has now been fixed, but now means that it is easier to
write code that invalidates iterators without intending it as it is the
difference in qualified access (eg to begin) that causes invalidation.

>|>  So it appears that even with fixes, reference counting is only possible
>|>  with non-pointer iterators. You can have your cake, but you can't eat it
>|>  :->
>
>So where's the problem?  In practice, I would expect a quality
>implementation to use a class type for all iterators by default anyway.

I personally do not have a problem with this, but I expect that a number
of implementors might.

>Just because the standard says there is undefined behavior, the
>implementation is not required to go out of its way to make it difficult
>for the user.

Indeed.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/05/31
Raw View
kanze@gabi-soft.de writes:
>"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:
>
>|>  That was my point: the cow does not pull its weight. (What a fat cow.)
>|>  We (GotW) discovered that mt cow sucks. Now we discover that, even if
>|>  the standard made a conscious effort for allowing the cow, it failed
>|>  in doing so.
>
>|>  Moreover, all strings in all programs are six characters on the
>|>  average :o).
>
>And operator new/malloc is a very expensive function in most
>implementations.  Even for six characters:-).

And acquiring a mutex lock is usually far more expensive than a
general-purpose memory allocation. Even atomic integer operations are
relatively expensive compared to plain integer operations.

There are better (read: both faster and safer) ways of optimizing string
memory allocation than using COW. See GotW #45:

   http://www.peerdirect.com/resources/gotw045a.html

Two of the results reported there are:

3. It is a myth that COW's principal advantage [always] lies in
   avoiding memory allocations.  Especially for longer
   strings, COW's principal advantage is that it avoids
   copying the characters in the string.

Here I should add: On compilers with reasonably efficient default memory
allocators. This includes MSVC. Even so, for short strings allocation
was still important enough to be optimizable:

4. Optimized allocation, not COW, was a consistent true
   speed optimization in all cases (but note that it
   does trade off space). Here is perhaps the most
   important conclusion from the Detailed Measurements
   section:

  "* Most of COW's primary advantage for small strings could be
      gained without COW by using a more efficient allocator.
      (Of course, you could also do both -- use COW and an
      efficient allocator.)"

See Rob Murray's "C++ Strategies and Tactics" book, pages 70-72, for
more empirical tests that show COW is only beneficial in certain
situations even in single-threaded code.

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/31
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  (1) Dropping the fifth sub-bullet entirely would eliminate all of the=
se
|>  problems (I believe), but would also practically eliminate possibilit=
ies
|>  for reference counting. You could share representation so long as you
|>  never accessed it, ie using find_*, append, replace, etc functions bu=
t
|>  not any of the indexing or iterator access functions.

You could share representation as long as you never called a non-const
function on the string.  The problem is, if the string itself is
non-const, you call non-const functions for operator[] or begin even if
you don't intend to modify the string.  And of course, the internal
implementation of string can't know that you won't modify it, and must
switch from reference counting to unique copies, to be safe.  It is this
switch which invalidates references and iterators.

The "correct" solution (IMHO, of course) is that string should not have
non-const functions, except for variants of assignment.  But I really
don't think that we could do this now.

The alternative is for operator[]() et al. in reference counted strings
to return helper classes instead of actual references.  At present, this
is banned by the standard; to make it legal, it would be necessary to
enumerate the legal operations on the results of operator[], instead of
just stating that it returned a reference.  Doing so would break at
least two currently legal operations:

    string      s ;

    char&       cr =3D s[ i ] ;

and:

    struct C { int a ; int b } ;
    basic_string< C > s ;

    s[ i ].a ;

I feel rather secure about the second -- I really don't think the idiom
is widely used (perhaps wrongly).  Personally, I would not like to
maintain code using the first, but I somehow suspect that such code does
exist, and that it may be too late to ban it now.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "David Abrahams" <abrahams@mediaone.net>
Date: 2000/05/31
Raw View
"Ross Smith" <ross.s@ihug.co.nz> wrote in message
news:8gsimb$93f$1@news.ihug.co.nz...
> > In order to define a string class using cow for which this works, it is
> > necessary for operator[]() to return a helper class.  The standard
> > forbids this, however, and it is *not* implementable as long as the
> > charT type of basic_string can be a struct.
>
> Why not?

Because you'd have to have an operator.() to access the members of the
struct transparently, and there isn't any such operator in C++.

-Dave

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/31
Raw View
In article <864s7jrlin.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
>Kevlin Henney <kevlin@curbralan.com> writes:
>
>|>  (1) Dropping the fifth sub-bullet entirely would eliminate all of these
>|>  problems (I believe), but would also practically eliminate possibilities
>|>  for reference counting. You could share representation so long as you
>|>  never accessed it, ie using find_*, append, replace, etc functions but
>|>  not any of the indexing or iterator access functions.
>
>You could share representation as long as you never called a non-const
>function on the string.  The problem is, if the string itself is
>non-const, you call non-const functions for operator[] or begin even if
>you don't intend to modify the string.  And of course, the internal
>implementation of string can't know that you won't modify it, and must
>switch from reference counting to unique copies, to be safe.  It is this
>switch which invalidates references and iterators.

Exactly. A constructed string has little control over its fate, and so
reference counting is of benefit in a few situations, but (1) a string
cannot be aware in advance of this (unless, of course, ctors for const
objects can be treated differently...), and (2) its mechanisms must be
pessimistic.

>The "correct" solution (IMHO, of course) is that string should not have
>non-const functions, except for variants of assignment.  But I really
>don't think that we could do this now.

Yes it's too late. However, I think that the "correct" solution is
actually not to have one string class. In the same way that we have more
than sequence type in the standard library plus a set of requirements
for more, strings should not be considered single class concepts. A set
of STL reqs would have made for a happier and extensible family of
string types.

Even if one comes down to mandating concrete types, I think one needs at
least two: modifiable strings and non-modifiable strings. This is the
model I have in some code, with the latter ref counted and the former
not.

>The alternative is for operator[]() et al. in reference counted strings
>to return helper classes instead of actual references.  At present, this
>is banned by the standard; to make it legal, it would be necessary to
>enumerate the legal operations on the results of operator[], instead of
>just stating that it returned a reference.  Doing so would break at
>least two currently legal operations:
>
>    string      s ;
>
>    char&       cr = s[ i ] ;

This can be addressed with a UDC.

>    struct C { int a ; int b } ;
>    basic_string< C > s ;
>
>    s[ i ].a ;
>
>I feel rather secure about the second

Ditto.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Anders J. Munch" <andersjm@post.tele.dk>
Date: 2000/05/28
Raw View
Kevlin Henney wrote in message ...
>>Const constructors and const destructors? I'd love them. They'd mean
>>"This guy is meant to be, respectively was, a constant object". Holds
>>water to me. Ain't it?
>
>Yup, I think this would make a real difference, which is why I made a
>proposal last time round to get them in. However, I believe it was too
>much of a core change too late in the day. Maybe next time :->


One problem is inconsistency.  The keyword 'const' would mean
something entirely different for a constructor than for a member
function.  A _truly_ const constructor would be pretty much useless:
the type of 'this' would be T const*const, and initializing instance
variables would be impossible!

--
Anders Munch, andersjm@post.tele.dk
Still confused but at a higher level.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Anders J. Munch" <andersjm@post.tele.dk>
Date: 2000/05/28
Raw View
Herb Sutter wrote in message ...
>No, it's stronger: While the constructor is running, the object does not
>yet exist. Object lifetime begins when the constructor returns
>successfully. It's not that constructors aren't const, so much that
>constness isn't an applicable concept until the object in question
>exists (is constructed).


Nonsense.  The object does exist.  It just hasn't entered the stage of
its existence which the standard calls the 'lifetime'.

    3.8/1: "The lifetime of an object is a runtime property of the object."

(Herb, I've said this before.  How many times are you going to make
me repeat myself?)

>Ask me whether an object is const, and at first I will say "what
>object?" -- until the constructor ends and the object pops into the
>universe, after which I'll say "oh, that object" and be able to give you
>an answer about whether it's const or not.

What would be the point of exempting constructors from the
const/non-const classification?  The type of 'this' is T*const in a
constructor,
just like in a non-const method; non-const methods can be called from a
constructor, just like from a non-const method, etc.  Why not simply
call the constructor non-const and be done with it?

In the absense of any other definition of "non-const" in the standard,
doesn't
it simply mean "not const"?   Then a constructor is non-const simply by
virtue of not being designated const.

- Anders

--
Anders Munch, andersjm@post.tele.dk
Still confused but at a higher level.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 2000/05/28
Raw View
In article <sirq2hkoo1137@news.supernews.com>, Andrei Alexandrescu
<andrewalex@hotmail.com> writes
>Const constructors and const destructors? I'd love them. They'd mean
>"This guy is meant to be, respectively was, a constant object". Holds
>water to me. Ain't it?
Yes, add them to the list of things to be considered for the next C++
release.

Francis Glassborow      Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/28
Raw View
Anders J. Munch <andersjm@post.tele.dk> wrote in message
news:8gmqkj$b5r$2@news.inet.tele.dk...
> One problem is inconsistency.  The keyword 'const' would mean
> something entirely different for a constructor than for a member
> function.

Yes, indeed it can cause confusion. But in constructors many things
are different, so I can live with that for the sake of the information
I would get.

>  A _truly_ const constructor would be pretty much useless:
> the type of 'this' would be T const*const, and initializing instance
> variables would be impossible!

No doubt about that :o).


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/28
Raw View
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

|>  Cow and multithreading do not work together - see GotW.

Nathan Meyers has posted that for any given machine, there is generally
an efficient implementation.  Of course, it's not portable.  (It
generally involves a little bit of assembler.)  But then, the
implementation of a standard library doesn't have to be portable (and
threads aren't portable anyway).

And a well designed cow will give significant performance gains in most
typical applications.

|>  Frankly, I would like to see the mad cow banned. (A good side effect
|>  will be that vendors will finally implement efficient multithreaded
|>  std::strings.) We can do better than cow for strings - see below.

In sum, you want all users to pay for thread safety, even if they aren't
using threads:-).

|>  What I would like to get instead, is introducing a "temporary string"
|>  class, and maybe to allow expression templates with std::string.
    [Interesting suggestion deleted...]

The interface to std::string is awkward to make efficient *and* thread
safe, although it can be done.  IMHO, strings should be value objects,
i.e.: they should not be modifiable, except by assignment.  (The only
non-const function originally in my pre-standard string class was
operator=3D.  It's also interesting to see that Java took this approch.
And that Java's strings are thread safe, without any synchronized
methods.  Of course, having a garbage collection thread which can lock
all other threads out helps here:-).)  Operators like concatenation or
replace return new string objects, rather than modifying the existing
string.

Of course, in such a system, building up strings character by character
is expensive, since each character results in a new string object.  For
such special cases, I used a special StringBuilder class -- this class
would normally *not* use a shared representation.  But of course, it
would normally not be assigned to other StringBuilder, either, so a
shared representation wouldn't gain anything.

In practice, of course, this is *not* the standard string class -- what
we have is a class that tries to do both.  But a good implementation of
the standard string class should use some simple heuristics, according
to the functions called, to determine which use it is being put to, and
in the case of building, automatically forego cow.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/28
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  In article <sigil66ao1180@news.supernews.com>, Andrei Alexandrescu
|>  <andrewalex@hotmail.com> writes
|>  [...]
|>  >I think indeed Kevlin discovered a problem in string,=20

|>  Well, we had the BSI C++ panel meeting today, and that confirmed
|>  that it was indeed broken, and that the obvious fixes do not work
|>  correctly: Allowing the first call to accessor functions (eg op[]
|>  and begin) after construction or (potential) modification to
|>  invalidate existing refs, ptrs and iters (which is _not_ what it
|>  says at the moment) would initially appear to solve the
|>  problem.=20

As far as I can recall, allowing the first call to operator[] et al. to
invalidate iterators is actually what was discussed and voted on in the
committee.  The entire question was covered in detail in the French
comments to the CD2, and it was my understanding from the response to
the comments that this was what was adopted.  Now I see that it wasn't.

(For those wondering why operator[] has to be a special case: if it
wasn't, s[i]=3D=3Ds[j] would be undefined if *all* calls to operator[]
invalidated references.)

(And the French suggestion actually went a lot further -- what we wanted
was to allow operator[] to return a helper class instead of an actual
reference.  Which in turn can only be made to work if the charT in
basic_string is limited to a scalar type.)

|>  Alas, when taken together with the recent fix to comparing
|>  iterators it does not.

Could you please elaborate.

|>  So it appears that even with fixes, reference counting is only possib=
le
|>  with non-pointer iterators. You can have your cake, but you can't eat=
 it
|>  :->

So where's the problem?  In practice, I would expect a quality
implementation to use a class type for all iterators by default anyway.
Just because the standard says there is undefined behavior, the
implementation is not required to go out of its way to make it difficult
for the user.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/28
Raw View
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

|>  That was my point: the cow does not pull its weight. (What a fat cow.=
)
|>  We (GotW) discovered that mt cow sucks. Now we discover that, even if
|>  the standard made a conscious effort for allowing the cow, it failed
|>  in doing so.

|>  Moreover, all strings in all programs are six characters on the
|>  average :o).

And operator new/malloc is a very expensive function in most
implementations.  Even for six characters:-).

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/28
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  In article <86ya577gxi.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
|>  >Kevlin Henney <kevlin@curbralan.com> writes:
|>  >|>      // first example: "*******************" should be printed tw=
ice
|>  >|>      string original =3D "some arbitrary text", copy =3D original=
;
|>  >|>      const string &alias =3D original;
|>  >
|>  >|>      string::const_iterator i =3D alias.begin(), e =3D alias.end(=
);
|>  >|>      for(string::iterator j =3D original.begin(); j !=3D original=
.end(); ++j)

|>  >Calling a non-const function (like begin, here), invalidates all
|>  >previous iterators.

|>  Except that it does not. That is exactly the reason for my posting. 2=
1.3
|>  para 5 in part states that

|>  "References, pointers, and iterators referring to the elements of a
|>  basic_string sequence may be invalidated by the following uses of tha=
t
|>  basic_string object:
|>  ....
|>  -- Calling non-const member functions, except operator[](), at(),
|>  begin(), rbegin(), end(), and rend()...."

And in part states "Subsequent to any of the above usages [...] the
first call to non-const member functions operator[](), at(), begin(),
rbegin(), end(), or rend()."  I'm still trying to figure out what is
supposed to be meant by the first part of the phrase.  Normally, I would
have thought that "subsequent to" could better be said by "after" --
that would certainly correspond to the latin meaning.  But in this
context, the phrase would be meaningless, since any iterators are
already invalid "subsequent to" the above uses.  In fact, from the
discussions at the time, the meaning must be "before", or something
similar.  If none of the above uses has taken place, the listed
functions invalidate the iterator.  Or they invalidate any iterators
obtained since the above functions.  Or something like that:-).

So there is probably a defect, but it is one of wording, and not what
you can do with a string.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/28
Raw View
Alan Griffiths <alan@octopull.demon.co.uk> writes:

|>  OK, we considered the possibility of permitting the first call to non=
-
|>  const begin (etc) to reallocate (i.e. invalidate existing iterators).
|>  But one problem case still exists:

|>      string s;
|>      const string& cs;
|>      s =3D "..."
|>      cs.begin() =3D=3D s.begin();

|>  Regardless of the change under discussion this will introduce undefin=
ed
|>  behaviour (because as currently worded the last bullet now applies).
|>  This is sufficiently surprising that we did not feel the language
|>  standard should require it.

I don't think that this can be handled as a defect.  The case was well
known and considered by the committee.  (I was mentionned in the French
comments on CD2, for example, with alternatives that solved it.)

The committee had, in fact, three alternatives before it at the time:
  - ban copy on write completely,
  - allow things like the above to invoke undefined behavior, or
  - allow the use of a helper class with operator[].
(The implementation arguments with iterators are a bit of a red herring,
since the iterator can very well be a class, and not just a dumb
pointer.  The return value of operator[]() cannot be.)  The committee
explicitly chose the middle option.  There may be, and apparently are,
small problems with the actual wording.  These can and should be fixed
with a defect report.  A defect report should not be used to change an
explicit decision of the committee, however, even if you and I don't
agree with that decision.

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/29
Raw View
Christopher Eltschka <celtschk@physik.tu-muenchen.de> writes:

|>  The GNU String class had an even more interesting feature:
|>  You could assign to substrings. In standard string syntax,
|>  this would work like the following:

|>  s =3D "Hello!";
|>  s.substr(1, 4) =3D "i there";
|>     // now the characters 1 to 4 are replaced with "i there",
|>     // resulting in the string "Hi there!"

|>  Note that the length of original and replacement substring is
|>  not the same.

Well, if you are going to support a non-const operator[], you might as
well.  At least it is useful, e.g.: in a toupper which converts '=DF' to
"SS" correctly.  (In practice, I've never found the least use for a
non-const operator[], and it is generally more efficient to generate a
completely new string than to assign to a substr. Although by popular
demand, my own string class ended up supporting both.)

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 2000/05/29
Raw View
In article <8gmqki$b5r$1@news.inet.tele.dk>, Anders J. Munch
<andersjm@post.tele.dk> writes
>Nonsense.  The object does exist.  It just hasn't entered the stage of
>its existence which the standard calls the 'lifetime'.
>
> =A73.8/1: "The lifetime of an object is a runtime property of the objec=
t."
>
>(Herb, I've said this before.  How many times are you going to make
>me repeat myself?)

Perhaps until you make a request for interpretation as a result of which
the C++ Committees uphold your view. We certainly intended (and, yes I
was actually a participant in that decision so carry part of the blame
if the words do not express our intent) that an object had no existence
outside its lifetime. We intended that there was no complete object
prior to the start of its lifetime nor after the end of it.


Francis Glassborow      Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/29
Raw View
"Marco Dalla Gasperina" <marcodg@home.com> writes:

|>  I can't figure out (given all the words that have been flying)
|>  if the following is well defined:

|>      string s;
|>      const string &r =3D s;

|>      s[1] =3D r[2];

It's not.

In order to define a string class using cow for which this works, it is
necessary for operator[]() to return a helper class.  The standard
forbids this, however, and it is *not* implementable as long as the
charT type of basic_string can be a struct.

Given the number of people who are going to write such things, compared
to the number of people who are instantiating basic_string on a struct,
it is obvious which solution I would favor.  (OK: I'm just guessing as
to the number of people in each case.  Does anyone know of any case
where someone is actually using a basic_string instantiated on anything
but an integral type?)

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "David Abrahams" <abrahams@mediaone.net>
Date: 2000/05/29
Raw View
"Bill Wade" <bill.wade@stoner.com> wrote in message
news:8gmhs9$pus@library2.airnews.net...

> string is not a really good tool for making lots of modifications to a
> sequence of characters.  That remains true even when strings don't support
> COW, but COW does make it worse.  When you care about performance for
those
> operations, you should almost always prefer either vector (simple,
> contiguous and has reserve()) or rope (rope == COW on steroids).  In rare
> cases even deque or list might make sense.

I think there can be no question that a reference-counting string is a
useful and good idea for many applications. It is not uncommon to need to
copy strings which just *might* need to change later, but probably won't.
The worst problems here arise with all the ways in which a string is
mutable. A reference-counted string, IMO, should be immutable except for
assignment (and maybe operator+=()). Given the contention above that a
basic_string isn't really a good vehicle for modifications anyway, maybe the
standard should have included such an immutable string in the first place?
You could always construct it from a range of vector<char_t>::iterators.
Hmm, an expression template facility could also be used to eliminate
temporaries in complex string slice/recombine expressions...

-Dave

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/29
Raw View
In article <8gmqkj$b5r$2@news.inet.tele.dk>, Anders J. Munch
<andersjm@post.tele.dk> writes
>Kevlin Henney wrote in message ...
>>>Const constructors and const destructors? I'd love them. They'd mean
>>>"This guy is meant to be, respectively was, a constant object". Holds
>>>water to me. Ain't it?
>>
>>Yup, I think this would make a real difference, which is why I made a
>>proposal last time round to get them in. However, I believe it was too
>>much of a core change too late in the day. Maybe next time :->
>
>One problem is inconsistency.  The keyword 'const' would mean
>something entirely different for a constructor than for a member
>function.  A _truly_ const constructor would be pretty much useless:
>the type of 'this' would be T const*const, and initializing instance
>variables would be impossible!

I'd be happy to send a copy of the proposal for anyone interested in the
detail. What is meant by const ctor is perhaps not what you have
assumed: that would indeed be inconsistent and would cause the problems
you describe. Perhaps if we word it another way it becomes clearer: a
ctor that is executed for objects created as const.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/05/29
Raw View
"Anders J. Munch" <andersjm@post.tele.dk> writes:
>Herb Sutter wrote in message ...
>>No, it's stronger: While the constructor is running, the object does no=
t
>>yet exist. Object lifetime begins when the constructor returns
>>successfully. It's not that constructors aren't const, so much that
>>constness isn't an applicable concept until the object in question
>>exists (is constructed).
>
>Nonsense.  The object does exist.

If so, then what is its type? See below.

>It just hasn't entered the stage of
>its existence which the standard calls the 'lifetime'.
>
> =A73.8/1: "The lifetime of an object is a runtime property of the objec=
t."
>
>(Herb, I've said this before.  How many times are you going to make
>me repeat myself?)

While a building is under construction, it's true that some component
parts of it do exist. We generally don't call a collection of
scaffolding a "building" until it's done, though.

What statement do you disagree with? We agree that the lifetime doesn't
begin until the constructor ends successfully. You may want to say that
an object t of type T "exists" during construction and consider it a
matter of taste, but I think it's invalid and a bad mental model. I
agree that a block of memory holding some values does exist, but I deny
that the object exists -- not only because its lifetime hasn't begun,
but because it probably doesn't fulfill the T invariants yet, which
means it is not yet an object of type T. The job of the constructor is
to take raw memory and make an object of the given type; by definition
the object cannot exist or be usable, until the constructor has
completed its job successfully.

Until the constructor finishes, you have memory that's a bundle of bits.
If you want to call that an object, I ask "of what type?"

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/29
Raw View
In article <86d7m7rmkq.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
>Kevlin Henney <kevlin@curbralan.com> writes:
>|>  Except that it does not. That is exactly the reason for my posting. 21.3
>|>  para 5 in part states that
>
>|>  "References, pointers, and iterators referring to the elements of a
>|>  basic_string sequence may be invalidated by the following uses of that
>|>  basic_string object:
>|>  ....
>|>  -- Calling non-const member functions, except operator[](), at(),
>|>  begin(), rbegin(), end(), and rend()...."
>
>And in part states "Subsequent to any of the above usages [...] the
>first call to non-const member functions operator[](), at(), begin(),
>rbegin(), end(), or rend()."  I'm still trying to figure out what is
>supposed to be meant by the first part of the phrase.  Normally, I would
>have thought that "subsequent to" could better be said by "after" --
>that would certainly correspond to the latin meaning.

That is also its English meaning, and I would therefore assume it
naturally to be the one in the standard.

>But in this
>context, the phrase would be meaningless, since any iterators are
>already invalid "subsequent to" the above uses.

Not exactly, it is possible to obtain iterators and references after any
of the above uses without being any of the above uses.

>So there is probably a defect, but it is one of wording, and not what
>you can do with a string.

Well, I think that works out as "yes" to both: because the wording is
defective it changes what you can do with a string!
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Francis Glassborow <francis@robinton.demon.co.uk>
Date: 2000/05/29
Raw View
In article <86zoparke6.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
>A defect report should not be used to change an
>explicit decision of the committee, however, even if you and I don't
>agree with that decision.

Even if it proves to be a mistake, unimplementable, ambiguous or
contradictory? I am not claiming that the case in question is any of
these, I just wish to point out that whether something is a deliberate
decision of the Committee or not says little about its defectiveness.


Francis Glassborow      Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA          +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/29
Raw View
In article <86zoparke6.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
>Alan Griffiths <alan@octopull.demon.co.uk> writes:
>
>|>  OK, we considered the possibility of permitting the first call to non-
>|>  const begin (etc) to reallocate (i.e. invalidate existing iterators).
>|>  But one problem case still exists:
>
>|>      string s;
>|>      const string& cs;
>|>      s = "..."
>|>      cs.begin() == s.begin();
>
>|>  Regardless of the change under discussion this will introduce undefined
>|>  behaviour (because as currently worded the last bullet now applies).
>|>  This is sufficiently surprising that we did not feel the language
>|>  standard should require it.
>
>I don't think that this can be handled as a defect.  The case was well
>known and considered by the committee.  (I was mentionned in the French
>comments on CD2, for example, with alternatives that solved it.)

I think perhaps it is time to reassess whether or not this constitutes
something we want to consider a defect. At the time this was dealt with
(July '97 wasn't it?) the focus was on finishing up a standard. Now it's
shipped it is perhaps worth considering whether certain consequences
were genuinely desirable:

        void f(const string &s, string &t)
        {
                if(s[0] == s[0]) ...
                ...
        }

        void g()
        {
                string s("?"), t("#"), u("!");
                f(s, t); // yes
                f(u, u); // no
        }
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/29
Raw View
<kanze@gabi-soft.de> wrote in message
news:86ln0vro55.fsf@gabi-soft.de...
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:
|>  Moreover, all strings in all programs are six characters on the
|>  average :o).

And operator new/malloc is a very expensive function in most
implementations.  Even for six characters:-).

Right. However, I saw a nice article by Jack Reeves in the Report. He
stored small strings right in the space used for the pointers. Hard to
implement and very hard to implement generically and portably, but
worth looking into.


Andrei



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/29
Raw View
<kanze@gabi-soft.de> wrote in message
news:86u2fjrov7.fsf@gabi-soft.de...
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

|>  Frankly, I would like to see the mad cow banned. (A good side
effect
|>  will be that vendors will finally implement efficient
multithreaded
|>  std::strings.) We can do better than cow for strings - see below.

>In sum, you want all users to pay for thread safety, even if they
aren't
using threads:-).<

Actually, I wanted the opposite - give a fair chance to mt folks and
not harm st folks. But if cow is so important...

>The interface to std::string is awkward to make efficient *and*
thread
safe, although it can be done.  IMHO, strings should be value objects,
i.e.: they should not be modifiable, except by assignment.  (The only
non-const function originally in my pre-standard string class was
operator=.  It's also interesting to see that Java took this approch.
And that Java's strings are thread safe, without any synchronized
methods.<

In C++, const member functions introduce a neat discrimination between
const and non-const strings. Except for constructors and destructors -
that's why I think it would be worthwhile introducing them.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Ross Smith" <ross.s@ihug.co.nz>
Date: 2000/05/29
Raw View
> <kanze@gabi-soft.de> wrote in message news:868zwvrly6.fsf@gabi-soft.de...
> "Marco Dalla Gasperina" <marcodg@home.com> writes:
>
> |>      string s;
> |>      const string &r = s;
> |>      s[1] = r[2];
>
> In order to define a string class using cow for which this works, it is
> necessary for operator[]() to return a helper class.  The standard
> forbids this, however, and it is *not* implementable as long as the
> charT type of basic_string can be a struct.

Why not?

--
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
   "So that's 2 T-1s and a newsfeed ... would you like clues with that?"
                                                       -- Peter Da Silva


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/25
Raw View
In article <8ggmuj$qai@library2.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
>"Alan Griffiths" <alan@octopull.demon.co.uk> wrote in message
>news:BaQryAA6FuK5EwNR@octopull.demon.co.uk...
>> The sequence of events is a call to const begin to obtain an iterator
>> and then a call to non-const begin which (in practice, but not in the
>> standard) unshares the string implementation - non-conformingly
>> invalidating the original iterator.
>
>I don't understand why you say "not in the standard."

Because it isn't in the standard :->

>The fifth bullet
>specifically says that the first call to one of a set of several members,
>including non-const begin(), invalidates existing iterators.

The fifth bullet actually says that the first call sequentially
following any of the above (ie calls specified as invalidating iterators
in the previous four bullets) to one of a set of several members
including non-const begin(), invalidates existing iterators.

If you don't believe us, here's the text. Note (1) the poor wording, and
(2) the precondition on the fifth bullet applying (ie "Subsequent...."):

"Subsequent to any of the above uses except the forms of insert() and
erase() which return itera-tors, the first call to non-const member
functions operator[](), at(), begin(), rbegin(), end(), or rend()."

Point (1), and the knowledge that the intent of the standard was to
support reference-counted strings, means that it is very easy to read
what one thinks (or would like to think) it says rather than what it
actually says.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern/std-c++/faq.html                  ]





Author: Rintala Matti <bitti@korppi.cs.tut.fi>
Date: 2000/05/26
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:
> In article <8ge4tq$g9c@library1.airnews.net>, Bill Wade
> <bill.wade@stoner.com> writes
> >Could you clarify?  The only argument I can see along these lines would say
> >that a constructor is not a non-const member function.
...
> >They certainly don't look
> >like they are const.
>
> And it definitely does not look like a non-const member function: non-
> const member functions cannot be called on const objects.

I will not directly take part in the "are reference-counted strings
conforming" discussion, but since at least one Kevlin Henney's example
was based on the opinion that constructors are const, I thought I
should quote the following part of the standard:

12.1 p4:
"A constructor can be invoked for a const, volatile or const volatile
object. A constructor shall not be declared const, volatile, or const
volatile (9.3.2). const and volatile semantics (7.1.5.1) are not
applied on an object under construction. Such semantics only come into
effect once the constructor for the most derived object (1.8) ends."

This seems to indicate quite clearly that constructors are _not_
const, even if they are used to initialise a const object. While the
constructor is running, the object is non-const.

------------ Matti Rintala ----------- bitti@cs.tut.fi ------------
"Well, the way I see it, logic is only a way of being ignorant
 by numbers." (from "Small Gods" by Terry Pratchett)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/26
Raw View
Bill Wade wrote:
>
> "Kevlin Henney" <kevlin@curbralan.com> wrote in message
> news:Ux2cAgAeirK5Ew8H@two-sdg.demon.co.uk...
> > In article <8ge4tq$g9c@library1.airnews.net>, Bill Wade
> > <bill.wade@stoner.com> writes
>
> > >Are constructors member functions?  12/1 "...
> > >constructor[s] ... are ... member functions."
> >
> > If this is the piece of text I think it is, the missing words include
> > "special" and nowhere is non-const mentioned.
>
> True, but in English I expect that "special member functions" are "member
> functions."

Yes, but this is the C++ standard we're talking about. Keep in mind
that, for instance, a "null pointer constant" is null, and constant, but
is guaranteed to not be a pointer.

In this case, I think you're right - they are member functions. However,
I think that they are neither const nor non-const; the distinction
doesn't apply to constructors. A non-const member function can only be
called on a non-const object. Constructors aren't called on objects,
they create objects, so the distinction doesn't apply. 12.1p4: "const
and volatile semantics (7.1.5.1) are not applied to an object under
construction."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/26
Raw View
In article <OlXW4.63804$55.505983@news1.sttls1.wa.home.com>, Marco Dalla
Gasperina <marcodg@home.com> writes
>>         string s;
>>         const string &r = s;
>>         if(r.begin() == s.begin()) ....
>
>I can't figure out (given all the words that have been flying)
>if the following is well defined:
>
>    string s;
>    const string &r = s;
>
>    s[1] = r[2];


Assuming some call relevant call between the decls and the assignment,
then no this is not. Once upon a time it was, but from what I can tell
the result of r[2] changed from char to const char & in the November '96
Working Paper, so this is (1) not unreasonable code, and (2) also a
problem case.

On reconsidering, I think I was too conservative in one of my other
postings about what needed to be changed. I had forgotten that const
operator[]() and at() now returned references rather than values. This
leads to an even more interesting state of affairs:

(1) Dropping the fifth sub-bullet entirely would eliminate all of these
problems (I believe), but would also practically eliminate possibilities
for reference counting. You could share representation so long as you
never accessed it, ie using find_*, append, replace, etc functions but
not any of the indexing or iterator access functions. [In this case,
Andrei gets his wish: the mad COW would be well on the way to the
incinerator ;->]

(2) Revert back to returning values for those const functions, and the
wording could be changed to still allow reference counting.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/26
Raw View
In article <8gh7qf$dcm@library2.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
[...]
>True, but in English I expect that "special member functions" are "member
>functions."

Indeed, but does this help you establish whether they are const, non-
const, or something completely different?

>> Care should be taken in
>> omitting words from quotes.
>
>True.  I'm certainly not entirely comfortable in calling constructors member
>functions.  However  I can't find a clear statement in the standard that
>they aren't.

The standard is indeed not very clear in this respect, but to hijack
this lack of clarity to change the intent of the some poor wording for
string seems less than useful. The wording for string is at fault,
period. We can play word games until the COWs come home, but this line
of reasoning seems moribund.

However, if one wishes to play such games, it is trivial to demonstrate
that [21.3 para 5] does not work for the case you are holding out for:

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by the following uses of that
basic_string object:
....
-- Calling non-const member functions, except operator[](), at(),
begin(), rbegin(), end(), and rend().
...."

This does not make any kind of sense when we just rewrite to focus on
constructors only:

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by calling a constructor."

I think that which was moribund is now dead.

>> >They certainly don't look like they are const.
>>
>> And it definitely does not look like a non-const member function: non-
>> const member functions cannot be called on const objects.
>
>But non-const member functions can create const objects.

Whilst this is true it is alas not in the slightest bit relevant:
Whether or not a function can create objects is nothing to do with
whether or not it can be called on a const object.

>We'd could probably have a similar debate about whether or not
>  'char' is 'non-signed char'
>in an environment where char is signed.

I doubt it :->

>I'll absolutely agree that a clearer statement in 21.3/5 bullet 4  could
>have saved us both a lot of typing.

No, this is not the bullet at fault. It is bullet 5 that is at fault;
bullet 4 is quite clear (but needs fixing for different reasons).

>>         string s, &r = s;
>>         if(r.begin() == s.begin()) ....
[...]
>If you change r to be reference to const string (I see you fixed this in
>another post), this does not work because the evaluation order is not
>specified, and with the wrong order the r.begin() result is unusable.  That
>is certainly "surprising" if your mental model treats string as a
>vector<char>, but it is what the standard says.  I don't see this particular
>example in the public version of issue 179.  Are you saying that this
>(fixed) example will become normative text?

My mental model treats string as an unstring-like minefield rather than
a vector<char>.

That said, great efforts have been made in supporting the principle of
least astonishment, and the draft (and now the standard) has been fixed
many times to this end. Issue 179 is one such example. Another example
that is more closely related to the matter at hand is one of the ways in
which the draft was fixed between CD2 and FDIS:

        s[i] == s[i]

According to your line of reasoning, people should just have accepted
that this was bad news and simply not call it. Fortunately the result of
the July '97 mtg was more constructive and fixed this up.

>I don't think you'd be surprised to find that the following doesn't work
>
>  vector<char> s;
>  const vector<char&> r;
>  if(r.begin() == s.insert(s.begin(), 'x'))
>
>because you know when vector::insert can invalidate iterators.  You won't be
>surprised that the const/non-const version of your example fails if you know
>when string::begin can invalidate iterators.

Comparing like with like might be a more useful exercise:

        vector<char> s;
        const vector<char> &r = s;
        if(r.begin() == s.begin() ....

>The standard traded safety for implementation freedom, here and in many
>other places.  We may not agree with the decision, but I believe that it is
>wrong to correct it as a DR.

Indeed, but correcting that which is defective is the very purpose of a
DR. It is interesting to play around with possibilities as we have done,
but a fault is still a fault and "if it is broke, fix it".
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Rob Stewart <donotspamme@giage.com>
Date: 2000/05/26
Raw View
Bill Wade wrote:
>
> "Kevlin Henney" <kevlin@curbralan.com> wrote:
> >
> > Bill Wade <bill.wade@stoner.com> writes:
> > >
> > >Are constructors member functions?  12/1 "...
> > >constructor[s] ... are ... member functions."
> >
> > If this is the piece of text I think it is, the missing words include
> > "special" and nowhere is non-const mentioned.
>
> True, but in English I expect that "special member functions" are "member
> functions."

Perhaps the right way to think of ctors and dtors is that they
are "special" member functions and, as such, they are neither
const nor non-const.

--
Robert Stewart     |  rob-at-giage-dot-com
Software Engineer  |  using std::disclaimer;
Giage, Inc.        |  http://www.giage.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/26
Raw View
John Hickin wrote:
>
> James Kuyper wrote:
> >
>
> >
> > I'm not saying that allocators aren't broken, but I'd be interested in
> > knowing the particular features of them that you consider broken.
> > Personally, my least favorite parts are sections 20.1.5 p4 and p5, but I
> > don't think that counts as broken, just extremely frustrating.
> >
>
> All instances of the same allocator are allowed to be treated as
> equivalent. In my opinion this means that they are _broken_. In yours,
> frustrating. The standard goes to great lengths to make sure that
> sufficient flexibility is available to the programmer to get various
> jobs done. This flexibility isn't (yet) available with the allocator
> specification. It is a definite shortcoming. Without the caveat in the
> standard a defect report would surely have been opened.

I don't consider it broken, because I haven't detected any inconsistency
in that specification. Also, from what I've heard, there isn't a lot of
demand for such allocators.

The writers of the standard felt that there was insufficient experience
with  allocators that allow inequivalent instances to justify
standardizing them. However, they wanted to encourage further
experimentation with the concept. Therefore, they allow but do not
mandate standard containers that can work with such allocators. I would
prefer that they were mandated, and would prefer the semantics of such
containers to be defined by the standard, rather than being
implementation-defined. However, I can understand the committee's
hesitation, and therefore I'm only frustrated.

Does anyone know if there's any implementation of the C++ standard
library where the standard containers can handle such allocators?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: philip.hibbs@tnt.co.uk
Date: 2000/05/26
Raw View
Kevlin:
> Which means that the first call to them since construction is not
> covered, hence (part of) the problem. In the examples I gave none
> of the conditions for invalidation applied.

So, you aren't counting the constructor as a non-const function?


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/26
Raw View
"Marco Dalla Gasperina" <marcodg@home.com> wrote in message
news:OlXW4.63804$55.505983@news1.sttls1.wa.home.com...

> I can't figure out (given all the words that have been flying)
> if the following is well defined:
>
>     string s;
>     const string &r = s;
>
>     s[1] = r[2];
>
> The culprit sequence would be
> 1)  [const] char& t1 = r.operator[](2)
> 2)  char& t2 = s.operator[](1); // Invalidating the previous reference??
> 3)  char c = t1;    // boom?
> 4)  t2 = c;

Well it is clearly undefined because your indices are out of range ;-).

If you initialize s with

  string q = "abc";
  s = q;

The behavior is undefined according to the standard.  In practice this
example is safe on existing implementations when there is only a single
thread of execution.  After your step (2), t1 will no longer reference the
'b' in s, but it will continue to reference the 'b' in q and you will get
the desired effect.  In a multithreaded COW environment you have to worry
that 'q' gets destroyed or modified in another thread before step (3) occurs
in your thread.

If the last time you "modify" s is in its constructor, there is a lively
debate going on about whether or not this is undefined.  However my "in
practice on existing implementations" discussion remains unchanged, because
existing COW implementations do essentially the same thing for both
assignment and construction.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/26
Raw View
In article <w59r9arcbdd.fsf@korppi.cs.tut.fi>, Rintala Matti
<bitti@korppi.cs.tut.fi> writes
>I will not directly take part in the "are reference-counted strings
>conforming" discussion, but since at least one Kevlin Henney's example
>was based on the opinion that constructors are const, I thought I
>should quote the following part of the standard:

Really? I made no such statement or assumption, so you must have misread
what I posted. I asked a question about the justification of classifying
constructors as non-const or const member functions.

None of my examples assumed what you state, I'm afraid, nor are they
based on opinions.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/26
Raw View
In article <8ginsp$qnv$1@nnrp1.deja.com>, philip.hibbs@tnt.co.uk writes
>Kevlin:
>> Which means that the first call to them since construction is not
>> covered, hence (part of) the problem. In the examples I gave none
>> of the conditions for invalidation applied.
>
>So, you aren't counting the constructor as a non-const function?

Doesn't matter whether or not I count it as a const or a non-const
member function, the wording in 21.3 para 5 does not deal with
construction.

The paragraph describes when references, pointers, and iterators become
invalidated, and calling a general non-const member function may
invalidate them. If a constructor is a non-const member function, then
you have to arrange to somehow call it on an object after you have
created iterators etc to it!

The whole const/non-const discussion is a complete red herring as it
does not apply.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Herb Sutter <hsutter@peerdirect.com>
Date: 2000/05/26
Raw View
Rintala Matti <bitti@korppi.cs.tut.fi> writes:
>This seems to indicate quite clearly that constructors are _not_
>const, even if they are used to initialise a const object.

Sort of.

>While the constructor is running, the object is non-const.

No, it's stronger: While the constructor is running, the object does not
yet exist. Object lifetime begins when the constructor returns
successfully. It's not that constructors aren't const, so much that
constness isn't an applicable concept until the object in question
exists (is constructed).

Ask me whether an object is const, and at first I will say "what
object?" -- until the constructor ends and the object pops into the
universe, after which I'll say "oh, that object" and be able to give you
an answer about whether it's const or not.

Herb

---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Editor-in-Chief, C++ Report (http://www.creport.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/26
Raw View
Herb Sutter <hsutter@peerdirect.com> wrote in message
news:qc6ris840dnnn9htnsje4oio23tmelipkl@4ax.com...
> Ask me whether an object is const, and at first I will say "what
> object?" -- until the constructor ends and the object pops into the
> universe, after which I'll say "oh, that object" and be able to give
you
> an answer about whether it's const or not.

I still think a const-qualified constructor makes conceptual sense.
Here const is *information* - somebody, somewhere, builds an object
that is const. Call me a bad guy, but I think that discrimination by
constness during construction/destruction is good - you can optimize
stuff. It's information.

Const constructors and const destructors? I'd love them. They'd mean
"This guy is meant to be, respectively was, a constant object". Holds
water to me. Ain't it?


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/26
Raw View
Bill Wade <bill.wade@stoner.com> wrote in message
news:8ghltk$fqu@library2.airnews.net...
> An underlying current of these discussions is a hope that the
standard will
> back away from COW support.  COW can be a big help to
single-threaded
> programs that mostly treat strings as "entities" rather than as
arrays to
> manipulate on a per-character basis.

I would correct this. The mad cow can be a big help to single-threaded
programs that must deal with strings in a very efficient manner.
Otherwise, who cares?

This further reduces cow's applicability.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/27
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:+hvIuCAF4$K5Ew4M@two-sdg.demon.co.uk...
> In article <8ggmuj$qai@library2.airnews.net>, Bill Wade
> <bill.wade@stoner.com> writes

> >I don't understand why you say "not in the standard."
>
> Because it isn't in the standard :->
>
> >The fifth bullet
> >specifically says that the first call to one of a set of several members,
> >including non-const begin(), invalidates existing iterators.
>
> The fifth bullet actually says that the first call sequentially
> following any of the above (ie calls specified as invalidating iterators
> in the previous four bullets) to one of a set of several members
> including non-const begin(), invalidates existing iterators.

So if a constructor is a non-const member function (calls to non-const
member functions are listed in bullet 4) then it follows that the first call
after construction to one of the members listed in bullet five invalidates
any existing iterators and the behavior you don't like is "in the standard."

If either bullet four or bullet five had specifically mentioned constructors
(to either explicitly include or exclude them) then you and I would probably
agree on whether or not your code is well behaved according to the standard.

Boolean logic would say that one of the three statements is true
  1) 'tors are not member functions
  2) 'tors are const member functions
  3) 'tors are (non-const) member functions.
Unfortunately we seem to be stuck with fuzzy logic here.  IMO option (3) is
most consistent with the rest of the standard.  Adding 'tors to bullet 4 is
meaningless in isolation (none of us expect that iterators survive 'tors,
even though I can't find such chapter and verse after looking at lifetime,
iterator, and container requirements), but bullet 5 references bullet 4, and
'tors are certainly meaningful in that context.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: =?ISO-8859-1?Q?J=F6rg?= Barfurth <joerg.barfurth@attglobal.net>
Date: 2000/05/27
Raw View

>>>>>>>>>>>>>>>>>> Urspr=FCngliche Nachricht <<<<<<<<<<<<<<<<<<

Am 26.05.00, 09:40:09, schrieb Herb Sutter <hsutter@peerdirect.com> zum=20
Thema Re: Can std::basic_string implementations be reference counted?:


> Rintala Matti <bitti@korppi.cs.tut.fi> writes:
> >This seems to indicate quite clearly that constructors are _not_
> >const, even if they are used to initialise a const object.

> Sort of.

> >While the constructor is running, the object is non-const.

> No, it's stronger: While the constructor is running, the object does no=
t
> yet exist. Object lifetime begins when the constructor returns
> successfully. It's not that constructors aren't const, so much that
> constness isn't an applicable concept until the object in question
> exists (is constructed).

> Ask me whether an object is const, and at first I will say "what
> object?" -- until the constructor ends and the object pops into the
> universe, after which I'll say "oh, that object" and be able to give yo=
u
> an answer about whether it's const or not.
=20
> Herb

But the (base class and member) subobject of a const object are=20
(recursively) const.

While the constructor of an object is running such subobjects already=20
exist, and they are non-const. This holds even if the object being=20
constructed (and therefore those same subobjects) will be const after the=
=20
constructor is done.

You might say that such subobjects exist only relative to the outer=20
object and dont have an existance of their own. But then, why do they get=
=20
destructed in case of an exception ?

What I'm trying to say is that there is a meaning to the statement "While=
=20
the constructor is running, the object is non-const.", namely that=20
(already existing) subobjects are not const.

As all objects are ultimately composed from subobjects of builtin types=20
and physical mutation of an object's state always involves mutating such=20
a (POD) subobject, it is reasonable to relate object constness to=20
constness of the subobjects it is composed of (disregarding members that=20
are explicitly const or mutable).

-------------------------------
I just found yet another way to look at it: It would be legal for a=20
compiler (that uses vtables) to remove any non-const virtual functions=20
from the vtable(s) of a const object and its (non-mutable) subobjects.

It must provide the complete (non-const objects') vtables though while=20
the object is still being constructed (or is a subobject of an object=20
being constructed). =20

--
J=F6rg


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/27
Raw View
In article <8gm05n$fnf@library1.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
>So if a constructor is a non-const member function (calls to non-const
>member functions are listed in bullet 4) then it follows that the first call
>after construction to one of the members listed in bullet five invalidates
>any existing iterators and the behavior you don't like is "in the standard."

The behavior I "don't like"? Well that's a funny way of putting a defect
:->

Let my try to outline this as simply as possible so you don't miss any
of the working. First of all, here's the text:

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by the following uses of that
basic_string object:
-- As an argument to non-member functions swap() (21.3.7.8),
operator>>() (21.3.7.9), and getline() (21.3.7.9).
-- As an argument to basic_string::swap().
-- Calling data() and c_str() member functions.
-- Calling non-const member functions, except operator[](), at(),
begin(), rbegin(), end(), and rend().
-- Subsequent to any of the above uses except the forms of insert() and
erase() which return iterators, the first call to non-const member
functions operator[](), at(), begin(), rbegin(), end(), or rend()."

And here's the logic:

- The five bullets in 21.3 para 5 describe when existing references,
pointers and iterators to the elements of a string may be invalidated.
- The fourth bullet describes calls to an object which may invalidate
any existing references, pointers or iterators to the elements of a
string.
- A constructor cannot be called on a string object to invalidate any
existing references, pointers or iterators to its elements, therefore
the fourth bullet does not apply to constructors.
- The fifth bullet applies subsequent to any of the uses mentioned in
the previous four bullets, which means that constructors are excluded
because they do not apply in the fourth bullet.
- This conclusion is independent of the status of constructors as const,
non-const, or some other kind of special member function.

Whilst I have no doubt that other sophistry may be thrown at this, that
is of less interest than actually trying to resolve the problem!
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/27
Raw View
In article <sirq2hkoo1137@news.supernews.com>, Andrei Alexandrescu
<andrewalex@hotmail.com> writes
>I still think a const-qualified constructor makes conceptual sense.
>Here const is *information* - somebody, somewhere, builds an object
>that is const. Call me a bad guy, but I think that discrimination by
>constness during construction/destruction is good - you can optimize
>stuff. It's information.
>
>Const constructors and const destructors? I'd love them. They'd mean
>"This guy is meant to be, respectively was, a constant object". Holds
>water to me. Ain't it?

Yup, I think this would make a real difference, which is why I made a
proposal last time round to get them in. However, I believe it was too
much of a core change too late in the day. Maybe next time :->
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/28
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:x4nFSEAdOSL5Ewgs@two-sdg.demon.co.uk...

> (1) Dropping the fifth sub-bullet entirely would eliminate all of these
> problems (I believe), but would also practically eliminate possibilities
> for reference counting. You could share representation so long as you
> never accessed it, ie using find_*, append, replace, etc functions but
> not any of the indexing or iterator access functions. [In this case,
> Andrei gets his wish: the mad COW would be well on the way to the
> incinerator ;->]

Eliminating bullet 5 makes COW significantly uglier (in particular c_str()
must either mark the string unshareable or not point the same place as
&s.begin(), because if you remove bullet 5, even const begin() must make the
string unshareable.).

However you'd be surprised at what a COW addict could still do.

COW still works nicely for simple copying cases (return by value, or when
vector<string> changes capacity).  String supports the most common string
algorithms in forms that use indexes instead of iterators.  Indexes are much
less subject to invalidation than iterators.  You can get a single element's
value by index (meaning the string remains shared) using one of the forms of
copy().  You can change character values (breaking any current shares, but
leaving the string in a sharable state) using replace, push_back, etc.  If
you've gotten a string into an unshareable state you can make it shareable
again by swapping it with itself.

string is not a really good tool for making lots of modifications to a
sequence of characters.  That remains true even when strings don't support
COW, but COW does make it worse.  When you care about performance for those
operations, you should almost always prefer either vector (simple,
contiguous and has reserve()) or rope (rope == COW on steroids).  In rare
cases even deque or list might make sense.




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/24
Raw View
Christopher Eltschka <celtschk@physik.tu-muenchen.de> wrote in message
news:3929799C.4E957404@physik.tu-muenchen.de...
> Andrei Alexandrescu wrote:
[snip]
> > Similarly, operator+ can return a temp_string, or alternatively
can
> > return an expression template. (I'd be happy enough with a
> > temp_string.)
>
> With operator+, there's a problem: the concatenated string
> inside operator+, which the temp_string would have to refer to,
> doesn't exist any more.

I think there's a misunderstanding. temp_string holds memory just like
string does.

> > You can do something similar with today's string, but who would do
> > that?
> >
> > s.substr(1, 100).swap(s); // efficient but... man!
>
> s.assign(s, 1, 100); // should work, shouldn't it?

You've got a point, but see below.

> > // btw s.swap(s.substr(1, 100)) doesn't work
> > // due to a rule that I basically think is wrong
> >
> > Or:
> >
> > // Old (inefficient) code: s = s1 + s2;
> > // New (efficient but clumsy)
> > (s1 + s2).swap(s);
>
> s.assign(s1 + s2);

Nope, that implies an extra copy.

> Or (probably more efficient):
>
> s = s1; s += s2;

I think (s1 + s2).swap(s) remains the most efficient (only one
allocation call etc.). Anyway, my goal was to make operator+ more
efficient without a loss in safety and expressivity, and I think there
are ways to do it.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <3925753D.65CE1C73@giage.com>, Rob Stewart
<donotspamme@giage.com> writes
>kanze@gabi-soft.de wrote:
>>
>> Kevlin Henney <kevlin@curbralan.com> writes:
>>
>> |>      string original = "some arbitrary text", copy = original;
>> |>      const string &alias = original;
>>
>> |>      string::const_iterator i = alias.begin(), e = alias.end();
>> |>      for(string::iterator j = original.begin(); j != original.end(); ++j)
>>
>> Calling a non-const function (like begin, here), invalidates all
>> previous iterators.
>
>However accurate, that means that you can't iterate a string.  If
>you call begin() to get a start iterator, and then call end() to
>test it, you already invalidated the start iterator, so the
>comparision is undefined.

The statement was not accurate, and you can iterate a string, therefore
the comparison is well defined. The standard has reasonably careful
wording to allow typical use of non-const begin and end. The problem is
that it let a few key cases slip through.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <3924E404.B3D9EBF4@lucent.com>, Michiel Salters
<salters@lucent.com> writes
>kanze@gabi-soft.de wrote:
>>
>> Kevlin Henney <kevlin@curbralan.com> writes:
>
>> |>      string::const_iterator i = alias.begin(), e = alias.end();
>> |>      for(string::iterator j = original.begin(); j != original.end(); ++j)
>
>> Calling a non-const function (like begin, here), invalidates all
>> previous iterators.
>
>Wait a minute.
>Does this mean that the first line is already causing undefined behavior?

No, because there are _no_ calls in my example which invalidated any
iterators according to the wording in the standard.

>I.e. string::const_iterator i = alias.begin(), e = alias.end();

And these are certainly excluded, as calls to const begin and end never
invalidate iterators.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Alan Griffiths <alan@octopull.demon.co.uk>
Date: 2000/05/24
Raw View
In article <39299B6D.7E5E@esva.net>, Beman Dawes <beman@esva.net> writes
>Kevlin Henney wrote:
>>
>> It has always been assumed that strings could be reference counted and,
>
>It is really stronger than an assumption; the standard explicityly says:
>
>"These rules are formulated to allow, but not require, a reference
>counted implemenation. A reference counted implementation must have the
>same semantics as a non-reference counted implementation."

And as a result of this non-normative footnote the assumption has been
made.  What we proved at the recent BSI panel meeting is that the costs
of a ref-counted implementation are significantly higher than previously
believed.

>> If I'm wrong, then great and all we need to do is clarify the wording in
>> 21.3 para 5 (in particular the English of the last bullet needs some
>> work), otherwise a more significant fix is required.
>
>It wouldn't hurt to clarify the wording of the last bullet item.  The
>basic idea is to say that in a reference counted implementation, "the
>first time an iterator escapes, the string has to be copied, as it is
>now subject to being modified."  IOW, the standard requires
>copy-on-anticipation-of-write if reference counting is used.  It was
>hard to come up with standardese to say that.

OK, we considered the possibility of permitting the first call to non-
const begin (etc) to reallocate (i.e. invalidate existing iterators).
But one problem case still exists:

    string s;
    const string& cs;
    s = "..."
    cs.begin() == s.begin();

Regardless of the change under discussion this will introduce undefined
behaviour (because as currently worded the last bullet now applies).
This is sufficiently surprising that we did not feel the language
standard should require it.
>
>> I can read that
>> para many ways, ranging from making reference counting mostly
>> impractical to effectively impossible.
>
>Well, there was an existance proof that it could be done; at the time of
>the addition of those words to the standard there was at least one
>existing implementation that used reference counting.  Whether it was
>practical or not is another question.

I have access to two such implementations - both fail to behave
correctly in the examples posted at the beginning of the thread.

...
>
>I'm missing something.  Doesn't in both cases the first alias.begin()
>force a copy, and thus everything works as expected?

No, the predicate of the last bullet is false - so a copy is disallowed
(fourth bullet).

>I'm also confused
>by "alias".  My understanding of the standard is that it is immaterial,
>yet I expect you included it because you though it had some bearing on
>the issue?

It is there to provide a means to invoke to the const version of begin.
The sequence of events is a call to const begin to obtain an iterator
and then a call to non-const begin which (in practice, but not in the
standard) unshares the string implementation - non-conformingly
invalidating the original iterator.

--
Alan Griffiths  (alan@octopull.demon.co.uk)  http://www.octopull.demon.co.uk/
ACCU Chairman   (chair@accu.org)             http://www.accu.org/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/24
Raw View
"Andrei Alexandrescu" <andrewalex@hotmail.com> wrote in message
news:sild9ggjo176@news.supernews.com...

> In my mt app ... I'd say - let's ban the mad cow and we're home free.

That is exactly a QOI issue.  The standard does not mandate COW vs. non-COW.
You'd be better off with non-COW.  Find an implementation with the qualities
that you want.

There are other problem domains where COW is better.  Giving the vendor the
freedom to make his implementation faster for a particular anticipated use
is very consistent with other aspects of the standard C++ library.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <8geted$ijb@library2.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
>"Andrei Alexandrescu" <andrewalex@hotmail.com> wrote in message
>news:sild9ggjo176@news.supernews.com...
>
>> In my mt app ... I'd say - let's ban the mad cow and we're home free.
>
>That is exactly a QOI issue.  The standard does not mandate COW vs. non-COW.

This is true, but great effort has been put into allowing lightweight
COW as an implementation possibility. However, this has been the source
of many wording bugs and changes in the past, present and -- inevitably
-- future, if the past is the best predictor of the future. This effort
has clearly not paid off, and a more conservative approach would produce
a cleaner standard and better implementations.

>From what I can see, wording changes that would fix up the problems we
have identified will allow reference counted implementations but such
implementations will be mutually exclusive with pointer based iterators.
I think this is a fair compromise: if you want a "smart" implementation
use smart pointers (ie iterator classes), if you want a "dumb" one then
use dumb pointers (ie raw ones) :->
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <8ge4tq$g9c@library1.airnews.net>, Bill Wade
<bill.wade@stoner.com> writes
[....]
>Could you clarify?  The only argument I can see along these lines would say
>that a constructor is not a non-const member function.  At best that seems
>to be sophistry.

Which way? I think it is perhaps sophistry to claim that a ctor is a
non-const member function :->

>Are constructors member functions?  12/1 "...
>constructor[s] ... are ... member functions."

If this is the piece of text I think it is, the missing words include
"special" and nowhere is non-const mentioned. Care should be taken in
omitting words from quotes.

>They certainly don't look
>like they are const.

And it definitely does not look like a non-const member function: non-
const member functions cannot be called on const objects.

>Perhaps you would argue that after
>
>  string s = "Hello";
>  string::iterator i = s.begin();
>  s.~string();
>  new(&s) string;
>
>i should still be valid since the only non-const member that was called is
>begin() and the lifetime section only talks about pointers and references,
>not iterators (and nothing in the standard says that iterators must be
>implemented in terms of either pointers or references)?

basic_string is still subject to sequence requirements, and so this
would be as valid as the same usage on a vector.

>> would
>> initially appear to solve the problem. Alas, when taken together with
>> the recent fix to comparing iterators it does not.
>
>What is this fix?  Looking at
>http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html the only thing
>that looks like it comes close to causing a restriction is DR 51, but that
>only applies to containers, and I don't believe that std::basic_string is a
>container as the word is used in the standard.

According to the standard, std::basic_string is indeed a container
because it satisfies sequence requirements. Whether or not it is a good
container is matter for separate debate ;->

The issue I was referring to is 179 in lwg-active: "Comparison of
const_iterators to iterators doesn't work", but it should and this was
fixed at the Tokyo mtg.

With that fix the following example becomes interesting:

        string s, &r = s;
        if(r.begin() == s.begin()) ....

In effect, it would appear that to make this work it is not enough to
fix basic_string's wording to include "since construction"; it must also
remove all of the iterator functions from having any effect on the
validity at all. This effectively bans reference-counted implementations
that use pointers for their iterators.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Matt Austern <austern@sgi.com>
Date: 2000/05/24
Raw View
"Andrei Alexandrescu" <andrewalex@hotmail.com> writes:

> In my mt app I use 1000 strings in its single-threaded part, and 10
> strings in the multi-threaded part. What to do? You could argue that
> the vendor could provide an additional non_ref_count_string class, or
> an additional template parameter to std::basic_string, but I'd say -
> let's ban the mad cow and we're home free.

I'm not a fan of reference counting either, but I think it's very
unlikely that the standardization committee will change the standard
to forbid a reference-counted basic_string.  For that matter, I also
think it's very unlikely that the committee will require a reference-
counted basic_string.

There are real-world implementations on important platforms that use
reference-counted strings.  The committee isn't going to turn around
and say that all those implementations are non-conforming.  If it did
something so reckless, the vendors would (justifiably) ignore it.

Engineers have to concern themselves with what's possible, and so do
standardization committees.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <39299B6D.7E5E@esva.net>, Beman Dawes <beman@esva.net> writes
>Kevlin Henney wrote:
>>
>> It has always been assumed that strings could be reference counted and,
>
>It is really stronger than an assumption; the standard explicityly says:
>
>"These rules are formulated to allow, but not require, a reference
>counted implemenation. A reference counted implementation must have the
>same semantics as a non-reference counted implementation."

Yes, this was why I worded my sentence as I did: It has been assumed
that reference strings can be reference counted because a piece of non-
normative text in the standard says so and a great deal of effort was
put in to try and accommodate such implementations. This was the
assumption, and unfortunately the reality is different :-(

>It wouldn't hurt to clarify the wording of the last bullet item.  The
>basic idea is to say that in a reference counted implementation, "the
>first time an iterator escapes, the string has to be copied, as it is
>now subject to being modified."  IOW, the standard requires
>copy-on-anticipation-of-write if reference counting is used.  It was
>hard to come up with standardese to say that.

It's not the case that "it wouldn't hurt", in fact it now appears to be
not even a case of clarifying: The wording is wrong and does not handle
the cases that people though they did. It fixing rather than clarifying
:-}

While what you say may be the "spirit" of the standard, the standard
defines flesh rather than spirit, and the wording in 21.3 para 5 does
not say what you said in your text :-(

>Well, there was an existance proof that it could be done; at the time of
>the addition of those words to the standard there was at least one
>existing implementation that used reference counting.

Unfortunately, I do not think such an existence proof exists :-( If it's
the implementation that I am thinking of, then it does not conform to
the wording as it appears in the standard.

>>     // first example: "*******************" should be printed twice
>>     string original = "some arbitrary text", copy = original;
>>     const string &alias = original;
>>
>>     string::const_iterator i = alias.begin(), e = alias.end();
>>     for(string::iterator j = original.begin(); j != original.end(); ++j)
>>         *j = '*';
>>     while(i != e)
>>         cout << *i++;
>>     cout << endl;
>>     cout << original << endl;
>>
>>     // second example: "some arbitrary text" should be printed out
>>     string original = "some arbitrary text", copy = original;
>>     const string &alias = original;
>>
>>     string::const_iterator i = alias.begin();
>>     original.begin();
>>     while(i != alias.end())
>>         cout << *i++;
>
>I'm missing something.  Doesn't in both cases the first alias.begin()
>force a copy, and thus everything works as expected?

No, because they are the const begin members rather than the non-const
ones: only the non-const ones are given the opportunity to invalidate.

>I'm also confused
>by "alias".  My understanding of the standard is that it is immaterial,
>yet I expect you included it because you though it had some bearing on
>the issue?

Yes, it is significant and material: it forces the const variants of the
iterator access functions to be called.

>> I have tested this on three string implementations, two of which were
>> reference counted. The reference counted implementations gave
>> "surprising behaviour".
>
>Surprising that they worked, or surprising that they didn't:-?

:-)
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: sirwillard@my-deja.com
Date: 2000/05/24
Raw View
In article <9krGVDAlDOJ5EwQB@two-sdg.demon.co.uk>,
  Kevlin Henney <kevlin@curbralan.com> wrote:
> In article <86ya577gxi.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
> >Kevlin Henney <kevlin@curbralan.com> writes:
> >|>      // first example: "*******************" should be printed
twice
> >|>      string original = "some arbitrary text", copy = original;
> >|>      const string &alias = original;
> >
> >|>      string::const_iterator i = alias.begin(), e = alias.end();
> >|>      for(string::iterator j = original.begin(); j != original.end
(); ++j)
> >
> >Calling a non-const function (like begin, here), invalidates all
> >previous iterators.
>
> Except that it does not. That is exactly the reason for my posting.
21.3
> para 5 in part states that
>
> "References, pointers, and iterators referring to the elements of a
> basic_string sequence may be invalidated by the following uses of that
> basic_string object:
> ....
> -- Calling non-const member functions, except operator[](), at(),
> begin(), rbegin(), end(), and rend()...."
>
> Hence the potential DR.

The next bullet reads:

"Subsequent to any of the above uses except the forms of insert() and
erase() which return itera-tors, the first call to non-const member
functions operator[](), at(), begin(), rbegin(), end(), or rend()."

This leaves the wiggle room required for the COW string to invalidate
the iterators here.  This is a "first call" situation that should
invalidate your iterators in the case of COW implementations.  I don't
see a DR, even if I don't like these rules.

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/24
Raw View
Rob Stewart wrote:
>
> kanze@gabi-soft.de wrote:
> >
> > Kevlin Henney <kevlin@curbralan.com> writes:
> >
> > |>      string original = "some arbitrary text", copy = original;
> > |>      const string &alias = original;
> >
> > |>      string::const_iterator i = alias.begin(), e = alias.end();
> > |>      for(string::iterator j = original.begin(); j != original.end(); ++j)
> >
> > Calling a non-const function (like begin, here), invalidates all
> > previous iterators.
>
> However accurate, that means that you can't iterate a string.  If
> you call begin() to get a start iterator, and then call end() to
> test it, you already invalidated the start iterator, so the
> comparision is undefined.

Even when non-const, end() doesn't invalidate iterators, unless it's the
first call after another operation that also invalidates the iterators.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: sirwillard@my-deja.com
Date: 2000/05/25
Raw View
In article <3929D496.1736AF9D@wizard.net>,
  James Kuyper <kuyper@wizard.net> wrote:
> sirwillard@my-deja.com wrote:
> >
> > In article <86ya577gxi.fsf@gabi-soft.de>,
> >   kanze@gabi-soft.de wrote:
> ....
> > > Calling a non-const function (like begin, here), invalidates all
> > > previous iterators.
> >
> > A very, very, very strong emphasis needs to be put on _CAN_ here
(and
> > you left the word out all together).  Calling a non-const function
>
> <nit-pick>The actual phrase left out was "may be", not "can".
> </nit-pick>

This is more than just a nit-pick, considering I made no attempt what
so ever to quote the standard here, and the wording I used leads to the
same conclusion.

> Section 23.1 p5 goes on to say:

Now here is a more valid nit... it's not 23.1 but 21.3.

> "except operator[](), at(), begin(), rbegin(), end() and rend()."

I originally thought the next bullet point here left this point moot,
since it clearly specifies that all of the above may indeed invalidate
any previous iterators/pointers/references.  However, my mistake was in
missing the phrase "first call", as you pointed out.  So my overly
trivial example was a very bad one, since it is indeed gauranteed to
work properly according to the standard.  It still seems to me that
this point still leaves too much room for misuse leading to undefined
behavior in code that looks correct at first sight.

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/25
Raw View
In article <fxtg0r9qh8a.fsf@isolde.engr.sgi.com>, Matt Austern
<austern@sgi.com> writes
[...]
>There are real-world implementations on important platforms that use
>reference-counted strings.  The committee isn't going to turn around
>and say that all those implementations are non-conforming.  If it did
>something so reckless, the vendors would (justifiably) ignore it.

The standard has to be pragmatic, but the unfortunate truth is that it
appears these implementations are currently non-conforming. We can fix
the wording to allow reference-counted implementations, but I am not
sure that we can fix the wording to allow all existing implementations
to be valid.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/25
Raw View
In article <8gf360$78r$1@nnrp1.deja.com>, sirwillard@my-deja.com writes
>In article <9krGVDAlDOJ5EwQB@two-sdg.demon.co.uk>,
>  Kevlin Henney <kevlin@curbralan.com> wrote:
>> "References, pointers, and iterators referring to the elements of a
>> basic_string sequence may be invalidated by the following uses of that
>> basic_string object:
>> ....
>> -- Calling non-const member functions, except operator[](), at(),
>> begin(), rbegin(), end(), and rend()...."
>>
>> Hence the potential DR.
>
>The next bullet reads:
>
>"Subsequent to any of the above uses except the forms of insert() and
>erase() which return itera-tors, the first call to non-const member
>functions operator[](), at(), begin(), rbegin(), end(), or rend()."
>
>This leaves the wiggle room required for the COW string to invalidate
>the iterators here.  This is a "first call" situation that should
>invalidate your iterators in the case of COW implementations.  I don't
>see a DR, even if I don't like these rules.

I quoted this follow-on bullet elsewhere in this thread, and
unfortunately it does not allow the wriggle room required. This is the
wording that is at fault.

It does not describe the "first call" situation I presented in my code:
it describes the first call _since_ another invalidating call
("Subsequent to any of the above...") but not since construction. Thus a
minimum DR would propose a fix to include "first call since
construction".

However, that would still leave the following code in something of a
mess (which I will post correctly this time, honest!):

        string s;
        const string &r = s;
        if(r.begin() == s.begin()) ....

To fix this requires the removal of "begin(), rbegin(), end(), or
rend()", leaving just "operator[]() or at()". As I mentioned elsewhere
in the thread, this allows COWs but keeps them a little more sane by
preventing pointer-iterator access.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/25
Raw View
"Alan Griffiths" <alan@octopull.demon.co.uk> wrote in message
news:BaQryAA6FuK5EwNR@octopull.demon.co.uk...

> The sequence of events is a call to const begin to obtain an iterator
> and then a call to non-const begin which (in practice, but not in the
> standard) unshares the string implementation - non-conformingly
> invalidating the original iterator.

I don't understand why you say "not in the standard."  The fifth bullet
specifically says that the first call to one of a set of several members,
including non-const begin(), invalidates existing iterators.

The bullets seem to be consistent with a model that has a constructed string
in one of three states:

1) Shareable but unshared.
2) Shareable and shared.
3) Unshareable.

Bullet five describes the transition to state 3.  As long as there is a
valid non-const iterator or reference into the string, the string is
unsharable.  Once all such iterators are invalidated (such as by a call to
c_str()) the string may be put into one of the other states.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: philhibbs@my-deja.com
Date: 2000/05/25
Raw View
> Alan Griffiths  (alan@octopull.demon.co.uk):
> It is there to provide a means to invoke to the const version of
begin.
> The sequence of events is a call to const begin to obtain an iterator
> and then a call to non-const begin which (in practice, but not in the
> standard) unshares the string implementation - non-conformingly
> invalidating the original iterator.

When I first spotted this problem, it seemed totally theoretical to me,
as I could not see a way of calling const begin() on a non-const
string. However, Kevlin came up with the "const reference to a non-
const string" mechanism. I am utterly astonished that anyone with a
decent grasp of the STL could write a reference-counted basic_string
without coming across this potential problem. I was just reading
through Josuttis, learning about the STL and iterators, and this
question just seemed obvious to me. Be ashamed, be very ashamed!

philip.hibbs@tnt.co.uk
phil@snark.freeserve.co.uk


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: philhibbs@my-deja.com
Date: 2000/05/25
Raw View
Kevlin:
>-- Subsequent to any of the above uses except the forms of
>insert() and erase() which return iterators, the first call
>to non-const member functions operator[](), at(), begin(),
>rbegin(), end(), or rend()."

I'm not sure that I understand this point, what does it mean?

Does it mean that the first call to begin() might invalidate existing
iterators? If so, your example programs are behaving correctly!

philip.hibbs@tnt.co.uk
phil@snark.freeserve.co.uk


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/25
Raw View
I wrote:
>With that fix the following example becomes interesting:
>
>        string s, &r = s;
>        if(r.begin() == s.begin()) ....

Which is nonsense! What I meant to write was:

        string s;
        const string &r = s;
        if(r.begin() == s.begin()) ....

At which point the following text makes sense:

>In effect, it would appear that to make this work it is not enough to
>fix basic_string's wording to include "since construction"; it must
>also remove all of the iterator functions from having any effect on the
>validity at all. This effectively bans reference-counted
>implementations that use pointers for their iterators.

Sorry for the finger trouble.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Rob Stewart <donotspamme@giage.com>
Date: 2000/05/25
Raw View
Kevlin Henney wrote:
>
> In article <3925753D.65CE1C73@giage.com>, Rob Stewart
> <donotspamme@giage.com> writes
> >kanze@gabi-soft.de wrote:
> >>
> >> Kevlin Henney <kevlin@curbralan.com> writes:
> >>
> >> |>      string::const_iterator i = alias.begin(), e = alias.end();
> >> |>      for(string::iterator j = original.begin(); j != original.end(); ++j)
> >>
> >> Calling a non-const function (like begin, here), invalidates all
> >> previous iterators.
> >
> >However accurate, that means that you can't iterate a string.  If
> >you call begin() to get a start iterator, and then call end() to
> >test it, you already invalidated the start iterator, so the
> >comparision is undefined.
>
> The statement was not accurate, and you can iterate a string, therefore
> the comparison is well defined.

Sorry, I was being facetious.

--
Robert Stewart     |  rob-at-giage-dot-com
Software Engineer  |  using std::disclaimer;
Giage, Inc.        |  http://www.giage.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/25
Raw View
In article <8gg9rb$1lk$1@nnrp1.deja.com>, philhibbs@my-deja.com writes
>Kevlin:
>>-- Subsequent to any of the above uses except the forms of
>>insert() and erase() which return iterators, the first call
>>to non-const member functions operator[](), at(), begin(),
>>rbegin(), end(), or rend()."
>
>I'm not sure that I understand this point, what does it mean?

This is the paragraph written in not-so-plain-English that is causing
some of the problems :->

>Does it mean that the first call to begin() might invalidate existing
>iterators? If so, your example programs are behaving correctly!

Leaving aside the references to insert() and erase(), the plain(er)
English translation would be

"The first call to non-const member functions operator[](), at(),
begin(), rbegin(), end(), or rend(), following a call to any one of the
previously mentioned functions [ie the previous four bullets] may
invalidate references, pointers, and iterators referring to the elements
of a basic_string."

Which means that the first call to them since construction is not
covered, hence (part of) the problem. In the examples I gave none of the
conditions for invalidation applied.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "John Hickin" <hickin@nortelnetworks.com>
Date: 2000/05/25
Raw View
Bill Wade wrote:
>

>
> That is exactly a QOI issue.  The standard does not mandate COW vs. non-COW.
> You'd be better off with non-COW.  Find an implementation with the qualities
> that you want.
>
> There are other problem domains where COW is better.  Giving the vendor the
> freedom to make his implementation faster for a particular anticipated use
> is very consistent with other aspects of the standard C++ library.
>

If allocators were not _broken_ in the Standard Library this detail
might have been taken careof by making COW a property of the
basic_string's allocator.

Regards, John.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/25
Raw View
Kevlin Henney <kevlin@curbralan.com> wrote in message
news:NTr6YXAp25K5Ewdi@two-sdg.demon.co.uk...
> This is true, but great effort has been put into allowing
lightweight
> COW as an implementation possibility. However, this has been the
source
> of many wording bugs and changes in the past, present and --
inevitably
> -- future, if the past is the best predictor of the future. This
effort
> has clearly not paid off, and a more conservative approach would
produce
> a cleaner standard and better implementations.

That was my point: the cow does not pull its weight. (What a fat cow.)
We (GotW) discovered that mt cow sucks. Now we discover that, even if
the standard made a conscious effort for allowing the cow, it failed
in doing so.

Moreover, all strings in all programs are six characters on the
average :o).

Ban the mad cow, and the clients and the library vendors will be
happy. Well, to be realistic, that won't happen. But at least there
has to be a crazy one making waves, otherwise nothing happens.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/25
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:Ux2cAgAeirK5Ew8H@two-sdg.demon.co.uk...
> In article <8ge4tq$g9c@library1.airnews.net>, Bill Wade
> <bill.wade@stoner.com> writes

> >Are constructors member functions?  12/1 "...
> >constructor[s] ... are ... member functions."
>
> If this is the piece of text I think it is, the missing words include
> "special" and nowhere is non-const mentioned.

True, but in English I expect that "special member functions" are "member
functions."

> Care should be taken in
> omitting words from quotes.

True.  I'm certainly not entirely comfortable in calling constructors member
functions.  However  I can't find a clear statement in the standard that
they aren't.  As far as I can tell that particular sentence calls the
default constructor and copy constructor member functions.  A Note to
9.3.2/5 calls all constructors functions.  9.3/1 says that functions (other
than friends) declared in class bodies are member functions.

Note that most (but not all: exceptions deal with typedef of constructors
and cv issues) of the rules in 9.3 "Member functions" traditionally apply to
constructors even though they don't explicitely say so, and the rules are
not repeated anywhere else.

> >They certainly don't look like they are const.
>
> And it definitely does not look like a non-const member function: non-
> const member functions cannot be called on const objects.

But non-const member functions can create const objects.  There is no const
object until the constructor has completed.  Obviously we could debate all
day about the destructor.

We'd could probably have a similar debate about whether or not
  'char' is 'non-signed char'
in an environment where char is signed.

I'll absolutely agree that a clearer statement in 21.3/5 bullet 4  could
have saved us both a lot of typing.

> basic_string is still subject to sequence requirements

I'd missed that.  Good.  Now I can complain when an implementation's
string::push_back is not amortized O(1).  Now lets work on ostringstream.

> The issue I was referring to is 179 in lwg-active: "Comparison of
> const_iterators to iterators doesn't work", but it should and this was
> fixed at the Tokyo mtg.
>
> With that fix the following example becomes interesting:
>
>         string s, &r = s;
>         if(r.begin() == s.begin()) ....
>
> In effect, it would appear that to make this work it is not enough to
> fix basic_string's wording to include "since construction"; it must also
> remove all of the iterator functions from having any effect on the
> validity at all. This effectively bans reference-counted implementations
> that use pointers for their iterators.

This particular example does work.  There is only one object involved.
Sequence point rules ensure that only one of the begin() calls is the first
one, and the second begin() call does not get to invalidate any iterators
(in particular the one returned by the first call).

If you change r to be reference to const string (I see you fixed this in
another post), this does not work because the evaluation order is not
specified, and with the wrong order the r.begin() result is unusable.  That
is certainly "surprising" if your mental model treats string as a
vector<char>, but it is what the standard says.  I don't see this particular
example in the public version of issue 179.  Are you saying that this
(fixed) example will become normative text?

I don't think you'd be surprised to find that the following doesn't work

  vector<char> s;
  const vector<char&> r;
  if(r.begin() == s.insert(s.begin(), 'x'))

because you know when vector::insert can invalidate iterators.  You won't be
surprised that the const/non-const version of your example fails if you know
when string::begin can invalidate iterators.

The standard traded safety for implementation freedom, here and in many
other places.  We may not agree with the decision, but I believe that it is
wrong to correct it as a DR.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Marco Dalla Gasperina" <marcodg@home.com>
Date: 2000/05/25
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:s34yhnAnb9K5Ew7T@two-sdg.demon.co.uk...
> However, that would still leave the following code in something of a
> mess (which I will post correctly this time, honest!):
>
>         string s;
>         const string &r = s;
>         if(r.begin() == s.begin()) ....

This is a facinating discussion...

I can't figure out (given all the words that have been flying)
if the following is well defined:

    string s;
    const string &r = s;

    s[1] = r[2];

The culprit sequence would be
1)  char& t1 = r.operator[](2)
2)  char& t2 = s.operator[](1); // Invalidating the previous reference??
3)  char c = t1;    // boom?
4)  t2 = c;

thanks,
marco


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/25
Raw View
<philhibbs@my-deja.com> wrote in message news:8gg9rb$1lk$1@nnrp1.deja.com...
> Kevlin:
> >-- Subsequent to any of the above uses except the forms of
> >insert() and erase() which return iterators, the first call
> >to non-const member functions operator[](), at(), begin(),
> >rbegin(), end(), or rend()."
>
> I'm not sure that I understand this point, what does it mean?
>
> Does it mean that the first call to begin() might invalidate existing
> iterators? If so, your example programs are behaving correctly!

If a constructor is a non-const member function (as described in one of the
other bullets) Kevlin's programs can't behave "incorrectly" since they
invoke undefined behavior.  If a program invokes undefined behavior anything
the implementation does is "correct."

If a constructor is something other than a "non-const member function" I
would say that according to the standard Kevlin's examples are well behaved,
and the implementations he tested are broken.  In this case either

  1) Constructors must put strings into an unshareable state (which would
significantly reduce the value of COW) or
  2) The standard is defective in not grouping constructors with other
non-const functions.

An underlying current of these discussions is a hope that the standard will
back away from COW support.  COW can be a big help to single-threaded
programs that mostly treat strings as "entities" rather than as arrays to
manipulate on a per-character basis.  COW becomes much less attractive in
typical MT environments, and it makes iterator/reference access to the
characters much more dangerous.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/25
Raw View
John Hickin wrote:
...
> If allocators were not _broken_ in the Standard Library this detail
> might have been taken careof by making COW a property of the
> basic_string's allocator.

I'm not saying that allocators aren't broken, but I'd be interested in
knowing the particular features of them that you consider broken.
Personally, my least favorite parts are sections 20.1.5 p4 and p5, but I
don't think that counts as broken, just extremely frustrating.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "John Hickin" <hickin@nortelnetworks.com>
Date: 2000/05/25
Raw View
James Kuyper wrote:
>

>
> I'm not saying that allocators aren't broken, but I'd be interested in
> knowing the particular features of them that you consider broken.
> Personally, my least favorite parts are sections 20.1.5 p4 and p5, but I
> don't think that counts as broken, just extremely frustrating.
>

All instances of the same allocator are allowed to be treated as
equivalent. In my opinion this means that they are _broken_. In yours,
frustrating. The standard goes to great lengths to make sure that
sufficient flexibility is available to the programmer to get various
jobs done. This flexibility isn't (yet) available with the allocator
specification. It is a definite shortcoming. Without the caveat in the
standard a defect report would surely have been opened.

Let me rephrase, then, to say that a sufficiently flexible specification
of allocators might suffice to make the use of COW a property of
basic_string<>'s allocator. I think that there would continue to be
unanticipated fallout from such a scheme but the ability to choose for
or against a COW representation would be left to the programmer and not
the implementation.

Regards, John.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/19
Raw View
It has always been assumed that strings could be reference counted and,
in spite of the hazards of using this technique in a multi-threaded
environment, this is indeed a common implementation.

Constraints on reference types, ie they cannot be proxies, means that
the standard already reduces the degree to which reference counting is
effective. A recent question about iterator validity (by Philip Hibbs on
the ACCU mailing list) leads me to believe that the current wording in
the standard makes reference counting completely impractical, and that
existing implementations are therefore broken.

If I'm wrong, then great and all we need to do is clarify the wording in
21.3 para 5 (in particular the English of the last bullet needs some
work), otherwise a more significant fix is required. I can read that
para many ways, ranging from making reference counting mostly
impractical to effectively impossible.

You might want to check my thinking on this, so here's a couple of
pieces of code that demonstrate what I believe to be the wrong
behaviour:

    // first example: "*******************" should be printed twice
    string original = "some arbitrary text", copy = original;
    const string &alias = original;

    string::const_iterator i = alias.begin(), e = alias.end();
    for(string::iterator j = original.begin(); j != original.end(); ++j)
        *j = '*';
    while(i != e)
        cout << *i++;
    cout << endl;
    cout << original << endl;


    // second example: "some arbitrary text" should be printed out
    string original = "some arbitrary text", copy = original;
    const string &alias = original;

    string::const_iterator i = alias.begin();
    original.begin();
    while(i != alias.end())
        cout << *i++;

I have tested this on three string implementations, two of which were
reference counted. The reference counted implementations gave
"surprising behaviour".

Thoughts?
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: kanze@gabi-soft.de
Date: 2000/05/19
Raw View
Kevlin Henney <kevlin@curbralan.com> writes:

|>  You might want to check my thinking on this, so here's a couple of
|>  pieces of code that demonstrate what I believe to be the wrong
|>  behaviour:

|>      // first example: "*******************" should be printed twice
|>      string original =3D "some arbitrary text", copy =3D original;
|>      const string &alias =3D original;

|>      string::const_iterator i =3D alias.begin(), e =3D alias.end();
|>      for(string::iterator j =3D original.begin(); j !=3D original.end(=
); ++j)

Calling a non-const function (like begin, here), invalidates all
previous iterators.

|>          *j =3D '*';
|>      while(i !=3D e)
|>          cout << *i++;

Which means that this loop is undefined behavior.  In practice, there is
a good chance that e will still point to the the previous memory (which
hasn't been freed, since it is still in use in copy), so you get off
with just unexpected output.

|>      cout << endl;
|>      cout << original << endl;


|>      // second example: "some arbitrary text" should be printed out
|>      string original =3D "some arbitrary text", copy =3D original;
|>      const string &alias =3D original;

|>      string::const_iterator i =3D alias.begin();
|>      original.begin();

Again, you've just invalidate i.

|>      while(i !=3D alias.end())
|>          cout << *i++;

And in this case, a core dump or some really strange output wouldn't
surprise me.

|>  I have tested this on three string implementations, two of which were
|>  reference counted. The reference counted implementations gave
|>  "surprising behaviour".

Once you invoke undefined behavior, nothing should surprise you.  (Of
course, one *could* be surprised that the standard would define
std::string in such a way as to make it almost as dangerous as the char*
it replaces.)

--=20
James Kanze                               mailto:kanze@gabi-soft.de
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
Ziegelh=FCttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Robert Klemme <robert.klemme@myview.de>
Date: 2000/05/20
Raw View
i am a bit puzzled by your first example.  maybe you clarify.

>     // first example: "*******************" should be printed twice
>     string original =3D "some arbitrary text", copy =3D original;

copy is not used in this example.  did i miss something?

>     const string &alias =3D original;
>=20
>     string::const_iterator i =3D alias.begin(), e =3D alias.end();
>     for(string::iterator j =3D original.begin(); j !=3D original.end();=
 ++j)
>         *j =3D '*';
>     while(i !=3D e)
>         cout << *i++;
>     cout << endl;
>     cout << original << endl;


and another one:

std::string a, b;

a       =3D "12345";
char& c =3D a[ 3 ];
b       =3D a;
c       =3D '7';

what value does b have?  is it allowed to change?  is a allowed to
change?

table 43 in section 21.3 of the standard says:

data(): points at the first element of an
allocated copy of the array whose
first element is pointed at by
str.data()

this sounds to me like the buffer HAS to be copied because it says
"allocated copy".  comments?

regards

 robert


--=20
Robert Klemme
Software Engineer
-------------------------------------------------------------
myview technologies GmbH & Co. KG
Riemekestra=DFe 160 ~ D-33106 Paderborn ~ Germany
E-Mail: mailto:robert.klemme@myview.de
Telefon: +49/5251/69090-321 ~ Fax: +49/5251/69090-399
-------------------------------------------------------------

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/22
Raw View
Alan Griffiths <alan@octopull.demon.co.uk> wrote in message
news:qznRyCAp$AK5Ewrc@octopull.demon.co.uk...
[snip]
> Supporting this exclusion is what causes problems for shared body
> implementations.  (And is therefore the point of the original
posting.)

I think indeed Kevlin discovered a problem in string, and I'm soooo
glad.

I would like to dwell on a related subject - std::string, cows, and
multithreading.

The fact that many string implementations of today use cow is rather
disturbing to me. In many modern programs, multithreading is the norm.
I do Internet stuff; here, due to network latency, you *must* use
multithreading. (By the way, an asynchronous approach might be more
efficient sometimes, but it's harder to program - you have to maintain
more state - and it's nonportable across Berkeley Sockets
implementations.)

Cow and multithreading do not work together - see GotW.

Frankly, I would like to see the mad cow banned. (A good side effect
will be that vendors will finally implement efficient multithreaded
std::strings.) We can do better than cow for strings - see below.

What I would like to get instead, is introducing a "temporary string"
class, and maybe to allow expression templates with std::string.

Consider this:

s = s.substr(1, 100);

The return value of substring is string, and it shouldn't be. It
should be a distinct type called temp_string, that makes it clear that
it's about a temporary object. This way string::operator= is given a
chance at discriminating between an assignment of a full-fledged
string and a temporary string that's the result of a function. The
latter would just efficiently swap the guts of the temporary with the
guts of the destination.

Similarly, operator+ can return a temp_string, or alternatively can
return an expression template. (I'd be happy enough with a
temp_string.)

You can do something similar with today's string, but who would do
that?

s.substr(1, 100).swap(s); // efficient but... man!
// btw s.swap(s.substr(1, 100)) doesn't work
// due to a rule that I basically think is wrong

Or:

// Old (inefficient) code: s = s1 + s2;
// New (efficient but clumsy)
(s1 + s2).swap(s);
// again you cannot at least say s.swap(s1 + s2)
// (which would look slightly better)
// due to the same rule that I think it's wrong

I'd like the standard to leave place to implementations that do things
like temp_string and/or expression templates, instead of carefully
leaving place for that cow that forced me to write my own string
class. Now that Kevlin pointed a defect, we can remove the evil from
the root :o).

A temp_string class and/or allowing string expression templates will
certainly make for better, more efficient code, while keeping a
natural syntax.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: sirwillard@my-deja.com
Date: 2000/05/23
Raw View
In article <86ya577gxi.fsf@gabi-soft.de>,
  kanze@gabi-soft.de wrote:
> Kevlin Henney <kevlin@curbralan.com> writes:
>
> |>  You might want to check my thinking on this, so here's a couple of
> |>  pieces of code that demonstrate what I believe to be the wrong
> |>  behaviour:
>
> |>      // first example: "*******************" should be printed
twice
> |>      string original =3D "some arbitrary text", copy =3D original;
> |>      const string &alias =3D original;
>
> |>      string::const_iterator i =3D alias.begin(), e =3D alias.end();
> |>      for(string::iterator j =3D original.begin(); j !=3D
original.end(=
> ); ++j)
>
> Calling a non-const function (like begin, here), invalidates all
> previous iterators.

A very, very, very strong emphasis needs to be put on _CAN_ here (and
you left the word out all together).  Calling a non-const function
_CAN_ invalidate all previous iterators.  It won't necessarily do so,
even with a ref-counted string.  I understand why this provision exists
in the standard, but it leaves us with a difficult time being able to
write portable code, and requires that we fully understand our
implementation just to write code on one platform.  For example,
according to the standard the following line may very well lead to
undefined behavior (even though we all know that it's not bloody likely
to):

for (cont::iterator it = c.begin(); it != c.end(); ++it)
{
}

The call to c.end() could invalidate the iterator we got with c.begin
(), leaving us in a pickle.  Real world implementations won't be this
bad, but it surely illustrates the difficulty you can face in writing
portable code with this provision.

Considering that ref-counting isn't much help with std::basic_string
any way, I'd be in favor of removing this provision on the validity of
iterators.  It would fill in what I see as a gaping hole in our ability
to write portable code.

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/23
Raw View
In article <sigil66ao1180@news.supernews.com>, Andrei Alexandrescu
<andrewalex@hotmail.com> writes
[...]
>I think indeed Kevlin discovered a problem in string,

Well, we had the BSI C++ panel meeting today, and that confirmed that it
was indeed broken, and that the obvious fixes do not work correctly:
Allowing the first call to accessor functions (eg op[] and begin) after
construction or (potential) modification to invalidate existing refs,
ptrs and iters (which is _not_ what it says at the moment) would
initially appear to solve the problem. Alas, when taken together with
the recent fix to comparing iterators it does not.

So it appears that even with fixes, reference counting is only possible
with non-pointer iterators. You can have your cake, but you can't eat it
:->

>and I'm soooo
>glad.

Ditto. I am less than a fan of std::basic_string, and its design is
singularly unsuitable for reference counting.

>I would like to dwell on a related subject - std::string, cows, and
>multithreading.
>
>The fact that many string implementations of today use cow is rather
>disturbing to me. In many modern programs, multithreading is the norm.
>I do Internet stuff; here, due to network latency, you *must* use
>multithreading. (By the way, an asynchronous approach might be more
>efficient sometimes, but it's harder to program - you have to maintain
>more state - and it's nonportable across Berkeley Sockets
>implementations.)
>
>Cow and multithreading do not work together - see GotW.
>
>Frankly, I would like to see the mad cow banned. (A good side effect
>will be that vendors will finally implement efficient multithreaded
>std::strings.) We can do better than cow for strings - see below.

Yup, all agreed :->

>What I would like to get instead, is introducing a "temporary string"
>class, and maybe to allow expression templates with std::string.
[...]

Interesting. I have taken a different direction in an implementation
that takes a step back to reconsider strings from first principles.
Unsurprisingly, as well as a few simplifications, this leads to a
distinct separation of string concepts: mutable strings and readonly
strings (const_string in my implementation).

In this case, reference counting for readonly strings can be made
thread-safe and simple, allowing assignment, swapping and concatenation
as pretty much the only operations performed on the handle object. The
mutable strings are non-reference counted, and just do the right thing.

I think this separation is a clear and useful one. In theory something
like that could have been implemented in basic_string if const ctors had
been allowed, but that proposal was rejected.

>The return value of substring is string, and it shouldn't be. It
>should be a distinct type called temp_string, that makes it clear that
>it's about a temporary object. This way string::operator= is given a
>chance at discriminating between an assignment of a full-fledged
>string and a temporary string that's the result of a function. The
>latter would just efficiently swap the guts of the temporary with the
>guts of the destination.

Hmm, shades of auto_ptr.... I think I would be happier with the
const_string approach, but I would be interested in seeing the trade-off
in use.

But in terms of the need for A.N.Other class, I'm right there with you!
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Rob Stewart <donotspamme@giage.com>
Date: 2000/05/23
Raw View
Andrei Alexandrescu wrote:
>
> Frankly, I would like to see the mad cow banned.

<g>

> We can do better than cow for strings - see below.
>
> What I would like to get instead, is introducing a "temporary string"
> class, and maybe to allow expression templates with std::string.
>
> Consider this:
>
> s = s.substr(1, 100);
>
> The return value of substring is string, and it shouldn't be. It
> should be a distinct type called temp_string, that makes it clear that
> it's about a temporary object. This way string::operator= is given a
> chance at discriminating between an assignment of a full-fledged
> string and a temporary string that's the result of a function. The
> latter would just efficiently swap the guts of the temporary with the
> guts of the destination.

While I don't mean to open Pandora's box WRT to expanding the
standardized string, this points to a very handy addition.  If
temp_string offered an assignment operator from both
std::basic_string<E> and E*, and those assignment operators
manipulated the string from which they were created, we could
provide *very* nice syntactic sugar for insert():

std::string s("The brown fox jumped...");
s.substr(4, 0) = "quick "; // invokes temp_string::operator
=(char *)
// s now contains "The quick brown fox jumped..."

--
Robert Stewart     |  rob-at-giage-dot-com
Software Engineer  |  using std::disclaimer;
Giage, Inc.        |  http://www.giage.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 2000/05/23
Raw View
sirwillard@my-deja.com wrote:
>
> In article <86ya577gxi.fsf@gabi-soft.de>,
>   kanze@gabi-soft.de wrote:
....
> > Calling a non-const function (like begin, here), invalidates all
> > previous iterators.
>
> A very, very, very strong emphasis needs to be put on _CAN_ here (and
> you left the word out all together).  Calling a non-const function

<nit-pick>The actual phrase left out was "may be", not "can".
</nit-pick>

Section 23.1 p5 goes on to say:
"except operator[](), at(), begin(), rbegin(), end() and rend()."

> _CAN_ invalidate all previous iterators.  It won't necessarily do so,

If the iterator may be invalidated, it's a bad idea to write code which
will work correctly only if it wasn't. Luckily, this code isn't an
example of that.

....
> ...  For example,
> according to the standard the following line may very well lead to
> undefined behavior (even though we all know that it's not bloody likely
> to):
>
> for (cont::iterator it = c.begin(); it != c.end(); ++it)
> {
> }
>
> The call to c.end() could invalidate the iterator we got with c.begin
> (), leaving us in a pickle.  Real world implementations won't be this

According to 21.3p5 a call to non-const end() can invalidate an
iterator, but only if it's "the first call ..." "Subsequent to any of
the above uses". Now, one of those "above uses" could occur inside the
empty body of your loop (which is assume is meant to stand in for a
generic non-empty body). However, each of those uses themselves
invalidate the iterator, so it's no additional loss. Kevlin Henney's
example had "*j = '*'" in the body of the loop, which doesn't invoke any
of the potentially-invalidating string member functions.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Bill Wade" <bill.wade@stoner.com>
Date: 2000/05/24
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:KCerMWA+xYK5Ewtq@two-sdg.demon.co.uk...
> In article <sigil66ao1180@news.supernews.com>, Andrei Alexandrescu
> <andrewalex@hotmail.com> writes
> [...]
> >I think indeed Kevlin discovered a problem in string,
>
> Well, we had the BSI C++ panel meeting today, and that confirmed that it
> was indeed broken, and that the obvious fixes do not work correctly:
> Allowing the first call to accessor functions (eg op[] and begin) after
> construction or (potential) modification to invalidate existing refs,
> ptrs and iters (which is _not_ what it says at the moment)

Could you clarify?  The only argument I can see along these lines would say
that a constructor is not a non-const member function.  At best that seems
to be sophistry.  Are constructors member functions?  12/1 "...
constructor[s] ... are ... member functions."  They certainly don't look
like they are const.

Perhaps you would argue that after

  string s = "Hello";
  string::iterator i = s.begin();
  s.~string();
  new(&s) string;

i should still be valid since the only non-const member that was called is
begin() and the lifetime section only talks about pointers and references,
not iterators (and nothing in the standard says that iterators must be
implemented in terms of either pointers or references)?

> would
> initially appear to solve the problem. Alas, when taken together with
> the recent fix to comparing iterators it does not.

What is this fix?  Looking at
http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html the only thing
that looks like it comes close to causing a restriction is DR 51, but that
only applies to containers, and I don't believe that std::basic_string is a
container as the word is used in the standard.

> >The fact that many string implementations of today use cow is rather
> >disturbing to me. In many modern programs, multithreading is the norm.

Of course there is the principle that you don't pay for what you don't use.
If I don't use MT, why should my library pay a performance penalty just to
support MT?  OTOH if your MT vendor is giving you a COW string, I'd say that
is a QOI issue between you and your vendor.

I understand that you (the guy who doesn't want COW) do have to pay in the
form of a less-safe interface.  The standard is full of those tradeoffs, and
we can argue forever about which choices should have been made.  For
example, if I don't need contiguous vectors, I still pay for them in the
current rules about when references become invalid.

> >The return value of substring is string, and it shouldn't be. It
> >should be a distinct type called temp_string, that makes it clear that
> >it's about a temporary object. This way string::operator= is given a
> >chance at discriminating between an assignment of a full-fledged
> >string and a temporary string that's the result of a function. The
> >latter would just efficiently swap the guts of the temporary with the
> >guts of the destination.

Actually as-if rules with the current language already allow compilers to do
even better than swap, although I don't know if any compilers do so.
Obviously this kind of optimization is much more difficult when the library
vendor and core compiler vendor are two different entities.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Marco Dalla Gasperina" <marcodg@home.com>
Date: 2000/05/24
Raw View
"Kevlin Henney" <kevlin@curbralan.com> wrote in message
news:KCerMWA+xYK5Ewtq@two-sdg.demon.co.uk...
> In article <sigil66ao1180@news.supernews.com>, Andrei Alexandrescu
> <andrewalex@hotmail.com> writes
> [...]
> >I would like to dwell on a related subject - std::string, cows, and
> >multithreading.
> >
> >Cow and multithreading do not work together - see GotW.
> >
> >Frankly, I would like to see the mad cow banned. (A good side effect
> Interesting. I have taken a different direction in an implementation
> that takes a step back to reconsider strings from first principles.
> Unsurprisingly, as well as a few simplifications, this leads to a
> distinct separation of string concepts: mutable strings and readonly
> strings (const_string in my implementation).

I have wished for this myself, only in my case instead of const_string
I have called in a "symbol".  These also have to ability to be pooled.
Funny, but I can think of only a very small number of times that I
have had to disect a string to get at it's parts, but I maintain string
objects as distint values all the time.

IIRC Jack Reeves wrote about this duality of std::string as being
both a sequence/container of chars and as a distinct value type.
I thought it was a very astute point.

marco


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <39253F19.326308B4@myview.de>, Robert Klemme
<robert.klemme@myview.de> writes
>i am a bit puzzled by your first example.  maybe you clarify.
>
>>     // first example: "*******************" should be printed twice
>>     string original = "some arbitrary text", copy = original;
>
>copy is not used in this example.  did i miss something?

No, it is the inclusion of a copy that makes the example what it is.
Without the copying original would not be sharing its representation
with any other string object. Having the copying means that a reference
counted implementation will be sharing. It is this that gives it its
"interesting" behaviour on reference counted implementations :->

Compile and run it with and without the copy variable to see the effect.

>table 43 in section 21.3 of the standard says:
>
>data(): points at the first element of an
>allocated copy of the array whose
>first element is pointed at by
>str.data()
>
>this sounds to me like the buffer HAS to be copied because it says
>"allocated copy".  comments?

basic_string::data is well defined and is not the problem. 21.3 para 5
is clear on the results of calling data:

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by the following uses of that
basic_string object:
-- As an argument to non-member functions swap() (21.3.7.8),
operator>>() (21.3.7.9), and getline() (21.3.7.9).
-- As an argument to basic_string::swap().
-- Calling data() and c_str() member functions.
-- Calling non-const member functions, except operator[](), at(),
begin(), rbegin(), end(), and rend().
-- Subsequent to any of the above uses except the forms of insert() and
erase() which return iterators, the first call to non-const member
functions operator[](), at(), begin(), rbegin(), end(), or rend()."

However, it is the other parts of para 5 that are in question. In
particular, the 4th and 5th bullets.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Rob Stewart <donotspamme@giage.com>
Date: 2000/05/24
Raw View
kanze@gabi-soft.de wrote:
>
> Kevlin Henney <kevlin@curbralan.com> writes:
>
> |>      string original = "some arbitrary text", copy = original;
> |>      const string &alias = original;
>
> |>      string::const_iterator i = alias.begin(), e = alias.end();
> |>      for(string::iterator j = original.begin(); j != original.end(); ++j)
>
> Calling a non-const function (like begin, here), invalidates all
> previous iterators.

However accurate, that means that you can't iterate a string.  If
you call begin() to get a start iterator, and then call end() to
test it, you already invalidated the start iterator, so the
comparision is undefined.

--
Robert Stewart     |  rob-at-giage-dot-com
Software Engineer  |  using std::disclaimer;
Giage, Inc.        |  http://www.giage.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <86ya577gxi.fsf@gabi-soft.de>, kanze@gabi-soft.de writes
>Kevlin Henney <kevlin@curbralan.com> writes:
>|>      // first example: "*******************" should be printed twice
>|>      string original = "some arbitrary text", copy = original;
>|>      const string &alias = original;
>
>|>      string::const_iterator i = alias.begin(), e = alias.end();
>|>      for(string::iterator j = original.begin(); j != original.end(); ++j)
>
>Calling a non-const function (like begin, here), invalidates all
>previous iterators.

Except that it does not. That is exactly the reason for my posting. 21.3
para 5 in part states that

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by the following uses of that
basic_string object:
....
-- Calling non-const member functions, except operator[](), at(),
begin(), rbegin(), end(), and rend()...."

Hence the potential DR.

>(Of
>course, one *could* be surprised that the standard would define
>std::string in such a way as to make it almost as dangerous as the char*
>it replaces.)

Given string's history, this would not in fact surprise me :->
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Michiel Salters <salters@lucent.com>
Date: 2000/05/24
Raw View
kanze@gabi-soft.de wrote:
>
> Kevlin Henney <kevlin@curbralan.com> writes:

> |>      string::const_iterator i = alias.begin(), e = alias.end();
> |>      for(string::iterator j = original.begin(); j != original.end(); ++j)

> Calling a non-const function (like begin, here), invalidates all
> previous iterators.

Wait a minute.
Does this mean that the first line is already causing undefined behavior?
I.e. string::const_iterator i = alias.begin(), e = alias.end();
After the call to end() I expect the iterator 'i' to be valid.
But apparently 'i' might be invalidated. If they're not vaild , I think
I'll ban string iterators - this would be just to weird to consider.

Michiel Salters

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Kevlin Henney <kevlin@curbralan.com>
Date: 2000/05/24
Raw View
In article <39253F19.326308B4@myview.de>, Robert Klemme
<robert.klemme@myview.de> writes
>i am a bit puzzled by your first example.  maybe you clarify.
>
>>     // first example: "*******************" should be printed twice
>>     string original = "some arbitrary text", copy = original;
>
>copy is not used in this example.  did i miss something?

No, it is the inclusion of a copy that makes the example what it is.
Without the copying original would not be sharing its representation
with any other string object. Having the copying means that a reference
counted implementation will be sharing. It is this that gives it its
"interesting" behaviour on reference counted implementations :->

Compile and run it with and without the copy variable to see the effect.

>table 43 in section 21.3 of the standard says:
>
>data(): points at the first element of an
>allocated copy of the array whose
>first element is pointed at by
>str.data()
>
>this sounds to me like the buffer HAS to be copied because it says
>"allocated copy".  comments?

basic_string::data is well defined and is not the problem. 21.3 para 5
is clear on the results of calling data:

"References, pointers, and iterators referring to the elements of a
basic_string sequence may be invalidated by the following uses of that
basic_string object:
-- As an argument to non-member functions swap() (21.3.7.8),
operator>>() (21.3.7.9), and getline() (21.3.7.9).
-- As an argument to basic_string::swap().
-- Calling data() and c_str() member functions.
-- Calling non-const member functions, except operator[](), at(),
begin(), rbegin(), end(), and rend().
-- Subsequent to any of the above uses except the forms of insert() and
erase() which return iterators, the first call to non-const member
functions operator[](), at(), begin(), rbegin(), end(), or rend()."

However, it is the other parts of para 5 that are in question. In
particular, the 4th and 5th bullets.
____________________________________________________________

  Kevlin Henney                   phone:  +44 117 942 2990
  Curbralan Ltd                   mobile: +44 7801 073 508
  kevlin@curbralan.com            fax:    +44 870 052 2289
____________________________________________________________

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Andrei Alexandrescu" <andrewalex@hotmail.com>
Date: 2000/05/24
Raw View
Bill Wade <bill.wade@stoner.com> wrote in message
news:8ge4tq$g9c@library1.airnews.net...
> > >The fact that many string implementations of today use cow is
rather
> > >disturbing to me. In many modern programs, multithreading is the
norm.
>
> Of course there is the principle that you don't pay for what you
don't use.
> If I don't use MT, why should my library pay a performance penalty
just to
> support MT?  OTOH if your MT vendor is giving you a COW string, I'd
say that
> is a QOI issue between you and your vendor.

Nonono, it's different.

In my mt app I use 1000 strings in its single-threaded part, and 10
strings in the multi-threaded part. What to do? You could argue that
the vendor could provide an additional non_ref_count_string class, or
an additional template parameter to std::basic_string, but I'd say -
let's ban the mad cow and we're home free.


Andrei


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: sirwillard@my-deja.com
Date: 2000/05/24
Raw View
In article <39253F19.326308B4@myview.de>,
  Robert Klemme <robert.klemme@myview.de> wrote:
>
> i am a bit puzzled by your first example.  maybe you clarify.
>
> >     // first example: "*******************" should be printed twice
> >     string original =3D "some arbitrary text", copy =3D original;
>
> copy is not used in this example.  did i miss something?

The only reason for 'copy' is to add a reference count to the
original.  This causes the invalidation of the iterators later in his
code.

> >     const string &alias =3D original;
> >=20
> >     string::const_iterator i =3D alias.begin(), e =3D alias.end();
> >     for(string::iterator j =3D original.begin(); j !=3D original.end
();=
>  ++j)
> >         *j =3D '*';
> >     while(i !=3D e)
> >         cout << *i++;
> >     cout << endl;
> >     cout << original << endl;
>
> and another one:
>
> std::string a, b;
>
> a       =3D "12345";
> char& c =3D a[ 3 ];
> b       =3D a;

After this point, 'c' may have been invalidated, leaving all bets off
on the rest of this code.

> c       =3D '7';
>
> what value does b have?  is it allowed to change?  is a allowed to
> change?
>
> table 43 in section 21.3 of the standard says:
>
> data(): points at the first element of an
> allocated copy of the array whose
> first element is pointed at by
> str.data()
>
> this sounds to me like the buffer HAS to be copied because it says
> "allocated copy".  comments?

It only has to be copied at the point where data() is called.  Until
then, the internal representation may not even use an array, let alone
be a unique copy.

--
William E. Kempf
Software Engineer, MS Windows Programmer


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 2000/05/24
Raw View
Andrei Alexandrescu wrote:
>
> Alan Griffiths <alan@octopull.demon.co.uk> wrote in message
> news:qznRyCAp$AK5Ewrc@octopull.demon.co.uk...
> [snip]
> > Supporting this exclusion is what causes problems for shared body
> > implementations.  (And is therefore the point of the original
> posting.)
>
> I think indeed Kevlin discovered a problem in string, and I'm soooo
> glad.
>
> I would like to dwell on a related subject - std::string, cows, and
> multithreading.
>
> The fact that many string implementations of today use cow is rather
> disturbing to me. In many modern programs, multithreading is the norm.
> I do Internet stuff; here, due to network latency, you *must* use
> multithreading. (By the way, an asynchronous approach might be more
> efficient sometimes, but it's harder to program - you have to maintain
> more state - and it's nonportable across Berkeley Sockets
> implementations.)
>
> Cow and multithreading do not work together - see GotW.
>
> Frankly, I would like to see the mad cow banned. (A good side effect
> will be that vendors will finally implement efficient multithreaded
> std::strings.) We can do better than cow for strings - see below.
>
> What I would like to get instead, is introducing a "temporary string"
> class, and maybe to allow expression templates with std::string.
>
> Consider this:
>
> s = s.substr(1, 100);
>
> The return value of substring is string, and it shouldn't be. It
> should be a distinct type called temp_string, that makes it clear that
> it's about a temporary object. This way string::operator= is given a
> chance at discriminating between an assignment of a full-fledged
> string and a temporary string that's the result of a function. The
> latter would just efficiently swap the guts of the temporary with the
> guts of the destination.

The GNU String class had an even more interesting feature:
You could assign to substrings. In standard string syntax,
this would work like the following:

s = "Hello!";
s.substr(1, 4) = "i there";
   // now the characters 1 to 4 are replaced with "i there",
   // resulting in the string "Hi there!"

Note that the length of original and replacement substring is
not the same.

>
> Similarly, operator+ can return a temp_string, or alternatively can
> return an expression template. (I'd be happy enough with a
> temp_string.)

With operator+, there's a problem: the concatenated string
inside operator+, which the temp_string would have to refer to,
doesn't exist any more.

>
> You can do something similar with today's string, but who would do
> that?
>
> s.substr(1, 100).swap(s); // efficient but... man!

s.assign(s, 1, 100); // should work, shouldn't it?

> // btw s.swap(s.substr(1, 100)) doesn't work
> // due to a rule that I basically think is wrong
>
> Or:
>
> // Old (inefficient) code: s = s1 + s2;
> // New (efficient but clumsy)
> (s1 + s2).swap(s);

s.assign(s1 + s2);

Or (probably more efficient):

s = s1; s += s2;

[...]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Beman Dawes <beman@esva.net>
Date: 2000/05/24
Raw View
Kevlin Henney wrote:
>
> It has always been assumed that strings could be reference counted and,

It is really stronger than an assumption; the standard explicityly says:

"These rules are formulated to allow, but not require, a reference
counted implemenation. A reference counted implementation must have the
same semantics as a non-reference counted implementation."

> in spite of the hazards of using this technique in a multi-threaded
> environment, this is indeed a common implementation.
>
> Constraints on reference types, ie they cannot be proxies, means that
> the standard already reduces the degree to which reference counting is
> effective. A recent question about iterator validity (by Philip Hibbs on
> the ACCU mailing list) leads me to believe that the current wording in
> the standard makes reference counting completely impractical, and that
> existing implementations are therefore broken.
>
> If I'm wrong, then great and all we need to do is clarify the wording in
> 21.3 para 5 (in particular the English of the last bullet needs some
> work), otherwise a more significant fix is required.

It wouldn't hurt to clarify the wording of the last bullet item.  The
basic idea is to say that in a reference counted implementation, "the
first time an iterator escapes, the string has to be copied, as it is
now subject to being modified."  IOW, the standard requires
copy-on-anticipation-of-write if reference counting is used.  It was
hard to come up with standardese to say that.

> I can read that
> para many ways, ranging from making reference counting mostly
> impractical to effectively impossible.

Well, there was an existance proof that it could be done; at the time of
the addition of those words to the standard there was at least one
existing implementation that used reference counting.  Whether it was
practical or not is another question.

> You might want to check my thinking on this, so here's a couple of
> pieces of code that demonstrate what I believe to be the wrong
> behaviour:
>
>     // first example: "*******************" should be printed twice
>     string original = "some arbitrary text", copy = original;
>     const string &alias = original;
>
>     string::const_iterator i = alias.begin(), e = alias.end();
>     for(string::iterator j = original.begin(); j != original.end(); ++j)
>         *j = '*';
>     while(i != e)
>         cout << *i++;
>     cout << endl;
>     cout << original << endl;
>
>     // second example: "some arbitrary text" should be printed out
>     string original = "some arbitrary text", copy = original;
>     const string &alias = original;
>
>     string::const_iterator i = alias.begin();
>     original.begin();
>     while(i != alias.end())
>         cout << *i++;

I'm missing something.  Doesn't in both cases the first alias.begin()
force a copy, and thus everything works as expected?  I'm also confused
by "alias".  My understanding of the standard is that it is immaterial,
yet I expect you included it because you though it had some bearing on
the issue?

> I have tested this on three string implementations, two of which were
> reference counted. The reference counted implementations gave
> "surprising behaviour".

Surprising that they worked, or surprising that they didn't:-?

--Beman

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Beman Dawes <beman@esva.net>
Date: 2000/05/24
Raw View
sirwillard@my-deja.com wrote:

> > Calling a non-const function (like begin, here), invalidates all
> > previous iterators.
>
> A very, very, very strong emphasis needs to be put on _CAN_ here (and
> you left the word out all together).  Calling a non-const function
> _CAN_ invalidate all previous iterators.  It won't necessarily do so,
> even with a ref-counted string.  I understand why this provision exists
> in the standard, but it leaves us with a difficult time being able to
> write portable code, and requires that we fully understand our
> implementation just to write code on one platform.  For example,
> according to the standard the following line may very well lead to
> undefined behavior (even though we all know that it's not bloody likely
> to):
>
> for (cont::iterator it = c.begin(); it != c.end(); ++it)
> {
> }
>
> The call to c.end() could invalidate the iterator we got with c.begin
> (), leaving us in a pickle.

That is not the intent of the standard. The first call above (c.begin())
is the only one which can invalidate iterators.  That's why the wording
is a bit convoluted.  The example given above was one of the test cases
considered by the Library Working Group.

>  Real world implementations won't be this
> bad, but it surely illustrates the difficulty you can face in writing
> portable code with this provision.
>
> Considering that ref-counting isn't much help with std::basic_string
> any way, I'd be in favor of removing this provision on the validity of
> iterators.  It would fill in what I see as a gaping hole in our ability
> to write portable code.

The intent of the standard is that such string code should be portable.
As stated in 21.3 paragraph 6 "A reference
counted implementation must have the same semantics as a non-reference
counted implementation."

Cheers,

--Beman

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]