Topic: The problem with temporaries <-> x3j16 string proposal
Author: euaeny@eua.ericsson.se (Erik Nyquist)
Date: 15 Aug 91 07:20:52 GMT Raw View
As an observing member of the x3j16 committee I have bben given the chance
to look at Jerry Schwarz proposal for a standard string class.
Among other things I noticed that there is no
string operator(const string&, const string&);
defined for the string class.
This operator has apparently been removed from the original proposal
at the Lund x3j16 meeting to avoid the problem with the lifetime of
temporaries.
--
ARM: p268 ... The compiler must ensure that a temporary object is
destroyed. The exact point of deastruction is implementation
dependent. There are only two things that can be done with a
temporary: fetch its value (implicitly copying it) to use in some
other expression, or bind a reference to it. If the value of a
temporary is fetched, that temporary is then dead and can be destroyed
immediately. If a reference is bound to a temporary, the temporary
must not be destroyed until the reference is. This destruction must
take place before exit from the scope in which the temporary is
created.
--
This formulation makes a number of 'good' programming tricks dangerous.
Eg:
// The proposed string class does not depend on the iostream-package.
// Instead it contains an conversion operator: operator const char*();
// to allow a string to be used in places where a const char* could
// have been used.
// Assume that the string class has an operator:
// string operator+(const string&, const string&)
string s1="Erik";
string s2="Nyquist";
void my_charp_function( const char* );
// This call only works if the temporary created by operator+
// lives until my_charp_function has been called.
// ARM does not guarantee that!
my_charp_function(s1+s2);
// The compiler could generate:
// {
// const char* tmp1;
// {
// string tmp2 = s1 + s2;
// tmp1 = tmp2.operator const char*();
// }
// my_charp_function(tmp2);
// OOPS! tmp2 points to an already deleted string
// }
This could been avoid by using a reference to the temporary.
string& dummy = s1+s2;
my_charp_function(dummy);
But why make the world so complicated!
Im my opinion the language could be improved if temporaries are not destroyed
until exit from the scope in which they have been created. I believe that
there has been some people arguing that temporaries should be destroyed as
soon as possible for some reason. What reasons I do not rembember!
CONCLUSION:
1. I believe that a standard string class should provide:
string operator+(const string&,const string&);
2. I would like to reformulate the cited lines from ARM to this:
--
The compiler must ensure that a temporary object is destroyed.
Temporary objects are destroyed on exit from the scope in which they
have been created. There are only two things that can be done with a
temporary: fetch its value (implicitly copying it) to use in some
other expression, or bind a reference to it. If a reference is bound
to a temporary, the temporary must not be destroyed until the
reference is.
--
My humble opinion:
--- Erik Nyquist
--
Erik Nyquist Ellemtel Utecklings AB We are no longer the knights that say Ni!
Box 1505 We are the knights that say:
S-125 25 Alvsjo, Sweden Iky,iky,iky,iky,patang,zoop-boing, zowie.
Author: euaeny@eua.ericsson.se (Erik Nyquist)
Date: 15 Aug 91 07:42:20 GMT Raw View
OOPS! Typing errors in my original posting:
euaeny@eua.ericsson.se (Erik Nyquist) writes:
>Eg:
>// The proposed string class does not depend on the iostream-package.
>// Instead it contains an conversion operator: operator const char*();
>// to allow a string to be used in places where a const char* could
>// have been used.
>// Assume that the string class has an operator:
>// string operator+(const string&, const string&)
>string s1="Erik";
>string s2="Nyquist";
>void my_charp_function( const char* );
>// This call only works if the temporary created by operator+
>// lives until my_charp_function has been called.
>// ARM does not guarantee that!
>my_charp_function(s1+s2);
>// The compiler could generate:
>// {
>// const char* tmp1;
>// {
>// string tmp2 = s1 + s2;
>// tmp1 = tmp2.operator const char*();
>// }
>// my_charp_function(tmp2);
>// OOPS! tmp2 points to an already deleted string
OOPS! This should have been:
// my_charp_function(tmp1);
// OOPS! tmp1 points to an already deleted string
>// }
>Erik Nyquist Ellemtel Utecklings AB We are no longer the knights that say Ni!
> Box 1505 We are the knights that say:
> S-125 25 Alvsjo, Sweden Iky,iky,iky,iky,patang,zoop-boing, zowie.
--
Erik Nyquist Ellemtel Utecklings AB We are no longer the knights that say Ni!
Box 1505 We are the knights that say:
S-125 25 Alvsjo, Sweden Iky,iky,iky,iky,patang,zoop-boing, zowie.
Author: lmiller@aero.org (Lawrence H. Miller)
Date: 15 Aug 91 16:34:39 GMT Raw View
In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
>As an observing member of the x3j16 committee I have bben given the chance
>to look at Jerry Schwarz proposal for a standard string class.
>
>Among other things I noticed that there is no
> string operator +(const string&, const string&);
>defined for the string class.
>
>This operator has apparently been removed from the original proposal
>at the Lund x3j16 meeting to avoid the problem with the lifetime of
>temporaries.
Excuse perhaps my dumbness on this, but why is a temporary
the only way to implement string operator+, as opposed to
using a static string, internal to the operator function?
--
Larry Miller
The Aerospace Corporation
lmiller@aero.org
(213 soon to be 310)336-5597
Author: jss@summit.lucid.com (Jerry Schwarz)
Date: 15 Aug 91 12:24:03 GMT Raw View
In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
As an observing member of the x3j16 committee I have bben given the chance
to look at Jerry Schwarz proposal for a standard string class.
It's actually Aron Insiga's proposal. I was merely reporting
on it to the committee.
The question of lifetime of temporaries is under active consideration
by x3j16. There is general consensus that the ARM gives too much
freedom and it must be narrowed, but no agreement on what the best
solution is. End of block seems too late because temporaries can be
large and expensive to keep around. As soon as possible (which has
been given a precise technical definition) seems too early as
illustrated by the problems it would cause for string. I personally
favor something along the lines of "end of top level expression".
As the original item noted, the most significant issue is the
impact on the member that gives a "char*" corresponding
to the string. (I carefully did not write "operator char*"
because some people don't want it to be a conversion operator.)
The question of the lifetime of the char array is not settled.
There seem to be several candidates.
1. Lifetime of the string, or until the string is modified which
ever is sooner. This obviously interacts with the
lifetime of temporaries.
2. Permanent. This would require the program to later delete it.
3. Random. (The proposal is that the last n arrays are preserved
for some arbitrary but presumably large n)
Jerry Schwarz
Author: jbn@lulea.telesoft.se (Johan Bengtsson)
Date: 16 Aug 91 12:13:18 GMT Raw View
lmiller@aero.org (Lawrence H. Miller) writes:
> In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
> >As an observing member of the x3j16 committee I have bben given the chance
> >to look at Jerry Schwarz proposal for a standard string class.
> >
> Excuse perhaps my dumbness on this, but why is a temporary
> the only way to implement string operator+, as opposed to
> using a static string, internal to the operator function?
>
Sure, this works for Erik's example
my_charp_function(s1+s2);
It does _not_ work for
my_charp_twice_function(s1+s2,s3+4):
--
-----------------------------------------------------------------------------
| Johan Bengtsson, Telia Research AB, Aurorum 6, S-951 75 Lulea, Sweden |
| Email: jbn@lulea.telesoft.se; Voice: (+46) 92075471; Fax: (+46) 92075490 |
-----------------------------------------------------------------------------
Author: jbn@lulea.telesoft.se (Johan Bengtsson)
Date: 16 Aug 91 12:38:23 GMT Raw View
jss@summit.lucid.com (Jerry Schwarz) writes:
> In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
>
> > As an observing member of the x3j16 committee I have bben given the chance
> > to look at Jerry Schwarz proposal for a standard string class.
>
> It's actually Aron Insiga's proposal. I was merely reporting
> on it to the committee.
>
> As the original item noted, the most significant issue is the
> impact on the member that gives a "char*" corresponding
> to the string. (I carefully did not write "operator char*"
> because some people don't want it to be a conversion operator.)
> The question of the lifetime of the char array is not settled.
> There seem to be several candidates.
>
> 1. Lifetime of the string, or until the string is modified which
> ever is sooner. This obviously interacts with the
> lifetime of temporaries.
Anything else would be quite confusing.
>
> 2. Permanent. This would require the program to later delete it.
But how can the program possibly know when it is safe to destroy it?
>
> 3. Random. (The proposal is that the last n arrays are preserved
> for some arbitrary but presumably large n)
Don't make this _mistake_!
I have done this in plain C and I am very sorry I did it. No
matter how large you make n (I use 128), you will sooner or
later write code that "crashes", because suddenly a loop that
uses temporary strings runs more than n times, destroying a string
that was still needed.
Example:
my_charp_function(s1+s2,function_with_long_loop_using_temps());
This may or may not work, depending on the oorder of argument
evaluation.
It is surprisingly easy to unsuspectingly call such a function.
Worse, the program will work, until someone "stresses" it with
a larger job than before. Your programs will then be
characterized as "non-robust".
--
-----------------------------------------------------------------------------
| Johan Bengtsson, Telia Research AB, Aurorum 6, S-951 75 Lulea, Sweden |
| Email: jbn@lulea.telesoft.se; Voice: (+46) 92075471; Fax: (+46) 92075490 |
-----------------------------------------------------------------------------
Author: dag@control.lth.se (Dag Bruck)
Date: 16 Aug 91 06:20:51 GMT Raw View
In article <1991Aug15.163439.25421@aero.org> lmiller@aero.org (Lawrence H. Miller) writes:
>In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
>>
>> string operator +(const string&, const string&);
>
> Excuse perhaps my dumbness on this, but why is a temporary
> the only way to implement string operator+, as opposed to
> using a static string, internal to the operator function?
I don't think an internal static string would work in this case:
String s1, s2, s3, t;
t = s1 + s2 + s3;
I guess you could work hard with the type system to make sure that
expressions of that kind were caught at compile-time. What about
this:
class String { ....};
class StringX : public String {
private:
void operator + (const String &);
};
StringX operator + (const String &, const String &);
The idea is that, in the example above,
s1 + s2
becomes a StringX, call it r. The next operation then becomes
t = r + s3
which will try to invoke the private operator of class StringX.
It is quite easy to circumvent this safeguard by mistake, for example
if you return a String& from a function. I'm quite certain I got some
of the details wrong, and you probably run into difficulties with
automatic type conversions when you derive "magic" types in this way.
The restrictive approach adopted by X3J16 may be the safest in the
long run.
-- Dag
Author: pierson@encore.com (Dan L. Pierson)
Date: 16 Aug 91 14:22:20 GMT Raw View
Regarding Re: The problem with temporaries <-> x3j16 string proposal; lmiller@aero.org (Lawrence H. Miller) adds:
> Excuse perhaps my dumbness on this, but why is a temporary
> the only way to implement string operator+, as opposed to
> using a static string, internal to the operator function?
Internal static strings almost always break parallel programs (think
of what happens when two unreleated function execute string operator+
at the same time). I hope that X3J16 does not intend to specify a
language that is almost guaranteed to be useless for parallel
programming.
--
dan
In real life: Dan Pierson, Encore Computer Corporation, Research
UUCP: {talcott,linus,necis,decvax}!encore!pierson
Internet: pierson@encore.com
Author: Ari.Huttunen@hut.fi (Ari Juhani Huttunen)
Date: 16 Aug 91 21:31:37 GMT Raw View
In article <3717@lulea.telesoft.se> jbn@lulea.telesoft.se (Johan Bengtsson) writes:
>> In article <1991Aug15.072052.13383@eua.ericsson.se> euaeny@eua.ericsson.se (Erik Nyquist) writes:
>> As the original item noted, the most significant issue is the
>> impact on the member that gives a "char*" corresponding
>> to the string. (I carefully did not write "operator char*"
>> because some people don't want it to be a conversion operator.)
>> The question of the lifetime of the char array is not settled.
>> There seem to be several candidates.
>> 2. Permanent. This would require the program to later delete it.
> But how can the program possibly know when it is safe to destroy it?
When it is no longer needed == use garbage collection.
Actually, I don't think garbage collection is the solution to this problem.
But since I think it is the only major thing that C++ lacks, I'll write a few
lines about it.
Because garbage collection is expensive, it should be used only when the
programmer decides so. I gather it is already possible to write a garbage
collector using member operators new/delete? But because I don't know how
to do it, I would like to see something like this:
collected class String {
...
};
--
...............................................................................
Ari Huttunen Ari.Huttunen@hut.fi I{-R'lyeh! Cthulhu fhtagn! I{! I{!
90-7285944
Author: jimad@microsoft.UUCP (Jim ADCOCK)
Date: 19 Aug 91 19:14:46 GMT Raw View
In article <ARI.HUTTUNEN.91Aug16233137@wonderwoman.hut.fi> Ari.Huttunen@hut.fi (Ari Juhani Huttunen) writes:
>Because garbage collection is expensive, it should be used only when the
>programmer decides so. I gather it is already possible to write a garbage
>collector using member operators new/delete? But because I don't know how
>to do it, I would like to see something like this:
>
>collected class String {
I agree with your conclusion, but your premise is wrong.
Garbage Collection need not be expensive. Yes, there are many
many very slow, very inefficient ways of implementing garbage collection.
The very worse, very slowest, worst code generating method of all --
reference counting -- seems to be a favorite of C++ programmers,
which may explain in part why C/C++ programmers think garbage collection
is so expensive. And which helps demostate the depth of the problem.
However, if any of these old time C/C++ hackers ever actually program up
a competent version of generational scavenging, [or better yet, get a
young non-C/C++ hacker without their historical prejudices to do the
programming] they will find that GC can be LESS expensive than
traditional C/C++ memory management techniques.
Many old-time C/C++ hackers find this unbelievable [and therefore refuse
to even *try* implementing their own generational scavenging GC] and
at the same time refuse to consider the many ways memory management
in today's C++ totally sucks [which in turn are the reasons why it
is not hard to write a GC that is LESS expensive than traditional
C/C++ memory management.]
*** traditional C/C++ memory allocators /deallocations are hundreds
of times slower than the best GC memory allocators / deallocators.
[The best GC allocators have a typical allocation cost of one or two
machine instructions, and a typical deallocation cost of "nothing"]
*** destructors everywhere are expensive and generate damned poor code.
*** "smart" pointers [actually "stoopid" pointers] generate horribly
inefficient code, aided and abetted by the aliasing problems that
the C++ experts are unwilling to address.
*** temporaries are expensive, and their lifetimes are never, nor can
they be "correct." Many of the gratuitous temporaries, or at least
most of their costs go away with GC.
*** locality-of-reference remains an unsolved and un-addressed problem
in C++ -- even though most of the other OOPLs recognized this problem
as central [as do some well-known "object oriented" GUI systems]
GC can easily and quickly reduce the working set memory requirements
by about 20X in the typical case.
*** etc, etc, etc.
In summary: it is not GC that is expensive, but rather ignorance and
prejudice. Unfortunately, the old-time C/C++ hacker community
has such a *wealth* of ignorance and prejudice against GC that is hard
to imagine how anyone can overcome it. I know, I tried. I implemented
a successful GC scheme locally, and the old-time C/C++ hacker community
did every thing they could [successfully] to defeat its adoption.
I showed the numbers, they called me a liar, and just went on with their
lives, fat, dumb and happy.
Author: dhesi@cirrus.com (Rahul Dhesi)
Date: 20 Aug 91 01:41:48 GMT Raw View
In <74266@microsoft.UUCP> jimad@microsoft.UUCP (Jim Adcock) writes:
I implemented a successful GC scheme locally, and the old-time
C/C++ hacker community did every thing they could [successfully]
to defeat its adoption.
Maybe if you posted it to Usenet?
--
Rahul Dhesi <dhesi@cirrus.COM>
UUCP: oliveb!cirrusl!dhesi
Author: daniel@arapaho.ucsc.edu (Daniel Edelson)
Date: 20 Aug 91 00:30:24 GMT Raw View
In article <74266@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>
>However, if any of these old time C/C++ hackers ever actually program up
>a competent version of generational scavenging, [or better yet, get a
>young non-C/C++ hacker without their historical prejudices to do the
>programming] they will find that GC can be LESS expensive than
>traditional C/C++ memory management techniques.
In principle I agree completely that GC can be inexpensive enough
to be worthwhile, however,...
Can you suggest a way of implementing Generation Scavenging
for C++ on standard (non-tagged) architectures?
Or indeed *any* coping GC algorithm whatsoever.
How will you detect the roots (pointers on the stack, in registers, and
in globals)? Will you use:
* Software tags
* Stack-frame decoding and program-counter mapping
* Indirection tables
* Conservative scanning (inappropriate for copying collection)
* Other?
This seems to be the most important problem that needs to be solved.
A number of fairly bright people have looked at this problem.
(e.g., Bartlett's tech report, Detlef's Ph.D. thesis, and others.)
The fact that there is no widely accepted solution is not just due to lack
of thought, but also because this is a moderately difficult problem.
I am very eager to hear how you think Generation Scavenging should
be implemented for C++. Please email me (or post) if you have concrete
ideas that you're interested in sharing. However, I do not currently
share your view that the only reason it's missing is because nobody
has tried.
---
Daniel Edelson | New motto, copy-collection for C++ is out:
daniel@cis.ucsc.edu, or | ``Don't dangle, recycle, and don't
uunet!ucscc!terra!daniel | make me sweep up after you.''
Author: schwartz@roke.cs.psu.edu (Scott Schwartz)
Date: 20 Aug 91 03:06:05 GMT Raw View
daniel@arapaho.ucsc.edu (Daniel Edelson) writes:
This seems to be the most important problem that needs to be solved.
A number of fairly bright people have looked at this problem.
(e.g., Bartlett's tech report, Detlef's Ph.D. thesis, and others.)
The fact that there is no widely accepted solution is not just due to lack
of thought, but also because this is a moderately difficult problem.
This is oh so true. While, as you say, there have been a number of
working GC schemes put forward by various parties, I find I can't
bring myself to use them. It just looks too fragile, too
unpredictable. Even if you convince me that they work perfectly in
isolation, who knows how they will interact with malloc/free, or each
other, deep in the guts of some library? C++ (even more than C) seems
to have been designed with the idea that GC would never be used, and
so it shall be.