Topic: strict" mode for C++0x - beginnings of a proposal
Author: "Hans Boehm" <Hans_Boehm@hp.com>
Date: 31 Jul 2001 12:42:31 -0400 Raw View
Apologies for replying to an old message. But I think an important point
has been missed.
Jerrold Leichter <jerrold.leichter@smarts.com> wrote in message
news:Pine.SOL.4.21.0107091527450.12514-100000@fair.smarts.com...
> | At least you know the delete will occur at the end of some
> | scope, and in a scope that mentions the pointer involved.
> | With GC, it's really random.
>
> How does knowing that help you in designing or even debugging your program?
>
> Note that in a single-threaded program, GC will only typically run when you
> are calling new - just as predictable (and the prediction is just as useless).
>
> In a multi-threaded program, once you pass off your reference-counted smart
> pointer to another thread, things become effectively unpredictable even today.
It's far worse than that. If you do this with a simple
reference-count-based scheme, the delete will occur synchronously in some
thread, perhaps in response to an assignment that discarded the last
reference to something not obviously related. That thread may be holding
locks and may be in the middle of munging some shared data structure. It
now runs the finalizer on your behalf. If that finalizer accesses shared
data structures and is properly synchronized, this will deadlock. If it's
not properly synchronized, or locks are reentrant, it will see or modify the
shared data structure in an inconsistent state.
I don't think this is entirely hypothetical. Consider some legacy C library
CL which requires an external lock because it wasn't designed to be thread
safe. A finalizer might need to deallocate an object managed by this
library. It would presumably do so after acquiring the lock L for the
library. But the finalizer may be run in the middle of
acquire L;
call f() from CL, which sets up some state for g;
// continue to hold L, since intervening CL calls between f() and g() are not safe.
x = y; // Pointer assignment; x used to pointer to a data structure that
included object with finalizer;
// Finalizer is run; tries to acquire L; DEADLOCK
g();
release L;
There are some destructors that really need to be run synchronously. But
there are also finalizers that really MUST be run asynchronously. That's
why Java essentially dictates that they be run in a separate thread. (Some
implementations still haven't gotten that right, but they will eventually.)
The more I think about it, synchronous-reference-count-based finalization
seems just plain broken to me. If you want something like Java or Modula 3
finalization, finalizers MUST run ASYNCHRONOUSLY. Invoking destructors
based on reference counts works if the reference counts are limited to a
small module so that you know something about the contexts in which they can
be invoked. It doesn't work for a global memory management scheme.
You can of course get asynchronous finalizers with reference counting by
introducing the appropriate queue. (If you actually enqueue reference count
updates and not just finalizations, the result might even run a lot faster.)
But that seems to be at odds with the point of the discussion here.
Hans
(Hans_Boehm<at>hp<dot>com)
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: kgw-zamboni-news@stiscan.com
Date: 28 Jul 2001 07:01:54 -0400 Raw View
On Thu, 26 Jul 2001 13:26:44, John Nagle <nagle@animats.com> wrote:
WEEKDAY
>kgw-zamboni-news@stiscan.com wrote:
>> On Wed, 27 Jun 2001 16:23:52, John Nagle <nagle@animats.com> wrote:
>> Since it is possible to have pointers and auto pointers to both
>> pointers and auto pointers
>> and they can be passed as parameters, your scope restriction require
>> that
>> all pointers carry their scope information with their value. Only the
>> direct
>> immediate address of object and directly addressed auto pointers can
>> be checked at
>> compile time. The rest must be checked at run time with an
>> appropriate throw or null result.
>
> No, "auto scope" is to be enforced at compile time, much
>like "const". It's a type restriction.
>
Then pointers to auto pointers or objects containing them can not be
passed as parameters
making them different than other objects.
> I need to rewrite my note to clarify the semantics.
>
> John Nagle
> Animats
>---
>[ comp.std.c++ is moderated. To submit articles, try just posting with ]
>[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
>[ --- Please see the FAQ before posting. --- ]
>[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
>
--
Remove -zamboni to reply
All the above is hearsay and the opinion of no one in particular
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 29 Jul 01 11:22:46 GMT Raw View
> >kgw-zamboni-news@stiscan.com wrote:
> >> On Wed, 27 Jun 2001 16:23:52, John Nagle <nagle@animats.com> wrote:
> >> Since it is possible to have pointers and auto pointers to both
> >> pointers and auto pointers
> >> and they can be passed as parameters, your scope restriction require
> >> that
> >> all pointers carry their scope information with their value. Only the
> >> direct
> >> immediate address of object and directly addressed auto pointers can
> >> be checked at
> >> compile time. The rest must be checked at run time with an
> >> appropriate throw or null result.
> >
> > No, "auto scope" is to be enforced at compile time, much
> >like "const". It's a type restriction.
> >
> Then pointers to auto pointers or objects containing them can not be
> passed as parameters
> making them different than other objects.
No, you can assign auto to auto of lesser scope,
and non-auto to auto, but not auto to non-auto.
Most function parameters which are pointers can and should be auto.
This indicates, enforceably, that the function
won't keep a copy of the pointer after the function returns.
So the caller knows that the function won't mess up the caller's
allocation assumptions. This is the most common case, and
adds no run-time overhead.
If you want to pass a pointer that might be kept, it
has to be a smart pointer. That's standard smart pointer
practice.
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 26 Jul 2001 09:26:44 -0400 Raw View
kgw-zamboni-news@stiscan.com wrote:
> On Wed, 27 Jun 2001 16:23:52, John Nagle <nagle@animats.com> wrote:
> Since it is possible to have pointers and auto pointers to both
> pointers and auto pointers
> and they can be passed as parameters, your scope restriction require
> that
> all pointers carry their scope information with their value. Only the
> direct
> immediate address of object and directly addressed auto pointers can
> be checked at
> compile time. The rest must be checked at run time with an
> appropriate throw or null result.
No, "auto scope" is to be enforced at compile time, much
like "const". It's a type restriction.
I need to rewrite my note to clarify the semantics.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: kgw-zamboni-news@stiscan.com
Date: 25 Jul 2001 05:36:54 GMT Raw View
On Wed, 27 Jun 2001 16:23:52, John Nagle <nagle@animats.com> wrote:
WEEKDAY
Since it is possible to have pointers and auto pointers to both
pointers and auto pointers
and they can be passed as parameters, your scope restriction require
that
all pointers carry their scope information with their value. Only the
direct
immediate address of object and directly addressed auto pointers can
be checked at
compile time. The rest must be checked at run time with an
appropriate throw or null result.
I gather from the history of pointer classes tried over the past years
that reference counting
is way too expensive in time and space and that garbage collection was
more efficient.
Remove -zamboni to reply
All the above is hearsay and the opinion of no one in particular
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 19 Jul 2001 04:44:02 GMT Raw View
abuse@cabal.org.uk (Peter Corlett) wrote (abridged):
> This is not necessarily true. The finaliser could put a copy of "this"
> somewhere else, thereby creating a reference to the object. It would
> then be finalised again when that reference is deleted.
I don't have chapter and verse to quote, but I am confident that a Java
object won't be finalised twice even if it resurrects itself the first
time.
I gather the usual implementation is to keep a separate list of objects
which need to be finalised. The aim is to avoid overhead for the majority
of objects which never need it. Objects which do need it are put on the
list when they are created, and taken off it when the finaliser is run.
Nothing ever puts them back on the list, so the system will never run the
finaliser twice. This is probably just as well, else the object might
never be reclaimed.
The C# article mentioned recently describes a similar, but more
sophisticated system. Objects are finalised once by default, but can ask
to be finalised a second time.
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Pete Becker <petebecker@acm.org>
Date: 20 Jul 2001 16:09:51 GMT Raw View
Dave Harris wrote:
>
> I gather the usual implementation is to keep a separate list of objects
> which need to be finalised. The aim is to avoid overhead for the majority
> of objects which never need it. Objects which do need it are put on the
> list when they are created, and taken off it when the finaliser is run.
> Nothing ever puts them back on the list, so the system will never run the
> finaliser twice. This is probably just as well, else the object might
> never be reclaimed.
>
The issue that folks are seeing is that an object's finalizer can
resurrect the object, for example by inserting it into some active
container, thus making it live once again. And when the last remaining
reference to that object goes away it re-enters the garbage collection
process. The rule in Java, though, is that no matter how many times the
object goes through garbage collection its finalizer is only called once
by the system. In practice this means that the object has a flag that
indicates that its finalizer has been run.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 20 Jul 2001 16:09:51 GMT Raw View
There are good points here, I wrote a reply, and it was lost
somewhere in the news system.
This references the documents at
http://www.animats.com/papers/languages/index.html
The issues were:
1. Do we need auto_cast? The answer appears to be no.
2. Details of auto scoping around subroutine calls need to
be clarified. More on this later; I have a solution and
will put it on the web page. Basic concept is that
auto args have to outlive a call in which they're used.
This requires some checks when intermediate temporary
auto objects are generated.
3. Encapsulating "new" in a template is ugly, because new
takes a variable number of arguments. But we have
to prevent the user from generating a raw pointer in
strict mode, so either we have new generate smart
pointers or encapsulate "new." The practical, although
unsatisfying, solution is to have smart pointer templates
support some "smart_new" operation with function
templates provided for 0..N arguments, where N is
perhaps 9 or so.
4. Proponents of type inference want to use "auto"
for that function. I'd like to defer arguing over
that issue for now, as a distraction from the
semantic issues.
John Nagle
Animats
Dave Harris wrote:
>
> nagle@animats.com (John Nagle) wrote (abridged):
> > (http://www.animats.com/papers/languages/index.html)
> >
> > when calling a function, the return value must
> > not outlive the arguments.
....
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 21 Jul 2001 15:17:59 GMT Raw View
petebecker@acm.org (Pete Becker) wrote (abridged):
> > list of objects which need to be finalised. Objects which do need
> > [finalisation] are put on the list when they are created, and
> > taken off it when the finaliser is run.
>
> The issue that folks are seeing is that an object's finalizer can
> resurrect the object [...] it re-enters the garbage collection
> process
Re-entering the garbage collection process would not put it back onto the
list of objects that must eventually be finalised. That only happens when
the object is created.
The advantage of the list is that dead objects which don't need to be
finalised, don't need to be touched at all. With a copying collector the
"live" objects get copied from one semi-space to another. The dead objects
are left behind, and if it wasn't for finalisation, the old semi-space
could be reused as raw bytes immediately. With finalisation but without a
list, we'd have to scan the entire old semi-space again, identifying
objects and for each one testing whether it needed to be finalised.
Keeping a list avoids that O(N) cost.
It can be a worth-while optimisation if there are large numbers of
short-lived objects, most of which don't need finalisation. Which is the
common case for applications written with GC in mind.
I did allow for resurrection in my previous post, honest :-)
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: abuse@cabal.org.uk (Peter Corlett)
Date: 16 Jul 2001 23:02:18 GMT Raw View
Dave Harris <brangdon@cix.co.uk> wrote:
[...]
> I don't know C#, but in Java you can call the finaliser yourself as many
> times as you like. The runtime doesn't care, and will happily call it
> again. (Or not - really the main guarantee is that the system won't call
> it twice.)
This is not necessarily true. The finaliser could put a copy of "this"
somewhere else, thereby creating a reference to the object. It would then be
finalised again when that reference is deleted.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Pete Becker <petebecker@acm.org>
Date: 17 Jul 2001 22:04:01 GMT Raw View
Peter Corlett wrote:
>
> Dave Harris <brangdon@cix.co.uk> wrote:
> [...]
> > I don't know C#, but in Java you can call the finaliser yourself as many
> > times as you like. The runtime doesn't care, and will happily call it
> > again. (Or not - really the main guarantee is that the system won't call
> > it twice.)
>
> This is not necessarily true. The finaliser could put a copy of "this"
> somewhere else, thereby creating a reference to the object. It would then be
> finalised again when that reference is deleted.
>
Java guarantees that finalize will only be invoked once automatically
for each object. But the main lesson to be learned from Java finalizers
is that they're far too complex for the very limited number of things
that they can actually do. Most folks only use finalizers for error
checking, e.g. displaying "you forgot to release this resource" if the
finalizer gets called and if the resource hasn't been released.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 13 Jul 01 16:25:41 GMT Raw View
Dave Harris wrote:
>
> nagle@animats.com (John Nagle) wrote (abridged):
> > http://www.animats.com/papers/languages
> >
> > This is starting to look workable, and not too hard to do.
> > Comments?
>
> Consider:
> smart_ptr<someType> r = smart_new<someType>();
> auto someType* q = r; // OK?
> r = 0;
>
> The paper suggests that the commented line is OK. So what happens to q
> after the assignment to r causes the reference count to go to 0 and the
> object to be deleted? Is q magically reset to 0, or is it left as a
> dangling pointer, or is assignment not supported for smart pointers?
Good point. That's a hole that has to be fixed.
> It seems to me that if assignment is supported for smart pointers, the
> conversion to auto should not be.
> One approach is to have two kinds of smart pointers, one with assignment
> and one with conversion to auto.
The two kinds would seem to be "const" and "non-const". So
we now require that you can only take an auto pointer from a
const smart pointer. A const smart pointer guarantees that the
object pointed to will stay around for the scope lifetime of the
const smart pointer. That's easy to understand and implement.
Consider:
const smart_ptr<someType> r = smart_new<someType>();
auto someType* q = r; // OK
r = 0; // ERROR - assignment to const
smart_ptr<someType> r = smart_new<someType>();
auto someType* q = r; // ERROR - can't take auto from a non-const
r = 0; // OK
So we now have one kind with assignment, and one kind with
conversion to auto, as you suggested, just by using "const".
If somebody gives you a non-const smart pointer, and you
need an auto, you can do it in two steps:
smart_ptr<someType> r = smart_new<someType>();
const smart_ptr<someType> rc = r;
auto someType* q = rc; // OK - auto from const smart_ptr
r = 0; // OK - assignment to non-const smart_ptr
So the simple case is easy, the complex case is possible, and
both are now safe. Thanks. Anybody see a problem with this?
I'll update my online draft accordingly.
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Olaf Krzikalla <Entwicklung@reico.de>
Date: 13 Jul 2001 22:32:26 -0400 Raw View
Hi,
John Nagle wrote:
> And here's the new draft:
>
> http://www.animats.com/papers/languages
>
> This is starting to look workable, and not too hard to do.
> Comments?
I agree, that delete is forbidden in strict mode. But, if I understand
correctly, new could return a auto ptr instead of a non-auto ptr. Thus
the smart_new can be avoided.
Best regards
Olaf Krzikalla
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: "Sam Lindley" <sam@lindleys.screaming.net>
Date: 14 Jul 01 18:09:21 GMT Raw View
This paper may be of interest:
http://citeseer.nj.nec.com/gay98memory.html
Sam Lindley
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 14 Jul 2001 17:42:22 -0400 Raw View
jerrold.leichter@smarts.com (Jerrold Leichter) wrote (abridged):
> In C# (and Java), finalization is completely tangled with
> memory deallocation. You can never call the finalizer
> yourself.
I don't know C#, but in Java you can call the finaliser yourself as many
times as you like. The runtime doesn't care, and will happily call it
again. (Or not - really the main guarantee is that the system won't call
it twice.) The object remains valid even if it has been finalised.
Accessing a finalised object does not yield undefined behaviour.
"Valid" doesn't necessarily mean that the object's class invariant is
true. As with any routine, it is up to the finaliser itself whether to
preserve the invariant.
> 2. I propose that the garbage collector call the destructor if you
> haven't, but you can do it yourself if you like. This makes
> it possible to use GC'ed values pointed to by lexically-bound
> smart pointers in a useful way.
I'd prefer to keep destruction separate from finalisation. Finalisation
can follow Java's model. Lexically-bound smart pointers can finalise GC'd
objects, but should not destroy them.
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 15 Jul 01 14:41:30 GMT Raw View
Olaf Krzikalla wrote:
>
> Hi,
>
> John Nagle wrote:
> > And here's the new draft:
> >
> > http://www.animats.com/papers/languages
> >
> > This is starting to look workable, and not too hard to do.
> > Comments?
> I agree, that delete is forbidden in strict mode. But, if I understand
> correctly, new could return a auto ptr instead of a non-auto ptr. Thus
> the smart_new can be avoided.
Hmm. Good idea if it works. Is it airtight?
You'd have to be able to initialize a smart pointer from
an auto pointer.
Consider
auto someType* p = new someType;// allocate new obj
smart_ptr<someType> s1 = p; // s1 ref count = 1
smart_ptr<someType> s2 = p; // s2 ref count = 1
s2 = 0; // new obj gets deallocated
s1->foo(); // BAD: ref to dangling ptr
If reference counts are "intrusive", i.e. part of the
type (typically implemented by deriving all objects
from a common base class), this might work. But if
reference counts are "non-intrusive", stored outside
the type in a separate object, it won't work, because
you can create two copies of the reference count.
Non-intrusive reference counts have to be supported; the
"everything is inherited from the master base class" idea
isn't popular C++ usage.
So we have to encapsulate "new" in something like
"smart+new". Of course, the problem with encapsulating new is
that we have trouble getting variable constructor arguments through
the template system. Is anything happening in the next revision
of C++ that will help there? Is there a workaround now?
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 15 Jul 2001 14:40:16 -0400 Raw View
Sam Lindley wrote:
> This paper may be of interest:
> http://citeseer.nj.nec.com/gay98memory.html
That's about manual region-based memory management. It doesn't
seem to help safety much. Ada has that feature, and it can be built
in C++ using "placement new". But it's used mostly in specialized
real-time applications.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 15 Jul 2001 14:40:34 -0400 Raw View
nagle@animats.com (John Nagle) wrote (abridged):
> The two kinds would seem to be "const" and "non-const".
I suppose. That means:
const smart_ptr<int> p = new int;
*p = 0;
is permissible because the const applies to the pointer not the pointed-to
thing. It's different to a container, where:
const vector<int> v( ... );
v[0] = 0; // Compile error.
But I suppose it's OK. It mirrors the difference between:
const int *p;
int *const p;
so is probably appropriate for something which behaves like a pointer.
In practice the conversion to auto is just an efficiency tweak, so it
doesn't matter much if it is fairly hard to get to. It is as easy to make
a function take a smart pointer as an auto pointer.
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: news_comp.std.c++_expires-2001-09-01@nmhq.net (Niklas Matthies)
Date: 16 Jul 01 15:15:04 GMT Raw View
On 15 Jul 2001 14:40:16 -0400, John Nagle <nagle@animats.com> wrote:
> Sam Lindley wrote:
> > This paper may be of interest:
> > http://citeseer.nj.nec.com/gay98memory.html
>
> That's about manual region-based memory management. It doesn't
> seem to help safety much.
It seems that you didn't take a very close look at the paper. They
achieve safety through region-based reference counting enforced by the
compiler.
-- Niklas Matthies
--
[X] <-- nail here for new monitor
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 16 Jul 01 15:14:33 GMT Raw View
nagle@animats.com (John Nagle) wrote (abridged):
> (http://www.animats.com/papers/languages/index.html)
>
> when calling a function, the return value must
> not outlive the arguments.
Presumably where the hidden "this" is included as an argument. And if the
arguments have varying lifetimes we use the most conservative. The idea is
that function is expected to return one of those objects or a value whose
lifetime is connected with one of them in some way. OK.
> I suppose we could have auto_cast.
We don't need another keyword; it should be a variation of const_cast.
Const_cast already deals with volatile as well as const.
I imagine the cast will be needed for the implementation of smart
pointers. I am assuming these pointer templates will be implementable as
ordinary library functions, and that programmers can write new kinds if
they want. For example, a gc_ptr<int> could be used by implementations
that want to support garbage collection.
> Yes, that's unsafe. But in strict mode, you can't use "new".
The problem is really the delete, not the new. I am not sure if it is
ideologically sound to forbid new.
Good engineering dictates that objects have strong class invariants which
they establish in their constructor. This sometimes means the constructor
must take arguments; in principle, arbitrarily many of them. C++ does not
have a direct way to write a forwarding function for arbitrarily many
arguments.
I suppose we can provide direct support for small numbers of arguments,
and we can use parameter objects with weaker invariants for the rest. I
don't particular like it, though. It's an ugly thing to have at language
level.
> > Also, there is a suggestion to use "auto" for type inference.
>
> Sigh. That's a long way from its current meaning.
Not really. The real type inference meaning is carried by the omission of
an explicit type, which seems logical enough. The compiler fills in the
types which the programmer leaves out. Eg, comparing:
auto int x = 1; // Explicit type.
auto y = 1; // Inferred type.
The meaning of auto has not changed, it merely becomes essential to show
we have a declaration. Probably any storage specifier should do.
extern z = 1; // Inferred type.
There are three language changes which I think are important and urgent,
and typeof() is one of them. And if we have typeof(), I'd like to have
type inference too. A bit of syntactic sugar to sweeten the deal.
> > It is probably worth a new keyword.
>
> Suggestions? It's hard to find something that's short,
> meaningful, and not widely used as a variable or type name. Because
> this is safety-related, it should be short, or people won't use it.
It may be safety-related, but it's also an efficiency tweak. People can
use smart pointers instead. Perhaps its smart pointers which need the
short name.
How about not using a new keyword at all? Then we can use something like
"ptr".
ptr<int> // Lifetime managed elsewhere.
auto_ptr<int> // As now. "Single owner" semantics.
counted_ptr<int> // Lifetime by reference counting.
gc_ptr<int> // Garbage collection.
The finer nuances of scoped lifetimes need compiler support. I wonder if
they are really worth having - if strict mode itself is worth having as
part of the standard. Or, if there is a more fundamental concept
underlying this. For example, there seems to be a deep difference between
pointers passed to functions and pointers returned from functions.
ptr<int> makes sense for the former but not the latter. Maybe we really
just need a copy constructor which can tell the difference.
> The C/C++ tradition is that the less restrictive form is the
> default. This is because restrictions are added later.
By "later", you mean later in history? I think bad design has sometimes
been forced on C++, sometimes because of backwards compatibility
(especially with C) and sometimes due to lack of foresight. I don't think
it is a tradition, certainly not one that should be perpetuated. There are
exceptions already (eg "mutable").
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: geert@cs.uu.nl (Geert-Jan Giezeman)
Date: 10 Jul 2001 18:27:13 -0400 Raw View
In <3B49A254.4782DC05@reico.de> Olaf Krzikalla writes:
>Niklas Matthies wrote:
>> [...]
>> > - force the evaluation of all function results (or have the possibility
>> > to declare a function in a way, that the result can't be ignored)
>
>Agreed, forcing the evaluation of all functions results is too hard. But
>forcing the evaluation of certain functions due to their declarations
>could become quite handy. The first example that comes to my mind is
>perhaps a little bit silly:
>
>std::vector<int> v;
>//...
>for (int i = 0; v.size(); ++i) //...
>
>I don't want to argue about the example.
Ok, then I will.
> The point is: there are a lot
>of functions, where ignoring the result actually is an error. And we
>should enable the compiler to report as much errors as possible at
>compile time.
In this example, the result of v.size() is not ignored. It is converted to
a boolean. I don't see how the proposed rule could be of any help.
I'll complete the 'silly' example in equally silly style to show that what is
written could be what is meant:
// remove the last 30 elements of a vector,
// or the entire vector, if there are fewer
for (int i = 0; v.size(); ++i) {
v.pop_back();
if (i == 30) break;
}
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 11 Jul 01 19:01:50 GMT Raw View
Jerrold Leichter wrote:
>
> I read your proposal,
http://www.animats.com/papers/languages/index.html
> and I think it's a step in the right direction. However,
> I think you're being too conservative on the issue of garbage collection and
> prompt destructors.
....
> So, the question is ... what does prompt deallocation mean for objects
created
> with "new"? When do you expect the destructor to run?
>
> a) If you are using traditional pointers, you expect the destructor
> to run when you use delete. In that case, you have to be
> absolutely sure that no pointers to the deleted object will
> be used, or you will have some fun debugging to do.
That's where we are now, of course.
> b) If you use a reference-counted smart pointer, you have no real
> way to predict when the object really gets deleted, hence
> when the destructor runs. Sure, you can try to structure
your
> code so that you think you know when you have the last
pointer;
> but that's dangerous, and will inevitably fail in the long
run.
At least you know the delete will occur at the end of some
scope, and in a scope that mentions the pointer involved.
With GC, it's really random.
> c) If you have an "owning" smart pointer, you have a safer version
of
> (a), but as we know these are messy beasts.
True. I was a big fan of single-owner smart pointers
for a while, but they just don't play well with the
STL, which assumes objects in collections are copyable.
I still like the idea, but it's a bad fit to C++ as
presently defined.
> I would suggest the following:
> 1. It's possible to declare a class to be garbage-collected. It's probably
> necessary that this be done in the ultimate base class, and that all
> descendents also be garbage collected. Probably the cleanest way to put
> this into C++ is to have a built-in "gcobject" class. You can inherit from
> it, but when you do so you must use single inheritance.
....
I think you just reinvented Microsoft Managed C++ Objects, a
new feature of their ".NET". See the MSDN article at
http://msdn.microsoft.com/msdnmag/issues/1100/GCI/GCI.asp
That's a garbage collection system with finalization. The
semantics get complicated. Not only is there finalization,
there's "resurrection". Even then, it's not airtight.
If you think C++ should have GC, read that article. It gives
a clear picture of the painful implications of retrofitting
GC to C++. It can work, but it isn't pretty.
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Jerrold Leichter <jerrold.leichter@smarts.com>
Date: 11 Jul 01 19:03:17 GMT Raw View
| > b) If you use a reference-counted smart pointer, you have no real
| > way to predict when the object really gets deleted, hence
| > when the destructor runs. Sure, you can try to structure
| > your code so that you think you know when you have the last
| > pointer; but that's dangerous, and will inevitably fail in
| > the long run.
|
| At least you know the delete will occur at the end of some
| scope, and in a scope that mentions the pointer involved.
| With GC, it's really random.
How does knowing that help you in designing or even debugging your program?
Note that in a single-threaded program, GC will only typically run when you
are calling new - just as predictable (and the prediction is just as useless).
In a multi-threaded program, once you pass off your reference-counted smart
pointer to another thread, things become effectively unpredictable even today.
| I think you just reinvented Microsoft Managed C++ Objects, a
| new feature of their ".NET". See the MSDN article at
|
| http://msdn.microsoft.com/msdnmag/issues/1100/GCI/GCI.asp
|
| That's a garbage collection system with finalization. The
| semantics get complicated. Not only is there finalization,
| there's "resurrection". Even then, it's not airtight.
| If you think C++ should have GC, read that article. It gives
| a clear picture of the painful implications of retrofitting
| GC to C++. It can work, but it isn't pretty.
Actually, no, C# doesn't retrofit anything to C++ - it's as much a new
language
as Java is. Both share C/C++ syntax, but the semantics is quite different.
(It's actually fascinating to look at both Java and C# from the perspective
of Modula 3. Both languages start with a subset of Modula 3's ideas and give
them a C/C++ syntax. Both then add their own innovations, some good (e.g.,
inner classes in Java; I'm not yet sure what in C#), some questionable (final
in Java, again I'm not yet sure about C#.) However, I think Tony Hoare's
comment about Algol - that it managed the remarkable feat of begin an improve-
ment not just on its predecessors but on most of its successors - to have some
applicability here as well.)
Anyway: There are fundamental differences between what I've proposed, and
what C# does:
1. I propose that destructors do what they do now: Turn an object
into raw memory using its class-specific destruction semantics.
This should be *independent* of memory lifetime.
In C# (and Java), finalization is completely tangled with
memory deallocation. You can never call the finalizer
yourself. The finalizer can feed the *object* back into the
GC. The order of finalization calls is unpredictable - even
for multiple subobjects of a single object.
2. I propose that the garbage collector call the destructor if you
haven't, but you can do it yourself if you like. This makes
it possible to use GC'ed values pointed to by lexically-bound
smart pointers in a useful way.
3. Unlike C#, which gives you unpredictable results if you access an
object after it has been finalized, I specifically require that
the compiler arrange for all such accesses (from safe code,
anyway) to be detected. This means that keeping a reference to
a destructed object is safe - if not usually useful.
4. Destructors have the same semantics they have now: Objects are
always destructed "from the outside in", following the same
rules for order of destruction of contained objects. You
cannot change that order.
C# makes things complex because there is no special semantics for a finalize()
routine - a programmer can, but need not, call inherited finalize() routines.
This makes "resurrection" a possibility: A finalize() routine can decide not
to destroy anything but instead stash a pointer to the object somewhere. The
resulting semantics are a mess - and they leave a loophole in the safety of
the system. In my proposal, there is nothing a destructor can do to prevent
the destruction of anything else in its object, including the object's type.
Sure, it can stash away a pointer. That will make the *raw memory* live on -
but the object is still destructed, and is now generally inaccessible.
I have yet to see a really good argument for finalization routines. There are
a few specialized situations in which they are useful, but they are rare. If
you need something like that in a "C++ with GC", my suggestion would be that
you not use GC - the traditional C++ facilities would continue to be there
unchanged.
-- Jerry
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 11 Jul 01 19:03:46 GMT Raw View
nagle@animats.com (John Nagle) wrote (abridged):
> The basic concept is that pointers and references explicitly declared
> as "auto" can't be used in ways that would let the data they
> contain outlive the scope of the "auto" variable.
Sounds good.
> The built-in arithmetic operations remain defined for non-auto
> pointers, but don't accept "auto" arguments.
Presumably this is so that we don't use pointer arithmetic to obtain an
invalid pointer from a valid one.
Presumably delete cannot be applied to auto pointers? And auto pointers
will be default-initialised to 0?
{
auto int *ap;
assert( ap == 0 );
}
The idea being, an auto pointer is either 0 or refers to a valid object,
unless some monkey business has been going on. It would be nice if we
could achieve that.
> Smart pointer implementations can be safe, provided they return
> only "auto" scoped pointers when needed, because the lifetime of
> the contents of an "auto" scoped pointer has been limited.
Can you clarify what happens with function return values? I would guess
you intend the lifetime of an auto pointer which is a function result to
be the same as a temporary variable. Ie it ends at the end of the full
expression. Eg:
void func( auto int *ap);
smart_ptr<int> sp( new int );
func( sp ); // OK. *sp outlives the body of func().
auto int *ap;
ap = sp; // Error - ap has longer lifetime than the expression.
sp = 0; // *sp deleted here.
I think this example shows the lifetime cannot be any longer, if we're to
avoid dangling auto pointers. But maybe it is too long. What happens with:
void func( auto int *ap, int *p );
smart_ptr<int> sp( new int );
func( sp, sp=0 );
If the first argument is evaluated first, func() will again see a dangling
pointer. Can you prevent this and still allow auto function results to be
passed as function arguments?
Can auto be used in structs? At first sight I thought it was reasonable,
but it raises some awkward issues about initialisation order, eg:
struct Derived : Base {
auto int *ap;
Derived() : Base(ap), ap(new int) {}
};
does Base get to see an uninitialised auto pointer, or is Derived::ap
"initialised" twice? It's also not especially useful in that auto for
member function arguments will apply to the lifetime of the function, not
the struct it is a member of. So this is probably not worth doing.
> And conversion from non-auto to auto is defined, but auto to
> non-auto conversion is prohibited.
Presumably the auto to non-auto conversion can be made explicitly with a
const_cast<>.
A conversion from non-auto to auto is a bit unsafe, in that it can be used
to make a dangling auto pointer:
{
T *np = new T;
auto T *ap = np;
delete np;
// ap is now a dangling auto pointer.
}
This is a variation on the smart_ptr<> example. The problem of aliases is
very real.
> The underutilized "auto" keyword seems appropriate here.
Except that it is a storage specifier rather than a type qualifier. What
about pointers to auto pointers? I can use "const" to express both that
the thing pointed to is const and that the pointer itself is const:
{
const int x = 42;
const int *const cp = &x;
const int *const *pcp = &cp;
}
Similarly, I would expect:
{
auto int x = 42;
auto int *auto ap = &x;
auto int *auto *pap = ≈
}
but auto doesn't currently work like that.
Also, there is a suggestion to use "auto" for type inference. As in:
auto x = 42; // x is an int.
auto y = 42.0; // y is a double.
auto z = c.begin(); // z is c.begin()'s return type.
"Auto" is quite a popular keyword for new meanings. I'd rather use it for
type inference :-)
> For now, please think about this "auto scope" proposal and see
> if there are any major issues. Thanks.
It is a shame that the unsafe, non-auto pointers will be the default. I
would much prefer safe code to be shorter and cleaner than the unsafe
equivalent. It is probably worth a new keyword. Especially given the
issues with "auto" I mention above.
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: mbenkmann@gmx.de (Matthias Benkmann)
Date: 11 Jul 01 19:04:14 GMT Raw View
On 9 Jul 2001 00:36:57 -0400, jthorn@galileo.thp.univie.ac.at
(Jonathan Thornburg) wrote:
>In article <3B41CBBF.3AAFFEAF@reico.de>,
>Olaf Krzikalla <Entwicklung@reico.de> wrote:
>>To achieve safer code there are other
>>things I would like to see in strict mode. Two of them are:
>>- forbid implicit conversions of built-in types
>
>I'm not sure you want to forbid *all* implicit conversions. For
>example, it's probably desirable to (as C/C++ do right now) continue
>to allow value-preserving implicit conversions, eg bool --> int,
>char --> int, int --> float, and float --> double.
I think that the bool->int conversion is undesirable. On another
newsgroup I've just seen the following mistake (simplified)
if (i = func() < 0)
where it should have been
if ((i=func())<0)
Without implicit conversion from bool to int that would not have
happend (compiler error). Another common range of bugs is confusing &
and &&, which can be avoided by keeping bool and int distinct. I think
a "strict" mode should not implicitly convert bool-->int or
int-->bool
MSB
----
By the way:
Vacuum cleaners suck!
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 12 Jul 2001 09:09:39 -0400 Raw View
nagle@animats.com (John Nagle) wrote (abridged):
> http://www.animats.com/papers/languages
>
> This is starting to look workable, and not too hard to do.
> Comments?
Consider:
smart_ptr<someType> r = smart_new<someType>();
auto someType* q = r; // OK?
r = 0;
The paper suggests that the commented line is OK. So what happens to q
after the assignment to r causes the reference count to go to 0 and the
object to be deleted? Is q magically reset to 0, or is it left as a
dangling pointer, or is assignment not supported for smart pointers?
It seems to me that if assignment is supported for smart pointers, the
conversion to auto should not be.
One approach is to have two kinds of smart pointers, one with assignment
and one with conversion to auto. This allows code like:
proc( ToAuto(r), r = 0 );
where the ToAuto(r) creates a temporary object which keeps the pointed-to
object alive for the duration of proc(), thus allowing r to be assigned
to.
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Olaf Krzikalla <Entwicklung@reico.de>
Date: 12 Jul 2001 09:10:44 -0400 Raw View
Hi,
Olaf Krzikalla wrote:
> Agreed, forcing the evaluation of all functions results is too hard. But
> forcing the evaluation of certain functions due to their declarations
> could become quite handy. The first example that comes to my mind is
> perhaps a little bit silly:
>
> std::vector<int> v;
> //...
> for (int i = 0; v.size(); ++i) //...
>
Arrgh, first think, then post. I trapped in my own pitfall. Of course
v.size() is evaluated through an implicit conversion to bool. Forget the
example.
Best regards
Olaf Krzikalla
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 12 Jul 2001 14:23:37 -0400 Raw View
Dave Harris wrote:
> nagle@animats.com (John Nagle) wrote (abridged):
See paper draft at
(http://www.animats.com/papers/languages/index.html)
for background.
> > The basic concept is that pointers and references explicitly declared
> > as "auto" can't be used in ways that would let the data they
> > contain outlive the scope of the "auto" variable.
>
> Sounds good.
>
> > The built-in arithmetic operations remain defined for non-auto
> > pointers, but don't accept "auto" arguments.
>
> Presumably this is so that we don't use pointer arithmetic to obtain an
> invalid pointer from a valid one.
Yes.
>
> Presumably delete cannot be applied to auto pointers?
Yes.
> And auto pointers will be default-initialised to 0?
Yes.
> The idea being, an auto pointer is either 0 or refers to a valid object,
> unless some monkey business has been going on. It would be nice if we
> could achieve that.
I think we can.
> > Smart pointer implementations can be safe, provided they return
> > only "auto" scoped pointers when needed, because the lifetime of
> > the contents of an "auto" scoped pointer has been limited.
>
> Can you clarify what happens with function return values? I would guess
> you intend the lifetime of an auto pointer which is a function result to
> be the same as a temporary variable. Ie it ends at the end of the full
> expression.
That problem kept me up late one night. The temporary
variable rule won't solve the problem, because then you couldn't
assign an "auto" scoped pointer return value to a local "auto"
variable. That would make functions that return "auto" values useless.
The right answer seems to be this: For "auto" scope pointers,
within a function, it can be assumed that the
return value does not outlive the arguments.
when calling a function, the return value must
not outlive the arguments.
Examples:
Allowed:
strong_ptr<someType> p = strong_new<someType>();
auto someType* q = p;
This is proper, because q has smaller scope than p.
Not allowed:
auto someType* q = strong_new<someType>;();
// ERROR: q outlives temporary value from which it was taken.
The expression generates a temporary strong pointer which goes
out of scope before the next statement. That temporary then
has an implicit conversion applied to it to obtain a pointer
of auto scope. This conversion must generate a compile-time
error, because the argument (a temporary) has smaller scope
than the return value.
This really matters. In the example above, when the temporary
strong pointer is deallocated, the reference count on the new
object goes to 0, and the new object is deallocated. This
would leave q as a dangling pointer.
Readers, check me on this. Is this airtight?
> Can auto be used in structs?
> ...probably not worth doing.
For now, I'd say that auto can't be used in structs,
unless somebody comes up with good semantics and a good use
for it there.
> > And conversion from non-auto to auto is defined, but auto to
> > non-auto conversion is prohibited.
>
> Presumably the auto to non-auto conversion can be made explicitly with a
> const_cast<>.
I suppose we could have auto_cast. (Personally, I think that
to use auto_cast, const_cast, or reinterpret_cast, programmers
should first be required to make a small donation to the Fund for
Victims of Bad Software.)
> A conversion from non-auto to auto is a bit unsafe, in that it can be used
> to make a dangling auto pointer:
>
> {
> T *np = new T;
> auto T *ap = np;
> delete np;
> // ap is now a dangling auto pointer.
> }
>
Yes, that's unsafe. But in strict mode, you can't use "new".
You have to be in non-strict ("legacy"?) mode to make that mistake.
And in legacy mode, you can make that mistake with any pointer.
That's where we are today.
> This is a variation on the smart_ptr<> example. The problem of aliases is
> very real.
Is there an aliasing problem not covered above? I may have
missed something.
> > The underutilized "auto" keyword seems appropriate here.
>
> Except that it is a storage specifier rather than a type qualifier.
True, although the distinction isn't that rigid. Consider
the many meanings of "static". I picked "auto" because it's already
a reserved word, rarely used, and short.
> What
> about pointers to auto pointers? I can use "const" to express both that
> the thing pointed to is const and that the pointer itself is const:
>
> {
> const int x = 42;
> const int *const cp = &x;
> const int *const *pcp = &cp;
> }
>
> Similarly, I would expect:
>
> {
> auto int x = 42;
> auto int *auto ap = &x;
> auto int *auto *pap = ≈
> }
>
> but auto doesn't currently work like that.
Ah, now I see why the distiction between a storage class
specifier and a type qualifier is important. If "auto" gets
these semantics, it probably should be moved to the type qualifier
category. That won't break existing uses.
> Also, there is a suggestion to use "auto" for type inference.
Sigh. That's a long way from its current meaning.
> It is probably worth a new keyword.
Suggestions? It's hard to find something that's short,
meaningful, and not widely used as a variable or type name. Because
this is safety-related, it should be short, or people won't use it.
> It is a shame that the unsafe, non-auto pointers will be the default. I
> would much prefer safe code to be shorter and cleaner than the unsafe
> equivalent.
The C/C++ tradition is that the less restrictive form is the
default. This is because restrictions are added later.
Personally, I agree; I'd rather have "var" and "" rather than
"" and "const", for example. But it's too late for that.
Excellent comments.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 6 Jul 2001 11:07:32 -0400 Raw View
Olaf Krzikalla wrote:
>
> Hi,
>
> John Nagle wrote:
> > [strict mode]
>
> I like the idea in general, but I would not restrict the term 'strict
> mode' to memory handling only. To achieve safer code there are other
> things I would like to see in strict mode. Two of them are:
>
> - forbid implicit conversions of built-in types
> - force the evaluation of all function results (or have the possibility
> to declare a function in a way, that the result can't be ignored)
While interesting stylistic restrictions, those aren't necessary
for safety. The political battles implicit in forcing stylistic
constraints would detract from the memory safety issue.
I do propose to restrict casts which can cause pointer
related errors. I'm proposing that you have to use reinterpret_cast
to do anything blatantly unsafe. (There are still some loopholes
that let non-obvious unsafe casts get through; those need to be
closed.) But requiring explicit conversions goes well beyond that.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: news_comp.std.c++_expires-2001-09-01@nmhq.net (Niklas Matthies)
Date: 6 Jul 2001 11:07:59 -0400 Raw View
On 05 Jul 01 19:14:27 GMT, Olaf Krzikalla <Entwicklung@reico.de> wrote:
> John Nagle wrote:
> > [strict mode]
>
> I like the idea in general, but I would not restrict the term 'strict
> mode' to memory handling only. To achieve safer code there are other
> things I would like to see in strict mode. Two of them are:
[ ]
> - force the evaluation of all function results (or have the possibility
> to declare a function in a way, that the result can't be ignored)
The first possibility can be problematic, in particular for assignment
operators (where the result is not used in the normal case). Maybe
allowing ignoring reference-typed results would solve these problems.
-- Niklas Matthies
--
If all you have is a hammer, everything looks like a nail.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Francis Glassborow <francis.glassborow@ntlworld.com>
Date: 06 Jul 01 18:34:36 GMT Raw View
In article <3B41CBBF.3AAFFEAF@reico.de>, Olaf Krzikalla
<Entwicklung@reico.de> writes
>Hi,
>
>John Nagle wrote:
>> [strict mode]
>
>I like the idea in general, but I would not restrict the term 'strict
>mode' to memory handling only. To achieve safer code there are other
>things I would like to see in strict mode. Two of them are:
>
>- forbid implicit conversions of built-in types
>- force the evaluation of all function results (or have the possibility
>to declare a function in a way, that the result can't be ignored)
The only conceivable way this would get floor space is if an implementor
actually implemented your proposal as an extension. It is not that we
are lazy, just that we do not want to make major changes to C++ unless
there are some very good reasons for doing so and we can be convinced
that the pain is worth the gain.
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: bs@research.att.com (Bjarne Stroustrup)
Date: 07 Jul 01 06:46:45 GMT Raw View
John Nagle <nagle@animats.com> writes:
> ...
> I do propose to restrict casts which can cause pointer
> related errors. I'm proposing that you have to use reinterpret_cast
> to do anything blatantly unsafe. (There are still some loopholes
> that let non-obvious unsafe casts get through; those need to be
> closed.) But requiring explicit conversions goes well beyond that.
Is your aim to provide complete memory safety? That is, to guarantee
absence of memory leaks. Or do you simply aim at having significantly
fewer run-time errors related to pointers?
I suspect the former aim requires (complete) type safety in regards to
pointers and references. I suspect that the latter aim will leave many
unsatisfied.
- Bjarne
Bjarne Stroustrup - http://www.research.att.com/~bs
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 8 Jul 2001 12:29:53 -0400 Raw View
Bjarne Stroustrup wrote:
>
> John Nagle <nagle@animats.com> writes:
[strict mode proposal]
> Is your aim to provide complete memory safety? That is, to guarantee
> absence of memory leaks. Or do you simply aim at having significantly
> fewer run-time errors related to pointers?
>
> I suspect the former aim requires (complete) type safety in regards to
> pointers and references. I suspect that the latter aim will leave many
> unsatisfied.
Agreed. I'm working on a new draft, along the following lines:
I'm no longer proposing a specific smart pointer implementation.
I'm now proposing the language infrastructure needed to make smart
pointer implementations safe.
Many C++ smart pointer implementations exist. They share
a common weakness. Using a smart pointer requires obtaining a raw
pointer from the smart pointer. Once a raw pointer has been obtained,
it can be used in ways that break the smart pointer system. This is
a language-level problem and cannot be fixed effectively
through class libraries alone.
The minimal change required to the language is the addition of a
new data attribute that provides the necessary protection. The
underutilized "auto" keyword seems appropriate here. The basic concept
is that pointers and references explicitly declared as "auto" can't
be used in ways that would let the data they contain outlive the scope
of the "auto" variable.
Example:
void fn1(auto someType* p); // takes auto arg
void fn2(someType* p); // takes non-auto arg
void fn3(auto someType* p)
{ auto someType* q = p; // OK
fn1(p); // OK
fn2(p); // ERROR - auto passed to non-auto
smart_ptr<someType> r = smart_new<someType>(); // use of some smart
pointer implementation
q = r; // OK - smart pointer converts to "auto" raw ptr.
someType* bad1 = r; // ERROR - auto passed to non-auto
fn1(r); // OK - smart pointer converts to "auto" raw ptr.
{
someType innerobj; // a local instance
auto someType* innerq = &innerobj; // OK - passed to lesser scope
q = &innerobj; // ERROR - assigned pointer to inner object to outer
scope.
smart_ptr<someType> s = smart_new<someType>(); // smart pointer in
inner scope
q = s; // ERROR - assigned pointer to inner object to outer scope
}
}
Note the effect of the restrictions. Smart pointers and
auto scoped pointers play well together. Smart pointer
implementations can be safe, provided they return only "auto"
scoped pointers when needed, because the lifetime of the contents of
an "auto" scoped pointer has been limited.
Note especially that last "q = s;". This is the auto scope mechanism
protecting a smart pointer. At the end of the inner block,
s will be deallocated, and the heap object it points to
will go away because its reference count goes to 0.
"q" would have been a dangling pointer. That gets caught
at compile time. There's no additional run time overhead for auto
scoped objects; it's entirely a compile time check, like "const".
The built-in arithmetic operations remain defined for non-auto
pointers, but don't accept "auto" arguments. And conversion from
non-auto to auto is defined, but auto to non-auto conversion is
prohibited. Pointer arithmetic on auto scoped pointers is
thus prohibited.
"auto" scope allows intermixing auto and raw pointers in the same
program, allowing compatibility. This interpretation of "auto"
is simple to implement in compilers and useful in its own right,
as a way to tighten up existing smart pointer libraries.
I'm suggesting that "auto" have these semantics all the time.
The actual keyword is used so seldom that this won't break much,
if any, code, and if it does, a compile time error is generated.
The next step, optional "strict mode", simply turns off the ability
to generate new raw pointer values. "&" returns "auto",
"new" and "delete" are disallowed, and pointer values must
be initialized. Even in strict mode code, you can manipulate a
raw pointer if you can get one from someplace. But you can't get
out of the strict world without help from non-strict code.
This provides a transition path to strict mode.
More later on remaining problems like built-in arrays, strings, etc.
For now, please think about this "auto scope" proposal and see
if there are any major issues. Thanks.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: jthorn@galileo.thp.univie.ac.at (Jonathan Thornburg)
Date: 9 Jul 2001 00:36:57 -0400 Raw View
In article <3B41CBBF.3AAFFEAF@reico.de>,
Olaf Krzikalla <Entwicklung@reico.de> wrote:
>To achieve safer code there are other
>things I would like to see in strict mode. Two of them are:
>- forbid implicit conversions of built-in types
I'm not sure you want to forbid *all* implicit conversions. For
example, it's probably desirable to (as C/C++ do right now) continue
to allow value-preserving implicit conversions, eg bool --> int,
char --> int, int --> float, and float --> double.
>- force the evaluation of all function results (or have the possibility
>to declare a function in a way, that the result can't be ignored)
Would you forbid
#include <cstdio>
int main()
{
std::printf("hello, world\n");
return 0;
}
because the result returned from printf() (a count of the number of
characters written) is ignored? Probably printf() should be flagged
as "safe to discard the function result". But then what about the
other *printf() functions, where the result is sometimes used and
sometimes not?
Another point to ponder is "close" functions, e.g. std::fclose(3) or
the close(2) Unix system call. These both return error flags which
most programs ignore (possibly at their peril).
--
-- Jonathan Thornburg <jthorn@thp.univie.ac.at>
Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
Golm, Germany http://www.aei.mpg.de/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik
"Stock prices have reached what looks like a permanently high plateau"
-- noted economist Irving Fisher, 15 October 1929
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 09 Jul 01 19:14:18 GMT Raw View
John Nagle wrote:
>
> Bjarne Stroustrup wrote:
> >
> > John Nagle <nagle@animats.com> writes:
> [strict mode proposal]
> > Is your aim to provide complete memory safety? That is, to guarantee
> > absence of memory leaks. Or do you simply aim at having significantly
> > fewer run-time errors related to pointers?
> >
> > I suspect the former aim requires (complete) type safety in regards to
> > pointers and references. I suspect that the latter aim will leave many
> > unsatisfied.
>
> Agreed. I'm working on a new draft...
And here's the new draft:
http://www.animats.com/papers/languages
This is starting to look workable, and not too hard to do.
Comments?
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Olaf Krzikalla <Entwicklung@reico.de>
Date: 09 Jul 01 19:14:38 GMT Raw View
Hi,
Niklas Matthies wrote:
> [ ]
> > - force the evaluation of all function results (or have the possibility
> > to declare a function in a way, that the result can't be ignored)
>
> The first possibility can be problematic, in particular for assignment
> operators (where the result is not used in the normal case). Maybe
> allowing ignoring reference-typed results would solve these problems.
Agreed, forcing the evaluation of all functions results is too hard. But
forcing the evaluation of certain functions due to their declarations
could become quite handy. The first example that comes to my mind is
perhaps a little bit silly:
std::vector<int> v;
//...
for (int i = 0; v.size(); ++i) //...
I don't want to argue about the example. The point is: there are a lot
of functions, where ignoring the result actually is an error. And we
should enable the compiler to report as much errors as possible at
compile time.
I found another example in a question posted to de.comp.lang.iso_c++:
complex <long double> cz;
cz = (45,48,67);
Here a compiler could warn, but some doesn't. IMHO we can treat
constants of built-in types and also the appropriate binary operators as
functions where result evaluation is forced.
Best regards
Olaf Krzikalla
BTW: Treat this post also as a respond to Francis' and Jonathan's
comments.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: bs@research.att.com (Bjarne Stroustrup)
Date: 30 Jun 01 05:50:36 GMT Raw View
John Nagle <nagle@animats.com> writes:
> Almost all, if not all, the mainstream languages that postdate
> C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> Python, and the various "scripting languages" protect against
> dangling pointers, invalid pointers, and most memory leaks. This
> should tell us something.
but what? and how do you define "mainstream language"?
> I propose that the next version of C++ should have a "strict
> mode", which makes code safe with regard to memory allocation.
> I suggest the following design constraints:
>
> -- Strict mode should have the minimal set of constraints
> required to provide memory allocation safety.
> -- Strict mode should not impose a significant performance
> penalty at run time.
> ...
>
> Please comment on these goals. I see them as achieveable,
> and will discuss mechanisms at a future time. My question
> now is whether these goals are widely seen as worth the
> effort required to achieve them.
Naturally - safety is fundamentally a good thing - but the answer to your
questions depends on the cost and effort required to achieve those goals.
Please give us an idea of how you would like to handle the most obvious
problems. For example, array range checking, zero-pointers, dangling
pointers and unions:
union U {
int i;
int* p;
};
void f(int* p1, int* p2, int* p3, U u)
{
*p1 = 1;
*p2 = 2;
p3[7] = 3;
u.p = 4;
}
void g()
{
int* p = new int;
delete p;
int* a = new int[4];
U u;
u.i = 5;
f(0,p,a,u);
}
- Bjarne
Bjarne Stroustrup - http://www.research.att.com/~bs
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Nicola Musatti <objectway@divalsim.it>
Date: 30 Jun 01 06:07:00 GMT Raw View
I do not agree with your proposal when considered at the language level.
However, I think that something similar might be achieved in the
standard library. In retrospect I'm not sure that allowing unchecked
pointer/element dereference (or similar functionality) is such a good
idea. After all a program that relies on such facilities either performs
explicit checks or is inherently unsafe.
I'd prefer the library to provide checked access only, and than specify
that the compiler is allowed to optimize away redundant checks; as a
last resort a compiler directive might be provided to disable checks in
specific code portions.
Such an approach would support a good programming approach:
- code relying on library checks;
- debug;
- optimize;
- profile;
- disable checks where the compiler hasn't done so and they impose
unacceptable overhead.
Best regards,
Nicola Musatti
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: brangdon@cix.co.uk (Dave Harris)
Date: 30 Jun 01 06:07:29 GMT Raw View
nagle@animats.com (John Nagle) wrote (abridged):
> I propose that the next version of C++ should have a "strict
> mode", which makes code safe with regard to memory allocation.
Could you couch your proposal in the form of (a) a class library that
provides a set of "safe" constructs; (b) a list of standard C++ features
which are banned in "strict" mode?
I imagine the banned features would include raw pointers, and new and
delete expressions. The class library would include smart pointers and
checked ways to create and destroy objects.
Ideally this could all be done in a way which was backwards-compatible. A
compiler just needs to warn about the use of non-strict features. The big
issue, I suppose, is how to mark the "strict" code. We need to be able to
call trusted non-strict code from strict code, so I don't think compile
options are adequate. A #pragma with namespace scope might do for now.
Can the class library part be written with the language as it stands? Or
do we need extra mechanisms, eg for cloning, or for forwarding arbitrary
numbers of arguments?
Eg we could have:
template <typename T>
class SafeIterator {
//
};
template <typename T>
SafeIterator<T> create() {
return SafeIterator<T>( new T );
}
as a safe wrapper for new expressions, but how to make it scale well when
the constructor takes arguments?
Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
brangdon@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 30 Jun 2001 15:41:04 -0400 Raw View
James Kuyper wrote:
> What do you consider the minimal set? What constitutes safety?
I answered some of this in my previous message.
The purpose of "strict mode" is to protect the machine model.
Formally, the goal is to eliminate undefined semantics which can
cause a crash or unexpected non-local side effects (i.e. stores
into unexpected areas of memory.) And that's all. I'm not
proposing to clean up other historical cruft (declaration syntax,
for example) as part of "strict mode".
Mostly, I'm talking about implementing what we have now in the
STL in ways that are safer. Most of the things that need
to be checked that appear in frequently-executed code can
have most or all of the checks optimized out. That's the
key here.
Again, a brief summary:
-- optimized iterator checking
-- optimized reference counting
-- weak pointers for backpointers, etc.
-- iterator arithmetic allowed, pointer arithmetic not.
-- restrictions on unsafe casts
-- no null reference rule enforced
-- no explicit delete (auto_ptr like encapsulation)
And that's about it.
If this is done right, most programmers won't even be
aware that it's happening, until they get a useful error
message instead of an unexplainable crash.
From the programmer's point of view, the biggest problem
will be converting to implicit allocation. The smart pointer/
collection/auto_ptr design is the key issue here. (The current
semantics of auto_ptr are too wierd, even after three rounds of
tries in the standard. Mostly this is because the concept needs
more support in the language itself. I recommend rereading the
auto_ptr history (where is that?) before commenting on this.
For a long time, I was thinking in terms of "owning pointers"
and "use pointers", which is basically what people do in C++ now
without language support. This avoids reference counting. The
problem is that it doesn't encapsulate well within the C++
structure. C++, and especially the STL, doesn't support
uncopyable objects well. This is why auto_ptr has wierd assignment
semantics and collections of auto_ptr don't work.
Reference counts do work, and are well understood. The
only problem is performance. So I'm thinking in terms of reference
counted allocation with optimizations to remove most of the
reference count updates. Some of this optimization is to be
done by the compiler, and some by the programmer. The
programmer's tool for this is the "temporary reference", an
"auto" reference taken from a reference-counted pointer and
guaranteed not to outlive the object it was taken from.
(In particular, this means you can't assign a temporary
reference to anything of larger scope, because then a copy
could live too long.) The "auto" keyword, which isn't
used much and is legal in these situations now, might be used:
void foo(auto valarray<double>& aref); // auto restrictions apply
void bar;
{
valarray<double> a;
valarray<double>& aref = a; // ref to a, reference counted
auto valarray<double>& aref = a; // auto ref to a, no count needed
auto valarray<double>& bref = aref; // OK, narrower scope
static valarray<double>& cref = aref; // ERROR: cref can outlive
aref.
foo(aref); // param must be auto
}
Note the passing of a parameter as an auto ref. That
means the function doesn't have to do any reference counting
on the object at all. (Great for math libraries). The
only restriction is that it can't keep a pointer to the
object that outlives the function's return, because that
would break reference counting.
This gets us the safety of reference counting with NO overhead
in inner loops. And it's easily understandable and easy to
retrofit to existing code, given, of course, proper language
and compiler support. So there's a concrete example of added
safety without overhead.
> Keep in mind that different users have different needs for safety; it's
> entirely appropriate for the standard to require only the least common
> denominator of safety, leaving market forces to impose requirements for
> higher levels of safety.
Huh? What is this, the Republican party platform?
Ir's important for the C++ community to come out of denial
and realize the language has major problems. If it didn't,
we wouldn't have the need for languages that are very C++ like
but are safe, like Java and C#. In particular, I'm concerned that
if the safety problem isn't fixed, Microsoft will run over C++ with
their proprietary C#.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 01 Jul 01 02:30:13 GMT Raw View
Attila Feher wrote:
>
> Ron Natalie wrote:
> [SNIP]
> Now what I don't get is: for MS environments there is BoundsChecker
> doing this. Here we have OS support (Solaris) and since we did not like
> it we have made our or "boundschecker for xxx". So the solution for
> _testing_ exists. _Noone_ would do a _final_ application in this strict
> mode anyways, since it _does_ come with speed and memory use penalties.
The problem with such tools is that they tend to impose so much
overhead that they're turned off early. If we can get checking
overhead down to 10-20%, it can be left on in important programs.
Note that Java programs routinely run with checking on, so when
it does break in production, you have a good idea why.
I realize that there are some macho programmers who consider
checking to be an insult to their masculinity, but castration
anxiety is not a criterion for language design.
John Nagle
Animats
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: jthorn@galileo.thp.univie.ac.at (Jonathan Thornburg)
Date: 01 Jul 01 02:34:12 GMT Raw View
In article <3B3C1CCB.DB3A455B@animats.com>,
John Nagle <nagle@animats.com> wrote:
> -- Iterator arithmetic is safe, because iterators have
> meaningful error semantics and can be checked.
> Pointer arithmetic is not safe. I propose to require
> the use of iterators where pointer arithmetic is
> desired.
> [[...]]
> -- For backwards compatibility, we could allow
> arithmetic on pointers to const. This retains
> "const char*", which can only cause the
> reading of junk or a machine exception; it can't
> overstore. This allows "printf", and
> it disallows "sprintf", cause of many
> buffer overflows.
Unfortunately, it also disallows snprintf() , which [assuming a
complete reengineering into C++ strings isn't appropriate] is the
standard way of fixing sprintf() .
The problem of ensuring safety is an interesting and challenging one.
I'm just not sure it's solvable in practice. I'd really want to see
(say) apache and lynx modified to use a proposed "safe" dialect before
I'd be convinced that this is workable in practice.
--
-- Jonathan Thornburg <jthorn@thp.univie.ac.at>
http://www.thp.univie.ac.at/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik
"It's every man for himself, said the elephant as he danced on the
anthill!" -- T. C. "Tommy" Douglas's description of ((you guess))
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: "James Kuyper Jr." <kuyper@wizard.net>
Date: 02 Jul 01 03:00:09 GMT Raw View
John Nagle wrote:
>
> James Kuyper wrote:
> > What do you consider the minimal set? What constitutes safety?
>
> I answered some of this in my previous message.
Not adequately. I'm having a hard time figuring out how to specify what
it is that is missing, but your proposal is too vague to be evaluated.
> Again, a brief summary:
> -- optimized iterator checking
> -- optimized reference counting
It's outside the scope of the standard to specify optimization. An
implementation of C++ that takes 10 billion years to evaluate "p = p;"
would still be a conforming implementation, and that's a deliberate
feature of the standard.
> -- iterator arithmetic allowed, pointer arithmetic not.
I'm very unclear about what you mean by this. Under the current
standard, pointers are iterators; pointer arithmetic IS one particular
kind of iterator arithmentic. One possibility is that you're proposing a
change in the definition of iterators. Could you describe that change?
Another possibility is that there are certain features of iterator
arithmetic, which come up only when the iterator happens to be a
pointer, that you want to disallow. Could you specify what those
features are?
> -- restrictions on unsafe casts
I'm not sure what you mean by this. There are several different ways
that the standard can say that a given piece of code is bad:
"The code allows undefined behavior."
"The code is ill-formed."
"The code requires a diagnostic."
One thing the standard very deliberately never does, is prohibit an
implementation from accepting code. An implementation is always free to
define extensions which make any particular non-conforming code
acceptable to that particular implementation. I hope you're not
suggesting that be changed.
So, which of the above things would you like to be specified for unsafe
casts? Most of the ways in which casts are unsafe are already covered by
"undefined behavior". The C++ standard specifies that code which
contains a violation of any diagnosable rule requires a diagnostic; that
seems to cover all of the remaining cases worth worrying about. I
presume you don't want to require a diagnostic for any case of unsafe
casting that cannot be diagnosed? Do you want some of those cases
promoted to "ill-formed"? That's seldom appropriate; "ill-formed" is
meant to be syntactic judgment; most of the ways in which casts can be
bad are semantic.
Of course, in real life, the diagnostic requirement is widely ignored,
and for good reason - it's excessively broad. For some rules, diagnosing
violation of them is equivalent to solving the halting problem; they can
unpredictably require infinite time to diagnose. I don't think it's
appropriate to require that such violations be diagnosed. This has led
to a situation where implementors essentially diagnose whatever they
want to diagnose. If it's too much trouble, they excuse it as
"undiagnosable". I prefer C99's approach, which makes it far clearer
which rule violations must be diagnosed. It's still not entirely clear,
but it's much clearer.
> -- no null reference rule enforced
Please describe the form this enforcement should take.
...
> > Keep in mind that different users have different needs for safety; it's
> > entirely appropriate for the standard to require only the least common
> > denominator of safety, leaving market forces to impose requirements for
> > higher levels of safety.
>
> Huh? What is this, the Republican party platform?
I'm not a Republican - I've never seen a Republican candidate I could
vote for. But I am a believer in free markets (which is one reason why I
vote Democratic as seldom as possible - believe it or not, there are
other parties than those two in the US).
I don't believe that it's a good idea for the standard to mandate a safe
mode; if I want the level of safety you're suggesting, I'll buy a
compiler that provides it. If I want more efficiency than a safe
implementation can provide, I'll buy a compiler that provides it. You
say that this "strict" mode would be optional; but it's an option that
every implementation of C++ would be required to support. If I won't be
needing to use the "strict" mode, I don't want my compiler vendor to pay
the costs of the extra time it took to implement this mandatory option,
which they would then pass on to me.
> Ir's important for the C++ community to come out of denial
> and realize the language has major problems. If it didn't,
> we wouldn't have the need for languages that are very C++ like
> but are safe, like Java and C#. In particular, I'm concerned that
> if the safety problem isn't fixed, Microsoft will run over C++ with
> their proprietary C#.
There's always a tradeoff between safety and efficiency. If the C++
standard were to set minimum requirements for pointer safety, those
requirements would indirectly set upper limits on the efficiency of an
implementation. Java is very popular right now, but it's not universally
popular, and this is one of the reasons. The best available Java
implementations on a given platform are often very slow, and things like
this are part of the cause.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 2 Jul 2001 10:43:28 -0400 Raw View
Jonathan Thornburg wrote:
>
> In article <3B3C1CCB.DB3A455B@animats.com>,
> John Nagle <nagle@animats.com> wrote:
> > -- Iterator arithmetic is safe, because iterators have
> > meaningful error semantics and can be checked.
> > Pointer arithmetic is not safe. I propose to require
> > the use of iterators where pointer arithmetic is
> > desired.
> > [[...]]
> > -- For backwards compatibility, we could allow
> > arithmetic on pointers to const. This retains
> > "const char*", which can only cause the
> > reading of junk or a machine exception; it can't
> > overstore. This allows "printf", and
> > it disallows "sprintf", cause of many
> > buffer overflows.
>
> Unfortunately, it also disallows snprintf() , which [assuming a
> complete reengineering into C++ strings isn't appropriate] is the
> standard way of fixing sprintf().
Agreed. But we've had C++ strings for a few years now;
it's not unreasonable to deprecate some older C features.
> The problem of ensuring safety is an interesting and challenging one.
> I'm just not sure it's solvable in practice. I'd really want to see
> (say) apache and lynx modified to use a proposed "safe" dialect before
> I'd be convinced that this is workable in practice.
I agree completely about Apache. It's important, widely used,
and needs to be safe. Lynx probably isn't worth the effort.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 2 Jul 2001 12:32:35 -0400 Raw View
"James Kuyper Jr." wrote:
>
> John Nagle wrote:
> ...I'm having a hard time figuring out how to specify what
> it is that is missing, but your proposal is too vague to be evaluated.
Agreed. I'm writing a preliminary draft, and will post
a link shortly, probably around mid-week.
> There's always a tradeoff between safety and efficiency.
No. It's possible to have safety and run-time efficiency.
The real tradeoffs involve ease of programming and compiler
complexity. The problem is finding an acceptable combination of
things to ask of the programmer, things to ask of
the compiler, and things to ask of the run-time system which
result in safety.
Thought for today: a 20% performance penalty is four months
of Moore's Law. It's easy to lose four months on a big project
just chasing memory-related bugs.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Ron Natalie <ron@sensor.com>
Date: 02 Jul 01 18:21:50 GMT Raw View
dave wrote:
>
> Ron Natalie <ron@sensor.com> writes:
>
> > Yeah, that there is a place for languages that proivde some
> > insulation against these issues at the cost of performance.
> > We're still 30 times faster in our C++ benchmarks over equivelent
> > JAVA code, we ain't switching for the critical stuff.
>
> This isn't always true. Microsofts JIT compiler makes Java bytecodes run
> as fast or faster than their native C code on some occasions. I don't think
> C++ has many occasions to be faster than C [perhaps as fast as C and rarely
> faster IMHO].
>
> This proves one of two things.
> 1) Microsoft wrote a great JIT compiler that runs Java class bytecode almost
> as fast as native binaries and they did a great job of it.
> 2) Microsoft compilers for C/C++ just suck :).
It's not always false either and that's my point. Yes Java can be as
fast as some C++ code, but there are times when it is still 30 times
slower. My point was that even though JAVA is superior in a lot of
ways for a lot of tasks, there are still times when C++ is going to
be required.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Pete Becker <petebecker@acm.org>
Date: 02 Jul 01 18:22:05 GMT Raw View
John Nagle wrote:
>
> Thought for today: a 20% performance penalty is four months
> of Moore's Law. It's easy to lose four months on a big project
> just chasing memory-related bugs.
>
Now you just have to convince the 8051 chip that it should grow in
accordance with Moore's law.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: James Kuyper <kuyper@wizard.net>
Date: 03 Jul 01 17:27:37 GMT Raw View
John Nagle wrote:
>
> "James Kuyper Jr." wrote:
> >
> > John Nagle wrote:
> > ...I'm having a hard time figuring out how to specify what
> > it is that is missing, but your proposal is too vague to be evaluated.
>
> Agreed. I'm writing a preliminary draft, and will post
> a link shortly, probably around mid-week.
>
> > There's always a tradeoff between safety and efficiency.
>
> No. It's possible to have safety and run-time efficiency.
Claiming there's no tradeoff doesn't make it so. To take just one
example from your list: you can't have checked pointers without having
time spent checking them, and space used up somewhere storing the
information about what the valid range for each pointer is. Also, a
language implementor can't provide checked pointers as a built-in
option, without spending a fair amount of time implementing them.
The key things that determine what trade-offs make sense are the cost of
programmer time, and the cost of CPU time, and how much of each are
needed for a given application. For a simple application needing very
little programmer time, but which will be consuming a lot of CPU time,
efficiency can have much greater relative importance than it would for a
complicated program that will run exactly once and will complete in a
few seconds. The cost of CPU time has dropped dramatically, reducing the
importance of efficiency, and producing a corresponding increase in the
relative importance of safety. However, the importance of efficiency
will never drop to exactly 0, until the cost of CPU time also drops to
exactly 0.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Attila Feher <Attila.Feher@lmf.ericsson.se>
Date: 5 Jul 2001 11:44:28 -0400 Raw View
John Nagle wrote:
>
> Attila Feher wrote:
> >
> > Ron Natalie wrote:
> > [SNIP]
> > Now what I don't get is: for MS environments there is BoundsChecker
> > doing this. Here we have OS support (Solaris) and since we did not like
> > it we have made our or "boundschecker for xxx". So the solution for
> > _testing_ exists. _Noone_ would do a _final_ application in this strict
> > mode anyways, since it _does_ come with speed and memory use penalties.
>
> The problem with such tools is that they tend to impose so much
> overhead that they're turned off early. If we can get checking
> overhead down to 10-20%, it can be left on in important programs.
> Note that Java programs routinely run with checking on, so when
> it does break in production, you have a good idea why.
>
> I realize that there are some macho programmers who consider
> checking to be an insult to their masculinity, but castration
> anxiety is not a criterion for language design.
Now slipping the topic to personal insults does not hold any technical
argumental value. When doing realtime (where there ain't no such thing
as enough CPU power) there is no time left for production binary.
Language level support (which can be 100% removed from production code)
however would be great!
[Moderator's note: we discussed John Nagle's article when it
was submitted, but we were of the opinion that his wording,
while colorful, was not intended as a personal attack.
However, we strongly encourage all posters to take care to avoid
making comments which could be construed as personal attacks.]
A
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 5 Jul 2001 11:47:29 -0400 Raw View
James Kuyper wrote:
>
> John Nagle wrote:
> >
> > "James Kuyper Jr." wrote:
> Claiming there's no tradeoff doesn't make it so. To take just one
> example from your list: you can't have checked pointers without having
> time spent checking them, and space used up somewhere storing the
> information about what the valid range for each pointer is.
Consider
int sum(vector<int>& tab)
{ int total = 0;
for (vector<int>::iterator p = tab.begin(); p != tab.end(); p++)
{ total += *p; }
}
What checks do we need? Let's write them as asserts.
int sum(vector<int>& tab)
{ int total = 0;
for (vector<int>::iterator p = tab.begin();
p != tab.end();
assert(bound_to_collection(p),assert(p != tab.end()),p++)
{ assert(bound_to_collection(p)); // null check
assert(p != tab.end()); // end check
total += *p;
}
}
("bound_to_array" tests whether the iterator is bound to a valid
collection, rather than just being an uninitialized iterator.)
Now we optimize. A simple static analysis indicates that
the iterator p is initially bound to a collection, and that no
operation performed on p can unbind it from that collection.
So all the "bound_to_collection" tests are true and can be dropped.
int sum(vector<int>& tab)
{ int total = 0;
for (vector<int>::iterator p = tab.begin();
p != tab.end();
assert(p != tab.end()),p++)
{ assert(p != tab.end()); // end check
total += *p;
}
}
Next, we use some simple program verification techniques to
observe that the second "assert(p != tab.end())" follows the
loop test "p != tab.end()" without intervening operations on
p. Therefore, "(p != tab.end()) implies (p != tab.end())",
and we can remove the second assert.
int sum(vector<int>& tab)
{ int total = 0;
for (vector<int>::iterator p = tab.begin(); p != tab.end();
assert(p != tab.end()),p++)
{
total += *p;
}
}
Continuing in this vein, we observe that p is not modified
within the loop other than in the FOR statement, and thus
the loop termination test "p != tab.end()" from the
previous iteration implies the precondition for "p++".
Thus, that test can't fail. This leaves
int sum(vector<int>& tab)
{ int total = 0;
for (vector<int>::iterator p = tab.begin(); p != tab.end(); p++)
{
total += *p;
}
}
from which all the run-time checks have been removed at compile
time. At this point, run-time information connecting the iterator
to the collection for checking purposes is no longer needed
and can be dropped. All run-time checking overhead has now
been eliminated.
This is a very simple form of program verification. Much
more can be done; see my old paper, Practical Program
Verification, in POPL '83. Using some of those techniques
for optimization of checks is very effective, and far easier
than full proof of correctness. Anything you can prove easily,
you don't have to check; anything you can't gets checked at
run time.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Olaf Krzikalla <Entwicklung@reico.de>
Date: 05 Jul 01 19:14:27 GMT Raw View
Hi,
John Nagle wrote:
> [strict mode]
I like the idea in general, but I would not restrict the term 'strict
mode' to memory handling only. To achieve safer code there are other
things I would like to see in strict mode. Two of them are:
- forbid implicit conversions of built-in types
- force the evaluation of all function results (or have the possibility
to declare a function in a way, that the result can't be ignored)
Best regards
Olaf Krzikalla
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Ron Natalie <ron@sensor.com>
Date: 28 Jun 01 02:47:38 GMT Raw View
John Nagle wrote:
>
> Almost all, if not all, the mainstream languages that postdate
> C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> Python, and the various "scripting languages" protect against
> dangling pointers, invalid pointers, and most memory leaks.
The one's you mention are all "scripting languages"
> This should tell us something.
Yeah, that there is a place for languages that proivde some
insulation against these issues at the cost of performance.
We're still 30 times faster in our C++ benchmarks over equivelent
JAVA code, we ain't switching for the critical stuff.
>
> -- Strict mode should have the minimal set of constraints
> required to provide memory allocation safety.
You want to enumerate how this is to happen? You want to remove
pointers from the language like Java did?
> -- Strict mode should not impose a significant performance
> penalty at run time. More static checking at compile time
> is acceptable. That's the big difference between this
> approach (and C++ generally) and Java, etc.
Yeah, well nothing comes for free? Why do you think there is a
performance penalty in Java, etc...?
>
> Note that if achieved, this would mean, among other things,
> the end of "buffer overflow" security exploits, a major
> reduction if not elimination of memory leaks in long running
> programs, and a significant reduction in hard to find bugs.
> Those are some of the reasons programmers and projects are
> moving away from C++.
It's not clear how you are going to achieve any of this without
substantially changing the language, either to remove all the
constructs which would be unsafe or to put significant runtime
checking on things like pointer manipulation with a lot of attendent
book keeping to keep this from happening.
It is also laughable to think that this will end "buffer overflow"
security exploits. Those programs were laughably wrong to begin with
and this won't fix it. We've exploited similar things in "safe"
environments. Failure to check bounds on input is mistake even
with shell scripts, LIST, or JAVA.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: Ernest Friedman-Hill <ejfried@alum.mit.edu>
Date: Thu, 28 Jun 2001 15:39:12 GMT Raw View
Ron Natalie wrote:
>
> John Nagle wrote:
> >
> > Almost all, if not all, the mainstream languages that postdate
> > C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> > Python, and the various "scripting languages" protect against
> > dangling pointers, invalid pointers, and most memory leaks.
>
> The one's you mention are all "scripting languages"
Ummm, no. C# and Java are both compiled languages.
>
> > This should tell us something.
>
> Yeah, that there is a place for languages that proivde some
> insulation against these issues at the cost of performance.
> We're still 30 times faster in our C++ benchmarks over equivelent
> JAVA code, we ain't switching for the critical stuff.
Ummmm, no. For server-side stuff, Java is usually well inside a factor of
two of equivalent C++ code. It depends a lot on the C++ compiler
and the JVM, of course. I've never seen a C# benchmark, but it's going to
be pretty fast, too.
> >
> > -- Strict mode should have the minimal set of constraints
> > required to provide memory allocation safety.
>
> You want to enumerate how this is to happen? You want to remove
> pointers from the language like Java did?
He said he will.
>
> > -- Strict mode should not impose a significant performance
> > penalty at run time. More static checking at compile time
> > is acceptable. That's the big difference between this
> > approach (and C++ generally) and Java, etc.
>
> Yeah, well nothing comes for free? Why do you think there is a
> performance penalty in Java, etc...?
No, not free; but not terribly costly, either.
>
> >
> > Note that if achieved, this would mean, among other things,
> > the end of "buffer overflow" security exploits, a major
> > reduction if not elimination of memory leaks in long running
> > programs, and a significant reduction in hard to find bugs.
> > Those are some of the reasons programmers and projects are
> > moving away from C++.
>
> It's not clear how you are going to achieve any of this without
> substantially changing the language, either to remove all the
> constructs which would be unsafe or to put significant runtime
> checking on things like pointer manipulation with a lot of attendent
> book keeping to keep this from happening.
>
> It is also laughable to think that this will end "buffer overflow"
> security exploits. Those programs were laughably wrong to begin with
> and this won't fix it. We've exploited similar things in "safe"
> environments. Failure to check bounds on input is mistake even
> with shell scripts, LIST, or JAVA.
Well, for Java this you're just plain wrong. For UNIX shells, no one has ever
caled them "safe." But plenty of the other environments John mentioned are safe, and
there's no such thing as a buffer overrun in one of these languages.
And what the heck is "LIST"? LISP, maybe?
>
> [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
> [ about comp.lang.c++.moderated. First time posters: do this! ]
>
> [ comp.std.c++ is moderated. To submit articles, try just posting with ]
> [ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
> [ --- Please see the FAQ before posting. --- ]
> [ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
> [ Note that the FAQ URL has changed! Please update your bookmarks. ]
--
---------------------------------------------------------
Ernest Friedman-Hill
Distributed Systems Research Phone: (925) 294-2154
Sandia National Labs FAX: (925) 294-2234
Org. 8920, MS 9012 ejfried@ca.sandia.gov
PO Box 969 http://herzberg.ca.sandia.gov
Livermore, CA 94550
======================================= MODERATOR'S COMMENT:
Please don't quote the signature block.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: Attila Feher <Attila.Feher@lmf.ericsson.se>
Date: 28 Jun 2001 18:22:54 -0400 Raw View
Ron Natalie wrote:
[SNIP]
> You want to enumerate how this is to happen? You want to remove
> pointers from the language like Java did?
[SNIP]
> > Note that if achieved, this would mean, among other things,
> > the end of "buffer overflow" security exploits, a major
> > reduction if not elimination of memory leaks in long running
> > programs, and a significant reduction in hard to find bugs.
> > Those are some of the reasons programmers and projects are
> > moving away from C++.
>
> It's not clear how you are going to achieve any of this without
> substantially changing the language, either to remove all the
> constructs which would be unsafe or to put significant runtime
> checking on things like pointer manipulation with a lot of attendent
> book keeping to keep this from happening.
>
> It is also laughable to think that this will end "buffer overflow"
> security exploits. Those programs were laughably wrong to begin with
> and this won't fix it. We've exploited similar things in "safe"
> environments. Failure to check bounds on input is mistake even
> with shell scripts, LIST, or JAVA.
Now what I don't get is: for MS environments there is BoundsChecker
doing this. Here we have OS support (Solaris) and since we did not like
it we have made our or "boundschecker for xxx". So the solution for
_testing_ exists. _Noone_ would do a _final_ application in this strict
mode anyways, since it _does_ come with speed and memory use penalties.
So I just don't get it: why a strict mode when on most of the platforms
solutions for this kind of testing (BoundsChecker, Purify, etc. etc.)
exist?
I am sorry for programmers leaving C++ and believing they did due to
this. I believe it is due to their ignorance (not knowing existing
tools) or due to the fact their project allows no time for proper
design... which is again somewhere a human fault not one of the
language. C++ is a power tool - and as you don't give a chainsaw to
someone not efficient using it...
Example of "good design" making most probably no use for any "strict
mode":
class MyBrilliantClass {
public:
MyBrilliantClass : mModesty( false) { ; }
//..
void doThis( <allpars>);
private:
bool mModesty;
//...
};
void My...::doThis(...) {
// Check arguments
# ifdef MYPROJNAME_PARANOID_DEBUG
// Do tests, throw if error
// In most of the projects no ifdef here!
# endif
// Do real work
# ifdef MYPROJNAME_PARANOID_MEMCHECK
// Do special (platform dependent) test
// to find the case when the "real work"
// screwed up the memory or stack
# endif
}
Of course it would be nice to have support from the language for this:
but since many of these tests are platform dependent I don't see how
could it get into the STD.
BTW, what would be nice is a standardized "debug mode" STL, eg: bounds
check on the vector, exception thrown on dereferencing or incrementing
an end() iterator...
A
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: James Dennett <jdennett@acm.org>
Date: 29 Jun 01 14:04:05 GMT Raw View
John Nagle wrote:
>
> Almost all, if not all, the mainstream languages that postdate
> C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> Python, and the various "scripting languages" protect against
> dangling pointers, invalid pointers, and most memory leaks. This
> should tell us something.
>
> I propose that the next version of C++ should have a "strict
> mode", which makes code safe with regard to memory allocation.
> I suggest the following design constraints:
>
> -- Strict mode should have the minimal set of constraints
> required to provide memory allocation safety.
> -- Strict mode should not impose a significant performance
> penalty at run time. More static checking at compile time
> is acceptable. That's the big difference between this
> approach (and C++ generally) and Java, etc.
> -- Garbage collection should not be required.
> -- Reference counting may be required, but should be
> optimized out by the compiler when possible. This
> must well enough that inner loops generally do not
> have reference count overhead.
> -- Both single-thread and multi-thread programs should be
> safe in strict mode.
> -- Destructors should, as at present, be invoked at
> well-defined times, not at some random later time as
> with finalizers.
> -- Run-time errors in strict mode should raise exceptions.
>
> Note that if achieved, this would mean, among other things,
> the end of "buffer overflow" security exploits, a major
> reduction if not elimination of memory leaks in long running
> programs, and a significant reduction in hard to find bugs.
> Those are some of the reasons programmers and projects are
> moving away from C++.
>
> Please comment on these goals. I see them as achieveable,
> and will discuss mechanisms at a future time. My question
> now is whether these goals are widely seen as worth the
> effort required to achieve them.
Your ideas closely echo thoughts of my own. The question is
whether an idea of separate "modes" for C++ will fly. I hope
that they can; there's little chance of changing some of the
cruft in C++ unless we explicitly allow such modes so that we
can disable the nastiness in "safe" mode and leave things as
they are in "fast" mode.
-- James Dennett
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: James Kuyper <kuyper@wizard.net>
Date: 29 Jun 01 14:04:37 GMT Raw View
John Nagle wrote:
>
> Almost all, if not all, the mainstream languages that postdate
> C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> Python, and the various "scripting languages" protect against
> dangling pointers, invalid pointers, and most memory leaks. This
> should tell us something.
>
> I propose that the next version of C++ should have a "strict
> mode", which makes code safe with regard to memory allocation.
> I suggest the following design constraints:
>
> -- Strict mode should have the minimal set of constraints
> required to provide memory allocation safety.
What do you consider the minimal set? What constitutes safety? Some
would argue that 'new' and 'delete' already qualify. I'm not saying that
new and delete are perfect, or even particularly good (though they're a
definite improvement over malloc/free). I'm saying that you need to be
more specific. What specific kinds of requirements do you want the
standard to establish for implementations? Perfect safety is impossible;
large amounts of safety can get arbitrarily expensive. How much safety,
of what kind, are you asking us to buy?
Keep in mind that different users have different needs for safety; it's
entirely appropriate for the standard to require only the least common
denominator of safety, leaving market forces to impose requirements for
higher levels of safety.
...
> Please comment on these goals. I see them as achieveable,
> and will discuss mechanisms at a future time. My question
> now is whether these goals are widely seen as worth the
> effort required to achieve them.
Of course they're worthwhile, but whether they're worth the effort
depends upon how much effort you're talking about. That won't be clear
until you've provided some more specific suggestions.
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: "Stephen Howe" <NOSPAMsjhowe@dial.pipex.com>
Date: 29 Jun 01 14:04:48 GMT Raw View
"John Nagle" <nagle@animats.com> wrote in message
news:3B376536.D51B80C4@animats.com...
> -- Strict mode should have the minimal set of constraints
> required to provide memory allocation safety.
> -- Strict mode should not impose a significant performance
> penalty at run time.
So are you advocating run-time checking? In general, that is against the
direction C++ has been heading.
More static checking at compile time
> is acceptable.
Sure, that is the direction C++ has been heading and therefore any
additional static checking is a good thing.
Stephen Howe
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 29 Jun 2001 15:33:07 -0400 Raw View
Ernest Friedman-Hill wrote:
>
> Ron Natalie wrote:
> >
> > John Nagle wrote:
> > You want to enumerate how this is to happen? You want to remove
> > pointers from the language like Java did?
>
> He said he will.
Thanks for your comment in comp.std.c++.
I have a number of drafts for "strict C++" designs, none of which
I'm really happy with. If you, or others you know, are interested in
this, let me know. I'd like to get some people thinking about
this.
My basic thinking runs as follows:
-- Memory allocation needs to be semi-explicit. This means
a combination of reference counts and things like auto_ptr,
integrated with collections. Garbage collection is
undesirable for C++; even if it can be made to work,
destructor semantics are too important to C++ to be
replaced with finalizers called from GC.
-- Weak pointers are needed to handle back pointers and
such, to avoid unnecessary circularity. (Think Perl
weak pointers, not Java weak pointers.)
-- In addition to weak and strong pointers, we would
want temporary references and iterators. These
are required to have scope smaller than some
strong pointer, so they can't outlive the strong
pointer. Such temporary references and iterators
then don't need reference counts. This
fixes the performance problem for number-crunching
loops.
-- Iterator arithmetic is safe, because iterators have
meaningful error semantics and can be checked.
Pointer arithmetic is not safe. I propose to require
the use of iterators where pointer arithmetic is
desired.
-- Compilers need to know about iterators so they can
optimize checking. Checks should throw exceptions.
It should be permissible to throw a checking
exception as soon as an error becomes inevitable.
This allows hoisting checks to the top of loops.
-- For backwards compatibility, we could allow
arithmetic on pointers to const. This retains
"const char*", which can only cause the
reading of junk or a machine exception; it can't
overstore. This allows "printf", and
it disallows "sprintf", cause of many
buffer overflows. That's probably a good
compromise between safety and compatibility.
This is enough for single-thread programs. Thread
safety raises other issues, which need to be addressed.
John Nagle
Animats
http://www.animats.com
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: dave <dleimbac@earthlink.net>
Date: 29 Jun 2001 15:34:54 -0400 Raw View
Ron Natalie <ron@sensor.com> writes:
> Yeah, that there is a place for languages that proivde some
> insulation against these issues at the cost of performance.
> We're still 30 times faster in our C++ benchmarks over equivelent
> JAVA code, we ain't switching for the critical stuff.
This isn't always true. Microsofts JIT compiler makes Java bytecodes run
as fast or faster than their native C code on some occasions. I don't think
C++ has many occasions to be faster than C [perhaps as fast as C and rarely
faster IMHO].
This proves one of two things.
1) Microsoft wrote a great JIT compiler that runs Java class bytecode almost
as fast as native binaries and they did a great job of it.
2) Microsoft compilers for C/C++ just suck :).
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: John Nagle <nagle@animats.com>
Date: 29 Jun 2001 15:35:17 -0400 Raw View
Ron Natalie wrote:
>
> John Nagle wrote:
> >
> > Almost all, if not all, the mainstream languages that postdate
> > C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> > Python, and the various "scripting languages" protect against
> > dangling pointers, invalid pointers, and most memory leaks.
>
> We're still 30 times faster in our C++ benchmarks over equivelent
> JAVA code, we ain't switching for the critical stuff.
> >
> > -- Strict mode should have the minimal set of constraints
> > required to provide memory allocation safety.
>
> You want to enumerate how this is to happen? You want to remove
> pointers from the language like Java did?
We're in the STL era now; raw pointer manipulation is on
the decline. We can take advantage of that fact in the next
go-round of C++.
Iterators are not pointers, although they're sometimes
implemented that way. Bad pointer operations aren't illegal
until the next dereference. Bad iterator operations are
illegal at the point the iterator is updated. Thus, iterator
arithmetic is checkable. Better, most checks can be optimized out.
In particular, optimizing out all the checking for most "for"
loops is straightforward, and the simpler case of "while" aren't
hard either. There's an old paper on Pascal
optimization that found that about 90% of subscript checks
can be optimized out.
Bear in mind that all of this would apply in some yet to be
defined "strict mode", a la Perl "use strict". I realize we
can't make people tighten up their old code much, but that
doesn't mean we can't get rid of some of the legacy problems.
More on this later; it's 1 AM.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
Author: James Kanze <James.Kanze@dresdner-bank.com>
Date: 30 Jun 01 05:47:56 GMT Raw View
John Nagle wrote:
> Almost all, if not all, the mainstream languages that postdate
> C++ are "safe" with regard to memory allocation. Java, C#, Perl,
> Python, and the various "scripting languages" protect against
> dangling pointers, invalid pointers, and most memory leaks. This
> should tell us something.
> I propose that the next version of C++ should have a "strict
> mode", which makes code safe with regard to memory allocation.
In what way would it still be C++? Do you count on adding additional
information to the sources somehow?
I'd be very interested in something like ESC for C++. I'm not sure to
what degree it is possible.
[...]
> Note that if achieved, this would mean, among other things, the end
> of "buffer overflow" security exploits, a major reduction if not
> elimination of memory leaks in long running programs, and a
> significant reduction in hard to find bugs. Those are some of the
> reasons programmers and projects are moving away from C++.
Don't promiss a silver bullet. I'm convinced that there can be
significant improvement. I'm also convinced that from a pratical
point of view, the programmer is going to have to give the compiler a
bit more information. I know of some experients concerning tracing
values in C++. The results vary -- in at least one case, compile
times increased exponentially with total application size. I'm all in
favor of a maximum of compile time checking, but I can't accept
compile times measured in centuries. (I would actually accept up to
about a week, provided I could be reasonably certain that certain
classes of errors where 100% eliminated.)
> Please comment on these goals. I see them as achieveable, and will
> discuss mechanisms at a future time. My question now is whether
> these goals are widely seen as worth the effort required to achieve
> them.
The goals are laudable. The are certainly achievable in the absolute.
The question is, are they achievable with reasonable compile times.
--
James Kanze mailto:kanze@gabi-soft.de
Conseils en informatique orient e objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelh ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]
[ Note that the FAQ URL has changed! Please update your bookmarks. ]
Author: John Nagle <nagle@animats.com>
Date: 27 Jun 2001 12:23:52 -0400 Raw View
Almost all, if not all, the mainstream languages that postdate
C++ are "safe" with regard to memory allocation. Java, C#, Perl,
Python, and the various "scripting languages" protect against
dangling pointers, invalid pointers, and most memory leaks. This
should tell us something.
I propose that the next version of C++ should have a "strict
mode", which makes code safe with regard to memory allocation.
I suggest the following design constraints:
-- Strict mode should have the minimal set of constraints
required to provide memory allocation safety.
-- Strict mode should not impose a significant performance
penalty at run time. More static checking at compile time
is acceptable. That's the big difference between this
approach (and C++ generally) and Java, etc.
-- Garbage collection should not be required.
-- Reference counting may be required, but should be
optimized out by the compiler when possible. This
must well enough that inner loops generally do not
have reference count overhead.
-- Both single-thread and multi-thread programs should be
safe in strict mode.
-- Destructors should, as at present, be invoked at
well-defined times, not at some random later time as
with finalizers.
-- Run-time errors in strict mode should raise exceptions.
Note that if achieved, this would mean, among other things,
the end of "buffer overflow" security exploits, a major
reduction if not elimination of memory leaks in long running
programs, and a significant reduction in hard to find bugs.
Those are some of the reasons programmers and projects are
moving away from C++.
Please comment on these goals. I see them as achieveable,
and will discuss mechanisms at a future time. My question
now is whether these goals are widely seen as worth the
effort required to achieve them.
John Nagle
Animats
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html ]