Thread

Topic: Defect Report: 'use' of invalid pointer value not defined

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Tue, 25 Sep 2001 16:46:18 GMT Raw View

Martin von Loewis wrote:
>
> "James Russell Kuyper Jr." <kuyper@wizard.net> writes:
>
> > > a) A value is 'used' in a program if a variable holding this value
> > >    appears in an expression that is evaluated.
> >
> > A variable stores a value; it isn't the same thing as the value. That is
> > a use of the variable, but not necessarily a use of the value stored in
> > it. The value is used if it plays a part in determining the value of the
> > expression.
>
> How do I find out whether a value plays part in determining the value
> of the expression?

By reading the description of the operator. For instance, i+j calculates
the sum of the value stored in i, and the value stored in j. It can't
calculate that sum without using those values. i=j, on the other hand,
takes the value stored in j, and stores a copy of it in i. The value
stored in i plays no role in that operation.

> > > b) A value is 'used' if an expression evaluates to that value.
> > >    This would render the sequence
> > >
> > >    int *x = new int(0);
> > >    delete x;
> > >    x->~int();
> > >
> > >    into undefined behaviour; according to 5.2.4, the variable x is
> > >    'evaluated' as part of evaluating the pseudo destructor call. This,
> >
> > Yes, that is undefined behavior, for two seperate reasons. In addition
> > to the reason you've given, that code calls the destructor twice on the
> > same object.
>
> Actually, it doesn't, since int has no destructor (5.3.5/6 says that
> the destructor is called only if the object has any). If I rewrite

You're right. I was thinking of the general case, but for 'int' there's
no such problem.

> this program as
>
> int *x = new int(0);
> operator delete(x);
> x->~int();
>
> then this objection goes away, right? Is then the program still with
> undefined behaviour?

Again, I was thinking primarily of the general case, but I have to admit
that for 'int', the explicit destructor call shouldn't actually need to
use the indeterminate value stored in 'x', in order to call the
non-existent destructor.

...
> > >    in turn, would mean that all containers (23) of pointers show
> > >    undefined behaviour, e.g. 23.2.2.3 requires to invoke the
> > >    destructor as part of the clear() method of the container.
> >
> > The clear() method would not require any code comparable to the example
> > you've given.
>
> If the application deletes the elements of the container before
> clearing it, then this code sequence would occur. Consider
>
> #include <vector>
> int main(){}
>   std::list<int*> l;
>   l.push_back(new int(0));
>   delete l[0];
>   l.clear();
> }
>
> Does that program show undefined behaviour?

No, but only because there's only one element in the vector. If there
were two elements, l.clear() could legally use the copy constructor to
copy the value of one into the other's location, before erasing them
both. It couldn't use the assignment operator, since clear() has a
complexity requirement on the assignment operator. It also couldn't use
a temporary extra element, because it would have to destroy the
temporary, and clear() has complexity limit on calls to the destructor.

Actually, using the as-if rule, you could argue that std::vector<T>
could be specialized for built-in types to ignore the limit on
destructor calls. That would allow clear() to make a copy even if
there's only one element in the vectory.

If std::vector<T>::clear() did copy a deallocated pointer, the behavior
would be undefined. However, that undefined behavior is fundamentally
attributable to the delete, not the container. l[0] is no longer
CopyConstructible after the delete.

...
> > A container of pointers does not delete the pointed-at objects when the
> > pointers are destroyed. Therefore, the values of the pointers stored in
> > the container are not themselves rendered indeterminate; The destructor
> > is called only for the pointers themselves. This is, of course, a
> > potential memory leak if those were the only copies you had of pointers
> > to previously allocated memory.
>
> You are right; I was thinking about the case where the application
> deleted the elements of the container before clearing it.

That's not legal, for the reasons given.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Tue, 25 Sep 2001 17:05:53 GMT Raw View

thp@cs.ucr.edu wrote:
>
> James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
> : thp@cs.ucr.edu wrote:
> :>
> :> James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
> : ...
> :> : The variable 'x' is used, but only as a place to store a new value; the
> :> : value stored in 'x' is not used after the 'delete' which renders it
> :> : indeterminate.
> :>
> :> I completely agree, but I have a question about the proper use of the
> :> noun "pointer".  Is it correct to say that the value stored in x is a
> :> "pointer"?  Or should one say that the value stored in x is a "pointer
> :> value"?  Similarly, is it appropriate to say that x is a "pointer"?
> :> Or should one say that x is a "pointer object"?
>
> : That sounds reasonable. However, conventional use is not very careful
> : about such distinctions, and rarely needs to be. The term "pointer" is
> : often used interchangeably to refer either to a value, or an object
> : containing such values. I think this is true even in the standard
> : itself. There's seldom any real confusion over the matter. When it
> : matters, the phrases "pointer value" and "pointer object" are quite
> : clear, even though one of them may be technically redundant. I'm a
> : notorious pedant, but some distinctions are too trivial to matter even
> : to me. :-)
>
> The distinction between an object and its value seems both basic and
> non-trivial.  As a teacher and as a reader of this newsgroup, I've
> observed a significant amount of confusion that appears to be caused
> by ambiguous use of the same term to designate both rvalues and lvaues
> of the same type.  For whatever it's worth, I'd recommend picking a
> single meaning for "pointer," and sticking to it.

If it is that important, I'd recommend that someone (I'm certainly not
sufficiently interested to do it myself) systematicly review the
standard, to determine which of the two meanings would require the
fewest re-writes. I've found the term 'pointer' used in both senses in
different places in the standard. The term is so frequently used that
such a review would be an awful lot of work.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Tue, 25 Sep 2001 17:35:59 GMT Raw View

joerg.barfurth@attglobal.net (Joerg Barfurth) writes:

> I personally would tend to say that a value is used, if it is used as
> rvalue in the evaluation of an expression. For lvalue expressions, this
> means that they are used if an lvalue to rvalue conversion is applied to
> them. But the lack of a clear definition reaaly is somewhat troubling
> here.

Indeed. The problem gets more complicated by the fact that the
standard is not always clear which expressions require rvalues, and
which don't. It sometimes says that lvalue-to-rvalue conversions are
not performed, sometimes, when one might expect that it does, it
actually doesn't (e.g. pseudo destructor call).

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Wed, 26 Sep 2001 10:42:36 GMT Raw View

"James Russell Kuyper Jr." <kuyper@wizard.net> writes:

> By reading the description of the operator. For instance, i+j calculates
> the sum of the value stored in i, and the value stored in j. It can't
> calculate that sum without using those values. i=j, on the other hand,
> takes the value stored in j, and stores a copy of it in i. The value
> stored in i plays no role in that operation.

That's a good suggestion in general. However, how do I know what the
implementation "can" and "can't". It is apparent that, in order to
compute the sum, the value of the arguments must be known. In the case
of pseudo-destructor calls, it is not so clear what the implementation
"can".

> Again, I was thinking primarily of the general case, but I have to admit
> that for 'int', the explicit destructor call shouldn't actually need to
> use the indeterminate value stored in 'x', in order to call the
> non-existent destructor.

That brings me back to the original issue. How do I know whether a
value is "used"? You seem to say it depends on the type of the value,
whether a certain expression would use it or not.

> No, but only because there's only one element in the vector. If there
> were two elements, l.clear() could legally use the copy constructor to
> copy the value of one into the other's location, before erasing them
> both. It couldn't use the assignment operator, since clear() has a
> complexity requirement on the assignment operator. It also couldn't use
> a temporary extra element, because it would have to destroy the
> temporary, and clear() has complexity limit on calls to the destructor.

It's a vector of pointers, which don't have destructors. Does this
make a difference? (it should, because the expression foo.~T() becomes
a pseudo destructor call if T is not a class name)

> If std::vector<T>::clear() did copy a deallocated pointer, the behavior
> would be undefined. However, that undefined behavior is fundamentally
> attributable to the delete, not the container. l[0] is no longer
> CopyConstructible after the delete.

CopyConstructible is a requirement on a template argument (see 20.1,
20.1.3). It is a type that is either CopyConstructible or not, not a
specific value. So the statement 'l[0] is no longer CopyConstructible'
is wrong: being CopyConstructible is a static property.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Wed, 26 Sep 2001 17:09:25 GMT Raw View

Martin von Loewis wrote:
>
> "James Russell Kuyper Jr." <kuyper@wizard.net> writes:
...
> > Again, I was thinking primarily of the general case, but I have to admit
> > that for 'int', the explicit destructor call shouldn't actually need to
> > use the indeterminate value stored in 'x', in order to call the
> > non-existent destructor.
>
> That brings me back to the original issue. How do I know whether a
> value is "used"? You seem to say it depends on the type of the value,
> whether a certain expression would use it or not.

Actually, I was trying to avoid saying that, because I'm not sure. Would
anyone else with greater expertise care to weigh in on this question?

...
> It's a vector of pointers, which don't have destructors. Does this
> make a difference? (it should, because the expression foo.~T() becomes
> a pseudo destructor call if T is not a class name)

Yes. That's why I mentioned the possibility of specialization for
built-in types.

> > If std::vector<T>::clear() did copy a deallocated pointer, the behavior
> > would be undefined. However, that undefined behavior is fundamentally
> > attributable to the delete, not the container. l[0] is no longer
> > CopyConstructible after the delete.
>
> CopyConstructible is a requirement on a template argument (see 20.1,
> 20.1.3). It is a type that is either CopyConstructible or not, not a
> specific value. So the statement 'l[0] is no longer CopyConstructible'
> is wrong: being CopyConstructible is a static property.

I was aware of that issue, but wanted to gloss it over. That's because I
see only two ways to acknowledge that issue:

1. No pointer type is necessarily CopyConstructible, because on some
implementations there are some pointer values that can't be safely
copied. This means that you can't put pointers in containers, at least
not when using such implementations.

2. CopyConstructible does not necessarily mean that the value can be
safely copied. As a result, when uncopyable pointer values are possible,
containers would have to be partially specialized for Container<T*>.
Within that partial specialzation, containers would be able to take
advantage of CopyConstructible requirements only by using
implementation-specific methods of testing pointers for safety.

I don't like either option. The same issues apply to Assignable,
EqualityComparable, and LessThanComparable. Something must be done,
either by users or by implementors, to prevent standard templates from
using pointer values that have been rendered indeterminate by
deallocation. One solution that would put the burden on the users would
be to re-cast these as requirements on the particular objects placed in
a container, rather than as a requirement on the type used as a template
argument.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: thp@cs.ucr.edu
Date: Wed, 26 Sep 2001 21:17:36 GMT Raw View

Martin von Loewis <loewis@informatik.hu-berlin.de> wrote:
: joerg.barfurth@attglobal.net (Joerg Barfurth) writes:

:> I personally would tend to say that a value is used, if it is used as
:> rvalue in the evaluation of an expression. For lvalue expressions, this
:> means that they are used if an lvalue to rvalue conversion is applied to
:> them. But the lack of a clear definition reaaly is somewhat troubling
:> here.

: Indeed. The problem gets more complicated by the fact that the
: standard is not always clear which expressions require rvalues, and
: which don't. It sometimes says that lvalue-to-rvalue conversions are
: not performed, sometimes, when one might expect that it does, it
: actually doesn't (e.g. pseudo destructor call).

I presume that the standard *defines* the behavior of the
pseudo-destructor call.  The question is whether it also invokes
undefined behavior by constituting a "use" of the pointer.  I vaguely
recall someone, either in this group or comp.std.c, quoting some
policy to the effect that in such cases the interpretation that yields
defined behavior is the proper one.  Does anyone else recall or know
of such a thing?  (My memory is very vauge on this point.)

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: thp@cs.ucr.edu
Date: Sat, 22 Sep 2001 15:19:18 GMT Raw View

James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
: Martin von Loewis wrote:
:>
:> [Moderator's note: Forwarded to C++ Committee. -sdc ]
:>
:> 3.7.3.2, [basic.stc.dynamic.deallocation]/4, mentions that the effect
:> of using an invalid pointer value is undefined. However, the standard
:> never says what it means to 'use' a value.
:>
:> There are a number of possible interpretations, but it appears that
:> each of them leads to undesired conclusions:
:>
:> a) A value is 'used' in a program if a variable holding this value
:>    appears in an expression that is evaluated.

: A variable stores a value; it isn't the same thing as the value. That is
: a use of the variable, but not necessarily a use of the value stored in
: it. The value is used if it plays a part in determining the value of the
: expression.

:>    This interpretation would render the sequence
:>
:>    int *x = new int(0);
:>    delete x;
:>    x = 0;
:>
:>    into undefined behaviour. As this is a common idiom, this is
:>    clearly undesirable.

: The variable 'x' is used, but only as a place to store a new value; the
: value stored in 'x' is not used after the 'delete' which renders it
: indeterminate.

I completely agree, but I have a question about the proper use of the
noun "pointer".  Is it correct to say that the value stored in x is a
"pointer"?  Or should one say that the value stored in x is a "pointer
value"?  Similarly, is it appropriate to say that x is a "pointer"?
Or should one say that x is a "pointer object"?

It might be worth noting that both K&R and Stroustrup say (in effect)
that pointers are address-valued *objects*.

Tom Payne




















---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Mon, 24 Sep 2001 15:27:48 GMT Raw View

Francis Glassborow <francis.glassborow@ntlworld.com> writes:

> In article <j47kutd6iy.fsf@informatik.hu-berlin.de>, Martin von Loewis
> <loewis@informatik.hu-berlin.de> writes
> >b) A value is 'used' if an expression evaluates to that value.
> >   This would render the sequence
> >
> >   int *x = new int(0);
> >   delete x;
> >   x->~int();
>
> Please explain why you think this does not already always exhibit
> undefined behaviour.

As Mr. Kuyper points out, it might, on the basis of x being destroyed
twice. However, since the type of *x does not have a destructor, the
delete expression won't invoke one (5.3.5/6). Why would you think this
is already undefined? How about

int *x = new int(0);
operator delete(x);
x->~int();

instead? That does not 'already' exhibit undefined behaviour. The
Defect now is that the standard doesn't tell me whether this fragment
shows undefined behaviour, since I cannot tell whether x is 'used' in
the pseudo-destructor call.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Mon, 24 Sep 2001 16:53:52 GMT Raw View

thp@cs.ucr.edu wrote:
>
> James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
...
> : The variable 'x' is used, but only as a place to store a new value; the
> : value stored in 'x' is not used after the 'delete' which renders it
> : indeterminate.
>
> I completely agree, but I have a question about the proper use of the
> noun "pointer".  Is it correct to say that the value stored in x is a
> "pointer"?  Or should one say that the value stored in x is a "pointer
> value"?  Similarly, is it appropriate to say that x is a "pointer"?
> Or should one say that x is a "pointer object"?

That sounds reasonable. However, conventional use is not very careful
about such distinctions, and rarely needs to be. The term "pointer" is
often used interchangeably to refer either to a value, or an object
containing such values. I think this is true even in the standard
itself. There's seldom any real confusion over the matter. When it
matters, the phrases "pointer value" and "pointer object" are quite
clear, even though one of them may be technically redundant. I'm a
notorious pedant, but some distinctions are too trivial to matter even
to me. :-)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: joerg.barfurth@attglobal.net (Joerg Barfurth)
Date: Mon, 24 Sep 2001 17:07:48 GMT Raw View

Francis Glassborow <francis.glassborow@ntlworld.com> wrote:

> In article <j47kutd6iy.fsf@informatik.hu-berlin.de>, Martin von Loewis
> <loewis@informatik.hu-berlin.de> writes
> >b) A value is 'used' if an expression evaluates to that value.
> >   This would render the sequence
> >
> >   int *x =3D new int(0);
> >   delete x;
> >   x->~int();
>=20
> Please explain why you think this does not already always exhibit
> undefined behaviour.

>From the accompanying text, it seems that the OP intended to invoke the
pseudo-destructor on the pointer rather than the already deleted int. Of
course this would then look like:

 typedef int * pint;
 pint x =3D new int(0);
 delete x;
 x.~pint();

If the last line here 'uses the value of x' and thus produces undefined
behavior, that would indeed be a problem for common containers of
pointers.

I personally would tend to say that a value is used, if it is used as
rvalue in the evaluation of an expression. For lvalue expressions, this
means that they are used if an lvalue to rvalue conversion is applied to
them. But the lack of a clear definition reaaly is somewhat troubling
here.

Regards, Joerg
--=20
J=F6rg Barfurth                         joerg.barfurth@attglobal.net
<<<<<<<<<<<<< using std::disclaimer;  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
Software Developer                    http://www.OpenOffice.org
StarOffice Configuration              http://www.sun.com/staroffice

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Mon, 24 Sep 2001 17:21:25 GMT Raw View

"James Russell Kuyper Jr." <kuyper@wizard.net> writes:

> > a) A value is 'used' in a program if a variable holding this value
> >    appears in an expression that is evaluated.
>
> A variable stores a value; it isn't the same thing as the value. That is
> a use of the variable, but not necessarily a use of the value stored in
> it. The value is used if it plays a part in determining the value of the
> expression.

How do I find out whether a value plays part in determining the value
of the expression?

> > b) A value is 'used' if an expression evaluates to that value.
> >    This would render the sequence
> >
> >    int *x = new int(0);
> >    delete x;
> >    x->~int();
> >
> >    into undefined behaviour; according to 5.2.4, the variable x is
> >    'evaluated' as part of evaluating the pseudo destructor call. This,
>
> Yes, that is undefined behavior, for two seperate reasons. In addition
> to the reason you've given, that code calls the destructor twice on the
> same object.

Actually, it doesn't, since int has no destructor (5.3.5/6 says that
the destructor is called only if the object has any). If I rewrite
this program as

int *x = new int(0);
operator delete(x);
x->~int();

then this objection goes away, right? Is then the program still with
undefined behaviour?

> However, if you have a user-defined type with a non-trivial virtual
> destructor, rather than 'int', it's quite likely that this will go
> wrong.

Certainly. However, I picked 'int' on purpose.

>
> >    in turn, would mean that all containers (23) of pointers show
> >    undefined behaviour, e.g. 23.2.2.3 requires to invoke the
> >    destructor as part of the clear() method of the container.
>
> The clear() method would not require any code comparable to the example
> you've given.

If the application deletes the elements of the container before
clearing it, then this code sequence would occur. Consider

#include <vector>
int main(){}
  std::list<int*> l;
  l.push_back(new int(0));
  delete l[0];
  l.clear();
}

Does that program show undefined behaviour?

> A container of pointers does not delete the pointed-at objects when the
> pointers are destroyed. Therefore, the values of the pointers stored in
> the container are not themselves rendered indeterminate; The destructor
> is called only for the pointers themselves. This is, of course, a
> potential memory leak if those were the only copies you had of pointers
> to previously allocated memory.

You are right; I was thinking about the case where the application
deleted the elements of the container before clearing it.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: thp@cs.ucr.edu
Date: Tue, 25 Sep 2001 11:30:06 GMT Raw View

James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
: thp@cs.ucr.edu wrote:
:>
:> James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
: ...
:> : The variable 'x' is used, but only as a place to store a new value; the
:> : value stored in 'x' is not used after the 'delete' which renders it
:> : indeterminate.
:>
:> I completely agree, but I have a question about the proper use of the
:> noun "pointer".  Is it correct to say that the value stored in x is a
:> "pointer"?  Or should one say that the value stored in x is a "pointer
:> value"?  Similarly, is it appropriate to say that x is a "pointer"?
:> Or should one say that x is a "pointer object"?

: That sounds reasonable. However, conventional use is not very careful
: about such distinctions, and rarely needs to be. The term "pointer" is
: often used interchangeably to refer either to a value, or an object
: containing such values. I think this is true even in the standard
: itself. There's seldom any real confusion over the matter. When it
: matters, the phrases "pointer value" and "pointer object" are quite
: clear, even though one of them may be technically redundant. I'm a
: notorious pedant, but some distinctions are too trivial to matter even
: to me. :-)

The distinction between an object and its value seems both basic and
non-trivial.  As a teacher and as a reader of this newsgroup, I've
observed a significant amount of confusion that appears to be caused
by ambiguous use of the same term to designate both rvalues and lvaues
of the same type.  For whatever it's worth, I'd recommend picking a
single meaning for "pointer," and sticking to it.

Tom Payne
























---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Sun, 30 Sep 2001 21:37:39 GMT Raw View

"James Russell Kuyper Jr." <kuyper@wizard.net> writes:

> I don't like either option. The same issues apply to Assignable,
> EqualityComparable, and LessThanComparable. Something must be done,
> either by users or by implementors, to prevent standard templates from
> using pointer values that have been rendered indeterminate by
> deallocation.

Thanks for this comment :-) My preferred solution to this problem, and
to the issue I submitted, would be to outlaw implementations that
break when accessing pointer values which have been deleted,
i.e. making 'use' of 'invalid pointer values' well-defined.  The
usage restrictions on such a pointer would be the same as in 3.8/5.

The rationale for the restriction is given in a footnote. To me, this
is an indication that the committee agrees that this restriction is
debatable (many other restrictions are given without rationale, as
they are obviously desirable).

I don't know what architectures the committee had in mind when
introducing this restriction. People bring up x86 as an example, where
a segment register load causes a failure if the segment descriptor has
been invalidated. This argument is incorrect, IMO: On this
architecture, an implementation could easily avoid loading pointer
values into segment registers unless it wants to perform one of the
operations in 3.8/5. E.g. to simply copy a pointer value, plain
registers (EDX:EAX) could be used, instead of loading segment
registers.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Mon, 1 Oct 2001 20:54:01 GMT Raw View

Martin von Loewis wrote:
>
> "James Russell Kuyper Jr." <kuyper@wizard.net> writes:
>
> > I don't like either option. The same issues apply to Assignable,
> > EqualityComparable, and LessThanComparable. Something must be done,
> > either by users or by implementors, to prevent standard templates from
> > using pointer values that have been rendered indeterminate by
> > deallocation.
>
> Thanks for this comment :-) My preferred solution to this problem, and
> to the issue I submitted, would be to outlaw implementations that
> break when accessing pointer values which have been deleted,
> i.e. making 'use' of 'invalid pointer values' well-defined.  The
> usage restrictions on such a pointer would be the same as in 3.8/5.

The restricted uses permitted by 3.8p5 apply only between allocation and
the end of construction, or between the beginning of destruction and
deallocation of the memory. A pointer to an object which has been
deleted is already past that time, so 3.8p5 doesn't help with this
issue. A pointer to an object which has merely been destroyed, and not
deleted, is perfectly safe as long as it's not dereferenced.

There's a good reason why 'use' of 'invalid pointer values' is undefined
behavior; there's many real and popular platforms where such use causes
a program to abort; if we require such use to be safe, we prohibit
implementation of C++ on such platforms.

> The rationale for the restriction is given in a footnote. To me, this
> is an indication that the committee agrees that this restriction is
> debatable (many other restrictions are given without rationale, as
> they are obviously desirable).

More accurately, I'd say that the footnote is intended to help the large
number of people who are unaware of the problem that this rule copes
with. Without an understanding of why this rule exists, it can be very
difficult to understand when this particular rule applies, and when it
doesn't.

> I don't know what architectures the committee had in mind when
> introducing this restriction. People bring up x86 as an example, where
> a segment register load causes a failure if the segment descriptor has
> been invalidated. This argument is incorrect, IMO: On this
> architecture, an implementation could easily avoid loading pointer
> values into segment registers unless it wants to perform one of the
> operations in 3.8/5. E.g. to simply copy a pointer value, plain
> registers (EDX:EAX) could be used, instead of loading segment
> registers.

Yes, but that would be a mistake. The behavior of the address registers
is a safety feature, and a fairly well-motivated one. It would be a bad
idea to bypass it, or at least to mandate bypassing it. The standard
shouldn't mandate high-safety implementations, but it shouldn't prohibit
them, either. A program that copies an invalid pointer value is
generally a program that is unaware of the fact that the pointer is
invalid; the only good reason for bothering to copy something is if you
intend to use it, and there's no good use for an invalid pointer. This
in turn implies that, somewhere along the line, the program is likely to
try to dereference the invalid pointer. It's best if such programs are
guaranteed to die as early and reliably as possible - that makes
debugging them a lot easier. At least, that's one reasonable approach,
and the standard shouldn't prohibit such an approach.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: Tue, 2 Oct 2001 17:26:12 GMT Raw View

"James Russell Kuyper Jr." <kuyper@wizard.net> writes:

> There's a good reason why 'use' of 'invalid pointer values' is undefined
> behavior; there's many real and popular platforms where such use causes
> a program to abort; if we require such use to be safe, we prohibit
> implementation of C++ on such platforms.

Could you give a few examples? (if there are many of them, ten should
be sufficient :-). Seriously, I'd like to learn about some of these
systems, to see the need for this restriction.

> Yes, but that would be a mistake. The behavior of the address registers
> is a safety feature, and a fairly well-motivated one. It would be a bad
> idea to bypass it, or at least to mandate bypassing it.

I don't see how adding undefined behaviour adds safety, when defining
the behaviour well would be straight-forward. Furthermore, I cannot
see how an implementation that fails when a certain value is loaded
provides safety by doing so. It just means that the implementation
might 'randomly' fail, if a certain execution pattern occurs.

An implementation failing on access to deleted memory would indeed add
safety. An implementation that fails when loading a pointer to deleted
memory doesn't, IMO.

> The standard shouldn't mandate high-safety implementations, but it
> shouldn't prohibit them, either. A program that copies an invalid
> pointer value is generally a program that is unaware of the fact
> that the pointer is invalid; the only good reason for bothering to
> copy something is if you intend to use it, and there's no good use
> for an invalid pointer.

There is atleast one useful application: printing the pointer, to
compare the printed with earlier values. Comparing a pointer to
deallocated memory may also occur programmatically, e.g. when it is
used as the key in a std::map.

Regards,
Martin

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Wed, 3 Oct 2001 00:29:16 GMT Raw View

Martin von Loewis wrote:
>
> "James Russell Kuyper Jr." <kuyper@wizard.net> writes:
>
> > There's a good reason why 'use' of 'invalid pointer values' is undefined
> > behavior; there's many real and popular platforms where such use causes
> > a program to abort; if we require such use to be safe, we prohibit
> > implementation of C++ on such platforms.
>
> Could you give a few examples? (if there are many of them, ten should

You've already mentioned the most widely used family of such machines:
the Intel 80386 and above. I've never programmed Intel machines that
advanced, so I can't speak of them from personal experience. I switched
to programming for various Unix platforms at roughly the same time that
protected mode was added to Intel chips, and I've never had to go back.
That doesn't matter - my programs use invalid pointers only by accident,
so I wouldn't be very familiar with the failure mode even if I was using
one.

You dismissed those machines as an example, but I've already expressed
my disagreement with your reasons for doing so.

...
> > Yes, but that would be a mistake. The behavior of the address registers
> > is a safety feature, and a fairly well-motivated one. It would be a bad
> > idea to bypass it, or at least to mandate bypassing it.
>
> I don't see how adding undefined behaviour adds safety, when defining
> the behaviour well would be straight-forward. Furthermore, I cannot
> see how an implementation that fails when a certain value is loaded
> provides safety by doing so. It just means that the implementation
> might 'randomly' fail, if a certain execution pattern occurs.

Paradoxically, the reason why it's safer is that it makes the failure
MORE likely to occur. The point is that a program will fail quicker,
which is safer for precisely the same reason that a plane which fails to
take off is safer than a plane which fails while in the air. It
increases the likelihood that problems will be detected before delivery;
maybe even before baselining.

This is one definition of "safety". There are legitimate, conflicting
definitions of safety. No one of those definitions is the right one for
all applications. I don't think the standard should outlaw
implementations that are safer in this sense of the term. That's what
you'd do if you required copy assignment or copy construction of
pointers to be safe.

Note: bytewise copies are still perfectly safe; they don't use the value
of the pointer, but only the values of the individual bytes, interpeted
as unsigned char. Unsigned char doesn't have dangerous values.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: thp@cs.ucr.edu
Date: Wed, 3 Oct 2001 10:18:13 GMT Raw View

James Russell Kuyper Jr. <kuyper@wizard.net> wrote:
[...]
: Paradoxically, the reason why it's safer is that it makes the failure
: MORE likely to occur. The point is that a program will fail quicker,
: which is safer for precisely the same reason that a plane which fails to
: take off is safer than a plane which fails while in the air. It
: increases the likelihood that problems will be detected before delivery;
: maybe even before baselining.

Thanks for making that very important point.  One of my pet peeves has
been the doctrine of "defensive programming", which holds that you
should code so that your programs muddle along in spite of bugs.  The
advice goes, "Don't make your program a sucker for bugs."

My favorite debugging tool, the assert statement, is designed to do
exactly the opposite, i.e., abort the program as soon as a bug is
detected.  IMHO, the earlier a bug is detected, the easier it is to
track down and the less damage it does.

Yes, if I'm on a life-support system, I'd like the last-to-crash of
the system's redundant control programs to have been coded defensively.

My apologies for the off-topic journey to the soapbox.

Tom Payne






---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Martin von Loewis <loewis@informatik.hu-berlin.de>
Date: 20 Sep 2001 20:48:00 GMT Raw View

[Moderator's note: Forwarded to C++ Committee. -sdc ]

3.7.3.2, [basic.stc.dynamic.deallocation]/4, mentions that the effect
of using an invalid pointer value is undefined. However, the standard
never says what it means to 'use' a value.

There are a number of possible interpretations, but it appears that
each of them leads to undesired conclusions:

a) A value is 'used' in a program if a variable holding this value
   appears in an expression that is evaluated.
   This interpretation would render the sequence

   int *x = new int(0);
   delete x;
   x = 0;

   into undefined behaviour. As this is a common idiom, this is
   clearly undesirable.

b) A value is 'used' if an expression evaluates to that value.
   This would render the sequence

   int *x = new int(0);
   delete x;
   x->~int();

   into undefined behaviour; according to 5.2.4, the variable x is
   'evaluated' as part of evaluating the pseudo destructor call. This,
   in turn, would mean that all containers (23) of pointers show
   undefined behaviour, e.g. 23.2.2.3 requires to invoke the
   destructor as part of the clear() method of the container.

If any other meaning was intended for 'using an expression', that
meaning should be stated explicitly.

Regards,
Martin



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: "James Russell Kuyper Jr." <kuyper@wizard.net>
Date: Thu, 20 Sep 2001 20:14:17 CST Raw View

Martin von Loewis wrote:
>
> [Moderator's note: Forwarded to C++ Committee. -sdc ]
>
> 3.7.3.2, [basic.stc.dynamic.deallocation]/4, mentions that the effect
> of using an invalid pointer value is undefined. However, the standard
> never says what it means to 'use' a value.
>
> There are a number of possible interpretations, but it appears that
> each of them leads to undesired conclusions:
>
> a) A value is 'used' in a program if a variable holding this value
>    appears in an expression that is evaluated.

A variable stores a value; it isn't the same thing as the value. That is
a use of the variable, but not necessarily a use of the value stored in
it. The value is used if it plays a part in determining the value of the
expression.

>    This interpretation would render the sequence
>
>    int *x = new int(0);
>    delete x;
>    x = 0;
>
>    into undefined behaviour. As this is a common idiom, this is
>    clearly undesirable.

The variable 'x' is used, but only as a place to store a new value; the
value stored in 'x' is not used after the 'delete' which renders it
indeterminate.

> b) A value is 'used' if an expression evaluates to that value.
>    This would render the sequence
>
>    int *x = new int(0);
>    delete x;
>    x->~int();
>
>    into undefined behaviour; according to 5.2.4, the variable x is
>    'evaluated' as part of evaluating the pseudo destructor call. This,

Yes, that is undefined behavior, for two seperate reasons. In addition
to the reason you've given, that code calls the destructor twice on the
same object. However, since the destructor doesn't actually do anything,
and the value of the pointer isn't actually needed to call the
destructor, there's little that's actually likely to go wrong in this
case. However, if you have a user-defined type with a non-trivial
virtual destructor, rather than 'int', it's quite likely that this will
go wrong.

>    in turn, would mean that all containers (23) of pointers show
>    undefined behaviour, e.g. 23.2.2.3 requires to invoke the
>    destructor as part of the clear() method of the container.

The clear() method would not require any code comparable to the example
you've given. It's defined in terms of erasing every element in the
container, and the typical sequence for erasing a single element would
be

 get_allocator().destroy(p);
 get_allocator().deallocate(p,1);

where p is an Allocator::pointer to the one of the pointers stored in
the container. This is equivalent, for containers using the default
allocator std::allocator, to

 ((T*)p)->~T();
 ::operator delete(p);

That makes only one call to the destructor, and makes no use of p after
it's value has been rendered indeterminate by the call to ::operator
delete().

A container of pointers does not delete the pointed-at objects when the
pointers are destroyed. Therefore, the values of the pointers stored in
the container are not themselves rendered indeterminate; The destructor
is called only for the pointers themselves. This is, of course, a
potential memory leak if those were the only copies you had of pointers
to previously allocated memory.

There is a problem if you try to store pointers to deallocated objects
in a container; or to deallocate an object pointed at by a pointer
stored in a container. Such pointers are not CopyConstructible - copying
their value allows undefined behavior. Any container is free to copy any
contained object at any time that doing so would not be prohibited by
complexity requirements. There's no obvious reason for
std::list<T>::clear() to copy the contained objects, so you're probably
safe with regard to 23.2.2.3. However, for other containers it can be
dangerous; technically, it's even dangerous for std::list<T>::clear().

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]

Author: Francis Glassborow <francis.glassborow@ntlworld.com>
Date: Fri, 21 Sep 2001 22:47:15 GMT Raw View

In article <j47kutd6iy.fsf@informatik.hu-berlin.de>, Martin von Loewis
<loewis@informatik.hu-berlin.de> writes
>b) A value is 'used' if an expression evaluates to that value.
>   This would render the sequence
>
>   int *x = new int(0);
>   delete x;
>   x->~int();

Please explain why you think this does not already always exhibit
undefined behaviour.



Francis Glassborow
I offer my sympathy and prayers to all those who are suffering
as a result of the events of September 11 2001.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]