Thread

Topic: adding "move" to c++ - reference lifetime issues

Author: AlbertoBarbati@libero.it (Alberto Barbati)
Date: Sun, 17 Oct 2004 06:44:13 GMT Raw View

John Nagle wrote:
>
> A function that returns a reference to one of its own argument
> is doing a "keep".  In the example above, "foo" is doing a
> "keep", and would have to be written
>
>     T& foo(keep T& x){ return rand()&1 ? t : x; }
>
> You can't pass a temporary to a "keep", so "bar" would
> be in error.

You don't need "keep" to say "don't allow a temporary here", as
temporaries won't bind to non-const references. Therefore using "T&"
would suffice. Moreover function bar() does not pass a temporary but a
local variable. Maybe you meant "can't pass a local variable here"? That
seem a bit too much to me, as IMHO this use of foo() is perfectly
reasonable:

  void bar()
  {
    T y;
    std::cout << foo(y);
  }

(provided there's a suitable overload of operator<<, of course).

The problem, as has been already said, is that it's difficult to
describe object lifetime. Even with "keep" there's no way to tell the
compiler "how long" will the reference be kept. So the compiler must
assume that the reference is kept indefinitely and that may seriously
limit the usefulness of the concept (see more on this below).

On a side note, I just realized that there's no way to avoid binding a
temporary to a const reference, so maybe "keep const T&" could indeed be
an interesting idiom.

>    Other situations that require a "keep" include
> things like this
>
>     void f1(keep T& x)
>     { static T* lastx = &x;
>     }

This is a much more interesting use-case. In this case the reference is
indeed kept indefinetely. However, I can think about uses where a local
variable can be reasonably passed to this function:

   int main()
   {
     for(T y; /*condition*/; y = next(y))
       f1(y);
     return 0;
   }

Or any other use case where function f1() is never called again after
the leaving the function.

> Of course, it means retrofitting code for "keep correctness",
> much as we once had to retrofit for "const correctness".
> But "keeping" is relatively rare, compared to "constness".  With
> the notable exception of iostreams, very few functions
> in the standard libraries "keep" their arguments.

Talking about iostreams, if "keep" was used with them then a lot of
reasonable uses of string streams would become illegal. For example:

   std::string format_msg(int errcode)
   {
     std::ostringstream msg;
     msg << "Error: code is " << errcode;
     return msg.str();
   }

> And, as I pointed out, smart pointers don't have to update
> reference counts for non-"keep" variables.  How to implement
> that is another subject.  But this is a way to get reduce
> reference counting overhead substantially.

Smart-pointers that need to have their refcounts updated are usually
passed by value, not by reference. If don't need to update refcounts,
you pass them by reference instead. So there's no need to add a "keep"
concept for smart-pointers, we already have it.

> It's not fully backwards compatible, unfortunately.  But it's
> a step in the right direction.

If such a thing is introduced, it's going to be big. So a "step in the
right direction" argument is not going to be acceptable. Either it's the
Right Thing(tm) or we should not attempt it at all. That's my personal
opinion.

There are a few holes in this idea that prevent me to agree with you
about its usefulnees. If we agree that a "keep T&" is kept indefinetely,
the question becomes "what could be bound to a keep T&?". We already
excluded temporaries and local variables. So there remain only static
variables and dynamically allocated objects. To have "keep" work for
dynamically allocated objects, "keep" would need to be a qualifier like
const and volatile and should be available for pointers also, and
operator new should return a "keep T*" instead of a "T*". Unlike regular
cv-qualifiers, a "keep T*" could be implicitly converted to a "T*" but
not the opposite. If you don't have this, the "keep" could be easily
circumvented, for example:

   T& foo(keep T* t) { return *t; }
   T& bar(T* t)      { return foo(*t); }
   T& baz()          { T y; return bar(&y); }

Even with this machinery in place, you cannot prevent delete to cause havok:

   T& bar()
   {
     keep T* t = new T;
     T& r = foo(t);
     delete t;
     return r;
   }

So "keeping" dynamically allocated objects is not completely safe.

That said, "keep" has a clear and non-error-prone semantic only when
used with static variables. I'm not so sure that such a feature would be
worth the effort...

Alberto

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nagle@animats.com (John Nagle)
Date: Tue, 19 Oct 2004 06:28:27 GMT Raw View

Alberto Barbati wrote:

> John Nagle wrote:

> The problem, as has been already said, is that it's difficult to
> describe object lifetime. Even with "keep" there's no way to tell the
> compiler "how long" will the reference be kept. So the compiler must
> assume that the reference is kept indefinitely and that may seriously
> limit the usefulness of the concept (see more on this below).

     Good point. The options are 1) give up, and accept the bugs, and
2) figure out better ways to tell the language what you're doing.
Let's explore 2) some more here.

> On a side note, I just realized that there's no way to avoid binding a
> temporary to a const reference, so maybe "keep const T&" could indeed be
> an interesting idiom.
>
>>    Other situations that require a "keep" include
>> things like this
>>
>>     void f1(keep T& x)
>>     { static T* lastx = &x;
>>     }
>
> This is a much more interesting use-case. In this case the reference is
> indeed kept indefinetely.

    Yes, that's the true "keep" case.  That's the one that bites
programmers, because something they didn't think kept a reference
or pointer actually does.  This leads to subtle bugs.


> Talking about iostreams, if "keep" was used with them then a lot of
> reasonable uses of string streams would become illegal. For example:
>
>   std::string format_msg(int errcode)
>   {
>     std::ostringstream msg;
>     msg << "Error: code is " << errcode;
>     return msg.str();
>   }

    That's a serious problem.

    Maybe we need to be able to explictly say that a function
returns one of its own reference arguments. It's not "keeping"
it, it's "returning" it.  Perhaps like this:

    T& foo(return T& x)
    { return(x);  }

"return" does what iostreams needs.  It's also easy to understand.
x goes in, and x comes back out.

Returning a reference to a parameter is a common enough idiom that
it's worth viewing it as a special case.

    So, if we want to describe reference lifetime, we need "keep",
which means the kept reference outlives the call, and "return",
which means the reference is returned and the caller
determines what happens with it.  Is this sufficient?

>> And, as I pointed out, smart pointers don't have to update
>> reference counts for non-"keep" variables.  How to implement
>> that is another subject.  But this is a way to get reduce
>> reference counting overhead substantially.
>
>
> Smart-pointers that need to have their refcounts updated are usually
> passed by value, not by reference. If don't need to update refcounts,
> you pass them by reference instead. So there's no need to add a "keep"
> concept for smart-pointers, we already have it.

     I want to think about that a bit more.  It could work.

> If such a thing is introduced, it's going to be big. So a "step in the
> right direction" argument is not going to be acceptable. Either it's the
> Right Thing(tm) or we should not attempt it at all. That's my personal
> opinion.

      Good point.  Are "keep", "return", and "" enough attributes
to make this work?

> There are a few holes in this idea that prevent me to agree with you
> about its usefulnees. If we agree that a "keep T&" is kept indefinetely,
> the question becomes "what could be bound to a keep T&?". We already
> excluded temporaries and local variables. So there remain only static
> variables and dynamically allocated objects. To have "keep" work for
> dynamically allocated objects, "keep" would need to be a qualifier like
> const and volatile and should be available for pointers also, and
> operator new should return a "keep T*" instead of a "T*". Unlike regular
> cv-qualifiers, a "keep T*" could be implicitly converted to a "T*" but
> not the opposite. If you don't have this, the "keep" could be easily
> circumvented, for example:
>
>   T& foo(keep T* t) { return *t; }
>   T& bar(T* t)      { return foo(*t); }
>   T& baz()          { T y; return bar(&y); }

     "return", as described above, deals with these "in and out"
cases.
>
> Even with this machinery in place, you cannot prevent delete to cause
> havok:
>
>   T& bar()
>   {
>     keep T* t = new T;
>     T& r = foo(t);
>     delete t;
>     return r;
>   }
>
> So "keeping" dynamically allocated objects is not completely safe.

    That's another issue - who can do a delete.  However, the
destructor for a smart pointer should throw an exception (ouch)
if called with a nonzero use count.  So we can catch that at
run time.

    What we really need is a general mechanism for declaring
new attributes and the rules enforced for them.  Tnen you
could try out things like this without modifying the
language.  Is anyone working on that?

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: laurie.cheers@btinternet.com (Laurie Cheers)
Date: Thu, 21 Oct 2004 01:56:00 GMT Raw View

John Nagle wrote:
> Alberto Barbati wrote:
>> John Nagle wrote:
>>>    Other situations that require a "keep" include things like this
>>>
>>>     void f1(keep T& x)
>>>     { static T* lastx = &x;
>>>     }
>>
>> This is a much more interesting use-case. In this case the reference is
>> indeed kept indefinetely.
>
>     Yes, that's the true "keep" case.  That's the one that bites
> programmers, because something they didn't think kept a reference
> or pointer actually does.  This leads to subtle bugs.

If I understand your intent correctly, the most common "keep" case would be:

class Window
{
  void setTitle( keep const string* title ) { m_Title = title; }
  string* m_Title;
};

So how about this - would the parameter need to be declared "keep" here,
or not?

void copyTitle( (keep?) Window* window ) { m_Title = window->m_Title; }

>     So, if we want to describe reference lifetime, we need "keep",
> which means the kept reference outlives the call, and "return",
> which means the reference is returned and the caller
> determines what happens with it.  Is this sufficient?

I hope "return" will also permit you to write out a value to an out parameter?
This function seems perfectly reasonable (albeit rather pointless):

void setString( return string* input, string** output ) { *output = input; }

If that's acceptable, how about this one?

void setTitle( Window* window, return string* title )
{
  window->setTitle( m_Title );
}

window->setTitle wants a "keep" value - surely it should be legal to pass it
a "return" value, given that 'window' is one of our inputs?

And can you also declare things "return keep", if they do both things? Or will
"keep" on its own imply that?

> I want to think about that a bit more.  It could work.

I think so too. But the semantics need thoroughly hammering out.

> > There are a few holes in this idea that prevent me to agree with you
> > about its usefulnees. If we agree that a "keep T&" is kept indefinetely,
> > the question becomes "what could be bound to a keep T&?". We already
> > excluded temporaries and local variables.

Well, I think a local variable should be perfectly acceptable, provided that
the object which is 'keeping' it is also a local variable...

void main()
{
  string title("Hello");
  Window window;
  window->setTitle(&title);
}

>> Even with this machinery in place, you cannot prevent delete to cause
>> havok:
>>
>>   T& bar()
>>   {
>>     keep T* t = new T;
>>     T& r = foo(t);
>>     delete t;
>>     return r;
>>   }
>>
>> So "keeping" dynamically allocated objects is not completely safe.
>
>     That's another issue - who can do a delete.  However, the
> destructor for a smart pointer should throw an exception (ouch)
> if called with a nonzero use count.  So we can catch that at
> run time.

Instead of reference counting, perhaps a more stable approach would be
transfer of ownership? For example:

owner T& foo(owner T& a, owner T& b)
{
  if (rand()%2)
  {
    delete a; // compile time error if this is not done
    return b;
  }
  else
  {
    delete b; // compile time error if this is not done
    return a;
  }
}

owner T& bar()
{
  owner T* t = new T;
  owner T& r = foo(t);
  delete t; // compile-time error, t is now an invalid owner
  return r;
}

In other words, functions are required to clean up after themselves. If a
function is passed an "owner" variable as a parameter, it is now the owner of
that memory, and it must either delete it, keep it somewhere, or output it as
an "owner" return value.

When an "owner" variable is passed to a function, the existing owner
(t in the example above) will lose its owner attribute. Until a new value
is assigned to it, it's no longer possible to use this value as an owner.

If the variable isn't local, the compiler will demand that your function
assigns it a new value (e.g. NULL) before returning.

--
Laurie Cheers

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nagle@animats.com (John Nagle)
Date: Thu, 21 Oct 2004 06:55:43 GMT Raw View

Laurie Cheers wrote:
> John Nagle wrote:
> If I understand your intent correctly, the most common "keep" case would be:
>
> class Window
> {
>   void setTitle( keep const string* title ) { m_Title = title; }
>   string* m_Title;
> };
>
> So how about this - would the parameter need to be declared "keep" here,
> or not?
>
> void copyTitle( (keep?) Window* window ) { m_Title = window->m_Title; }

    Yes.  You now have two pointers to the same thing, where before you
only had one, so you've definitely "kept" something.

    "keep" is intended to be conservative.  Some ambiguous cases
will require "keep", even when the code doesn't really
cause long term storage.

>>    So, if we want to describe reference lifetime, we need "keep",
>>which means the kept reference outlives the call, and "return",
>>which means the reference is returned and the caller
>>determines what happens with it.  Is this sufficient?
>
>
> I hope "return" will also permit you to write out a value to an out parameter?

    No, that's getting too fancy to support in a declaration.  Going
that way leads to a requirement that the full dependency graph of the
parameters be declared, which is too much to ask.

> This function seems perfectly reasonable (albeit rather pointless):

    Um.

>>I want to think about that a bit more.  It could work.
>
>
> I think so too. But the semantics need thoroughly hammering out.

    It does.

    Spring (the predecessor to Java) had something like this, and
it might be worth checking that out.  Spring was trying to solve
this problem, but in the end, they just made everything in Java
garbage collected to make the problem go away.

    It's too bad there's no C++ high-reliability group working on
things like this.

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: nagle@animats.com (John Nagle)
Date: Sat, 16 Oct 2004 07:10:48 GMT Raw View

Bill Wade wrote:
> nagle@animats.com (John Nagle) wrote
>
>
>>    Are there troublesome cases here that can't be detected at
>>compile time?  Since references can't be reassigned, the compiler knows
>>the scope of the target of a reference at compile time, and should
>>be able to reliably prevent the return of an out of scope reference.
>
>
> static T t;
> T& foo(T& x){ return rand()&1 ? t : x; }
> T& bar(){ T y; return foo(y); }
>
> foo() never returns an out-of-scope reference.

    Yes, that's safe, and it's possible to confirm at compile time
that it's safe.
>
> To know if bar() returns an out-of-scope reference, the compiler has
> to look at the internals of both foo() and rand().  Even then it may
> not be able to answer the question at compile time.

    Good point.

    As previously discussed, the standard currently requires that
function arguments have lifetimes that survive the call in which
they appear.  So it is permissible to return a reference to an
argument.  This is widely used (most notably by iostreams) so
we're stuck with that.

> Of course, out-of-scope isn't the only possible problem.  Object (and
> memory) lifetime isn't strictly tied to scope.

    To really solve this problem, we need more
attributes for function arguments.  One useful set is "read",
"write", "delete", and "keep". Think of these as permissions.

    "read" allows reading an object.  This is implied in C++.
    "write" allows writing an object. This is implied by the
 absence of "const".
    "delete" allows deleting an object.  Currently, any argument
 (even const, I think) can be deleted, which is rather
 strong.  The absence of "delete" means the caller can
 assume the object will still be there after the call.
 (This is a separate subject and will not be discussed
 further in this note.)
    "keep" allows keeping a reference or pointer to an object
 beyond the return from a function.  The absence of
 "keep" means that reference counting systems don't
 have to update reference counts for object passed
 without "keep", so there's a big performance
 win for smart pointers.  "keep" is relatively rare,
 and users of functions need to know when a function
 keeps its arguments, so this is a reasonable attribute
 to add to C++.  You can't pass a temporary to a
 "keep", of course, because it would result in a
 reference to a temporary.

A function that returns a reference to one of its own argument
is doing a "keep".  In the example above, "foo" is doing a
"keep", and would have to be written

 T& foo(keep T& x){ return rand()&1 ? t : x; }

You can't pass a temporary to a "keep", so "bar" would
be in error.

So this is a way to do reference parameters safely, and
close a troublesome hole.

    Other situations that require a "keep" include
things like this

 void f1(keep T& x)
 { static T* lastx = &x;
 }

or the common idiom

 class Tptr {
  T& backptr;
 public:
  Tptr(keep T& p)
  : backptr(p) {}
 };

This is something worth having in declarations.
If you see "keep", you know there are variable lifetime
issues.  If you don't see "keep", you don't have to worry
about parameter lifetimes.

Of course, it means retrofitting code for "keep correctness",
much as we once had to retrofit for "const correctness".
But "keeping" is relatively rare, compared to "constness".  With
the notable exception of iostreams, very few functions
in the standard libraries "keep" their arguments.

And, as I pointed out, smart pointers don't have to update
reference counts for non-"keep" variables.  How to implement
that is another subject.  But this is a way to get reduce
reference counting overhead substantially.

It's not fully backwards compatible, unfortunately.  But it's
a step in the right direction.

Comments?

   John Nagle
   Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]