Topic: Rvalue Reference argument binding


Author: Douglas Turk <douglas.turk@gmail.com>
Date: Mon, 19 Apr 2010 14:59:17 CST
Raw View
Hello,

I have been looking at the stated behavior of rvalue reference
argument binding in the Final Committee Draft, and the implemented
behavior in GCC 4.5 and Visual C++. I'm not sure what the intended
behavior is in a number of cases, and I think that the compilers'
behavior is surprising (but probably conforming with the FCD).

Given the following declarations:
void f(int &&);
void g(std::string &&);

struct X {
 operator int() const { return 5; }
 operator std::string() const { return "abc"; }
};

int i = 5;
int && ri = 5;
long j = 5L;
const char * sz = "abc";
const char[] sz2 = "abc";
string s = "abc";
X x;
X && x2 = X();
X && x3 = X();
int && x4 = X();
std::string && x5 = X();

...which of the following calls are legal, and which would you
intuitively expect to be legal? Also, consider what would happen if
the formal argument types to f and g were const lvalue refs, or bare
values, instead of rvalue refs, and whether the answers should be
consistent. (The non-rvalue-ref cases are relatively well-understood,
and consistency with these cases is presumably desirable, especially
in light of papers such as N3010 - "Rvalue refs as 'funny' lvalues".)
Here are my test cases:

f(5);
f(i);
f(int(i));
f(j);
f(int(j));
f(j + 0);
f(X());
f(x);
f(x2);
f(x4);

g(std::string("abc"));
g("abc");
g(sz);
g(sz2);
g(static_cast<const char *>(sz));
g(sz + 0);
g(&sz[0]);
g(*&sz);
g(s);
g(s + "");
g(std::string(s));
g(X());
g(x);
g(x3);
g(x5);

(I didn't test std::move cases; I know that these work as expected.)

There are some clear-cut cases here, and some that are less clear. For
example, the first call to f and the first call to g are both clearly
legal; and the calls with lvalues that otherwise match the argument
type (f(i) and g(s)) are probably intended to be illegal under
13.3.3.1.4/3 - as proposed by paper N2844 ("Fixing a safety problem
with rvalue references").

I have informally tested this with Microsoft's Visual C++ 2010 (both
the EDG front-end used in the Intellisense feature, and the actual
compiler), and in GCC 4.5. The compilers are quite consistent. In all
cases, the compilers seem to apply the lvalue/rvalue test quite early
- they check to see if the argument's expression denotes an lvalue,
and reject anything that is clearly a non-rref lvalue (e.g. i, j, s,
sz, sz2, x, x2 ... x5), before checking to see if standard conversion
sequences might apply. Arguments that are obviously rvalues (e.g.(j +
0, sz + 0) are accepted by both. GCC and Visual C++ disagree over
whether string literals like "abc" are lvalues or rvalues - GCC says
they're rvalues, Visual C++ and EDG say they're lvalues.

Moving past this difference, I suggest that the overall behavior is
rather unintuitive - why should g(sz) fail, and yet g(sz + 0) succeed,
from a programmer's point of view? Surely a temporary string would
need to be created, in both cases, and the temporary could (and would)
get bound to the rvalue reference? After all, rvalue references refer
conceptually to temporaries or otherwise unwanted objects, and it is
the string which is the temporary in this case, not 'sz' or 'sz + 0'.
I am unsure whether this is what is intended in the FCD wording or
not; it seems like both are possible, but I don't know.

Going further, one may ask: why should an expression like g(sz + 0) be
valid, but g(s) be invalid? Both would clearly be legal if g accepted
a const lvalue ref, or a string by value. Surely a "real" std::string
(even if an lvalue) is conceptually closer to the argument type of g
than a const char *? Why should a programmer have to write
g(std::string(s)) when they don't have to write g(std::string(sz)),
due to an implicit conversion? In light of the issue raised in N2844,
we can't let the lvalue ref bind directly to the rvalue ref. However,
we could consider the normal copy constructor to be an implicit
conversion constructor, and define the behavior of g(s) to create an
intermediate copied temporary, then bind the temporary to the rvalue
ref argument, as if we wrote g(std::string(s)). It doesn't seem likely
that this is the intended interpretation, but 13.3.3.1.2/4 seems to
consider a copy or move constructor to be a user-defined conversion
constructor, so maybe it's not that unreasonable.

I suggest that this behavior could make the rvalue ref arguments
significantly more intuitive, and more consistent with lvalue
reference arguments. I also suggest that it could make rvalue ref
arguments more useful in their own right - currently they are mostly
used as part of an overload set with const lvalue-ref arguments, or
for move-only types, but with this interpretation, it's quite
reasonable to declare a function such as g(std::string&&), above, by
itself, even for a copyable type like std::string. Such an argument
would mean "move if you can, copy construct into a temporary and then
move otherwise". (Presumably the implicit copy construction into a
temporary would not be allowed for non-copyable types like
unique_ptr.)

Maybe all of this is me misunderstanding the wording, or looking at
the behavior of compilers that haven't been updated to reflect the
latest wording on this issue - I don't work on any compilers, and I'm
not involved in the standardization process, so I don't know.

Thanks in advance for your comments,
Doug Turk

--
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@netlab.cs.rpi.edu]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]