Thread

Topic: Defining lvalue, rvalue, pointer and reference

Author: thp@cs.ucr.edu
Date: Fri, 7 Nov 2003 06:38:51 +0000 (UTC) Raw View

James Dennett <jdennett@acm.org> wrote:
+ thp@cs.ucr.edu wrote:
+> johnchx <johnchx2@yahoo.com> wrote:
+> + thp@cs.ucr.edu wrote
+> +
+> +>    Widget {
+> +>      ...stuff...
+> +>      Widget* me() { return this; }
+> +>    };
+> +>
+> +>    Widget f();
+> +>
+> +>    Widget& r = *(f().me());
+> +>
+> +> Now we've attached an lvalue, r, to that rvalue-denoted object.
+> +>
+> +
+> + No, you've created a dangling reference.  r binds direclty to an
+> + lvalue, not to the temporary object returned by f(), so the
+> + temporary's lifetime ends at the end of the full expression.
+>
+> Not so:
+>
+>   There are two contexts in which temporaries are destroyed at a
+>   different point than the end of the full-expression. [...]  The
+>   second is when a reference is bound to a temporary. ...
+>                                               [12.2#4 and 12.2#5]
+
+ Vaguely interesting case.  I don't think the intent
+ was for this to apply: for example, you can't bind a
+ non-const reference to a temporary, but the code above
+ does successfully bind r to *(f().me()) -- because the
+ rhs doesn't count as a temporary, even though it's
+ actually the same object as f(), which *is* a temporary.

Yup.  I goofed.  I've posted a corrected version as a reply
to johncx's followup to the same posting.

+ Let's drop the word "temporary"... f() is an rvalue
+ (expression).  f().me() uses an rvalue to lvalue
+ conversion to call me(), and returns another rvalue.
+ But *(f().me()) is an lvalue.  lvalues are not considered
+ to be temporaries.  The actual object that is the value
+ of this expression is the result of f(), and that has no
+ reason to live beyond the end of the full expression...
+ so r is a dangling reference.

I completely agree.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Thu, 30 Oct 2003 22:46:42 +0000 (UTC) Raw View

johnchx wrote:
> do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
<snip>
>> Here's a comparison of how C++ and the other languages handle
>> references - omitting Object Pascal as I have no experience with
>> it and, unlike those last three languages, I didn't think it was
>> worth looking into:
>
> Thanks for the comparison!  Interesting....
>
> BTW, out of curiousity, which of these languages has a language
> construct *called* a reference (or reference type)?
<snip>

C++, Simula 67 and Eiffel.  Algol 68 makes the distinction, since
it supports both by-value and by-reference manipulation, but it
calls references "names".  VB 6 has the "ByRef" keyword for passing
function arguments by reference, but otherwise doesn't seem to use
the term.

Note that Simula is based on Algol 60, Eiffel is based on Simula,
and C++ has been influenced by Algol (partly via C) and Simula.
("C++: Simula in wolf's clothing." - Bjarne Stroustrup.)  So
explicit references seem to be an Algol feature that has been
inherited by most of its derivatives.  As an exception to this,
Java, which seems semantically somewhere between Simula and
Smalltalk (and only syntactically influenced by C++), does not
have them.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Thu, 30 Oct 2003 22:47:21 +0000 (UTC) Raw View

johnchx <johnchx2@yahoo.com> wrote:
+  Each expression has two (almost) orthogonal characteristics:
+  (a) type (int, bool, T, what-have-you...)
+  (b) category (lvalue or rvalue)

Oops.  I thought that you meant that lvalue and rvalue were othogonal
to each other.

+> + I'm claiming that rvalues and lvalues denoting objects denote their
+> + referents differently.  An rvalue denotes the value of the object; an
+> + lvalue denotes the location of the object.
+>
+> I'm not going to say that there's no difference, but one can via an
+> rvalue mutate an object (i.e., change its value), e.g., f().flip(),
+> where f returns by value an object of a user-defined type having a
+> mutating member function, flip().
+
+ Well yes, but no.  This is a misleading formulation.  (I know it's
+ ususally taught and explained this way, but I think it causes more
+ confusion than it prevents.)
+
+ You can't mutate an rvalue.  Period.
+
+ If, however, the rvalue happens to have class type, you can call one
+ of its member functions.  That function receives the implicit this
+ pointer.  *this is an lvalue.  Thus the member function has a
+ perfectly valid lvalue to modify as it chooses.

The bottom line is that you can invoke a mutating member function of
an rvalue.  Consider:

   Widget {
     ...stuff...
     Widget* me() { return this; }
   };

   Widget f();

   Widget& r = *(f().me());

Now we've attached an lvalue, r, to that rvalue-denoted object.

+> + Example:
+> +
+> +  int  foo();
+> +  int& bar();
+> +  int  j;
+> +  extern int& k;
+> +
+> +  int main() {
+> +    j;
+> +    k;
+> +    foo();
+> +    bar();
+> +  }
+> +
+> + In main(), the expression "j" is an lvalue.  The expression "k" is
+> + also an lvalue, whose type is adjusted to int.  That k is declared
+> + with reference type doesn't affect the interpretation of the
+> + expression "k".
+> +
+> + On the other hand, the expression "bar()" is an lvalue *because* it
+> + has reference type.  The expression "foo()," which doesn't have
+> + reference type, is an rvalue.
+>
+> Hmmmmm.  Consider:
+>
+>    int i;
+>    class Widget {
+>    public:
+>       int& k;
+>       Widget(i);
+>    } w1,w2;
+>    cout << &w1.k == &w2.k ? "yes" : "no" << endl;
+>
+
+ Does this compile?  I don't see what you're driving at.

Oops.  The point was to come up with a counter-example to (what I
understand to be) your claim that it makes no difference whether k is
an int& or an int.  So, consider:

    class Widget {
    public:
      int& k;
      Widget(int& i) : k(i) {}
    };
    int i;
    Widget w1(i);
    Widget w2(i);
    cout << &w1.k == &w2.k ? "yes" : "no" << endl;

As it stands, k is an int& and we should get "yes" as output.  But, if
we drop the "&" in the declaration of k, we get "no".

+> + In other words, in C++, names declared with reference type behave just
+> + like all the other names in this respect -- why introduce a
+> + distinction only to explain that it doesn't actually apply. ;-)
+>
+> To disspell the all too common misconception that references are
+> names.
+>
+
+ I'm not sure I follow you here.  Names declared with reference type
+ are certainly names.

Right.  They name references, which at run-time can be initialized to
refer to this or that object.  But not all references have names, e.g.,
the return value of int& f(){return *new int;}.

+> C++ references can be bound at run time.  In C++, the default behavior
+> is that entities that can be bound at run time can be rebound at
+> run-time.
+
+ Huh?  Can you illustrate this?

   int i = 1;
   i = 2;

+> The point is that arbitrary decisions need to be mentioned -- they
+> don't go without saying.
+
+ [snip]
+
+> Such objects are an exceptions to the notion that references behave
+> like "implictly dereferenced pointers".
+
+ I guess that's the point, really: if you think of references as
+ automatically dereferenced pointers, then lots exceptions and special
+ rules have to be introduced and these will appear arbitrary.

The special rules have to be there in any case.  The question of
whether or not the decision to make C++ reference non-reseatable was
"arbitrary" depends on which interpretation of "arbitrary" you pick.
Consider the following definition of "arbitrary":

  based on or determined by individual preference or convenience
  rather than by necessity or the intrinsic nature of something

(That's one of three possibilities from Merriam-Webster.)

Now consider the following from page 86 of D&E:

  It is not possible to change what a reference refers to after
  initialization.  That is, once a C++ reference is intialized it cannot
  be made to refer to a different object later; it cannot be re-bound.
  I had in the past been bitten by Algol68 references where r1=r2 can
  either assign through r1 to the object referred to or assign a new
  reference value to r1 (re-binding r1) depending on the type of r2.  I
  wanted to avoid such problems in C++.

Judge for yourself.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: ahp6@email.byu.edu ("Adam H. Peterson")
Date: Fri, 31 Oct 2003 17:28:22 +0000 (UTC) Raw View

>>To disspell the all too common misconception that references are
>>names.
>>
>
>
> I'm not sure I follow you here.  Names declared with reference type
> are certainly names.

I'm not sure we want to resort to name calling.

(To the moderator:  Sorry, I couldn't resist.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Sat, 1 Nov 2003 03:24:27 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote

>    Widget {
>      ...stuff...
>      Widget* me() { return this; }
>    };
>
>    Widget f();
>
>    Widget& r = *(f().me());
>
> Now we've attached an lvalue, r, to that rvalue-denoted object.
>

No, you've created a dangling reference.  r binds direclty to an
lvalue, not to the temporary object returned by f(), so the
temporary's lifetime ends at the end of the full expression.

The term "bind" is used by the standard in this context in a very
specific way; in particular what a reference binds to is *not*
identical with what it denotes.

> The point was to come up with a counter-example to (what I
> understand to be) your claim that it makes no difference whether k is
> an int& or an int.  So, consider:
>
>     class Widget {
>     public:
>       int& k;
>       Widget(int& i) : k(i) {}
>     };
>     int i;
>     Widget w1(i);
>     Widget w2(i);
>     cout << &w1.k == &w2.k ? "yes" : "no" << endl;
>
> As it stands, k is an int& and we should get "yes" as output.  But, if
> we drop the "&" in the declaration of k, we get "no".
>

w1.k and w2.k are both lvalues which denote the objects they were
initialized to denote.  Change the type of Widget::k to int and,
viola, w1.k and w2.k remain lvalues which denote the objects they were
initialized to denote.  That's the only sense in which I mean they are
"the same."

The initialization semantics are different, of course.  If the type of
Widget::k is int, the compiler will implicitly create the denoted
object; if the type is int&, the user must explicitly identify the
denoted object.


> +> + In other words, in C++, names declared with reference type behave just
> +> + like all the other names in this respect -- why introduce a
> +> + distinction only to explain that it doesn't actually apply. ;-)
> +>
> +> To disspell the all too common misconception that references are
> +> names.
> +>
> +
> + I'm not sure I follow you here.  Names declared with reference type
> + are certainly names.
>
> Right.  They name references,

NO, NO, NO!  ;-)

That's the intuition I'm trying to undo: that "a reference" is a
"thing", and that the name of a reference denotes that thing.  This is
simply not so.

3/4 tells us:

  A *name* is the use of an identifier (2.10) that denotes an
  entity or a *label* (6.6.4, 6.1).

3/3 tells us:

  An *entity* is a value, object, subobject, base clas subobject
  array element, variable, function, instance of a function,
  enumerator, type, class member, template or namespace.

Thus a name never denotes "a reference".  A name declared with
reference type denotes the object or function it was initialized to
refer to, and nothing else.  "The reference" does not exist.


>
> +> C++ references can be bound at run time.  In C++, the default behavior
> +> is that entities that can be bound at run time can be rebound at
> +> run-time.
> +
> + Huh?  Can you illustrate this?
>
>    int i = 1;
>    i = 2;
>

i denotes the same object before and after the assignment.  I don't
see any rebinding here.  Unless you're suggesting that built-in
assignment ends the lifetime of the original object and begins the
lifetime a new one by reusing its storage.  I don't think the standard
supports that view, but if it does, the same "rebinding" mechanism
works for references (3.8/7).


> +> The point is that arbitrary decisions need to be mentioned -- they
> +> don't go without saying.
> +
> + [snip]
> +
> +> Such objects are an exceptions to the notion that references behave
> +> like "implictly dereferenced pointers".
> +
> + I guess that's the point, really: if you think of references as
> + automatically dereferenced pointers, then lots exceptions and special
> + rules have to be introduced and these will appear arbitrary.
>
> The special rules have to be there in any case.  The question of
> whether or not the decision to make C++ reference non-reseatable was
> "arbitrary" depends on which interpretation of "arbitrary" you pick.
> Consider the following definition of "arbitrary":
>
>   based on or determined by individual preference or convenience
>   rather than by necessity or the intrinsic nature of something
>
> (That's one of three possibilities from Merriam-Webster.)
>

I like it.

I find that if I think of references simply as expressions (or names,
where applicable) whose referent must be specified explicitly, almost
all of the "rules" for their use emerege naturally.  If I think of
them as pointers whose pointer-ness has been hidden from me by the
design of the language, then the rules for their use seem arbitrary
and confusing.

I suppose that one could argue that references are *really*
automatically dereferneced pointers hemmed in by a bunch of special
rules and exceptions.  I don't know what references *really* are, in
some metaphysical sense.  Does it matter?


> Now consider the following from page 86 of D&E:
>
>   It is not possible to change what a reference refers to after
>   initialization.  That is, once a C++ reference is intialized it cannot
>   be made to refer to a different object later; it cannot be re-bound.
>   I had in the past been bitten by Algol68 references where r1=r2 can
>   either assign through r1 to the object referred to or assign a new
>   reference value to r1 (re-binding r1) depending on the type of r2.  I
>   wanted to avoid such problems in C++.
>
> Judge for yourself.

Hmmm.  So, in Algol68, assignment to a reference can *sometimes*
rebind it and *sometimes* assign a new value to the existing referent.
 And you find C++ references strange and arbitrary?  :-)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Sat, 1 Nov 2003 08:28:22 +0000 (UTC) Raw View

johnchx <johnchx2@yahoo.com> wrote:
+ thp@cs.ucr.edu wrote
+
+>    Widget {
+>      ...stuff...
+>      Widget* me() { return this; }
+>    };
+>
+>    Widget f();
+>
+>    Widget& r = *(f().me());
+>
+> Now we've attached an lvalue, r, to that rvalue-denoted object.
+>
+
+ No, you've created a dangling reference.  r binds direclty to an
+ lvalue, not to the temporary object returned by f(), so the
+ temporary's lifetime ends at the end of the full expression.

Not so:

  There are two contexts in which temporaries are destroyed at a
  different point than the end of the full-expression. [...]  The
  second is when a reference is bound to a temporary. ...
                                              [12.2#4 and 12.2#5]

+ The term "bind" is used by the standard in this context in a very
+ specific way; in particular what a reference binds to is *not*
+ identical with what it denotes.

Nevertheless, per 8.5.3#5:

  If the initializer expression ... is an lvalue ... then the
  reference is bound directly to the initializer expression lvalue ...

Specifically, in the case at hand, "Widget& r = *(f().me());" binds
r to directly to the result of evaluating "*(f().me())".

+> The point was to come up with a counter-example to (what I
+> understand to be) your claim that it makes no difference whether k is
+> an int& or an int.  So, consider:
+>
+>     class Widget {
+>     public:
+>       int& k;
+>       Widget(int& i) : k(i) {}
+>     };
+>     int i;
+>     Widget w1(i);
+>     Widget w2(i);
+>     cout << &w1.k == &w2.k ? "yes" : "no" << endl;
+>
+> As it stands, k is an int& and we should get "yes" as output.  But, if
+> we drop the "&" in the declaration of k, we get "no".
+>
+
+ w1.k and w2.k are both lvalues which denote the objects they were
+ initialized to denote.  Change the type of Widget::k to int and,
+ viola, w1.k and w2.k remain lvalues which denote the objects they were
+ initialized to denote.  That's the only sense in which I mean they are
+ "the same."


+ The initialization semantics are different, of course.  If the type of
+ Widget::k is int, the compiler will implicitly create the denoted
+ object; if the type is int&, the user must explicitly identify the
+ denoted object.

And in the first case we've identified the same object, while in the
first case, the compiler must create distinct objects.  That's a
detectable difference.

+> +> + In other words, in C++, names declared with reference type behave just
+> +> + like all the other names in this respect -- why introduce a
+> +> + distinction only to explain that it doesn't actually apply. ;-)
+> +>
+> +> To disspell the all too common misconception that references are
+> +> names.
+> +>
+> +
+> + I'm not sure I follow you here.  Names declared with reference type
+> + are certainly names.
+>
+> Right.  They name references,
+
+ NO, NO, NO!  ;-)
+
+ That's the intuition I'm trying to undo: that "a reference" is a
+ "thing", and that the name of a reference denotes that thing.  This is
+ simply not so.
+
+ 3/4 tells us:
+
+  A *name* is the use of an identifier (2.10) that denotes an
+  entity or a *label* (6.6.4, 6.1).
+
+ 3/3 tells us:
+
+  An *entity* is a value, object, subobject, base clas subobject
+  array element, variable, function, instance of a function,
+  enumerator, type, class member, template or namespace.
+
+ Thus a name never denotes "a reference".  A name declared with
+ reference type denotes the object or function it was initialized to
+ refer to, and nothing else.  "The reference" does not exist.

So, what about references that have no names, e.g., the return value
of int& f() {return *new int;} ?  The Standard may choose to call it a
non-entity, but it exists at run time and at run time can be given a
name, e.g, "int& r = f();".  But don't believe that it needs a name ---
"f().print_it;" might print out some very important information.

+> +> C++ references can be bound at run time.  In C++, the default behavior
+> +> is that entities that can be bound at run time can be rebound at
+> +> run-time.
+> +
+> + Huh?  Can you illustrate this?
+>
+>    int i = 1;
+>    i = 2;
+>
+
+ i denotes the same object before and after the assignment.

Yes, but names aren't entities per your enumeration above.  The entity
that gets rebound is the object that i denotes.

+ I don't see any rebinding here.  Unless you're suggesting that built-in
+ assignment ends the lifetime of the original object and begins the
+ lifetime a new one by reusing its storage.  I don't think the standard
+ supports that view, but if it does, the same "rebinding" mechanism
+ works for references (3.8/7).

That object (nameless variable) gets re-bound to a new value, namely 2.

+> +> The point is that arbitrary decisions need to be mentioned -- they
+> +> don't go without saying.
+> +
+> + [snip]
+> +
+> +> Such objects are an exceptions to the notion that references behave
+> +> like "implictly dereferenced pointers".
+> +
+> + I guess that's the point, really: if you think of references as
+> + automatically dereferenced pointers, then lots exceptions and special
+> + rules have to be introduced and these will appear arbitrary.
+>
+> The special rules have to be there in any case.  The question of
+> whether or not the decision to make C++ reference non-reseatable was
+> "arbitrary" depends on which interpretation of "arbitrary" you pick.
+> Consider the following definition of "arbitrary":
+>
+>   based on or determined by individual preference or convenience
+>   rather than by necessity or the intrinsic nature of something
+>
+> (That's one of three possibilities from Merriam-Webster.)
+>
+
+ I like it.
+
+ I find that if I think of references simply as expressions (or names,
+ where applicable) whose referent must be specified explicitly, almost
+ all of the "rules" for their use emerege naturally.  If I think of
+ them as pointers whose pointer-ness has been hidden from me by the
+ design of the language, then the rules for their use seem arbitrary
+ and confusing.
+
+ I suppose that one could argue that references are *really*
+ automatically dereferneced pointers hemmed in by a bunch of special
+ rules and exceptions.  I don't know what references *really* are, in
+ some metaphysical sense.  Does it matter?
+
+
+> Now consider the following from page 86 of D&E:
+>
+>   It is not possible to change what a reference refers to after
+>   initialization.  That is, once a C++ reference is intialized it cannot
+>   be made to refer to a different object later; it cannot be re-bound.
+>   I had in the past been bitten by Algol68 references where r1=r2 can
+>   either assign through r1 to the object referred to or assign a new
+>   reference value to r1 (re-binding r1) depending on the type of r2.  I
+>   wanted to avoid such problems in C++.
+>
+> Judge for yourself.
+
+ Hmmm.  So, in Algol68, assignment to a reference can *sometimes*
+ rebind it and *sometimes* assign a new value to the existing referent.
+ And you find C++ references strange and arbitrary?  :-)

I've claimed that the decision to render references unreseatable in
C++ is "arbitrary", per the above quotes.  I've not claimed that it is
"strange".

In any case, Simila67, Java, Python, C#, and may other language have
reseatable references.  C++ is the only language I know of whose
references are not reseatable.  FWIW, I've been told by folks who
claim to know Algol68 that Stroustrup is incorrect.  They say that
assignment's left operand is given its most dereferenced type and that
the right operand is dereferenced until has a type that is
assignment-compatible with the left operand.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Sat, 1 Nov 2003 16:00:22 CST Raw View

Ben Hutchings <do-not-spam-benh@bwsint.com> wrote:
+ johnchx wrote:
+> do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
+ <snip>
+>> Here's a comparison of how C++ and the other languages handle
+>> references - omitting Object Pascal as I have no experience with
+>> it and, unlike those last three languages, I didn't think it was
+>> worth looking into:
+>
+> Thanks for the comparison!  Interesting....
+>
+> BTW, out of curiousity, which of these languages has a language
+> construct *called* a reference (or reference type)?
+ <snip>
+
+ C++, Simula 67 and Eiffel.  Algol 68 makes the distinction, since
+ it supports both by-value and by-reference manipulation, but it
+ calls references "names".

I've never compiled an Algol68 program, but I've had to go over some
of its features in Programming language courses.  What I recall is
that ref is (in essence) a type qualifier, that means "object".
Consider:
  int  i = 3;
  int& r = i;
Per Algol 68, 3 is an int, i names a "ref int", and r names a "ref ref
int".  I would say that 3 is an int, i names an int-valued object, etc.

+ VB 6 has the "ByRef" keyword for passing
+ function arguments by reference, but otherwise doesn't seem to use
+ the term.
+
+ Note that Simula is based on Algol 60, Eiffel is based on Simula,
+ and C++ has been influenced by Algol (partly via C) and Simula.
+ ("C++: Simula in wolf's clothing." - Bjarne Stroustrup.)  So
+ explicit references seem to be an Algol feature that has been
+ inherited by most of its derivatives.

Perhaps, the influence was the other way around.  Simula67
vs. Algol68.  The suffix indicates the year.  But I suspect that
references were an idea whose time had come.

+ As an exception to this,
+ Java, which seems semantically somewhere between Simula and
+ Smalltalk (and only syntactically influenced by C++), does not
+ have them.

Rather, Java has nothing else, at least where user-defined types are
concerned.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Sun, 2 Nov 2003 05:05:43 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote
> johnchx <johnchx2@yahoo.com> wrote:
> + thp@cs.ucr.edu wrote
> +
> +>    Widget {
> +>      ...stuff...
> +>      Widget* me() { return this; }
> +>    };
> +>
> +>    Widget f();
> +>
> +>    Widget& r = *(f().me());
> +>
> +> Now we've attached an lvalue, r, to that rvalue-denoted object.
> +>
> +
> + No, you've created a dangling reference.  r binds direclty to an
> + lvalue, not to the temporary object returned by f(), so the
> + temporary's lifetime ends at the end of the full expression.
>
> Not so:
>
>   There are two contexts in which temporaries are destroyed at a
>   different point than the end of the full-expression. [...]  The
>   second is when a reference is bound to a temporary. ...
>                                               [12.2#4 and 12.2#5]
>

Correct: so the question is whether a reference is bound to the
temporary.


> Nevertheless, per 8.5.3#5:
>
>   If the initializer expression ... is an lvalue ... then the
>   reference is bound directly to the initializer expression lvalue ...
>

Yes: the reference is bound directly to the lvalue.  Notice carefully
what the standard does *not* say -- it doesn't say that the reference
is bound to the object denoted by the lvalue.

If you read further in 8.5.3/5, you'll come to the text that describes
the cases in which references bind to objects.  These always involve
references initialized with rvalue expressions.

I believe most compilers these days get this right. (I know that gcc
does and that Borland doesn't.)  Here's quick test:

   #include <iostream>

   struct foo
   {
      ~foo() { std::cout << "foo dtor" << std::endl; }
      foo* get() {return this;}
   };

   int main() {
      std::cout << "before" << std::endl;
      foo& f =  *(foo().get()) ;
      std::cout << "after" << std::endl;
   }


> + The initialization semantics are different, of course.  If the type of
> + Widget::k is int, the compiler will implicitly create the denoted
> + object; if the type is int&, the user must explicitly identify the
> + denoted object.
>
> And in the first case we've identified the same object, while in the
> first case, the compiler must create distinct objects.  That's a
> detectable difference.

As I said, the initialization semantics are different.


> +> +> + In other words, in C++, names declared with reference type behave just
> +> +> + like all the other names in this respect -- why introduce a
> +> +> + distinction only to explain that it doesn't actually apply. ;-)
> +> +>
> +> +> To disspell the all too common misconception that references are
> +> +> names.
> +> +>
> +> +
> +> + I'm not sure I follow you here.  Names declared with reference type
> +> + are certainly names.
> +>
> +> Right.  They name references,
> +
> + NO, NO, NO!  ;-)
> +
> + That's the intuition I'm trying to undo: that "a reference" is a
> + "thing", and that the name of a reference denotes that thing.  This is
> + simply not so.
> +
> + 3/4 tells us:
> +
> +  A *name* is the use of an identifier (2.10) that denotes an
> +  entity or a *label* (6.6.4, 6.1).
> +
> + 3/3 tells us:
> +
> +  An *entity* is a value, object, subobject, base clas subobject
> +  array element, variable, function, instance of a function,
> +  enumerator, type, class member, template or namespace.
> +
> + Thus a name never denotes "a reference".  A name declared with
> + reference type denotes the object or function it was initialized to
> + refer to, and nothing else.  "The reference" does not exist.
>
> So, what about references that have no names, e.g., the return value
> of int& f() {return *new int;} ?  The Standard may choose to call it a
> non-entity, but it exists at run time and at run time can be given a
> name, e.g, "int& r = f();".  But don't believe that it needs a name ---
> "f().print_it;" might print out some very important information.

This seems like a non-sequitur, so maybe we're missing each others
points.  Let me recap, and perhaps we can figure out where the thread
was lost.

I was making a point about names declared with reference type.

You responded with an allusion to "the all too common misconception
that references are names."

Since my point was about names declared with reference type, I replied
that such names were certainly names.

You answered that such names name references.

I replied that that is not possible.  Names name labels or entities.
References are neither labels nor entities.  Therefore it cannot be
true that names declared with reference type name references.

Your reply reiterated the existence of references without names
(function parameters and returns declared with reference type).

I'm at a loss to understand what they have to do with names declared
with reference type.  Perhaps you're making an unrelated point, and
I'm confusing myself by trying to tie it back to where we started.
Can you clarify?


> +> +> C++ references can be bound at run time.  In C++, the default behavior
> +> +> is that entities that can be bound at run time can be rebound at
> +> +> run-time.
> +> +
> +> + Huh?  Can you illustrate this?
> +>
> +>    int i = 1;
> +>    i = 2;
> +>
> +
> + i denotes the same object before and after the assignment.
>
> Yes, but names aren't entities per your enumeration above.  The entity
> that gets rebound is the object that i denotes.

Hmmm... I don't think I understand your definition of binding, then.
As I use the term in this context, "binding" is the relationship
between a name and an entity.  Re-binding means causing an existing
name to refer to a different entity.  What do you mean by "binding"
and "re-binding"?


> + Hmmm.  So, in Algol68, assignment to a reference can *sometimes*
> + rebind it and *sometimes* assign a new value to the existing referent.
> + And you find C++ references strange and arbitrary?  :-)
>
> I've claimed that the decision to render references unreseatable in
> C++ is "arbitrary", per the above quotes.  I've not claimed that it is
> "strange".

You're right of course.  I've gotten the impression that you find the
behavior of C++ references to be surprising or unnatural, but I may be
inferring too much.  My apologies.


> In any case, Simila67, Java, Python, C#, and may other language have
> reseatable references.  C++ is the only language I know of whose
> references are not reseatable.

Maybe it would be better to say that C++ doesn't have anything
comparable to references in those languages.

I say that because the difference between a C++ reference and an
automatically dereferenced pointer (which is, I gather, more or less
the notion of "reference" in these other languages) are deeper than
just "no reseating."  (For example, a reference isn't guaranteed to
have a size; a reference isn't an object.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jdennett@acm.org (James Dennett)
Date: Mon, 3 Nov 2003 21:36:47 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote:
> johnchx <johnchx2@yahoo.com> wrote:
> + thp@cs.ucr.edu wrote
> +
> +>    Widget {
> +>      ...stuff...
> +>      Widget* me() { return this; }
> +>    };
> +>
> +>    Widget f();
> +>
> +>    Widget& r = *(f().me());
> +>
> +> Now we've attached an lvalue, r, to that rvalue-denoted object.
> +>
> +
> + No, you've created a dangling reference.  r binds direclty to an
> + lvalue, not to the temporary object returned by f(), so the
> + temporary's lifetime ends at the end of the full expression.
>
> Not so:
>
>   There are two contexts in which temporaries are destroyed at a
>   different point than the end of the full-expression. [...]  The
>   second is when a reference is bound to a temporary. ...
>                                               [12.2#4 and 12.2#5]

Vaguely interesting case.  I don't think the intent
was for this to apply: for example, you can't bind a
non-const reference to a temporary, but the code above
does successfully bind r to *(f().me()) -- because the
rhs doesn't count as a temporary, even though it's
actually the same object as f(), which *is* a temporary.

Let's drop the word "temporary"... f() is an rvalue
(expression).  f().me() uses an rvalue to lvalue
conversion to call me(), and returns another rvalue.
But *(f().me()) is an lvalue.  lvalues are not considered
to be temporaries.  The actual object that is the value
of this expression is the result of f(), and that has no
reason to live beyond the end of the full expression...
so r is a dangling reference.

-- James.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Mon, 3 Nov 2003 21:38:34 +0000 (UTC) Raw View

In article <bnvll9$8hv$2@glue.ucr.edu>, thp@cs.ucr.edu wrote:
> Ben Hutchings <do-not-spam-benh@bwsint.com> wrote:
<snip>
> + Note that Simula is based on Algol 60, Eiffel is based on Simula,
> + and C++ has been influenced by Algol (partly via C) and Simula.
> + ("C++: Simula in wolf's clothing." - Bjarne Stroustrup.)  So
> + explicit references seem to be an Algol feature that has been
> + inherited by most of its derivatives.
>
> Perhaps, the influence was the other way around.  Simula67
> vs. Algol68.  The suffix indicates the year.  But I suspect that
> references were an idea whose time had come.

Sorry, yes, you're right - Algol 60 had only the bizarre call-by-
name feature and not the more general name mode (reference type)
concept of Algol 68.  Of course, Algol and Simula came out of
academia, where it is common to discuss programming features
without having an implementation or complete design for them, so
references presumably could be implemented in multiple languages
simultaneously based on papers rather than on a single
predecessor.

> + As an exception to this,
> + Java, which seems semantically somewhere between Simula and
> + Smalltalk (and only syntactically influenced by C++), does not
> + have them.
>
> Rather, Java has nothing else, at least where user-defined types are
> concerned.

Neither does Simula, but it's still explicit about references.

(It's probably time, or past time, to end this thread as it no
longer seems relevant to C++.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Sun, 19 Oct 2003 19:25:19 +0000 (UTC) Raw View

do-not-spam-benh@bwsint.com (Ben Hutchings) wrote

> Following the thread "Fancy pointers that behave like Java-style
> reference?", here's my attempt to define what these things are and
> how they are related.  They aren't actually that simple, but
> hopefully they avoid using vague terms such as 'alias' and are also
> correct.

A good idea indeed...so good I feel compelled to take a stab at it myself.  :-)


Expressions
-----------
-> every expression has a type

-> every expression is an lvalue or an rvalue (3.10/1)

-> the language provides rules which determine
   whether an expression is an lvalue or an rvalue


Reference Type
--------------
-> an expression with reference type is interpreted as an
   lvalue, regardless of the other rules which might apply

-> having reference type has no further effect on the
   interpretation of an expression ( 5/6 )


References
----------
-> the phrase "a reference" may refer to one of the
   following:  a non-member name with reference type,
   a member name with reference type, the value of a
   function whose declared return type has reference
   type, or a function parameter of reference type

-> it is possible to declare a non-member name as having
   reference type; the definition of such a name shall
   initialize it with an lvalue of compatible type

-> it is possible to declare a member name as having
   reference type; such a member must be initialized
   with an lvalue of compatible type

-> it is possible to declare a function as returning a
   reference type; a function call expression calling
   such a function is an lvalue; the function shall
   be defined to return an lvalue

-> it is possible to declare a function parameter of
   reference type; a function call expression calling
   such a function must initialize such a parameter with
   an lvalue of compatible type.


lvalues & rvalues
-----------------
-> the concepts of type an lvalue/rvalue-ness are
   completely orthogonal, with two exceptions:

   (a) it is impossible to have an rvalue of
       function type and
   (b) all expressions with reference type are
       lvalues

-> put another way, the purpose of reference type
   is to allow the programmer to use the type system
   to affect the lvalue-ness of expressions

-> intuitively, an lvalue denotes a location, while
   an rvalue denotes a value.

-> an lvalue-to-rvalue conversion is accomplished by
   reading the value stored at the location denoted
   by the lvalue (this is not allowed for lvalues
   of function type)

-> an rvalue-to-lvalue "conversion" can in general
   be accomplished only by copying the value
   denoted by the rvalue into a known location,
   which is thereafter denoted by the lvalue; this
   is called "introducing a temporary"

-> it is possible to take the address of an lvalue;
   it is not possible to take the address of an
   rvalue


Pointers
--------
-> if p has the type pointer-to-T, the expression *p is
   an lvalue of type T (5.3.1/1); it does not have
   reference type


Comments and corrections welcome, of course!

-- John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Mon, 20 Oct 2003 01:47:52 +0000 (UTC) Raw View

johnchx <johnchx2@yahoo.com> wrote:
+ do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
+
+> Following the thread "Fancy pointers that behave like Java-style
+> reference?", here's my attempt to define what these things are and
+> how they are related.  They aren't actually that simple, but
+> hopefully they avoid using vague terms such as 'alias' and are also
+> correct.
+
+ A good idea indeed...so good I feel compelled to take a stab at it myself.  :-)
+
+
+ Expressions
+ -----------
+ -> every expression has a type
+
+ -> every expression is an lvalue or an rvalue (3.10/1)
+
+ -> the language provides rules which determine
+   whether an expression is an lvalue or an rvalue
+
+
+ Reference Type
+ --------------
+ -> an expression with reference type is interpreted as an
+   lvalue, regardless of the other rules which might apply
+
+ -> having reference type has no further effect on the
+   interpretation of an expression ( 5/6 )
+
+
+ References
+ ----------
+ -> the phrase "a reference" may refer to one of the
+   following:  a non-member name with reference type,
+   a member name with reference type, the value of a
+   function whose declared return type has reference
+   type, or a function parameter of reference type
+
+ -> it is possible to declare a non-member name as having
+   reference type; the definition of such a name shall
+   initialize it with an lvalue of compatible type
+
+ -> it is possible to declare a member name as having
+   reference type; such a member must be initialized
+   with an lvalue of compatible type
+
+ -> it is possible to declare a function as returning a
+   reference type; a function call expression calling
+   such a function is an lvalue; the function shall
+   be defined to return an lvalue
+
+ -> it is possible to declare a function parameter of
+   reference type; a function call expression calling
+   such a function must initialize such a parameter with
+   an lvalue of compatible type.

+ lvalues & rvalues
+ -----------------
+ -> the concepts of type an lvalue/rvalue-ness are
+   completely orthogonal, with two exceptions:
+
+   (a) it is impossible to have an rvalue of
+       function type and
+   (b) all expressions with reference type are
+       lvalues
+
+ -> put another way, the purpose of reference type
+   is to allow the programmer to use the type system
+   to affect the lvalue-ness of expressions
+
+ -> intuitively, an lvalue denotes a location, while
+   an rvalue denotes a value.
+
+ -> an lvalue-to-rvalue conversion is accomplished by
+   reading the value stored at the location denoted
+   by the lvalue (this is not allowed for lvalues
+   of function type)
+
+ -> an rvalue-to-lvalue "conversion" can in general
+   be accomplished only by copying the value
+   denoted by the rvalue into a known location,
+   which is thereafter denoted by the lvalue; this
+   is called "introducing a temporary"
+
+ -> it is possible to take the address of an lvalue;
+   it is not possible to take the address of an
+   rvalue
+
+
+ Pointers
+ --------
+ -> if p has the type pointer-to-T, the expression *p is
+   an lvalue of type T (5.3.1/1); it does not have
+   reference type
+
+
+ Comments and corrections welcome, of course!

 1) Nice work.

 2) You omitted initialization of const references via rvalues.

 3) I don't understand what you mean by "the concepts of type an
    lvalue/rvalue-ness are completely orthogonal ..."  By definiton,
    they are complements of each other, in the sense that every
    expression is one or the other but never both.

 4) In the clause:

      intuitively, an lvalue denotes a location, while
      an rvalue denotes a value.

    I presume that "location" means roughly the same thing as
    "object".  It might be worthwhile mentioning that some
    lvalues, e.g., "*(int*)0" don't represent objects, while
    some rvalues denote objects, e.g., "f()" where f returns
    a value of a user-defined type.

 5) In the clause:

      an expression with reference type is interpreted as an
      lvalue, regardless of the other rules which might apply

    I'd drop "interpreted as" and add "denoting the referent
    of the corresponding reference.  It may undergo subsequent
    lvalue-to-rvalue conversion."

 6) The fact that all post-initialization occurrences of refernces
    denote the referent makes it impossible directly to reseat
    a reference and/or determine its location.  To prohibit doing
    either of those things via indirection:

      - Pointers, reference, and arrays to/of references are not
        permitted.

      - Struct-or-class objects having reference members have
        no layout rules.

 7) By definition, sizeof(T&) denotes sizeof(T).

Tom Payne



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Mon, 20 Oct 2003 18:01:31 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote
>
>  1) Nice work.
>

Thanks, and thanks for the comments as well.

>  2) You omitted initialization of const references via rvalues.
>

Well, they're in there if you know where to look.  :-)  They're the
"rvalue-to-lvalue 'conversion'" I refered to.  (At least that's one
way an rvalue-to-lvalue "conversion" can occur.)

I did leave out the special rule demanding that references initialized
this way be of const qualified type, and a number of other
reference-related rules (e.g. lifetimes of temporaries bound to
references).  I chose to do that only because I was trying to find a
clean way to present the idea of references -- and directly related
ideas like lvalue -- rather than a comprehensive treatment of the
related topics.  And I think the "const rule" has more to do with the
error-prone nature of the implicit conversions inherited from C than
with the essential ideas of references.  (IIRC, the "const rule"
didn't exist at all in the primordial soup of C-with-classes, at least
in the early days.)


>  3) I don't understand what you mean by "the concepts of type an
>     lvalue/rvalue-ness are completely orthogonal ..."  By definiton,
>     they are complements of each other, in the sense that every
>     expression is one or the other but never both.
>

That should read "the concepts of type AND lvalue/rvalue-ness are
completely orthogonal ... "  That is, every expression (a) has a type
and (b) has an lvalue/rvalue-ness, and (a) is orthogonal to (b), with
certain exceptions.

I put it this way to set up the notion that the purpose of reference
type is to allow one of these dimensions (type) to affect the other
(lvalue/rvalue-ness).

BTW, there's one more "non-orthogonality": you can't have an rvalue of
array type.  I left that out originally.

>  4) In the clause:
>
>       intuitively, an lvalue denotes a location, while
>       an rvalue denotes a value.
>
>     I presume that "location" means roughly the same thing as
>     "object".  It might be worthwhile mentioning that some
>     lvalues, e.g., "*(int*)0" don't represent objects, while
>     some rvalues denote objects, e.g., "f()" where f returns
>     a value of a user-defined type.
>

I stand by my phrasing here -- in particular, location does *not* mean
the same as object, even roughly.  (While every object is a region of
storage, not every region of storage is an object.)  Lvalues may
denote functions, for example.

What I'm driving at here is the notion that there are two different
aspects of a referent -- its location and its value  -- which may be
denoted by an expression, and lvalues denote the first aspect while
rvalues denote the second.

I've found this to be a useful intuitive guide, in the sense that when
I think about expressions this way, I can accurately "guess" what the
standard will say, and I can explain things about the standard that
would otherwise be mysterious (e.g. when does the function - to -
pointer-to-function conversion occur?).


>  5) In the clause:
>
>       an expression with reference type is interpreted as an
>       lvalue, regardless of the other rules which might apply
>
>     I'd drop "interpreted as"

Well...I'm trying to draw attention to the idea that having reference
type changes the interpretation of an expression which, if it didn't
have reference type, would have been an rvalue.

>     and add "denoting the referent
>     of the corresponding reference.

All expressions denote their referents.  lvalue/rvalue-ness and having
or lacking reference type doesn't change this. (I know that the
suggested language is more or less a quotation from the standard...I
suppose that the import of this phraseing is that the implementation
is prohibited from, e.g., copying the referent and treating the
expression as denoting the copy.  So it probably does need to be in
the formal text, but I think it complicates the exposition.)

Another point to mention: I've tried to avoid mixing notion of
"reference type" with the terms "a reference" or "the reference."

There are (at least) four distinct language constructs that we
commonly refer to as "a reference," and I think that some of the
difficulty that we've seen in defining "a reference" comes from trying
to trying to find some phraseology that applies equally well to all
four constructs.

I think it is possible to give a single coherent account of the
meaning of reference type.  I don't think "a reference" has a single,
unified definition and meaning.

>     It may undergo subsequent
>     lvalue-to-rvalue conversion."
>

This just follows from the general rules for building up a full
expression from sub-expressions, I think.  In other words, it may
undergo any valid conversion; there's nothing special about the
lvalue-to-rvalue conversion in this context.


>  6) The fact that all post-initialization occurrences of refernces
>     denote the referent makes it impossible directly to reseat
>     a reference and/or determine its location.

I think this is a confusing notion that arises from thinking about
names declared with reference type as "pointer-like."  In fact,
there's nothing unusual at all about this.  Consider:

int main() {

  int i = 0;
  int* pi = &i;

}

After its definition, "i" is an lvalue which denotes an object of type
int.  How would you "reseat" it?  How would you take its address?  Not
the address of the denoted object, but of "i" itself?

"pi" is the same: after its definition, it is an lvalue denoting an
object of type int*.  You can change the value stored there, but you
can't cause "pi" to denote some other object.  What you *can* do is
"re-seat" the result of *pi.  But again, there's no magic.  Consider:

struct A {
  int* mp1;
  int* mp2;
  void reseat( bool b ) { mb = b; }
  int& foo() { return mb ? *mp1 : * mp2 ; }
  bool mb;
  A(int* p1, int* p2 ):
    mp1(p1), mp2(p2), mb (true) {}
};

int i, j;
A a (&i, &j);

If I change the value of a.mb, I can change the result of a.foo(), an
expression with reference type.  Have I "re-seated" a reference?  Not
in any meaningful sense.

What I'm getting at here is that the quality of being "un-reseatable"
is in no way peculiar to expressions with reference type.

>     To prohibit doing
>     either of those things via indirection:
>
>       - Pointers, reference, and arrays to/of references are not
>         permitted.

I think it's confusing to view these as special prohibitions.  They
simply follow from the definition of the compound types.  Pointers are
pointers to void or to objects or functions of a given type.
Refereces are references to objects or functions of a given type.
Arrays are arrays of objects of a given type.

Saying that you can't have a pointer to "a reference" is, in my view,
a lot like saying that you can't have a pointer to a namespace -- the
idea is non-sensical.

>
>       - Struct-or-class objects having reference members have
>         no layout rules.
>

Doesn't 9.2/12 apply?

>  7) By definition, sizeof(T&) denotes sizeof(T).

Yes, though I think the only reason that there's a special rule on
this is that adjusting the type of an expression from T& to T might be
considered a step in the evaluation of the expression, and sizeof is
guaranteed not to evaluate its operand.


Thanks again for the thought-provoking comments!

-- John

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: qrczak@knm.org.pl ("Marcin 'Qrczak' Kowalczyk")
Date: Mon, 20 Oct 2003 22:38:57 +0000 (UTC) Raw View

On Sun, 19 Oct 2003 19:25:19 +0000, johnchx wrote:

> -> an expression with reference type is interpreted as an
>    lvalue, regardless of the other rules which might apply

What is the difference between T lvalue and T& lvalue, where T is any
non-reference type?

I mean: would anything change if we said that all lvalues have the
appropriate reference types? Then lvalueness and reference type would
always coincide, so we might start using only one term.

--
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Mon, 20 Oct 2003 22:39:50 +0000 (UTC) Raw View

thp@cs.ucr.edu writes:

> It might be worthwhile mentioning that some
>     lvalues, e.g., "*(int*)0" don't represent objects

Are you sure?

    *(int*)0

is not even a valid expression.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Mon, 20 Oct 2003 22:40:08 +0000 (UTC) Raw View

johnchx wrote:
> do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
>
>> Following the thread "Fancy pointers that behave like Java-style
>> reference?", here's my attempt to define what these things are and
>> how they are related.  They aren't actually that simple, but
>> hopefully they avoid using vague terms such as 'alias' and are also
>> correct.
>
> A good idea indeed...so good I feel compelled to take a stab at it
> myself.  :-)
<snip>

I was expecting some specific criticism rather than a complete
alternative.  I know there are errors even in my second article:

- I forgot to mention that references can't be bound to bitfields.
- I said that lvalues can have reference type, but this is not
  correct; expressions apparently having reference type are
  lvalues of the referred-to type.  Reference types in
  declarations are a way to require the use of lvalues rather
  than rvalues.
- I forgot to mention the other special case of lvalue-to-rvalue
  conversion: array-to-pointer.

What I was hoping was that it would be possible to come up with a
clearer explanation of these concepts and their relations than
there is in the current section 3.10, which would perhaps avoid
the differing (mis-)understandings apparent in the thread I
referred to (and others I have seen in comp.lang.c++.moderated).

As justification for this, I could point out some problems with
the current text:

Para 2: "An lvalue refers to an object or function."  That
seems to mean that an lvalue is a reference, but technically
"reference" means something subtly different.  It also has
the well-known problem of describing run-time behaviour
when lvalues are actually identified at translation time.

Para 2: "Some rvalue expressions...refer to objects."  This
just confuses the issue rather than drawing a clear
distinction between the two.

Para 5.  If lvalues and rvalues are expressions, then the
results of function calls cannot be lvalues or rvalues - only
the function call expressions can be.

Para 6: "An expression which holds a temporary object..."  How
can an expression "hold" an object?  Surely this should refer
to something like "A cast expression that potentially yields a
temporary object"?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: gdr@integrable-solutions.net (Gabriel Dos Reis)
Date: Mon, 20 Oct 2003 23:56:15 +0000 (UTC) Raw View

qrczak@knm.org.pl ("Marcin 'Qrczak' Kowalczyk") writes:

| On Sun, 19 Oct 2003 19:25:19 +0000, johnchx wrote:
|
| > -> an expression with reference type is interpreted as an
| >    lvalue, regardless of the other rules which might apply
|
| What is the difference between T lvalue and T& lvalue, where T is any
| non-reference type?
|
| I mean: would anything change if we said that all lvalues have the
| appropriate reference types?

Yes, I see a problem.  If you equate lvalue with reference type then
what would be "T" in the following?

   template<class T>
     void sink(T) { }

   int main()
   {
      int x = 9;
      int* p = &x;
      sink(*p);          // *p is an lvalue.  T = ?
   }

| Then lvalueness and reference type would
| always coincide, so we might start using only one term.

--
                                                       Gabriel Dos Reis
                                           gdr@integrable-solutions.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
Date: Mon, 20 Oct 2003 19:50:41 CST Raw View

On Mon, 20 Oct 2003 23:56:15 +0000, Gabriel Dos Reis wrote:

> Yes, I see a problem.  If you equate lvalue with reference type then what
> would be "T" in the following?

int. Of course the language would work as today, only described
differently, so we would say that during template parameter deduction
if the parameter type is not a reference, then any toplevel reference of
the argument type is stripped.

In fact the description in the draft seems to assume that the type of the
argument can be a reference. It doesn't say that a reference parameter
type must be matched with an lvalue argument, it says that if both are
of the form T& then T can be deduced.

If it didn't change in the final standard, I'm afraid it's wrong. Either
argument types which would be references are turned into lvalues (and then
it's not possible to match T& parameter) or they remain references (and
then T parameter binds T to a reference type if the argument is a reference).

--
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: Gabriel Dos Reis <gdr@integrable-solutions.net>
Date: Mon, 20 Oct 2003 21:57:18 CST Raw View

dave@boost-consulting.com (David Abrahams) writes:

| thp@cs.ucr.edu writes:
|
| > It might be worthwhile mentioning that some
| >     lvalues, e.g., "*(int*)0" don't represent objects
|
| Are you sure?
|
|     *(int*)0
|
| is not even a valid expression.

Consider  sizeof(*(int*)0).

--
                                                       Gabriel Dos Reis
                                           gdr@integrable-solutions.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Tue, 21 Oct 2003 18:16:16 +0000 (UTC) Raw View

do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
> johnchx wrote:
> > A good idea indeed...so good I feel compelled to take a stab at it
> > myself.  :-)
> <snip>
>
> I was expecting some specific criticism rather than a complete
> alternative.

I know...but starting with a clean sheet is sometimes just too
tempting to resist.  ;-)

[snip]

> What I was hoping was that it would be possible to come up with a
> clearer explanation of these concepts and their relations than
> there is in the current section 3.10, which would perhaps avoid
> the differing (mis-)understandings apparent in the thread I
> referred to (and others I have seen in comp.lang.c++.moderated).
>
> As justification for this, I could point out some problems with
> the current text:
>
> Para 2: "An lvalue refers to an object or function."  That
> seems to mean that an lvalue is a reference,

Reference types must be reference to an object type or reference to a
function type, and lvalues denote objects or functions.  This isn't
just a coincidence, but it also doesn't imply that every lvalue has
reference type.

> but technically
> "reference" means something subtly different.

Yes, in particular it is a (compound) type, not an expression
category.  (lvalue and rvalue are *not* types, of course.)

> It also has
> the well-known problem of describing run-time behaviour
> when lvalues are actually identified at translation time.
>
> Para 2: "Some rvalue expressions...refer to objects."  This
> just confuses the issue rather than drawing a clear
> distinction between the two.
>

I don't think it's meant to draw a distinction.  Paragraphs 4, 5 and 6
(and parts of Clause 5) provide the rules for distinguishing lvalues
from rvalues.  Paragraph 2 is just telling us to what lvalues and
rvalues may refer.  In particular, lvalues always denote either
objects or functions, and never anything else.  Some rvalues denote
objects; others don't.

The standard is stragely silent at this point about what rvalues which
do not denote objects actually denote.  The answer, I think, boils
down to non-object values, which in turn means values that only exist
in a register.  Since these values do not occupy a region of storage,
they are not objects (despite having an object type, such as int or
double).  The classic example is the return value from a function
which returns a built-in type by value.

> Para 5.  If lvalues and rvalues are expressions, then the
> results of function calls cannot be lvalues or rvalues - only
> the function call expressions can be.
>

Yes, the wording here could be improved.  Or dropped -- I think 5/6
may convey the necessary information.

> Para 6: "An expression which holds a temporary object..."  How
> can an expression "hold" an object?  Surely this should refer
> to something like "A cast expression that potentially yields a
> temporary object"?

Yes, the standard is mostly pretty consistent about saying that
expressions "denote" or "refer to" or "designate" objects, so "holds"
seems a strange thing to say here.  Paragraph 6 may actually be
redundant with 5.4/1.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Tue, 21 Oct 2003 18:45:24 +0000 (UTC) Raw View

johnchx <johnchx2@yahoo.com> wrote:
+ thp@cs.ucr.edu wrote
+>
+>  1) Nice work.
+>
+
+ Thanks, and thanks for the comments as well.
+
+>  2) You omitted initialization of const references via rvalues.
+>
+
+ Well, they're in there if you know where to look.  :-)  They're the
+ "rvalue-to-lvalue 'conversion'" I refered to.  (At least that's one
+ way an rvalue-to-lvalue "conversion" can occur.)

But you didn't indicate that it could happen automatically, and AFAIK
it doesn't happen automatically in any other context.

[...]
+>  3) I don't understand what you mean by "the concepts of type an
+>     lvalue/rvalue-ness are completely orthogonal ..."  By definiton,
+>     they are complements of each other, in the sense that every
+>     expression is one or the other but never both.
+>
+
+ That should read "the concepts of type AND lvalue/rvalue-ness are
+ completely orthogonal ... "  That is, every expression (a) has a type
+ and (b) has an lvalue/rvalue-ness, and (a) is orthogonal to (b), with
+ certain exceptions.

Ah, so.

+ I put it this way to set up the notion that the purpose of reference
+ type is to allow one of these dimensions (type) to affect the other
+ (lvalue/rvalue-ness).

But, in that case, they're not "orthogonal".

[...]
+ What I'm driving at here is the notion that there are two different
+ aspects of a referent -- its location and its value  -- which may be
+ denoted by an expression, and lvalues denote the first aspect while
+ rvalues denote the second.

In some cases an rvalue denotes an object, which in general is
somewhat different from that object's value.

+ I've found this to be a useful intuitive guide, in the sense that when
+ I think about expressions this way, I can accurately "guess" what the
+ standard will say, and I can explain things about the standard that
+ would otherwise be mysterious (e.g. when does the function - to -
+ pointer-to-function conversion occur?).

That's the best that any paradigm can accomplish.

+>  5) In the clause:
+>
+>       an expression with reference type is interpreted as an
+>       lvalue, regardless of the other rules which might apply
+>
+>     I'd drop "interpreted as"
+
+ Well...I'm trying to draw attention to the idea that having reference
+ type changes the interpretation of an expression which, if it didn't
+ have reference type, would have been an rvalue.

All lvalues are subject to lvalue-to-rvalue conversions, whether or
not they are lvalues by virtue of having reference type.  (But you
knew that, so perhaps you had something else in mind.)

+>     and add "denoting the referent
+>     of the corresponding reference.
+
+ All expressions denote their referents.

That's certainly the case in say Algol68, but in C++ many people have
serious objection to saying that the expression "3" *refers* to the
integer three.

+ lvalue/rvalue-ness and having
+ or lacking reference type doesn't change this. (I know that the
+ suggested language is more or less a quotation from the standard...I
+ suppose that the import of this phraseing is that the implementation
+ is prohibited from, e.g., copying the referent and treating the
+ expression as denoting the copy.  So it probably does need to be in
+ the formal text, but I think it complicates the exposition.)

AFAIK, the exposition in the standard normally mentions specifically
what entity each sort of expression denotes.

+ Another point to mention: I've tried to avoid mixing notion of
+ "reference type" with the terms "a reference" or "the reference."

One can do that by distributing the duty of specifying what object
each sort of expression of reference type denotes one level down to
the clauses the semantics of each operator whose return value has
reference type.  But things get more tedious.

+ There are (at least) four distinct language constructs that we
+ commonly refer to as "a reference," and I think that some of the
+ difficulty that we've seen in defining "a reference" comes from trying
+ to trying to find some phraseology that applies equally well to all
+ four constructs.
+
+ I think it is possible to give a single coherent account of the
+ meaning of reference type.  I don't think "a reference" has a single,
+ unified definition and meaning.

To fully characterize any language's notion of reference requires more
than one clause.  Attempts like "reference are aliases" degenerate
into humpty-dumpty games on the term "alias".  Analogies such as
"references behave like implicitly dereferenced pointers" require a
number of exception clauses.

+>     It may undergo subsequent
+>     lvalue-to-rvalue conversion."
+>
+
+ This just follows from the general rules for building up a full
+ expression from sub-expressions, I think.  In other words, it may
+ undergo any valid conversion; there's nothing special about the
+ lvalue-to-rvalue conversion in this context.

Of course.  The point is to warn newbie readers not to be astonished
when, via two stage implicit conversion (but don't call it that), an
expression of type T& becomes an rvalue of type T.

+>  6) The fact that all post-initialization occurrences of refernces
+>     denote the referent makes it impossible directly to reseat
+>     a reference and/or determine its location.
+
+ I think this is a confusing notion that arises from thinking about
+ names declared with reference type as "pointer-like."  In fact,
+ there's nothing unusual at all about this.  Consider:
+
+ int main() {
+
+  int i = 0;
+  int* pi = &i;
+
+ }
+
+ After its definition, "i" is an lvalue which denotes an object of type
+ int.  How would you "reseat" it?

The identifier i is statically bound to the corresponding offset
(location) in certain data regions.  That binding is not subject to
dynamic rebinding.  Most references are dynamically bound to objects.

In most languages references are subject to dynamic rebinding, but in
C++ such rebinding has consciously been precluded because of the
designer's negative experiences with reference rebinding in Algol68.
That preclusion seems noteworthy, not just to me but to the designer
of the langauge, who has noted it in several places in his writing on
the matter.

+ How would you take its address?  Not
+ the address of the denoted object, but of "i" itself?
+ "pi" is the same: after its definition, it is an lvalue denoting an
+ object of type int*.  You can change the value stored there, but you
+ can't cause "pi" to denote some other object.

Of course not.  The identifier "pi" exists only at compile time and
gets bound at compile time.  At run time there is nothing left of "pi"
to bind or rebind.  By contrast, most references get dynamically
bound, and some have no idenifiers associated with them.  References
and identifiers are quite different notions.

+ What you *can* do is
+ "re-seat" the result of *pi.  But again, there's no magic.  Consider:
+
+ struct A {
+  int* mp1;
+  int* mp2;
+  void reseat( bool b ) { mb = b; }
+  int& foo() { return mb ? *mp1 : * mp2 ; }
+  bool mb;
+  A(int* p1, int* p2 ):
+    mp1(p1), mp2(p2), mb (true) {}
+ };
+
+ int i, j;
+ A a (&i, &j);
+
+ If I change the value of a.mb, I can change the result of a.foo(), an
+ expression with reference type.  Have I "re-seated" a reference?  Not
+ in any meaningful sense.
q+
+ What I'm getting at here is that the quality of being "un-reseatable"
+ is in no way peculiar to expressions with reference type.

The fact that we can't at run time change the binding of an
identifier, in no way makes the non-reseatability of references
*expected behavior*.  Rather, the non-reseatability of references is a
phenomenon that is peculiar to C++ and the result of a somewhat
arbitrary decision by the language's designer.  The more appropriate
analogy would be a const object, which gets bound at run-time but
cannot be rebound.

+>     To prohibit doing
+>     either of those things via indirection:
+>
+>       - Pointers, reference, and arrays to/of references are not
+>         permitted.
+
+ I think it's confusing to view these as special prohibitions.  They
+ simply follow from the definition of the compound types.  Pointers are
+ pointers to void or to objects or functions of a given type.
+ Refereces are references to objects or functions of a given type.
+ Arrays are arrays of objects of a given type.
+
+ Saying that you can't have a pointer to "a reference" is, in my view,
+ a lot like saying that you can't have a pointer to a namespace -- the
+ idea is non-sensical.

It is something that has been prohibited by design.  References are an
idea imported into C++ from other language and deliberately crippled.
All I'm saying is that the results of that crippling are noteworthy,
and the designer himself finds them noteworthy.  He seems not to
believe that they go without saying.

+>       - Struct-or-class objects having reference members have
+>         no layout rules.
+>
+
+ Doesn't 9.2/12 apply?

Oops!  Apparently I have a misimpression here.  The appropriate claim
would be that:

  There is no guarantee that a pointer to a struct-or-class object
  having a reference points to the object's first member.

The significance of this is that replacing post-initialization
occurrences of referneces by similarly dereferenced pointers can cause
such an object to become POD and subject to more rigid layout rules.

+>  7) By definition, sizeof(T&) denotes sizeof(T).
+
+ Yes, though I think the only reason that there's a special rule on
+ this is that adjusting the type of an expression from T& to T might be
+ considered a step in the evaluation of the expression, and sizeof is
+ guaranteed not to evaluate its operand.

But T& is not an expression and is not subject to evaluation.

+ Thanks again for the thought-provoking comments!

Any time.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Tue, 21 Oct 2003 18:45:57 +0000 (UTC) Raw View

Gabriel Dos Reis <gdr@integrable-solutions.net> writes:

> dave@boost-consulting.com (David Abrahams) writes:
>
> | thp@cs.ucr.edu writes:
> |
> | > It might be worthwhile mentioning that some
> | >     lvalues, e.g., "*(int*)0" don't represent objects
> |
> | Are you sure?
> |
> |     *(int*)0
> |
> | is not even a valid expression.
>
> Consider  sizeof(*(int*)0).

OK, but *(int*)0 isn't evaluated.  I should have said it's undefined
behavior to evaluate it.  I don't see how it can be defined to be an
lvalue and produce undefined behavior at the same time.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Fri, 24 Oct 2003 19:47:31 +0000 (UTC) Raw View

Gabriel Dos Reis <gdr@integrable-solutions.net> wrote:
[...]
+ There can be expression which are well-formed from  static analysis
+ point of view, but which have are not defined at runtime. That is part
+ of the reasons why we use C++ :-)

Agreed, but AFAIK "*(int*)0" poses the same problem in pure C.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Fri, 24 Oct 2003 19:47:56 +0000 (UTC) Raw View

In article <4fb4137d.0310221029.6a8261e2@posting.google.com>,
johnchx wrote:
> thp@cs.ucr.edu wrote in message
>> johnchx <johnchx2@yahoo.com> wrote:
<snip>
> On the other hand, is it simpler to say:
>
>    In C++, some identifiers are statically bound and some are
>    dynamically bound.  You might think that you can re-bind
>    dynamically bound identifiers, but you can't.
>
> Or:
>
>   In C++, you can't rebind identifiers.

That's nice and simple, though I'm not sure whether "identifier" is
exactly the right term.

<snip>
>> In most languages references are subject to dynamic rebinding,
>
> Aha!  That may be a key difference in perspective.  I don't work with
> other languages that have constructs called references, so my "default
> assumptions" have been formed by how they work in C++, not by the
> approaches taken in other languages.
>
> I have worked with languages that follow the "everything is secretly
> an automatically-dereferenced-reference-counted-shared-pointer-that-
> may-or-may-not-be-copy-on-write" paradigm (VB, Object Pascal).  I
> hate them. ;-)

If it stays secret, I don't see the problem.  If the implementation
doesn't work quite right and sometimes causes weird behaviour, I would
be concerned.  So far as I could tell, VB 6's treatment of references
is slightly odd but predictable.

> I gather that Java and Python also more or less fit this paradigm.
> But I don't know algol or simula or eiffel, and perhaps if I did, I'd
> be more inclined to see "un-reseatable" as a special case.
<snip>

Here's a comparison of how C++ and the other languages handle
references - omitting Object Pascal as I have no experience with it
and, unlike those last three languages, I didn't think it was worth
looking into:

Language        C++     VB 6     Java   Python   Algol   Simula  Eiffel
                                                 68      67

Fundamental     yes     yes      yes    no       yes     yes     yes
values?[1]

Fundamental     yes     yes      no     yes      yes     no      yes
references?[2]                                   [8]

Other values?   yes     no       no     no       yes     no      yes

Other           yes     yes      yes    yes      yes     yes     yes[9]
references?                                      [8]

Value           =       Let =    =      N/A      :=      :=      :=
assignment

Value equality  ==,     =,       ==,    ==,      =,      =,      =,
comparison[3]   !=      <>       !=     !=, <>   /=      <>      /=

Reference       N/A     Set =    =      =        :=      :-      :=
rebinding

Null reference  N/A     Nothing  null   None[7]  NIL     None    Void

Reference       none    Is       ==,    is,      :=:,    ==,     =,
comparison      [5]              !=     is not   :/:     =/=     /=

Other           as for  .        ., []  many     as for  .       as for
reference       values           [6]             values          values?
operations[4]

[1] Does the language permit manipulation of values (not references
    to values) of fundamental types (arithmetic types and maybe a few
    others)?
[2] Does the language permit manipulation of references to values of
    fundamental types?
[3] The operators used to compare values, possibly through
    references.  Where these are the same as the operators used to
    compare references, there is no direct way to compare values
    through references to them.
[4] Other operations defined on references.
[5] Except by combination of & and == operators.
[6] Class types support "."; array types "[]".  The java.lang.String
    type also supports the "+" operator.
[7] This is actually a built-in name that refers to a dummy object.
[8] Note that they are called "names".
[9] Except for "expanded class" types.

There are probably some errors in this table.

Clearly C++ is the odd one out in not permitting rebinding.  Yet
there is considerable variation between these languages in their
treatment of references, so it would not be accurate to consider
C++ references wholly unusual.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: belvis@pacbell.net (Bob Bell)
Date: Sat, 25 Oct 2003 20:16:33 +0000 (UTC) Raw View

do-not-spam-benh@bwsint.com (Ben Hutchings) wrote in message news:<slrnbpg2pg.1q8.do-not-spam-benh@tin.bwsint.com>...
> Clearly C++ is the odd one out in not permitting rebinding.

Perhaps that's because C++ provides pointers. How many of these
langauges also provide a type like a C++ pointer, which does allow
"rebinding"?

> Yet
> there is considerable variation between these languages in their
> treatment of references, so it would not be accurate to consider
> C++ references wholly unusual.

One central question that seems to get overlooked in these discussions
about C++'s references vs. other languages' references is "what's the
problem for C++?" It's not enough to point out that C++ is different;
why are C++'s reference semantics a problem for C++ programmers? I've
never cared that references couldn't be reseated or can't be NULL,
because when I want that kind of behavior I use a pointer.

Bob

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Mon, 27 Oct 2003 01:44:09 +0000 (UTC) Raw View

johnchx <johnchx2@yahoo.com> wrote:
+ thp@cs.ucr.edu wrote in message
+> johnchx <johnchx2@yahoo.com> wrote:
+> + thp@cs.ucr.edu wrote
+> +>  2) You omitted initialization of const references via rvalues.
+> +>
+> +
+> + Well, they're in there if you know where to look.  :-)  They're the
+> + "rvalue-to-lvalue 'conversion'" I refered to.  (At least that's one
+> + way an rvalue-to-lvalue "conversion" can occur.)
+>
+> But you didn't indicate that it could happen automatically, and AFAIK
+> it doesn't happen automatically in any other context.
+
+ Hmmm...I'm not sure I understand what could "happen automatically."
+ Could you elaborate?

Consider:

  const int& r1 = 3;  // automatic rvalue-to-lvalue conversion happens.
  int& r2 = 3;        // no automatic rvalue-to-lvalue conversion happens.

+> + I put it this way to set up the notion that the purpose of reference
+> + type is to allow one of these dimensions (type) to affect the other
+> + (lvalue/rvalue-ness).
+>
+> But, in that case, they're not "orthogonal".
+
+ Yes...that is why I said "with two exceptions."
+
+ And in a way that's the point (or one of them, anyhow): reference type
+ is an exception to an otherwise clear rule.  And being an exception to
+ that rule is its entire purpose.

Perhaps I'm quibbling about the meaning of "orthogonal".  To me it
means "independent" as in the case of orthogonal vectors.  Vectors
that point in opposite directions are as non-orthogonal as they come.
I see "rvalue" and "lvalue" as opposite notions: an expression is
an rvalue if and only if it isn't an lvalue.

+> [...]
+> + What I'm driving at here is the notion that there are two different
+> + aspects of a referent -- its location and its value  -- which may be
+> + denoted by an expression, and lvalues denote the first aspect while
+> + rvalues denote the second.
+>
+> In some cases an rvalue denotes an object, which in general is
+> somewhat different from that object's value.
+>
+
+ I don't think so, but I may simply not understand your point.
+
+ I'm claiming that rvalues and lvalues denoting objects denote their
+ referents differently.  An rvalue denotes the value of the object; an
+ lvalue denotes the location of the object.

I'm not going to say that there's no difference, but one can via an
rvalue mutate an object (i.e., change its value), e.g., f().flip(),
where f returns by value an object of a user-defined type having a
mutating member function, flip().

[...]
+> +>  5) In the clause:
+> +>
+> +>       an expression with reference type is interpreted as an
+> +>       lvalue, regardless of the other rules which might apply
+> +>
+> +>     I'd drop "interpreted as"
+> +
+> + Well...I'm trying to draw attention to the idea that having reference
+> + type changes the interpretation of an expression which, if it didn't
+> + have reference type, would have been an rvalue.
+>
+> All lvalues are subject to lvalue-to-rvalue conversions, whether or
+> not they are lvalues by virtue of having reference type.  (But you
+> knew that, so perhaps you had something else in mind.)
+
+ I'm claiming that if an expression would have been an lvalue,
+ regardless of whether it has reference type, then having reference
+ type does not affect the meaning of the expression.  If, on the other
+ hand, the expression would be an rvalue if it doesn't have reference
+ type, then having reference type has an effect -- namely causing the
+ expression to be interpreted as an lvalue.
+
+ Example:
+
+  int  foo();
+  int& bar();
+  int  j;
+  extern int& k;
+
+  int main() {
+    j;
+    k;
+    foo();
+    bar();
+  }
+
+ In main(), the expression "j" is an lvalue.  The expression "k" is
+ also an lvalue, whose type is adjusted to int.  That k is declared
+ with reference type doesn't affect the interpretation of the
+ expression "k".
+
+ On the other hand, the expression "bar()" is an lvalue *because* it
+ has reference type.  The expression "foo()," which doesn't have
+ reference type, is an rvalue.

Hmmmmm.  Consider:

   int i;
   class Widget {
   public:
      int& k;
      Widget(i);
   } w1,w2;
   cout << &w1.k == &w2.k ? "yes" : "no" << endl;

+> +>     and add "denoting the referent
+> +>     of the corresponding reference.
+> +
+> + All expressions denote their referents.
+>
+> That's certainly the case in say Algol68, but in C++ many people have
+> serious objection to saying that the expression "3" *refers* to the
+> integer three.
+>
+
+ I've followed a little of this, but I may not fully understand the
+ issue.  Some of the objections I've seen seem to arise from the
+ misconception that rvalues always denote objects, which isn't so.
+
+ I think it's perfectly correct to say that the literal 3 is an rvalue
+ which denotes the value 3.

No one is quibbling about "denote".  The arguments involve "refer",
"referent", and "reference".

+> + lvalue/rvalue-ness and having
+> + or lacking reference type doesn't change this. (I know that the
+> + suggested language is more or less a quotation from the standard...I
+> + suppose that the import of this phraseing is that the implementation
+> + is prohibited from, e.g., copying the referent and treating the
+> + expression as denoting the copy.  So it probably does need to be in
+> + the formal text, but I think it complicates the exposition.)
+>
+> AFAIK, the exposition in the standard normally mentions specifically
+> what entity each sort of expression denotes.
+>
+
+ My only real objection to the language is that it seems to imply that
+ something tricky is going on, when in fact, something trivial is going
+ on.  In my opinion, it would be a lot clearer to say:
+
+   The referent of the expression is unchanged.
+
+ than to say
+
+  ...the expression designates the object or function denoted by the
+ reference.
+
+ Which raises questions like:
+
+   What's the diference between "designating" and "denoting"?
+
+ and
+
+   What "reference?"  How did we go from talking about an
+   expression with reference type to "the reference?"

AFAIK, the normal terminology is to say that functions whose return
type is T return entities of type T.  In particular, functions of
return type T& are said to return entities of type reference-to-T.
For example, we say that int& f(){return i;} returns a reference (to i).
We don't say that f returns an int, because f's return type isn't int.

+> +>  6) The fact that all post-initialization occurrences of refernces
+> +>     denote the referent makes it impossible directly to reseat
+> +>     a reference and/or determine its location.
+> +
+> + I think this is a confusing notion that arises from thinking about
+> + names declared with reference type as "pointer-like."  In fact,
+> + there's nothing unusual at all about this.  Consider:
+> +
+> + int main() {
+> +
+> +  int i = 0;
+> +  int* pi = &i;
+> +
+> + }
+> +
+> + After its definition, "i" is an lvalue which denotes an object of type
+> + int.  How would you "reseat" it?
+>
+> The identifier i is statically bound to the corresponding offset
+> (location) in certain data regions.  That binding is not subject to
+> dynamic rebinding.  Most references are dynamically bound to objects.
+>
+
+ Yes, I understand what you're saying here.
+
+ On the other hand, is it simpler to say:
+
+   In C++, some identifiers are statically bound and some are
+   dynamically bound.  You might think that you can re-bind
+   dynamically bound identifiers, but you can't.
+
+ Or:
+
+  In C++, you can't rebind identifiers.
+
+ In other words, in C++, names declared with reference type behave just
+ like all the other names in this respect -- why introduce a
+ distinction only to explain that it doesn't actually apply. ;-)

To disspell the all too common misconception that references are
names.

+> In most languages references are subject to dynamic rebinding,
+
+ Aha!  That may be a key difference in perspective.  I don't work with
+ other languages that have constructs called references, so my "default
+ assumptions" have been formed by how they work in C++, not by the
+ approaches taken in other languages.
+
+ I have worked with languages that follow the "everything is secretly
+ an automatically-dereferenced-reference-counted-shared-pointer-that-may-or-may-not-be
+ -copy-on-write" paradigm (VB, Object Pascal).  I hate them. ;-)
+
+ I gather that Java and Python also more or less fit this paradigm.
+ But I don't know algol or simula or eiffel, and perhaps if I did, I'd
+ be more inclined to see "un-reseatable" as a special case.
+
+
+> The fact that we can't at run time change the binding of an
+> identifier, in no way makes the non-reseatability of references
+> *expected behavior*.
+
+ Well, expectations vary.  Sometimes having a limited basis for
+ comparison can be an advantage.  ;-)

C++ references can be bound at run time.  In C++, the default behavior
is that entities that can be bound at run time can be rebound at
run-time.  If we want to prevent rebinding, we must put "const" in
front of their declaration.

There is the fact that reference redirect all direct attempts to
modify their binding to their referent.  So there is good reason not
to expect direct attempts at reseating to work.  Steps had to be taken
to prevent reseating via indirection: no references, pointers, arrays
to/of references and relaxed layout rules for struct-or-class objects
having reference members.

+> It is something that has been prohibited by design.  References are an
+> idea imported into C++ from other language and deliberately crippled.
+
+ I understand.  Wishful thinking on my part, perhaps, about the
+ assumptions people would bring with them to C++.

The point is that arbitrary decisions need to be mentioned -- they
don't go without saying.


+>   There is no guarantee that a pointer to a struct-or-class object
+>   having a reference points to the object's first member.
+>
+> The significance of this is that replacing post-initialization
+> occurrences of referneces by similarly dereferenced pointers can cause
+> such an object to become POD and subject to more rigid layout rules.
+
+ I'm not sure I understand this.  What replacement do you have in mind?

Such objects are an exceptions to the notion that references behave
like "implictly dereferenced pointers".

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Mon, 27 Oct 2003 01:44:25 +0000 (UTC) Raw View

Bob Bell <belvis@pacbell.net> wrote:
+ do-not-spam-benh@bwsint.com (Ben Hutchings) wrote in message news:<slrnbpg2pg.1q8.do-not-spam-benh@tin.bwsint.com>...
+> Clearly C++ is the odd one out in not permitting rebinding.
+
+ Perhaps that's because C++ provides pointers. How many of these
+ langauges also provide a type like a C++ pointer, which does allow
+ "rebinding"?

These other languages have reseatable references, which are like C's
pointers but get implicitly dereferenced in most contexts. I've heard
that C# has explicitly dereferenced pointers as well.

Apparently C's notion of pointer was influenced by Algol68 references
-- see <http://www.cse.ucsc.edu/~pohl/Spring01/correspond.htm>:

Correspondence

Dennis Ritchie on the Influence of Algol68

From Ira Pohl to Dennis Ritchie:

i have been lecturing to my grad students on prog lang design issues
and of course have told them the bcpl-b-c and unix story. i have also
described the algol 60-68-pascal history. my question is were you to
any extent influenced by algol68 ideas - especially their ideas on
mode(type).

Response from Dennis:

There was some, but not an awful lot. It is possible that the
composition of types (pointers to pointers etc) was at least a little
influential here, but I might have realized independently that if you
have pointers, you also needed pointers to pointers. This part of C is
in some ways quite similar to A68 (though without, of course, A68's
automatic coercions).

More explicit borrowing, which came later, were unions (though they
could have come from elsewhere), and very explicitly, casts. Even the
name came from A68.

Regards,
Dennis

+> Yet there is considerable variation between these languages in their
+> treatment of references, so it would not be accurate to consider
+> C++ references wholly unusual.
+
+ One central question that seems to get overlooked in these discussions
+ about C++'s references vs. other languages' references is "what's the
+ problem for C++?" It's not enough to point out that C++ is different;
+ why are C++'s reference semantics a problem for C++ programmers? I've
+ never cared that references couldn't be reseated or can't be NULL,
+ because when I want that kind of behavior I use a pointer.

I've worked with languages where lvalue-to-rvalue conversion wasn't
implicit. After people got used to it, they began to think of it as
natural and to think of implicit lvalue-to-rvalue conversion as
confusing. Even in English some people prefer to say "the man called
John" rather than to simply say "John". To me, however, such
unnecessary use of explicit indirection is usually (but not always)
distracting.

Tom Payne

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

Author: johnchx2@yahoo.com (johnchx)
Date: Wed, 29 Oct 2003 23:47:37 +0000 (UTC) Raw View

do-not-spam-benh@bwsint.com (Ben Hutchings) wrote
> johnchx wrote:
> >
> >   In C++, you can't rebind identifiers.
>
> That's nice and simple, though I'm not sure whether "identifier" is
> exactly the right term.
>

You're right.  I'm actually not sure that there is a term with exactly
the right scope to say what I mean here.  "Name" is less broad than
identifier, but still too broad (encompasses type names and labels).
"Variable" might do, though the idea of a "reference variable" is open
to controversy.  (The standard uses the term, but it also says that a
variable is "introduced by the declaration of an object," and the
typical reference declaration doesn't declare an object.)

"Name" may be the best bet, since it is true that names can't be
rebound.  (They can be hidden, but that's a different kettle of fish.)

> Here's a comparison of how C++ and the other languages handle
> references - omitting Object Pascal as I have no experience with it
> and, unlike those last three languages, I didn't think it was worth
> looking into:

Thanks for the comparison!  Interesting....

BTW, out of curiousity, which of these languages has a language
construct *called* a reference (or reference type)?  For instance,
Python, IIRC, just has names, and you can rebind names; more or less
the same with VB (though VB allows function parameters with a ByRef
qualifier).

In other words, when we talk about the "reference feature" in these
languages, in which cases are we talking about a particular construct
*called* reference, and where do we have just a general notion of
"reference semantics"?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Wed, 29 Oct 2003 23:48:13 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote in message news:<bne9h6$g30$1@glue.ucr.edu>...
> Consider:
>
>   const int& r1 = 3;  // automatic rvalue-to-lvalue conversion happens.
>   int& r2 = 3;        // no automatic rvalue-to-lvalue conversion happens.
>

Hmmm...well, yes, I suppose.  In the second case, nothing happens
(except a compiler error).

> + And in a way that's the point (or one of them, anyhow): reference type
> + is an exception to an otherwise clear rule.  And being an exception to
> + that rule is its entire purpose.
>
> Perhaps I'm quibbling about the meaning of "orthogonal".  To me it
> means "independent" as in the case of orthogonal vectors.  Vectors
> that point in opposite directions are as non-orthogonal as they come.
> I see "rvalue" and "lvalue" as opposite notions: an expression is
> an rvalue if and only if it isn't an lvalue.

No, I agree completely with you.  Perhaps I'm not being clear about
what I'm claiming is (almost) orthogonal to what.  Let me reformulate:

  Each expression has two (almost) orthogonal characteristics:
  (a) type (int, bool, T, what-have-you...)
  (b) category (lvalue or rvalue)


> + I'm claiming that rvalues and lvalues denoting objects denote their
> + referents differently.  An rvalue denotes the value of the object; an
> + lvalue denotes the location of the object.
>
> I'm not going to say that there's no difference, but one can via an
> rvalue mutate an object (i.e., change its value), e.g., f().flip(),
> where f returns by value an object of a user-defined type having a
> mutating member function, flip().

Well yes, but no.  This is a misleading formulation.  (I know it's
ususally taught and explained this way, but I think it causes more
confusion than it prevents.)

You can't mutate an rvalue.  Period.

If, however, the rvalue happens to have class type, you can call one
of its member functions.  That function receives the implicit this
pointer.  *this is an lvalue.  Thus the member function has a
perfectly valid lvalue to modify as it chooses.

> +
> + Example:
> +
> +  int  foo();
> +  int& bar();
> +  int  j;
> +  extern int& k;
> +
> +  int main() {
> +    j;
> +    k;
> +    foo();
> +    bar();
> +  }
> +
> + In main(), the expression "j" is an lvalue.  The expression "k" is
> + also an lvalue, whose type is adjusted to int.  That k is declared
> + with reference type doesn't affect the interpretation of the
> + expression "k".
> +
> + On the other hand, the expression "bar()" is an lvalue *because* it
> + has reference type.  The expression "foo()," which doesn't have
> + reference type, is an rvalue.
>
> Hmmmmm.  Consider:
>
>    int i;
>    class Widget {
>    public:
>       int& k;
>       Widget(i);
>    } w1,w2;
>    cout << &w1.k == &w2.k ? "yes" : "no" << endl;
>

Does this compile?  I don't see what you're driving at.


> + In other words, in C++, names declared with reference type behave just
> + like all the other names in this respect -- why introduce a
> + distinction only to explain that it doesn't actually apply. ;-)
>
> To disspell the all too common misconception that references are
> names.
>

I'm not sure I follow you here.  Names declared with reference type
are certainly names.


> C++ references can be bound at run time.  In C++, the default behavior
> is that entities that can be bound at run time can be rebound at
> run-time.

Huh?  Can you illustrate this?


> The point is that arbitrary decisions need to be mentioned -- they
> don't go without saying.

[snip]

> Such objects are an exceptions to the notion that references behave
> like "implictly dereferenced pointers".

I guess that's the point, really: if you think of references as
automatically dereferenced pointers, then lots exceptions and special
rules have to be introduced and these will appear arbitrary.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Thu, 16 Oct 2003 08:47:55 +0000 (UTC) Raw View

Following the thread "Fancy pointers that behave like Java-style
reference?", here's my attempt to define what these things are and
how they are related.  They aren't actually that simple, but
hopefully they avoid using vague terms such as 'alias' and are also
correct.

Lvalue and rvalue:

Every expression can be statically determined to yield either an
lvalue or an rvalue.  An lvalue has an object, function, reference-
to-object or reference-to-function type and potentially addresses
an object or function.  An rvalue has an object type and is
potentially a temporary value.  Evaluating an expression may
result in undefined behaviour and thus not yield anything
meaningful, hence the key word 'potentially'.

Reference and pointer:

A reference is an lvalue that is initialised to address either the
same object or function as another lvalue or a new object
initialised from an rvalue.  In the latter case the new object and
has a const-qualified type and the reference has a reference-to-
const type.  [Explanation of the lifetime of the object omitted.]

A pointer is a value that is potentially the address of an object,
function, or region of storage, but it is not necessarily a valid
address.  An lvalue of pointer type (potentially) addresses such
a value and not any entity that is (potentially) addressed by the
pointer itself.

Relations between the above:

A lvalue that addresses a non-const object can generally be used
to assign an rvalue of the corresponding type to that object, but
this can be disabled by class-types.

An lvalue that addresses an object can be converted to an rvalue
if the object has been initialised or assigned to [are there any
exceptions to this?].  The type of the rvalue is the
corresponding object type stripped of any cv-qualification and
its value is the current value of the object.  An lvalue that
addresses a function can be converted to an rvalue of the
corresponding pointer-to-function type.

The unary '&' and '*' operators convert between rvalues of
pointer-to-object and pointer-to-function types and lvalues of
the corresponding object and function types.

[Explanation of ill-formed and undefined cases omitted.]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Thu, 16 Oct 2003 20:07:46 +0000 (UTC) Raw View

Despite spending a lot of time writing this, I later recognised some
errors in it.  Before everyone jumps on me, here are my own
revisions:

I wrote:
> Every expression can be statically determined to yield either an
> lvalue or an rvalue.

Every well-formed expression can be determined at translation time to
be either an lvalue or an rvalue.

[Current standards says "is" not "yields".]

> An lvalue has an object, function, reference-to-object or
> reference-to-function type...

An lvalue has a type other than void...

[The original is overly complex, yet also omits member-function
types.]

> An rvalue has an object type and is potentially a temporary
> value.

An rvalue has an object or void type and potentially yields a
temporary value or (in the case of void) a lack of value.

> A reference is an lvalue that is initialised to address either the
> same object or function as another lvalue or a new object
> initialised from an rvalue.

A reference addresses an object or function.  It can be initialised
with an lvalue of the same type or of the type it refers to.  If it
is a reference to a const object type, it can be initialised with
an rvalue of the unqualified object type.

[An lvalue is a kind of expression, so a reference can't be an
lvalue.]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: gdr@integrable-solutions.net (Gabriel Dos Reis)
Date: Tue, 21 Oct 2003 19:20:44 +0000 (UTC) Raw View

dave@boost-consulting.com (David Abrahams) writes:

| Gabriel Dos Reis <gdr@integrable-solutions.net> writes:
|
| > dave@boost-consulting.com (David Abrahams) writes:
| >
| > | thp@cs.ucr.edu writes:
| > |
| > | > It might be worthwhile mentioning that some
| > | >     lvalues, e.g., "*(int*)0" don't represent objects
| > |
| > | Are you sure?
| > |
| > |     *(int*)0
| > |
| > | is not even a valid expression.
| >
| > Consider  sizeof(*(int*)0).
|
| OK, but *(int*)0 isn't evaluated.  I should have said it's undefined
| behavior to evaluate it.

agreed.

So, *(int*)0 is a valid expression. It invokes an undefined behaviour
only if it is evaluated.  Now, consider

  double g(int&) { return 0; }

  int main()
  {
     return  sizeof (double) != sizeof (g(*(int*)0));
  }

this program is well-defined because the operand of sizeof is never
evaluated. It must type-check. But then, that means the expression
*(int*)0 is bound to the (non-)reference parameter int&.  That cannot
happen if *(int*)0 is an rvalue.

| I don't see how it can be defined to be an
| lvalue and produce undefined behavior at the same time.

if isn't an lvalue, are you saying it is an rvalue?

--
                                                       Gabriel Dos Reis
                                           gdr@integrable-solutions.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: dave@boost-consulting.com (David Abrahams)
Date: Tue, 21 Oct 2003 23:38:29 +0000 (UTC) Raw View

gdr@integrable-solutions.net (Gabriel Dos Reis) writes:

> So, *(int*)0 is a valid expression. It invokes an undefined behaviour
> only if it is evaluated.  Now, consider
>
>   double g(int&) { return 0; }
>
>   int main()
>   {
>      return  sizeof (double) != sizeof (g(*(int*)0));
>   }
>
> this program is well-defined because the operand of sizeof is never
> evaluated. It must type-check. But then, that means the expression
> *(int*)0 is bound to the (non-)reference parameter int&.  That cannot
> happen if *(int*)0 is an rvalue.
>
> | I don't see how it can be defined to be an
> | lvalue and produce undefined behavior at the same time.
>
> if isn't an lvalue, are you saying it is an rvalue?

I think I'm saying it's no kind of value.  This may not be supported
by the standard.  My intuition tells me lvalues and rvalues are
runtime things and if it doesn't exist at runtime it can't be either
one.  But intuition ain't proof.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Wed, 22 Oct 2003 01:46:54 +0000 (UTC) Raw View

David Abrahams <dave@boost-consulting.com> wrote:
+ thp@cs.ucr.edu writes:
+
+> It might be worthwhile mentioning that some
+>     lvalues, e.g., "*(int*)0" don't represent objects
+
+ Are you sure?
+
+    *(int*)0
+
+ is not even a valid expression.

"Invalid"?  I grant that its evaluation would invoke undefined
behavior.  But AFAIK "int main(){if(0)*(int*)0;}" conforms.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: gdr@integrable-solutions.net (Gabriel Dos Reis)
Date: Wed, 22 Oct 2003 01:47:30 +0000 (UTC) Raw View

"Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl> writes:

| On Mon, 20 Oct 2003 23:56:15 +0000, Gabriel Dos Reis wrote:
|
| > Yes, I see a problem.  If you equate lvalue with reference type then what
| > would be "T" in the following?
|
| int. Of course the language would work as today, only described
| differently,

this remains to be demonstrated.  Not because I'm being argumentative,
but because that assertion is far from trivial.

|              so we would say that during template parameter deduction
| if the parameter type is not a reference, then any toplevel reference of
| the argument type is stripped.
|
| In fact the description in the draft seems to assume that the type of the
| argument can be a reference. It doesn't say that a reference parameter
| type must be matched with an lvalue argument, it says that if both are
| of the form T& then T can be deduced.
|
| If it didn't change in the final standard, I'm afraid it's wrong. Either

Well, in this specific case what is wrong or what is right boils down
to a judgment call.  The wording about T& and T did not change.

I've tried at several times to suggest that lvalueness should be
reflected in the type system (and I did so again a year ago when the
debates on typeof/decltype and "move semantics" were very active) but
in the end I've come to the conclusion that even if the current
situation is not ideal, changing the standard to encode lvaluness in
the type system will cause more surprises than is needed.
The issue of encoding lvaluness as reference is not novel.  It is a
recurent theme.  And from what I can see from the archives, the
committee had a long debate a long time ago, and finally decided to
retain C rules.  It is not an issue that is cast in stone, but I'm
under the impression that one would need a really *new* killer argument
in favor of encoding lvaluness in the type system.


--
                                                       Gabriel Dos Reis
                                           gdr@integrable-solutions.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Wed, 22 Oct 2003 15:50:10 +0000 (UTC) Raw View

David Abrahams <dave@boost-consulting.com> wrote:
+ gdr@integrable-solutions.net (Gabriel Dos Reis) writes:
+
+> So, *(int*)0 is a valid expression. It invokes an undefined behaviour
+> only if it is evaluated.  Now, consider
+>
+>   double g(int&) { return 0; }
+>
+>   int main()
+>   {
+>      return  sizeof (double) != sizeof (g(*(int*)0));
+>   }
+>
+> this program is well-defined because the operand of sizeof is never
+> evaluated. It must type-check. But then, that means the expression
+> *(int*)0 is bound to the (non-)reference parameter int&.  That cannot
+> happen if *(int*)0 is an rvalue.
+>
+> | I don't see how it can be defined to be an
+> | lvalue and produce undefined behavior at the same time.
+>
+> if isn't an lvalue, are you saying it is an rvalue?
+
+ I think I'm saying it's no kind of value.  This may not be supported
+ by the standard.  My intuition tells me lvalues and rvalues are
+ runtime things and if it doesn't exist at runtime it can't be either
+ one.  But intuition ain't proof.

So, expressions are compile-time things, and

       expressions == rvalues + lvalues

Worse yet, there are situations like

       *f();

where f returns (int*)0 if and only if this program can be proven to
have defined behavior.

The point is that we have to make the lvalue-vs-rvalue decision at
compile time, and there's no way to tell what'll happen at run time.
Thus *<exp> must be an lvalue whenever <exp> has pointer type.

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: gdr@integrable-solutions.net (Gabriel Dos Reis)
Date: Wed, 22 Oct 2003 15:50:41 +0000 (UTC) Raw View

dave@boost-consulting.com (David Abrahams) writes:

| gdr@integrable-solutions.net (Gabriel Dos Reis) writes:
|
| > So, *(int*)0 is a valid expression. It invokes an undefined behaviour
| > only if it is evaluated.  Now, consider
| >
| >   double g(int&) { return 0; }
| >
| >   int main()
| >   {
| >      return  sizeof (double) != sizeof (g(*(int*)0));
| >   }
| >
| > this program is well-defined because the operand of sizeof is never
| > evaluated. It must type-check. But then, that means the expression
| > *(int*)0 is bound to the (non-)reference parameter int&.  That cannot
| > happen if *(int*)0 is an rvalue.
| >
| > | I don't see how it can be defined to be an
| > | lvalue and produce undefined behavior at the same time.
| >
| > if isn't an lvalue, are you saying it is an rvalue?
|
| I think I'm saying it's no kind of value.

I see.

|  This may not be supported
| by the standard.  My intuition tells me lvalues and rvalues are
| runtime things and if it doesn't exist at runtime it can't be either
| one.  But intuition ain't proof.

lvalue and rvalue are static properties, i.e. they are attributes
derived from static analysis of the program text. In particular, a
reference need not exist at run time.

There can be expression which are well-formed from  static analysis
point of view, but which have are not defined at runtime. That is part
of the reasons why we use C++ :-)

--
                                                       Gabriel Dos Reis
                                           gdr@integrable-solutions.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Wed, 22 Oct 2003 15:51:26 +0000 (UTC) Raw View

In article <uvfqi443c.fsf@boost-consulting.com>, David Abrahams wrote:
> gdr@integrable-solutions.net (Gabriel Dos Reis) writes:
>
>> So, *(int*)0 is a valid expression. It invokes an undefined behaviour
>> only if it is evaluated.
<snip>
> I think I'm saying it's no kind of value.  This may not be supported
> by the standard.  My intuition tells me lvalues and rvalues are
> runtime things and if it doesn't exist at runtime it can't be either
> one.  But intuition ain't proof.

Well, they are apparently defined to be kinds of expression which can
be (and must be) distinguished at translation time.  Of course, there
are corresponding run-time entities, the results of evaluation those
expressions, which the same names may loosely be applied to - and
the standard itself seems to do this in places.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: thp@cs.ucr.edu
Date: Wed, 22 Oct 2003 17:44:23 +0000 (UTC) Raw View

Gabriel Dos Reis <gdr@integrable-solutions.net> wrote:
[...]
+ I've tried at several times to suggest that lvalueness should be
+ reflected in the type system (and I did so again a year ago when the
+ debates on typeof/decltype and "move semantics" were very active) but
+ in the end I've come to the conclusion that even if the current
+ situation is not ideal, changing the standard to encode lvaluness in
+ the type system will cause more surprises than is needed.
+ The issue of encoding lvaluness as reference is not novel.  It is a
+ recurent theme.  And from what I can see from the archives, the
+ committee had a long debate a long time ago, and finally decided to
+ retain C rules.  It is not an issue that is cast in stone, but I'm
+ under the impression that one would need a really *new* killer argument
+ in favor of encoding lvaluness in the type system.

Thanks for that insight.  I'd suspected so, but wasn't sure.  A change
in the metalanguage of the standard would indeed be a massive
undertaking.  I can sympathize with those who don't want to do so
without strong reason.  The reason I would offer for change would be
ease of conceptualization/learning.  That reason would not, however,
convince me if I were the one who had to do the rewriting.  ;-)

Tom Payne

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.fr
Date: Thu, 23 Oct 2003 06:15:34 +0000 (UTC) Raw View

dave@boost-consulting.com (David Abrahams) wrote in message
news:<uvfqi443c.fsf@boost-consulting.com>...

> I think I'm saying it's no kind of value.  This may not be supported
> by the standard.  My intuition tells me lvalues and rvalues are
> runtime things and if it doesn't exist at runtime it can't be either
> one.  But intuition ain't proof.

That's interesting, because I would have thought just the opposite:
lvalues and rvalues are purely static, compile time attributes that have
no real meaning at run-time (and nothing but a formal meaning at compile
time -- certainly nothing that anywhere resembles anything my intuition
is capable of even working on).

If you ask me why, I guess it is just because I've never heard the terms
anywhere outside of language specifications.  If I'm studying
algorithms, or something like that, I don't use the terms.  And when I
consider what happens at the hardware level when a program runs, I don't
either.

In C, I could always imagine that if something had an address, it was an
lvalue.  But even there -- on some of the machines I've worked on, there
were no floating point registers, nor floating point immediate
instructions, and floating point literals had an address.  And of
course, in C++...

In the end, in this group (comp.STD.c++), an lvalue is something that
the standard says is an lvalue -- the definition is spread out all over
chapter 5, and a few other places as well.  Elsewhere, I think it is
almost one of those "I know it when I see it, but I can't explain it"
sort of things.

None of which, of course, is particularly intellectually satisfying.

--
James Kanze           GABI Software        mailto:kanze@gabi-soft.fr
Conseils en informatique orient   e objet/     http://www.gabi-soft.fr
                    Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Thu, 23 Oct 2003 06:21:47 +0000 (UTC) Raw View

thp@cs.ucr.edu wrote in message
> johnchx <johnchx2@yahoo.com> wrote:
> + thp@cs.ucr.edu wrote
> +>  2) You omitted initialization of const references via rvalues.
> +>
> +
> + Well, they're in there if you know where to look.  :-)  They're the
> + "rvalue-to-lvalue 'conversion'" I refered to.  (At least that's one
> + way an rvalue-to-lvalue "conversion" can occur.)
>
> But you didn't indicate that it could happen automatically, and AFAIK
> it doesn't happen automatically in any other context.

Hmmm...I'm not sure I understand what could "happen automatically."
Could you elaborate?

> + I put it this way to set up the notion that the purpose of reference
> + type is to allow one of these dimensions (type) to affect the other
> + (lvalue/rvalue-ness).
>
> But, in that case, they're not "orthogonal".

Yes...that is why I said "with two exceptions."

And in a way that's the point (or one of them, anyhow): reference type
is an exception to an otherwise clear rule.  And being an exception to
that rule is its entire purpose.

>
> [...]
> + What I'm driving at here is the notion that there are two different
> + aspects of a referent -- its location and its value  -- which may be
> + denoted by an expression, and lvalues denote the first aspect while
> + rvalues denote the second.
>
> In some cases an rvalue denotes an object, which in general is
> somewhat different from that object's value.
>

I don't think so, but I may simply not understand your point.

I'm claiming that rvalues and lvalues denoting objects denote their
referents differently.  An rvalue denotes the value of the object; an
lvalue denotes the location of the object.

In an intuitive sense, of course.  Which is probably worth
belaboring...because the standard gives the term "value" a technical
meaning which I *don't* mean in the context of my intuitive
characterization of lvalues and rvalues.  Put another, perhaps even
more confusing way, when I say an rvalue denotes the value of its
referent, I'm not implying that its referent is "a value."

Sigh.  If somebody can suggest a term -- that hasn't been "hijacked"
by the standard -- to replace "value" in my exposition, I'd be
grateful.

> +>  5) In the clause:
> +>
> +>       an expression with reference type is interpreted as an
> +>       lvalue, regardless of the other rules which might apply
> +>
> +>     I'd drop "interpreted as"
> +
> + Well...I'm trying to draw attention to the idea that having reference
> + type changes the interpretation of an expression which, if it didn't
> + have reference type, would have been an rvalue.
>
> All lvalues are subject to lvalue-to-rvalue conversions, whether or
> not they are lvalues by virtue of having reference type.  (But you
> knew that, so perhaps you had something else in mind.)

I'm claiming that if an expression would have been an lvalue,
regardless of whether it has reference type, then having reference
type does not affect the meaning of the expression.  If, on the other
hand, the expression would be an rvalue if it doesn't have reference
type, then having reference type has an effect -- namely causing the
expression to be interpreted as an lvalue.

Example:

  int  foo();
  int& bar();
  int  j;
  extern int& k;

  int main() {
    j;
    k;
    foo();
    bar();
  }

In main(), the expression "j" is an lvalue.  The expression "k" is
also an lvalue, whose type is adjusted to int.  That k is declared
with reference type doesn't affect the interpretation of the
expression "k".

On the other hand, the expression "bar()" is an lvalue *because* it
has reference type.  The expression "foo()," which doesn't have
reference type, is an rvalue.

>
> +>     and add "denoting the referent
> +>     of the corresponding reference.
> +
> + All expressions denote their referents.
>
> That's certainly the case in say Algol68, but in C++ many people have
> serious objection to saying that the expression "3" *refers* to the
> integer three.
>

I've followed a little of this, but I may not fully understand the
issue.  Some of the objections I've seen seem to arise from the
misconception that rvalues always denote objects, which isn't so.

I think it's perfectly correct to say that the literal 3 is an rvalue
which denotes the value 3.

> + lvalue/rvalue-ness and having
> + or lacking reference type doesn't change this. (I know that the
> + suggested language is more or less a quotation from the standard...I
> + suppose that the import of this phraseing is that the implementation
> + is prohibited from, e.g., copying the referent and treating the
> + expression as denoting the copy.  So it probably does need to be in
> + the formal text, but I think it complicates the exposition.)
>
> AFAIK, the exposition in the standard normally mentions specifically
> what entity each sort of expression denotes.
>

My only real objection to the language is that it seems to imply that
something tricky is going on, when in fact, something trivial is going
on.  In my opinion, it would be a lot clearer to say:

   The referent of the expression is unchanged.

than to say

  ...the expression designates the object or function denoted by the
reference.

Which raises questions like:

   What's the diference between "designating" and "denoting"?

and

   What "reference?"  How did we go from talking about an
   expression with reference type to "the reference?"

> +>  6) The fact that all post-initialization occurrences of refernces
> +>     denote the referent makes it impossible directly to reseat
> +>     a reference and/or determine its location.
> +
> + I think this is a confusing notion that arises from thinking about
> + names declared with reference type as "pointer-like."  In fact,
> + there's nothing unusual at all about this.  Consider:
> +
> + int main() {
> +
> +  int i = 0;
> +  int* pi = &i;
> +
> + }
> +
> + After its definition, "i" is an lvalue which denotes an object of type
> + int.  How would you "reseat" it?
>
> The identifier i is statically bound to the corresponding offset
> (location) in certain data regions.  That binding is not subject to
> dynamic rebinding.  Most references are dynamically bound to objects.
>

Yes, I understand what you're saying here.

On the other hand, is it simpler to say:

   In C++, some identifiers are statically bound and some are
   dynamically bound.  You might think that you can re-bind
   dynamically bound identifiers, but you can't.

Or:

  In C++, you can't rebind identifiers.

In other words, in C++, names declared with reference type behave just
like all the other names in this respect -- why introduce a
distinction only to explain that it doesn't actually apply. ;-)

> In most languages references are subject to dynamic rebinding,

Aha!  That may be a key difference in perspective.  I don't work with
other languages that have constructs called references, so my "default
assumptions" have been formed by how they work in C++, not by the
approaches taken in other languages.

I have worked with languages that follow the "everything is secretly
an automatically-dereferenced-reference-counted-shared-pointer-that-may-or-may-not-be
-copy-on-write" paradigm (VB, Object Pascal).  I hate them. ;-)

I gather that Java and Python also more or less fit this paradigm.
But I don't know algol or simula or eiffel, and perhaps if I did, I'd
be more inclined to see "un-reseatable" as a special case.

> The fact that we can't at run time change the binding of an
> identifier, in no way makes the non-reseatability of references
> *expected behavior*.

Well, expectations vary.  Sometimes having a limited basis for
comparison can be an advantage.  ;-)

> It is something that has been prohibited by design.  References are an
> idea imported into C++ from other language and deliberately crippled.

I understand.  Wishful thinking on my part, perhaps, about the
assumptions people would bring with them to C++.

>   There is no guarantee that a pointer to a struct-or-class object
>   having a reference points to the object's first member.
>
> The significance of this is that replacing post-initialization
> occurrences of referneces by similarly dereferenced pointers can cause
> such an object to become POD and subject to more rigid layout rules.

I'm not sure I understand this.  What replacement do you have in mind?

>
> +>  7) By definition, sizeof(T&) denotes sizeof(T).
> +
> + Yes, though I think the only reason that there's a special rule on
> + this is that adjusting the type of an expression from T& to T might be
> + considered a step in the evaluation of the expression, and sizeof is
> + guaranteed not to evaluate its operand.
>
> But T& is not an expression and is not subject to evaluation.

Oh, I see, you meant the type-id T&, not an expression having the type
T&.  Never mind.  ;-)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: johnchx2@yahoo.com (johnchx)
Date: Thu, 23 Oct 2003 06:21:51 +0000 (UTC) Raw View

dave@boost-consulting.com (David Abrahams) wrote

> > | I don't see how it can be defined to be an
> > | lvalue and produce undefined behavior at the same time.
> >

Aren't there plenty of lvalues the use of which produce undefined
behavior?  What I have in mind is everything that runs afoul of
3.10/15.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]