Topic: references vs. pointers


Author: Tom Payne <thp@cs.ucr.edu>
Date: 1996/11/06
Raw View
J. Kanze <kanze@gabi-soft.fr> wrote:
[...]
: Since the position of the standard is that references are not objects,
:  ...

It's not that simple!

The roots of the complexity are in the C Standard, which defines an
object to be "a region of data storage in the execution environment,
the contents of which can represent values. [3.14]" The C++ draft
standard appears to follow suit by defining an object to be "a region
of storage. [1.7]"

This segment-of-storage definition of object has a problem in that it
defines a semantic concept in terms of implementation.  Still it is
based on sound intuition and meshes well with other, more semantic,
definitions.  For example, it coincides with that of Booch [Object
Oriented Design, p. 77]:
    An object has state, behavior, and identity.
via the identifications
    state     ~  value
    identity  ~  address
    behavior  ~  type-specific operations.
It also meshes with a subsequent (implicit) definition in the
C Standard:
    Types are partitioned into object types (types that describe
    objects), function types (types that describe functions), and
    incomplete types (types that describes objects but lack
    information needed to determine their sizes).  [6.1.2.5],
which implies that an object is any instance of a data types.

In an aggressively optimized program most local variables, which are
instances of data types, do not occupy segments of storage.  They are
typically kept in registers and may be spilled to a diffent place in
storage each time they are spilled.  These distinct definitions of
"object" in the C Standard are commonly reconciled, however, via the
as-if rule -- all instances of data types behave as if they occupy
segments of storage, in the sense that there is no experiment one can
do to determine that they don't.  A stretch, but a consistent one.

The C++ Standard similarly contains a subesquesnt (implicit)
redefinition of "object":
  An object type is a (possibly cv-qualified) type that is not a func-
  tion type, not a reference type, and not incomplete (except for an
  incompletely-defined object type). [3.9.9]
which implies that an object is any instances of a non-reference data
type.  Unfortunately, this redefinition can't be reconciled by a
consistent appeal to the as-if rule.

The problem is that a dynamically initialized reference has as much
or more need for a "segment of memory" as a constant static int whose
address isn't used.  So, to reconcile the redefinition, the C++
Standards Committee argues that

 The reference, which needs memory to record its referrent, doesn't
 really have any, because there is no way to find its size or location.
 The int, whose value can be determined at compile time, needs and has
 memory, because (in another program) its address could be used.

I simply find this casuistry a bit silly, but many reasonable
programmers take it seriously and are misled into concluding, for
instance, that "references are not allowed to occupy memory."

Tom Payne








[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/10/28
Raw View
Tom Payne <thp@cs.ucr.edu> writes:

   [I've cut most of the following.  I really want to hear the point of
view of people who are activly involved in the standardization, and not
my own...]

> : > Presumably, you are referring to the line: "References might or might
> : > not require storage; however, the storage duration categories apply to
> : > references as well."  I would take this to mean that the lifetime of a
> : > reference is what you'd expect it to be if it were an object.
>
> : Correct.  You're interpretation corresponds to mine, however...  The
> : lifetime of an object depends on whether it has a non-trivial
> : constructor or not.  Do references have a non-trivial constructor?
>
> AFIK there is nothing to suggest that references have constructors.

Since the position of the standard is that references are not objects,
they obviously cannot have constructors.  The quoted passage, however,
suggests that their lifetime is the same as objects with the same
storage class.  The problem with this is, of course, that not all
objects with the same storage class have the same lifetime.

> : There is also the problem of the default initialization: all other
> : built-in objects with static lifetime have a lifetime equal to the
> : duration of the program.  Such objects are normally initialized by
> : assigning 0 to them.  What is the effect of this with regards to
> : references?
>
> There are (at least) two distinct ways tointerpret the lifetime of a
> reference:
>   (1) the same lifetime as const ints of similar storage class,
>   (2) extending from initialization to destruction of referrent or
>       end of the lifetime associated with its storage-class, whichever
>       comes first.
> Under (1), for instance, the use of an uninitialized pointer yields
> undefined behavior, because it has an indeterminate and possibly
> invalid value.  Under (2) the behavior is undefined because the
> reference is "nonexistent."  I find (1) to be conceptially simpler.

The problem with (1) is that const ints and similar such things (like
const pointers) have a two step initialization, at least when they have
static lifetime.  First they are initialized "as if" zero were assigned
to them.  This is rather awkward for a reference, since there is no way
that a reference can be legally initialized with 0 (even if you consider
references as automatically dereferenced pointers).

This interpretation requires special wording to handle this problem.
I'm not saying that this wording is necessarily difficult (it isn't),
but it isn't there now.

> : When speaking of the "lifetime of an object", in 3.8, a reference
> : definitly acts like an object with a non-trivial constructor (but a
> : trivial destructor), at least in all reasonable implementations.
>
> Why is a reference's constructor any less trivial than that of a const
> int or a const pointer?

It's not (supposing that it had a constructor).  However, again: objects
with static lifetime and trivial constructors are 0-initialized.
References are obviously *NOT* 0-initialized, since such an
initialization is not allowed by the current draft.  In fact, I cannot
find anything in the draft as to what a reference could possibly be
before initialization.  Except, of course, if it occupies storage; then
it could always be raw memory.

This corresponds to the lifetime of an object with a non-trivial
constructor.

> : I think that 3.6.2, on the other hand, needs a little reworking.  As
> : currently worded: "Objects of POD types with static storage duration
> : initialized with constant expressions shall be initialized before any
> : dynamic initialization takes place."  A reference (or an object
> : containing a reference) is not a POD type; the current wording implies
> : that *ALL* references follow the ordering of dynamic initialization.
>
> : I don't think that this is what is wanted.  For example:
>
> :   int             a ;
> :   extern int&     ra ;
> :   X               x ;
> :   int&            ra = a ;
>
> : As currently worded, "ra" is explicitly uninitialized (or zero
> : initialized?) when the constructor for x is called.
>
> The simplest view would seem to be that, at the point where x is
> iitialized, ra has been initialized to a possibly invalid value (say
> zero), so any access other than initialization provokes undefined
> behavior.

Or that ra doesn't exist before initialization is finished, so any
access constitutes undefined behavior.

--
James Kanze          +33 3 88 14 49 00           email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle --
                            -- Beratung in industrieller Datenverarbeitung


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Tom Payne <thp@cs.ucr.edu>
Date: 1996/10/22
Raw View
In comp.lang.c++.moderated J. Kanze <kanze@gabi-soft.fr> wrote:
:     [This has become strictly a standardization issue.  I've
: cross-posted to comp.stc.c++, and set follow-ups there.]

: Tom Payne <thp@cs.ucr.edu> writes:

: > J. Kanze <kanze@gabi-soft.fr> wrote:

[...]
: > : I'm not sure about lifetime;
: > : there is text in 3.7 which makes some assertions concerning the lifetime
: > : of the non-existant storage of a reference, but I am unable to determine
: > : what this is supposed to mean.  (If a reference always refers to
: > : something, then by definition, it doesn't exist until it refers to
: > : something, i.e. it is initialized.)
: >
: > Presumably, you are referring to the line: "References might or might
: > not require storage; however, the storage duration categories apply to
: > references as well."  I would take this to mean that the lifetime of a
: > reference is what you'd expect it to be if it were an object.

: Correct.  You're interpretation corresponds to mine, however...  The
: lifetime of an object depends on whether it has a non-trivial
: constructor or not.  Do references have a non-trivial constructor?

AFIK there is nothing to suggest that references have constructors.

: There is also the problem of the default initialization: all other
: built-in objects with static lifetime have a lifetime equal to the
: duration of the program.  Such objects are normally initialized by
: assigning 0 to them.  What is the effect of this with regards to
: references?

There are (at least) two distinct ways tointerpret the lifetime of a
reference:
  (1) the same lifetime as const ints of similar storage class,
  (2) extending from initialization to destruction of referrent or
      end of the lifetime associated with its storage-class, whichever
      comes first.
Under (1), for instance, the use of an uninitialized pointer yields
undefined behavior, because it has an indeterminate and possibly
invalid value.  Under (2) the behavior is undefined because the
reference is "nonexistent."  I find (1) to be conceptially simpler.

: When speaking of the "lifetime of an object", in 3.8, a reference
: definitly acts like an object with a non-trivial constructor (but a
: trivial destructor), at least in all reasonable implementations.

Why is a reference's constructor any less trivial than that of a const
int or a const pointer?

: I think that 3.6.2, on the other hand, needs a little reworking.  As
: currently worded: "Objects of POD types with static storage duration
: initialized with constant expressions shall be initialized before any
: dynamic initialization takes place."  A reference (or an object
: containing a reference) is not a POD type; the current wording implies
: that *ALL* references follow the ordering of dynamic initialization.

: I don't think that this is what is wanted.  For example:

:   int             a ;
:   extern int&     ra ;
:   X               x ;
:   int&            ra = a ;

: As currently worded, "ra" is explicitly uninitialized (or zero
: initialized?) when the constructor for x is called.

The simplest view would seem to be that, at the point where x is
iitialized, ra has been initialized to a possibly invalid value (say
zero), so any access other than initialization provokes undefined
behavior.

[...]
: > The standard should, however, adopt the view that references are
: > objects, not because of any difference in the semantics of the
: > language, but because of differences in the teachability of the
: > language.  Stroustrup lists teachability amoung the Language-Technical
: > Rules in D&E, p. 119:
: >
: >    IF IN DOUBT, PICK THE VARIANT OF A FEATURE THAT IS EASIEST TO
: >    TEACH ...  One intent is to ease the task for educators ...
: >
: > I grant that teachability tends to be in the mind of the beholder,
: > but consider a student who is skilled in C transferring into a program
: > where the standard laguage is C++.  He asks an educator what a
: > reference is.  Does telling him that "a reference is not an object,
: > but rather an alias" suggest that a reference:
: >   *  Can be returned by functions?
: >   *  Can be intialized at run time?
: >   *  Can dangle?
: >   *  Has a lifetime?
: >   *  Need not have a name?
: >   *  Can appear as a member of a struct or class object?

: But teaching him that "a reference is an object" suggests that a
: reference:
:   *  Can be (bytewise) copied.
:   *  Can be assigned to/modified.
:   *  Has an address.

Each of these is an immediate consequence of the fundamental behavior
of references: a reference refers all operations to its referrent.
C++ has been carefully crafted so that there is no way to find the
address of a reference and indirectly change what object it refers to.
Nothing, however, says that references don't have addresses.  (A tree
falling in the forest ALWAYS makes a sound.)

: > He will most likely leap to one of the following models/analogies,
: > none of which suggest the features listed above:
: >   *  Aliases defined via the C preprocessor.
: >   *  Hard links in the Unix file system.
: >   *  Soft links in the Unix file system.

: In fact, I find that the best analogy for references is links on the
: Unix file system.  Sort of a combination of hard links and soft links.
: (Mostly soft, since they can dangle.)

Close.  But links necessarily have names associated with them.  Also,
deleting a file only removes a single link (unless it was the final
hard link).

: > If, on the other hand, one tells him that "references are a special
: > kind of object that behaves somewhat like automatically dereferenced
: > pointers," each of those features is automatically implied.

: Plus a lot of others which aren't true.

Only a few, and they are either:
  1)  immediate consequences of the fundamental behavior of references,
  2)  by-products of C++'s treatment of references that have to be
      introduced as special cases under either view (e.g., no
      arrays of references).
Each major category of object, however, has certain pecuriarities,
e.g., arrays and class objects cannot really be returned by value.

In building a reasonable taxonomy of entities, the relevant question
is whether references share the properties that tie together the other
categories of references.  Semantically, what objects have in common
(and other entities lack) are attributes (specifically, values) that
cannot be determined until run time -- hence their need for a "segment
of storage" to store such attributes.  A simple diagonal argument
constructs references whose value cannot be determined until run time.

Unfortunately, C/C++ defines "object" in terms of implementation: "a
segment of storage."  Under normal implementations, such storage is
required for all objects (and references) that
  (1) have a value that varies at run time, or
  (2) have their address used, or
  (3) are members of a POD struct-or-class object
  (4) are initialized at run time, or
  (5) have external linkage.
In other (degenerate) cases, such storage can be and often is
optimized away, but we stil consider the item to be an object if
other instances of its type require storage, e.g., a linked static
const int is still an object even if its storage gets optimized
away.

Tom Payne
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/10/15
Raw View
    [This has become strictly a standardization issue.  I've
cross-posted to comp.stc.c++, and set follow-ups there.]

Tom Payne <thp@cs.ucr.edu> writes:

> J. Kanze <kanze@gabi-soft.fr> wrote:

    [...]
> So, in C++ one cannot have a pure value of a (nontrivial) class type,
> simply because this is a pointer.  I've long felt that this should
> have been a reference, but only for aesthetic reasons.

You're not alone in feeling this.  I was once told (by Stroustrup, or
Andy Koenig?) that the reason "this" is a pointer is that references
weren't yet in the language when "this" was introduced.

> : I'm not sure about lifetime;
> : there is text in 3.7 which makes some assertions concerning the lifetime
> : of the non-existant storage of a reference, but I am unable to determine
> : what this is supposed to mean.  (If a reference always refers to
> : something, then by definition, it doesn't exist until it refers to
> : something, i.e. it is initialized.)
>
> Presumably, you are referring to the line: "References might or might
> not require storage; however, the storage duration categories apply to
> references as well."  I would take this to mean that the lifetime of a
> reference is what you'd expect it to be if it were an object.

Correct.  You're interpretation corresponds to mine, however...  The
lifetime of an object depends on whether it has a non-trivial
constructor or not.  Do references have a non-trivial constructor?

There is also the problem of the default initialization: all other
built-in objects with static lifetime have a lifetime equal to the
duration of the program.  Such objects are normally initialized by
assigning 0 to them.  What is the effect of this with regards to
references?

When speaking of the "lifetime of an object", in 3.8, a reference
definitly acts like an object with a non-trivial constructor (but a
trivial destructor), at least in all reasonable implementations.

I think that 3.6.2, on the other hand, needs a little reworking.  As
currently worded: "Objects of POD types with static storage duration
initialized with constant expressions shall be initialized before any
dynamic initialization takes place."  A reference (or an object
containing a reference) is not a POD type; the current wording implies
that *ALL* references follow the ordering of dynamic initialization.

I don't think that this is what is wanted.  For example:

  int             a ;
  extern int&     ra ;
  X               x ;
  int&            ra = a ;

As currently worded, "ra" is explicitly uninitialized (or zero
initialized?) when the constructor for x is called.

> : > There is nothing inconsistent or inconvenient in viewing references as
> : > objects.
>
> : There would be nothing inconsistent or inconvenient in defining
> : references as objects.  The standards committee chose a different
> : direction.  There is something inconsistent in viewing references as
> : something other than what they are defined as.
>
> Regardless of the intent of the committee, AFIK there is nothing in
> the current wording that is inconsistent with viewing or implementing
> references as objects.

I'm not sure.  If you want to consider them as objects, they are very
special objects, since they cannot be copied, etc.  See the discussion
of types, 3.9.

> : > Denying that they are objects, however, begs the question:
> : > What the hell are they then?
>
> : They are references.
>
> >From the standard: "An entity is a value, object, subobject, base
> class subobject, array element, variable, function, set of functions,
> instance of a function, enumerator, type, class member, template, or
> namespace."  Which is it?

>From the standard (3.9): "Types describe objects, references or
functions.":-)  Can it be that the committee accidentally forgot one of
the entities in the list?  (Or can it be that the draft is ambiguous in
this regard?)

> : > The C++ world would be a less confusing place if the committee would
> : > simply accept that references are objects and be done with it.
>
> : Maybe.  I think that the committee have a very difficult job.
>
> Agreed!!
>
> : I certainly think that the status of references could be clearer, even
> : without declaring them as objects.
>
> Agreed.  Nevertheless, the committee should then take note of the fact
> that in terms of the semantics of the language, i.e., which programs
> and implementations conform, it makes no difference whether the
> standard specifies that references are or are not objects.

I think that we can agree that references are like objects in some ways,
and different from objects in others.  This leaves two ways open to
define them in the standard:

1. As objects.  In places where they do not behave as objects (e.g. byte
copying as in 3.9), the standard must then refer to "objects other than
references", or something similar.

2. As non-objects.  In places where they do behave like objects, the
standard must then explicitly state "objects and/or references".

Which way is better is largely a matter of personal opinion.  I tend to
prefer keeping them separate; if necessary, one could invent a third
term ("object-like entities"?) for the cases where they are the same.
(This is a basic preference in my tastes.  Thus, for example, I also
consider that unions are not classes; unions and classes are two
distinct concepts, which share certain characteristics.)

> The standard should, however, adopt the view that references are
> objects, not because of any difference in the semantics of the
> language, but because of differences in the teachability of the
> language.  Stroustrup lists teachability amoung the Language-Technical
> Rules in D&E, p. 119:
>
>    IF IN DOUBT, PICK THE VARIANT OF A FEATURE THAT IS EASIEST TO
>    TEACH ...  One intent is to ease the task for educators ...
>
> I grant that teachability tends to be in the mind of the beholder,
> but consider a student who is skilled in C transferring into a program
> where the standard laguage is C++.  He asks an educator what a
> reference is.  Does telling him that "a reference is not an object,
> but rather an alias" suggest that a reference:
>   *  Can be returned by functions?
>   *  Can be intialized at run time?
>   *  Can dangle?
>   *  Has a lifetime?
>   *  Need not have a name?
>   *  Can appear as a member of a struct or class object?

But teaching him that "a reference is an object" suggests that a
reference:
  *  Can be (bytewise) copied.
  *  Can be assigned to/modified.
  *  Has an address.

> He will most likely leap to one of the following models/analogies,
> none of which suggest the features listed above:
>   *  Aliases defined via the C preprocessor.
>   *  Hard links in the Unix file system.
>   *  Soft links in the Unix file system.

In fact, I find that the best analogy for references is links on the
Unix file system.  Sort of a combination of hard links and soft links.
(Mostly soft, since they can dangle.)

> If, on the other hand, one tells him that "references are a special
> kind of object that behaves somewhat like automatically dereferenced
> pointers," each of those features is automatically implied.

Plus a lot of others which aren't true.

--
James Kanze           (+33) 88 14 49 00          email:
kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg,
France
Conseils en informatique industrielle --
                            -- Beratung in industrieller
Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with
your
                newsreader.  If that fails, use
mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]