Thread

Topic: C/C++ object identity (was: C++ briar patch (Was: Object IDs are bad))

Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/30 Raw View

Tom Payne <thp@cs.ucr.edu> writes:

>In comp.std.c++ Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
>: Hyopthetically assuming that the C standard committee had really
>: intended to make the results of `==' on ponters to unrelated objects
>: undefined,
>
>I used to subscribe to that hypothesis, but 6.3.9 (Equality operators)
>"Contraints" ... makes no restriction that the referrents of these
>pointers be related.

OK, the fact that the "Constraints" section doesn't explicitly
disallow it means that it is not explicitly undefined.  But (one
could argue) it is not explicitly defined either, hence it is
still undefined behaviour.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/26 Raw View

"Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes:

|>  Konstantin Baumann wrote:
|>  >
|>  > So it is not possible to check (e.g. for the assignment-operator of a
|>  > class) if two given objects are the same?
|>  >
|>  > class A {
|>  >         ...
|>  >         A& operator=(const A& other) {
|>  >                 if(this != &other) { ... }
|>  >                 return(*this);
|>  >         }
|>  >         ...
|>  > };
|>
|>  The only way to get two unequal pointers to the same memory location is
|>  in a segmented architecture, and I don't believe any implementation will
|>  ever produce such a result unless you coax it into doing so. If two
|>  pointers have different segment parts, then the only way they can point
|>  to the same location is if you've gone off the end of an array (which
|>  the language says produces undefined results anyway) or used casts to do
|>  magic pointer arithmetic, in which case you're on your own. In normal
|>  programming, the above is perfectly safe.

The problem has nothing to do with segmented architecture.  The problem
is simply that the standards (both C and C++) do not guarantee that
pointers to different objects compare unequal.  Pointers to the same
object *are* guaranteed to compare equal, and a null pointer is
guaranteed to compare unequal to any valid non-null pointer.  At least
in C; I presume that if the current C++ draft does not guarantee as much
as the C standard, it is an editorial error, which will be fixed, and
not something that the C++ standardization committee decided.

Consider the following:

    int         x[ 1 ] ;
    int         y[ 1 ] ;
 assert( &x[ 1 ] != &y[ 0 ] ) ;

Logically, one should be able to assume that this assertion never fails,
since the two pointers refer to different objects.  In fact, the
assertion will fail in certain cases for every compiler I've ever used.
I suspect that the reason for the current wording in the C standard is
an attempt to "legalize" these implementations.

In practice, such problems only occur in two cases: 1) with pointers one
past the end of an array, and 2) with pointers into string literals.  So
one possible solution would be to add words to the effect that two
pointers to different objects will compare unequal, except that if one
or both of the pointers points to one past the end of an array, or into
a string literal, the results of the comparison are not defined.  (I
don't generally like such special cases, but this seems to be an
adequate guarantee, and corresponds to existing practice.)

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Tom Payne <thp@cs.ucr.edu>
Date: 1997/05/26 Raw View

In comp.std.c++ Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
[...]
: When I said "If `==' was really intended to be undefined ...",
: I meant "If the C standard committee had really intended to make
: the results of `==' undefined ...".
[...]
: Hyopthetically assuming that the C standard committee had really
: intended to make the results of `==' on ponters to unrelated objects
: undefined,

What else could the committee possibly mean by the words: "If the
objects pointed to are not members of the same aggregate or union
object, the reseult is undefined ..."  [6.3.8]

[...]
: if compiler writers aren't going to take advantage of that freedom,
: then what's the point? The standard should reflect existing practice.

... and sound semantic definitions.  The standard defines equality of
pointers in terms of "sameness of objects", a concept whose definition
entails sameness of addresses (i.e., equality of pointers).  Not cool!


Tom Payne
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: msb@sq.com (Mark Brader)
Date: 1997/05/26 Raw View

> : When I said "If `==' was really intended to be undefined ...",
> : I meant "If the C standard committee had really intended to make
> : the results of `==' undefined ...".
>
> What else could the committee possibly mean by the words: "If the
> objects pointed to are not members of the same aggregate or union
> object, the reseult is undefined ..."  [6.3.8]

That in those circumstances, the result of <, <=, >, or >= is undefined.
The section defining == (and !=) is 6.3.9.
--
Mark Brader       The "I didn't think of that" type of failure occurs because
SoftQuad Inc.     I didn't think of that, and the reason I didn't think of it
msb@sq.com        is because it never occurred to me.  If we'd been able to
Toronto           think of 'em, we would have.         -- John W. Campbell

My text in this article is in the public domain.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/27 Raw View

James Kanze wrote:
>
> Consider the following:
>
>     int         x[ 1 ] ;
>     int         y[ 1 ] ;
>         assert( &x[ 1 ] != &y[ 0 ] ) ;
>
> Logically, one should be able to assume that this assertion never fails,
> since the two pointers refer to different objects.  In fact, the
> assertion will fail in certain cases for every compiler I've ever used.
> I suspect that the reason for the current wording in the C standard is
> an attempt to "legalize" these implementations.
>
> In practice, such problems only occur in two cases: 1) with pointers one
> past the end of an array, and 2) with pointers into string literals.  So
> one possible solution would be to add words to the effect that two
> pointers to different objects will compare unequal, except that if one
> or both of the pointers points to one past the end of an array, or into
> a string literal, the results of the comparison are not defined.  (I
> don't generally like such special cases, but this seems to be an
> adequate guarantee, and corresponds to existing practice.)

Both ambiguities can be resolved without burdening the definition of
pointer comparisons. First, the standard could explicitly state that
x[1] in your example isn't an object. (Perhaps it already does.) Second,
the standard could explicitly state that two string literals may refer
to parts of the same object.

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: ark@research.att.com (Andrew Koenig)
Date: 1997/05/28 Raw View

In article <5mcilv$16d@skylark.ucr.edu> Tom Payne <thp@cs.ucr.edu> writes:

> What else could the committee possibly mean by the words: "If the
> objects pointed to are not members of the same aggregate or union
> object, the reseult is undefined ..."  [6.3.8]

The intent, which I hope the committee will be able to correct,
is to say that if you compare the addresses of two objects that
are not part of the same aggregate or union, then the results
of == and != are defined (false and true, respectively), but the
results of <, <=, >, and >= are undefined.
--
    --Andrew Koenig
      ark@research.att.com
      http://www.research.att.com/info/ark
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/05/28 Raw View

"Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes:

 |> James Kanze wrote:
 |> >
 |> > Consider the following:
 |> >
 |> >     int         x[ 1 ] ;
 |> >     int         y[ 1 ] ;
 |> >         assert( &x[ 1 ] != &y[ 0 ] ) ;
 |> >
 |> > Logically, one should be able to assume that this assertion never fails,
 |> > since the two pointers refer to different objects.  In fact, the
 |> > assertion will fail in certain cases for every compiler I've ever used.
 |> > I suspect that the reason for the current wording in the C standard is
 |> > an attempt to "legalize" these implementations.
 |> >
 |> > In practice, such problems only occur in two cases: 1) with pointers one
 |> > past the end of an array, and 2) with pointers into string literals.  So
 |> > one possible solution would be to add words to the effect that two
 |> > pointers to different objects will compare unequal, except that if one
 |> > or both of the pointers points to one past the end of an array, or into
 |> > a string literal, the results of the comparison are not defined.  (I
 |> > don't generally like such special cases, but this seems to be an
 |> > adequate guarantee, and corresponds to existing practice.)
 |>
 |> Both ambiguities can be resolved without burdening the definition of
 |> pointer comparisons. First, the standard could explicitly state that
 |> x[1] in your example isn't an object. (Perhaps it already does.) Second,
 |> the standard could explicitly state that two string literals may refer
 |> to parts of the same object.

Actually, I think you could interpret the current C standard in this
way.  x[1] definitly isn't an object, and the right for string literals
to overlap implies that they may refer to parts of the same object.
Still, I think it would be much clearer if one didn't have to infer this
indirectly, but that it was explicitly stated.  While one can interpret
the current C standard this way, I'm not sure that one has to.

--
James Kanze      home:     kanze@gabi-soft.fr        +33 (0)1 39 55 85 62
                 office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 22 rue Jacques-Lemercier, F-78000 Versailles France
     -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Julian Pardoe <pardoej@lonnds.ml.com>
Date: 1997/05/28 Raw View

James Kanze wrote:

> In practice, such problems only occur in two cases: 1) with pointers one
> past the end of an array, and 2) with pointers into string literals.
> So one possible solution would be to add words to the effect that two
> pointers to different objects will compare unequal, except that if one
> or both of the pointers points to one past the end of an array, or into
> a string literal, the results of the comparison are not defined.  (I
> don't generally like such special cases, but this seems to be an
> adequate guarantee, and corresponds to existing practice.)

This solution seems adequate.  Past-the-end pointers are a bit odd
so having special rules for them isn't too bad.  As for string literals
are they a special problem anyway?  We cannot tell whether two occurences
of the literal string "fred" denote the same object or two different objects.

[In other words
    foo ("fred", "fred");
might be equivalent to
    static char *__lit_fred[] = { 'f', 'r', 'e', 'd', '\0' };
    foo (__lit_fred, __lit_fred);
or to
    static char *__lit_fred1[] = { 'f', 'r', 'e', 'd', '\0' };
    static char *__lit_fred2[] = { 'f', 'r', 'e', 'd', '\0' };
    foo (__lit_fred1, __lit_fred2);
.]

If string literals show any problems for pointer comparison, can't this
be justified on these grounds, which are already covered by the standard?

-- jP --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Julian Pardoe <pardoej@lonnds.ml.com>
Date: 1997/05/28 Raw View

Paul D. DeRocco wrote:
>
> Clive D.W. Feather wrote:
> >
> > In article <3383ED66.5622@strip_these_words.ix.netcom.com>, "Paul D.
> > DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes
> >
> > >Does casting a long to a far pointer create a non-conforming program?
> > >For instance, (int far*)0x00000400 and (int far*)0x00400000
> >
> > Use of far (or even __far) makes you not strictly conforming.
>
> True. But I don't actually have to write "far" to provoke the problem.
> If I'm compiling in a large memory model, then (int*)0x00000400 and
> (int*)0x00400000 will point to the same byte (under MS-DOS). The
> question remains, does casting a number to a pointer render a program
> nonconforming?

Clive Feather says the answer is "yes" (i.e. the program is not _strictly_
conforming).  However, your question can be answered even if casting
pointers to numbers and back is allowed.  Casting a pointer to a number and
back should not change its representation so (as long as the compiler only
issues pointers in some "canonical" form) casting should have no effect.
It may be true that (int*)0x00000400 and (int*)0x00400000 will point to th
same byte (under MS-DOS), but one of those representations must be non-
canonical and the only way you can get one of those is by manipulating
the pointer representation, which is certainly not sanctioned by the
Standard.

-- jP --
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Tom Payne <thp@cs.ucr.edu>
Date: 1997/05/28 Raw View

In comp.std.c++ Fergus Henderson <fjh@mundook.cs.mu.OZ.AU> wrote:
[...]
: When I said "If `==' was really intended to be undefined ...",
: I meant "If the C standard committee had really intended to make
: the results of `==' undefined ...".
[...]
: Hyopthetically assuming that the C standard committee had really
: intended to make the results of `==' on ponters to unrelated objects
: undefined,

I used to subscribe to that hypothesis, but 6.3.9 (Equality operators)
states:

   "Contraints

        One of the following shall hold:

   ---  ... ;

   ---  both operands are pointers to qualified or unqualified
        versions of compatible types;

   ---  ... ; or

   ---  ... ."

In particular, it makes no restriction that the referrents of these
pointers be related.  Granted, 6.3.9 goes on to say:

   Where the operands have types and values suitable for the
   relational operators, the semantics detailed in 6.3.8 apply.

But unrelated objects apparently are not "suitable for the relational
operator" ( <, >, <= and >= ), since, according to 6.3.8:

   If the objects pointed to are not members of the same aggregate
   or union object, the result is undefined. ...

I thought that the above 6.3.8 clause applied to equality operators,
since 6.3.8 goes on to discusses conditions under which pointers
"compare equal".  A more careful reading indicates that it applies
only to the relational operators, which seem to have a strictly
smaller domain than the equality operators.

Tom Payne
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/22 Raw View

Fergus Henderson wrote:

> I'm pretty sure that all the 16-bit x86 implementations have the following
> properties:
>
>         1.  There is no way for a strictly conforming program to create
>             two pointers to the same address but have different segments.

Does casting a long to a far pointer create a non-conforming program?
For instance, (int far*)0x00000400 and (int far*)0x00400000 both point
to the start of the BIOS data area on a DOS machine.

> So I don't buy your explanation.  If `==' was really intended
> to be undefined when comparing pointers to unrelated objects, why do
> all 16-bit x86 implementations compare both segment and offset with `==',
> rather than just comparing the offset (which would be and more efficient)?

What does "intended to be undefined" mean? If something is undefined in
the standard, it doesn't mean that a particular implementation cannot
produce a definable, coherent result. It doesn't even prohibit an
implementation from defining the result. It just means that the standard
writers don't define it.

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/22 Raw View

Konstantin Baumann wrote:
>
> So it is not possible to check (e.g. for the assignment-operator of a
> class) if two given objects are the same?
>
> class A {
>         ...
>         A& operator=(const A& other) {
>                 if(this != &other) { ... }
>                 return(*this);
>         }
>         ...
> };

The only way to get two unequal pointers to the same memory location is
in a segmented architecture, and I don't believe any implementation will
ever produce such a result unless you coax it into doing so. If two
pointers have different segment parts, then the only way they can point
to the same location is if you've gone off the end of an array (which
the language says produces undefined results anyway) or used casts to do
magic pointer arithmetic, in which case you're on your own. In normal
programming, the above is perfectly safe.

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: fjh@mundook.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/22 Raw View

"Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes:

>Fergus Henderson wrote:
>
>> I'm pretty sure that all the 16-bit x86 implementations have the following
>> properties:
>>
>>         1.  There is no way for a strictly conforming program to create
>>             two pointers to the same address but have different segments.
>
>Does casting a long to a far pointer create a non-conforming program?

No, but it does create a not-strictly-conforming program.

The notion "conforming program" defined in the C standard has been shown
to be basically useless, since all FORTRAN programs are "conforming"
by that definition.

The notion "strictly conforming", also defined in the C standard,
is much more useful.

>> So I don't buy your explanation.  If `==' was really intended
>> to be undefined when comparing pointers to unrelated objects, why do
>> all 16-bit x86 implementations compare both segment and offset with `==',
>> rather than just comparing the offset (which would be and more efficient)?
>
>What does "intended to be undefined" mean?

When I said "If `==' was really intended to be undefined ...",
I meant "If the C standard committee had really intended to make
the results of `==' undefined ...".

>If something is undefined in
>the standard, it doesn't mean that a particular implementation cannot
>produce a definable, coherent result. It doesn't even prohibit an
>implementation from defining the result. It just means that the standard
>writers don't define it.

Sure, but look at the possible answers to my rhetorical question above:

Hyopthetically assuming that the C standard committee had really
intended to make the results of `==' on ponters to unrelated objects
undefined, the reason that all 16-bit x86 implementations compare both
segment and offset with `==', rather than just comparing the offset,
was:

 1.  The 16-bit x86 C implementors just misunderstood the standard.
 2.  The 16-bit x86 C implementors understood the standard, but
     didn't realize the possible efficiency gains that could be
     had from taking advantage of this undefined behaviour.
 3.  The 16-bit x86 C implementors understood the standard, and
     understood the possible efficiency gains, but decided to
     forgo them.

I don't find any of these answers very likely.
Certianly `2' is quite implausible.  `1' is possible I guess, but
you'd think that at least some of the x86 C implementors would have
some people who understood the standard.  In fact I would expect
at least some of the x86 C implementors to have been participating
in the standardization process.  Which makes 3 fairly implausible too:
why would the committee make something undefined if the implementors
that could benefit are going to forgo that benefit anyway?
There is no point in making something undefined in the standard
if _every_ implementation actually defines it, even including those
ones that might potentially benefit from not defining it.
The only advantage of leaving something as undefined behaviour is that
it gives compiler writers more freedom.  But if compiler writers aren't
going to take advantage of that freedom, then what's the point?
The standard should reflect existing practice.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: smayo@ziplink.net (Scott Mayo)
Date: 1997/05/22 Raw View

<> > For example, given
<> >
<> >       int x, y;
<> >       if (&x == &y) ...
<>
<> As I understand it, the reason that == between pointers is undefined is
<> because there is no guarantee of a one-to-one mapping from pointers to
<> integers. Because of 8086/80286 like segemnted memory models where a pointer
<> consists of a segment address and an offset. Two different segment/offset
<> pairs can point to the same physical address. Furthermore, casting a pointer
<> to an int, doing arithmetic, and casting back to a pointer will not work
<> reliably in such a model.

I know these comparisons are not supported by the standard.

But, == and != desperately need to be special-cased by the
standard, and allowed to return whether the two pointers
point to the same object or not. This is an utterly fundamental concept.

I know the real life answer. Segmented memory models are still around,
but not generally the focus of new development in compilers; so NO
company is going to support an standards change that makes them reach
back into old compilers and force them to normalise pointers before
such a comparison. And someone will point out a piece of code that
relies on the fact that identical pointers with different representations
compare unequal. And someone else will point out that making == work
while > cannot is evil. And someone else again will point out
that normalizing pointers behind the scenes violates the spirit of C;
though I'd argue that the original spirit is off howling among the
gravestones because you can't compare two arbitrary pointers with
impunity anymore. :) So it won't get fixed.

C++ has no excuse. It should guarantee == and != across any two
pointers. Then I will be able to take my almost-ANSI-C code and
compile it with a C++ compiler, which I will be able to view
as a almost-ANSI-C-compiler-with-quirks, and stop worring about
the vast damage done by the ancient hardware designer who thought that
overlapping segments were a nifty idea. :(
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: "Clive D.W. Feather" <clive@on-the-train.demon.co.uk>
Date: 1997/05/23 Raw View

In article <3383ED66.5622@strip_these_words.ix.netcom.com>, "Paul D.
DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes
>> I'm pretty sure that all the 16-bit x86 implementations have the following
>> properties:
>>         1.  There is no way for a strictly conforming program to create
>>             two pointers to the same address but have different segments.
>Does casting a long to a far pointer create a non-conforming program?
>For instance, (int far*)0x00000400 and (int far*)0x00400000

Use of far (or even __far) makes you not strictly conforming.

>> So I don't buy your explanation.  If `==' was really intended
>> to be undefined when comparing pointers to unrelated objects, why do
>> all 16-bit x86 implementations compare both segment and offset with `==',
>> rather than just comparing the offset (which would be and more efficient)?
>What does "intended to be undefined" mean? If something is undefined in
>the standard, it doesn't mean that a particular implementation cannot
>produce a definable, coherent result. It doesn't even prohibit an
>implementation from defining the result. It just means that the standard
>writers don't define it.

The intention of the Standard is:

    < <= > and >= can only be used on pointers to the same array
    == and != can be used on *any* two pointers

That is, the latter does not have undefined behaviour for any two valid
pointers that can be compared, while the former does.

--
Clive D.W. Feather    | Director of Software Development  | Home email:
Tel: +44 181 371 1138 | Demon Internet Ltd.               | <clive@davros.org>
Fax: +44 181 371 1037 | <clive@demon.net>                 | Abuse:
Written on my laptop; please observe the Reply-To address | <clive@bofh.org>
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: dawks@best.com (phil dawkins)
Date: 1997/05/23 Raw View

"Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com> wrote:
>Konstantin Baumann wrote:
>> So it is not possible to check (e.g. for the assignment-operator of a
>> class) if two given objects are the same?
>> class A {
>>         ...
>>         A& operator=(const A& other) {
>>                 if(this != &other) { ... }
>>                 return(*this);
>>         }
>>         ...
>> };

>The only way to get two unequal pointers to the same memory location is
>in a segmented architecture, and I don't believe any implementation will
>ever produce such a result unless you coax it into doing so.

This is it really. On an Intel x86 architecture (8086...pentium and
beyond i suspect) running in "real" mode, an address is 20 bits
comprising a 16-bit segment and a 16-bit offset. The segment is
shifted left by four bits before the offset is added in a 16-bit add.
the result is aliasing where most actual addresses can be referenced
by multiple segment/offset pairs e.g. 0x0040:0x0000 == 0x0000:0x0400
== 0x0004:0x03c0 (i think).

This sort of punning is avoided by the compiler defining memory
models. Even in most assemblers one can use these models by loading a
segment register and leaving it alone while manipulating offset
registers and there are proprietory extensions to compilers allowing
casting to construct larger pointers from smaller or vice versa.

AFAIK, most who indulge in this somewhat esoteric activity of naughty
pointer manipulation do it for a good or bad reason and know that they
are doing it. Attempting to support it in a standards manner in a
compiler would probably be a horrifying concept and cause serious
damage. Due to the uptake of embedded PCs, many of which are run in
real mode for a variety of reasons, together with the massive interest
in C++ for embedded systems, the issue won't go away, but it is really
a non-issue. Like the guy who went to a doctor and said "doctor it
hurts when i do this" and the doctor said "well stop doing that".

phil.

--
The world is divided into two sorts of people: those that believe the
world is divided into two sorts of people, and those that don't.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/23 Raw View

Fergus Henderson wrote:

> Hyopthetically assuming that the C standard committee had really
> intended to make the results of `==' on ponters to unrelated objects
> undefined, the reason that all 16-bit x86 implementations compare both
> segment and offset with `==', rather than just comparing the offset,
> was:
>
>         1.  The 16-bit x86 C implementors just misunderstood the standard.
>         2.  The 16-bit x86 C implementors understood the standard, but
>             didn't realize the possible efficiency gains that could be
>             had from taking advantage of this undefined behaviour.
>         3.  The 16-bit x86 C implementors understood the standard, and
>             understood the possible efficiency gains, but decided to
>             forgo them.
>
> I don't find any of these answers very likely.
> Certianly `2' is quite implausible.  `1' is possible I guess, but
> you'd think that at least some of the x86 C implementors would have
> some people who understood the standard.  In fact I would expect
> at least some of the x86 C implementors to have been participating
> in the standardization process.  Which makes 3 fairly implausible too:
> why would the committee make something undefined if the implementors
> that could benefit are going to forgo that benefit anyway?
> There is no point in making something undefined in the standard
> if _every_ implementation actually defines it, even including those
> ones that might potentially benefit from not defining it.
> The only advantage of leaving something as undefined behaviour is that
> it gives compiler writers more freedom.  But if compiler writers aren't
> going to take advantage of that freedom, then what's the point?
> The standard should reflect existing practice.

I think that perhaps the implementors decided that the standard was
wrong, and that it was important to define what the standard leaves
undefined. At least that's my opinion. It would be disastrous to allow
pointers to two different objects to compare equal. It is absolutely
essential, having created a couple of actual objects, to be able to
compare pointers to them and see that they aren't equal. And the failure
of comparing just the offsets isn't even rare: the typical large-model
memory allocator in a DOS program will _always_ return a far pointer
with an offset part of 4 or 8 and a different segment address.

Note that all examples of aliasing that people have come up with so far
involve the creation of a pointer that isn't strictly speaking a pointer
to an object, in the normal sense of the word. A pointer that points
past the end of an array doesn't point to an object, at least in the C++
standard (I don't have a copy of the C standard), and a segmented
pointer that is mangled by pointer arithmetic is already outside the
standard anyway, so there is no reason why the fact that they might
compare unequal when they point to the same memory location should scare
the standard committee away from defining the equality of pointers to
different objects.

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Paul D. DeRocco" <pderocco@strip_these_words.ix.netcom.com>
Date: 1997/05/24 Raw View

Clive D.W. Feather wrote:
>
> In article <3383ED66.5622@strip_these_words.ix.netcom.com>, "Paul D.
> DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes
>
> >Does casting a long to a far pointer create a non-conforming program?
> >For instance, (int far*)0x00000400 and (int far*)0x00400000
>
> Use of far (or even __far) makes you not strictly conforming.

True. But I don't actually have to write "far" to provoke the problem.
If I'm compiling in a large memory model, then (int*)0x00000400 and
(int*)0x00400000 will point to the same byte (under MS-DOS). The
question remains, does casting a number to a pointer render a program
nonconforming?

--

Ciao,
Paul

(Please remove the "strip_these_words." prefix from the return
address, which has been altered to foil junk mail senders.)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: "Clive D.W. Feather" <clive@on-the-train.demon.co.uk>
Date: 1997/05/25 Raw View

In article <33869FD9.193E@strip_these_words.ix.netcom.com>, "Paul D.
DeRocco" <pderocco@strip_these_words.ix.netcom.com> writes
>If I'm compiling in a large memory model, then (int*)0x00000400 and
>(int*)0x00400000 will point to the same byte (under MS-DOS). The
>question remains, does casting a number to a pointer render a program
>nonconforming?

Non strictly conforming, yes.

--
Clive D.W. Feather    | Director of Software Development  | Home email:
Tel: +44 181 371 1138 | Demon Internet Ltd.               | <clive@davros.org>
Fax: +44 181 371 1037 | <clive@demon.net>                 | Abuse:
Written on my laptop; please observe the Reply-To address | <clive@bofh.org>
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: fjh@murlibobo.cs.mu.OZ.AU (Fergus Henderson)
Date: 1997/05/20 Raw View

Jeffrey Mark Siskind <qobi@ai.emba.uvm.edu> writes:

>fjh@mundook.cs.mu.OZ.AU (Fergus Henderson) writes:
>
>> Neither the C standard nor the C++ draft working
>> paper guarantee that pointer equality tests will reflect object identity.
>
>As I understand it, the reason that == between pointers is undefined is
>because there is no guarantee of a one-to-one mapping from pointers to
>integers. Because of 8086/80286 like segemnted memory models where a pointer
>consists of a segment address and an offset. Two different segment/offset
>pairs can point to the same physical address.

I'm pretty sure that all the 16-bit x86 implementations have the following
properties:

 1.  There is no way for a strictly conforming program to create
     two pointers to the same address but have different segments.

 2.  The `==' and `!=' are implemented by comparing both the segment
     and offset, whereas the `<', `<=', `>', and `>=' operators are
     implemented by comparing only the offset.

Thus in all of these implementations, `==' does reflect object identity, at
least for strictly conforming programs.

So I don't buy your explanation.  If `==' was really intended
to be undefined when comparing pointers to unrelated objects, why do
all 16-bit x86 implementations compare both segment and offset with `==',
rather than just comparing the offset (which would be and more efficient)?

The text of the C standard and C++ DWP certainly doesn't reflect existing
practice; I think these standard are just defective, and ought to be fixed.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Jeffrey Mark Siskind <qobi@ai.emba.uvm.edu>
Date: 1997/05/20 Raw View

fjh@mundook.cs.mu.OZ.AU (Fergus Henderson) writes:

> Ironically, although C and C++ do in practice have the memory model that
> Alexander Stepanov wanted, this is not actually guaranteed by their
> respective standards!  Neither the C standard nor the C++ draft working
> paper guarantee that pointer equality tests will reflect object identity.
>
> For example, given
>
>  int x, y;
>  if (&x == &y) ...

As I understand it, the reason that == between pointers is undefined is
because there is no guarantee of a one-to-one mapping from pointers to
integers. Because of 8086/80286 like segemnted memory models where a pointer
consists of a segment address and an offset. Two different segment/offset
pairs can point to the same physical address. Furthermore, casting a pointer
to an int, doing arithmetic, and casting back to a pointer will not work
reliably in such a model.

People also forget that there is no guarantee that pointers are all of the
same size and that there is an integer size that is the same size as pointers.
That is an artifact of architectures whose word size is a multiple of 8. But
on the PDP-10, and yes there were C compilers for the PDP-10, an int * might
be 18 bits (a half word) while a char * might be 36 bits (a `byte pointer' in
PDP-10 parlance). And incrementing a PDP-10 byte pointer interpreted as a 36
bit integer, would not point to the next character. A special `increment byte
pointer' instruction is needed.

As I understand it, the vast majority of C programs that cast between ints and
pointers, or that do == between pointers, are not ANSI compliant and not
portable.

    Jeff (home page http://www.emba.uvm.edu/~qobi)
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Steve Clamage <stephen.clamage@Eng.Sun.COM>
Date: 1997/05/21 Raw View

Konstantin Baumann wrote:
>
> > fjh@mundook.cs.mu.OZ.AU (Fergus Henderson) writes:
> >
> > > Ironically, although C and C++ do in practice have the memory model that
> > > Alexander Stepanov wanted, this is not actually guaranteed by their
> > > respective standards!  Neither the C standard nor the C++ draft working
> > > paper guarantee that pointer equality tests will reflect object identity.
> > >
> > > For example, given
> > >
> > >       int x, y;
> > >       if (&x == &y) ...
> > As I understand it, the vast majority of C programs that cast between ints and
> > pointers, or that do == between pointers, are not ANSI compliant and not
> > portable.
>
> So it is not possible to check (e.g. for the assignment-operator of a
> class) if two given objects are the same?
>
> class A {
>         ...
>         A& operator=(const A& other) {
>                 if(this != &other) { ... }
>                 return(*this);
>         }
>         ...
> };

That would be one of many bad consequences. There seems to be much
agreement at this point among members of the C and C++ committees
that we have an oversight in the wording in the standards, not a
deliberate exclusion of comparing unrelated pointers for equality.

I believe the C++ draft standard will be corrected before being
issued as a final standard. The C Committee is currently updating
the C standard, so perhaps it will also be corrected or made
more explicit.

--
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]

Author: "Konstantin Baumann" <kostab@uni-muenster.de>
Date: 1997/05/21 Raw View

Jeffrey Mark Siskind wrote:
>
> fjh@mundook.cs.mu.OZ.AU (Fergus Henderson) writes:
>
> > Ironically, although C and C++ do in practice have the memory model that
> > Alexander Stepanov wanted, this is not actually guaranteed by their
> > respective standards!  Neither the C standard nor the C++ draft working
> > paper guarantee that pointer equality tests will reflect object identity.
> >
> > For example, given
> >
> >       int x, y;
> >       if (&x == &y) ...
>
> As I understand it, the reason that == between pointers is undefined is
> because there is no guarantee of a one-to-one mapping from pointers to
> integers. Because of 8086/80286 like segemnted memory models where a pointer
> consists of a segment address and an offset. Two different segment/offset
> pairs can point to the same physical address. Furthermore, casting a pointer
> to an int, doing arithmetic, and casting back to a pointer will not work
> reliably in such a model.
>
> [deleted]
>
> As I understand it, the vast majority of C programs that cast between ints and
> pointers, or that do == between pointers, are not ANSI compliant and not
> portable.

So it is not possible to check (e.g. for the assignment-operator of a
class) if two given objects are the same?

class A {
 ...
 A& operator=(const A& other) {
  if(this != &other) { ... }
  return(*this);
 }
 ...
};

Kosta
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]