Thread

Topic: Is &array[array_size] well defined for all types ?

Author: "Heinz Ozwirk" <wansor42@gmx.de>
Date: Wed, 12 Jun 2002 18:24:37 GMT Raw View

"Mogens Hansen" <mogens_h@dk-online.dk> wrote in message
news:ae5efv$a4q$1@news.cybercity.dk...
> Is
>   const size_t   array_size =3D 4;               // any number
>   T                  array[array_size];           // T has a default
> constructor
>   T*                pt =3D &array[array_size]; // do not deref pointer
> garanteed to be well defined for all types of T (which has a default
> constructor) ?
>
> In particular, is it garanteed to be well defined if T overloads
"operator&"
> ?
>
> class T
> {
> public:
>     T*  operator&(void)
>         {    return this;    }    // perhaps do something more
> };
>
>
> The book
>   The C++ Programming Language, Third Edition/Special Edition
>   Bjarne Stroustrup
> page 91/92, say that taking a pointer to the element one beyond the end=
 of
> an array is garanteed to work.

Taking the address of an element is something different from calling
operator&. If you somehow get the address of an element in an array (a
pointer to an element) you can manipulate this pointer to point one beyon=
d
the end of the array. For example the following code is valid for all typ=
es
T:

    T array[size];
    T* oneBeyond1 =3D reinterpret_cast<T*>(&array) + size;

As long as no padding is involved, this might also be valid (but I wouldn=
't
dare to write it in real code):

    T* oneBeyond2 =3D reinterpret_cast<T*>(&array + 1);

In both cases T::operator&() will not be called. Instead a build in versi=
on
of (something like) T** T[]::operator&() will be used.

> The example is a build-in type - int.
>
> The C++ Standard =A75.2.1, says that
>   E1[E2]
> by definition is identical to
>   *((E1)+(E2))
> which means that
>   T*                pt =3D &array[array_size];
> is identical to
>   T*                pt =3D &*(array + array_size);
> which I think is identical to
>   T*               pt =3D &(*(array + array_size));
> because & and * has the same precedence and they are right-associative,
> which means that is is _not_ identical to
>   T*               pt =3D array + array_size;
>
> I'we been told (I do not have a copy) that C99, =A76.5.3.2#3 says that
>   T*               pt =3D &array[array_size];
> is identical to
>   T*               pt =3D array + array_size;
> Does C++ have a similar rule

I haven't found it explicitly stated in the standard, but as long as the
default implementation of T::operator&() is used, it should be possible t=
o
deduce that &* (and *&) are identity operations. Otherwise pointers in C+=
+
would be close to useless.

Regards
    Heinz


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: "Mogens Hansen" <mogens_h@dk-online.dk>
Date: Tue, 11 Jun 2002 13:24:35 CST Raw View

Is
  const size_t   array_size = 4;               // any number
  T                  array[array_size];           // T has a default
constructor
  T*                pt = &array[array_size]; // do not deref pointer
garanteed to be well defined for all types of T (which has a default
constructor) ?

In particular, is it garanteed to be well defined if T overloads "operator&"
?

class T
{
public:
    T*  operator&(void)
        {    return this;    }    // perhaps do something more
};


The book
  The C++ Programming Language, Third Edition/Special Edition
  Bjarne Stroustrup
page 91/92, say that taking a pointer to the element one beyond the end of
an array is garanteed to work.
The example is a build-in type - int.

The C++ Standard    5.2.1, says that
  E1[E2]
by definition is identical to
  *((E1)+(E2))
which means that
  T*                pt = &array[array_size];
is identical to
  T*                pt = &*(array + array_size);
which I think is identical to
  T*               pt = &(*(array + array_size));
because & and * has the same precedence and they are right-associative,
which means that is is _not_ identical to
  T*               pt = array + array_size;

I'we been told (I do not have a copy) that C99,    6.5.3.2#3 says that
  T*               pt = &array[array_size];
is identical to
  T*               pt = array + array_size;
Does C++ have a similar rule


When executing:
#include <iostream>

using namespace std;

class foo
{
public:
   foo()
      {   cout << "foo constructor, this == " << this << endl;   }
   ~foo()
      {   cout << "foo destructor,  this == " << this << endl;   }
   foo(const foo&)
      {   cout << "foo copy constr, this == " << this << endl;   }
   foo*   operator&()
      {
         cout << "foo operator&,   this == " << this << endl;
         return this;
      }
};

int main()
{
   foo   foos[4];
   foo*   fp = &foos[4];
   cout <<  "&foos[4]:   " << fp << endl;
   fp = &*(foos+4);
   cout <<  "&*(foos+4): " << fp << endl;
   fp = foos + 4;
   cout <<  "foos + 4:   " << fp << endl;
}

compiled with 3 different compilers released this year, they all output
something like:

foo constructor, this == 0012FE88
foo constructor, this == 0012FE89
foo constructor, this == 0012FE8A
foo constructor, this == 0012FE8B
foo operator&,   this == 0012FE8C
&foos[4]:   0012FE8C
foo operator&,   this == 0012FE8C
&*(foos+4): 0012FE8C
foos + 4:   0012FE8C
foo destructor,  this == 0012FE8B
foo destructor,  this == 0012FE8A
foo destructor,  this == 0012FE89
foo destructor,  this == 0012FE88

where it is obvious that "foo::operator&" is called for an object at
0x12FE8C, which has never been constructed.
If foo does _not_ overload "operator&", then there doesn't seem to be any
problems.

Kind regards

Mogens Hansen





---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: kanze@gabi-soft.de (James Kanze)
Date: Wed, 12 Jun 2002 16:12:31 GMT Raw View

"Mogens Hansen" <mogens_h@dk-online.dk> wrote in message
news:<ae5efv$a4q$1@news.cybercity.dk>...

> Is
>   const size_t   array_size = 4;              // any number
>   T              array[array_size];           // T has a default constructor
>   T*             pt = &array[array_size];     // do not deref pointer
> garanteed to be well defined for all types of T (which has a default
> constructor) ?

Whether or not there is a constructor, or anything else, is
irrelevant.  The expression "&array[sizeof(array)/sizeof(array[0])]"
is undefined behavior.  For all types, user defined or otherwise.

The definition of the [] operator it that A[T] is exactly equivalent
to *(A+N).  If A+N does not point to a valid object (e.g. it is a
pointer to one past the end of an array), the expression is undefined
behavior.  Regardless of what you do with the results.

It is true that most, if not all, compilers will in fact optimize the
dereference out, and that the code will in fact work as expected.  If
we were in comp.lang.c++.moderated, rather than comp.std.c++, I would
tell you that you can, in fact, use the expression without fear,
especially as the C99 special cases it to make it legal.

> In particular, is it garanteed to be well defined if T overloads
> "operator&" ?

If T overloads operator&, then all bets are off.  From a standards
point of view, it is the same problem -- this time, you have called a
member function on an invalid object, rather than dereferencing it
directly, but it is still undefined behavior.

>From a practical point of view, unlike the case with the built-in
operator&, you are playing with fire.  If the function just does
something trivial, like "return this" or "return 0", you might get
away with it.  Or you might not.  If the function does anything more
complicated, it is almost certain that it will not work as expected.

> class T
> {
> public:
>     T*  operator&(void)
>         {    return this;    }    // perhaps do something more
> };

> The book
>   The C++ Programming Language, Third Edition/Special Edition
>   Bjarne Stroustrup

> page 91/92, say that taking a pointer to the element one beyond the
> end of an array is garanteed to work.

Have you checked the corrections at his web site.  This is a known
error in the book (although I think Stroustrup takes a pragmatic point
of view -- standard or not, it will work in practice, so you can use
it).

It's interesting to note that the problem has been present since the
very first days of C, but that it was only recognized very recently.
The C committee decided to special case it, in order to make it legal,
but it was too late in the C++ standardization to discuss the issue.
I suspect that there will be some discussion about it for the next
version of the standard: IMHO, C compatibility argues strong enough in
favor that we should allow it, although I think it is not good
programming technique in C++, since it really does cause problems in
cases where [] is used on a debugging version of a collection or an
iterator.

> The example is a build-in type - int.

> The C++ Standard    5.2.1, says that
>   E1[E2]
> by definition is identical to
>   *((E1)+(E2))
> which means that
>   T*                pt = &array[array_size];
> is identical to
>   T*                pt = &*(array + array_size);
> which I think is identical to
>   T*               pt = &(*(array + array_size));
> because & and * has the same precedence and they are
> right-associative, which means that is is _not_ identical to
>   T*               pt = array + array_size;

The analysis is correct.  And *(array + array_size) is undefined
behavior, regardless of what you do with the results.

> I'we been told (I do not have a copy) that C99,    6.5.3.2#3 says
> that

>   T*               pt = &array[array_size];
> is identical to
>   T*               pt = array + array_size;
> Does C++ have a similar rule

No.  I expect that it will get one in the next revision.  Hopefully
somewhat better formulated than the C one.

Roughly speaking, the undefined behavior should only come into play
when the lvalue to rvalue conversion is invoked -- the result of
array[array_size] is an lvalue which cannot be converted to an rvalue.
This is, in fact, the situation with real compilers; I don't know of
any compiler where you cannot bind it to a reference, for example.

> When executing:

    [source deleted...]

> where it is obvious that "foo::operator&" is called for an object at
> 0x12FE8C, which has never been constructed.

> If foo does _not_ overload "operator&", then there doesn't seem to
> be any problems.

It's undefined behavior.  Anything can happen.  Including what you
actually wanted.

--
James Kanze                                mailto:kanze@gabi-soft.de
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung
Ziegelh   ttenweg 17a, 60598 Frankfurt, Germany Tel. +49(0)69 63198627

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]