Thread

Topic: Is pointer arithmetic defined when the static and the dynamic types are different?

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/11/11 Raw View

vandevod@cs.rpi.edu (David Vandevoorde) writes:

> >>>>> "JK" == J Kanze <kanze@gabi-soft.fr> writes:
> [...]
> JK> Given classes B and D derived from B, and the following function call:
>
> JK>   D           x ;
> JK>   f( x , 1 ) ;
>                   ^-`&' here
>
> JK> Does this result in undefined behavior for the following definition of
> JK> "f"?  (Note that if it does, then the compiler can legally use static
> JK> binding for the function call in the loop.)
>
> JK>   void
> JK>   f( B* a , int l )
> JK>   {
> JK>       for ( int i = 0 ; i < l ; i ++ )
> JK>           a[ i ].doIt() ;
> JK>   }
>
> From reading [expr.sub] and [expr.add] it seems (unfortunately) that
> this does not result in undefined behavio(u)r. In particular,
>
>   [expr.add] 5.7/4
>   For the purposes of these operators, a pointer to a nonarray object
>   behaves the same as a pointer to the first element of an array of
>   length one with the type of the object as its element type.
>
> (I think that your other examples are also well-defined in light of
>  this paragraph.)

Actually, this could invalidate the version with a++ in the loop.  A
pointer must always point to an element in the array, or one past the
end of the array; otherwise, undefined behavior results.  If the program
is to behave "as if" we had passed the address of an array of 1 D to the
function, then a++ (where a has type B*) will not meet this criteria,
and undefined behavior results.

--
James Kanze          +33 3 88 14 49 00           email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle --
                            -- Beratung in industrieller Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: dak <pierreba@poster.cae.ca>
Date: 1996/11/12 Raw View

Chelly Green (chelly@eden.com) wrote:
> David Vandevoorde wrote:
> >
> > >>>>> "CG" == Chelly Green <chelly@eden.com> writes:
> > [...]
> > >>> void
> > >>> f( B* a , int l )
> > >>> {
> > >>> for ( int i = 0 ; i < l ; i ++ )
> > >>> a[ i ].doIt() ;
> > >>> }
> > [...]
> > >> [expr.add] 5.7/4
> > >> For the purposes of these operators, a pointer to a nonarray object
> > >> behaves the same as a pointer to the first element of an array of
> > >> length one with the type of the object as its element type.

And taking from [dcl.array] in 8.3.4 Arrays, point 6:

 "6 Except  where  it has been declared for a class (_over.sub_), the sub-"
 "  script operator [] is interpreted in such a way that E1[E2] is identi-"
 "  cal to *((E1)+(E2))."

And thus accessing the zero'th element of any pointer seems legal to me.
And if virtual functions are involved, they should stay virtuals.

[...]

> > My understanding is that this allows `a[0]' (again, I'm not at all sure
> > about that) but the call `a[0].doIt()' seems still an virtual function
> > call for an incomplete object.
>
> But since the actual type of the object and the pointer to object type
> differ, it should be undefined, right? This would allow a compiler to
> optimize all array accesses as non-virtual. If the a '[0].doIt()' is
> valid, then this optimization would need a special case for the 0
> element (in case the index were a variable) if it wanted to optimize
> non-virtual.

I always believed that the reason why the compilers were allowed to call
functions non-virtually in the case of an array was because:

  o The "pointer" is const, being an array.
  o The complete type of the object is known because the _array_
    _declaration_ is in scope.

So, very normal conditions which can also apply for non-array objects.
In the case of an argument to a function, the declaration is not in
scope. Declaring arguments as array is already a special case of arrays
(because the type is incomplete).
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Chelly Green <chelly@eden.com>
Date: 1996/11/12 Raw View

dak wrote:
>
> Chelly Green (chelly@eden.com) wrote:
> > David Vandevoorde wrote:
> > >
> > > >>>>> "CG" == Chelly Green <chelly@eden.com> writes:
> > > [...]
> > > >>> void
> > > >>> f( B* a , int l )
> > > >>> {
> > > >>> for ( int i = 0 ; i < l ; i ++ )
> > > >>> a[ i ].doIt() ;
> > > >>> }

and calling the function with:

    struct D : B { };

    void g()
    {
        D d;
        f( &d, 1 );
    }

> > > [...]
> > > >> [expr.add] 5.7/4
> > > >> For the purposes of these operators, a pointer to a nonarray object
> > > >> behaves the same as a pointer to the first element of an array of
> > > >> length one with the type of the object as its element type.
                             ^^^^^^^^^^^^^^^^^^
Complete (dynamic) type or incomplete (static) type?

> And taking from [dcl.array] in 8.3.4 Arrays, point 6:
>
>  "6 Except  where  it has been declared for a class (_over.sub_), the sub-"
>  "  script operator [] is interpreted in such a way that E1[E2] is identi-"
>  "  cal to *((E1)+(E2))."
>
> And thus accessing the zero'th element of any pointer seems legal to me.
> And if virtual functions are involved, they should stay virtuals.

Yes, if the above ([expr.add] 5.7/4) refers to the static type, then
accessing element 0 should work.

This is the big question, then, does the [expr.add] quoted portion above
mean the dynamic or static type of the object?

If it means the dynamic type, then even accessing element 0 from a base
class pointer that really points to a derived class is undefined. This
allows compiler optimizations (statically resolving virtual calls). This
is consistent with accessing element x of the array, if x != 0.

On the other hand, if it means the static type, then accessing element 0
is fine. This would be useful to allow functions that operate on arrays
to also operate on a single object, treating it as an array of only one
object.

I wonder how current compilers interpret this section. Do any optimize
out the virtual mechanism for array accesses? I doubt the compiler I use
would, because it only optimizes out virtual calls for local variables
in functions, but not for members of a class, etc.

--
Chelly Green | chelly@eden.com | C++ - http://www.eden.com/~chelly
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: dak <pierreba@poster.cae.ca>
Date: 1996/11/13 Raw View

Chelly Green <chelly@eden.com> wrote:
>
 [snipped: calling function expecting a class pointer with a derived
           pointer and using the pointer in a subscript expression with
     index zero (0).]

> > > > [...]
> > > > >> [expr.add] 5.7/4
> > > > >> For the purposes of these operators, a pointer to a nonarray object
> > > > >> behaves the same as a pointer to the first element of an array
> > > > >> of length one with the type of the object as its element type.
>                                 ^^^^^^^^^^^^^^^^^^
> Complete (dynamic) type or incomplete (static) type?
>
> > And taking from [dcl.array] in 8.3.4 Arrays, point 6:
> >
> >  "6 Except  where  it has been declared for a class (_over.sub_), the"
> >  "  subscript operator [] is interpreted in such a way that E1[E2] is"
> >  "  identical to *((E1)+(E2))."
> >

[snip]

> This is the big question, then, does the [expr.add] quoted portion above
> mean the dynamic or static type of the object?
>
> If it means the dynamic type, then even accessing element 0 from a base
> class pointer that really points to a derived class is undefined. This
> allows compiler optimizations (statically resolving virtual calls). This
> is consistent with accessing element x of the array, if x != 0.
>
> On the other hand, if it means the static type, then accessing element 0
> is fine. This would be useful to allow functions that operate on arrays
> to also operate on a single object, treating it as an array of only one
> object.

[snipped: proposition of optimization behavior of compilers]

First, the paragraph I quoted is clear about what it says: the equivalency
rule of subscript and pointer arithmetics. Second, under the rule I quoted
nowhere does the standard makes a difference between dynamic and static
types.  Third, I don't think the standard ever talks about optimization.
All optimizations are short-cuts giving the same behavior as required.
Finally, pointer arithmetics has always worked with the type of the
pointer, not the real type of the object.

All that leads me to think that if the compiler does make a call that is
supposed to be virtual non-virtual then it is the responsibility of the
compiler to make sure it is the same as if the call were virtual.

I suppose different point of views are based upon the behaviour we
expect from the subscript operator. I see it as a shortcut (syntatic
sugar) to pointer arithmetics. Thus it doesn't do any shortcut on the
type of its operand. Many people see it as an array operator and thus
its resultant type is the type of the array it seems to be operated upon.
I disagree with that view.

That is why there is no ambiguity in my mind about what type (static or
dynamic) the standards talks about. It is always the type of the declared
pointer (the base pointer, as declared in the function argument, in the
case of the function call).
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1996/11/01 Raw View

In following-up to a posting in comp.lang.c++.moderated, the following
question occurred to me:

Given classes B and D derived from B, and the following function call:

  D           x ;
  f( x , 1 ) ;

Does this result in undefined behavior for the following definition of
"f"?  (Note that if it does, then the compiler can legally use static
binding for the function call in the loop.)

  void
  f( B* a , int l )
  {
      for ( int i = 0 ; i < l ; i ++ )
          a[ i ].doIt() ;
  }

Note that "a[ i ]" is the equivalent of "*(a + i)"; and that pointer
arithmetic uses the static type of the pointer.  Also, of course,
pointer arithmetic is only defined as long as the pointer stays within
an array (or one beyond the last element in the array), and array
elements cannot have a dynamic type which differes from their static
type.

What if "f" is defined as follows?

  void
  f( B* a , int l )
  {
      for ( ; l > 0 ; l -- )
          a++ -> doIt() ;
  }

(Note the operation "a++" results in an invalide pointer if "a" points
to a D, rather than a B.)

Or?

  void
  f( B* a , int l )
  {
      B*          limit = a + l ;
      while ( a < limit )
          a++ -> doIt() ;
  }

(Obviously, if the preceding is illegal, so is this.)

--
James Kanze          +33 3 88 14 49 00           email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle --
                            -- Beratung in industrieller Datenverarbeitung


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: vandevod@cs.rpi.edu (David Vandevoorde)
Date: 1996/11/04 Raw View

>>>>> "JK" == J Kanze <kanze@gabi-soft.fr> writes:
[...]
JK> Given classes B and D derived from B, and the following function call:

JK>   D           x ;
JK>   f( x , 1 ) ;
                  ^-`&' here

JK> Does this result in undefined behavior for the following definition of
JK> "f"?  (Note that if it does, then the compiler can legally use static
JK> binding for the function call in the loop.)

JK>   void
JK>   f( B* a , int l )
JK>   {
JK>       for ( int i = 0 ; i < l ; i ++ )
JK>           a[ i ].doIt() ;
JK>   }

Author: Chelly Green <chelly@eden.com>
Date: 1996/11/06 Raw View

David Vandevoorde wrote:
>
> J Kanze <kanze@gabi-soft.fr> writes:
> [...]
>> Given classes B and D derived from B, and the following function call:
>>
>>             D           x ;
>>             f( x , 1 ) ;
>>                   ^-`&' here
>>
>> Does this result in undefined behavior for the following definition of
>> "f"?  (Note that if it does, then the compiler can legally use static
>> binding for the function call in the loop.)
>>
>>             void
>>             f( B* a , int l )
>>             {
>>                 for ( int i = 0 ; i < l ; i ++ )
>>                     a[ i ].doIt() ;
>>             }
>
> From reading [expr.sub] and [expr.add] it seems (unfortunately) that
> this does not result in undefined behavio(u)r. In particular,
>
>   [expr.add] 5.7/4
>   For the purposes of these operators, a pointer to a nonarray object
>   behaves the same as a pointer to the first element of an array of
>   length one with the type of the object as its element type.
>
> (I think that your other examples are also well-defined in light of
>  this paragraph.)

Hmmm, I interpret the above to mean it *does* result in undefined
behavior. Replacing the original calling code with

    D x [1];
    f( x, 1 );

would result in undefined behavior because the actual type of the array
differs from the static type of the pointer inside f(). The part you
quoted from the standard says this is equivalent to the original. It
seems that even accessing element 0 is undefined. This allows the
compiler to resolve member function calls of objects in an array
statically (i.e. bypass virtual dispatch).

    D x [1];
    B* b = x;
    b [0].doIt(); // undefined

--
Chelly Green | chelly@eden.com | C++ - http://www.eden.com/~chelly


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: vandevod@cs.rpi.edu (David Vandevoorde)
Date: 1996/11/06 Raw View

>>>>> "CG" == Chelly Green <chelly@eden.com> writes:
[...]
>>> void
>>> f( B* a , int l )
>>> {
>>> for ( int i = 0 ; i < l ; i ++ )
>>> a[ i ].doIt() ;
>>> }
[...]
>> [expr.add] 5.7/4
>> For the purposes of these operators, a pointer to a nonarray object
>> behaves the same as a pointer to the first element of an array of
>> length one with the type of the object as its element type.
>>
>> (I think that your other examples are also well-defined in light of
>> this paragraph.)

CG> Hmmm, I interpret the above to mean it *does* result in undefined
CG> behavior. Replacing the original calling code with

CG>     D x [1];
CG>     f( x, 1 );

CG> would result in undefined behavior because the actual type of the array
CG> differs from the static type of the pointer inside f().

I'm not at all sure about my own interpretation, but here is how it
goes.  When `x' (which decays to a pointer to an array object) is
bound to `a', the derived-to-base-conversion causes a to be a pointer
to a nonarray (incomplete) object (of type B) which for the purposes
of pointer arithmetic (5.7/4 above) is treated a pointer to an array
of one B, but not one D.

My understanding is that this allows `a[0]' (again, I'm not at all sure
about that) but the call `a[0].doIt()' seems still an virtual function
call for an incomplete object.

 Daveed

[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]

Author: Chelly Green <chelly@eden.com>
Date: 1996/11/09 Raw View

David Vandevoorde wrote:
>
> >>>>> "CG" == Chelly Green <chelly@eden.com> writes:
> [...]
> >>> void
> >>> f( B* a , int l )
> >>> {
> >>> for ( int i = 0 ; i < l ; i ++ )
> >>> a[ i ].doIt() ;
> >>> }
> [...]
> >> [expr.add] 5.7/4
> >> For the purposes of these operators, a pointer to a nonarray object
> >> behaves the same as a pointer to the first element of an array of
> >> length one with the type of the object as its element type.
> >>
> >> (I think that your other examples are also well-defined in light of
> >> this paragraph.)
>
> CG> Hmmm, I interpret the above to mean it *does* result in undefined
> CG> behavior. Replacing the original calling code with
>
> CG>     D x [1];
> CG>     f( x, 1 );
>
> CG> would result in undefined behavior because the actual type of the array
> CG> differs from the static type of the pointer inside f().
>
> I'm not at all sure about my own interpretation, but here is how it
> goes.  When `x' (which decays to a pointer to an array object) is
> bound to `a', the derived-to-base-conversion causes a to be a pointer
> to a nonarray (incomplete) object (of type B) which for the purposes
> of pointer arithmetic (5.7/4 above) is treated a pointer to an array
> of one B, but not one D.

"...length one with the type of the object as its element type."
                        ^^^^^^^^^^^^^^^^^^

A pointer/reference to an object is not the object. I assume they mean
the type of the *complete* object. If you just have a pointer, you don't
have an object.

> My understanding is that this allows `a[0]' (again, I'm not at all sure
> about that) but the call `a[0].doIt()' seems still an virtual function
> call for an incomplete object.

But since the actual type of the object and the pointer to object type
differ, it should be undefined, right? This would allow a compiler to
optimize all array accesses as non-virtual. If the a '[0].doIt()' is
valid, then this optimization would need a special case for the 0
element (in case the index were a variable) if it wanted to optimize
non-virtual.

Anyone who *knows* for sure care to comment? (this is, after all, a
standards group!)
--
Chelly Green | chelly@eden.com | C++ - http://www.eden.com/~chelly
---
[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]