Thread

Topic: base 1 subscripts in C++

Author: Jonathan de Boyne Pollard <J.deBoynePollard@tesco.net>
Date: 2000/06/13 Raw View

JH> "Undefined behavior" says that in at least some circumstances,
JH> which circumstances include your hardware, software, and lunar
JH> date, your program will fail in completely undocumented ways.
JH> It says nothing about the probability of such circumstances.

ERT> Even if that probability is 0.

What makes you think that the probability of undefined behaviour is 0 in this
particular instance ?  There exist at least two platforms where subtracting 1
from an address can result in an address that will cause a machine exception
when loaded into a CPU register.

(Of course, on both of those platforms the behaviour that results from a
machine exception due to an illegal address _is_ in fact documented, by the
operating system and CPU technical reference manuals.  So the program won't
fail in _undocumented_ ways.)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Martijn Lievaart" <news-from@greebo.orion.nl>
Date: 2000/06/06 Raw View

"E. Robert Tisdale" <edwin@netwood.net> wrote in message
news:3935416F.B646784F@netwood.net...
> Jim Hill wrote:
>
> > It might work. It might even always work on the machines you use.
> > But somewhere, on some machine, some night, it *will* fail.
> > The standards don't specify undefined behavior lightly.
>
> Should I really worry about this?
> Are there any viable C++ compilers
> for which this example will fail to compile or run correctly?
>

No, I cannot force you. I just hope I don't have to use (or, gasp, depend)
on your software if this is your attitude towards software engineering.

I have seen to many 'tricks' (including a lot of my own) fail after having
worked perfectly for some time. Many of those where of the kind "I cannot
see this ever failing". Luckily I've grown to the point where I adhere to
the standards, even where they seem unreasonably restrictive or downright
silly, exactly because so many of these tricks failed because my assumptions
turned out faulty in the end.

Will your example fail on any machine in existance today? Maybe not, maybe
it fails on the next release of your compiler. Don't worry about it, learn
the language restrictions and live with them. I think the reasons why the
language is as it is where already stated several times on this thread. If
you don't accept them, use a different language or file a defect report.

M4
--
Contrary to popular believe, the number of the beast is not 666,
it's 555-37689.
Please post replies to this newsgroup. If you must reach
me by email, use <newsgroup-name> at greebo.orion in nl.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/01 Raw View

wmm@fastdial.net wrote:

> In article <39332A20.77F5B261@netwood.net>,
> "E. Robert Tisdale" <edwin@netwood.net> wrote:
> >
> > Did Bjarne get it wrong?
> > Is there a more compelling reason
> > for this restriction on pointers?
> > Or are there really situations
> > where the compiler is obliged to emit code
> > that forms an address
> > for an object which is never referenced?
>
> No, that's the reason for the restriction.

It's unusual for anyone to be able to identify a single reason
for a decision taken collectively.

> There are two points in response to your question:
>
> First, there are situations where optimization
> (at least short of full global optimization)
> cannot eliminate forming the address.
> For instance, if the address escapes the local context
> (returned as the function's value, passed as an argument,
>  stored in a global variable),

Neither C or C++ define address data types.
They define abstract pointer data types
but they do not require pointers
to be represented by machine addresses
even if machine addresses are the most obvious
and efficient implementation of pointers.

> there's no way, in general,
> for the compiler to do the constant folding
> that eliminates the negative offset.
>
> Second, standards generally do not mandate
> a given level of optimization in implementations.
> The intent is to allow a completely straightforward
> implementation with _no_ optimization,
> just a slavishly literal application of the abstract rules,
> to be conforming.  Differing levels of optimization
> are regarded as a quality-of-implementation issue,
> better regulated by market pressures than by standardization.
> That means that it's irrelevant
> whether "a good optimizing C++ compiler"
> could avoid the problem;
> the Standard doesn't require that a C++ implementation
> be good at optimizing to be conformant.

I'm glad that you cleared that up.
I had hoped that the design of the C++ programming language
would be free to exploit the best compiler technology available
instead of being restricted by the worst.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jthill@telus.net (Jim Hill)
Date: 2000/06/01 Raw View

E. Robert Tisdale <edwin@netwood.net> wrote:
> Bjarne Stroustrup <bs@research.att.com> wrote:
> >
> >         int v[100];
> >         int* p = &v[50];
> >
> >         void f(void) {
> >           int a = p[-20];         // ok
> >           int b = p[-200];        // error
> >           // ...
> >           }
>
>         void f(int a[], int n) {
>           int *aa = a - 1;
>           for (int i = 1; i <= n; i++)
>             aa[i] = i;
>           }

The two examples are, for the purpose of your argument, wildly
different.

In the example you're responding to, the initialization of b refers to
nonexistent memory. Anything can happen, and this can be detected just
by reading the source provided.

> are there really situations where the compiler is obliged to emit code
> that forms an address for an object which is never referenced?

In the example you're responding _with_, there's no telling what actual
object `a` refers to, so the behavior may be "ok" in the same way
`p[-20]` above is ok.

And your source both applies and compensates for the offset in the same
function - so of course any half-decent compiler will do the offset at
compile time, which will almost certainly (imo) work even on machines
that can't form invalid memory addresses.

But that has nothing to do with what is *guaranteed* to work by all
compiler vendors claiming to sell conforming C++ compilers.

"Undefined behavior" says that in at least some circumstances, which
circumstances include your hardware, software, and lunar date, your
program will fail in completely undocumented ways. It says nothing about
the probability of such circumstances.

Rewrite your example function as:

void foo(int a[], ind n)
{
  for (int i = 1; i <= n; ++i)
    a[i]=1;
}

and try to pass it an address bar-1, where bar is a real array, and you
get undefined behavior. It might work. It might even always work on the
machines you use. But somewhere, on some machine, some night, it *will*
fail. The standards don't specify undefined behavior lightly.

Jim

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Gabriel Dos Reis <dosreis@cmla.ens-cachan.fr>
Date: 2000/06/01 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:

[...]

| I'm glad that you cleared that up.
| I had hoped that the design of the C++ programming language
| would be free to exploit the best compiler technology available
| instead of being restricted by the worst.

C++ should be usable here and now. Please take a look at "D&E".

--
Gabriel Dos Reis, dosreis@cmla.ens-cachan.fr

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/01 Raw View

J=F6rg Barfurth wrote:

> Because you tell it to?
> While the C and C++ Standards tell us
> what effect of some programs must be -
> and that for other programs, like yours, the effect is undefined -
> they can't prescribe whether, how
> and to what extent a compiler should optimize code.
> On some platforms there might not be
> what you qualify as 'good optimizing compiler'.
> You'd be glad to find a compiler there at all.
> Also, what is an optimization for one architecture,
> might turn out to be the opposite on another architecture
> (or even the next generation of the same architecture).

Well, then I'm sorry that I even mentioned the word optimization.
The question is whether a compiler actually needs to emit code
to form the address of an invalid object
just because there is a pointer to that object.
Apparently, you believe that it is necessary
and I don't wish to change your mind about that.

[snip]

> My code does form such address.

But your code runs just fine.  It doesn't fault
when it forms an address to an invalid object
and it returns the correct result.

> Other architectures have more dedicated address registers.
> And maybe indexed access is better there than pointer arithmetic.
> I do recall a processor where the pointer would be stored in main memor=
y,
> but the index could be kept in a register.
> And I have never even looked at assembler
> for mainframes or supercomputers, which might be vastly different.

> From what one compiler does on one platform,
> you can't infer anything.

I can infer that at least one compiler on one platform
does not necessarily need to form an address for an invalid object
just because there is a pointer to that object.
My question is whether any compiler on any platform
needs to do so at any time.

[snip]

> No.  Why?  Maybe.
>
> No, he is right.  As he said, there are platforms
> where even forming an invalid address causes a hardware error.
> A universal programming language standard
> should cater for such platforms.

Yes.  But does the compiler actually need to emit code
which forms "an invalid address"?

> Why should there be?
> Isn't it sufficient that leaving out such provisions
> would severely restrict the feasibility
> of implementing C/C++ translators
> for perfectly reasonable architectures
> (instances of which did, do and maybe will exist)?

But would it severely restrict the feasibility
or merely make it more difficult to implement?
The C and C++ programming languages
pose all kinds of difficult problems for compiler developers
on all sorts of important computer architectures.
Why is this problem with addresses of invalid objects special?
Why does the language specification need to be altered
to accommodate it and not other important features
of other computer architectures?

> Is there a more compelling reason to remove this restriction?

Because most compilers ignore it?

> In C++ you can get the same usage with a simple adapter class (template=
).

At the expense of subtraction
for every reference to an element of an array?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: wmm@fastdial.net
Date: 2000/06/01 Raw View

In a previous article,  "E. Robert Tisdale"  <edwin@netwood.net> writes:
>Pierre Baillargeon wrote:
>
>> The other compiler that will not optimize that line away
>> and will even produce bad code from time to time is called "maintainer".
>> I've been in places where the original programmer had found "neat tricks".
>> Note that I no longer work there.  Amen.
>
>This begs the question.  Is it really necessary
>for the compiler to form the address of an invalid object
>just because there is a pointer to that object?

As Steve Clamage pointed out, in the absence of global data flow
analysis, yes.  If you create the pointer value in one translation
unit and apply the subscript in a different one, it's going to be
impossible for any traditional separate-compilation translator to
optimize this away.  (As you point out in another message, it
would be possible to avoid this by using an artificial pointer
representation instead of native machine addresses; however, I
don't believe most users would be willing to accept that level of
inefficiency.)

-- William M. Miller

     -----  Posted via NewsOne.Net: Free Usenet News via the Web  -----
     -----  http://newsone.net/ --  Discussions on every subject. -----
   NewsOne.Net prohibits users from posting spam.  If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: wmm@fastdial.net
Date: 2000/06/01 Raw View

In a previous article,  "E. Robert Tisdale"  <edwin@netwood.net> writes:
>wmm@fastdial.net wrote:
>
>> No, that's the reason for the restriction.
>
>It's unusual for anyone to be able to identify a single reason
>for a decision taken collectively.

That may be, but in all the discussions on the Committee I recall
concerning issues like this, this was the reason for the decisions
that were made.

>> There are two points in response to your question:
>>
>> First, there are situations where optimization
>> (at least short of full global optimization)
>> cannot eliminate forming the address.
>> For instance, if the address escapes the local context
>> (returned as the function's value, passed as an argument,
>>  stored in a global variable),
>
>Neither C or C++ define address data types.
>They define abstract pointer data types
>but they do not require pointers
>to be represented by machine addresses
>even if machine addresses are the most obvious
>and efficient implementation of pointers.

Of course.  However, for production use, most people want as
efficient execution as possible, which means using the native
facilities of the machine architecture.  We were very careful
to specify the _requirements_ of the language so as not to
penalize excessively any popular or realistically foreseeable
architecture.  One of the goals of C, inherited by C++, was to
allow efficient implementation on as wide a range of machine
architectures as possible.

>> there's no way, in general,
>> for the compiler to do the constant folding
>> that eliminates the negative offset.
>>
>> Second, standards generally do not mandate
>> a given level of optimization in implementations.
>> The intent is to allow a completely straightforward
>> implementation with _no_ optimization,
>> just a slavishly literal application of the abstract rules,
>> to be conforming.  Differing levels of optimization
>> are regarded as a quality-of-implementation issue,
>> better regulated by market pressures than by standardization.
>> That means that it's irrelevant
>> whether "a good optimizing C++ compiler"
>> could avoid the problem;
>> the Standard doesn't require that a C++ implementation
>> be good at optimizing to be conformant.
>
>I'm glad that you cleared that up.
>I had hoped that the design of the C++ programming language
>would be free to exploit the best compiler technology available
>instead of being restricted by the worst.

I think you may be misunderstanding the meaning and intent of
"undefined behavior."  Implementations are certainly free to
exceed the minimum requirements of the Standard, and that includes
defining what happens when your code exhibits "undefined behavior"
according to the Standard.  If you like that definition, and you're
willing to restrict your program to using that particular
implementation, fine.  All the Standard is saying by the "undefined
behavior" label is that implementations aren't required to produce
inefficient code or use heroic measures in the translator to
support code like that, so your program won't be portable.

To put it another way, C++ compilers _are_ "free to exploit the
best compiler technology available" -- we're just not saying that
a compiler _must_ use that level of technology in order to
conform to the minimal requirements of the language.

-- William M. Miller

     -----  Posted via NewsOne.Net: Free Usenet News via the Web  -----
     -----  http://newsone.net/ --  Discussions on every subject. -----
   NewsOne.Net prohibits users from posting spam.  If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/01 Raw View

wmm@fastdial.net wrote:

> As Steve Clamage pointed out,
> in the absence of global data flow analysis, yes.
> If you create the pointer value in one translation unit
> and apply the subscript in a different one,
> it's going to be impossible for any traditional,
> separate-compilation translator to optimize this away.
> (As you point out in another message, it would be possible
>  to avoid this by using an artificial pointer representation
>  instead of native machine addresses;
>  however, I don't believe most users would be willing
>  to accept that level of inefficiency.)

This is pretty close to the explanation that I would give
but I'm not sure how you can judge what uses will accept.
If I force the user to write

    void f(int a[], int n) {
      for (int i = 1; i <= n; i++)
        a[i-1] = i;
        ;

instead of

    void f(a[], int n) {
      int *aa = a - 1;
      for (int i = 1; i <= n; i++)
        aa[i] = i;
      }

The user may be obliged to accept the inefficiency
of computing i-1 in each iteration of the for loop.
How do you determine that this inefficiency
is more acceptable to users than the inefficiency
imposed by converting an "artificial" pointer
into a machine address?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 2000/06/01 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:
> Yes.  But does the compiler actually need to emit code
> which forms "an invalid address"?

If you are allowed to form a pointer to an element before the start
of an array, you can then pass that pointer as a paramater to a
function. On architectures where you may not form invalid addresses,
this would force the compiler to represent all pointers as base+offset
pairs. The language standards chose not to require such a thing.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/01 Raw View

Jim Hill wrote:

> But that has nothing to do with what is *guaranteed* to work
> by all compiler vendors claiming to sell conforming C++ compilers.

Compiler vendors claiming to sell conforming C++ compilers
might guarantee that the trick will work.
Exactly how many compiler vendors
claim to sell conforming C++ compilers?

> "Undefined behavior" says that in at least some circumstances,
> which circumstances include your hardware, software, and lunar date,
> your program will fail in completely undocumented ways.
> It says nothing about the probability of such circumstances.

Even if that probability is 0.

> Rewrite your example function as:
>
>         void foo(int a[], int n) {
>           for (int i = 1; i <= n; ++i)
>             a[i] = i;
>           }
>
> and try to pass it an address bar-1, where bar is a real array,
> and you get undefined behavior.

No.  I'll get an error message because foo accepts only in arrays.
You probably meant that bar is an int array.

> It might work. It might even always work on the machines you use.
> But somewhere, on some machine, some night, it *will* fail.
> The standards don't specify undefined behavior lightly.

Should I really worry about this?
Are there any viable C++ compilers
for which this example will fail to compile or run correctly?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 2000/06/01 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:
> This begs the question.  Is it really necessary
> for the compiler to form the address of an invalid object
> just because there is a pointer to that object?

It's the most obvious thing for a compiler to do, and no one was
interested in preventing them from doing so.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 2000/06/01 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:
> I'm glad that you cleared that up.
> I had hoped that the design of the C++ programming language
> would be free to exploit the best compiler technology available
> instead of being restricted by the worst.

It's "being restricted by the worst" that makes C and C++ so widely
available, from the largest mainframes to the tiniest microcontrollers.
The language designs allow for implementations to use simple and obvious
approaches to code generation if they want to do so.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 2000/06/01 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:
> If I force the user to write
>     void f(int a[], int n) {
>       for (int i = 1; i <= n; i++)
>         a[i-1] = i;
>         ;
> The user may be obliged to accept the inefficiency
> of computing i-1 in each iteration of the for loop.
> How do you determine that this inefficiency
> is more acceptable to users than the inefficiency
> imposed by converting an "artificial" pointer
> into a machine address?

Because we are allowing C++ compilers to make use of the best,
most modern, compiler technology, and that will allow the compiler
to optimize away the recomputations.

The modern, state-of-the-art technology that enables this is
probably over forty years old by now.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Peter Dimov" <pdimov@mmltd.net>
Date: 2000/06/01 Raw View

E. Robert Tisdale <edwin@netwood.net> wrote in message
news:39354645.B57E2147@netwood.net...

[...]

> If I force the user to write
>
>     void f(int a[], int n) {
>       for (int i = 1; i <= n; i++)
>         a[i-1] = i;
>         ;
>
> instead of
>
>     void f(a[], int n) {
>       int *aa = a - 1;
>       for (int i = 1; i <= n; i++)
>         aa[i] = i;
>       }
>
> The user may be obliged to accept the inefficiency
> of computing i-1 in each iteration of the for loop.

Why do you think so?

a[i-1] is equivalent to *(a + i - 1); any half-decent compiler that can
safely form a-1 will transform this to *(a - 1 + i). This is exactly what
you wrote in the second function above.

--
Peter Dimov
Multi Media Ltd.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/02 Raw View

Peter Dimov wrote:

> Why do you think so?

I don't.

> a[i-1] is equivalent to *(a + i - 1);
> any half-decent compiler that can safely form a-1
> will transform this to *(a - 1 + i).
> This is exactly what you wrote in the second function above.

J=F6rg Barfurth wrote:

> On some platforms there might not be
> what you qualify as 'good optimizing  compiler'.
> You'd be glad to find a compiler there at all.
> Also, what is an optimization for one architecture,
> might turn out to be the opposite on another architecture
> (or even the next generation of the same architecture).

I'll let you argue the point with him.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Steve Clamage <stephen.clamage@sun.com>
Date: 2000/05/31 Raw View

"E. Robert Tisdale" wrote:
>
> Over in comp.lang.c++
> Bjarne Stroustrup <bs@research.att.com> wrote:
> >
> > "E. Robert Tisdale" <edwin@netwood.net> wrote:
> >
> > > Chris Mears <cmears@bigpond.com> wrote:
> > >
> > > > There's no standard way of doing this.
> > > > A terribly non-portable way
> > > > would be to take the address of the element before the first,
> > > > and index from that.  ...
> > > >
> > > In fact, it will work almost everywhere.  And, I believe that
> > > if Brian W. Kernighan, Dennis M. Ritchie and Bjarne Stroustrup
> > > had their way the standards documents would clearly state
> > > that it should work everywhere.
> >
> > I don't think so.  As far as I know, no C or C++ manual or standard
> > has ever allowed access beyond the beginning of an array
> > or even allowed forming a pointer to there.
> > Nor have I ever heard Dennis or Brian argue for such to be allowed.
> >
> > You can use negative indices, but only as long as
> > the resulting pointer refers to an element of the array
> > that the pointer you index off, or one beyond the end. For example:
> >
> >         int v[100];
> >         int* p = &v[50];
> >
> >         void f(void) {
> >           int a = p[-20];         // ok
> >           int b = p[-200];        // error
> >           // ...
> >           }
> >
> > An ordinary compiler would not catch the error
> > except possibly in trivial cases like my example.
> >
> > The reason to disallow such pointers is that an array, such as v here,
> > might be allocated on or near a hardware boundary
> > so that there logically and physically isn't an element to refer to.
> > Hardware might give some hardware error
> > when the address is formed in a register or used for access,
> > or truncate the pointer that is formed for p[-200]
> > so that it refers to some other memory
> > (probably beyond the other end of the array).
>
> I'm not sure that this is a particularly compelling reason.
> It isn't clear to me why a good optimizing C++ compiler
> needs to form an address for such a pointer
> or store it in an address register.

> Consider, for example, the following function: ...
> which I compiled with my C++ compiler to obtain: ...
> Did Bjarne get it wrong? ...

You don't seem to grasp the difference between defining semantics
for a programming language, and the details of individual
implementations.

C and C++ take the view that the language should be implementable
efficiently on a wide range of machines. Consequently, the semantics
of the languages tend to be divorced from details of any class
of machines or translation techniques.

I'm going to make a small modification to Bjarne's example:
 int v[100]; // initialized before first use
 int foo(int* p, int i) { return  p[i]; }
 int bar1() { return foo(&v[50], -51); } // v[-1]
 int bar2() { return foo(&v[50], -100000); } // v[-99950]

Consider function foo.  In C and C++, the results are undefined
by the language definitions.  That means that anything might
happen. The compiler and run-time system might be such that v[-1]
is a valid address, and you'll get back some result. Or the address
might be invalid and you'll get a run-time hardware trap. Or the
compiler might generate code to validate array indexing, and you'll
get an out-of-range error.

That last possibility is an implementation option. You might like
to get a helpful error indication. But you might not want to pay
the cost of validating all indexing operations. Compilers can offer
you a choice, or you can choose a compiler that does what you want
-- providing efficiency or safety.

Suppose instead that the language defined what must happen. The
choices would be, it seems to me, that all indexing must produce a
result, or that all out-of-range indexing must be an error. (An
implementation might allow you to choose between these alternatives
on a case-by-case basis via pragmas, but I'd hate to see something
like that written into a language definition.)  Now you place an
efficiency burden on all programs, even carefully-written ones where
out-of-range indexing can't happen. (Sometimes the compiler can
determine that the check is not necessary, but only whole-program
data-flow analysis can determine whether it is necessary in foo;
foo might separately compiled.)

We have an example in Pascal. The language definition says all
array indexing must be checked, and out-of-range errors must
be reported. You often found that compilers didn't do the checking
by default, and some didn't even have it as an option, because
the implementors decided that their customers didn't want the
effiency hit.

In writing portable Pascal code, if you want safety, you have to
write source-level checks, which will be redundant (worsening the
efficiency hit) when compiled with standard-conforming compilers.

--
Steve Clamage, stephen.clamage@sun.com

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/05/31 Raw View

Steve Clamage wrote:

> You don't seem to grasp the difference
> between defining semantics for a programming language,
> and the details of individual implementations.

I thought I did.

> C and C++ take the view that the language
> should be implementable efficiently on a wide range of machines.
> Consequently, the semantics of the languages
> tend to be divorced from details
> of any class of machines or translation techniques.

Actually, I thought that was supposed to be true
of any high level computer programming language.

> I'm going to make a small modification to Bjarne's example:
>
>         int v[100]; // initialized before first use
>         int foo(int* p, int i) { return  p[i]; }
>         int bar1() { return foo(&v[50], -51); } // v[-1]
>         int bar2() { return foo(&v[50], -100000); } // v[-99950]
>
> Consider function foo.
> In C and C++, the results are undefined by the language definitions.
> That means that anything might happen.
> The compiler and run-time system might be such that
> v[-1] is [at] a valid address, and you'll get back some result.
> Or the address might be invalid and you'll get a run-time hardware trap.
> Or the compiler might generate code to validate array indexing,
> and you'll get an out-of-range error.

No.
You missed the point entirely I think.
There is no question that the behavior is undefined
if you actually try to reference an invalid object.
The question is whether simply computing a pointer
to an invalid object should be allowed, or rather,
"Why shouldn't it be allowed?"
Bjarne seems to think it is because the compiler might emit code
to compute the machine address of the invalid object
and cause a hardware fault, perhaps,
when the address is loaded into an address register.
But neither C or C++ have an address data type.
The languages specify only abstract pointer types
and NOT the obvious connection with machine addresses.
Compilers don't always need to emit code
to form the address of an invalid object
just because there is a pointer to that object and some don't.
Is there a compelling reason why a compiler should emit code
to form the machine address of an invalid object
just because there is a pointer to that object?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Hyman Rosen <hymie@prolifics.com>
Date: 2000/05/31 Raw View

"E. Robert Tisdale" <edwin@netwood.net> writes:
> Is there a compelling reason why a compiler should emit code
> to form the machine address of an invalid object
> just because there is a pointer to that object?

The logical and compelling reason is that you wrote code that formed
such an address! Instead of requiring compilers to work around goofy
code like that, the standard says that such code is undefined.

If you check back on Deja for the '\e' thread, you're going to see
that people on this newsgroup are a tenacious bunch. We're not going
to back down from our correct opinions, and you're never going to be
allowed to have the last word.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: =?ISO-8859-1?Q?J=F6rg?= Barfurth <joerg.barfurth@attglobal.net>
Date: 2000/05/31 Raw View

Am 30.05.00, 17:20:27, schrieb "E. Robert Tisdale" <edwin@netwood.net> zu=
m=20
Thema base 1 subscripts in C++:

> Over in comp.lang.c++
> Bjarne Stroustrup <bs@research.att.com> wrote:
> >
> > "E. Robert Tisdale" <edwin@netwood.net> wrote:
> >
> > > Chris Mears <cmears@bigpond.com> wrote:
> > >
> > > > There's no standard way of doing this.
> > > > A terribly non-portable way
> > > > would be to take the address of the element before the first,
> > > > and index from that.  e.g.:

> > > > The method that I provided *might* work on your compiler,
> > > > but you I can't guarantee it.
> > >
> > > In fact, it will work almost everywhere.  And, I believe that
> > > if Brian W. Kernighan, Dennis M. Ritchie and Bjarne Stroustrup
> > > had their way the standards documents would clearly state
> > > that it should work everywhere.
> >
> > I don't think so.  As far as I know, no C or C++ manual or standard
> > has ever allowed access beyond the beginning of an array
> > or even allowed forming a pointer to there.
> > Nor have I ever heard Dennis or Brian argue for such to be allowed.
> >
> > The reason to disallow such pointers is that an array, such as v here=
,
> > might be allocated on or near a hardware boundary
> > so that there logically and physically isn't an element to refer to.
> > Hardware might give some hardware error
> > when the address is formed in a register or used for access,
> > or truncate the pointer that is formed for p[-200]
> > so that it refers to some other memory
> > (probably beyond the other end of the array).

> I'm not sure that this is a particularly compelling reason.
> It isn't clear to me why a good optimizing C++ compiler
> needs to form an address for such a pointer
> or store it in an address register.

Because you tell it to ?
While the C and C++ Standards tell us what effect of some programs must=20
be - and that for other programs, like yours, the effect is undefined -=20
they cant prescribe whether, how and to what extent a compiler should=20
optimize code.
On some platforms there might not be what you qualify as 'good optimizing=
=20
compiler'. You'd be glad to find a compiler there at all.
Also, what is an optimization for one architecture, might turn out to be=20
the opposite on another architecture (or even the next generation of the=20
same architecture).

> Consider, for example, the following function:

Let me modify your example to use one-based subscripting consistently:

         void f(int a[], int n) {
           for (int i =3D 1; i <=3D n; i++)
             a[i] =3D i;
           }
         }
used as in
         int * g(int n) {
           int *aa =3D new int[n] - 1;
           f(aa,n);
           return aa;
         }

> which I compiled with my C++ compiler to obtain:
...

As did I:

1:     void f(int a[], int n) {
   mov         edx,dword ptr [esp+8]
   mov         eax,1
   cmp         edx,eax
   jl          f+1Eh (0040101e)
   mov         ecx,dword ptr [esp+4]  <<<<<< here
   add         ecx,4
2:       for (int i =3D 1; i <=3D n; i++)
3:         a[i] =3D i;
   mov         dword ptr [ecx],eax
   inc         eax
   add         ecx,4
   cmp         eax,edx
   jle         f+14h (00401014)
4:     }
   ret

5:
6:     int * g(int n) {
   push        esi
   push        edi
   mov         edi,dword ptr [esp+0Ch]
   lea         eax,[edi*4]
   push        eax
   call        operator new (00401070)
   mov         esi,eax
7:       int *aa =3D new int[n] - 1;
8:       f(aa,n);
   push        edi
   sub         esi,4          <<<<<< and here
   push        esi
   call        f (00401000)
   add         esp,0Ch
9:       return aa;
   mov         eax,esi
   pop         edi
   pop         esi
10:    }

> This code never forms an address corresponding to
>     int *aa =3D a - 1;

My code does form such address.

Other architectures have more dedicated address registers. And maybe=20
indexed access is better there than pointer arithmetic. I do recall a=20
processor where the pointer would be stored in main memory, but the index=
=20
could be kept in a register.
And I have never even looked at assembler for mainframes or=20
supercomputers, which might be vastly different.

>From what one compiler does on one platform, you cant infer anything.=20
BTW: what does your compiler generate if you turn off optimizations ? I=20
wouldn't like code that worked only when optimized.

> Did Bjarne get it wrong?
> Is there a more compelling reason
> for this restriction on pointers?
> Or are there really situations
> where the compiler is obliged to emit code
> that forms an address
> for an object which is never referenced?

No. Why ?. Maybe.

No, he is right. As he said, there are platforms where even forming an=20
invalid address causes a hardware error. A universal programming language=
=20
standard should cater for such platforms.

Why should there be ? Isn't it sufficient that leaving out such=20
provisions would severely restrict the feasability of implementing C/C++=20
translators for perfectly reasonable architectures (instances of which=20
did, do and maybe will exist) ?=20
Is there a more compelling reason to remove this restriction ? In C++ you=
=20
can get the same usage with a simple adapter class (template).

Maybe there are, e.g. if you allow my program. If you disallow it, please=
=20
tell me why.=20
OTOH it doesn't really matter whether the compiler is obliged to emit=20
such code. You tell it to do it that way, so it is unreasonable to=20
require the compiler _not_ to emit such code. IOW the standard should not=
=20
mandate 'optimizations' or kludgy workarounds.
Look at the distinction between builtin pointer comparison and std::less=20
to see how careful the comittee weighed restriction to accomodate=20
existing architectures' restrictions against application needs.

Essentially it comes down to: The core language should not restrict=20
implementors or hardware unduly, as long as the 'missing' feature can be=20
provided by a library. Some such features should be in the standard=20
library if (a) they are universally needed and/or (b) their=20
implementation strongly depends or benefits from platform-specific=20
implementation.

Regards, J=F6rg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Pierre Baillargeon <pb@artquest.net>
Date: 2000/05/31 Raw View

"E. Robert Tisdale" wrote:
>
> Consider, for example, the following function:
>
>         void f(int a[], int n) {
>           int *aa = a - 1;
>           for (int i = 1; i <= n; i++)
>             aa[i] = i;
>           }
>

[...]

> Or are there really situations
> where the compiler is obliged to emit code
> that forms an address
> for an object which is never referenced?

As in: turning debugging on? Most compilers I've used turn off most
optimizations when producing debug information, and more importantly,
will generate at least one opcode for each line of C++ code so that
breakpoints can be set and variables examined. Do you want to forfeit
the used of a debugger?

The other compiler that will not optimize that line away and will even
produce bad code from time to time is called "maintainer". I've been in
places where the original programmer had found "neat tricks". Note that
I no longer work there. Amen.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/06/01 Raw View

Pierre Baillargeon wrote:

> As in: turning debugging on?
> Most compilers I've used turn off most optimizations
> when producing debug information, and more importantly,
> will generate at least one opcode for each line of C++ code
> so that breakpoints can be set and variables examined.
> Do you want to forfeit the used of a debugger?

No.  I sometimes find a debugger useful
when debugging other people's code.

> The other compiler that will not optimize that line away
> and will even produce bad code from time to time is called "maintainer".
> I've been in places where the original programmer had found "neat tricks".
> Note that I no longer work there.  Amen.

This begs the question.  Is it really necessary
for the compiler to form the address of an invalid object
just because there is a pointer to that object?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "E. Robert Tisdale" <edwin@netwood.net>
Date: 2000/05/30 Raw View

Over in comp.lang.c++
Bjarne Stroustrup <bs@research.att.com> wrote:
>
> "E. Robert Tisdale" <edwin@netwood.net> wrote:
>
> > Chris Mears <cmears@bigpond.com> wrote:
> >
> > > There's no standard way of doing this.
> > > A terribly non-portable way
> > > would be to take the address of the element before the first,
> > > and index from that.  e.g.:
> > >
> > >     void f(void) {
> > >       int array[10];
> > >       int *ptr = array - 1;
> > >       ptr[1] = 1;
> > >       ptr[10] = 10;
> > >        /* Cannot use ptr[0]. */
> > >       }
> > >
> > > The method that I provided *might* work on your compiler,
> > > but you I can't guarantee it.
> > > The compiler's documentation might mention something about it.
> > > Don't use that code if you want 100% reliability.
> >
> > In fact, it will work almost everywhere.  And, I believe that
> > if Brian W. Kernighan, Dennis M. Ritchie and Bjarne Stroustrup
> > had their way the standards documents would clearly state
> > that it should work everywhere.
>
> I don't think so.  As far as I know, no C or C++ manual or standard
> has ever allowed access beyond the beginning of an array
> or even allowed forming a pointer to there.
> Nor have I ever heard Dennis or Brian argue for such to be allowed.
>
> You can use negative indices, but only as long as
> the resulting pointer refers to an element of the array
> that the pointer you index off, or one beyond the end. For example:
>
>         int v[100];
>         int* p = &v[50];
>
>         void f(void) {
>           int a = p[-20];         // ok
>           int b = p[-200];        // error
>           // ...
>           }
>
> An ordinary compiler would not catch the error
> except possibly in trivial cases like my example.
>
> The reason to disallow such pointers is that an array, such as v here,
> might be allocated on or near a hardware boundary
> so that there logically and physically isn't an element to refer to.
> Hardware might give some hardware error
> when the address is formed in a register or used for access,
> or truncate the pointer that is formed for p[-200]
> so that it refers to some other memory
> (probably beyond the other end of the array).

I'm not sure that this is a particularly compelling reason.
It isn't clear to me why a good optimizing C++ compiler
needs to form an address for such a pointer
or store it in an address register.

Consider, for example, the following function:

        void f(int a[], int n) {
          int *aa = a - 1;
          for (int i = 1; i <= n; i++)
            aa[i] = i;
          }

which I compiled with my C++ compiler to obtain:

        gcc2_compiled.:
        .text
                .align 4
        .globl f__FPii
                .type    f__FPii,@function
        f__FPii:
        .LFB1:
                pushl %ebp
        .LCFI0:
                movl %esp,%ebp
        .LCFI1:
                movl 12(%ebp),%ecx      # n
                movl $1,%eax            # i = 1
                cmpl %ecx,%eax          # i - n
                jg .L7                  # if (i > n)
                movl 8(%ebp),%edx       # a
                .align 4
        .L5:
                movl %eax,(%edx)        # *a = i
                addl $4,%edx            # a += 1
                incl %eax               # i++
                cmpl %ecx,%eax          # i - n
                jle .L5                 # if (i <= n)
        .L7:
                leave
                ret

This code never forms an address corresponding to

    int *aa = a - 1;

Did Bjarne get it wrong?
Is there a more compelling reason
for this restriction on pointers?
Or are there really situations
where the compiler is obliged to emit code
that forms an address
for an object which is never referenced?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: wmm@fastdial.net
Date: 2000/05/31 Raw View

In article <39332A20.77F5B261@netwood.net>,
  "E. Robert Tisdale" <edwin@netwood.net> wrote:
> Over in comp.lang.c++
> Bjarne Stroustrup <bs@research.att.com> wrote:
> >
> > I don't think so.  As far as I know, no C or C++ manual or standard
> > has ever allowed access beyond the beginning of an array
> > or even allowed forming a pointer to there.
> > Nor have I ever heard Dennis or Brian argue for such to be allowed.
> >
> > You can use negative indices, but only as long as
> > the resulting pointer refers to an element of the array
> > that the pointer you index off, or one beyond the end. For example:
> >
> >         int v[100];
> >         int* p = &v[50];
> >
> >         void f(void) {
> >           int a = p[-20];         // ok
> >           int b = p[-200];        // error
> >           // ...
> >           }
> >
> > An ordinary compiler would not catch the error
> > except possibly in trivial cases like my example.
> >
> > The reason to disallow such pointers is that an array, such as v
here,
> > might be allocated on or near a hardware boundary
> > so that there logically and physically isn't an element to refer to.
> > Hardware might give some hardware error
> > when the address is formed in a register or used for access,
> > or truncate the pointer that is formed for p[-200]
> > so that it refers to some other memory
> > (probably beyond the other end of the array).
>
> I'm not sure that this is a particularly compelling reason.
> It isn't clear to me why a good optimizing C++ compiler
> needs to form an address for such a pointer
> or store it in an address register.
>
> Did Bjarne get it wrong?
> Is there a more compelling reason
> for this restriction on pointers?
> Or are there really situations
> where the compiler is obliged to emit code
> that forms an address
> for an object which is never referenced?

No, that's the reason for the restriction.  There are two
points in response to your question:

First, there are situations where optimization (at least short
of full global optimization) cannot eliminate forming the
address.  For instance, if the address escapes the local context
(returned as the function's value, passed as an argument, stored
in a global variable), there's no way for the compiler in
general to do the constant folding that eliminates the negative
offset.

Second, standards generally do not mandate a given level of
optimization in implementations.  The intent is to allow a
completely straightforward implementation with _no_ optimization,
just a slavishly literal application of the abstract rules, to
be conforming.  Differing levels of optimization are regarded as
a quality-of-implementation issue, better regulated by market
pressures than by standardization.  That means that it's
irrelevant whether "a good optimizating C++ compiler" could
avoid the problem; the Standard doesn't require that a C++
implementation be good at optimizing to be conformant.
--
William M. Miller, wmm@fastdial.net
OnDisplay, Inc. (www.ondisplay.com)


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]