Topic: Bounds checking


Author: danielgutson@hotmail.com (danielgutson@hotmail.com)
Date: Thu, 2 Jan 2003 19:24:04 +0000 (UTC)
Raw View
allan_w@my-dejanews.com (Allan W) wrote in message news:<7f2735a5.0212261407.5687ee93@posting.google.com>...
> hyrosen@mail.com (Hyman Rosen) wrote
> > This from a language without bounds checking.
>
> Is C++ a language without bounds checking? Certainly all of the
> popular C and C++ compilers don't bounds-check on built-in arrays.
> But is this due to the standard, or just due to existing practice?
>
> Going beyond array bounds is already either compiler-defined or
> undefined behavior, right? (I couldn't find this in the standard,
> but I *know* it's in there somewhere!)
>
> On the other hand, I also couldn't find anything in the standard
> that guarantees the compiler will NOT check bounds.
>
> Seems to me it's possible to apply bounds-checking right now,
> without a language change -- and that the only reason compilers
> don't already do this, is that the earliest compilers didn't.
> That makes it a QOI issue.
>
> Yes, this would break legacy code which depends on this behavior,
> so any compiler which started checking bounds would be well-advised
> to supply a compiler switch to turn it off (or maybe a local
> extension to turn it off for one particular array). Presumably this
> is the reason that (AFAIK) no current compiler checks bounds.
>
> Still, I'm under the impression that the language does allow
> compilers to check it now -- and if I'm wrong, I'm sure I'll hear
> about it here.
>
> ---
> [ comp.std.c++ is moderated.  To submit articles, try just posting with ]
> [ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
> [              --- Please see the FAQ before posting. ---               ]
> [ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Some STL implementations have some asserts for bound checking in
vector(i.e. Microsoft's VisualC++), that is, always available at debug
time.
So, whenever you use the [] operator, it performs range checking.

You might also want 'ranges', as in PASCAL.
You can achive a range by defining a template and wrapping all the
integral operations:

template <int Min, int Max> class Range
{
public:
    Range<Min, Max>& operator = (int x);
    template <int Min2, int Max2> operator = (const Range<Min2, Max2>&
r);
    ...
private:
    int _value;
};

with some asserts in the members.

Hope it helps,
  Daniel.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: kanze@gabi-soft.de (James Kanze)
Date: Tue, 31 Dec 2002 19:21:30 +0000 (UTC)
Raw View
nagle@animats.com (John Nagle) wrote in message
news:<3E0DE586.1000901@animats.com>...
> Allan W wrote:

> > hyrosen@mail.com (Hyman Rosen) wrote

> >>This from a language without bounds checking.

> > Is C++ a language without bounds checking? Certainly all of the
> > popular C and C++ compilers don't bounds-check on built-in arrays.
> > But is this due to the standard, or just due to existing practice?

Existing practice.  The effect of an out of bounds access is undefined
behavior, both in C and in C++.  This is intentional.

    [...]
>     The problem with built-in C arrays is that C treats arrays as a
> synonym for pointer arithmetic.  When you pass a pointer, the language
> doesn't distinguish between a pointer to a single object, a pointer to
> the beginning of an array of objects, or a pointer to a position
> within an array.

The C standard was carefully written to expressedly allow 'fat
pointers', that is, pointers which contain the extra information
necessary for bounds checking.  At least one implementation in the past,
CenterLine, did this.

Speed is probably the only reason implementations don't do it;
CenterLine itself advertised itself as a debugging compiler, and didn't
pretend to generate code for use in final production.  And since you
mentionned (in the part I cut) STL implementations with a debug mode,
why don't compilers have a debug mode for the built-in stuff: fat
pointers, array bounds checking, overflow detection, etc.?  It would run
significantly slower than normal, but you wouldn't have to use it for
release code.

--
James Kanze                           mailto:jkanze@caicheuvreux.com
Conseils en informatique orient   e objet/
                    Beratung in objektorientierter Datenverarbeitung

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: hyrosen@mail.com (Hyman Rosen)
Date: Tue, 31 Dec 2002 21:06:59 +0000 (UTC)
Raw View
James Kanze wrote:
> why don't compilers have a debug mode for the built-in stuff: fat
> pointers, array bounds checking, overflow detection, etc.?

Because of the interface difficulties involved in linking to
system and vendor libraries compiled without this mode.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: petebecker@acm.org (Pete Becker)
Date: Tue, 31 Dec 2002 21:12:05 +0000 (UTC)
Raw View
James Kanze wrote:
>
> And since you
> mentionned (in the part I cut) STL implementations with a debug mode,
> why don't compilers have a debug mode for the built-in stuff: fat
> pointers, array bounds checking, overflow detection, etc.?  It would run
> significantly slower than normal, but you wouldn't have to use it for
> release code.
>

Borland provides this. With full checking on it's about 10 times slower
than without.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nagle@animats.com (John Nagle)
Date: Sat, 28 Dec 2002 19:59:15 +0000 (UTC)
Raw View
Allan W wrote:

> hyrosen@mail.com (Hyman Rosen) wrote
>
>>This from a language without bounds checking.
>>
>
> Is C++ a language without bounds checking? Certainly all of the
> popular C and C++ compilers don't bounds-check on built-in arrays.
> But is this due to the standard, or just due to existing practice?


    It's possible to check bounds on STL collections, and
STLport does this in debug mode.  This is a useful feature,
and more STL implementations should have it.  (Incidentally,
such checks can be optimized out in many cases.  Hoisting
subscript checks out of loops is often possible. An old
study of Pascal optimization found that 95% of subscript
checks could be optimized out at compile time.  With
proper compiler support, subscript checking doesn't
incur much run-time cost.  You probably get your money
back the first time it finds a bug.)

    The problem with built-in
C arrays is that C treats arrays as a synonym for pointer
arithmetic.  When you pass a pointer, the language doesn't
distinguish between a pointer to a single object, a pointer
to the beginning of an array of objects, or a pointer to
a position within an array.

    There have been attempts to work around this.  If you
have a storage allocator which, given a raw pointer, can
find the block information for the storage block into
which that pointer points, you can check subscripts even
for raw pointers.  But it's really slow.

    It's also possible to implement pointers as references
to objects and offsets within the object, but few C
implementations do this.  Too much code assumes that
pointers are numbers, not structures.


     John Nagle
     Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: scottm@toast.net ("Scott Mayo")
Date: Sun, 29 Dec 2002 04:04:20 +0000 (UTC)
Raw View
"Allan W":

> Is C++ a language without bounds checking? Certainly all of the
> popular C and C++ compilers don't bounds-check on built-in arrays.
> But is this due to the standard, or just due to existing practice?

Planting bounds checks on the code generated by a[i] requires
adding instructions to check i. This typically doubles the number
of instructions, and more than halves the speed. Plenty of people
would be willing to form a lynch mob if someone made that
speed loss a permanent part of C++. Anyway, there are
packages out there that can do it for you *optionally*,
when you need to debug. (Unsurprisingly, most of them
cause a huge performance hit.)

And while checking a[i] for a[], an automatic or static array, is
easy, checking p[i], p some pointer, is not. It can be impossible:

    extern int p[];
    extern char* q;

    p[37]; //legal, yes. In bounds? Who knows?
    q[37]; //legal, yes. In bounds? Who knows?

By heritage, C++ is just assembler with a nice paint job. Not too
many assemblers know how to check array bounds. You want
a high level language for what you are describing.



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: pjp@dinkumware.com ("P.J. Plauger")
Date: Sun, 29 Dec 2002 21:44:51 +0000 (UTC)
Raw View
"John Nagle" <nagle@animats.com> wrote in message news:3E0DE586.1000901@animats.com...

> > Is C++ a language without bounds checking? Certainly all of the
> > popular C and C++ compilers don't bounds-check on built-in arrays.
> > But is this due to the standard, or just due to existing practice?
>
>     It's possible to check bounds on STL collections, and
> STLport does this in debug mode.  This is a useful feature,
> and more STL implementations should have it.

Indeed. Our V4.0 library, now available at our web site, has a superset
of the checks in STLport.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: llewelly.@@xmission.dot.com (llewelly)
Date: Mon, 30 Dec 2002 03:12:19 +0000 (UTC)
Raw View
scottm@toast.net ("Scott Mayo") writes:

> "Allan W":
>
> > Is C++ a language without bounds checking? Certainly all of the
> > popular C and C++ compilers don't bounds-check on built-in arrays.
> > But is this due to the standard, or just due to existing practice?
>
> Planting bounds checks on the code generated by a[i] requires
> adding instructions to check i. This typically doubles the number
> of instructions, and more than halves the speed. Plenty of people
> would be willing to form a lynch mob if someone made that
> speed loss a permanent part of C++. Anyway, there are
> packages out there that can do it for you *optionally*,
> when you need to debug. (Unsurprisingly, most of them
> cause a huge performance hit.)
>
> And while checking a[i] for a[], an automatic or static array, is
> easy, checking p[i], p some pointer, is not. It can be impossible:
>
>     extern int p[];
>     extern char* q;

This is not impossible. A pointer could implemented as 3-member
    struct:

struct fat_pointer
{
    T* start_span;
    T* end_span;
    T* position;
};
[snip]

In addition to the code required for the bounds checks, this method
    also increases the size of pointers (by a factor of 3), and the
    overhead of copying them.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: nagle@animats.com (John Nagle)
Date: Mon, 30 Dec 2002 20:16:58 +0000 (UTC)
Raw View
llewelly wrote:

> scottm@toast.net ("Scott Mayo") writes:
>
>
>>"Allan W":
>>
>>
>>>Is C++ a language without bounds checking? Certainly all of the
>>>popular C and C++ compilers don't bounds-check on built-in arrays.
>>>But is this due to the standard, or just due to existing practice?
>>>
>>Planting bounds checks on the code generated by a[i] requires
>>adding instructions to check i. This typically doubles the number
>>of instructions, and more than halves the speed.


    Compilers can do a far better job than that.  About 95%
of subscript checks can be eliminated by simple optimizations.
But compilers need to know more about arrays to do this.

    STL iterator checking could potentially be optimized
quite a bit.  Consider the following approach.

    First, iterators should be classified by the compiler
as "simple" or "complex".  Simple iterators have the
following properties:

    - The first assignment to the iterator binds the iterator to a
 collection.
    - There are no possible reference to the iterator before
 its first assignment.
    - No operation applied to the iterator other than the first
 assignment can potentially bind the iterator to a
 different collection.

Simple iterators are thus bound to a single collection for
their entire lives, and the compiler doesn't have to carry
a pointer to their collection along for checking purposes.
The compiler can represent them as raw pointers, yet still
check them.

    Second, iterator checks for simple iterators
should be hoisted out of loops whenever possible.
Most of the common patterns of iterator usage allow
this optimization.

    With these optimizations, most of the checking overhead
can be eliminated from inner loops.  There will still be
some run-time checks required, but they probably won't
impact performance much.

    If anybody is really interested, I'll describe how to do
this in more detail.  It's not that hard.

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: scottm@toast.net ("Scott Mayo")
Date: Thu, 2 Jan 2003 06:24:40 +0000 (UTC)
Raw View
"llewelly" <llewelly.@@xmission.dot.com> wrote in message
news:86n0mow6iw.fsf@Zorthluthik.foo...
> scottm@toast.net ("Scott Mayo") writes:

> > And while checking a[i] for a[], an automatic or static array, is
> > easy, checking p[i], p some pointer, is not. It can be impossible:
> >
> >     extern int p[];
> >     extern char* q;

> This is not impossible. A pointer could implemented as 3-member
>     struct:
>
> struct fat_pointer
> {
>     T* start_span;
>     T* end_span;
>     T* position;
> };

This works for many things. It doesn't work well in the presence of realloc,
even if realloc doesn't move the array.

Speaking of which, if there is one thing I'd like to see added to C++, it's
the ability
to change the size of an allocated array of type T, such that if I extend
the array, T's ctor
is run on the new elements, and if I shrink it, T's dtor runs on the dropped
elements.
I *think* C++ already has enough mechanics in place to grant this
more or less for free, if there was a syntax in place to specify it.

I end up simulating this with an extra class, a realloc and some mechanics
when I want it. Or is there a better way?




---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: witless@attbi.com (Witless)
Date: Thu, 2 Jan 2003 06:40:55 +0000 (UTC)
Raw View
James Kanze wrote:

> why don't compilers have a debug mode for the built-in stuff: fat
> pointers, array bounds checking, overflow detection, etc.?  It would run
> significantly slower than normal, but you wouldn't have to use it for
> release code.

Actually you would have to use it.  See Hoare's discussion of the reactions
of his customers when he first implemented bounds checks.  They insisted that
the bounds checks be left in place because correctness was more important to
them than performance.

I can't think of anything in the last ~40 years to reverse those priorities.
Is it because computers are faster now that it is OK for them to be wrong
more often?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]





Author: allan_w@my-dejanews.com (Allan W)
Date: Fri, 27 Dec 2002 20:41:53 +0000 (UTC)
Raw View
hyrosen@mail.com (Hyman Rosen) wrote
> This from a language without bounds checking.

Is C++ a language without bounds checking? Certainly all of the
popular C and C++ compilers don't bounds-check on built-in arrays.
But is this due to the standard, or just due to existing practice?

Going beyond array bounds is already either compiler-defined or
undefined behavior, right? (I couldn't find this in the standard,
but I *know* it's in there somewhere!)

On the other hand, I also couldn't find anything in the standard
that guarantees the compiler will NOT check bounds.

Seems to me it's possible to apply bounds-checking right now,
without a language change -- and that the only reason compilers
don't already do this, is that the earliest compilers didn't.
That makes it a QOI issue.

Yes, this would break legacy code which depends on this behavior,
so any compiler which started checking bounds would be well-advised
to supply a compiler switch to turn it off (or maybe a local
extension to turn it off for one particular array). Presumably this
is the reason that (AFAIK) no current compiler checks bounds.

Still, I'm under the impression that the language does allow
compilers to check it now -- and if I'm wrong, I'm sure I'll hear
about it here.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]