Thread

Topic: size_t confusion please help...

Author: clive@sco.com (Clive D.W. Feather)
Date: Thu, 6 Jan 1994 20:59:01 GMT Raw View

>>>>>As an aside, one of the great mysteries of the ANSI C standard, at least
>>>>>to me, is that while pointer addition involves a pointer and an int,
>>>>>the difference of two pointers yields a ptrdiff_t.
>>> Also subject of a defect report I think.
>>Can someone please provide more info on this too.

This is *not* the subject of a Defect Report (though there was one on
what "size_t" and "ptrdiff_t" actually mean).

Using [] to mean optional, the type rules for pointer arithmetic are:

    pointer + [unsigned] [long] int => pointer
    pointer - [unsigned] [long] int => pointer
    [unsigned] [long] int + pointer => pointer
    pointer - pointer => "type PDT"

[char and short types will have been promoted].

"type PDT" is implementation-defined, but it is one of:

    signed char
    signed short int
    signed int
    signed long int

If you include <stddef.h>, the name "ptrdiff_t" becomes reserved with
file scope, and is a typedeffed type which is identical to "type PDT".

Clear now ?

--
Clive D.W. Feather     | Santa Cruz Operation    | If you lie to the compiler,
clive@sco.com          | Croxley Centre          | it will get its revenge.
Phone: +44 923 816 344 | Hatters Lane, Watford   |   - Henry Spencer
Fax:   +44 923 817 688 | WD1 8YN, United Kingdom | <== * NOTE NEW INFORMATION *

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Fri, 7 Jan 1994 00:20:58 GMT Raw View

In article <CJ87MD.BH3@x.co.uk> clive@sco.com (Clive D.W. Feather) writes:
>>>>>>As an aside, one of the great mysteries of the ANSI C standard, at least
>>>>>>to me, is that while pointer addition involves a pointer and an int,
>>>>>>the difference of two pointers yields a ptrdiff_t.
>>>> Also subject of a defect report I think.
>>>Can someone please provide more info on this too.
>
>This is *not* the subject of a Defect Report (though there was one on
>what "size_t" and "ptrdiff_t" actually mean).
>
>Using [] to mean optional, the type rules for pointer arithmetic are:
>
>    pointer + [unsigned] [long] int => pointer
>    pointer - [unsigned] [long] int => pointer
>    [unsigned] [long] int + pointer => pointer
>    pointer - pointer => "type PDT"
>
>[char and short types will have been promoted].
>
>"type PDT" is implementation-defined, but it is one of:
>
>    signed char
>    signed short int
>    signed int
>    signed long int
>
>If you include <stddef.h>, the name "ptrdiff_t" becomes reserved with
>file scope, and is a typedeffed type which is identical to "type PDT".
>
>Clear now ?

 Sure: there is no assurance that

 p1 + (p2 - p1) == p1

unless the pointers are separated by less than a 127 units.
Even if type PDT is long, because size_t might be unsigned long
we have

 unsigned long n;
 p1 - (p1 +n ) == n // not guarranteed

In particular, what is the result of pointer subtraction
if the pointers are too far apart? Can I convert the
result to unsigned to get the actual distance if I know
the order?


--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: clive@sco.com (Clive D.W. Feather)
Date: Sat, 8 Jan 1994 20:17:13 GMT Raw View

In article <CJ8Gyy.5xq@ucc.su.OZ.AU> maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>In article <CJ87MD.BH3@x.co.uk> clive@sco.com (Clive D.W. Feather) writes:
>> Using [] to mean optional, the type rules for pointer arithmetic are:
>>    pointer + [unsigned] [long] int => pointer
>>    pointer - [unsigned] [long] int => pointer
>>    [unsigned] [long] int + pointer => pointer
>>    pointer - pointer => "type PDT"
>> [char and short types will have been promoted].
>> "type PDT" is implementation-defined, but it is one of:
>>    signed char
>>    signed short int
>>    signed int
>>    signed long int
>> If you include <stddef.h>, the name "ptrdiff_t" becomes reserved with
>> file scope, and is a typedeffed type which is identical to "type PDT".

> Sure: there is no assurance that
> p1 + (p2 - p1) == p1
> unless the pointers are separated by less than a 127 units.

No. PDT must be capable of holding the difference between any two
elements of an array. What a suitable type will depend on the
implementation, but the Standard requires 32767 element arrays. If an
implementation allows arrays of 1000000 elements, then PDT must be able
to hold +1000001 and -1000001.

So the above is true if p1 and p2 are pointers to the same array or one
beyond the end; if not, p2 - p1 is undefined.

> Even if type PDT is long, because size_t might be unsigned long
> we have
> unsigned long n;
> p1 - (p1 +n ) == n // not guarranteed

If p1 points to element I of an array of J elements (0 <= I < J), then this
is always true provided that (I + n <= J). Otherwise (p1 + n) is undefined.

The type of size_t doesn't matter.

> In particular, what is the result of pointer subtraction
> if the pointers are too far apart? Can I convert the
> result to unsigned to get the actual distance if I know
> the order?

If the elements are in the same array, then the distance between them
*will* fit into PDT. If not, you can't subtract them anyway.

--
Clive D.W. Feather     | Santa Cruz Operation    | If you lie to the compiler,
clive@sco.com          | Croxley Centre          | it will get its revenge.
Phone: +44 923 816 344 | Hatters Lane, Watford   |   - Henry Spencer
Fax:   +44 923 817 688 | WD1 8YN, United Kingdom | <== * NOTE NEW INFORMATION *

Author: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Date: Sun, 9 Jan 1994 02:46:22 GMT Raw View

clive@sco.com (Clive D.W. Feather) writes:

>maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>>clive@sco.com (Clive D.W. Feather) writes:
>>> Using [] to mean optional, the type rules for pointer arithmetic are:
>>>    pointer - pointer => "type PDT"
>>> "type PDT" is implementation-defined, but it is one of:
>>>    signed char
>>>    signed short int
>>>    signed int
>>>    signed long int
>>> If you include <stddef.h>, the name "ptrdiff_t" becomes reserved with
>>> file scope, and is a typedeffed type which is identical to "type PDT".
>
>> Sure: there is no assurance that
>> p1 + (p2 - p1) == p1
>> unless the pointers are separated by less than a 127 units.
>
>No. PDT must be capable of holding the difference between any two
>elements of an array. What a suitable type will depend on the
>implementation, but the Standard requires 32767 element arrays. If an
>implementation allows arrays of 1000000 elements, then PDT must be able
>to hold +1000001 and -1000001.

So that means that a compiler that allowed 64000 element arrays, but
for which 64000 did not fit into a ptrdiff_t, would be non-conforming?

(If so, then both Borland and Microsoft C are non-conforming, I believe.)

--
Fergus Henderson        |   "People who brook no compromise in programming
                        |   languages should program in lambda calculus or
fjh@munta.cs.mu.OZ.AU   |   machine language, depending." --Andrew Koenig.

Author: sheeran@ndg.co.jp (Frank Sheeran)
Date: Mon, 10 Jan 1994 02:27:11 GMT Raw View

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Mon, 27 Dec 1993 14:10:13 GMT Raw View

In article <CII96F.K60@ses.com> jamshid@ses.com (Jamshid Afshar) writes:
>Crossposting to comp.std.c.  I think the question is whether
>(unsigned)-1 is guaranteed to give you UINT_MAX.
>
>In article <CIDIqy.AIs@ucc.su.oz.au>,
>John Max Skaller <maxtal@physics.su.OZ.AU> wrote:
>>In article <CICz0G.FtA@ses.com> jamshid@ses.com (Jamshid Afshar) writes:

>>>(size_t)-1 bytes since `operator new()' takes a size_t.  (note,
>>>casting -1 to an unsigned integral types gives you the largest
>>>value
>>
>> Prove it.

 I withdraw my comment. Its well defined, in ISO C and C++.

> When an integer is converted to an unsigned type, the value
> is the least unsigned integer congruent to the signed integer
> (modulo 2^n where n is the number of bits used to represent
> the unsigned type).  In a two's complement representation,
> this conversion is conceptual and there is no change in the
> bit pattern.

 Equivalent in C.

>>>Someone else wrote:
>>>>As an aside, one of the great mysteries of the ANSI C standard, at least
>>>>to me, is that while pointer addition involves a pointer and an int,
>>>>the difference of two pointers yields a ptrdiff_t.
>>
>> Also subject of a defect report I think.
>
>Can someone please provide more info on this too.

 Sorry, it doesnt appear to be listed in the record
of responses, and I cant find the list of defects.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: jamshid@ses.com (Jamshid Afshar)
Date: Tue, 21 Dec 1993 00:07:27 GMT Raw View

In article <1993Dec15.202555.3650@sat.mot.com>,
Joseph Hall <hall_j@sat.mot.com> wrote:
>Seems it was grumpy@cbnewse.cb.att.com (Paul J Lucas) who said:
>>From article <2el6gq$42o@wuecl.wustl.edu>, by abed@saturn.wustl.edu (Abed M. Hammoud):
>>>    X[-i + 1]....is this valid ? and is it well defined...
>>> can (i) be of type size_t
>>
>> As far as I can tell, there is no merit to doing this at all.
>
>It's also possibly illegal.  Array indices (or integral values explicitly
>added to pointers) must be of type int.  long is not allowed. [...]
>In any event, size_t can easily be larger than int and thus larger than
>array indices are allowed to be.

No, pointer arithmetic is guaranteed to work with any integral type,
whether it is int or long or signed or unsigned.  See ARM 5.7.  The
results are defined as long as you don't add or subtract a value which
results in an out-of-bounds reference (actually, you can point to one
past the end of the array).

There is a limit in how much memory you can allocate or how large
objects can be.  A definite limit is that you can't allocate more than
(size_t)-1 bytes since `operator new()' takes a size_t.  (note,
casting -1 to an unsigned integral types gives you the largest
value).  For example, size_t on non-extended MS-DOS/Windows compilers
is only 16-bits so you can't allocate arrays larger than 64K.  Some
compilers even restrict you to a couple of words less than that.
There are some compiler-specific language extensions to overcome some
of these limitations (`__huge' pointers).

Whether to use size_t or int is probably mostly a style issue.  In my
personal library I use size_t extensively because my Array class
(which is used by a lot of other classes) uses it.  I used it in my
array class so that I didn't have to check for negative values
everywhere and so that I could declare arrays of 64,000 chars when int
maxed out at 32,767 on my machine.  Using size_t instead of int just
feels "purer" to me.  Either way, make sure you're checking for
under/overflows.

>As an aside, one of the great mysteries of the ANSI C standard, at least
>to me, is that while pointer addition involves a pointer and an int,
>the difference of two pointers yields a ptrdiff_t.

Pointer arithmetic is not restricted to `int'.  ptrdiff_t must be a
signed integer type.  If size_t is `unsigned int' on a machine where
sizeof(long) > sizeof(int), the implementation should define ptrdiff_t
as `long' instead of `int' so that the following will work:

 char* p = new char[64000];
 ptrdiff_t n = &p[63999] - &p[0];
 // result is 63999, which would overflow a 16-bit signed int

I don't think an implementation is required to make the above code
work, though.  If ptrdiff_t is a 16-bit `int' it could crash since
overflow during signed integer arithmetic results in undefined
behavior.

Jamshid Afshar
jamshid@ses.com

Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Tue, 21 Dec 1993 07:13:46 GMT Raw View

In article <CICz0G.FtA@ses.com> jamshid@ses.com (Jamshid Afshar) writes:
>
>No, pointer arithmetic is guaranteed to work with any integral type,
>whether it is int or long or signed or unsigned.  See ARM 5.7.  The
>results are defined as long as you don't add or subtract a value which
>results in an out-of-bounds reference (actually, you can point to one
>past the end of the array).

 Yes. My proposed memory model will also probably
allow "one byte past the end of an object" of any kind.

>There is a limit in how much memory you can allocate or how large
>objects can be.  A definite limit is that you can't allocate more than
>(size_t)-1 bytes since `operator new()' takes a size_t.  (note,
>casting -1 to an unsigned integral types gives you the largest
>value

 Prove it.  The Library Group changed the specs of "string"
just because it isnt the case. As far as I know, the conversion
of signed to unsigned types (of the same size) is supported
only if the value of the signed type is representable
as an unsigned type, and -1 is not in that category.

 This issue was the subject of a defect notice for ISO C.

>
>Whether to use size_t or int is probably mostly a style issue.

 I prefer to use a signed type myself, just so
there is a defined "illegal" value. But this does cause
loss of size, arrays on 16 bit  machines only support
32K objects instead of 64K.

>In my
>personal library I use size_t extensively because my Array class
>(which is used by a lot of other classes) uses it.  I used it in my
>array class so that I didn't have to check for negative values
>everywhere and so that I could declare arrays of 64,000 chars when int
>maxed out at 32,767 on my machine.  Using size_t instead of int just
>feels "purer" to me.  Either way, make sure you're checking for
>under/overflows.

 Yes. Unsigned types are dangerous because you might
loop to a negative value and the program keeps on working
when it should fail.
>
>>As an aside, one of the great mysteries of the ANSI C standard, at least
>>to me, is that while pointer addition involves a pointer and an int,
>>the difference of two pointers yields a ptrdiff_t.

 Also subject of a defect report I think.

>Pointer arithmetic is not restricted to `int'.  ptrdiff_t must be a
>signed integer type.  If size_t is `unsigned int' on a machine where
>sizeof(long) > sizeof(int), the implementation should define ptrdiff_t
>as `long' instead of `int' so that the following will work:
>
> char* p = new char[64000];
> ptrdiff_t n = &p[63999] - &p[0];
> // result is 63999, which would overflow a 16-bit signed int
>
>I don't think an implementation is required to make the above code
>work, though.  If ptrdiff_t is a 16-bit `int' it could crash since
>overflow during signed integer arithmetic results in undefined
>behavior.

 Its sure nasty that the following is not guarranteed:

 char *p1 = ... ;
 size_t n = ...;
 char *p2 = p1 + n;
 ptrdiff m = p2 - p1;
 assert(n==m);

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA

Author: jamshid@ses.com (Jamshid Afshar)
Date: Thu, 23 Dec 1993 20:35:02 GMT Raw View

Crossposting to comp.std.c.  I think the question is whether
(unsigned)-1 is guaranteed to give you UINT_MAX.

In article <CIDIqy.AIs@ucc.su.oz.au>,
John Max Skaller <maxtal@physics.su.OZ.AU> wrote:
>In article <CICz0G.FtA@ses.com> jamshid@ses.com (Jamshid Afshar) writes:
>>There is a limit in how much memory you can allocate or how large
>>objects can be.  A definite limit is that you can't allocate more than
>>(size_t)-1 bytes since `operator new()' takes a size_t.  (note,
>>casting -1 to an unsigned integral types gives you the largest
>>value
>
> Prove it.  The Library Group changed the specs of "string"
>just because it isnt the case. As far as I know, the conversion
>of signed to unsigned types (of the same size) is supported
>only if the value of the signed type is representable
>as an unsigned type, and -1 is not in that category.
> This issue was the subject of a defect notice for ISO C.

Can someone provide more information about the defect report?  The
_Annotaed C++ Reference Manual_ Section 4.2 says:

 When an integer is converted to an unsigned type, the value
 is the least unsigned integer congruent to the signed integer
 (modulo 2^n where n is the number of bits used to represent
 the unsigned type).  In a two's complement representation,
 this conversion is conceptual and there is no change in the
 bit pattern.

The next sentence and the commentary explain that conversions to a
*signed* integer are implementation dependent if the original value
cannot be represented.  Is Standard C any different?

>>Someone else wrote:
>>>As an aside, one of the great mysteries of the ANSI C standard, at least
>>>to me, is that while pointer addition involves a pointer and an int,
>>>the difference of two pointers yields a ptrdiff_t.
>
> Also subject of a defect report I think.

Can someone please provide more info on this too.

Thanks,
Jamshid Afshar
jamshid@ses.com

Author: grumpy@cbnewse.cb.att.com (Paul J Lucas)
Date: Wed, 15 Dec 1993 13:09:30 GMT Raw View

Author: hall_j@sat.mot.com (Joseph Hall)
Date: Wed, 15 Dec 1993 20:25:55 GMT Raw View

Seems it was grumpy@cbnewse.cb.att.com (Paul J Lucas) who said:
>From article <2el6gq$42o@wuecl.wustl.edu>, by abed@saturn.wustl.edu (Abed M. Hammoud):
>> The above code works fine but then what if you were doing funcky stuff
>> inside the loop like
>>
>>    X[-i + 1]....is this valid ? and is it well defined...
>>
>> can (i) be of type size_t
>
> As far as I can tell, there is no merit to doing this at all.

It's also possibly illegal.  Array indices (or integral values explicitly
added to pointers) must be of type int.  long is not allowed.  On many
32-bit machines, size_t is an int, but ...

In any event, size_t can easily be larger than int and thus larger than
array indices are allowed to be.

As an aside, one of the great mysteries of the ANSI C standard, at least
to me, is that while pointer addition involves a pointer and an int,
the difference of two pointers yields a ptrdiff_t.

--
Joseph Nathan Hall |  Whales: smart food for smart people
Software Architect |    (* for the extremely humor-impaired, this is a joke)
Gorca Systems Inc. |                 joseph@joebloe.maple-shade.nj.us (home)
(on assignment)    |         (602) 732-2549 (work)   Joseph_Hall@sat.mot.com

Author: abed@saturn.wustl.edu (Abed M. Hammoud)
Date: 14 Dec 1993 20:08:58 GMT Raw View

Hello, I am still not sure when to use size_t variables and whether they
have an advantage over plain int.

for example:

for (size_t i = n; i; i--)
   etc

as opposed to

for (int i = n; i; i--)
   etc

If I remember correctly I read in a column by Plauger that you should
try to use size_t for all positive int's, like loop counters.

The above code works fine but then what if you were doing funcky stuff
inside the loop like

   X[-i + 1]....is this valid ? and is it well defined...

can (i) be of type size_t

Thanks,
+-------------------------------------------------+-----------------------+
|Abed M. Hammoud (KB0INX)     | abed@saturn.wustl.edu |
|Washington University.      | Office:               |
|Electronic Systems & Signals Research Laboratory.| -Voice:(314) 935-7547 |
|Department of Electrical/Biomedical Engineering. | -FAX:  (314) 935-4842 |
|Campus Box 1161, One Brookings Drive.    |     |
|St. Louis, MO , 63130 USA     |                       |
+-------------------------------------------------+-----------------------+