Topic: Converting int to unsigned int (was: reinterpret cast int* to...)


Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 4 Aug 1994 12:43:21 GMT
Raw View
volpe@bart.crd.ge.com (Christopher R. Volpe) writes:

>In article <318t08$t4d@news.tuwien.ac.at>, hp@vmars.tuwien.ac.at (Peter
>Holzer) writes:
>
>>diamond@jrd.dec.com (Norman Diamond) writes:
>>
>>>Suppose char types are 16 bits and int is 16 bits. Furthermore
>>>suppose that the machine's signed arithmetic uses one's complement.
>>>Now, one's complement is no problem for unsigned arithmetic and our
>>>unsigned chars, right? Wrong. Suppose you stick 0x8000 in an unsigned
>>>char and want to write it. fputc()'s first argument has type int,
>>>so the value gets demoted to type int in an implementation-defined
>>>manner, and the result must be in the range -32767 to +32767, in an
>>>int. fputc() internally converts this value back to unsigned char in
>>>a standard-defined manner, and the result is in the range 32769 to
>>>65535 or 0 to 32767; it cannot possibly be 32768. So fputc() cannot
>>>write the value of 0x8000 correctly.
>>
>>This is correct.

>I'm missing something here. Why can't it be 32768? Surely, the unsigned
>char is capable of holding the value 32768, if it's capable of holding
>32767 and 32769, right? Why can't its representation be 0x8000?

The unsigned char is capable of holding 32768, but it is converted to
int when passed to fputc. Since a one's complement 16-bit int can only
hold the values -32767 .. 32767, 32768 must be converted to one of
these values. The conversion back to unsigned char is well defined and
will map -32767 to 32769 ... -1 to 65535. Thus the unsigned char 32768
cannot be written on this implementation (the most likely value for
(int)32768 is -0, which would be converted to (unsigned char)0).

I still believe that the implementation above is not conforming, but
the standard is rather vague in this respect. Could somebody submit a
DR for me (or tell me, how to submit one myself, if that is possible)?

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 4 Aug 1994 12:45:16 GMT
Raw View
danpop@cernapo.cern.ch (Dan Pop) writes:

>In <31b1lo$gru@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:

>>your's might be valid. But if you are right, fread and fwrite are almost
>>useless, and arbitrary bytes cannot be written even to binary streams,
>>which severly reduces their usefulness, too.

>What is a byte? :-)

The amount of storage occupied by a char, unsigned char or signed
char or any value representable in any of these types.

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 29 Jul 1994 13:54:00 GMT
Raw View
diamond@jrd.dec.com (Norman Diamond) writes:

>In article <318t08$t4d@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>>diamond@jrd.dec.com (Norman Diamond) writes:

>>fwrite and fread are defined to write resp. read arrays of elements
>>of a given size. The type of the elements isn't restricted, so I think
>>any object can be written to a binary stream with fwrite and read back
>>with fread without loss of information.

>The type of the elements isn't restricted, but all writing goes via fputc(),
>and fputc() doesn't write what you want it to write.  In a binary stream,
>fgetc() (and fread()) have to read back what fputc() (and fwrite()) wrote,
>not what you wanted them to write.

Yes, I see your point. You say that the data that was written is the int
values that are passed to fputc from fwrite. I say that the data that
is written by fwrite is the object described by the first 3 parameters
of fwrite (and that the clause stating that all I/O goes through
fgetc/fputc just places additional requirements on fputc, but does not
reduce the requirement, that data read with fread does compare equal to
data earlier written by fwrite). I find nothing to prove my point, so
your's might be valid. But if you are right, fread and fwrite are almost
useless, and arbitrary bytes cannot be written even to binary streams,
which severly reduces their usefulness, too.

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: volpe@bart.crd.ge.com (Christopher R. Volpe)
Date: Fri, 29 Jul 1994 15:40:59 GMT
Raw View
In article <318t08$t4d@news.tuwien.ac.at>, hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>diamond@jrd.dec.com (Norman Diamond) writes:
>
>>Suppose char types are 16 bits and int is 16 bits. Furthermore suppose
>>that the machine's signed arithmetic uses one's complement. Now, one's
>>complement is no problem for unsigned arithmetic and our unsigned
>>chars, right? Wrong. Suppose you stick 0x8000 in an unsigned char
>>and want to write it. fputc()'s first argument has type int, so the
>>value gets demoted to type int in an implementation-defined manner,
>>and the result must be in the range -32767 to +32767, in an int.
>>fputc() internally converts this value back to unsigned char in a
>>standard-defined manner, and the result is in the range 32769 to 65535
>>or 0 to 32767; it cannot possibly be 32768. So fputc() cannot write the
>>value of 0x8000 correctly.
>
>This is correct.

I'm missing something here. Why can't it be 32768? Surely, the unsigned char
is capable of holding the value 32768, if it's capable of holding 32767 and
32769, right? Why can't its representation be 0x8000?

--

Chris Volpe    Phone: (518) 387-7766 (Dial Comm 8*833
GE Corporate R&D   Fax:   (518) 387-6560
PO Box 8, Schenectady, NY 12301  Email: volpecr@crd.ge.com





Author: danpop@cernapo.cern.ch (Dan Pop)
Date: Fri, 29 Jul 1994 21:18:58 GMT
Raw View
In <31b1lo$gru@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:

>your's might be valid. But if you are right, fread and fwrite are almost
>useless, and arbitrary bytes cannot be written even to binary streams,
>which severly reduces their usefulness, too.

What is a byte? :-)

Dan
--
Dan Pop
CERN, CN Division
Email: danpop@cernapo.cern.ch
Mail:  CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland




Author: diamond@jrd.dec.com (Norman Diamond)
Date: 28 Jul 1994 04:10:23 GMT
Raw View
In article <313nj8$do6@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>However, fwrite and fread are supposed to be able to write arbitrary
>objects to a binary stream and read it back.  Since all stdio I/O
>functions work as if they called fgetc/fputc and fputc converts its
>argument to unsigned char, we can assume that any object can be treated
>as an array of unsigned char without loss of information.

No we can't!  Too bad the other two or three threads involving this
issue aren't also crossposted to comp.std.c++ for their enjoyment.

Suppose char types are 16 bits and int is 16 bits.  Furthermore suppose
that the machine's signed arithmetic uses one's complement.  Now, one's
complement is no problem for unsigned arithmetic and our unsigned chars,
right?  Wrong.  Suppose you stick 0x8000 in an unsigned char and want to
write it.  fputc()'s first argument has type int, so the value gets
demoted to type int in an implementation-defined manner, and the result
must be in the range -32767 to +32767, in an int.  fputc() internally
converts this value back to unsigned char in a standard-defined manner,
and the result is in the range 32769 to 65535 or 0 to 32767; it cannot
possibly be 32768.  So fputc() cannot write the value of 0x8000 correctly.

If fgetc() reads a value of 0x8000, the implementation-defined conversion
to int will yield some int value.  It can have any int value because it
doesn't have to match anything; no call to fputc() could have written it.

Since all stdio I/O functions work as if they called fgetc/fputc, we
cannot assume that any object can be written and read without loss of
information.

I think I know two relatively painless ways to solve the problem:
(1)  Outlaw one's complement implementations; and/or
(2)  Require that INT_MAX be strictly greater than UCHAR_MAX.
--
 <<  If this were the company's opinion, I would not be allowed to post it.  >>
segmentation fault (california dumped)




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 28 Jul 1994 18:22:00 GMT
Raw View
diamond@jrd.dec.com (Norman Diamond) writes:

>In article <313nj8$do6@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>>However, fwrite and fread are supposed to be able to write arbitrary
>>objects to a binary stream and read it back.  Since all stdio I/O
>>functions work as if they called fgetc/fputc and fputc converts its
>>argument to unsigned char, we can assume that any object can be treated
>>as an array of unsigned char without loss of information.

>No we can't!  Too bad the other two or three threads involving this
>issue aren't also crossposted to comp.std.c++ for their enjoyment.

Well, I'm reading comp.std.c, so I see them.

fwrite and fread are defined to write resp. read arrays of elements
of a given size. The type of the elements isn't restricted, so I think
any object can be written to a binary stream with fwrite and read back
with fread without loss of information.

>Suppose char types are 16 bits and int is 16 bits. Furthermore suppose
>that the machine's signed arithmetic uses one's complement. Now, one's
>complement is no problem for unsigned arithmetic and our unsigned
>chars, right? Wrong. Suppose you stick 0x8000 in an unsigned char
>and want to write it. fputc()'s first argument has type int, so the
>value gets demoted to type int in an implementation-defined manner,
>and the result must be in the range -32767 to +32767, in an int.
>fputc() internally converts this value back to unsigned char in a
>standard-defined manner, and the result is in the range 32769 to 65535
>or 0 to 32767; it cannot possibly be 32768. So fputc() cannot write the
>value of 0x8000 correctly.

This is correct.

[...]

>Since all stdio I/O functions work as if they called fgetc/fputc, we
>cannot assume that any object can be written and read without loss of
>information.

This is where we disagree. The standard guarantuees that all objects can
be written to a binary stream and read back unchanged. Therefore your
example implementation is not conforming.

>I think I know two relatively painless ways to solve the problem:
>(1)  Outlaw one's complement implementations; and/or
>(2)  Require that INT_MAX be strictly greater or equal than UCHAR_MAX.

I would vote for (2), since it solves the problems discussed in the
other two or three threads, too.


 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: diamond@jrd.dec.com (Norman Diamond)
Date: 29 Jul 1994 00:48:23 GMT
Raw View
In article <318t08$t4d@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>diamond@jrd.dec.com (Norman Diamond) writes:
>>In article <313nj8$do6@news.tuwien.ac.at> hp@vmars.tuwien.ac.at (Peter Holzer) writes:
>>>However, fwrite and fread are supposed to be able to write arbitrary
>>>objects to a binary stream and read it back.  Since all stdio I/O
>>>functions work as if they called fgetc/fputc and fputc converts its
>>>argument to unsigned char, we can assume that any object can be treated
>>>as an array of unsigned char without loss of information.

>>No we can't!

>fwrite and fread are defined to write resp. read arrays of elements
>of a given size. The type of the elements isn't restricted, so I think
>any object can be written to a binary stream with fwrite and read back
>with fread without loss of information.

The type of the elements isn't restricted, but all writing goes via fputc(),
and fputc() doesn't write what you want it to write.  In a binary stream,
fgetc() (and fread()) have to read back what fputc() (and fwrite()) wrote,
not what you wanted them to write.  The explanation follows again:

>>Suppose char types are 16 bits and int is 16 bits. Furthermore suppose
>>that the machine's signed arithmetic uses one's complement. Now, one's
>>complement is no problem for unsigned arithmetic and our unsigned
>>chars, right? Wrong. Suppose you stick 0x8000 in an unsigned char
>>and want to write it. fputc()'s first argument has type int, so the
>>value gets demoted to type int in an implementation-defined manner,
>>and the result must be in the range -32767 to +32767, in an int.
>>fputc() internally converts this value back to unsigned char in a
>>standard-defined manner, and the result is in the range 32769 to 65535
>>or 0 to 32767; it cannot possibly be 32768. So fputc() cannot write the
>>value of 0x8000 correctly.

>This is correct.

>>Since all stdio I/O functions work as if they called fgetc/fputc, we
>>cannot assume that any object can be written and read without loss of
>>information.

>This is where we disagree. The standard guarantuees that all objects can
>be written to a binary stream and read back unchanged. Therefore your
>example implementation is not conforming.

No!  The standard guarantees that the data written out will be read in.
It does not guarantee that the data written out will be what the relevant
argument of fwrite() points to, or what some expression was before being
demoted to signed int in the relevant argument of fputc().  My example
implementation writes what the standard requires to be written, and reads
back what it writes, and it conforms.
--
 <<  If this were the company's opinion, I would not be allowed to post it.  >>
segmentation fault (california dumped)




Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Mon, 25 Jul 1994 16:34:59 GMT
Raw View
In article <CtHIMn.CDB@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>
>There's an issue of "spare bits". It might be reasonable for an unsigned
>type to have unused bits, while the signed type does not. For example,
>it might be easier to implement unsigned types on a specific machine by
>ignoring the top bit of every byte.
>
>> I also assumed that Ux_MAX == 2 * x_MAX + 1
>> for x = {SHRT,INT,LONG}.
>
>WG14 seems to disagree with you. I have had an informal response to a DR
>which says that, in particular, Ux_MAX == x_MAX is a reasonable
>implementation.

 It had better not be the case for x==CHAR, or copying
memory using unsigned char arrays will fail.  I expect quite a bit
of C code depends on that. Doesnt the C Standard even define
memcpy in terms of copy arrays of char?


--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189




Author: diamond@jrd.dec.com (Norman Diamond)
Date: 26 Jul 1994 03:15:09 GMT
Raw View
In article <CtI8qC.Ix9@ucc.su.OZ.AU> maxtal@physics.su.OZ.AU (John Max Skaller) writes:
>In article <CtHIMn.CDB@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>>I have had an informal response to a DR which says that, in particular,
>>Ux_MAX == x_MAX is a reasonable implementation.

>It had better not be the case for x==CHAR, or copying memory using unsigned
>char arrays will fail.  I expect quite a bit of C code depends on that.

And quite a bit of C code depends on plain char, and could potentially
depend on signed char.  I think there is no reliable way to copy memory
other than using the builtin functions.

You know, some defect reports point out obvious errors in the standard,
some point out possible errors (it is not clear if the standard permits
implementations to do certain things that make C useless), and some ask
only for clarification.  But overall, I thought that the purpose of defect
reports was to request that the standard be corrected, so that standard C
could be made almost as useful as some traditional C implementations used
to be.  In many cases, the committee seems to agree that the standard is
not so clear, but then makes standard C even more unusable, cutting off
its own noses along with ours, in order to spite our faces.  It used to
be possible even for C haters to use it for practical work, but now the
committee seems to hate C even more than we do.

>Doesnt the C Standard even define memcpy in terms of copy arrays of char?

No.  str* functions take char* (and some const char*) but their internal
implementations do not have to be C; they have to be whatever magic the
implementor wants to use to make the results correct.  mem* functions
take void* and again do not have to be implemented in C themselves.
--
 <<  If this were the company's opinion, I would not be allowed to post it.  >>
A program in conformance will not tend to stay in conformance, because even if
it doesn't change, the standard will.       Force = program size * destruction.
Every technical corrigendum is met by an equally troublesome new defect report.




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 26 Jul 1994 19:19:04 GMT
Raw View
maxtal@physics.su.OZ.AU (John Max Skaller) writes:

>In article <CtHIMn.CDB@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>>
>>> I also assumed that Ux_MAX == 2 * x_MAX + 1
>>> for x = {SHRT,INT,LONG}.
>>
>>WG14 seems to disagree with you. I have had an informal response to a DR
>>which says that, in particular, Ux_MAX == x_MAX is a reasonable
>>implementation.

> It had better not be the case for x==CHAR, or copying
>memory using unsigned char arrays will fail.

It must be the case on implementations where the default char is
unsigned. This is why I didn't include CHAR.

>I expect quite a bit of C code depends on that. Doesnt the C Standard
>even define memcpy in terms of copy arrays of char?

Not quite. It defines it in terms of copying characters. Whether these
are signed char, char, unsigned char or just some general kind of
characters isn't clear. However, fwrite and fread are supposed to be
able to write arbitrary objects to a binary stream and read it back.
Since all stdio I/O functions work as if they called fgetc/fputc and
fputc converts its argument to unsigned char, we can assume that any
object can be treated as an array of unsigned char without loss of
information.

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: clive@sco.com (Clive D.W. Feather)
Date: Tue, 19 Jul 1994 08:19:26 GMT
Raw View
In article <30ekug$eil@news.tuwien.ac.at> ,
Peter Holzer <hp@vmars.tuwien.ac.at>  wrote:
> Intuitively I tended to agree with Mark, because his parsing makes
> sense, whereas Ron's does not. But when I tried to parse that paragraph
> myself, I got a third result:
>
> When (a signed integer is converted to an unsigned integer with equal
>       or greater size) {
[...]
> } otherwise {
[...]
> }
> So, like Ron I put the Otherwise at the same level as the When like Ron

I don't believe that you can reasonably put "otherwise" with "when" in
either of your parsings. "otherwise" has to match with "if"; "when"
defines the situations that the rules apply to.

If you try to view it any other way, then the "otherwise" clause applies to
any action except for converting a signed integer to an unsigned integer with
equal or greater size; for example, it applies to adding floating point
with the + operator, or renaming files with the rename() function. This
is clearly nonsense.

> Also note that the the if-clause in the first sentence is completely
> redundant, since all positive values of a signed integer can be
> represented by an unsigned integer of equal or greater size,

Almost right. There's no actual requirement anywhere that ULONG_MAX >=
UINT_MAX, but if you assume there is, then you are right.

--
Clive D.W. Feather     | Santa Cruz Operation    | If you lie to the compiler,
clive@sco.com          | Croxley Centre          | it will get its revenge.
Phone: +44 923 816 344 | Hatters Lane, Watford   |   - Henry Spencer
Fax:   +44 923 210 352 | WD1 8YN, United Kingdom |




Author: diamond@jrd.dec.com (Norman Diamond )
Date: 20 Jul 1994 03:34:33 GMT
Raw View
In article <Ct6HsG.DEs@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>In article <30ekug$eil@news.tuwien.ac.at> ,
>Peter Holzer <hp@vmars.tuwien.ac.at>  wrote:
>> Also note that the the if-clause in the first sentence is completely
>> redundant, since all positive values of a signed integer can be
>> represented by an unsigned integer of equal or greater size,

>Almost right. There's no actual requirement anywhere that ULONG_MAX >=
>UINT_MAX, but if you assume there is, then you are right.

There almost is.  For signed integer types, there is ANSI Classic section
3.1.2.5, page 23 lines 42 to 43.  Page 24 lines 2 to 3 require that each
unsigned type "uses" the same amount of storage as the corresponding
signed type, including the sign.  Has the committee ruled yet on how
much use must occur in this use?
--
 <<  If this were the company's opinion, I would not be allowed to post it.  >>
A program in conformance will not tend to stay in conformance, because even if
it doesn't change, the standard will.       Force = program size * destruction.
Every technical corrigendum is met by an equally troublesome new defect report.




Author: clive@sco.com (Clive D.W. Feather)
Date: Wed, 20 Jul 1994 11:00:36 GMT
Raw View
In article <30i609$qs1@usenet.pa.dec.com>,
Norman Diamond  <diamond@jrd.dec.com> wrote:
>In article <Ct6HsG.DEs@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>> There's no actual requirement anywhere that ULONG_MAX >=
>> UINT_MAX, but if you assume there is, then you are right.
> There almost is.  For signed integer types, there is ANSI Classic section
> 3.1.2.5, page 23 lines 42 to 43.  Page 24 lines 2 to 3 require that each
> unsigned type "uses" the same amount of storage as the corresponding
> signed type, including the sign.  Has the committee ruled yet on how
> much use must occur in this use?

Not yet; hopefully they will do so next week. A preliminary response
indicated to me that they believe that UINT_MAX == INT_MAX is permitted,
which derails your line of argument above.

--
Clive D.W. Feather     | Santa Cruz Operation    | If you lie to the compiler,
clive@sco.com          | Croxley Centre          | it will get its revenge.
Phone: +44 923 816 344 | Hatters Lane, Watford   |   - Henry Spencer
Fax:   +44 923 210 352 | WD1 8YN, United Kingdom |




Author: maxtal@physics.su.OZ.AU (John Max Skaller)
Date: Wed, 20 Jul 1994 19:27:57 GMT
Raw View
In article <30i609$qs1@usenet.pa.dec.com> diamond@jrd.dec.com (Norman Diamond ) writes:
>In article <Ct6HsG.DEs@scone.london.sco.com> clive@sco.com (Clive D.W. Feather) writes:
>>In article <30ekug$eil@news.tuwien.ac.at> ,
>>Peter Holzer <hp@vmars.tuwien.ac.at>  wrote:
>>> Also note that the the if-clause in the first sentence is completely
>>> redundant, since all positive values of a signed integer can be
>>> represented by an unsigned integer of equal or greater size,
>
>>Almost right. There's no actual requirement anywhere that ULONG_MAX >=
>>UINT_MAX, but if you assume there is, then you are right.
>
>There almost is.  For signed integer types, there is ANSI Classic section
>3.1.2.5, page 23 lines 42 to 43.  Page 24 lines 2 to 3 require that each
>unsigned type "uses" the same amount of storage as the corresponding
>signed type, including the sign.  Has the committee ruled yet on how
>much use must occur in this use?

 No, but it was actively discussed at Waterloo.
--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,      CSERVE:10236.1703
        6 MacKay St ASHFIELD,     Mem: SA IT/9/22,SC22/WG21
        NSW 2131, AUSTRALIA




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 21 Jul 1994 14:04:45 GMT
Raw View
clive@sco.com (Clive D.W. Feather) writes:

>In article <30ekug$eil@news.tuwien.ac.at> ,
>Peter Holzer <hp@vmars.tuwien.ac.at>  wrote:
>> Intuitively I tended to agree with Mark, because his parsing makes
>> sense, whereas Ron's does not. But when I tried to parse that paragraph
>> myself, I got a third result:
>>
>> When (a signed integer is converted to an unsigned integer with equal
>>       or greater size) {
>[...]
>> } otherwise {
>[...]
>> }
>> So, like Ron I put the Otherwise at the same level as the When like Ron

>I don't believe that you can reasonably put "otherwise" with "when" in
>either of your parsings. "otherwise" has to match with "if"; "when"
>defines the situations that the rules apply to.

>If you try to view it any other way, then the "otherwise" clause applies to
>any action except for converting a signed integer to an unsigned integer with
>equal or greater size; for example, it applies to adding floating point
>with the + operator, or renaming files with the rename() function. This
>is clearly nonsense.

Not quite. The whole paragraph is still within a section devoted to
conversions between signed and unsigned integers, so it wouldn't apply
to adding floats. It could apply to every unsigned/signed conversion
except the one described in the first sentence which is still nonsense.

I agree that the only sensible parsing is the one Mark posted. I
understood it that way, too. Only after Ron posted his rather bizarre
interpretation of this paragraph, I tried to parse it formally, too,
and got a result which isn't much better (but then I am not a native
English speaker and my English grammar is even worse than my German
:-).


>> Also note that the the if-clause in the first sentence is completely
>> redundant, since all positive values of a signed integer can be
>> represented by an unsigned integer of equal or greater size,

>Almost right. There's no actual requirement anywhere that ULONG_MAX >=
>UINT_MAX, but if you assume there is, then you are right.

Yes, I assumed that. What sense is there in decreeing that sizeof
(unsigned long) >= sizeof (unsigned int) if this doesn't mean that the
range of representable values isn't at least as large in an unsigned
long as in an unsigned int. I also assumed that Ux_MAX == 2 * x_MAX + 1
for x = {SHRT,INT,LONG}.

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers




Author: clive@sco.com (Clive D.W. Feather)
Date: Mon, 25 Jul 1994 07:11:10 GMT
Raw View
In article <30lv9t$rr5@news.tuwien.ac.at>,
Peter Holzer <hp@vmars.tuwien.ac.at> wrote:
>clive@sco.com (Clive D.W. Feather) writes:
>> Almost right. There's no actual requirement anywhere that ULONG_MAX >=
>> UINT_MAX, but if you assume there is, then you are right.
> Yes, I assumed that. What sense is there in decreeing that sizeof
> (unsigned long) >= sizeof (unsigned int) if this doesn't mean that the
> range of representable values isn't at least as large in an unsigned
> long as in an unsigned int.

There's an issue of "spare bits". It might be reasonable for an unsigned
type to have unused bits, while the signed type does not. For example,
it might be easier to implement unsigned types on a specific machine by
ignoring the top bit of every byte.

> I also assumed that Ux_MAX == 2 * x_MAX + 1
> for x = {SHRT,INT,LONG}.

WG14 seems to disagree with you. I have had an informal response to a DR
which says that, in particular, Ux_MAX == x_MAX is a reasonable
implementation.

--
Clive D.W. Feather     | Santa Cruz Operation    | If you lie to the compiler,
clive@sco.com          | Croxley Centre          | it will get its revenge.
Phone: +44 923 816 344 | Hatters Lane, Watford   |   - Henry Spencer
Fax:   +44 923 210 352 | WD1 8YN, United Kingdom |




Author: rfg@netcom.com (Ronald F. Guilmette)
Date: Sun, 17 Jul 1994 00:45:36 GMT
Raw View
In article <1994Jul10.044446.1610@sq.sq.com> msb@sq.sq.com (Mark Brader) writes:
>Okay, we need to quote the text in question.  It reads:
>
>#  When a signed integer is converted to an unsigned integer with equal
>#  or greater size, if the value of the signed integer is nonnegative,
>#  its value is unchanged.  Otherwise: if the unsigned integer has
>#  greater size, the signed integer is first promoted to the signed
>#  integer corresponding to the unsigned integer; the value is converted
>#  to unsigned by adding to it one greater than the largest number that
>#  can be represented in the unsigned integer type.
>
>Note the unusual punctuation in the second sentence.

Yes.  That punctuation is indeed rather bizzare, but that's an awfully thin
thread to hang your interpretation on.

>The precondition
>on that sentence is simply the word "Otherwise"; the if-clause following
>it applies only to the wording up to the semicolon.  If that wasn't the
>intended meaning, they wouldn't have punctuated it that way.

Mark, that is (I'll admit) one plausible interpretation, but I do not
believe that it is the only one (and I fully intend to submit an defect
report on this particular paragraph, now that it has become apparent that
some people's interpretations, and parsing, of that paragraph differ from
my own.... a fact which I still find rather surprizing).

>In other words, the paragraph is to be parsed as follows:
>
> When (a signed integer is converted to an unsigned integer
>   with equal or greater size) {
>  if (the value of the signed integer is nonnegative) {
>   its value is unchanged.
>  } Otherwise {
>   if (the unsigned integer has greater size) {
>    the signed integer is first promoted
>    to the signed integer corresponding
>    to the unsigned integer;
>   }
>   the value is converted to unsigned by adding to
>   it one greater than the largest number that
>   can be represented in the unsigned integer type.
>  }
> }
>
>Presumably Ron failed to follow this intended parsing, and misunderstood
>the paragraph.

I did indeed parse the paragraph differently.  As to which of us misunder-
stood, I will be happy to let WG14 decide.  It seems to me that an equally
valid (if not actually MORE valid) parse is:

 When (a signed integer is converted to an unsigned integer
       with equal or greater size && the value is nonnegative)
          {
            its value is unchanged.
          }
 Otherwise if (the unsigned integer has greater size)
          {
     the signed integer is first promoted to the signed integer
            corresponding to the unsigned integer /* and then converted */;
     /* the value is converted to unsigned by adding to it one
        greater than the largest number that can be represented
        in the unsigned integer type. */
   }

Note that the statement that ``...the value is converted to unsigned by
adding...'' only tells us exactly how the bits get twiddled when a par-
ticular kind of conversion takes place.  It certainly does not tell us
what conversions can, do, or will take place (with well-defined behavior).
For that, we must read the remainder of the words in the given paragraph.
And those words only identify two cases where integral conversions (to
equal or greater size) can take place and yield well-defined behavior.

--

-- Ron Guilmette, Sunnyvale, CA ---------- RG Consulting -------------------
---- domain addr: rfg@netcom.com ----------- Purveyors of Compiler Test ----
---- uucp addr: ...!uunet!netcom!rfg ------- Suites and Bullet-Proof Shoes -




Author: hp@vmars.tuwien.ac.at (Peter Holzer)
Date: 18 Jul 1994 19:25:04 GMT
Raw View
rfg@netcom.com (Ronald F. Guilmette) writes:

>In article <1994Jul10.044446.1610@sq.sq.com> msb@sq.sq.com (Mark Brader) writes:
>>Okay, we need to quote the text in question.  It reads:
>>
>>#  When a signed integer is converted to an unsigned integer with equal
>>#  or greater size, if the value of the signed integer is nonnegative,
>>#  its value is unchanged.  Otherwise: if the unsigned integer has
>>#  greater size, the signed integer is first promoted to the signed
>>#  integer corresponding to the unsigned integer; the value is converted
>>#  to unsigned by adding to it one greater than the largest number that
>>#  can be represented in the unsigned integer type.
[...]

>>In other words, the paragraph is to be parsed as follows:
>>
>> When (a signed integer is converted to an unsigned integer
>>   with equal or greater size) {
>>  if (the value of the signed integer is nonnegative) {
>>   its value is unchanged.
>>  } Otherwise {
>>   if (the unsigned integer has greater size) {
>>    the signed integer is first promoted
>>    to the signed integer corresponding
>>    to the unsigned integer;
>>   }
>>   the value is converted to unsigned by adding to
>>   it one greater than the largest number that
>>   can be represented in the unsigned integer type.
>>  }
>> }
>>

> When (a signed integer is converted to an unsigned integer
>       with equal or greater size && the value is nonnegative)
>          {
>            its value is unchanged.
>          }
> Otherwise if (the unsigned integer has greater size)
>          {
>     the signed integer is first promoted to the signed integer
>            corresponding to the unsigned integer /* and then converted */;
>     /* the value is converted to unsigned by adding to it one
>        greater than the largest number that can be represented
>        in the unsigned integer type. */
>   }

Intuitively I tended to agree with Mark, because his parsing makes
sense, whereas Ron's does not. But when I tried to parse that paragraph
myself, I got a third result:

When (a signed integer is converted to an unsigned integer with equal
      or greater size) {
       if (the value is nonnegative) {
        its value is unchanged
       }
} otherwise {
 if (the unsigned integer has greater size) {
  the signed integer is first promoted to the signed
  integer corresponding to the unsigned integer
 }
 the value is converted to unsigned by adding to it one greater
 than the largest number that can be represented in the unsigned
 integer type
}

So, like Ron I put the Otherwise at the same level as the When like
Ron, but parsed the second sentence like Mark. I still think that Marks
parsing is the one intended by the committee, because in Ron's and my
parsing the second sentence would also apply to the conversions
described in the following paragraph.

Also note that the the if-clause in the first sentence is completely
redundant, since all positive values of a signed integer can be
represented by an unsigned integer of equal or greater size, and that
the if-clause in the second sentence is also redundant, since the
addition is obviously not done in machine arithmetic (in which `one
greater than the largest number that can be represented in the unsigned
integer type' does not exist), but in `normal' integer arithmetic.
Therefore the paragraph could be rewritten as:

 When a negative signed integer is converted to an unsigned integer with
 equal or greater size the value is converted to unsigned by adding to
 it one greater than the largest number that can be represented in the
 unsigned integer type.

 hp
--
   _  | hp@vmars.tuwien.ac.at | Peter Holzer | TU Vienna | CS/Real-Time Systems
|_|_) |------------------------------------------------------------------------
| |   | It's not what we don't know that gets us into trouble, it's
__/   | what we know that ain't so. -- Will Rogers