Thread

Topic: char and overflow

Author: igodardA@TpacbellDO.Tnet ("Ivan Godard")
Date: Sat, 7 Aug 2004 21:04:15 GMT Raw View

----- Original Message -----
From: "Francis Glassborow" <francis@robinton.demon.co.uk>
Newsgroups: comp.std.c++
Sent: Sunday, July 25, 2004 5:38 PM
Subject: Re: char and overflow


<snip>

> I think it is time we cleaned up these issues that were originally
> intended to allow C to be supported on platforms that used other than
> two's complement to represent negative integers. I think we should only
> allow one of three (implementation defined) behaviours for overflow of a
> signed integer value:
>
> 1) effectively modulo arithmetic offset by half the range. I.e. wrapping
> as for unsigned
> 2) Saturation (which is normally detectable from within the program)
> 3) Raising an exception.
>
> I think that making overflow of some integer types undefined behaviour
> is (or should be) unacceptable.

This is being addressed by hardware in some new architectures. The "Mill"
CPU from Out-of-the-Box Computing has four variants (a 2-bit field in the
opcode) for overflow handling for all integral operations that can overflow.
The supported semantics are modulo, saturation, exception, and double-width
result. The desired mode can be specified as a default by compiler option or
reached directly in assembler or via equivalent intrinsic functions in a
library.

In addition the compilers support  cv-qualification-style specifiers that
can be used to decorate an integral type with the desired overflow
semantics: "_saturating unsigned char pixel;". The compilers propagate the
overflow specification of a destination back through the dataflow of
ordinary expressions: "pixel = a + b;" uses saturation for the "+"
regardless of the overflow specification of "a" and "b". This provides what
is nearly aways wanted (and avoids a nightmare of promotion rules), but can
be overriden via intermediate destinations (including casts) or use of an
explicit operator:
      pixel = (_excepting unsigned char)(a+b);
      pixel = overflow::excepting_plus(a, b);

Similar methods are used for rounding mode specification in floating point:
      _decimal _round_toward_zero long double EuroTax;

This seems a reasonable candidate for standardization.

Ivan

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jackklein@spamcop.net (Jack Klein)
Date: Sun, 25 Jul 2004 06:21:25 GMT Raw View

On Sat, 24 Jul 2004 18:03:01 GMT, francis@robinton.demon.co.uk
(Francis Glassborow) wrote in comp.std.c++:

>
> I may be have nightmares unnecessarily but unsigned integer types wrap
> in a completely defined ways whereas signed integer types have undefined
> behaviour if a calculation takes a value out of the specified range.
>
> Now does this mean that:
>
> char c(127);
> ++c;
>
> Has undefined behaviour if char uses a signed 8-bit representation?

No, it has an implementation-defined result, since it is the int value
127 that will be incremented to the int value 128, then this
out-of-range value will be assigned back to 'c'.  It is the final
assignment that generates the implementation-defined result, as stated
in paragraph 3 of 4.7 of my original version of the standard.

But in an implementation with 16-bit ints:

    int i(32767);
    ++i;

..does have undefined behavior.

AFIK there are no platforms in production today that use anything
other than "2's complement with silent overflow", although I am not
qualified as an expert in this area.

The "strange" hardware architectures today, in terms of the C and C++
integer types, tend to be DSP and RISC architectures that have no
support for CHAR_BIT == 8.  I have worked on an Analog Devices DSP
where all integer types were 32-bit 2's complement, and am currently
working on a TI DSP where char, short, and int are all 16 bits because
there is no hardware support for 8-bit objects.  The first few
versions of the ARM architecture did not provide hardware support for
signed 8-bit values, nor 16-bit values at all, although this support
was eventually added.

Personally I do not expect to ever write code for a platform with any
of these characteristics:

1.  1's complement or signed-magnitude integer representation.

2.  Any integer types having widths that are not exact powers of 2.

3.  Any integer types with padding bits.

Again, to my knowledge, there are no current architectures that have
any of these characteristics, although there might have been in the
past.

My understanding is that programming tools for such older
architectures tend to be in "maintenance" phase.

Perhaps a little research can determine if any such architecture has
or expects to have a conforming C++ (or C99) implementation.  If not,
such backwards support could be dropped from future versions of both
the C and C++ standards.

This would allow a precise definition of the consequences of integer
overflow, as well as integer conversion of out-of-range values.  On
the other hand, division by 0 and conversion of out-of-range floating
point values would most likely need to remain undefined, due to
hardware traps on processors and floating point hardware.

Please, however, consider such changes in conjunction with the C
standard committee.  There is already one difference between the
common subset of integer types in the two languages.  C99 allows
padding bits in signed char, C++ 98 does not.  Fortunately, if my
assumptions above are correct, this is a moot point for somewhere
between 99.999% and 100% of C and C++ programmers.

I welcome comment from anyone with direct knowledge of platforms that
do not meet my assumptions, and which expect to have continually
evolving C and C++ implementations that upgrade to future versions of
the language standards.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: ux9i003@yahoo.com (Max Polk)
Date: Sun, 25 Jul 2004 07:30:45 GMT Raw View

Francis Glassborow wrote:
>
> I may be have nightmares unnecessarily but unsigned integer types wrap
> in a completely defined ways whereas signed integer types have undefined
> behaviour if a calculation takes a value out of the specified range.
>
> Now does this mean that:
>
> char c(127);
> ++c;
>
> Has undefined behaviour if char uses a signed 8-bit representation?
>

Are you referring to 4.7.3?  "If the destination type is signed, the value is
unchanged if it can be represented in the destination type (and bitfield
width); otherwise, the value is implementation defined."

Implementation defined is not undefined, so the implementation will make "c"
in your example some valid value, although you are allowed to be surprised its
value.  Am I using 4.7 3 correctly?

That is different than creating a situation where an invalid bit pattern
results.  For example, see 3.9.1.  "For character types, all bits of the
object representation participate in the value representation.  For unsigned
character types, all possible bit patterns of the value representation
represent numbers."

With signed char, even though all bits participate in the value
representation, the standard seems to allow for invalid bit patterns, which is
unlike unsigned char where every bit pattern *has to* represent some number.

If I'm using 4.7.3 and 3.9.1 correctly, then in your example, an
implementation-defined value gets assigned, whereas random bit flipping of a
signed char may be undefined.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: francis@robinton.demon.co.uk (Francis Glassborow)
Date: Mon, 26 Jul 2004 00:38:38 GMT Raw View

In article <4_EMc.179015$tH1.7390882@twister.southeast.rr.com>, Max Polk
<ux9i003@yahoo.com> writes
>Francis Glassborow wrote:
>>  I may be have nightmares unnecessarily but unsigned integer types
>>wrap  in a completely defined ways whereas signed integer types have
>>undefined  behaviour if a calculation takes a value out of the
>>specified range.
>>  Now does this mean that:
>>  char c(127);
>> ++c;
>>  Has undefined behaviour if char uses a signed 8-bit representation?
>>
>
>Are you referring to 4.7.3?  "If the destination type is signed, the
>value is unchanged if it can be represented in the destination type
>(and bitfield width); otherwise, the value is implementation defined."

No, I had missed that in the context of a narrowing conversion (in this
case storing an int rvalue of 128 in a char using an 8-bit signed
representation.

Now that rule if IIUC means that the result of:

short s(32767);
++s;

Is implementation defined if the representation of s uses fewer bits
than that for int, but undefined otherwise.

I think it is time we cleaned up these issues that were originally
intended to allow C to be supported on platforms that used other than
two's complement to represent negative integers. I think we should only
allow one of three (implementation defined) behaviours for overflow of a
signed integer value:

1) effectively modulo arithmetic offset by half the range. I.e. wrapping
as for unsigned
2) Saturation (which is normally detectable from within the program)
3) Raising an exception.

I think that making overflow of some integer types undefined behaviour
is (or should be) unacceptable.

--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: musiphil@bawi.org (Seungbeom Kim)
Date: Mon, 26 Jul 2004 00:39:26 GMT Raw View

Jack Klein wrote:
> On Sat, 24 Jul 2004 18:03:01 GMT, francis@robinton.demon.co.uk
> (Francis Glassborow) wrote in comp.std.c++:
>
>
>>I may be have nightmares unnecessarily but unsigned integer types wrap
>>in a completely defined ways whereas signed integer types have undefined
>>behaviour if a calculation takes a value out of the specified range.
>>
>>Now does this mean that:
>>
>>char c(127);
>>++c;
>>
>>Has undefined behaviour if char uses a signed 8-bit representation?
>
>
> No, it has an implementation-defined result, since it is the int value
> 127 that will be incremented to the int value 128, then this
> out-of-range value will be assigned back to 'c'.  It is the final
> assignment that generates the implementation-defined result, as stated
> in paragraph 3 of 4.7 of my original version of the standard.

So, you mean incrementing a char involves conversion to and from an int,
just as the following:

     char c(127);
     c = char(int(c) + 1);   // ++c

Can you refer me to the relevant clause of the standard?

--
Seungbeom Kim

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Mon, 26 Jul 2004 16:37:01 GMT Raw View

Jack Klein wrote:
<snip>=20
> Personally I do not expect to ever write code for a platform with any
> of these characteristics:
>=20
> 1.  1's complement or signed-magnitude integer representation.
>=20
> 2.  Any integer types having widths that are not exact powers of 2.
>=20
> 3.  Any integer types with padding bits.
>=20
> Again, to my knowledge, there are no current architectures that have
> any of these characteristics, although there might have been in the
> past.

Unisys Corporation still sells UNIVAC 1100 mainframes under the name
ClearPath=A0<http://www.unisys.com/products/clearpath__servers/>.  This
architecture has 36-bit words and 1's complement signed representation.=20

A company called XKL has been selling a PDP-10 clone called TOAD-1.=20
The PDP-10 architecture has 36-bit words; I don't remember what signed
representation it uses.

> My understanding is that programming tools for such older
> architectures tend to be in "maintenance" phase.
>=20
> Perhaps a little research can determine if any such architecture has
> or expects to have a conforming C++ (or C99) implementation.  If not,
> such backwards support could be dropped from future versions of both
> the C and C++ standards.
<snip>

It is not clear to me whether there is a C or C++ implementation for
ClearPath.  XKL was funding a port of the GNU compiler tools to the
PDP-10 <http://pdp10.nocrew.org/> but I don't know whether this is
still ongoing.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: do-not-spam-benh@bwsint.com (Ben Hutchings)
Date: Mon, 26 Jul 2004 21:30:40 GMT Raw View

Seungbeom Kim wrote:
> Jack Klein wrote:
>> On Sat, 24 Jul 2004 18:03:01 GMT, francis@robinton.demon.co.uk
>> (Francis Glassborow) wrote in comp.std.c++:
>>
>>
>>>I may be have nightmares unnecessarily but unsigned integer types wrap
>>>in a completely defined ways whereas signed integer types have undefined
>>>behaviour if a calculation takes a value out of the specified range.
>>>
>>>Now does this mean that:
>>>
>>>char c(127);
>>>++c;
>>>
>>>Has undefined behaviour if char uses a signed 8-bit representation?
>>
>>
>> No, it has an implementation-defined result, since it is the int value
>> 127 that will be incremented to the int value 128, then this
>> out-of-range value will be assigned back to 'c'.  It is the final
>> assignment that generates the implementation-defined result, as stated
>> in paragraph 3 of 4.7 of my original version of the standard.
>
> So, you mean incrementing a char involves conversion to and from an int,
> just as the following:
>
>      char c(127);
>      c = char(int(c) + 1);   // ++c
>
> Can you refer me to the relevant clause of the standard?

++c is equivalent to c += 1 (5.3.2/1), which is equivalent to c = c + 1
(5.17/7).  The "usual arithmetic conversions" (5/9) are performed on
the operands of the addition operator (5.7/1) and they include the
integral promotions (4.5) which convert char to int.  The result of
the addition has type int and is converted back to type char as part
of the assignment.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: jackklein@spamcop.net (Jack Klein)
Date: Tue, 27 Jul 2004 03:43:57 GMT Raw View

On Mon, 26 Jul 2004 00:38:38 GMT, francis@robinton.demon.co.uk
(Francis Glassborow) wrote in comp.std.c++:

> In article <4_EMc.179015$tH1.7390882@twister.southeast.rr.com>, Max Polk
> <ux9i003@yahoo.com> writes
> >Francis Glassborow wrote:

   [snip to avoid rejection for over quoting]

> I think it is time we cleaned up these issues that were originally
> intended to allow C to be supported on platforms that used other than
> two's complement to represent negative integers. I think we should only
> allow one of three (implementation defined) behaviours for overflow of a
> signed integer value:
>
> 1) effectively modulo arithmetic offset by half the range. I.e. wrapping
> as for unsigned
> 2) Saturation (which is normally detectable from within the program)
> 3) Raising an exception.
>
> I think that making overflow of some integer types undefined behaviour
> is (or should be) unacceptable.

I very much agree, despite Ben Hutchings information else thread about
Unisys and XKL.  If these machines exist to support legacy
applications, the existing tools (K&R C, or C89/90, and whatever
pre-standard C++ compiler they might possibly have) will still work as
before.

I imagine that the test would be if there is either a C99
implementation or a C++ 98 implementation for either of these
platforms or any like them.  Given the age of the original version of
these standards, if there are legacy oddball platforms that haven't
implemented them yet, they are not likely to implement future
versions.

And while we are at it, can we please get rid of padding bits and trap
representations?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]

Author: francis@robinton.demon.co.uk (Francis Glassborow)
Date: Sat, 24 Jul 2004 18:03:01 GMT Raw View

I may be have nightmares unnecessarily but unsigned integer types wrap
in a completely defined ways whereas signed integer types have undefined
behaviour if a calculation takes a value out of the specified range.

Now does this mean that:

char c(127);
++c;

Has undefined behaviour if char uses a signed 8-bit representation?

--
Francis Glassborow      ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]