Thread

Topic: No invalid bit patterns!

Author: NULL@NULL.NULL ("Tom s")
Date: Sat, 13 May 2006 15:45:28 GMT Raw View

Jeff Rife posted:


>> Is there not a requirement whereby an X-Bit unsigned integral type
>> can:=20
>>=20
>> a) Hold 2^X unique values
>>=20
>> b) Store values 0 to ( 2^X - 1 ) inclusive
>=20
> For "signed char" and "unsigned char" (and thus, "char"), there appears
> to be this requirement based on the wording in 3.9.1.1, but "these
> requirements do not hold for other types".  There are no changes to
> this section in the draft concerning this wording, so I'd say it's not
> gonna change.


What about 3.9.1.4 (page 54, or 82):

/* Begin Quotation */

Unsigned integers, declared unsigned, shall obey the laws of arithmetic=20
modulo 2^n where n is the number of bits in the value representation of t=
hat=20
particular size of integer.
 (This implies that unsigned arithmetic does not overflow because a resul=
t=20
that cannot be represented by the resulting unsigned integer type is redu=
ced=20
modulo the number that is one greater than the largest value that can be=20
represented by the resulting unsigned integer type.)

/* End Quotation */


>From the first paragraph, I deduce the following:

  If an unsigned integer type is 16-Bit, it should
  obey the laws of arithmetic modulo 65 536.

>From the second paragraph, I deduce the following:

  Should 16-Bit unsigned arithmetic overflow (e.g. 2 * 40 000),
  the resultant figure is reduced modulo the number that is
  one greater than the largest value that can be represented
  by the resulting unsigned integer type.

Combining the two, I deduce that:

  Upon overflow, the number by which the resultant figure is
  moduloed, is one greater than the largest value which can
  be represented by the resulting unsigned integer type.

We know that the resultant figure is reduced modulo 65 536. Therefore, th=
e=20
largest value the integer type can represent is 65 535 (i.e. one less tha=
n=20
the number by which it is moduloed).

I can assert the following:

1) A 16-Bit unsigned integer variable can store all numbers in the range =
0=20
to 65 535 inclusive.
2) By the laws of mathematics, this makes use of each and every unique bi=
t=20
pattern that the unsigned integer type can represent.
3) Accordingly, there are no "left-over" bit patterns, and, as such, ther=
e=20
can be no "invalid" bit pattern.

Therefore I assert that the following code could not possibly exhibit=20
undefined behaviour:

#include <iostream>

int main()
{
    unsigned char a;
    unsigned short b;
    unsigned c;
    unsigned long d;

    std::cout << a << b << c << d;
}


I attest that the Standard's limitations should be relaxed, indicating th=
e=20
following:

It is perfectly okay to read the value of an uninitialised variable whose=
=20
type is of one of the unsigned integer types.

-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: NULL@NULL.NULL ("Tom s")
Date: Sat, 13 May 2006 22:14:40 GMT Raw View

Disregard the post immediately above this one in the thread; I posted it=20
two days ago and it's only shown up now -- but I had sent a second post i=
n=20
the meantime which showed up before it.

-Tom=E1s

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: bop@gmb.dk ("Bo Persson")
Date: Sun, 14 May 2006 02:02:27 GMT Raw View

""Tom=E1s"" <NULL@NULL.NULL> skrev i meddelandet=20
news:wuO8g.9058$j7.305482@news.indigo.ie...

>/* Begin Quotation */
>
>Unsigned integers, declared unsigned, shall obey the laws of=20
>arithmetic
>modulo 2^n where n is the number of bits in the value representation=20
>of that
>particular size of integer.
> (This implies that unsigned arithmetic does not overflow because a=20
> result
>that cannot be represented by the resulting unsigned integer type is=20
>reduced
>modulo the number that is one greater than the largest value that can=20
>be
>represented by the resulting unsigned integer type.)
>
>/* End Quotation */

Note that is says "value representation" not "bit pattern".

[...]

> I can assert the following:
>
> 1) A 16-Bit unsigned integer variable can store all numbers in the=20
> range 0
> to 65 535 inclusive.
> 2) By the laws of mathematics, this makes use of each and every=20
> unique bit
> pattern that the unsigned integer type can represent.
> 3) Accordingly, there are no "left-over" bit patterns, and, as such,=20
> there
> can be no "invalid" bit pattern.

This a false assertion, as you start by assuming that the value=20
representation must use all the bits. That is where it fails!

Nothing stops an implementation from storing your 16-bit value in an=20
18-bit memory word. That leaves plenty of room for invalid bit=20
patterns.


Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: kuyper@wizard.net
Date: Sat, 13 May 2006 21:04:55 CST Raw View

"Tom   s" wrote:
..
> What about 3.9.1.4 (page 54, or 82):
..
> Unsigned integers, declared unsigned, shall obey the laws of arithmetic
> modulo 2^n where n is the number of bits in the value representation of that
> particular size of integer.

Key phrase: "value representation". The object representation can
contain bits that aren't part of the value representation.

> 1) A 16-Bit unsigned integer variable can store all numbers in the range 0
> to 65 535 inclusive.
> 2) By the laws of mathematics, this makes use of each and every unique bit
> pattern that the unsigned integer type can represent.
> 3) Accordingly, there are no "left-over" bit patterns,

Yes. There are no left over bits in the value representation.

> ... and, as such, there
> can be no "invalid" bit pattern.

Unless the bits that aren't part of the value representation play a
role in rendering it invalid.

> Therefore I assert that the following code could not possibly exhibit
> undefined behaviour:
>
> #include <iostream>
>
> int main()
> {
>     unsigned char a;
>     unsigned short b;
>     unsigned c;
>     unsigned long d;
>
>     std::cout << a << b << c << d;
> }

A program either has defined behavior, or it doesn't. You use
"undefined behavior" as if it was some specific list of behaviors,
presumably undesireable ones. When a program has undefined behavior, as
is the case with this one, anything is permitted, including doing
precisely what the programmer indefensibly expected it to do. For
example, in this case one of the permitted behavior of this program is
to print out "0 0 0 0". Another is to print out "Program aborted, for
attempting to use the value of uninitialized variables."

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Author: nagle@animats.com (John Nagle)
Date: Mon, 15 May 2006 01:54:10 GMT Raw View

kuyper@wizard.net wrote:
> "Tom=E1s" wrote:
>>... and, as such, there
>>can be no "invalid" bit pattern.
>=20
>=20
> Unless the bits that aren't part of the value representation play a
> role in rendering it invalid.

    Once upon a time, there were machines which had such representations,
such as the Burroughs 5000 and its successors through the Unisys A
series.  The Symbolics LISP machines also had such properties.
Burroughs used 36 bits to store both 24-bit integers and floating
point numbers; if the high 8 bits were zero, it was a positive
integer.  Symbolics had tag bits on each word, which were outside
the part of the word accessed by arithmetic operations.

    Burroughs used signed-magnitude integers.  Univac used
ones-complement.  IBM switched from ones-complement to
twos-complement in the 1960s.  Almost everybody who started
after 1960 used twos complement, which is what we have today.

    C++ was originally intended to accomodate all this variability.
That's why the numeric representation isn't nailed down,
as it is in Java.

    John Nagle
    Animats

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]