Topic: Left-shift of signed integers


Author: nagle@animats.com (John Nagle)
Date: Mon, 28 Aug 2006 19:15:09 GMT
Raw View
Old Wolf wrote:
> In the C standard, if E1 is a signed int, then E1 << 1 is explicitly
> defined as (E1 * 2), with undefined behaviour if this value is larger
> than INT_MAX.
>
> But the C++ standard (1998 version) says:
>     The value of E1 << E2 is E1 (interpreted as a bit pattern)
>     leftshifted E2 bit positions; vacated bits are zerofilled.
>
> and has no comment on the resulting value. Does this mean
> left shifts of signed integers are undefined?

    That's correct; left shifts of signed integers result
in undefined behavior.  Try them on a UNISYS A series (48-bit signed magnitude)
or B series (36-bit ones complement) machine, or a PDP-10 (36-bit
ones complement) family machine.  Admittedly there's very little running
hardware that isn't byte-oriented twos complement.  But UNISYS is still
selling those things.

Ref: http://www.unisys.com/eprise/main/admin/corporate/doc/dorado-280-spec.pdf
http://www.unisys.com/eprise/main/admin/corporate/doc/ClearPath_Plus_Libra_Model_300_Server_Spec_Sheet.pdf

The UNISYS B series, the old UNIVAC 1100 line, is the longest-lived line in
computing, dating back to the vacuum-tube UNIVAC 1101 in 1951.

Considering that UNISYS pushes Java, which has standardized numeric formats, the
problems of simulating twos-complement arithmetic on the old iron have
been solved.  With the PDP-10 line dead, it would probably be acceptable
to insist that C++ implementations now use a twos-complement representation.

    John Nagle

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bop@gmb.dk ("Bo Persson")
Date: Tue, 29 Aug 2006 04:48:34 GMT
Raw View
"John Nagle" <nagle@animats.com> skrev i meddelandet
news:X6HIg.1263$Cq4.306@newssvr25.news.prodigy.net...
> Old Wolf wrote:
>> In the C standard, if E1 is a signed int, then E1 << 1 is
>> explicitly
>> defined as (E1 * 2), with undefined behaviour if this value is
>> larger
>> than INT_MAX.
>>
>> But the C++ standard (1998 version) says:
>>     The value of E1 << E2 is E1 (interpreted as a bit pattern)
>>     leftshifted E2 bit positions; vacated bits are zerofilled.
>>
>> and has no comment on the resulting value. Does this mean
>> left shifts of signed integers are undefined?
>
>    That's correct; left shifts of signed integers result
> in undefined behavior.  Try them on a UNISYS A series (48-bit signed
> magnitude)
> or B series (36-bit ones complement) machine, or a PDP-10 (36-bit
> ones complement) family machine.  Admittedly there's very little
> running
> hardware that isn't byte-oriented twos complement.  But UNISYS is
> still
> selling those things.
>
> Ref:
> http://www.unisys.com/eprise/main/admin/corporate/doc/dorado-280-spec.pdf
> http://www.unisys.com/eprise/main/admin/corporate/doc/ClearPath_Plus_Libra_Model_300_Server_Spec_Sheet.pdf
>
> The UNISYS B series, the old UNIVAC 1100 line, is the longest-lived
> line in
> computing, dating back to the vacuum-tube UNIVAC 1101 in 1951.
>
> Considering that UNISYS pushes Java, which has standardized numeric
> formats, the
> problems of simulating twos-complement arithmetic on the old iron
> have
> been solved.

No, it has not really. The "solution" is to add additional processors,
with the "proper" hardware support for Java standard types.

> With the PDP-10 line dead, it would probably be acceptable
> to insist that C++ implementations now use a twos-complement
> representation.

To make left shift of signed integers defined, or some other
improvement?

How about requiring 32-bits, byte addressed, IEEE floating point, no
EBCDIC, little endian, non-segmented memory, no padding bits?

Why is it so important to have fixed hardware characteristics, for
everyone?


And if you don't like Unisys, what about these guys:

http://www-03.ibm.com/systems/z/


Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "kanze" <kanze@gabi-soft.fr>
Date: Tue, 29 Aug 2006 09:23:30 CST
Raw View
Old Wolf wrote:
> In the C standard, if E1 is a signed int, then E1 << 1 is explicitly
> defined as (E1 * 2), with undefined behaviour if this value is larger
> than INT_MAX.

That's interesting, because it is a real change compared to C90.
And existing practice as well.  Note that it also says that it
is undefined behavior if E1 is negative!

> But the C++ standard (1998 version) says:
>     The value of E1 << E2 is E1 (interpreted as a bit pattern)
>     leftshifted E2 bit positions; vacated bits are zerofilled.

> and has no comment on the resulting value.

The C standard also says this.  It then comments on what it
means.

> Does this mean left shifts of signed integers are undefined?

No.  It means that the value bits of the representation will be
left shifted, with 0's bits inserted to the right.

> What if the int has padding bits interspersed with value bits,

They don't count.  In general, padding bits are invisible,
except when considering the relationship between the size of an
object, and its max and min values.  Otherwise, they wouldn't be
padding bits, but would participate in the value representation.

> and the shift results in a trap representation? What if 0x4000
> is left-shifted by one bit, when INT_MAX is 0x7FFF ?

The C++ standard says that you will get whatever integer is
represented by the bit pattern 0x8000.  Note that on a one's
complement or a signed magnitude machine, this may be a trapping
value.  Which is probably why the C standard is worded the way
it is.

Personally, I would have preferred that the "clarification" be
more along the lines of "the value is the value represented by
the resulting bit pattern, and may be trapping."  Followed, if
need be, by some statements like those in the C standard,
guaranteeing that it won't be trapping for certain values.

--
James Kanze                                           GABI Software
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: nagle@animats.com (John Nagle)
Date: Wed, 30 Aug 2006 19:52:45 GMT
Raw View
Bo Persson wrote:
> "John Nagle" <nagle@animats.com> skrev i meddelandet
> news:X6HIg.1263$Cq4.306@newssvr25.news.prodigy.net...

>>Considering that UNISYS pushes Java, which has standardized numeric
>>formats, the
>>problems of simulating twos-complement arithmetic on the old iron
>>have
>>been solved.
>
>
> No, it has not really. The "solution" is to add additional processors,
> with the "proper" hardware support for Java standard types.

    Surprisingly enough, UNISYS is actually supporting Java
on the OS2200 36-bit CPUs.

http://ecommunity.unisys.com/unisys/solution_center/sc_documents/whitepaper/41264862-100.pdf

    Yes, you can also plug in an x86 CPU board and run Java on it in the
same cabinet, but you don't have to.

    However, it appears that UNISYS does not support C++ for the OS2200 line, so
perhaps we can forget about ones complement hardware.

> And if you don't like Unisys, what about these guys:
> http://www-03.ibm.com/systems/z/

    IBM mainframes have 32-bit twos-complement integers, but the floating point
formats are not IEEE compatible.  No problem there.

    It looks like there's nothing left that runs C++ and doesn't have twos
complement integers.

    John Nagle

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Old Wolf" <oldwolf@inspire.net.nz>
Date: Mon, 28 Aug 2006 09:04:13 CST
Raw View
In the C standard, if E1 is a signed int, then E1 << 1 is explicitly
defined as (E1 * 2), with undefined behaviour if this value is larger
than INT_MAX.

But the C++ standard (1998 version) says:
    The value of E1 << E2 is E1 (interpreted as a bit pattern)
    leftshifted E2 bit positions; vacated bits are zerofilled.

and has no comment on the resulting value. Does this mean
left shifts of signed integers are undefined? What if the int
has padding bits interspersed with value bits, and the shift
results in a trap representation? What if 0x4000 is left-shifted
by one bit, when INT_MAX is 0x7FFF ?

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Mon, 28 Aug 2006 11:33:48 CST
Raw View
Old Wolf wrote:
> In the C standard, if E1 is a signed int, then E1 << 1 is explicitly
> defined as (E1 * 2), with undefined behaviour if this value is larger
> than INT_MAX.
>
> But the C++ standard (1998 version) says:
>     The value of E1 << E2 is E1 (interpreted as a bit pattern)
>     leftshifted E2 bit positions; vacated bits are zerofilled.
>
> and has no comment on the resulting value. Does this mean
> left shifts of signed integers are undefined? What if the int
> has padding bits interspersed with value bits, and the shift
> results in a trap representation? ...


The C++98 standard was approved before C99; as a result, the portions
of C++98 that are compatible with some version of the C standard, are
closer to the wording in C90 than to the (mostly) improved wording of
C99. Technically, therefore, C++98 incorporates many of the defects
that were fixed in C99. In general, I would recommend assuming that
some future version of the C++ standard will contain similar revisions
to fix those same defects. In the meantime I would recommend avoiding
writing code where the difference matters, if possible.

Padding bits were formally recognised only in C99, therefore the C90
wording that was copied over into C++98 didn't have to worry about
them. The C99 wording implies that they are skipped over by the shift.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]