Thread

Topic: Possible 32-bit or 64-bit char?

Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1998/08/31 Raw View

"Paul D. DeRocco" <pderocco@ix.netcom.com> writes:

|>  Steve Clamage wrote:

|>  > 16-bit integers are not going to be portable to all systems
|>  > in any event. For example, 36-bit machines used to be common,
|>  > and they did not support 16-bit data types at all. You'd have
|>  > to do a lot of unportable bit twiddling read or write a file
|>  > of 16-bit quantities, if you could do it at all.

|>  Are there really any non-power-of-two machines that have modern C++
|>  implementations on them? I would think that a 36-bit machine would have
|>  to have completely non-standard everything, including disks that store
|>  9-bit bytes.

|>  For that matter, are there any one's complement or sign-magnitude
|>  machines that have modern C++ implementations?

|>  I'm just wondering how many actual implementations would be rendered
|>  impossible if C++ specified that chars had eight bits, and that signed
|>  numbers used two's complement form.

There is a machine in the current Unisys catalog which uses 40 bit
signed magnitude for integers.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/31 Raw View

In article <m3k93p5kgg.fsf@gabi-soft.fr>,
  kanze@gabi-soft.fr (J. Kanze) wrote:
>
> |>  Steve Clamage wrote:
> |>  > 16-bit integers ... 36-bit machines ...

> "Paul D. DeRocco" <pderocco@ix.netcom.com> writes:
> |>  Are there really any non-power-of-two machines that have modern C++
> |>  implementations on them?

The definition of "bit" leads us to the conclusion that all supported
machines are power-of-two machines.

> There is a machine in the current Unisys catalog which uses 40 bit
> signed magnitude for integers.

The maximum unsigned integer would be one less than 2 to the power of 40.

Okay, I'm being intentionally obtuse. You're really asking if there are
any machines where the number of bits in a word isn't itself a power of
two, as in 32-bit machines where 2^N=32 if N=5. But I ask, why does it
matter?

Imagine some hypothetical machine which had *ONLY* 32-bit math, and
where a machine address specifies a 32-bit value. The compiler can
simulate 8-bit chars with AND and OR, but this is slow. I submit that
this implementation will have exactly the same problems -- no better
and no worse -- than the 40-bit system.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: David R Tribble <david.tribble@noSPAM.central.beasys.com>
Date: 1998/08/04 Raw View

Steve Clamage wrote:
>> 16-bit integers are not going to be portable to all systems
>> in any event. For example, 36-bit machines used to be common,
>> and they did not support 16-bit data types at all. You'd have
>> to do a lot of unportable bit twiddling read or write a file
>> of 16-bit quantities, if you could do it at all.

Paul D. DeRocco wrote:
> Are there really any non-power-of-two machines that have modern C++
> implementations on them? I would think that a 36-bit machine would
> have to have completely non-standard everything, including disks that
> store 9-bit bytes.

I don't know the details, but C was ported to the DEC-20, a machine
having 36-bit words, quite a number of years ago.  I don't know the
details, but I would assume that:

    char     => 9 bits (4 per word), or 7 bits (5 per word)
    short    => 18 bits
    int      => 36 bits (a single word)
    long     => 36 bits

Since TOPS-20, the O/S that ran on the DEC-20, used ASCII characters,
either 7 or 9 bits would have been sufficient for 'char'.

Reading and writing data on this kind of system is different, of
course, that reading and writing data on an 8-bit (or 16- or 32-bit
system).  Data is written to files in convenient units, probably
"words", rather than "characters".  Thus files were composed of
streams of 36-bit words, which of course fit nicely into C 'int'
data objects.

Writing characters entailed packing multiple characters into words
as they were written (and unpacking them from words as they were
read).  The TOPS-20 system I used back in college wrote standard
8-bit (9-track) magnetic tapes in a form that packed groups (pairs?)
of 36-bit words into 8-bit bytes (which apparently was a wide-spread
technique for DEC systems).

I assume that if C could be ported to such a machine, then C++
could be, too.

-- David R. Tribble, dtribble@technologist.com --

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/05 Raw View

Scott Schurr wrote:
>
> I'm currently using a C (not C++) compiler for a DSP that has 32-bit
> chars -- the Analog Devices SHARC family of DSPs.  The DSP has no
> support for 8-bit data types, and very little use for characters.  So
> for this processor it's not a major drawback.  But it does make the
> character-oriented aspects of the standard C library awkward.

I've yet to see a C++ version for such a DSP, although I admit one is
plausible. But as long as the word size is a multiple of 8 bits, it
would be possible to do an implementation that used 8-bit chars. It
would be rather inefficient when operating on arrays of chars or
pointers to chars, but still feasible. Such a translation would be
impossible on a 36-bit machine, though. I'd still be shocked if anyone
attempted a modern, standard-conforming C++ compiler on a 36-bit
machine. My guess is that mandating 8-bit chars would probably kill
about 0.01% of the C++ compiler market.

--

Ciao,
Paul

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Jim Cobban" <Jim.Cobban.jcobban@nt.com>
Date: 1998/08/05 Raw View

In article <35C78F82.3C27@noSPAM.central.beasys.com>,
David R Tribble  <dtribble@technologist.com> wrote:
>
>I don't know the details, but C was ported to the DEC-20, a machine
>having 36-bit words, quite a number of years ago.  I don't know the
>details, but I would assume that:
>
>    char     => 9 bits (4 per word), or 7 bits (5 per word)
>    short    => 18 bits
>    int      => 36 bits (a single word)
>    long     => 36 bits
>
The first machine on which FORTRAN was implemented had 36 bit words.  I
attended a seminar many years ago at which the chief engineer for that
processor spoke.  He explained that the reason the processor had 36 bit
words was that allowed him to optimize the design of his checkers playing
program.  The checkers playing program was used as part of the "burn-in" of
these machines before they were shipped to the customer.  On that processor
characters were 6 bits, since that supported all of the characters used in
FORTRAN (upper case letters, numbers, special characters).  As a result you
could put 6 characters in a word, which is why FORTRAN variable names were
for decades limited to 6 characters.
--
Jim Cobban   |  jcobban@nortel.ca                   |  Phone: (613) 763-8013
Nortel (MCS) |                                      |  FAX:   (613) 763-5199


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1998/08/04 Raw View

Steve Clamage wrote:
>
> 16-bit integers are not going to be portable to all systems
> in any event. For example, 36-bit machines used to be common,
> and they did not support 16-bit data types at all. You'd have
> to do a lot of unportable bit twiddling read or write a file
> of 16-bit quantities, if you could do it at all.

Are there really any non-power-of-two machines that have modern C++
implementations on them? I would think that a 36-bit machine would have
to have completely non-standard everything, including disks that store
9-bit bytes.

For that matter, are there any one's complement or sign-magnitude
machines that have modern C++ implementations?

I'm just wondering how many actual implementations would be rendered
impossible if C++ specified that chars had eight bits, and that signed
numbers used two's complement form.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: scotts@ims.com (Scott Schurr)
Date: 1998/08/04 Raw View

In article <35C68038.E11004C8@ix.netcom.com>, "Paul D. DeRocco" <pderocco@ix.netcom.com> writes:
|> I'm just wondering how many actual implementations would be rendered
|> impossible if C++ specified that chars had eight bits, and that signed
|> numbers used two's complement form.

I'm currently using a C (not C++) compiler for a DSP that has 32-bit
chars -- the Analog Devices SHARC family of DSPs.  The DSP has no
support for 8-bit data types, and very little use for characters.  So
for this processor it's not a major drawback.  But it does make the
character-oriented aspects of the standard C library awkward.

--------------------------------------
Scott Schurr
  Integrated Measurement Systems, Inc.
  Voice: (503) 626-7117
  Fax:   (503) 644-6969
  Email: scotts@ims.com
--------------------------------------
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/08/04 Raw View

In article <XK1x1.2449$3G4.5289765@news21.bellglobal.com>,
  "Michel Michaud" <Michel.Michaud@sympatico.ca> wrote:
> >In particular, there was some doubt about whether a conforming
> >implementation could have sizeof(char) == sizeof(int), because
> >various functions in the standard library such as fgetc()
> >assume that int can hold all the values in unsigned char
> >and also EOF.
> I did think about that but I forgot it was also a C++ issue because
> of istream's get()...
>
> >I don't recall what the conclusion was, however.
> :(

Is it required that all possible unsigned chars can be represented in
the file system? For instance
    unsigned char uc;
    int i;
    FILE *f = fopen("abc","w"); // or fopen("abc","wb");
    fputc(0,f);
    for (uc=1; uc; ++uc) fputc(uc,f);
    fputc(1,f);
    fclose(f);
    f = fopen("abc","r"); // or fopen("abc","rb");
    i = fgetc(f);
    assert(!i);
    for (uc=1; uc; ++uc) {
        i = fgetc(f);
        assert(uc==(unsigned char)i);
    };
    i = fgetc(f);
    assert(1==i);
    fclose(f);
Assuming no I/O errors, I think that assert() might still trigger
on some systems. Consider a hypothetical system that uses 8-bit
ASCII characters in it's file system, but used 12-bit bytes
internally. In this case, fputc(256,f) would write char(256%256),
which is '\0'. Reading this back in with fgetc(f) would return 0,
not 256. Isn't that right?

If so, then having sizeof(int)==sizeof(char) wouldn't be a problem.
Set EOF to -1, which is a value that can't possibly be read in
from fgetc() or anything else that reads data from the file system.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Michel Michaud" <Michel.Michaud@sympatico.ca>
Date: 1998/08/01 Raw View

The (draft) standard state : (5.3.3)
  "The sizeof operator yields the number of bytes in the"
  "object representation of its operand... sizeof(char),"
  "sizeof(signed char) and sizeof(unsigned char) are 1..."

So IS IT possible that a conforming standard compiler would have

1 == sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long)

which means 32-bit char (or larger) to support the minimum range
of long (byte is not always 8-bit as stated in 1.6) ?

If so, does this mean that there is no PORTABLE way to write even
a simple program like Unix's "cat" ?

Does it mean that there could be no way AT ALL to read or write to a
file containing, for example, a sequence of 16-bit integers ? (even
using bit field, because there could be an odd number of values)

I think that the standard should have stated that char MUST be as small
as the smallest unit of adressable data the target machine is using, or
the size of natural character...
(it does state that
  "...Plain ints have the natural size suggested by the architecture
  "of the execution environment... ")

In his latest book, Stroustrup says that "the char type is supposed
to be chosen by the implementation to be the most suitable..." but I don't
see that in the standard, only that it must be "large enough" (3.9.1).

I guess this is also a C problem... although there may be a C++ solution
that is not C...

I think compiler vendor would not be foolish enough to have 32-bit char in
a 8-bit environment, but the standard could have prevented that...

Michel Michaud micm19@mail2.cstjean.qc.ca
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: clamage@Eng.Sun.COM (Steve Clamage)
Date: 1998/08/02 Raw View

"Michel Michaud" <Michel.Michaud@sympatico.ca> writes:

>The (draft) standard state : (5.3.3)
>  "The sizeof operator yields the number of bytes in the"
>  "object representation of its operand... sizeof(char),"
>  "sizeof(signed char) and sizeof(unsigned char) are 1..."

>So IS IT possible that a conforming standard compiler would have

>1 == sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long)

>which means 32-bit char (or larger) to support the minimum range
>of long (byte is not always 8-bit as stated in 1.6) ?

Yes. But 1.6 in the final standard does not mention bytes or
sizes. Section 1.7 mentions bytes, but does not say anything
about them being 8 bits. The C and C++ standards require a
byte to be at least 8 bits (since a char must be at least 8
bits and has a size of 1 byte). A byte can be larger.

>If so, does this mean that there is no PORTABLE way to write even
>a simple program like Unix's "cat" ?

Read and write one char at at time. It doesn't matter how many
bits the implementation uses internally to hold a character. It
maps the external representation of chars (e.g. in the OS's file
system) to the internal representation on input, and back again
on output. All the I/O operations on chars have the same defined
meaning independent of internal representation.

>Does it mean that there could be no way AT ALL to read or write to a
>file containing, for example, a sequence of 16-bit integers ? (even
>using bit field, because there could be an odd number of values)

16-bit integers are not going to be portable to all systems
in any event. For example, 36-bit machines used to be common,
and they did not support 16-bit data types at all. You'd have
to do a lot of unportable bit twiddling read or write a file
of 16-bit quantities, if you could do it at all. In general,
you can't expect to write a fully portable C or C++ program that
depends on exact object sizes. If you depend only on minimum
sizes, you can write a portable program that works on a wide
variety of systems.

But if you need to share files with incompatible systems, you can't
do that portably in C or C++ directly. You'd need to use something
like a data serializing I/O library customized for each platform,
but your code that uses the library would be portable.

>I think compiler vendor would not be foolish enough to have 32-bit char in
>a 8-bit environment, but the standard could have prevented that...

There isn't any way for a language standard to prevent foolish
implementations without also preventing efficient implementations
on some systems. For example, Java places portability of object
code among its highest design goals. It therefore specifies the
exact size and implementation of all the basic data types. Java
will then be inefficient on, say, a 36-bit machine with
propietary floating-point. C and C++ have different design goals,
placing efficient implementation above portability.  C or C++
code that depends on object sizes will not be as portable as
similar Java code, but will probably be more efficient.

"Quality of implementation" is a catch-all phrase that refers to
choices that are best left to a community of implementors and users
of a given platform. The exact implementation of basic data
types falls under that category. There might be reasons for
choosing 32 (or 64) bits as the size of a byte. Those reasons
might not suit your purposes, in which case you would choose
some other implementation.

--
Steve Clamage, stephen.clamage@sun.com
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: fjh@cs.mu.OZ.AU (Fergus Henderson)
Date: 1998/08/02 Raw View

"Michel Michaud" <Michel.Michaud@sympatico.ca> writes:

>The (draft) standard state : (5.3.3)
>  "The sizeof operator yields the number of bytes in the"
>  "object representation of its operand... sizeof(char),"
>  "sizeof(signed char) and sizeof(unsigned char) are 1..."
>
>So IS IT possible that a conforming standard compiler would have
>
>1 == sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long)

This question has been debated in comp.std.c quite a bit,
and I think it may even be addressed in C9X.
In particular, there was some doubt about whether a conforming
implementation could have sizeof(char) == sizeof(int), because
various functions in the standard library such as fgetc()
assume that int can hold all the values in unsigned char
and also EOF.

I don't recall what the conclusion was, however.

--
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3        |     -- the last words of T. S. Garp.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Michel Michaud" <Michel.Michaud@sympatico.ca>
Date: 1998/08/02 Raw View

>In particular, there was some doubt about whether a conforming
>implementation could have sizeof(char) == sizeof(int), because
>various functions in the standard library such as fgetc()
>assume that int can hold all the values in unsigned char
>and also EOF.
I did think about that but I forgot it was also a C++ issue because
of istream's get()...

>I don't recall what the conclusion was, however.
:(

Michel Michaud micm19@removethis.mail2.cstjean.qc.ca
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1998/08/03 Raw View

Michel Michaud wrote:
>
> >In particular, there was some doubt about whether a conforming
> >implementation could have sizeof(char) == sizeof(int), because
> >various functions in the standard library such as fgetc()
> >assume that int can hold all the values in unsigned char
> >and also EOF.
> I did think about that but I forgot it was also a C++ issue because
> of istream's get()...

Even without istream::get, it would be a C++ issue, since C++
includes the complete C standard library, including fgetc.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]