Topic: Types of wide character


Author: AllanW@my-dejanews.com
Date: 1998/08/07
Raw View
There are three distinct types of character:
    signed char
    unsigned char
    char
The third type must behave exactly like one of the first two, but
which one is implementation-defined.

Are there three distinct types of wide character?
    signed wchar_t
    unsigned wchar_t
    wchar_t
If so, must the "sign implementation" match that of chars? (i.e. is
it invalid for a compiler to make char act like signed char, but make
wchar_t act like unsigned wchar_t)?

But if there is only one type of wide character, is it signed or
unsigned?

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: stephen.clamage@sun.com (Steve Clamage)
Date: 1998/08/07
Raw View
In article 1@nnrp1.dejanews.com, AllanW@my-dejanews.com writes:
>There are three distinct types of character:
>    signed char
>    unsigned char
>    char
>The third type must behave exactly like one of the first two, but
>which one is implementation-defined.

Right. That's a legacy from C, which allowed the char type to be
used as a tiny integer as well as to represent characters. K&R C
didn't have the "signed" keyword, so there were two available
types orginally: char and unsigned char. Typical implementations
made char signed. Some implementations made char an unsigned type
for performance (or other) reasons, meaning there would be no
way to specify a signed tiny integer. Standard C added the "signed"
keyword (already present in some C implementations), and thus
provided three char types. That allowed maximum compatibility
with existing practice. C++ adopted the C rules.

>
>Are there three distinct types of wide character?

No. No historical precedent applies. There is only one wchar_t.
In C, it is typedefed to one of the integral types. In C++ it
is a built-in type which has the same implementation of one of the
integral types. The implementation chooses which type (or typedef
in C) to use. It can be signed or unsigned. Implementations vary.

Writing "signed wchar_t" or "unsigned wchar_t" is not allowed in
C++, and is not portable in C. (In the C++ standard, see 3.9.1
paragraph 5, and 7.1.5.2. In the C standard, see 7.1.6.)

---
Steve Clamage, stephen.clamage@sun.com




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: James Kuyper <kuyper@wizard.net>
Date: 1998/08/07
Raw View
The standard says that "Type wchar_t shall have the same size,
signedness, and alignment requirements (_intro.memory_) as one of the
other integral types, called its underlying type."

This wording implies to me that their are no guarantees about the
signedness of wchar_t, and I couldn't find any other wording which did
specified the signedness. I don't think you can legally precede wchar_t
with 'signed' or 'unsigned'. You can find out the signedness by checking
whether WCHAR_MIN is less than 0.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: AllanW@my-dejanews.com
Date: 1998/08/08
Raw View
I think that 3.9.1/5 also answered my other question, about weather
it's signed or unsigned, by simply refusing to state (it shall have
the same ... signedness ... as one of the other integral types).

What I wanted to do is take applications that had no knowledge of
wide characters, and compile them with the equivalent of
    #define char wchar_t
but that isn't going to fly if it uses signed char or unsigned char.

In the words of the late great Lucille Ball: "Oh, well."

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: "Alberto Barbati" <albbarbZZZ@tin.it>
Date: 1998/08/09
Raw View
AllanW@my-dejanews.com wrote in message <6qffr7$ehv$1@nnrp1.dejanews.com>...

>Are there three distinct types of wide character?
>    signed wchar_t
>    unsigned wchar_t
>    wchar_t

No, there aren't. Section 3.9.1  "Fundamental types" of the draft is kind of
obscure to me, but surely it does not say that the three types are to be
distinct. Comma 5 says

----------------
5 Type  wchar_t  is  a distinct type whose values can represent distinct
  codes for all members of the largest extended character set  specified
  among  the  supported locales (_lib.locale_).  Type wchar_t shall have
  the same size, signedness, and alignment requirements (_intro.memory_)
  as one of the other integral types, called its underlying type.
----------------

My personal interpretation (I may be wrong, comments appreciated) is that
the type wchar_t cannot be declared signed nor unsigned, so there is only
one type.
Both MW CodeWarrior and MSVC do it in that way, with the difference that CW
has a built-in wchar_t type, and it ignores silently any signed/unsigned in
front of it, while MSVC has wchar_t typedef'd to unsigned short, so a
signed/unsigned before of it gives a compile error.

Do other implementations behave in the same way?

--
Alberto Barbati
Please remove ZZZ from my e-ddress when replying





[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]