Topic: wchar_t as a basic type
Author: bright@nazgul.UUCP (Walter Bright)
Date: 22 Dec 92 08:53:37 GMT Raw View
In article <1992Dec9.103811.28262@alf.uib.no> yngvar@novaja.imr.no (Yngvar Foelling) writes:
/Sorry about following up this so long afterwards. But here's a point that I
/didn't see mentioned in the discussion.
/wchar_t is the type of a wide character constant, i.e. L'x'.
/Walter, you're a compiler writer. Would you like the idea of making the type
/of a constant, defined in the language, a type from the library?
It already is. L'x' is a wchar_t, which is defined in the library.
/How would you handle the correct insertion of a string like this:
/cout << L"Wide string constant";
The compiler "knows" about wchar_t, just like it does now
for ANSI C. No, it is not as clean as I would like, but the
L wart is the only one I've found so far.
Author: yngvar@novaja.imr.no (Yngvar Foelling)
Date: Wed, 9 Dec 92 10:38:11 GMT Raw View
Sorry about following up this so long afterwards. But here's a point that I
didn't see mentioned in the discussion.
wchar_t is the type of a wide character constant, i.e. L'x'.
Walter, you're a compiler writer. Would you like the idea of making the type
of a constant, defined in the language, a type from the library?
How would you handle the correct insertion of a string like this:
cout << L"Wide string constant";
--
ISO 8859-1: Yngvar F lling | Snail mail:
ASCII: Yngvar Foelling | Tertnesveien 121
| N-5084 Tertnes
E-mail: yngvar@imr.no | Norway
Author: bright@nazgul.UUCP (Walter Bright)
Date: 24 Nov 92 00:40:39 GMT Raw View
I understand that wchar_t, which is a typedef in ANSI C, is now proposed to
be a basic integral type in ANSI C++. The argument for this is that it
enables functions to be overloaded based on wchar_t.
I am puzzled why wchar_t cannot be handled as a class in the Standard Library.
Making it part of the library is a much simpler solution than trying to
integrate it into the type system of the language. If wchar_t cannot be
handled as a class, like complex is, then isn't that a failure of one of the
goals of C++, which is the ability to define new types and the operations
on those types?
Author: jss@lucid.com (Jerry Schwarz)
Date: Sun, 29 Nov 92 00:26:58 GMT Raw View
In article <1507@nazgul.UUCP>, bright@nazgul.UUCP (Walter Bright) writes:
|> I understand that wchar_t, which is a typedef in ANSI C, is now proposed to
|> be a basic integral type in ANSI C++. The argument for this is that it
|> enables functions to be overloaded based on wchar_t.
|>
|> I am puzzled why wchar_t cannot be handled as a class in the Standard Library.
|> Making it part of the library is a much simpler solution than trying to
|> integrate it into the type system of the language. If wchar_t cannot be
|> handled as a class, like complex is, then isn't that a failure of one of the
|> goals of C++, which is the ability to define new types and the operations
|> on those types?
wchar_t is part of ANSI/ISO C and is required to be an integral type.
The C++ committee could not reasonably change this. The proposal
to make wchar_t a distinct type was originally made by the library
working group because we were looking at iostreams whose current
specification ignores wchar_t.
Consider
wchar_t wc ;
cout << wc ;
Although a final decision hasn't been made on the meaning of a
wchar_t inserter, it seems that we should be able to define it
differently that a short or int inserter even on systems where
wchar_t has the same representation as a short or int.
There is a secondary efficiency consideration. Some compilers
would deal with objects based on the type
class Wchar { short w ; } ;
less efficiently than with
typedef short Wchar ;
In particular some will allocate more space for the former.
Before you jump to condemn such implementations, consider
the tradeoffs that have to be made when deciding layouts on
32 bit word addressable machines. Even on machines that are
byte addressable, word aligned operations (such as copying)
may be significantly faster than unaligned operations.
-- Jerry Schwarz
Author: checker@acf5.NYU.EDU (checker)
Date: 29 Nov 92 03:51:16 GMT Raw View
jss@lucid.com (Jerry Schwarz) defends wchar_t as a builtin:
>In article <1507@nazgul.UUCP>, bright@nazgul.UUCP (Walter Bright) writes:
>|> I am puzzled why wchar_t cannot be handled as a class in the Standard Library.
>|> Making it part of the library is a much simpler solution than trying to
>|> integrate it into the type system of the language. If wchar_t cannot be
>|> handled as a class, like complex is, then isn't that a failure of one of the
>|> goals of C++, which is the ability to define new types and the operations
>|> on those types?
>wchar_t is part of ANSI/ISO C and is required to be an integral type.
>The C++ committee could not reasonably change this.
This is a legitimate point.
>There is a secondary efficiency consideration. Some compilers
>would deal with objects based on the type
> class Wchar { short w ; } ;
>less efficiently than with
> typedef short Wchar ;
>In particular some will allocate more space for the former.
>Before you jump to condemn such implementations, consider
>the tradeoffs that have to be made when deciding layouts on
>32 bit word addressable machines. Even on machines that are
>byte addressable, word aligned operations (such as copying)
>may be significantly faster than unaligned operations.
This isn't. Walter's point about this highlighting a failure in the
implementation of user-defined types rings true. If, in the face of a
`difficult-to-optimize' problem, people run to the compiler to hack in a
new builtin type, where will the incentive to write those optimizations
come from? More importantly, how can the committee deny with a straight
face every Tom, Dick, and Harry their own little builtin types, like
vectors, matrices, complex, strings, etc. Each of these has performance
considerations that rival wchar_t.
Small, high performance, types like these will become more important as
people try to escape from the typeless morass that is the current set
of builtins; let's have the committee lead the way. Use the features of
the language to extend the language, not the compiler.
Chris
Author: jss@lucid.com (Jerry Schwarz)
Date: Sun, 29 Nov 92 22:00:13 GMT Raw View
In article <2092@acf5.NYU.EDU>, checker@acf5.NYU.EDU (checker) writes:
jss:
|> >wchar_t is part of ANSI/ISO C and is required to be an integral type.
|> >The C++ committee could not reasonably change this.
checker:
|> This is a legitimate point.
|>
jss:
|> >There is a secondary efficiency consideration. Some compilers
|> >would deal with objects based on the type
|> > class Wchar { short w ; } ;
|> >less efficiently than with
|> > typedef short Wchar ;
|> >In particular some will allocate more space for the former.
|> >Before you jump to condemn such implementations, consider
|> >the tradeoffs that have to be made when deciding layouts on
|> >32 bit word addressable machines. Even on machines that are
|> >byte addressable, word aligned operations (such as copying)
|> >may be significantly faster than unaligned operations.
|>
checker:
|> This isn't. ...
Apparently I was misunderstood. This secondary consideration
played _no_ role in the discussions leading up to the committee
decision to make wchar_t a distinct type.
My point is that on some architures there are difficult decisions
to be made with regard to alignment and representations of classes
and integral types. Reasonable decisions can result in the class
and the typedef having different performance characteristics.
Nothing in checker's response refuted that point.
-- Jerry Schwarz
Author: bright@nazgul.UUCP (Walter Bright)
Date: 2 Dec 92 18:39:19 GMT Raw View
In article <1992Nov29.002658.16405@lucid.com> jss@lucid.com (Jerry Schwarz) writes:
/In article <1507@nazgul.UUCP>, bright@nazgul.UUCP (Walter Bright) writes:
/|> I am puzzled why wchar_t cannot be handled as a class in the Standard Library.
/wchar_t is part of ANSI/ISO C and is required to be an integral type.
/The C++ committee could not reasonably change this.
They change other things. And I bet making wchar_t a class would
break almost no existing code.
/Although a final decision hasn't been made on the meaning of a
/wchar_t inserter, it seems that we should be able to define it
/differently that a short or int inserter even on systems where
/wchar_t has the same representation as a short or int.
Which making it a class handles nicely.
/There is a secondary efficiency consideration. Some compilers
/would deal with objects based on the type
/ class Wchar { short w ; } ;
/less efficiently than with
/ typedef short Wchar ;
The quality of C++ compilers is steadilly improving, and as
a C++ implementor I'll say that these efficiency problems can
be removed, and probably will be long before C++ becomes a standard.
/In particular some will allocate more space for the former.
/Before you jump to condemn such implementations, consider
/the tradeoffs that have to be made when deciding layouts on
/32 bit word addressable machines. Even on machines that are
/byte addressable, word aligned operations (such as copying)
/may be significantly faster than unaligned operations.
On machines I am familiar with, short is a 16 bit word, and
there is no speed gain from aligning a short on a 32 bit boundary
rather than a 16 bit one. I find it strange that a compiler
would straddle a word for a 16 bit struct but not for a 16 bit
auto. Sounds like a problem with the compiler.