Topic: Type-safe C++


Author: Tim Ottinger <tottinge@oma.com>
Date: 1997/01/15
Raw View
Bjorn Fahller wrote:
>
> John Burger wrote:
[snip]
> > class String {
> > public:
> > String();       // Empty string
> > String(char);   // Single char string
> > String(int);    // Character representation of int. 1234->"1234"
> > String(char *); // Conversion of char array to string.
> > void operator +=(char); // Append a char
> > void operator +=(int);  // Append a representation of an int
> > void operator +=(char *);       // Append a char array
> > };
> >
> > The above will compile, but if I do:
> > String s(0); s += 0; // Stringise integers
> > I'll get ambiguity problems!
>
> True, but fortunately the disambiguation isn't that hard:
>
> String s(int(0)); s+=char(0);
>
> It takes a few keystrokes more, but on the other hand also
> disambiguates for a reader. To me, the meaning of the above
> is clearer than with no explicit type info.

Not to be a C++ apologist, but I think this is not only easy
and more clear, but I much prefer the compiler stop and ask me
whenever there is a doubt.  I've had a compiler make multiple
casts, calls, and coercions quietly.

Sometimes is much better to bail than to give a bad answer.

--
Tim
-------------------------------------------------------------
Tim Ottinger      | Object Mentor Inc. | OOA/D, C++, more..
tottinge@oma.com  | http://www.oma.com | Training/Consulting
-------------------------------------------------------------
"... remember, there are ways of succeeding that we would not
 personally have chosen. "              - Bjarne Stroustrup
-------------------------------------------------------------


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Bjorn Fahller <Bjorn.Fahller@ebc.ericsson.se>
Date: 1997/01/13
Raw View
John Burger wrote:
>
> My gripe comes from not being able to write a "natural" string class.
>
> class String {
> public:
> String();       // Empty string
> String(char);   // Single char string
> String(int);    // Character representation of int. 1234->"1234"
> String(char *); // Conversion of char array to string.
> void operator +=(char); // Append a char
> void operator +=(int);  // Append a representation of an int
> void operator +=(char *);       // Append a char array
> };
>
> The above will compile, but if I do:
> String s(0); s += 0; // Stringise integers
> I'll get ambiguity problems!

True, but fortunately the disambiguation isn't that hard:

String s(int(0)); s+=char(0);

It takes a few keystrokes more, but on the other hand also
disambiguates for a reader. To me, the meaning of the above
is clearer than with no explicit type info.
   _
/Bjorn.
--
Bjorn Fahller                  Tel: +46 8 4220898 /
NA/EBC/FNM/T                   -------------------
Ericsson Business Networks AB / A polar bear is a rectangular
S-131 89 Stockholm/SWEDEN    /  bear after a coordinate transform
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: bs@research.att.com (Bjarne Stroustrup)
Date: 1997/01/13
Raw View

 > From: "John Burger" <john.burger@compucat.com.au> writes:
 >
 > Thank you, Steve, for your considered response. However...
 >
 > Steve Clamage <stephen.clamage@Eng.Sun.COM> wrote in article
 > <199612311940.LAA09352@taumet.eng.sun.com>...
 > >
 > > The problem is complicated enough that you might want to consider
 > > not overloading unless it doesn't matter very much which version of
 > > the function gets called. (That is, when the overloading addresses
 > > efficiency but not correctness.) Apart from ambiguities, you may find
 > > an unexpected version of the function gets called when you don't supply
 > > a full set of choices.
 >
 > Unfortunately, C++ _requires_ overloading in (at least) two circumstances:
 > 1) Constructors
 > 2) Operators.
 >
 > My gripe comes from not being able to write a "natural" string class.
 >
 > class String {
 > public:
 > String(); // Empty string
 > String(char); // Single char string
 > String(int); // Character representation of int. 1234->"1234"
 > String(char *); // Conversion of char array to string.
 > void operator +=(char); // Append a char
 > void operator +=(int); // Append a representation of an int
 > void operator +=(char *); // Append a char array
 > };
 >
 > The above will compile, but if I do:
 > String s(0); s += 0; // Stringise integers
 > I'll get ambiguity problems!

Actually, you don't. 0 is an int that if necessary can be converted to
a char, short, pointer, etc.

One might wonder if having conversions from three built-in types i wise,
but your example resolves as you indicate you want it resolved. For example

void f()
{
 String s(0); // Stringise integers
 s += 0;

 String s('0'); // add characters
 s += '0';

 String s("0"); // add strings
 s += "0";
}

 - Bjarne

Bjarne Stroustrup, AT&T Research, http://www.research.att.com/~bs/homepage.html


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: "John Burger" <john.burger@compucat.com.au>
Date: 1997/01/10
Raw View
Thank you, Steve, for your considered response. However...

Steve Clamage <stephen.clamage@Eng.Sun.COM> wrote in article
<199612311940.LAA09352@taumet.eng.sun.com>...
>
> The problem is complicated enough that you might want to consider
> not overloading unless it doesn't matter very much which version of
> the function gets called. (That is, when the overloading addresses
> efficiency but not correctness.) Apart from ambiguities, you may find
> an unexpected version of the function gets called when you don't supply
> a full set of choices.

Unfortunately, C++ _requires_ overloading in (at least) two circumstances:
1) Constructors
2) Operators.

My gripe comes from not being able to write a "natural" string class.

class String {
public:
String(); // Empty string
String(char); // Single char string
String(int); // Character representation of int. 1234->"1234"
String(char *); // Conversion of char array to string.
void operator +=(char); // Append a char
void operator +=(int); // Append a representation of an int
void operator +=(char *); // Append a char array
};

The above will compile, but if I do:
String s(0); s += 0; // Stringise integers
I'll get ambiguity problems!

John Burger
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: "John Burger" <john.burger@compucat.com.au>
Date: 1997/01/10
Raw View
Ted Clancy <s341282@student.uq.edu.au> wrote in article
<32CEF8EC.3E51@student.uq.edu.au>...

> Agreed about the logically different types. That's why I'm thankful for
> the new three different chars. I use signed and unsigned chars as small
> integers/raw memory since they are promotable to signed int and unsigned
> int, and are defined to take up the smallest addressable space on the
> platform. I typedef them to byte and ubyte, so that I have
> byte  <= short  <= int  <=  long
> ubyte <= ushort <= uint <= ulong

Which was my original point. You're using unsigned and signed char as small
integers, and chars as characters. Although characters are internally
represented as numbers, when externally represented (on the screen or on
paper), it has a graphical representation rather than a numeric one. If you
want to overload on "how to represent the information", I'd like to
overload on ubyte/byte (your words) byte/tiny (my words - I picture bytes
as unsigned) independently of char. My classic example is the String class:
String(char); ('\x41' becomes "A")
String(byte); (255 becomes "255")
String(tiny); ((tiny)255 becomes "-1")
String(char *); (0 becomes "")

> I use char (platform dependant signed-ness, promotable to wchar) and the
> new wchar for text (ASCII or Unicode) for platform-dependant text.

That works, since it means that wchar is used when graphical representation
is required. Of course, C++ hampers this by offering implicit conversions
between ints and wchars!

John Burger
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 03 Jan 1997 10:43:15 +0100
Raw View
stephen.clamage@Eng.Sun.COM (Steve Clamage) writes:

|>  In designing a language from scratch, that would make sense. Java, for
|>  example, with few compatibility constraints, uses 16-bit chars, and
|>  mandates Unicode (among many implementation requirements). C++ is
|>  constrained to be compatible with C and with tons of legacy code and
|>  operating systems. That constraint rules out a lot of nice possibilities.

IMHO, the problem is less obvious than it seems.  Logically, one would
like a separate type for all conceptually distinct types.  Thus, text
characters, small integers, and raw memory really deserve a separate
type, much in the same way C++ recently added a bool type.  In practice,
however, I'm less than certain.  Fixing a type in the language means a
definite loss of flexibility.  While I don't think that there is any
question that we know enough about bool to do this, I'm much less sure
about textual characters.

Even 20 years ago, it was pretty sure that a boolean type would have two
legal values.  20 years ago, however, it was generally well known that 7
bits were enough for a character.  Had C, from its origines, fixed a
separate character type, specified as tightly as Java does today, it
would probably have been US-ASCII!  While 16 bit Unicode seems like a
minimum today, there are already text applications for which it is not
sufficient (Arabic with explicit ligatures, for example).  In my
opinion, Unicode is definitely the way to go in an application today,
but it is far from obvious, to me, at least, that it will still be
considered the "best" solution even 10 years from now.

Given this, although I really do regret the loss of additional type
checking (to prevent assigning an arbitrary integral value to a
character, etc.), I think that the current situation in C/C++ is the
only viable one long term.  It leaves the representation of the
characters up to the application (and partially, the library); this
means more work for the application programmer (and is more
error-prone), but applications are more easily changed than language
specifications.  Imposing Unicode for textual representations means that
some applications (Omega, for example) simply cannot be written in the
language.

--
James Kanze         home:     kanze@gabi-soft.fr        +33 (0)3 88 14 49 00
                    office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 8 rue des Francs Bourgeois, F-67000 Strasbourg, France
       -- Conseils en informatique industrielle --


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/01/03
Raw View
stephen.clamage@Eng.Sun.COM (Steve Clamage) writes:

|>  In designing a language from scratch, that would make sense. Java, for
|>  example, with few compatibility constraints, uses 16-bit chars, and
|>  mandates Unicode (among many implementation requirements). C++ is
|>  constrained to be compatible with C and with tons of legacy code and
|>  operating systems. That constraint rules out a lot of nice possibilities.

IMHO, the problem is less obvious than it seems.  Logically, one would
like a separate type for all conceptually distinct types.  Thus, text
characters, small integers, and raw memory really deserve a separate
type, much in the same way C++ recently added a bool type.  In practice,
however, I'm less than certain.  Fixing a type in the language means a
definite loss of flexibility.  While I don't think that there is any
question that we know enough about bool to do this, I'm much less sure
about textual characters.

Even 20 years ago, it was pretty sure that a boolean type would have two
legal values.  20 years ago, however, it was generally well known that 7
bits were enough for a character.  Had C, from its origines, fixed a
separate character type, specified as tightly as Java does today, it
would probably have been US-ASCII!  While 16 bit Unicode seems like a
minimum today, there are already text applications for which it is not
sufficient (Arabic with explicit ligatures, for example).  In my
opinion, Unicode is definitely the way to go in an application today,
but it is far from obvious, to me, at least, that it will still be
considered the "best" solution even 10 years from now.

Given this, although I really do regret the loss of additional type
checking (to prevent assigning an arbitrary integral value to a
character, etc.), I think that the current situation in C/C++ is the
only viable one long term.  It leaves the representation of the
characters up to the application (and partially, the library); this
means more work for the application programmer (and is more
error-prone), but applications are more easily changed than language
specifications.  Imposing Unicode for textual representations means that
some applications (Omega, for example) simply cannot be written in the
language.

--
James Kanze         home:     kanze@gabi-soft.fr        +33 (0)3 88 14 49 00
                    office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 8 rue des Francs Bourgeois, F-67000 Strasbourg, France
       -- Conseils en informatique industrielle --


[ comp.std.c++ is moderated.  To submit articles: try just posting with      ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu         ]
[ FAQ:      http://reality.sgi.com/employees/austern_mti/std-c++/faq.html    ]
[ Policy:   http://reality.sgi.com/employees/austern_mti/std-c++/policy.html ]
[ Comments? mailto:std-c++-request@ncar.ucar.edu                             ]





Author: Ted Clancy <s341282@student.uq.edu.au>
Date: 1997/01/06
Raw View
James Kanze wrote:
>
> stephen.clamage@Eng.Sun.COM (Steve Clamage) writes:
>
> |>  In designing a language from scratch, that would make sense. Java, for
> |>  example, with few compatibility constraints, uses 16-bit chars, and
> |>  mandates Unicode (among many implementation requirements). C++ is
> |>  constrained to be compatible with C and with tons of legacy code and
> |>  operating systems. That constraint rules out a lot of nice possibilities.
>
> IMHO, the problem is less obvious than it seems.  Logically, one would
> like a separate type for all conceptually distinct types.  Thus, text
> characters, small integers, and raw memory really deserve a separate
> type, much in the same way C++ recently added a bool type.  In practice,
> however, I'm less than certain.  Fixing a type in the language means a
> definite loss of flexibility.  While I don't think that there is any
> question that we know enough about bool to do this, I'm much less sure
> about textual characters.
>
Agreed about the logically different types. That's why I'm thankful for
the new three different chars. I use signed and unsigned chars as small
integers/raw memory since they are promotable to signed int and unsigned
int, and are defined to take up the smallest addressable space on the
platform. I typedef them to byte and ubyte, so that I have
byte  <= short  <= int  <=  long
ubyte <= ushort <= uint <= ulong

I use char (platform dependant signed-ness, promotable to wchar) and the
new wchar for text (ASCII or Unicode) for platform-dependant text.

I assumed the two logically-separate-type uses of char was the reason
for the changes to the chars.

> Even 20 years ago, it was pretty sure that a boolean type would have two
> legal values.  20 years ago, however, it was generally well known that 7
> bits were enough for a character.  Had C, from its origines, fixed a
> separate character type, specified as tightly as Java does today, it
> would probably have been US-ASCII!  While 16 bit Unicode seems like a
> minimum today, there are already text applications for which it is not
> sufficient (Arabic with explicit ligatures, for example).  In my
> opinion, Unicode is definitely the way to go in an application today,
> but it is far from obvious, to me, at least, that it will still be
> considered the "best" solution even 10 years from now.
>
> Given this, although I really do regret the loss of additional type
> checking (to prevent assigning an arbitrary integral value to a
> character, etc.), I think that the current situation in C/C++ is the
> only viable one long term.  It leaves the representation of the
> characters up to the application (and partially, the library); this
> means more work for the application programmer (and is more
> error-prone), but applications are more easily changed than language
> specifications.  Imposing Unicode for textual representations means that
> some applications (Omega, for example) simply cannot be written in the
> language.
>
> --
> James Kanze         home:     kanze@gabi-soft.fr        +33 (0)3 88 14 49 00
>                     office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
> GABI Software, Sarl., 8 rue des Francs Bourgeois, F-67000 Strasbourg, France
>               -- Conseils en informatique industrielle --
>

--
Ted Clancy                               | "...This is Pauline Hanson of
Borg
s341282@student.uq.edu.au                |     Resistance is futile
B Engineering, University of Queensland. |     You _will_ be
assimilated..."
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: d96-mst@nada.kth.se (Mikael St ldal)
Date: 1997/01/06
Raw View
In article <rf5loab5c24.fsf@vx.cit.alcatel.fr>,
James Kanze <james-albert.kanze@vx.cit.alcatel.fr> wrote:

>Given this, although I really do regret the loss of additional type
>checking (to prevent assigning an arbitrary integral value to a
>character, etc.), I think that the current situation in C/C++ is the
>only viable one long term.

But what about only specify that the 'char' type is distinct from all
other integral types (including 'signed char' and 'unsigned char') so
that a conversion from an integral constant to char is worse than a
conversion from an integral constant to any other integral type?

void f(char);
void f(signed char);

f(0); // will call f(signed char), but doesn't currently work?
f('\0'); // will call f(char) and works already
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: James Kanze <james-albert.kanze@vx.cit.alcatel.fr>
Date: 1997/01/07
Raw View
Ted Clancy <s341282@student.uq.edu.au> writes:

|>  Agreed about the logically different types. That's why I'm thankful for
|>  the new three different chars. I use signed and unsigned chars as small
|>  integers/raw memory since they are promotable to signed int and unsigned
|>  int, and are defined to take up the smallest addressable space on the
|>  platform. I typedef them to byte and ubyte, so that I have
|>  byte  <= short  <= int  <=  long
|>  ubyte <= ushort <= uint <= ulong
|>
|>  I use char (platform dependant signed-ness, promotable to wchar) and the
|>  new wchar for text (ASCII or Unicode) for platform-dependant text.
|>
|>  I assumed the two logically-separate-type uses of char was the reason
|>  for the changes to the chars.

I don't think so.  It was more a case that sometimes you need signed,
and sometimes it doesn't matter.  Since the signed-ness of char was
implementation defined, it was felt necessary to offer a char type that
had to be signed.

I like your separation, although it isn't really supported by the
language.  ("char" promotes to "int", regardless of the type of wchar_t,
for example.)  At any rate, it makes the intention clearer.  Also, as I
explain in another posting, you really need char-1 (ISO 8859-1), char-2
(ISO 8859-2), etc.

--
James Kanze         home:     kanze@gabi-soft.fr        +33 (0)3 88 14 49 00
                    office:   kanze@vx.cit.alcatel.fr   +33 (0)1 69 63 14 54
GABI Software, Sarl., 8 rue des Francs Bourgeois, F-67000 Strasbourg, France
       -- Conseils en informatique industrielle --
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: stephen.clamage@Eng.Sun.COM (Steve Clamage)
Date: 1997/01/02
Raw View
In article DF5DD940@owl.compucat.com.au, John Burger <john.burger@compucat.com.au> writes:
>
>While we're correcting the type system, what about the following =
>ambiguity?
>
>int fn(char *);
>int fn(long);
>int fn(char);
>
>At one level, these could be called unambiguously by:
>fn("Hi there!");
>fn(0x12345678L);
>fn('A');
>although not all compilers necessarily agree!
>
>However, what does the following call?
>fn(0);

The call is ambiguous, since any of the three functions can be called
with a standard conversion, but none with a promotion or exact
match. The problem is caused by an ill-considered use of overloading.

In general, when you overload a function on numeric types, you need
to overload on type int to avoid ambiguities. You should also
overload on type double if floating-point can be involved, and
on unsigned int if unsigned values can be involved.

The problem is complicated enough that you might want to consider
not overloading unless it doesn't matter very much which version of
the function gets called. (That is, when the overloading addresses
efficiency but not correctness.) Apart from ambiguities, you may find
an unexpected version of the function gets called when you don't supply
a full set of choices.

Example: In the iostream classes, the overloading of the << and >>
functions involve correctness and not just efficiency, and so are
overloaded on every basic type plus void* and char*. The basic idea,
input/output, is the same for all the functions, but the details
are different and important.

If you overload
 fire(Gun&)      // shoot a pistol
 fire(BBQpit&)   // start a cooking fire
 fire(Employee&) // discharge a worker
in the same program, things just get confusing.


>I would LOVE to be able to use
>fn(nil);  // Or whatever. NULL is a de-facto reserved word.
>fn('\0'); // Only way to pass a char - use char syntax!
>fn(0);   // Cannot be confused with char
>
>This requires going away from the C++ concept that the char is the =
>universal unit.
>I know that (8-bit) bytes are hardly universal even now, but adding =
>"byte" to the language would make things SO much easier! No longer would =
>there be confusion with char, signed char, and unsigned char (only char, =
>signed byte and unsigned byte),
>and sizeof would return number of bytes an object used, rather than the =
>number of chars.
>
>This would also allow UNICODE (and other) multi-byte characters to be =
>supported later intuitively as chars, once ASCII and EBCDIC (finally) =
>die, and not spoil the concept of sizes being expressed in bytes.

In designing a language from scratch, that would make sense. Java, for
example, with few compatibility constraints, uses 16-bit chars, and
mandates Unicode (among many implementation requirements). C++ is
constrained to be compatible with C and with tons of legacy code and
operating systems. That constraint rules out a lot of nice possibilities.

---
Steve Clamage, stephen.clamage@eng.sun.com
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]