Topic: Defect Report: contradiction in [lex.string] "String literals" and [lex.ccon] "Character literals


Author: kanze@gabi-soft.de (James Kanze)
Date: 23 Apr 03 10:28:57 GMT
Raw View
 [Moderator's note: this defect report has been
 forwarded to the C++ committee. -moderator.]

'2.13.4 [lex.string]/5 reads "Escape sequences and
universal-character-names in string literals have the same meaning as in
character literals, except that the single quote ' is representable
either by itself or by the escape sequence \', and the double quote "
shall be preceded by a \. In a narrow string literal, a
universal-character-name may map to more than one char element due to
multibyte encoding."

The first sentence refers us to '2.13.2 [lex.ccon], where we read in the
first paragraph that "An ordinary character literal that contains a
single c-char has type char [...]."  Since the grammar shows that a
universal-character-name is a c-char, something like '\u1234' must have
type char (and thus be a single char element); in paragraph 5, we read
that "A universal-character-name is translated to the encoding, in the
execution character set, of the character named.  If there is no such
encoding, the universal-character-name is translated to an
implemenation-defined encoding."

This is in obvious contradiction with the second sentence.  In addition,
I'm not really clear what is supposed to happen in the case where the
execution (narrow-)character set is UTF-8.  Consider the character
\u0153 (the oe in the French word oeuvre).  Should '\0153' be a char,
with an "error" value, say '?' (in conformance with the requirement that
it be a single char), or an int, with the two char values 0xC5, 0x93, in
an implementation defined order (in conformance with the requirement
that a character representable in the execution character set be
represented).  Supposing the former, should "\0153" be the equivalent of
"?" (in conformance with the first sentence), or "\xC5\x93" (in
conformance with the second).

--
James Kanze             GABI Software             mailto:kanze@gabi-soft.fr
Conseils en informatique orientie objet/
                           Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Til. : +33 (0)1 30 23 45 16
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]