Topic: Defect report: byte-order mark as an extended character in
Author: David Krauss<potswa@gmail.com>
Date: Wed, 7 Dec 2011 12:24:26 -0800 (PST)
Raw View
C++11 makes numerous additions to the list of Unicode code points
allowed in identifiers ( E). This includes the byte order mark, which
is included in range FE47-FFFD. However, this character is also used
as a prefix or envelope for UTF-16 and UTF-8 files.
The most reasonable thing for an implementation to do is ignore a BOM
at the beginning of a source file, but it's still possible (if
unlikely) to begin a file with an identifier.
Although this ambiguity is unlikely to occur, it probably makes more
sense to exclude the BOM from identifiers. It has been superseded by
WORD JOINER, U+2060, as the invisible, non-breaking space character
preferred for use in text. The latter is also allowed in identifiers,
for better or worse.
See also the discussion on Stack Overflow,
http://stackoverflow.com/questions/8227642/is-the-byte-order-marker-really-a-valid-identifier
- D
--
[ comp.std.c++ is moderated. To submit articles, try posting with your ]
[ newsreader. If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]