Topic: Unicode support by extending std::locale. Can we
Author: Tom Honermann <tom@honermann.net>
Date: Wed, 28 Mar 2018 10:12:20 -0400
Raw View
This is a multi-part message in MIME format.
--------------7A6A8252F112C83DB5D465BE
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: quoted-printable
Hi, Dimitrij!=C2=A0 Thanks for taking an interest in Unicode support!
In case you haven't heard, the C++ committee formed study group 16 at=20
the Jacksonville meeting a little over a week ago to focus on improving=20
Unicode support.=C2=A0 We'll soon have a new SG16 Unicode mailing list setu=
p=20
and I'd like to encourage you to resubmit this for discussion to that=20
mailing list once it opens (I'll let you know when that happens).
Tom.
On 03/28/2018 08:45 AM, Dimitrij Mijoski wrote:
>
>
> Unicode support by extending std::locale. Can we make it by 2020?
>
> The need for standard Unicode support has been requested many times=20
> and I wont get into it. It is very obvious that we need it.
>
> Goals:
>
> * No new string class
> * No new character type
> * Reuse std::locale and facet interfaces
> * Follow best practices and see how Linux and POSIX handle locales.
> * Follow library ICU.
> * See boost::locale which extends std::locale.
> * Use bottom up approach while designing. First define low level
> stuff (facets), then their use (e.g. in iostreams).
>
>
> 1. Terms and definitions
>
> * Natural Language - a spoken language with its own script/alphabet.
> A script is contained of all characters needed for particular
> language.
> * Computer character set - a strictly defined set (in a mathematical
> sense) of characters. Usually one character set can serve multiple
> languages.
> * Character map - a mapping between character set and integers.
> * Encoding - scheme that specifies how a character from a set is
> encoded into bits.
> o A popular scheme is first to map the characters to integers
> and then binary-encode each integer into fixed length of bits
> and bytes.
> o Another popular scheme is to encode a character into variable
> length sequence of (consecutive) bytes.
> o Less popular scheme is to use shift states. In such encodings
> there are bytes that alter a shift state, and bytes that form
> an actual character. The final characters depends on its "own"
> bytes plus the some earlier bytes that altered the shift state.
> o Other schemes may exist.
> * Unicode - a standard that combines ~ 1 million characters into
> single set, then maps each character into unique integer and
> defines couple of encodings. Namely: UTF-32, UTF-16, and UTF-8.
> Then defines byte serialization of UTF-16 and UTF-32 as UTF-16-BE,
> UTF-16-LE, UTF-32-BE and UTF-32-LE.
>
>
> 2. Current state
>
>
> In C and C++, a locale (in most implementations) is a pair of natural=20
> language identifier and a narrow encoding identifier.
>
> In C and C++, the standard groups the encodings into three groups:
>
> 1. Narrow single-byte encodings.
> 2. Narrow multi-byte encodings.
> 3. Wide encodings - fixed length encodings where the character is
> mapped to a single wchar_t unit that can be wider than 1 byte (it
> is not necessary) and always is the same length.
>
> Examples are:
>
> 1. ASCII, IS0-8859-1 (latin1), IS0-8859-2 (latin2), IS0-8859-5
> (cyrillic).
> 2. Shift-JIS, UTF-8, UTF-16-BE, UTF-16-LE
> 3. UCS-2, UTF-32
>
> In C and C++, the third group is defined as derived from the first=20
> two. The standard says that for each multi-byte (and single-byte)=20
> encoding there should exist a wide fixed-length encoding that encodes=20
> each character of the same character set into a single distinct wide=20
> character. In another words, the standard says that for each supported=20
> character set, there should be one narrow single-byte or multi-byte=20
> encoding and one wide encoding.
>
> One last requirement from the standard is that all character sets=20
> (narrow and wide encoded) should be superset of the basic execution=20
> character set (narrow and wide encoded).
>
> The standard allows that the wide encoding of one character set to be=20
> different with the wide encoding of another character set. In practice=20
> this is not true.
>
> * On Linux, Android and various Unixes the wide encoding is always
> UTF-32, whatever the narrow encoding is selected with the locale.
> * On Windows, the wide encoding is always UCS-2 when calling
> standard library functions. When calling Windows API wide strings
> are interpreted as UTF-16.
>
> From what we see, the C++ standard allows, but does not mandates the=20
> implementation to have a decent Unicode support. Such is Linux with=20
> glibc as C standard library and libstdc++ with C++ standard libraries.=20
> Just create a locale with the UTF-8 as narrow encoding,=20
> std::locale("en_US.UTF-8") and you got an acceptable Unicode support.=20
> But even on Linux there are drawback. Locales have to be first enabled=20
> and configured with the command line utility locale-gen.
>
> Other platforms are less fortunate. On Windows you can not create a=20
> locale with UTF-8 as narrow, and thus you don't get UTF-32 as wide,=20
> you can only get UCS-2.
>
>
> 3. Future proposal
>
>
> ctype<char32_t>
>
> This facet should behave almost as |ctype<wchar_t>| on systems where=20
> |wchar_t| is UTF-32 encoded. The wide facet on Linux has one little=20
> gotcha. When used from the classic/"C" locale, it does not work for=20
> characters above ASCII. We can test that with the following program.
>
> |
> |
> #include<locale>#include<cassert>usingnamespacestd;intmain(){// Wide=20
> facet of "C" locale does not work for anything above ASCII.// This is=20
> little gotcha.auto&c=20
> =3Dstd::locale::classic();assert(isupper(L'=C3=9F',c)=3D=3Dfalse);assert(=
islower(L'=C3=9F',c)=3D=3Dfalse);assert(isupper(L'=E1=BA=9E',c)=3D=3Dfalse)=
;assert(islower(L'=E1=BA=9E',c)=3D=3Dfalse);assert(toupper(L'=C3=9F',c)=3D=
=3DL'=C3=9F');assert(tolower(L'=C3=9F',c)=3D=3DL'=C3=9F');assert(toupper(L'=
=E1=BA=9E',c)=3D=3DL'=E1=BA=9E');assert(tolower(L'=E1=BA=9E',c)=3D=3DL'=E1=
=BA=9E');assert(isupper(L'=D0=B1',c)=3D=3Dfalse);assert(islower(L'=D0=B1',c=
)=3D=3Dfalse);assert(isupper(L'=D0=91',c)=3D=3Dfalse);assert(islower(L'=D0=
=91',c)=3D=3Dfalse);assert(toupper(L'=D0=B1',c)=3D=3DL'=D0=B1');assert(tolo=
wer(L'=D0=B1',c)=3D=3DL'=D0=B1');assert(toupper(L'=D0=91',c)=3D=3DL'=D0=91'=
);assert(tolower(L'=D0=91',c)=3D=3DL'=D0=91');//=20
> Generic unicode locale, classifies and converts Latin and Cyrillic//=20
> letters correctly.autocu8=20
> =3Dlocale("C.UTF-8");assert(isupper(L'=C3=9F',cu8)=3D=3Dfalse);assert(isl=
ower(L'=C3=9F',cu8)=3D=3Dtrue);assert(isupper(L'=E1=BA=9E',cu8)=3D=3Dtrue);=
assert(islower(L'=E1=BA=9E',cu8)=3D=3Dfalse);assert(toupper(L'=C3=9F',cu8)=
=3D=3DL'=C3=9F');//why=20
> not=20
> =E1=BA=9E?assert(tolower(L'=C3=9F',cu8)=3D=3DL'=C3=9F');assert(toupper(L'=
=E1=BA=9E',cu8)=3D=3DL'=E1=BA=9E');assert(tolower(L'=E1=BA=9E',cu8)=3D=3DL'=
=C3=9F');assert(isupper(L'=D0=B1',cu8)=3D=3Dfalse);assert(islower(L'=D0=B1'=
,cu8)=3D=3Dtrue);assert(isupper(L'=D0=91',cu8)=3D=3Dtrue);assert(islower(L'=
=D0=91',cu8)=3D=3Dfalse);assert(toupper(L'=D0=B1',cu8)=3D=3DL'=D0=91');asse=
rt(tolower(L'=D0=B1',cu8)=3D=3DL'=D0=B1');assert(toupper(L'=D0=91',cu8)=3D=
=3DL'=D0=91');assert(tolower(L'=D0=91',cu8)=3D=3DL'=D0=B1');//=20
> Latin-1 locale. The wide facet classifies and converts case//=20
> correctly even for Unicode characters that are not part of=20
> latin-1.autol1=20
> =3Dlocale("en_US.ISO-8859-1");assert(isupper(L'=C3=9F',l1)=3D=3Dfalse);as=
sert(islower(L'=C3=9F',l1)=3D=3Dtrue);assert(isupper(L'=E1=BA=9E',l1)=3D=3D=
true);assert(islower(L'=E1=BA=9E',l1)=3D=3Dfalse);assert(toupper(L'=C3=9F',=
l1)=3D=3DL'=C3=9F');//why=20
> not=20
> =E1=BA=9E?assert(tolower(L'=C3=9F',l1)=3D=3DL'=C3=9F');assert(toupper(L'=
=E1=BA=9E',l1)=3D=3DL'=E1=BA=9E');assert(tolower(L'=E1=BA=9E',l1)=3D=3DL'=
=C3=9F');assert(isupper(L'=D0=B1',l1)=3D=3Dfalse);assert(islower(L'=D0=B1',=
l1)=3D=3Dtrue);assert(isupper(L'=D0=91',l1)=3D=3Dtrue);assert(islower(L'=D0=
=91',l1)=3D=3Dfalse);assert(toupper(L'=D0=B1',l1)=3D=3DL'=D0=91');assert(to=
lower(L'=D0=B1',l1)=3D=3DL'=D0=B1');assert(toupper(L'=D0=91',l1)=3D=3DL'=D0=
=91');assert(tolower(L'=D0=91',l1)=3D=3DL'=D0=B1');//=20
> English UTF-8 locale, works as expected// same as the two above.autou8=20
> =3Dlocale("en_US.UTF-8");assert(isupper(L'=C3=9F',u8)=3D=3Dfalse);assert(=
islower(L'=C3=9F',u8)=3D=3Dtrue);assert(isupper(L'=E1=BA=9E',u8)=3D=3Dtrue)=
;assert(islower(L'=E1=BA=9E',u8)=3D=3Dfalse);assert(toupper(L'=C3=9F',u8)=
=3D=3DL'=C3=9F');//=20
> why not=20
> =E1=BA=9E?assert(tolower(L'=C3=9F',u8)=3D=3DL'=C3=9F');assert(toupper(L'=
=E1=BA=9E',u8)=3D=3DL'=E1=BA=9E');assert(tolower(L'=E1=BA=9E',u8)=3D=3DL'=
=C3=9F');assert(isupper(L'=D0=B1',u8)=3D=3Dfalse);assert(islower(L'=D0=B1',=
u8)=3D=3Dtrue);assert(isupper(L'=D0=91',u8)=3D=3Dtrue);assert(islower(L'=D0=
=91',u8)=3D=3Dfalse);assert(toupper(L'=D0=B1',u8)=3D=3DL'=D0=91');assert(to=
lower(L'=D0=B1',u8)=3D=3DL'=D0=B1');assert(toupper(L'=D0=91',u8)=3D=3DL'=D0=
=91');assert(tolower(L'=D0=91',u8)=3D=3DL'=D0=B1');return0;}
> |
> |
>
> We should completely avoid this gotcha and make |ctype<char32_t>| work=20
> out of the box for the whole Unicode range. The locale name should=20
> modify only the |widen()| and |narrow()| functions.
>
> We can additional allow the natural language part of the locale to=20
> slightly modify the case conversion. E.g. the Turkish dotted and=20
> dotless I <https://en.wikipedia.org/wiki/Dotted_and_dotless_I>.
>
> Defining this facet will automatically enable decent Unicode regexes.
>
>
> ctype<char16_t>
>
> This should behave exactly same as the above, except that it will=20
> accept only the first 65536 characters of Unicode, i.e. characters=20
> from the basic multilingual plane (BMP).
>
>
> codecvt
>
> Codecvt specializations of |char16_t| and |char32_t| already exists.=20
> They convert strictly from/to UTF-8. I am not that happy with this=20
> because it is not aligned with the codecvt of |wchar_t| which can=20
> convert from/to custom encoding, when the implementations allows.
>
> Maybe we should allow |codecvt_byname| to create |codecvt| that can=20
> convert between custom encoding and UTF-32/UTF-16.
>
> Additionally, I am not even happy with the general design of codecvt,=20
> mostly because it is created with the same names as locales. Codecvt=20
> depend only on the encoding part of the locale identifier pair, not on=20
> the natural language identifier. Maybe we should create a new codecvt2=20
> facet.
>
>
> codecvt2<InternT, ExternT> facet
>
> codecvt2 should be constructed only with encoding name. It will=20
> convert from/to that encoding. This is the way |iconv| works. you=20
> first call |iconv_open| with two encoding names. This is the way ICU=20
> ucnv.h <http://icu-project.org/apiref/icu4c/ucnv_8h.html> works. You=20
> open |UConverter| with |ucnv_open()| by giving the name of the encoding.
>
> * |codecvt2<char, char>| may convert between various narrow encodings,
> * |codecvt2<char32_t, char>| will convert between custom narrow
> encoding and UTF-32.
> * |codecvt2<char32_t, char16_t>| may convert between UTF-32 and UTF-16.
>
> Besides the standard |in()| and |out()|, additional functions should=20
> be provided. For example simple function that converts one string into=20
> another without keeping state. The input string is known that it=20
> starts and ends, it isn't part of larger text. Such functionality was=20
> added in C++11 and deprecated in C++17, |wstring_convert|.
>
> Additional function that converts only one output character may be=20
> provided.
>
>
> Conclusion so far
>
> Specifying the above facets are the absolute minimum to get a decent=20
> Unicode support. More advanced Unicode features like:
>
> 1. Querying character properties like general category
> 2. Language sensitive string case transformations (not character)
> 3. Normalization
>
> Will need facets on their own.
>
> --=20
> You received this message because you are subscribed to the Google=20
> Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send=20
> an email to std-proposals+unsubscribe@isocpp.org=20
> <mailto:std-proposals+unsubscribe@isocpp.org>.
> To post to this group, send email to std-proposals@isocpp.org=20
> <mailto:std-proposals@isocpp.org>.
> To view this discussion on the web visit=20
> https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/45303792-68f=
2-4545-8ce4-4a3e1ec35b1b%40isocpp.org=20
> <https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/45303792-68=
f2-4545-8ce4-4a3e1ec35b1b%40isocpp.org?utm_medium=3Demail&utm_source=3Dfoot=
er>.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/0b9b8c99-033c-63d3-d599-c6c966a2dd9d%40honermann=
..net.
--------------7A6A8252F112C83DB5D465BE
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8=
">
</head>
<body text=3D"#000000" bgcolor=3D"#FFFFFF">
<div class=3D"moz-cite-prefix">Hi, Dimitrij!=C2=A0 Thanks for taking an
interest in Unicode support!<br>
<br>
In case you haven't heard, the C++ committee formed study group 16
at the Jacksonville meeting a little over a week ago to focus on
improving Unicode support.=C2=A0 We'll soon have a new SG16 Unicode
mailing list setup and I'd like to encourage you to resubmit this
for discussion to that mailing list once it opens (I'll let you
know when that happens).<br>
<br>
Tom.<br>
<br>
On 03/28/2018 08:45 AM, Dimitrij Mijoski wrote:<br>
</div>
<blockquote type=3D"cite"
cite=3D"mid:45303792-68f2-4545-8ce4-4a3e1ec35b1b@isocpp.org">
<div dir=3D"ltr">
<h1>Unicode support by extending std::locale. Can we make it by
2020?</h1>
<p>The need for standard Unicode support has been requested many
times and I wont get into it. It is very obvious that we need
it.</p>
<p>Goals:</p>
<ul>
<li>No new string class</li>
<li>No new character type</li>
<li>Reuse std::locale and facet interfaces</li>
<li>Follow best practices and see how Linux and POSIX handle
locales.</li>
<li>Follow library ICU.</li>
<li>See boost::locale which extends std::locale.</li>
<li>Use bottom up approach while designing. First define low
level stuff (facets), then their use (e.g. in iostreams).</li>
</ul>
<h2>1. Terms and definitions</h2>
<ul>
<li>Natural Language - a spoken language with its own
script/alphabet. A script is contained of all characters
needed for particular language.</li>
<li>Computer character set - a strictly defined set (in a
mathematical sense) of characters. Usually one character set
can serve multiple languages.</li>
<li>Character map - a mapping between character set and
integers.</li>
<li>Encoding - scheme that specifies how a character from a
set is encoded into bits.
<ul>
<li>A popular scheme is first to map the characters to
integers and then binary-encode each integer into fixed
length of bits and bytes.</li>
<li>Another popular scheme is to encode a character into
variable length sequence of (consecutive) bytes.</li>
<li>Less popular scheme is to use shift states. In such
encodings there are bytes that alter a shift state, and
bytes that form an actual character. The final
characters depends on its "own" bytes plus the some
earlier bytes that altered the shift state.</li>
<li>Other schemes may exist.</li>
</ul>
</li>
<li>Unicode - a standard that combines ~ 1 million characters
into single set, then maps each character into unique
integer and defines couple of encodings. Namely: UTF-32,
UTF-16, and UTF-8. Then defines byte serialization of UTF-16
and UTF-32 as UTF-16-BE, UTF-16-LE, UTF-32-BE and UTF-32-LE.</l=
i>
</ul>
<h2>2. Current state</h2>
<p><br>
</p>
<p>In C and C++, a locale (in most implementations) is a pair of
natural language identifier and a narrow encoding identifier.</p>
<p>In C and C++, the standard groups the encodings into three
groups:</p>
<ol>
<li>Narrow single-byte encodings.</li>
<li>Narrow multi-byte encodings.</li>
<li>Wide encodings - fixed length encodings where the
character is mapped to a single wchar_t unit that can be
wider than 1 byte (it is not necessary) and always is the
same length.</li>
</ol>
<p>Examples are:</p>
<ol>
<li>ASCII, IS0-8859-1 (latin1), IS0-8859-2 (latin2),
IS0-8859-5 (cyrillic).</li>
<li>Shift-JIS, UTF-8, UTF-16-BE, UTF-16-LE</li>
<li>UCS-2, UTF-32</li>
</ol>
<p>In C and C++, the third group is defined as derived from the
first two. The standard says that for each multi-byte (and
single-byte) encoding there should exist a wide fixed-length
encoding that encodes each character of the same character set
into a single distinct wide character. In another words, the
standard says that for each supported character set, there
should be one narrow single-byte or multi-byte encoding and
one wide encoding.</p>
<p>One last requirement from the standard is that all character
sets (narrow and wide encoded) should be superset of the basic
execution character set (narrow and wide encoded).</p>
<p>The standard allows that the wide encoding of one character
set to be different with the wide encoding of another
character set. In practice this is not true.</p>
<ul>
<li>On Linux, Android and various Unixes the wide encoding is
always UTF-32, whatever the narrow encoding is selected with
the locale.</li>
<li>On Windows, the wide encoding is always UCS-2 when calling
standard library functions. When calling Windows API wide
strings are interpreted as UTF-16.</li>
</ul>
<p>From what we see, the C++ standard allows, but does not
mandates the implementation to have a decent Unicode support.
Such is Linux with glibc as C standard library and libstdc++
with C++ standard libraries. Just create a locale with the
UTF-8 as narrow encoding, std::locale("en_US.UTF-8") and you
got an acceptable Unicode support. But even on Linux there are
drawback. Locales have to be first enabled and configured with
the command line utility locale-gen.</p>
<p>Other platforms are less fortunate. On Windows you can not
create a locale with UTF-8 as narrow, and thus you don't get
UTF-32 as wide, you can only get UCS-2.</p>
<p><br>
</p>
<h2>3. Future proposal</h2>
<p><br>
</p>
<h3>ctype<char32_t></h3>
<p>This facet should behave almost as <code>ctype<wchar_t></c=
ode>
on systems where <code>wchar_t</code> is UTF-32 encoded. The
wide facet on Linux has one little gotcha. When used from the
classic/"C" locale, it does not work for characters above
ASCII. We can test that with the following program.</p>
<div class=3D"sourceCode">
<pre class=3D"sourceCode c++"><code class=3D"sourceCode cpp"><spa=
n class=3D"pp">
<div style=3D"background-color: rgb(250, 250, 250); border-color: rgb(187, =
187, 187); border-style: solid; border-width: 1px; overflow-wrap: break-wor=
d;" class=3D"prettyprint"><code class=3D"prettyprint"><div class=3D"subpret=
typrint"><span style=3D"color: #800;" class=3D"styled-by-prettify">#include=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><s=
pan style=3D"color: #080;" class=3D"styled-by-prettify"><locale></spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify">
</span><span style=3D"color: #800;" class=3D"styled-by-prettify">#include</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><spa=
n style=3D"color: #080;" class=3D"styled-by-prettify"><cassert></span=
><span style=3D"color: #000;" class=3D"styled-by-prettify">
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">using</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span s=
tyle=3D"color: #008;" class=3D"styled-by-prettify">namespace</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify"> std</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">;</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify">
</span><span style=3D"color: #008;" class=3D"styled-by-prettify">int</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> main</span><span=
style=3D"color: #660;" class=3D"styled-by-prettify">()</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify">
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">{</span><s=
pan style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// Wide facet of "C" locale does not work for anything ab=
ove ASCII.</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// This is little gotcha.</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">auto</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">&</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify"> c </span><span style=3D"color: #660;" class=3D"styled-by-prettify">=
=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> std</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">::</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify">locale</span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">::</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify">classic</span><span style=3D"colo=
r: #660;" class=3D"styled-by-prettify">();</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-b=
y-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-=
by-prettify">'=E1=BA=9E'</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" clas=
s=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-=
by-prettify">'=E1=BA=9E'</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> c</span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// Generic unicode locale, classifies and converts Latin =
and Cyrillic</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// letters correctly.</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> cu8 </span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
locale</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</=
span><span style=3D"color: #080;" class=3D"styled-by-prettify">"C.UTF-8"</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #800;" class=3D"styled-by-prettify">//why n=
ot =E1=BA=9E?</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"sty=
led-by-prettify">'=E1=BA=9E'</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"sty=
led-by-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> cu8</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// Latin-1 locale. The wide facet classifies and converts=
case</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// correctly even for Unicode characters that are not par=
t of latin-1.</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> l1 </span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> =
locale</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</s=
pan><span style=3D"color: #080;" class=3D"styled-by-prettify">"en_US.ISO-88=
59-1"</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #800;" class=3D"styled-by-prettify">//why n=
ot =E1=BA=9E?</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">'=E1=BA=9E'</span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> l1</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// English UTF-8 locale, works as expected</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #800;" class=3D"st=
yled-by-prettify">// same as the two above.</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">auto</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> u8 </span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> =
locale</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</s=
pan><span style=3D"color: #080;" class=3D"styled-by-prettify">"en_US.UTF-8"=
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><=
span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> =C2=A0</span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"style=
d-by-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
"> </span><span style=3D"color: #800;" class=3D"styled-by-prettify">// why =
not =E1=BA=9E?</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=C3=9F'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">'=E1=BA=9E'</span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=E1=BA=9E'</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span s=
tyle=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styl=
ed-by-prettify">'=C3=9F'</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">isupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">true</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">islower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">false</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=B1'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">toupper</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=91'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">assert</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">tolower</span><span style=3D"color: #660;" class=3D"styled-by-prettify=
">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">L</span=
><span style=3D"color: #080;" class=3D"styled-by-prettify">'=D0=91'</span><=
span style=3D"color: #660;" class=3D"styled-by-prettify">,</span><span styl=
e=3D"color: #000;" class=3D"styled-by-prettify"> u8</span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D=3D</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"styled-b=
y-prettify">'=D0=B1'</span><span style=3D"color: #660;" class=3D"styled-by-=
prettify">);</span><span style=3D"color: #000;" class=3D"styled-by-prettify=
">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </span><span style=3D"color: #008;" class=3D"st=
yled-by-prettify">return</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"> </span><span style=3D"color: #066;" class=3D"styled-by-prett=
ify">0</span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify">
</span><span style=3D"color: #660;" class=3D"styled-by-prettify">}</span></=
div></code></div>
</span></code></pre>
</div>
<p>We should completely avoid this gotcha and make <code>ctype<c=
har32_t></code>
work out of the box for the whole Unicode range. The locale
name should modify only the <code>widen()</code> and <code>narrow=
()</code>
functions.</p>
<p>We can additional allow the natural language part of the
locale to slightly modify the case conversion. E.g. the
Turkish <a
href=3D"https://en.wikipedia.org/wiki/Dotted_and_dotless_I"
moz-do-not-send=3D"true">dotted and dotless I</a>.</p>
<p>Defining this facet will automatically enable decent Unicode
regexes.</p>
<p><br>
</p>
<h3>ctype<char16_t></h3>
<p>This should behave exactly same as the above, except that it
will accept only the first 65536 characters of Unicode, i.e.
characters from the basic multilingual plane (BMP).</p>
<p><br>
</p>
<h3>codecvt</h3>
<p>Codecvt specializations of <code>char16_t</code> and <code>char3=
2_t</code>
already exists. They convert strictly from/to UTF-8. I am not
that happy with this because it is not aligned with the
codecvt of <code>wchar_t</code> which can convert from/to
custom encoding, when the implementations allows.</p>
<p>Maybe we should allow <code>codecvt_byname</code> to create
<code>codecvt</code> that can convert between custom encoding
and UTF-32/UTF-16.</p>
<p>Additionally, I am not even happy with the general design of
codecvt, mostly because it is created with the same names as
locales. Codecvt depend only on the encoding part of the
locale identifier pair, not on the natural language
identifier. Maybe we should create a new codecvt2 facet.</p>
<p><br>
</p>
<h3>codecvt2<InternT, ExternT> facet</h3>
<p>codecvt2 should be constructed only with encoding name. It
will convert from/to that encoding. This is the way <code>iconv</=
code>
works. you first call <code>iconv_open</code> with two
encoding names. This is the way ICU <a
href=3D"http://icu-project.org/apiref/icu4c/ucnv_8h.html"
moz-do-not-send=3D"true">ucnv.h</a> works. You open <code>UConv=
erter</code>
with <code>ucnv_open()</code> by giving the name of the
encoding.</p>
<ul>
<li><code>codecvt2<char, char></code> may convert
between various narrow encodings,</li>
<li><code>codecvt2<char32_t, char></code> will convert
between custom narrow encoding and UTF-32.</li>
<li><code>codecvt2<char32_t, char16_t></code> may
convert between UTF-32 and UTF-16.</li>
</ul>
<p>Besides the standard <code>in()</code> and <code>out()</code>,
additional functions should be provided. For example simple
function that converts one string into another without keeping
state. The input string is known that it starts and ends, it
isn't part of larger text. Such functionality was added in
C++11 and deprecated in C++17, <code>wstring_convert</code>.</p>
<p>Additional function that converts only one output character
may be provided.</p>
<p><br>
</p>
<h3>Conclusion so far</h3>
<p>Specifying the above facets are the absolute minimum to get a
decent Unicode support. More advanced Unicode features like:</p>
<ol>
<li>Querying character properties like general category</li>
<li>Language sensitive string case transformations (not
character)</li>
<li>Normalization</li>
</ol>
<p>Will need facets on their own.</p>
</div>
-- <br>
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Future Proposals" group.<br>
To unsubscribe from this group and stop receiving emails from it,
send an email to <a
href=3D"mailto:std-proposals+unsubscribe@isocpp.org"
moz-do-not-send=3D"true">std-proposals+unsubscribe@isocpp.org</a>.<=
br>
To post to this group, send email to <a
href=3D"mailto:std-proposals@isocpp.org" moz-do-not-send=3D"true">s=
td-proposals@isocpp.org</a>.<br>
To view this discussion on the web visit <a
href=3D"https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/453037=
92-68f2-4545-8ce4-4a3e1ec35b1b%40isocpp.org?utm_medium=3Demail&utm_sour=
ce=3Dfooter"
moz-do-not-send=3D"true">https://groups.google.com/a/isocpp.org/d/m=
sgid/std-proposals/45303792-68f2-4545-8ce4-4a3e1ec35b1b%40isocpp.org</a>.<b=
r>
</blockquote>
<p><br>
</p>
</body>
</html>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/0b9b8c99-033c-63d3-d599-c6c966a2dd9d%=
40honermann.net?utm_medium=3Demail&utm_source=3Dfooter">https://groups.goog=
le.com/a/isocpp.org/d/msgid/std-proposals/0b9b8c99-033c-63d3-d599-c6c966a2d=
d9d%40honermann.net</a>.<br />
--------------7A6A8252F112C83DB5D465BE--
.
Author: Nicol Bolas <jmckesson@gmail.com>
Date: Wed, 28 Mar 2018 07:18:38 -0700 (PDT)
Raw View
------=_Part_13219_840773978.1522246718442
Content-Type: multipart/alternative;
boundary="----=_Part_13220_938119649.1522246718443"
------=_Part_13220_938119649.1522246718443
Content-Type: text/plain; charset="UTF-8"
On Wednesday, March 28, 2018 at 8:45:40 AM UTC-4, Dimitrij Mijoski wrote:
>
> Unicode support by extending std::locale. Can we make it by 2020?
>
> The need for standard Unicode support has been requested many times and I
> wont get into it. It is very obvious that we need it.
>
> Goals:
>
> - No new string class
> - No new character type
>
> We kinda need a new character type. Unless you really like having to use
`u8path` every time you want to use a UTF-8 string with
`std::filesystem::path`. Having a type that says, "I'm really a UTF-8
string" is important.
>
> - Reuse std::locale and facet interfaces
>
> For the love of God, *why?!*
I'm being serious: why would we want to compound the mistakes of
`std::locale` by trying to improve it? It was a bad idea; let it die.
>
> - Follow best practices and see how Linux and POSIX handle locales.
> - Follow library ICU.
> - See boost::locale which extends std::locale.
> - Use bottom up approach while designing. First define low level stuff
> (facets), then their use (e.g. in iostreams).
>
> <snip>
>
>
> Conclusion so far
>
> Specifying the above facets are the absolute minimum to get a decent
> Unicode support.
>
How would this deal with Unicode case conversions which work based, not on
letters, but *strings*? That is, a single lowercase codepoint converts into
two uppercase ones, or vice-versa? How would this handle Unicode titlecase?
And so forth.
This is not "decent Unicode support" even if you ignore having to deal with
`std::locale`'s garbage.
> More advanced Unicode features like:
>
> 1. Querying character properties like general category
> 2. Language sensitive string case transformations (not character)
> 3. Normalization
>
> Will need facets on their own.
>
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c28eab4e-b5d6-455f-aee7-3e212a00b135%40isocpp.org.
------=_Part_13220_938119649.1522246718443
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Wednesday, March 28, 2018 at 8:45:40 AM UTC-4, =
Dimitrij Mijoski wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div di=
r=3D"ltr">
<h1>Unicode support by extending std::locale. Can we make it by 2020?</h1>
<p>The need for standard Unicode support has been requested many times and =
I wont get into it. It is very obvious that we need it.</p>
<p>Goals:</p>
<ul><li>No new string class</li><li>No new character type</li></ul></div></=
blockquote><div>We kinda need a new character type. Unless you really like =
having to use `u8path` every time you want to use a UTF-8 string with `std:=
:filesystem::path`. Having a type that says, "I'm really a UTF-8 s=
tring" is important. <br></div><blockquote class=3D"gmail_quote" style=
=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: =
1ex;"><div dir=3D"ltr"><ul><li>Reuse std::locale and facet interfaces</li><=
/ul></div></blockquote><div>For the love of God, <i>why?!</i><br><br>I'=
m being serious: why would we want to compound the mistakes of `std::locale=
` by trying to improve it? It was a bad idea; let it die.<br></div><blockqu=
ote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left=
: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"><ul><li>Follow best p=
ractices and see how Linux and POSIX handle locales.</li><li>Follow library=
ICU.</li><li>See boost::locale which extends std::locale.</li><li>Use bott=
om up approach while designing. First define low level stuff (facets), then=
their use (e.g. in iostreams).</li></ul><snip><br><p><br></p>
<h3>Conclusion so far</h3>
<p>Specifying the above facets are the absolute minimum to get a decent Uni=
code support.</p></div></blockquote><div><br>How would this deal with Unico=
de case conversions which work based, not on letters, but <i>strings</i>? T=
hat is, a single lowercase codepoint converts into two uppercase ones, or v=
ice-versa? How would this handle Unicode titlecase? And so forth.<br><br>Th=
is is not "decent Unicode support" even if you ignore having to d=
eal with `std::locale`'s garbage.<br>=C2=A0</div><blockquote class=3D"g=
mail_quote" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc sol=
id;padding-left: 1ex;"><div dir=3D"ltr"><p>More advanced Unicode features l=
ike:</p>
<ol><li>Querying character properties like general category</li><li>Languag=
e sensitive string case transformations (not character)</li><li>Normalizati=
on</li></ol>
<p>Will need facets on their own.</p>
</div></blockquote></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/c28eab4e-b5d6-455f-aee7-3e212a00b135%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/c28eab4e-b5d6-455f-aee7-3e212a00b135=
%40isocpp.org</a>.<br />
------=_Part_13220_938119649.1522246718443--
------=_Part_13219_840773978.1522246718442--
.
Author: martinho.fernandes@native-instruments.de
Date: Wed, 28 Mar 2018 07:55:16 -0700 (PDT)
Raw View
------=_Part_5348_309466553.1522248916354
Content-Type: multipart/alternative;
boundary="----=_Part_5349_37367184.1522248916354"
------=_Part_5349_37367184.1522248916354
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
I find that there are several important issues in this proposal that need=
=20
to be addressed.
On Wednesday, March 28, 2018 at 2:45:40 PM UTC+2, Dimitrij Mijoski wrote:
>
> Goals:=20
> =20
> - [...]
> - Reuse std::locale and facet interfaces
>
> This goal is contradictory with the goal of Unicode support. Several of=
=20
these interfaces are simply not suitable for Unicode support (and IMO=20
should be deprecated). Some of the textbook counterexamples are right there=
=20
in the code sample provided. `charT toupper(charT, locale)` just cannot=20
possibly correctly uppercase `=C3=9F` into `SS` (which AFAIK is still the=
=20
correct way to uppercase this according to the CLDR locales). Even if the=
=20
locale is changed so that it uppercases to `=E1=BA=9E` (following last year=
's=20
decision of the Rat f=C3=BCr deutsche Rechtschreibung, which makes it an op=
tion,=20
but not a requirement), it's still impossible to uppercase some hundred=20
other characters, e.g. U+01F0 LATIN SMALL LETTER J WITH CARON. The=20
fundamental assumption existing in the locale interface is that case=20
mapping is a 1:1 mapping, but that isn't true.
>
> - See boost::locale which extends std::locale.
>
> Note that the Boost.Locale documentation even acknowledges the problem I=
=20
described above: "You may notice that there are existing functions to_upper=
=20
and to_lower in the Boost.StringAlgo library. The difference is that these=
=20
function operate over an entire string instead of performing incorrect=20
character-by-character conversions."
> - Unicode - a standard that combines ~ 1 million characters into=20
> single set, then maps each character into unique integer and defines c=
ouple=20
> of encodings. Namely: UTF-32, UTF-16, and UTF-8. Then defines byte=20
> serialization of UTF-16 and UTF-32 as UTF-16-BE, UTF-16-LE, UTF-32-BE =
and=20
> UTF-32-LE.
>
> Nitpick: Note that the Unicode Standard defines a lot more than character=
s=20
and encodings.
=20
> 3. Future proposal
>
>
> ctype<char32_t>
>
> We should completely avoid this gotcha and make ctype<char32_t> work out=
=20
> of the box for the whole Unicode range. The locale name should modify onl=
y=20
> the widen() and narrow() functions.
>
>
As mentioned above, this is not enough because the interface itself is=20
unsuitable for this purpose.
=20
> Defining this facet will automatically enable decent Unicode regexes.
>
This is really debatable. The only thing that `char32_t` gives is the=20
ability to match on code points instead of matching on code units (which is=
=20
a disaster with `char16_t` and `char`). However, this isn't enough for even=
=20
regular expression Level 1 Conformance, because the facilities in <regex>=
=20
are currently unsuited for this purpose.
=20
> ctype<char16_t>=20
>
> This should behave exactly same as the above, except that it will accept=
=20
> only the first 65536 characters of Unicode, i.e. characters from the basi=
c=20
> multilingual plane (BMP).
>
This is just designing for deprecation. This ctype would prove entirely=20
useless for UTF-16, for example. It's essentially UCS-2-only. Pretending=20
UCS-2 is relevant is the same kind of mistake that <codecvt> made. This=20
isn't "Unicode support"; it's "Unicode subset support". It's wishful=20
thinking that people don't use, e.g. the Supplementary Ideographic Plane,=
=20
mathematical symbols, or, heck, emoji. Let's not do that again.
--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/8174836d-21fd-4030-aee9-bcb43d83d0fb%40isocpp.or=
g.
------=_Part_5349_37367184.1522248916354
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">=C2=A0I find that there are several important issues in th=
is proposal that need to be addressed.<br><br>On Wednesday, March 28, 2018 =
at 2:45:40 PM UTC+2, Dimitrij Mijoski wrote:<blockquote class=3D"gmail_quot=
e" style=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;paddin=
g-left: 1ex;"><div dir=3D"ltr">Goals:
<ul><li>[...]<br></li><li>Reuse std::locale and facet interfaces</li></ul><=
/div></blockquote><div>This goal is contradictory with the goal of Unicode =
support. Several of these interfaces are simply not suitable for Unicode su=
pport (and IMO should be deprecated). Some of the textbook counterexamples =
are right there in the code sample provided. `charT toupper(charT, locale)`=
just cannot possibly correctly uppercase `<span class=3D"stringliteral">=
=C3=9F` into `SS` (which AFAIK is still the correct way to uppercase this a=
ccording to the CLDR locales). Even if the locale is changed </span><span c=
lass=3D"stringliteral"><span class=3D"stringliteral">so that it uppercases =
to `=E1=BA=9E` </span>(following last year's decision of the Rat f=C3=
=BCr deutsche Rechtschreibung, which makes it an option, but not a requirem=
ent), it's still impossible to uppercase some hundred other characters,=
e.g. U+01F0 LATIN SMALL LETTER J WITH CARON. The fundamental assumption ex=
isting in the locale interface is that case mapping is a 1:1 mapping, but t=
hat isn't true.<br></span></div><blockquote class=3D"gmail_quote" style=
=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: =
1ex;"><div dir=3D"ltr"><ul><li>See boost::locale which extends std::locale.=
</li></ul></div></blockquote><div>Note that the Boost.Locale documentation =
even acknowledges the problem I described above: "You may notice that =
there are existing functions to_upper and to_lower in the Boost.StringAlgo =
library. The difference is that these function operate over an entire strin=
g instead of performing incorrect character-by-character conversions."=
<br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-l=
eft: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"=
><ul><li>Unicode - a standard that combines ~ 1 million characters into=20
single set, then maps each character into unique integer and defines=20
couple of encodings. Namely: UTF-32, UTF-16, and UTF-8. Then defines=20
byte serialization of UTF-16 and UTF-32 as UTF-16-BE, UTF-16-LE,=20
UTF-32-BE and UTF-32-LE.</li></ul></div></blockquote><div>Nitpick: Note tha=
t the Unicode Standard defines a lot more than characters and encodings.<br=
>=C2=A0<br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0;margi=
n-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"l=
tr">
<h2>3. Future proposal</h2><p><br></p>
<h3>ctype<char32_t></h3><br><p>We should completely avoid this gotcha=
and make <code>ctype<char32_t></code> work out of the box for the wh=
ole Unicode range. The locale name should modify only the <code>widen()</co=
de> and <code>narrow()</code> functions.</p>
<p></p></div></blockquote><div><br>As mentioned above, this is not enough b=
ecause the interface itself is unsuitable for this purpose.<br>=C2=A0</div>=
<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bor=
der-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"><p>Defining t=
his facet will automatically enable decent Unicode regexes.</p></div></bloc=
kquote><div><br>This is really debatable. The only thing that `char32_t` gi=
ves is the ability to match on code points instead of matching on code unit=
s (which is a disaster with `char16_t` and `char`). However, this isn't=
enough for even regular expression Level 1 Conformance, because the facili=
ties in <regex> are currently unsuited for this purpose.<br>=C2=A0</d=
iv><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;=
border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=3D"ltr"><p></p>
<h3>ctype<char16_t></h3>
<p>This should behave exactly same as the above, except that it will=20
accept only the first 65536 characters of Unicode, i.e. characters from=20
the basic multilingual plane (BMP).</p></div></blockquote><div><br>This is =
just designing for deprecation. This ctype would prove entirely useless for=
UTF-16, for example. It's essentially UCS-2-only. Pretending UCS-2 is =
relevant is the same kind of mistake that <codecvt> made. This isn=
9;t "Unicode support"; it's "Unicode subset support"=
;. It's wishful thinking that people don't use, e.g. the Supplement=
ary Ideographic Plane, mathematical symbols, or, heck, emoji. Let's not=
do that again.<br><br></div></div>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/8174836d-21fd-4030-aee9-bcb43d83d0fb%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/8174836d-21fd-4030-aee9-bcb43d83d0fb=
%40isocpp.org</a>.<br />
------=_Part_5349_37367184.1522248916354--
------=_Part_5348_309466553.1522248916354--
.