Topic: Unicode character names in universal-character-names


Author: Eelis <eelis@eelis.net>
Date: Sun, 15 Sep 2013 14:25:03 +0200
Raw View
Wouldn't it be nice to be able to use Unicode character names in
universal-character-names?

     std::cout << "\u{PER MILLE SIGN}";

I think this better expresses the intent than "\u2030".

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Ville Voutilainen <ville.voutilainen@gmail.com>
Date: Sun, 15 Sep 2013 15:43:01 +0300
Raw View
--047d7b873a1093ab1104e66b6d1b
Content-Type: text/plain; charset=ISO-8859-1

On 15 September 2013 15:25, Eelis <eelis@eelis.net> wrote:

> Wouldn't it be nice to be able to use Unicode character names in
> universal-character-names?
>
>     std::cout << "\u{PER MILLE SIGN}";
>
> I think this better expresses the intent than "\u2030".
>


Seems like a good idea. I played with some work-around ideas, and they..
...don't work:

constexpr auto PER_MILLE_SIGN = "2030";
int main()
{
    auto x = "\u" PER_MILLE_SIGN;  // both clang and gcc complain that the
universal-character name is incomplete
    auto y = "\u2030";
    cout << (string(x) == string(y));
}


constexpr auto PER_MILLE_SIGN = "\u2030";
int main()
{
    auto x = PER_MILLE_SIGN;  // sorta works, doesn't embed into strings
    auto y = "\u2030";
    cout << (string(x) == string(y));
}

Any kind of define would require string literal concatenation anyway, so
it's never going to be as nice
as "\u{name}".

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--047d7b873a1093ab1104e66b6d1b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote">On 15 September 2013 15:25, Eelis <span dir=3D"ltr">&lt;<a href=3D"=
mailto:eelis@eelis.net" target=3D"_blank">eelis@eelis.net</a>&gt;</span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex=
;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Wouldn&#39;t it be nice to be able to use Unicode character names in univer=
sal-character-names?<br>
<br>
=A0 =A0 std::cout &lt;&lt; &quot;\u{PER MILLE SIGN}&quot;;<br>
<br>
I think this better expresses the intent than &quot;\u2030&quot;.<span clas=
s=3D""><font color=3D"#888888"><br></font></span></blockquote><div><br><br>=
</div><div>Seems like a good idea. I played with some work-around ideas, an=
d they.. ..don&#39;t work:<br>
<br>constexpr auto PER_MILLE_SIGN =3D  &quot;2030&quot;; <br> int main() <b=
r>{<br>=A0=A0=A0 auto x =3D &quot;\u&quot; PER_MILLE_SIGN;=A0 // both clang=
 and gcc complain that the universal-character name is incomplete<br>=A0=A0=
=A0 auto y =3D &quot;\u2030&quot;; <br>
=A0=A0=A0 cout &lt;&lt; (string(x) =3D=3D string(y));<br>} <br><br><br>cons=
texpr auto PER_MILLE_SIGN =3D  &quot;\u2030&quot;; <br> int main() <br>{<br=
>=A0=A0=A0 auto x =3D PER_MILLE_SIGN;=A0 // sorta works, doesn&#39;t embed =
into strings<br>=A0=A0=A0 auto y =3D &quot;\u2030&quot;; <br>
=A0=A0=A0 cout &lt;&lt; (string(x) =3D=3D string(y));<br>}</div></div><br><=
/div><div class=3D"gmail_extra">Any kind of define would require string lit=
eral concatenation anyway, so it&#39;s never going to be as nice<br>as &quo=
t;\u{name}&quot;.<br>
</div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--047d7b873a1093ab1104e66b6d1b--

.


Author: Maurice Bos <m-ou.se@m-ou.se>
Date: Sun, 15 Sep 2013 14:50:18 +0200
Raw View
--089e0160b7bed8db7b04e66b88fa
Content-Type: text/plain; charset=ISO-8859-1

Should this only work inside string literals, or do you want to modify the
'universal-character-name' grammer to allow this in identifiers as well?
(just like \uXXXX)


2013/9/15 Ville Voutilainen <ville.voutilainen@gmail.com>

>
>
>
> On 15 September 2013 15:25, Eelis <eelis@eelis.net> wrote:
>
>> Wouldn't it be nice to be able to use Unicode character names in
>> universal-character-names?
>>
>>     std::cout << "\u{PER MILLE SIGN}";
>>
>> I think this better expresses the intent than "\u2030".
>>
>
>
> Seems like a good idea. I played with some work-around ideas, and they..
> ..don't work:
>
> constexpr auto PER_MILLE_SIGN = "2030";
> int main()
> {
>     auto x = "\u" PER_MILLE_SIGN;  // both clang and gcc complain that the
> universal-character name is incomplete
>     auto y = "\u2030";
>     cout << (string(x) == string(y));
> }
>
>
> constexpr auto PER_MILLE_SIGN = "\u2030";
> int main()
> {
>     auto x = PER_MILLE_SIGN;  // sorta works, doesn't embed into strings
>     auto y = "\u2030";
>     cout << (string(x) == string(y));
> }
>
> Any kind of define would require string literal concatenation anyway, so
> it's never going to be as nice
> as "\u{name}".
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposals+unsubscribe@isocpp.org.
> To post to this group, send email to std-proposals@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.
>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--089e0160b7bed8db7b04e66b88fa
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Should this only work inside string literals, or do you wa=
nt to modify the &#39;universal-character-name&#39; grammer to allow this i=
n identifiers as well? (just like \uXXXX)<br></div><div class=3D"gmail_extr=
a">

<br><br><div class=3D"gmail_quote">2013/9/15 Ville Voutilainen <span dir=3D=
"ltr">&lt;<a href=3D"mailto:ville.voutilainen@gmail.com" target=3D"_blank">=
ville.voutilainen@gmail.com</a>&gt;</span><br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
">

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote"><div class=3D"im">On 15 September 2013 15:25, Eelis <span dir=3D"lt=
r">&lt;<a href=3D"mailto:eelis@eelis.net" target=3D"_blank">eelis@eelis.net=
</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex">
Wouldn&#39;t it be nice to be able to use Unicode character names in univer=
sal-character-names?<br>
<br>
=A0 =A0 std::cout &lt;&lt; &quot;\u{PER MILLE SIGN}&quot;;<br>
<br>
I think this better expresses the intent than &quot;\u2030&quot;.<span><fon=
t color=3D"#888888"><br></font></span></blockquote><div><br><br></div></div=
><div>Seems like a good idea. I played with some work-around ideas, and the=
y.. ..don&#39;t work:<br>


<br>constexpr auto PER_MILLE_SIGN =3D  &quot;2030&quot;; <br> int main() <b=
r>{<br>=A0=A0=A0 auto x =3D &quot;\u&quot; PER_MILLE_SIGN;=A0 // both clang=
 and gcc complain that the universal-character name is incomplete<br>=A0=A0=
=A0 auto y =3D &quot;\u2030&quot;; <br>


=A0=A0=A0 cout &lt;&lt; (string(x) =3D=3D string(y));<br>} <br><br><br>cons=
texpr auto PER_MILLE_SIGN =3D  &quot;\u2030&quot;; <br> int main() <br>{<br=
>=A0=A0=A0 auto x =3D PER_MILLE_SIGN;=A0 // sorta works, doesn&#39;t embed =
into strings<br>=A0=A0=A0 auto y =3D &quot;\u2030&quot;; <br>


=A0=A0=A0 cout &lt;&lt; (string(x) =3D=3D string(y));<br>}</div></div><br><=
/div><div class=3D"gmail_extra">Any kind of define would require string lit=
eral concatenation anyway, so it&#39;s never going to be as nice<br>as &quo=
t;\u{name}&quot;.<br>


</div></div><div class=3D"HOEnZb"><div class=3D"h5">

<p></p>

-- <br>
=A0<br>
--- <br>
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br>
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals%2Bunsubscribe@isocpp.org" target=3D=
"_blank">std-proposals+unsubscribe@isocpp.org</a>.<br>
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org" target=3D"_blank">std-proposals@isocpp.org</a>.<br>
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/" target=3D"_blank">http://groups.google.com/a/isocpp.org/gro=
up/std-proposals/</a>.<br>
</div></div></blockquote></div><br></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--089e0160b7bed8db7b04e66b88fa--

.


Author: Eelis <eelis@eelis.net>
Date: Sun, 15 Sep 2013 14:51:17 +0200
Raw View
On 2013-09-15 14:43, Ville Voutilainen wrote:
> Seems like a good idea. I played with some work-around ideas, and they..
> ..don't work:
 >
> [...]
 >
> it's never going to be as nice
> as "\u{name}".

Yeah, I think so too. If there is interest, I'll implement this in Clang
and then write proposed wording.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Eelis <eelis@eelis.net>
Date: Sun, 15 Sep 2013 14:52:40 +0200
Raw View
On 2013-09-15 14:50, Maurice Bos wrote:
> Should this only work inside string literals, or do you want to modify
> the 'universal-character-name' grammer to allow this in identifiers as
> well? (just like \uXXXX)

I think doing it by extending universal-character-name makes the most sense.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Ville Voutilainen <ville.voutilainen@gmail.com>
Date: Sun, 15 Sep 2013 16:08:01 +0300
Raw View
--001a11c25b4802a00b04e66bc790
Content-Type: text/plain; charset=ISO-8859-1

On 15 September 2013 15:52, Eelis <eelis@eelis.net> wrote:

> On 2013-09-15 14:50, Maurice Bos wrote:
>
>> Should this only work inside string literals, or do you want to modify
>> the 'universal-character-name' grammer to allow this in identifiers as
>> well? (just like \uXXXX)
>>
>
> I think doing it by extending universal-character-name makes the most
> sense.
>
>
>
You know.. I guess doing this stuff with a user-defined literal would be
one option.
It would look like "\u{PER MILLE SIGN}"_symbolic_unicode ;)
The suffix is certainly up to bike-shedding, but it would avoid having to
do it
in the compiler.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--001a11c25b4802a00b04e66bc790
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote">On 15 September 2013 15:52, Eelis <span dir=3D"ltr">&lt;<a href=3D"=
mailto:eelis@eelis.net" target=3D"_blank">eelis@eelis.net</a>&gt;</span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border=
-left:1px #ccc solid;padding-left:1ex">
<div class=3D"im">On 2013-09-15 14:50, Maurice Bos wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Should this only work inside string literals, or do you want to modify<br>
the &#39;universal-character-name&#39; grammer to allow this in identifiers=
 as<br>
well? (just like \uXXXX)<br>
</blockquote>
<br></div>
I think doing it by extending universal-character-name makes the most sense=
..<div class=3D"HOEnZb"><div class=3D"h5"><br><br></div></div></blockquote><=
div><br></div><div>You know.. I guess doing this stuff with a user-defined =
literal would be one option.<br>
</div><div>It would look like &quot;\u{PER MILLE SIGN}&quot;_symbolic_unico=
de ;)<br></div><div>The suffix is certainly up to bike-shedding, but it wou=
ld avoid having to do it<br></div><div>in the compiler. <br></div></div>
<br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a11c25b4802a00b04e66bc790--

.


Author: Ville Voutilainen <ville.voutilainen@gmail.com>
Date: Sun, 15 Sep 2013 16:11:00 +0300
Raw View
--047d7bb045c4a6946d04e66bd1d1
Content-Type: text/plain; charset=ISO-8859-1

On 15 September 2013 16:08, Ville Voutilainen
<ville.voutilainen@gmail.com>wrote:

>
>
>
> On 15 September 2013 15:52, Eelis <eelis@eelis.net> wrote:
>
>> On 2013-09-15 14:50, Maurice Bos wrote:
>>
>>> Should this only work inside string literals, or do you want to modify
>>> the 'universal-character-name' grammer to allow this in identifiers as
>>> well? (just like \uXXXX)
>>>
>>
>> I think doing it by extending universal-character-name makes the most
>> sense.
>>
>>
>>
> You know.. I guess doing this stuff with a user-defined literal would be
> one option.
> It would look like "\u{PER MILLE SIGN}"_symbolic_unicode ;)
>

....except it probably can't use \u inside the string, so it would need a
different magical-cookie
to work, lest the universal-character name causes an error before the UDL
is invoked.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--047d7bb045c4a6946d04e66bd1d1
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote">On 15 September 2013 16:08, Ville Voutilainen <span dir=3D"ltr">&lt=
;<a href=3D"mailto:ville.voutilainen@gmail.com" target=3D"_blank">ville.vou=
tilainen@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><br><div class=3D"gmail_ext=
ra"><br><br><div class=3D"gmail_quote"><div class=3D"im">On 15 September 20=
13 15:52, Eelis <span dir=3D"ltr">&lt;<a href=3D"mailto:eelis@eelis.net" ta=
rget=3D"_blank">eelis@eelis.net</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div>On 2013-09-15 14:50, Maurice Bos wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Should this only work inside string literals, or do you want to modify<br>
the &#39;universal-character-name&#39; grammer to allow this in identifiers=
 as<br>
well? (just like \uXXXX)<br>
</blockquote>
<br></div>
I think doing it by extending universal-character-name makes the most sense=
..<div><div><br><br></div></div></blockquote><div><br></div></div><div>You k=
now.. I guess doing this stuff with a user-defined literal would be one opt=
ion.<br>

</div><div>It would look like &quot;\u{PER MILLE SIGN}&quot;_symbolic_unico=
de ;)<br></div></div></div></div></blockquote><div><br></div><div>...except=
 it probably can&#39;t use \u inside the string, so it would need a differe=
nt magical-cookie<br>
to work, lest the universal-character name causes an error before the UDL i=
s invoked.<br></div></div></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--047d7bb045c4a6946d04e66bd1d1--

.


Author: David Krauss <potswa@gmail.com>
Date: Sun, 15 Sep 2013 06:54:42 -0700 (PDT)
Raw View
------=_Part_2188_7789600.1379253282825
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


On Sunday, September 15, 2013 8:25:03 PM UTC+8, Eelis wrote:
>
> Wouldn't it be nice to be able to use Unicode character names in=20
> universal-character-names?=20
>
>      std::cout << "\u{PER MILLE SIGN}";=20
>
> I think this better expresses the intent than "\u2030".=20
>

I think it's better to simply use a comment. You can put comments inside=20
strings by availing of the string catenation facility.

std::cout << "\uFF11" /* FULLWIDTH DIGIT ONE */ "\u2030" /* PER MILLE SIGN=
=20
*/;

Every Unicode draft will expand the dictionary of names. Even if compilers=
=20
keep up by methodically adopting these, users won't reliably upgrade. So=20
there's a portability issue.

Many character names are oddly spelled. I initially put "FULL WIDTH" in the=
=20
above comment but had to fix it. Other oddities like "LAMDA" abound. This=
=20
is a critical usability issue.

Also, there's an issue in what the stringize operator does to UCNs;=20
according to the letter of the law it would require

#define S(X) #X

S("\u{DIGIT ZERO}") // =3D> "\"\\u{DIGIT ZERO}\"

This is a minor functionality issue but I mention because I want that=20
defect fixed=85 Exact spelling of UCNs is beyond the intent.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

------=_Part_2188_7789600.1379253282825
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br>On Sunday, September 15, 2013 8:25:03 PM UTC+8, Eelis =
wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8=
ex;border-left: 1px #ccc solid;padding-left: 1ex;">Wouldn't it be nice to b=
e able to use Unicode character names in=20
<br>universal-character-names?
<br>
<br>&nbsp; &nbsp; &nbsp;std::cout &lt;&lt; "\u{PER MILLE SIGN}";
<br>
<br>I think this better expresses the intent than "\u2030".
<br></blockquote><br>I think it's better to simply use a comment. You can p=
ut comments inside strings by availing of the string catenation facility.<b=
r><br><div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 2=
50); border-color: rgb(187, 187, 187); border-style: solid; border-width: 1=
px; word-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subpr=
ettyprint"><span style=3D"color: #000;" class=3D"styled-by-prettify">std</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">::</span><spa=
n style=3D"color: #000;" class=3D"styled-by-prettify">cout </span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">&lt;&lt;</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #080;" class=3D"styled-by-prettify">"\uFF11"</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;" cl=
ass=3D"styled-by-prettify">/* FULLWIDTH DIGIT ONE */</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #08=
0;" class=3D"styled-by-prettify">"\u2030"</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=
=3D"styled-by-prettify">/* PER MILLE SIGN */</span><span style=3D"color: #6=
60;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" clas=
s=3D"styled-by-prettify"><br></span></div></code></div><br>Every
 Unicode draft will expand the dictionary of names. Even if compilers=20
keep up by methodically adopting these, users won't reliably=20
upgrade. So there's a portability issue.<br><br>Many character names are
 oddly spelled. I initially put "FULL WIDTH" in the above comment but=20
had to fix it. Other oddities like "LAMDA" abound. This is a critical=20
usability issue.<br><br>Also, there's an issue in what the stringize operat=
or does to UCNs; according to the letter of the law it would require<br><br=
><div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250); =
border-color: rgb(187, 187, 187); border-style: solid; border-width: 1px; w=
ord-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subprettyp=
rint"><span style=3D"color: #800;" class=3D"styled-by-prettify">#define</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> S</span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify">X</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: #800;" class=3D"sty=
led-by-prettify">#X</span><span style=3D"color: #000;" class=3D"styled-by-p=
rettify"><br><br>S</span><span style=3D"color: #660;" class=3D"styled-by-pr=
ettify">(</span><span style=3D"color: #080;" class=3D"styled-by-prettify">"=
\u{DIGIT ZERO}"</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">)</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #800;" class=3D"styled-by-prettify">// =3D&gt; "\=
"\\u{DIGIT ZERO}\"</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"><br></span></div></code></div><br>This is a minor functionality iss=
ue but I mention because I want that defect fixed=85 Exact spelling of UCNs=
 is beyond the intent.</div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2188_7789600.1379253282825--

.


Author: David Krauss <potswa@gmail.com>
Date: Sun, 15 Sep 2013 07:12:22 -0700 (PDT)
Raw View
------=_Part_2660_18242798.1379254342386
Content-Type: text/plain; charset=ISO-8859-1



On Sunday, September 15, 2013 8:43:01 PM UTC+8, Ville Voutilainen wrote:
>
>
>
>
> On 15 September 2013 15:25, Eelis <ee...@eelis.net <javascript:>> wrote:
>
>> Wouldn't it be nice to be able to use Unicode character names in
>> universal-character-names?
>>
>>     std::cout << "\u{PER MILLE SIGN}";
>>
>> I think this better expresses the intent than "\u2030".
>>
>
>
> Seems like a good idea. I played with some work-around ideas, and they..
> ..don't work:
>

UCNs are translated in phase 1, before anything else, so they're pretty
much atomic.

Your best bet is to put it inside a string:

#define PER_MILLE_SIGN "\u2030"

#define CODEPOINT_(x) * U ## x // Prepend char32_t prefix, get first
element of string literal.
#define CODEPOINT(x) CODEPOINT_(x) // Tame catenation operator.

wchar_t *s1 = L"" PER_MILLE_SIGN;
char *s2 = "123 " PER_MILLE_SIGN;
char16_t c{ CODEPOINT( PER_MILLE_SIGN ) }; // Narrowing safe; CODEPOINT is
constant expression.


Any kind of define would require string literal concatenation anyway, so
> it's never going to be as nice
> as "\u{name}".
>

Is there reasoning here, or just an aesthetic bias? String literal
catenation may feel hackish, but it's nothing compared to UCNs which depend
*somewhat* on context (inside a literal vs in an identifier), and still
have a few ill-specified rough edges.

UCNs are their own text encoding. This proposal is about making a textual
text encoding. Verging on XML territory here.

>

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_2660_18242798.1379254342386
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Sunday, September 15, 2013 8:43:01 PM UTC+8, Vi=
lle Voutilainen wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;=
margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=
=3D"ltr"><br><div><br><br><div class=3D"gmail_quote">On 15 September 2013 1=
5:25, Eelis <span dir=3D"ltr">&lt;<a href=3D"javascript:" target=3D"_blank"=
 gdf-obfuscated-mailto=3D"HH3EOxqS_iAJ">ee...@eelis.net</a>&gt;</span> wrot=
e:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left:1px solid rgb(204,204,204);padding-left:1ex">
Wouldn't it be nice to be able to use Unicode character names in universal-=
character-names?<br>
<br>
&nbsp; &nbsp; std::cout &lt;&lt; "\u{PER MILLE SIGN}";<br>
<br>
I think this better expresses the intent than "\u2030".<span><font color=3D=
"#888888"><br></font></span></blockquote><div><br><br></div><div>Seems like=
 a good idea. I played with some work-around ideas, and they.. ..don't work=
:<br></div></div></div></div></blockquote><div dir=3D"ltr"><br>UCNs are tra=
nslated in phase 1, before anything else, so they're pretty much atomic.<br=
><br>Your best bet is to put it inside a string:<br><div class=3D"prettypri=
nt" style=3D"background-color: rgb(250, 250, 250); border-color: rgb(187, 1=
87, 187); border-style: solid; border-width: 1px; word-wrap: break-word;"><=
code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #8=
00;" class=3D"styled-by-prettify">#define</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> PER_MILLE_SIGN </span><span style=3D"color:=
 #080;" class=3D"styled-by-prettify">"\u2030"</span><span style=3D"color: #=
000;" class=3D"styled-by-prettify"><br><br></span><span style=3D"color: #80=
0;" class=3D"styled-by-prettify">#define</span><span style=3D"color: #000;"=
 class=3D"styled-by-prettify"> CODEPOINT_</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">x</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">)</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">*</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> U </span><s=
pan style=3D"color: #800;" class=3D"styled-by-prettify">## x // Prepend cha=
r32_t prefix, get first element of string literal.</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"color: #8=
00;" class=3D"styled-by-prettify">#define</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> CODEPOINT</span><span style=3D"color: #660;=
" class=3D"styled-by-prettify">(</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify">x</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">)</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> CODEPOINT_</span><span style=3D"color: #660;" class=3D"styled-by-pre=
ttify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettify">x<=
/span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><sp=
an style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=
=3D"color: #800;" class=3D"styled-by-prettify">// Tame catenation operator.=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><br><br></=
span><span style=3D"color: #008;" class=3D"styled-by-prettify">wchar_t</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span s=
tyle=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D"c=
olor: #000;" class=3D"styled-by-prettify">s1 </span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" c=
lass=3D"styled-by-prettify"> L</span><span style=3D"color: #080;" class=3D"=
styled-by-prettify">""</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> PER_MILLE_SIGN</span><span style=3D"color: #660;" class=3D"sty=
led-by-prettify">;</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">char</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span=
 style=3D"color: #000;" class=3D"styled-by-prettify">s2 </span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">=3D</span><span style=3D"col=
or: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #080;=
" class=3D"styled-by-prettify">"123 "</span><span style=3D"color: #000;" cl=
ass=3D"styled-by-prettify"> PER_MILLE_SIGN</span><span style=3D"color: #660=
;" class=3D"styled-by-prettify">;</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br>char16_t c</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">{</span><span style=3D"color: #000;" class=3D"s=
tyled-by-prettify"> CODEPOINT</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"> PER_MILLE_SIGN </span><span style=3D"color: #660;" class=3D"styl=
ed-by-prettify">)</span><span style=3D"color: #000;" class=3D"styled-by-pre=
ttify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">};=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><s=
pan style=3D"color: #800;" class=3D"styled-by-prettify">// Narrowing safe; =
CODEPOINT is constant expression.</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"><br><br></span></div></code></div><br><blockquote s=
tyle=3D"margin: 0px 0px 0px 0.8ex; border-left: 1px solid rgb(204, 204, 204=
); padding-left: 1ex;" class=3D"gmail_quote"><div>Any kind of define would =
require string literal concatenation anyway, so it's never going to be as n=
ice<br>as "\u{name}".<br></div></blockquote><div><br>Is there reasoning her=
e, or just an aesthetic bias? String literal catenation may feel hackish, b=
ut it's nothing compared to UCNs which depend <i>somewhat</i> on context (i=
nside a literal vs in an identifier), and still have a few ill-specified ro=
ugh edges.<br><br>UCNs are their own text encoding. This proposal is about =
making a textual text encoding. Verging on XML territory here.<br></div></d=
iv><blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;=
border-left: 1px #ccc solid;padding-left: 1ex;">
</blockquote></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_2660_18242798.1379254342386--

.


Author: Eelis <eelis@eelis.net>
Date: Sun, 15 Sep 2013 16:18:51 +0200
Raw View
On 2013-09-15 15:54, David Krauss wrote:
> Every Unicode draft will expand the dictionary of names. Even if
> compilers keep up by methodically adopting these, users won't reliably
> upgrade. So there's a portability issue.

An implementation could emit a diagnostic if the user attempts to use a
character name that is newer than the C++ standard used.

A typical implementation would allow this diagnostic to be overridden,
so that users could then make a conscious choice to reduce the
portability of their program by adding a Unicode requirement newer than
the C++ standard.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Philipp Stephani <p.stephani2@gmail.com>
Date: Sun, 15 Sep 2013 17:34:08 +0200
Raw View
--089e0160b9988b3aa704e66dd1b8
Content-Type: text/plain; charset=ISO-8859-1

2013/9/15 Eelis <eelis@eelis.net>

> Wouldn't it be nice to be able to use Unicode character names in
> universal-character-names?
>
>     std::cout << "\u{PER MILLE SIGN}";
>
> I think this better expresses the intent than "\u2030".
>
>
I think it's a very good idea. Many other languages have it, and it's a
simple and localized change.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--089e0160b9988b3aa704e66dd1b8
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">2013/9/15 Eelis <span dir=3D"ltr">&lt;<a href=3D"mailto:ee=
lis@eelis.net" target=3D"_blank">eelis@eelis.net</a>&gt;</span><br><div cla=
ss=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
">
Wouldn&#39;t it be nice to be able to use Unicode character names in univer=
sal-character-names?<br>
<br>
=A0 =A0 std::cout &lt;&lt; &quot;\u{PER MILLE SIGN}&quot;;<br>
<br>
I think this better expresses the intent than &quot;\u2030&quot;.<span clas=
s=3D"HOEnZb"><font color=3D"#888888"><br>
<br></font></span></blockquote><div><br></div><div>I think it&#39;s a very =
good idea. Many other languages have it, and it&#39;s a simple and localize=
d change.=A0</div></div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--089e0160b9988b3aa704e66dd1b8--

.


Author: stackmachine@hotmail.com
Date: Mon, 16 Sep 2013 23:46:30 -0700 (PDT)
Raw View
------=_Part_3208_5265517.1379400390897
Content-Type: text/plain; charset=ISO-8859-1



Am Sonntag, 15. September 2013 17:34:08 UTC+2 schrieb Philipp Stephani
>
> I think it's a very good idea. Many other languages have it, and it's a
> simple and localized change.
>
Name a few please. I am not aware of a single one and a quick search on
google did not give any results.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_3208_5265517.1379400390897
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>Am Sonntag, 15. September 2013 17:34:08 UTC+2 schr=
ieb Philipp Stephani<blockquote class=3D"gmail_quote" style=3D"margin: 0;ma=
rgin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div dir=
=3D"ltr"><div><div class=3D"gmail_quote"><div>I think it's a very good idea=
.. Many other languages have it, and it's a simple and localized change. <br=
></div></div></div></div></blockquote><div>Name a few please. I am not awar=
e of a single one and a quick search on google did not give any results. <b=
r></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_3208_5265517.1379400390897--

.


Author: Thiago Macieira <thiago@macieira.org>
Date: Tue, 17 Sep 2013 09:24:23 -0500
Raw View
--nextPart6683327.ggiJB5SRVg
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"

On domingo, 15 de setembro de 2013 17:34:08, Philipp Stephani wrote:
> I think it's a very good idea. Many other languages have it, and it's a
> simple and localized change.

It only requires the compiler to have a full list of character names from the
Unicode database. It will also require the C++ standard to mandate a minimum
version of Unicode, update it once in a while, provide a macro to indicate
which version of Unicode is known, etc.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--nextPart6683327.ggiJB5SRVg
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part.
Content-Transfer-Encoding: 7Bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iD8DBQBSOGYeM/XwBW70U1gRAnLLAKCr7yRncz4tQ1NzEpW39waFBT2a9QCgkBzC
+55lAE0FcLOx3m8uuO8Ub0c=
=vrfu
-----END PGP SIGNATURE-----

--nextPart6683327.ggiJB5SRVg--


.


Author: Zhihao Yuan <zy@miator.net>
Date: Tue, 17 Sep 2013 10:56:15 -0400
Raw View
On Tue, Sep 17, 2013 at 10:24 AM, Thiago Macieira <thiago@macieira.org> wrote:
> It only requires the compiler to have a full list of character names from the
> Unicode database. It will also require the C++ standard to mandate a minimum
> version of Unicode [...]

Then it's much more then "only" :)

I don't like the idea because seldom people can remember
Unicode names, while program is written for human to
read.  Even you can remember those names, some import
method support import Unicode characters through names,
like Fcitx; it's not an issue.


--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://4bsd.biz/

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Martinho Fernandes <martinho.fernandes@gmail.com>
Date: Tue, 17 Sep 2013 17:19:37 +0200
Raw View
On Sun, Sep 15, 2013 at 3:54 PM, David Krauss <potswa@gmail.com> wrote:
> Many character names are oddly spelled. I initially put "FULL WIDTH" in t=
he
> above comment but had to fix it. Other oddities like "LAMDA" abound. This=
 is
> a critical usability issue.
>

It doesn't stop at oddities. There are several characters names that
are just wrong, like =CA=9F=E1=B4=80=E1=B4=9B=C9=AA=C9=B4 s=E1=B4=8D=E1=B4=
=80=CA=9F=CA=9F =CA=9F=E1=B4=87=E1=B4=9B=E1=B4=9B=E1=B4=87=CA=80 =E1=B4=A0 =
=E1=B4=A1=C9=AA=E1=B4=9B=CA=9C =CA=9C=E1=B4=8F=E1=B4=8F=E1=B4=8B (U+028B) a=
nd some
like =E1=B4=98=CA=80=E1=B4=87s=E1=B4=87=C9=B4=E1=B4=9B=E1=B4=80=E1=B4=9B=C9=
=AA=E1=B4=8F=C9=B4 =D2=93=E1=B4=8F=CA=80=E1=B4=8D =D2=93=E1=B4=8F=CA=80 =E1=
=B4=A0=E1=B4=87=CA=80=E1=B4=9B=C9=AA=E1=B4=84=E1=B4=80=CA=9F =CA=80=C9=AA=
=C9=A2=CA=9C=E1=B4=9B =E1=B4=A1=CA=9C=C9=AA=E1=B4=9B=E1=B4=87 =CA=9F=E1=B4=
=87=C9=B4=E1=B4=9B=C9=AA=E1=B4=84=E1=B4=9C=CA=9F=E1=B4=80=CA=80 =CA=99=CA=
=80=E1=B4=80=E1=B4=8B=E1=B4=84=E1=B4=87=E1=B4=9B
(U+FE18) actually have misspellings. The Unicode Stability Policy
prevents these mistakes from being fixed, as it states that the Name
property won't ever change once assigned.

I like this feature in principle, but since this is about readability
and it can actually work against readability, it's not as enticing.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Eelis <eelis@eelis.net>
Date: Tue, 17 Sep 2013 19:36:39 +0200
Raw View
On 2013-09-17 16:24, Thiago Macieira wrote:
> On domingo, 15 de setembro de 2013 17:34:08, Philipp Stephani wrote:
>> I think it's a very good idea. Many other languages have it, and it's a
>> simple and localized change.
>
> It only requires the compiler to have a full list of character names from the
> Unicode database. It will also require the C++ standard to mandate a minimum
> version of Unicode, update it once in a while, provide a macro to indicate
> which version of Unicode is known, etc.

Such a macro already exists: __STDC_ISO_10646__.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Thiago Macieira <thiago@macieira.org>
Date: Tue, 17 Sep 2013 12:40:50 -0500
Raw View
On ter=E7a-feira, 17 de setembro de 2013 19:36:39, Eelis wrote:
> On 2013-09-17 16:24, Thiago Macieira wrote:
> > On domingo, 15 de setembro de 2013 17:34:08, Philipp Stephani wrote:
> >> I think it's a very good idea. Many other languages have it, and it's =
a
> >> simple and localized change.
> >=20
> > It only requires the compiler to have a full list of character names fr=
om
> > the Unicode database. It will also require the C++ standard to mandate =
a
> > minimum version of Unicode, update it once in a while, provide a macro =
to
> > indicate which version of Unicode is known, etc.
>=20
> Such a macro already exists: __STDC_ISO_10646__.

Obviously the macro does not mean that character names are permitted becaus=
e=20
they aren't right now. We can change the meaning of the macro and have it=
=20
contain a value that indicates the version of Unicode that is supported by =
the=20
compiler.

--=20
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: Ville Voutilainen <ville.voutilainen@gmail.com>
Date: Tue, 17 Sep 2013 20:47:01 +0300
Raw View
--001a1133f3d074e28e04e697e8f5
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 17 September 2013 20:40, Thiago Macieira <thiago@macieira.org> wrote:

> On ter=E7a-feira, 17 de setembro de 2013 19:36:39, Eelis wrote:
> > > It only requires the compiler to have a full list of character names
> from
> > > the Unicode database. It will also require the C++ standard to mandat=
e
> a
> > > minimum version of Unicode, update it once in a while, provide a macr=
o
> to
> > > indicate which version of Unicode is known, etc.
> >
> > Such a macro already exists: __STDC_ISO_10646__.
>
> Obviously the macro does not mean that character names are permitted
> because
> they aren't right now. We can change the meaning of the macro and have it
> contain a value that indicates the version of Unicode that is supported b=
y
> the
> compiler.
>
>
>
It already contains the information about which C standard is the one
matching the version
of C++ indicated by that macro (by the virtue of the aforementioned C++
standard referring
to a certain C standard), so if a given C++ version refers to a certain
Unicode version,
the macro reveals that Unicode version, too, indirectly.

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

--001a1133f3d074e28e04e697e8f5
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote">On 17 September 2013 20:40, Thiago Macieira <span dir=3D"ltr">&lt;<=
a href=3D"mailto:thiago@macieira.org" target=3D"_blank">thiago@macieira.org=
</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">On ter=E7a-feira, 17 de se=
tembro de 2013 19:36:39, Eelis wrote:<br>
&gt; &gt; It only requires the compiler to have a full list of character na=
mes from<br>
&gt; &gt; the Unicode database. It will also require the C++ standard to ma=
ndate a<br>
&gt; &gt; minimum version of Unicode, update it once in a while, provide a =
macro to<br>
&gt; &gt; indicate which version of Unicode is known, etc.<br>
&gt;<br>
&gt; Such a macro already exists: __STDC_ISO_10646__.<br>
<br>
</div>Obviously the macro does not mean that character names are permitted =
because<br>
they aren&#39;t right now. We can change the meaning of the macro and have =
it<br>
contain a value that indicates the version of Unicode that is supported by =
the<br>
compiler.<br>
<div class=3D"im HOEnZb"><br><br></div></blockquote><div><br></div><div>It =
already contains the information about which C standard is the one matching=
 the version<br>of C++ indicated by that macro (by the virtue of the aforem=
entioned C++ standard referring<br>
to a certain C standard), so if a given C++ version refers to a certain Uni=
code version,<br></div><div>the macro reveals that Unicode version, too, in=
directly. <br></div></div><br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a1133f3d074e28e04e697e8f5--

.


Author: Martinho Fernandes <martinho.fernandes@gmail.com>
Date: Tue, 17 Sep 2013 19:49:11 +0200
Raw View
On Tue, Sep 17, 2013 at 7:40 PM, Thiago Macieira <thiago@macieira.org> wrote:
>> Such a macro already exists: __STDC_ISO_10646__.
>
> Obviously the macro does not mean that character names are permitted because
> they aren't right now. We can change the meaning of the macro and have it
> contain a value that indicates the version of Unicode that is supported by the
> compiler.

It already indicates enough for this feature. It states a a year and
month and makes the Unicode required set consist of "all the
characters that
are defined by ISO/IEC 10646, along with all amendments and technical
corrigenda as of the specified year and month."

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Eelis <eelis@eelis.net>
Date: Tue, 17 Sep 2013 19:48:05 +0200
Raw View
On 2013-09-17 17:19, Martinho Fernandes wrote:
> I like this feature in principle, but since this is about readability
> and it can actually work against readability, it's not as enticing.

Would you not agree that /most/ features in C++, including the ones we
love, can be abused? :)

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Philipp Stephani <p.stephani2@gmail.com>
Date: Tue, 17 Sep 2013 20:50:25 +0200
Raw View
--001a11c3836c3bee7804e698cb48
Content-Type: text/plain; charset=ISO-8859-1

2013/9/17 <stackmachine@hotmail.com>

>
>
> Am Sonntag, 15. September 2013 17:34:08 UTC+2 schrieb Philipp Stephani
>
>> I think it's a very good idea. Many other languages have it, and it's a
>> simple and localized change.
>>
> Name a few please. I am not aware of a single one and a quick search on
> google did not give any results.
>
> Perl (http://perldoc.perl.org/charnames.html), Python (
http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
)

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--001a11c3836c3bee7804e698cb48
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">2013/9/17  <span dir=3D"ltr">&lt;<a href=3D"mailto:stackma=
chine@hotmail.com" target=3D"_blank">stackmachine@hotmail.com</a>&gt;</span=
><br><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;b=
order-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"=
>
<div dir=3D"ltr"><br><br>Am Sonntag, 15. September 2013 17:34:08 UTC+2 schr=
ieb Philipp Stephani<div class=3D"im"><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(=
204,204,204);border-left-style:solid;padding-left:1ex">
<div dir=3D"ltr"><div><div class=3D"gmail_quote"><div>I think it&#39;s a ve=
ry good idea. Many other languages have it, and it&#39;s a simple and local=
ized change. <br></div></div></div></div></blockquote></div><div>Name a few=
 please. I am not aware of a single one and a quick search on google did no=
t give any results. <br>
</div></div><div class=3D""><div class=3D"h5">

<p></p></div></div></blockquote></div>Perl (<a href=3D"http://perldoc.perl.=
org/charnames.html">http://perldoc.perl.org/charnames.html</a>), Python (<a=
 href=3D"http://docs.python.org/3/reference/lexical_analysis.html#string-an=
d-bytes-literals">http://docs.python.org/3/reference/lexical_analysis.html#=
string-and-bytes-literals</a>)</div>
</div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--001a11c3836c3bee7804e698cb48--

.


Author: Eelis <eelis@eelis.net>
Date: Tue, 17 Sep 2013 22:15:05 +0200
Raw View
On 2013-09-17 20:50, Philipp Stephani wrote:
> Perl (http://perldoc.perl.org/charnames.html), Python
> (http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)

Ah, cool. I did not know about these. Thanks!

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: Richard Smith <richard@metafoo.co.uk>
Date: Tue, 17 Sep 2013 15:03:58 -0700
Raw View
--089e0160d0ac6915cf04e69b7ffa
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Sep 17, 2013 at 11:50 AM, Philipp Stephani <p.stephani2@gmail.com>wrote:

> 2013/9/17 <stackmachine@hotmail.com>
>
>>
>>
>> Am Sonntag, 15. September 2013 17:34:08 UTC+2 schrieb Philipp Stephani
>>
>>> I think it's a very good idea. Many other languages have it, and it's a
>>> simple and localized change.
>>>
>> Name a few please. I am not aware of a single one and a quick search on
>> google did not give any results.
>>
>> Perl (http://perldoc.perl.org/charnames.html), Python (
> http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
> )
>

This thread seems to be missing justification. This proposal imposes a
significant cost on compiler vendors (and indeed on programmers, who now
need to learn another obscure lexical rule) and I've not seen anyone
present any compelling use cases.

So, it would be useful if someone could provide:
 (a) some examples of actual code using this feature to good effect in Perl
or Python
 (b) a demonstration that this should be a core language feature (as
opposed to, say, a UDL, much as Ville proposed: R"(\u{PER MILLE
SIGN})"_symbolic_unicode)

I would not expect this proposal to stand much chance in EWG without more
analysis in this direction.

I note also that Perl's approach allows for character naming schemes other
than the official Unicode character names, which might suggest to some that
this proposal is insufficient as-is, and that a UDL might be a better
approach.

Finally, compilers are increasingly allowing UTF-8 source files. Given such
a compiler, when would this proposal be preferable to direct use of the
relevant characters?

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--089e0160d0ac6915cf04e69b7ffa
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Tue, Sep 17, 2013 at 11:50 AM, Philipp Stephani <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:p.stephani2@gmail.com" target=3D"_blank">p=
..stephani2@gmail.com</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><d=
iv class=3D"gmail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div dir=3D"ltr">2013/9/17  <span dir=3D"ltr">&lt;<a href=
=3D"mailto:stackmachine@hotmail.com" target=3D"_blank">stackmachine@hotmail=
..com</a>&gt;</span><br>
<div class=3D"gmail_extra"><div class=3D"im"><div class=3D"gmail_quote"><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef=
t-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padd=
ing-left:1ex">

<div dir=3D"ltr"><br><br>Am Sonntag, 15. September 2013 17:34:08 UTC+2 schr=
ieb Philipp Stephani<div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);=
border-left-style:solid;padding-left:1ex">

<div dir=3D"ltr"><div><div class=3D"gmail_quote"><div>I think it&#39;s a ve=
ry good idea. Many other languages have it, and it&#39;s a simple and local=
ized change. <br></div></div></div></div></blockquote></div><div>Name a few=
 please. I am not aware of a single one and a quick search on google did no=
t give any results. <br>

</div></div><div><div>

<p></p></div></div></blockquote></div></div>Perl (<a href=3D"http://perldoc=
..perl.org/charnames.html" target=3D"_blank">http://perldoc.perl.org/charnam=
es.html</a>), Python (<a href=3D"http://docs.python.org/3/reference/lexical=
_analysis.html#string-and-bytes-literals" target=3D"_blank">http://docs.pyt=
hon.org/3/reference/lexical_analysis.html#string-and-bytes-literals</a>)</d=
iv>
</div></blockquote><div><br></div><div>This thread seems to be missing just=
ification. This proposal imposes a significant cost on compiler vendors (an=
d indeed on programmers, who now need to learn another obscure lexical rule=
) and I&#39;ve not seen anyone present any compelling use cases.</div>
<div><br></div><div>So, it would be useful if someone could provide:</div><=
div>=A0(a) some examples of actual code using this feature to good effect i=
n Perl or Python<br></div><div>=A0(b) a demonstration that this should be a=
 core language feature (as opposed to, say, a UDL, much as Ville proposed: =
R&quot;(\u{PER MILLE SIGN})&quot;_symbolic_unicode)</div>
<div><br></div><div>I would not expect this proposal to stand much chance i=
n EWG without more analysis in this direction.</div><div><br></div><div>I n=
ote also that Perl&#39;s approach allows for character naming schemes other=
 than the official Unicode character names, which might suggest to some tha=
t this proposal is insufficient as-is, and that a UDL might be a better app=
roach.</div>
<div><br></div><div>Finally, compilers are increasingly allowing UTF-8 sour=
ce files. Given such a compiler, when would this proposal be preferable to =
direct use of the relevant characters?</div></div></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--089e0160d0ac6915cf04e69b7ffa--

.


Author: Eelis <eelis@eelis.net>
Date: Wed, 18 Sep 2013 00:27:28 +0200
Raw View
On 2013-09-18 00:03, Richard Smith wrote:
> Finally, compilers are increasingly allowing UTF-8 source files. Given
> such a compiler, when would this proposal be preferable to direct use of
> the relevant characters?

When the characters are nonprintable, or when project coding standards
require ASCII source files, for example.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

.


Author: =?UTF-8?Q?Klaim_=2D_Jo=C3=ABl_Lamotte?= <mjklaim@gmail.com>
Date: Wed, 18 Sep 2013 00:29:03 +0200
Raw View
--047d7b343cd217a08504e69bd9d3
Content-Type: text/plain; charset=ISO-8859-1

On Wed, Sep 18, 2013 at 12:03 AM, Richard Smith <richard@metafoo.co.uk>wrote:

> (a) some examples of actual code using this feature to good effect in Perl
> or Python


By the way, as Python is mostly built over standard propositions, it's easy
to look for the related paper for rational.
I'm not sure if it helps here but: http://www.python.org/dev/peps/pep-0263/

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--047d7b343cd217a08504e69bd9d3
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On Wed, Sep 18, 2013 at 12:03 AM, Richard Smith <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:richard@metafoo.co.uk" target=3D"_blank">richard@metafoo.co.uk=
</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">(a) some examples of actual code using this feature to goo=
d effect in Perl or Python</blockquote>
</div><br>By the way, as Python is mostly built over standard propositions,=
 it&#39;s easy to look for the related paper for rational.<br>I&#39;m not s=
ure if it helps here but:=A0<a href=3D"http://www.python.org/dev/peps/pep-0=
263/">http://www.python.org/dev/peps/pep-0263/</a></div>
</div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

--047d7b343cd217a08504e69bd9d3--

.


Author: Zhihao Yuan <zy@miator.net>
Date: Tue, 17 Sep 2013 19:41:13 -0400
Raw View
On Tue, Sep 17, 2013 at 6:29 PM, Klaim - Jo=EBl Lamotte <mjklaim@gmail.com>=
 wrote:
> On Wed, Sep 18, 2013 at 12:03 AM, Richard Smith <richard@metafoo.co.uk>
> wrote:
>>
>> (a) some examples of actual code using this feature to good effect in Pe=
rl
>> or Python
>
> By the way, as Python is mostly built over standard propositions, it's ea=
sy
> to look for the related paper for rational.
> I'm not sure if it helps here but: http://www.python.org/dev/peps/pep-026=
3/

Richard is asking whether the Unicode names are useful, your
link is talking about the Unicode source file support...

C++ the standard knows Unicode, and an implementation can
pick any encoding to support Unicode.  AFAIK, clang supports
Unicode in string literals as well as Unicode identifiers with UTF-8,
but it seems that it does not support other encoding (correct me
if I'm wrong); GCC up to 4.8 supports Unicode in string literals
with any encoding (-finput-charset; I love GB18030), but does
not support Unicode identifiers.

Considering C++ is portable on systems which does
not even recognize ASCII, I think to leave encoding
implementation-defined is a right approach.

For short, I don't worry about the readability of C++ source
code without a Unicode character names support.

--=20
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://4bsd.biz/

--=20

---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.

.


Author: David Krauss <potswa@gmail.com>
Date: Tue, 17 Sep 2013 20:16:01 -0700 (PDT)
Raw View
------=_Part_4654_15492555.1379474161128
Content-Type: text/plain; charset=ISO-8859-1



On Wednesday, September 18, 2013 2:50:25 AM UTC+8, Philipp Stephani wrote:
>
> 2013/9/17 <stackm...@hotmail.com <javascript:>>
>
>>
>>
>> Am Sonntag, 15. September 2013 17:34:08 UTC+2 schrieb Philipp Stephani
>>
>>> I think it's a very good idea. Many other languages have it, and it's a
>>> simple and localized change.
>>>
>> Name a few please. I am not aware of a single one and a quick search on
>> google did not give any results.
>>
>> Perl (http://perldoc.perl.org/charnames.html), Python (
> http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
> )
>

The Perl feature goes much further and allows user-defined aliases,
character sequences, and fuzzy matching. It appears to be a plugin, not
part of their core language.

The Python 3.3 feature has more parity. That documentation links to a Unicode
spec <http://www.unicode.org/Public/6.1.0/ucd/NameAliases.txt> with
abbreviations and corrections, including "PRESENTATION FORM FOR VERTICAL
RIGHT WHITE LENTICULAR BRACKET" following Martinho's mention, but there's
still no LAMBDA. (The hooked V thing looks to me more like an
orthographical issue.)

I tried Googling for a while but couldn't find any other mention of this
feature besides a bug report <http://bugs.python.org/issue12753>, and a
StackOverflow answer which had copy-pasted the documentation extraneously.
At best it's obscure.

Is there anything my technique above doesn't do? It even gets codepoints as
compile-time constants, just like character literals. That's something even
Perl doesn't do.

Once again:

#define PER_MILLE_SIGN "\u2030"

#define CODEPOINT_(x) * U ## x // Prepend char32_t prefix, get first
element of string literal.
#define CODEPOINT(x) CODEPOINT_(x) // Tame catenation operator.

wchar_t *s1 = L"" PER_MILLE_SIGN;
char *s2 = "123 " PER_MILLE_SIGN;
char16_t c{ CODEPOINT( PER_MILLE_SIGN ) }; // Narrowing safe; CODEPOINT is
constant expression.

(Live demo <http://ideone.com/D4g0wl>.)

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

------=_Part_4654_15492555.1379474161128
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br>On Wednesday, September 18, 2013 2:50:25 AM UTC+8,=
 Philipp Stephani wrote:<blockquote class=3D"gmail_quote" style=3D"margin: =
0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div d=
ir=3D"ltr">2013/9/17  <span dir=3D"ltr">&lt;<a href=3D"javascript:" target=
=3D"_blank" gdf-obfuscated-mailto=3D"B-_ylM0ljygJ">stackm...@hotmail.com</a=
>&gt;</span><br><div><div class=3D"gmail_quote"><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-=
color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir=3D"ltr"><br><br>Am Sonntag, 15. September 2013 17:34:08 UTC+2 schr=
ieb Philipp Stephani<div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);=
border-left-style:solid;padding-left:1ex">
<div dir=3D"ltr"><div><div class=3D"gmail_quote"><div>I think it's a very g=
ood idea. Many other languages have it, and it's a simple and localized cha=
nge. <br></div></div></div></div></blockquote></div><div>Name a few please.=
 I am not aware of a single one and a quick search on google did not give a=
ny results. <br>
</div></div><div><div>

<p></p></div></div></blockquote></div>Perl (<a href=3D"http://perldoc.perl.=
org/charnames.html" target=3D"_blank">http://perldoc.perl.org/<wbr>charname=
s.html</a>), Python (<a href=3D"http://docs.python.org/3/reference/lexical_=
analysis.html#string-and-bytes-literals" target=3D"_blank">http://docs.pyth=
on.org/3/<wbr>reference/lexical_analysis.<wbr>html#string-and-bytes-literal=
s</a><wbr>)</div></div></blockquote><div><br>The Perl feature goes much fur=
ther and allows user-defined aliases, character sequences, and fuzzy matchi=
ng. It appears to be a plugin, not part of their core language.<br><br>The =
Python 3.3 feature has more parity. That documentation links to a <a href=
=3D"http://www.unicode.org/Public/6.1.0/ucd/NameAliases.txt">Unicode spec</=
a> with abbreviations and corrections, including "PRESENTATION FORM FOR VER=
TICAL RIGHT WHITE LENTICULAR BRACKET" following Martinho's mention, but the=
re's still no LAMBDA. (The hooked V thing looks to me more like an orthogra=
phical issue.)<br><br>I tried Googling for a while but couldn't find any ot=
her mention of this feature besides <a href=3D"http://bugs.python.org/issue=
12753">a bug report</a>, and a StackOverflow answer which had copy-pasted t=
he documentation extraneously. At best it's obscure.<br><br>Is there anythi=
ng my technique above doesn't do? It even gets codepoints as compile-time c=
onstants, just like character literals. That's something even Perl doesn't =
do.<br><br>Once again:<br><br><code><span style=3D"color:#000"></span><span=
 style=3D"color:#800"></span><div class=3D"prettyprint" style=3D"background=
-color: rgb(250, 250, 250); border-color: rgb(187, 187, 187); border-style:=
 solid; border-width: 1px; word-wrap: break-word;"><code class=3D"prettypri=
nt"><div class=3D"subprettyprint"><span style=3D"color: #800;" class=3D"sty=
led-by-prettify">#define</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"> PER_MILLE_SIGN </span><span style=3D"color: #080;" class=3D"=
styled-by-prettify">"\u2030"</span><span style=3D"color: #000;" class=3D"st=
yled-by-prettify"><br><br></span><span style=3D"color: #800;" class=3D"styl=
ed-by-prettify">#define</span><span style=3D"color: #000;" class=3D"styled-=
by-prettify"> CODEPOINT_</span><span style=3D"color: #660;" class=3D"styled=
-by-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prett=
ify">x</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> U </span><span style=3D"color:=
 #800;" class=3D"styled-by-prettify">## x // Prepend char32_t prefix, get f=
irst element of string literal.</span><span style=3D"color: #000;" class=3D=
"styled-by-prettify"><br></span><span style=3D"color: #800;" class=3D"style=
d-by-prettify">#define</span><span style=3D"color: #000;" class=3D"styled-b=
y-prettify"> CODEPOINT</span><span style=3D"color: #660;" class=3D"styled-b=
y-prettify">(</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y">x</span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</spa=
n><span style=3D"color: #000;" class=3D"styled-by-prettify"> CODEPOINT_</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">(</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify">x</span><span style=3D"=
color: #660;" class=3D"styled-by-prettify">)</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800;" clas=
s=3D"styled-by-prettify">// Tame catenation operator.</span><span style=3D"=
color: #000;" class=3D"styled-by-prettify"><br><br></span><span style=3D"co=
lor: #008;" class=3D"styled-by-prettify">wchar_t</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #660;" =
class=3D"styled-by-prettify">*</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify">s1 </span><span style=3D"color: #660;" class=3D"styled-=
by-prettify">=3D</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> L</span><span style=3D"color: #080;" class=3D"styled-by-prettify">""=
</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> PER_MILLE=
_SIGN</span><span style=3D"color: #660;" class=3D"styled-by-prettify">;</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><sp=
an style=3D"color: #008;" class=3D"styled-by-prettify">char</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">*</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify">s2 </span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">=3D</span><span style=3D"color: #000;" class=3D"sty=
led-by-prettify"> </span><span style=3D"color: #080;" class=3D"styled-by-pr=
ettify">"123 "</span><span style=3D"color: #000;" class=3D"styled-by-pretti=
fy"> PER_MILLE_SIGN</span><span style=3D"color: #660;" class=3D"styled-by-p=
rettify">;</span><span style=3D"color: #000;" class=3D"styled-by-prettify">=
<br>char16_t c</span><span style=3D"color: #660;" class=3D"styled-by-pretti=
fy">{</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> CODE=
POINT</span><span style=3D"color: #660;" class=3D"styled-by-prettify">(</sp=
an><span style=3D"color: #000;" class=3D"styled-by-prettify"> PER_MILLE_SIG=
N </span><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span>=
<span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span sty=
le=3D"color: #660;" class=3D"styled-by-prettify">};</span><span style=3D"co=
lor: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #800=
;" class=3D"styled-by-prettify">// Narrowing safe; CODEPOINT is constant ex=
pression.</span><span style=3D"color: #000;" class=3D"styled-by-prettify"><=
br></span></div></code></div><span style=3D"color:#000"><br></span></code>(=
<a href=3D"http://ideone.com/D4g0wl">Live demo</a>.)<br></div></div>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />

------=_Part_4654_15492555.1379474161128--

.