Topic: Converting some text to a number


Author: Marshall Clow <mclow.lists@gmail.com>
Date: Fri, 23 Nov 2012 14:17:41 -0800
Raw View
[ Originally posted to std-discussion, where, after some, well, discussion,=
 it was suggested that I post this here - even though it's not a proposal ]


Let's suppose that I have some text, and I think it's numeric, and I want t=
o convert it to a number.
(For simplicity, let's say that I want a whole number - not a floating poin=
t one).

[ I am NOT assuming ASCII for the text, and I am NOT assuming any particula=
r integral type ]

What's the best facility for doing that in the standard library?

There are:
 int stoi ( const std::string& str, size_t *pos =3D 0, int base =3D 10 );=
=20
 long stol ( const std::string& str, size_t *pos =3D 0, int base =3D 10 );=
=20
 long long stoll( const std::string& str, size_t *pos =3D 0, int base =3D 1=
0 ); (since C++11)=20

but they only work with std::string, and only with int/long/long long.=20
I don't believe there are any versions for unsigned or wstring, etc.
On the other hand, at least they report errors.

We can drop back to C:

 int       atoi ( const char *str );
 long      atol ( const char *str );
 long long atoll( const char *str );
and=20
 long strtol ( const char *str, char **str_end, int base );
 long long strtoll ( const char *str, char **str_end, int base );
 unsigned long strtoul ( const char *str, char **str_end, int base );
 unsigned long long strtoull( const char *str, char **str_end, int base );

but they only work with NULL terminated char pointers, and long/long long/u=
nsigned long/unsigned long long.
No other character types.
Also, since they are C APIs, there's no error handling/reporting.
Overflow is defined by "undefined behavior" (gee, thanks), or maybe returni=
ng XXX_MAX - or maybe setting errno.

There's always sscanf:
 sscanf ( str, "%d", &int );

'Nuff said.


And, of course, there's the boost::lexical_cast way:
 String >> StringStream;
 StringStream >> Int;

that has the advantage that it can be extended to work user defined types, =
as well as various string classes.
But as people never tire of pointing out=85 it's really slow.

So - how do I use the standard library to convert my std::u16string into an=
 extended precision integer?

Seems to me that I want something like:

 typedef <typename Integer, typename Iterator>
 Integer to_integer (  Iterator first, Iterator last, Iterator *end =3D nul=
lptr, unsigned base =3D 10 );

which I could call like this:
 auto foo =3D to_integer<myInteger_t> ( u16Str.begin(), u16Str.end ()):

What do people think?

-- Marshall

--=20




.


Author: =?ISO-8859-1?Q?Daniel_Kr=FCgler?= <daniel.kruegler@gmail.com>
Date: Fri, 23 Nov 2012 23:35:21 +0100
Raw View
--14dae9cfc84af10e9c04cf313227
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

2012/11/23 Marshall Clow <mclow.lists@gmail.com>

> [ Originally posted to std-discussion, where, after some, well,
> discussion, it was suggested that I post this here - even though it's not=
 a
> proposal ]


That is fine, because it seems what you want to discuss here is a potential
proposal.


> Let's suppose that I have some text, and I think it's numeric, and I want
> to convert it to a number.
> (For simplicity, let's say that I want a whole number - not a floating
> point one).
>
> [ I am NOT assuming ASCII for the text, and I am NOT assuming any
> particular integral type ]
>
> What's the best facility for doing that in the standard library?
>
> There are:
>         int stoi ( const std::string& str, size_t *pos =3D 0, int base =
=3D 10
> );
>         long stol ( const std::string& str, size_t *pos =3D 0, int base =
=3D 10
> );
>         long long stoll( const std::string& str, size_t *pos =3D 0, int b=
ase
> =3D 10 ); (since C++11)
>
> but they only work with std::string, and only with int/long/long long.
>
I don't believe there are any versions for unsigned


This is incorrect. We also have

unsigned long stoul(const string& str, size_t *idx =3D 0, int base =3D 10);
unsigned long long stoull(const string& str, size_t *idx =3D 0, int base =
=3D
10);


> or wstring, etc.
>

This is incorrect,there are also corresponding overloads for std::wstring.


> On the other hand, at least they report errors.
>

Yep.


> We can drop back to C:
>
>         int       atoi ( const char *str );
>         long      atol ( const char *str );
>         long long atoll( const char *str );
> and
>         long strtol ( const char *str, char **str_end, int base );
>         long long strtoll ( const char *str, char **str_end, int base );
>         unsigned long strtoul ( const char *str, char **str_end, int base
> );
>         unsigned long long strtoull( const char *str, char **str_end, int
> base );
>
> but they only work with NULL terminated char pointers, and long/long
> long/unsigned long/unsigned long long.
> No other character types.
> Also, since they are C APIs, there's no error handling/reporting.
> Overflow is defined by "undefined behavior" (gee, thanks), or maybe
> returning XXX_MAX - or maybe setting errno.
>
> There's always sscanf:
>         sscanf ( str, "%d", &int );
>
> 'Nuff said.
>
>
> And, of course, there's the boost::lexical_cast way:
>         String >> StringStream;
>         StringStream >> Int;
>
> that has the advantage that it can be extended to work user defined types=
,
> as well as various string classes.
> But as people never tire of pointing out=85 it's really slow.
>

This must be a very old lexical_cast implementation, because there were a
lot of improvements in the past. There is no evidence (e.g. based on the
template signature) that this function couldn't be as efficient as the
current numeric conversion functions that you mentioned above.


> So - how do I use the standard library to convert my std::u16string into
> an extended precision integer?
>
> Seems to me that I want something like:
>
>         typedef <typename Integer, typename Iterator>
>         Integer to_integer (  Iterator first, Iterator last, Iterator *en=
d
> =3D nullptr, unsigned base =3D 10 );
>
> which I could call like this:
>         auto foo =3D to_integer<myInteger_t> ( u16Str.begin(), u16Str.end
> ()):
>
> What do people think?
>

Personally I would like to have something like lexical_cast, but I agree
that the signature does not allow for the base provision. This makes your
proposal interesting, but I would like to see how the conversion would
specified for character types different from char and wchar_t.

- Daniel

--=20




--14dae9cfc84af10e9c04cf313227
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div class=3D"gmail_quote">2012/11/23 Marshall Clow <span dir=3D"ltr">&lt;<=
a href=3D"mailto:mclow.lists@gmail.com" target=3D"_blank">mclow.lists@gmail=
..com</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
[ Originally posted to std-discussion, where, after some, well, discussion,=
 it was suggested that I post this here - even though it&#39;s not a propos=
al ]</blockquote><div><br>That is fine, because it seems what you want to d=
iscuss here is a potential proposal.<br>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
Let&#39;s suppose that I have some text, and I think it&#39;s numeric, and =
I want to convert it to a number.<br>
(For simplicity, let&#39;s say that I want a whole number - not a floating =
point one).<br>
<br>
[ I am NOT assuming ASCII for the text, and I am NOT assuming any particula=
r integral type ]<br>
<br>
What&#39;s the best facility for doing that in the standard library?<br>
<br>
There are:<br>
=A0 =A0 =A0 =A0 int stoi ( const std::string&amp; str, size_t *pos =3D 0, i=
nt base =3D 10 );<br>
=A0 =A0 =A0 =A0 long stol ( const std::string&amp; str, size_t *pos =3D 0, =
int base =3D 10 );<br>
=A0 =A0 =A0 =A0 long long stoll( const std::string&amp; str, size_t *pos =
=3D 0, int base =3D 10 ); (since C++11)<br>
<br>
but they only work with std::string, and only with int/long/long long.<br><=
/blockquote><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex">
I don&#39;t believe there are any versions for unsigned </blockquote><div><=
br>This is incorrect. We also have<br><br>unsigned long stoul(const string&=
amp; str, size_t *idx =3D 0, int base =3D 10);<br>unsigned long long stoull=
(const string&amp; str, size_t *idx =3D 0, int base =3D 10);<br>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">or wstring, etc.<br></blockquote><d=
iv><br>This is incorrect,there are also corresponding overloads for std::ws=
tring.<br>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
On the other hand, at least they report errors.<br></blockquote><div><br>Ye=
p.<br>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex">

We can drop back to C:<br>
<br>
=A0 =A0 =A0 =A0 int =A0 =A0 =A0 atoi ( const char *str );<br>
=A0 =A0 =A0 =A0 long =A0 =A0 =A0atol ( const char *str );<br>
=A0 =A0 =A0 =A0 long long atoll( const char *str );<br>
and<br>
=A0 =A0 =A0 =A0 long strtol ( const char *str, char **str_end, int base );<=
br>
=A0 =A0 =A0 =A0 long long strtoll ( const char *str, char **str_end, int ba=
se );<br>
=A0 =A0 =A0 =A0 unsigned long strtoul ( const char *str, char **str_end, in=
t base );<br>
=A0 =A0 =A0 =A0 unsigned long long strtoull( const char *str, char **str_en=
d, int base );<br>
<br>
but they only work with NULL terminated char pointers, and long/long long/u=
nsigned long/unsigned long long.<br>
No other character types.<br>
Also, since they are C APIs, there&#39;s no error handling/reporting.<br>
Overflow is defined by &quot;undefined behavior&quot; (gee, thanks), or may=
be returning XXX_MAX - or maybe setting errno.<br>
<br>
There&#39;s always sscanf:<br>
=A0 =A0 =A0 =A0 sscanf ( str, &quot;%d&quot;, &amp;int );<br>
<br>
&#39;Nuff said.<br>
<br>
<br>
And, of course, there&#39;s the boost::lexical_cast way:<br>
=A0 =A0 =A0 =A0 String &gt;&gt; StringStream;<br>
=A0 =A0 =A0 =A0 StringStream &gt;&gt; Int;<br>
<br>
that has the advantage that it can be extended to work user defined types, =
as well as various string classes.<br>
But as people never tire of pointing out=85 it&#39;s really slow.<br></bloc=
kquote><div><br>This must be a very old lexical_cast implementation, becaus=
e there were a lot of improvements in the past. There is no evidence (e.g. =
based on the template signature) that this function couldn&#39;t be as effi=
cient as the current numeric conversion functions that you mentioned above.=
<br>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">

So - how do I use the standard library to convert my std::u16string into an=
 extended precision integer?<br>
<br>
Seems to me that I want something like:<br>
<br>
=A0 =A0 =A0 =A0 typedef &lt;typename Integer, typename Iterator&gt;<br>
=A0 =A0 =A0 =A0 Integer to_integer ( =A0Iterator first, Iterator last, Iter=
ator *end =3D nullptr, unsigned base =3D 10 );<br>
<br>
which I could call like this:<br>
=A0 =A0 =A0 =A0 auto foo =3D to_integer&lt;myInteger_t&gt; ( u16Str.begin()=
, u16Str.end ()):<br>
<br>
What do people think?<br></blockquote><div><br></div></div>Personally I wou=
ld like to have something like lexical_cast, but I agree that the signature=
 does not allow for the base provision. This makes your proposal interestin=
g, but I would like to see how the conversion  would specified for characte=
r types different from char and wchar_t.<br>
<br>- Daniel<br><br><br><br>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

--14dae9cfc84af10e9c04cf313227--

.


Author: Marshall Clow <mclow.lists@gmail.com>
Date: Fri, 23 Nov 2012 14:44:17 -0800
Raw View
--Apple-Mail=_D5F717A2-5F8B-4BA9-A64D-ADBFF3D987A1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=windows-1252

On Nov 23, 2012, at 2:35 PM, Daniel Kr=FCgler <daniel.kruegler@gmail.com> w=
rote:

> 2012/11/23 Marshall Clow <mclow.lists@gmail.com>
> [ Originally posted to std-discussion, where, after some, well, discussio=
n, it was suggested that I post this here - even though it's not a proposal=
 ]
>=20
> That is fine, because it seems what you want to discuss here is a potenti=
al proposal.
> =20
> Let's suppose that I have some text, and I think it's numeric, and I want=
 to convert it to a number.
> (For simplicity, let's say that I want a whole number - not a floating po=
int one).
>=20
> [ I am NOT assuming ASCII for the text, and I am NOT assuming any particu=
lar integral type ]
>=20
> What's the best facility for doing that in the standard library?
>=20
> There are:
>         int stoi ( const std::string& str, size_t *pos =3D 0, int base =
=3D 10 );
>         long stol ( const std::string& str, size_t *pos =3D 0, int base =
=3D 10 );
>         long long stoll( const std::string& str, size_t *pos =3D 0, int b=
ase =3D 10 ); (since C++11)
>=20
> but they only work with std::string, and only with int/long/long long.
> I don't believe there are any versions for unsigned
>=20
> This is incorrect. We also have
>=20
> unsigned long stoul(const string& str, size_t *idx =3D 0, int base =3D 10=
);
> unsigned long long stoull(const string& str, size_t *idx =3D 0, int base =
=3D 10);

Thanks for point this out.

> =20
> or wstring, etc.
>=20
> This is incorrect,there are also corresponding overloads for std::wasting=
..

And thanks for this, too - but there aren't any for u16string or u32string =
- or did I miss them, too?

>=20
> On the other hand, at least they report errors.
>=20
> Yep.
> =20
> We can drop back to C:
>=20
>         int       atoi ( const char *str );
>         long      atol ( const char *str );
>         long long atoll( const char *str );
> and
>         long strtol ( const char *str, char **str_end, int base );
>         long long strtoll ( const char *str, char **str_end, int base );
>         unsigned long strtoul ( const char *str, char **str_end, int base=
 );
>         unsigned long long strtoull( const char *str, char **str_end, int=
 base );
>=20
> but they only work with NULL terminated char pointers, and long/long long=
/unsigned long/unsigned long long.
> No other character types.
> Also, since they are C APIs, there's no error handling/reporting.
> Overflow is defined by "undefined behavior" (gee, thanks), or maybe retur=
ning XXX_MAX - or maybe setting errno.
>=20
> There's always sscanf:
>         sscanf ( str, "%d", &int );
>=20
> 'Nuff said.
>=20
>=20
> And, of course, there's the boost::lexical_cast way:
>         String >> StringStream;
>         StringStream >> Int;
>=20
> that has the advantage that it can be extended to work user defined types=
, as well as various string classes.
> But as people never tire of pointing out=85 it's really slow.
>=20
> This must be a very old lexical_cast implementation, because there were a=
 lot of improvements in the past. There is no evidence (e.g. based on the t=
emplate signature) that this function couldn't be as efficient as the curre=
nt numeric conversion functions that you mentioned above.

The beauty of lexical_cast (or most general template based ideas) is that y=
ou (the library implementer, or the app programmer, etc) can specialize an =
implementation for a particular type, and frequently get significant improv=
ements over a general mechanism.

If people think that this is worth pursuing, I would be sure to run some ti=
ming tests based on lexical_cast, etc.


> So - how do I use the standard library to convert my std::u16string into =
an extended precision integer?
>=20
> Seems to me that I want something like:
>=20
>         typedef <typename Integer, typename Iterator>
>         Integer to_integer (  Iterator first, Iterator last, Iterator *en=
d =3D nullptr, unsigned base =3D 10 );
>=20
> which I could call like this:
>         auto foo =3D to_integer<myInteger_t> ( u16Str.begin(), u16Str.end=
 ()):
>=20
> What do people think?
>=20
> Personally I would like to have something like lexical_cast, but I agree =
that the signature does not allow for the base provision. This makes your p=
roposal interesting, but I would like to see how the conversion would speci=
fied for character types different from char and wchar_t.

That's one interesting point, certainly.  If you're using Unicode, then '0'=
 .. '9' are in the same place as ASCII.
However, there are encodings (EBCIDIC, cough) where '0'..'9' are not in the=
 same place as ASCII/Unicode.
There are also "alternative" digit representations in Unicode (such as U+06=
69 aka ARABIC-INDIC DIGIT NINE)

-- Marshall

Marshall Clow     Idio Software   <mailto:mclow.lists@gmail.com>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is prom=
ptly moderated down to (-1, Flamebait).
        -- Yu Suzuki

--=20




--Apple-Mail=_D5F717A2-5F8B-4BA9-A64D-ADBFF3D987A1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=windows-1252

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html charset=
=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-=
mode: space; -webkit-line-break: after-white-space; "><div><div>On Nov 23, =
2012, at 2:35 PM, Daniel Kr=FCgler &lt;<a href=3D"mailto:daniel.kruegler@gm=
ail.com">daniel.kruegler@gmail.com</a>&gt; wrote:</div><br class=3D"Apple-i=
nterchange-newline"><blockquote type=3D"cite"><div class=3D"gmail_quote">20=
12/11/23 Marshall Clow <span dir=3D"ltr">&lt;<a href=3D"mailto:mclow.lists@=
gmail.com" target=3D"_blank">mclow.lists@gmail.com</a>&gt;</span><br><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex">
[ Originally posted to std-discussion, where, after some, well, discussion,=
 it was suggested that I post this here - even though it's not a proposal ]=
</blockquote><div><br>That is fine, because it seems what you want to discu=
ss here is a potential proposal.<br>
&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px =
0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); borde=
r-left-style: solid; padding-left: 1ex; position: static; z-index: auto; ">
Let's suppose that I have some text, and I think it's numeric, and I want t=
o convert it to a number.<br>
(For simplicity, let's say that I want a whole number - not a floating poin=
t one).<br>
<br>
[ I am NOT assuming ASCII for the text, and I am NOT assuming any particula=
r integral type ]<br>
<br>
What's the best facility for doing that in the standard library?<br>
<br>
There are:<br>
&nbsp; &nbsp; &nbsp; &nbsp; int stoi ( const std::string&amp; str, size_t *=
pos =3D 0, int base =3D 10 );<br>
&nbsp; &nbsp; &nbsp; &nbsp; long stol ( const std::string&amp; str, size_t =
*pos =3D 0, int base =3D 10 );<br>
&nbsp; &nbsp; &nbsp; &nbsp; long long stoll( const std::string&amp; str, si=
ze_t *pos =3D 0, int base =3D 10 ); (since C++11)<br>
<br>
but they only work with std::string, and only with int/long/long long.<br><=
/blockquote><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex">
I don't believe there are any versions for unsigned </blockquote><div><br>T=
his is incorrect. We also have<br><br>unsigned long stoul(const string&amp;=
 str, size_t *idx =3D 0, int base =3D 10);<br>unsigned long long stoull(con=
st string&amp; str, size_t *idx =3D 0, int base =3D 10);<br></div></div></b=
lockquote><div><br></div>Thanks for point this out.</div><div><br><blockquo=
te type=3D"cite"><div class=3D"gmail_quote"><div>
&nbsp;</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex">or wstring, etc.<br></blockquote=
><div><br>This is incorrect,there are also corresponding overloads for std:=
:wasting.<br></div></div></blockquote><div><br></div>And thanks for this, t=
oo - but there aren't any for&nbsp;u16string or&nbsp;u32string - or did I m=
iss them, too?</div><div><br></div><div><blockquote type=3D"cite"><div clas=
s=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><br></blockquote><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">On the other hand, at least they report errors.<br></=
blockquote><div><br>Yep.<br>&nbsp;</div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-colo=
r: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; positio=
n: static; z-index: auto; ">

We can drop back to C:<br>
<br>
&nbsp; &nbsp; &nbsp; &nbsp; int &nbsp; &nbsp; &nbsp; atoi ( const char *str=
 );<br>
&nbsp; &nbsp; &nbsp; &nbsp; long &nbsp; &nbsp; &nbsp;atol ( const char *str=
 );<br>
&nbsp; &nbsp; &nbsp; &nbsp; long long atoll( const char *str );<br>
and<br>
&nbsp; &nbsp; &nbsp; &nbsp; long strtol ( const char *str, char **str_end, =
int base );<br>
&nbsp; &nbsp; &nbsp; &nbsp; long long strtoll ( const char *str, char **str=
_end, int base );<br>
&nbsp; &nbsp; &nbsp; &nbsp; unsigned long strtoul ( const char *str, char *=
*str_end, int base );<br>
&nbsp; &nbsp; &nbsp; &nbsp; unsigned long long strtoull( const char *str, c=
har **str_end, int base );<br>
<br>
but they only work with NULL terminated char pointers, and long/long long/u=
nsigned long/unsigned long long.<br>
No other character types.<br>
Also, since they are C APIs, there's no error handling/reporting.<br>
Overflow is defined by "undefined behavior" (gee, thanks), or maybe returni=
ng XXX_MAX - or maybe setting errno.<br>
<br>
There's always sscanf:<br>
&nbsp; &nbsp; &nbsp; &nbsp; sscanf ( str, "%d", &amp;int );<br>
<br>
'Nuff said.<br>
<br>
<br>
And, of course, there's the boost::lexical_cast way:<br>
&nbsp; &nbsp; &nbsp; &nbsp; String &gt;&gt; StringStream;<br>
&nbsp; &nbsp; &nbsp; &nbsp; StringStream &gt;&gt; Int;<br>
<br>
that has the advantage that it can be extended to work user defined types, =
as well as various string classes.<br>
But as people never tire of pointing out=85 it's really slow.<br></blockquo=
te><div><br>This must be a very old lexical_cast implementation, because th=
ere were a lot of improvements in the past. There is no evidence (e.g. base=
d on the template signature) that this function couldn't be as efficient as=
 the current numeric conversion functions that you mentioned above.<br></di=
v></div></blockquote><div><br></div><div>The beauty of lexical_cast (or mos=
t general template based ideas) is that you (the library implementer, or th=
e app programmer, etc) can specialize an implementation for a particular ty=
pe, and frequently get significant improvements over a general mechanism.</=
div><div><br></div><div>If people think that this is worth pursuing, I woul=
d be sure to run some timing tests based on lexical_cast, etc.</div><div><b=
r></div><div><br></div><blockquote type=3D"cite"><div class=3D"gmail_quote"=
><blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; bord=
er-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-styl=
e: solid; padding-left: 1ex; position: static; z-index: auto; ">So - how do=
 I use the standard library to convert my std::u16string into an extended p=
recision integer?<br>
<br>
Seems to me that I want something like:<br>
<br>
&nbsp; &nbsp; &nbsp; &nbsp; typedef &lt;typename Integer, typename Iterator=
&gt;<br>
&nbsp; &nbsp; &nbsp; &nbsp; Integer to_integer ( &nbsp;Iterator first, Iter=
ator last, Iterator *end =3D nullptr, unsigned base =3D 10 );<br>
<br>
which I could call like this:<br>
&nbsp; &nbsp; &nbsp; &nbsp; auto foo =3D to_integer&lt;myInteger_t&gt; ( u1=
6Str.begin(), u16Str.end ()):<br>
<br>
What do people think?<br></blockquote><div><br></div></div>Personally I wou=
ld like to have something like lexical_cast, but I agree that the signature=
 does not allow for the base provision. This makes your proposal interestin=
g, but I would like to see how the conversion  would specified for characte=
r types different from char and wchar_t.</blockquote><br></div><div>That's =
one interesting point, certainly. &nbsp;If you're using Unicode, then '0' .=
.. '9' are in the same place as ASCII.</div><div>However, there are encoding=
s (EBCIDIC, cough) where '0'..'9' are not in the same place as ASCII/Unicod=
e.</div><div>There are also "alternative" digit representations in Unicode =
(such as&nbsp;U+0669 aka ARABIC-INDIC DIGIT NINE)</div><div><br><div apple-=
content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; border=
-spacing: 0px; ">-- Marshall<br><br>Marshall Clow &nbsp; &nbsp; Idio Softwa=
re &nbsp; &lt;<a href=3D"mailto:mclow.lists@gmail.com">mailto:mclow.lists@g=
mail.com</a>&gt;<br><br>A.D. 1517: Martin Luther nails his 95 Theses to the=
 church door and is promptly moderated down to (-1, Flamebait).<br>&nbsp;&n=
bsp; &nbsp; &nbsp; &nbsp;-- Yu Suzuki</span>

</div>
<br></div></body></html>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

--Apple-Mail=_D5F717A2-5F8B-4BA9-A64D-ADBFF3D987A1--

.


Author: DeadMG <wolfeinstein@gmail.com>
Date: Fri, 23 Nov 2012 15:00:49 -0800 (PST)
Raw View
------=_Part_1997_7736412.1353711649958
Content-Type: text/plain; charset=ISO-8859-1

String handling functions should really be left for a Unicode proposal. In
addition, I find your suggestion to be fairly woefully underspecified.

However, if you insist, then I would suggest something rather different. I
mean, output values? Eww. In addition, the customizability and error
handling of your suggestion is limited- for example, when parsing a decimal
integer, what thousands separator is to be used?

template<typename F> undefined fail(F f);
template<typename F> undefined last(F f);

template<typename num, typename iterator, typename T1 =
implementation_defined, typename T2 = implementation_defined>
num parse(iterator first, iterator last, T1 t1 = T1(), T2 t2 = T2());
template<typename num, typename iterator, typename T1 =
implementation_defined, typename T2 = implementation_defined>
num parse(iterator first, iterator last, std::locale l, T1 t1 = T1(), T2 t2
= T2());

Iterator shall be at least an input range of Unicode codepoints.
If one of the two last arguments shall be the result of a call to fail(f)
for some function object f, then when the function fails, it shall call f
with no arguments. If f returns num, then this value is returned. Else, an
exception shall be thrown after f is called.
If one of the last two arguments is the result of a call to last(f) for
some function object f, then when the function succeeds, it shall call f
with the iterator which is one beyond the last codepoint forming a part of
the integer parsed. This must be before, or equal to, last.
If the function succeeds, returns the parsed value.
The last two arguments shall not be both from fail() or both from last().
If so, the program is ill-formed, and a diagnostic is required.
If no argument is provided that is a result of fail(), then the default
argument shall be a function object that returns nothing and performs no
action.
If no argument is provided that is a result of last(), then the default
argument shall be a function object that takes the iterator and performs no
action.
The overload which does not take a locale shall simply pass the global
locale to the overload which does.
The parse function should respect digits which are part of Unicode and
locale data such as thousands separators.

Not to mention dealing with bases, floating-point numbers, booleans, etc.

Damn, I really need to make that proposal about named arguments.

--




------=_Part_1997_7736412.1353711649958
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

String handling functions should really be left for a Unicode proposal. In =
addition, I find your suggestion to be fairly woefully underspecified.<div>=
<br></div><div>However, if you insist, then I would suggest something rathe=
r different. I mean, output values? Eww. In addition, the customizability a=
nd error handling of your suggestion is limited- for example, when parsing =
a decimal integer, what thousands separator is to be used?</div><div><br></=
div><div>template&lt;typename F&gt; undefined fail(F f);</div><div>template=
&lt;typename F&gt; undefined last(F f);</div><div><br></div><div>template&l=
t;typename num, typename iterator, typename T1 =3D implementation_defined, =
typename T2&nbsp;=3D implementation_defined&gt;&nbsp;</div><div>num parse(i=
terator first, iterator last, T1 t1 =3D T1(), T2 t2 =3D T2());</div><div>te=
mplate&lt;typename num, typename iterator, typename T1 =3D implementation_d=
efined, typename T2&nbsp;=3D implementation_defined&gt;&nbsp;</div><div>num=
 parse(iterator first, iterator last, std::locale l, T1 t1 =3D T1(), T2 t2 =
=3D T2());</div><div><br></div><div>Iterator shall be at least an input ran=
ge of Unicode codepoints.</div><div>If one of the two last arguments shall =
be the result of a call to fail(f) for some function object f, then when th=
e function fails, it shall call f with no arguments. If f returns num, then=
 this value is returned. Else, an exception shall be thrown after f is call=
ed.</div><div>If one of the last two arguments is the result of a call to l=
ast(f) for some function object f, then when the function succeeds, it shal=
l call f with the iterator which is one beyond the last codepoint forming a=
 part of the integer parsed. This must be before, or equal to, last.</div><=
div>If the function succeeds, returns the parsed value.</div><div>The last =
two arguments shall not be both from fail() or both from last(). If so, the=
 program is ill-formed, and a diagnostic is required.</div><div>If no argum=
ent is provided that is a result of fail(), then the default argument shall=
 be a function object that returns nothing and performs no action.</div><di=
v>If no argument is provided that is a result of last(), then the default a=
rgument shall be a function object that takes the iterator and performs no =
action.</div><div>The overload which does not take a locale shall simply pa=
ss the global locale to the overload which does.</div><div>The parse functi=
on should respect digits which are part of Unicode and locale data such as =
thousands separators.</div><div><br></div><div>Not to mention dealing with =
bases, floating-point numbers, booleans, etc.</div><div><br></div><div>Damn=
, I really need to make that proposal about named arguments.</div>

<p></p>

-- <br />
&nbsp;<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_1997_7736412.1353711649958--

.